Maya Calendar Program
ASP.NET and Other Tips
Architecting, Writing, and Maintaining Large Programs in Visual BasicMiscellaneous
Constructing a 404 Handler for IIS
Constructing a Links Page using XML
Differences between XHTML 1.0 and HTML 4.0
ECMA, ISO, IETF RFCs, and Other Standards
Cataloging a Library
Mining Web Server Logs
Object Oriented Programming Concepts
Recommended Computer Books
Recommended Internet Explorer Settings
Review of the Compaq iPaq 3800 Series: H3830, H3835, H3850, H3870, H3875
Things to Consider when Internationalizing an Application
Understanding the Internet Jigsaw Puzzle
Internet BasicsUsing the Localization Toolkit
Xoc Scale Search Engine Ratings
Foldit Temporary Download Site
Other Xoc managed sites:
Copyright © 2000 by Greg Reddick
The Internet can basically be viewed as a bunch of computers hooked together with a bunch of wire, fiberoptic cable, and wireless receivers that understands a single protocol for exchanging data called Internet Protocol or IP. A protocol is a way of arranging data so that it can be understood by something else that understands the same protocol. It is like a common language; anyone who speaks Spanish can communicate with anyone else who also speaks Spanish.
Each machine on the Internet is given an address in the form of a four-byte number called an IP address. The IP address, when written out is generally expressed with the decimal values of the bytes separated by periods, such as 220.127.116.11.
The IP address for a computer can either be permanently attached to the particular computer, which is called a Static IP Address, or assigned to the computer when it connects to the Internet, called a Dynamic IP Address. Most dial-in lines of services, such as America Online, use Dynamic IP addresses, whereas machines such as web servers and email servers use Static IP addresses.
A series of bytes are packaged up using the Internet Protocol into a packet and moved from one machine to another over the Internet. The Internet Protocol adds some bytes to the front of the packet, called a packet header. Among the information in the packet header is the eventual IP address that it should reach.
When one machine gets a packet, it looks at the IP address, does some lookups into some tables and based off that information sends it off to another computer. This continues until the packet either reaches its destination computer, or a timeout is reached. If the timeout is reached, the packet is simply deleted. [http://www.ietf.org/rfc/rfc0791.txt]
Each machine connected to the Internet has an Internet Service Provider or ISP. The Internet Service Provider provides the connection between the local machine and the Internet as a whole. The ISP also generally provides other services such as Domain Name Service and email. For this service, the user of the local machine generally pays a monthly or annual fee.
Several ISPs connect to a larger level ISP that eventually connects to a sort of super-ISP called a Backbone.
The backbone computers are connected to each other. So a local machine's packet will be sent up the chain of ISPs until it gets to a machine that is the parent of the machine the packet is meant for whereupon it is sent down through the ISPs until it reaches the destination IP address. If instead, the packet reaches a backbone computer that isn't a parent of the destination computer, it is redirected to another backbone that is the parent of the destination machine. All this exchange happens very quickly, so it generally takes a packet less than a second to go from one computer to another, even if it is on the other side of the planet.
Because IP addresses are hard to remember, a scheme was formulated to add a friendlier name to some IP addresses, called a Domain Name (or sometimes simply a Domain). An example of a Domain Name is MICROSOFT.COM, which gets turned into the IP address 18.104.22.168. Generally only Static IP addresses get Domain Names assigned to them.
First level domains are .COM, .NET, .ORG, .EDU, .GOV, .MIL, and .INT. Second level domains are names such as MICROSOFT.COM. Third level domains are names such as WWW.MICROSOFT.COM. Second level domains are rented from a Domain Name Registry, such as Network Solutions, Inc. There are also two-letter first level domains for each country; for example, ES for Spain.
Each local computer attached to the Internet has designated a machine that it uses for Domain Name resolution, called a Domain Name Service or DNS, usually provided by the ISP. The Domain Name Service turns a Domain Name into an IP Address when asked.
When a DNS is asked to resolve a Domain Name, such as WWW.MICROSOFT.COM, into an IP address, it talks to a machine called a Domain Server. A Domain Server is a machine that keeps a table of Domain Names and their corresponding IP Addresses. A Domain Server may either keep that information directly or delegate the responsibility for keeping that information to another Domain Server.
For example, a request for the IP Address of WWW.MICROSOFT.COM will be passed to the DNS, which asks the COM Domain Server for the information. It has delegated that information to the MICROSOFT.COM Domain Server. This Domain Server has, in turn, delegated that information to the WWW.MICROSOFT.COM Domain Server. This server keeps the information on the IP Address for WWW.MICROSOFT.COM, 22.214.171.124, and returns that information to the DNS, which forwards it on to the local computer. As this information changes infrequently, if ever, the DNS also caches the information it retrieved, so that it doesn't have to look it up again if another request to WWW.MICROSOFT.COM is made.
When a new second level domain is registered with the Domain Name Registry, it must specify the Domain Server to which the first level domain will delegate the resolving of the second level and higher Domain Names. The ISP generally provides all second level and higher Domain Servers, as well as the DNS. [http://www.ietf.org/rfc/rfc1034.txt]
When the destination machine gets the packet, it must know what it should do with it. So additional protocols are added to the packet. The most common protocol is the Transmission Control Protocol or TCP. This is an additional packet header added to the packet that tells it how to take a series of packets and put them together to form a larger message. [http://www.ietf.org/rfc/rfc0793.txt]
Another common protocol is the User Datagram Protocol or UDP, which is used for single packet exchanges between machines. [http://www.ietf.org/rfc/rfc0768.txt]. The Internet is frequently thought to use a combination of TCP and IP, and commonly refered to as TCP/IP, although in reality it also uses UDP/IP, and the other protocols as well.
When a TCP or UDP packet is exchanged from one machine to another, it is designated as being addressed to a number on the destination called a port. A port is simply a number between 1 and 65535 that each machine may have a single program wait for a packet to arrive. A packet is also designated as coming from a port, and the receiving program may send information back to that original port if it needs to.
Within a set of TCP packets, a third level of protocols is used to further identify the information within. Two common protocols are HyperText Transfer Protocol or HTTP, and File Transfer Protocol or FTP. [ http://www.ietf.org/rfc/rfc2616.txt][ http://www.ietf.org/rfc/rfc0959.txt] FTP is the older of the two, and is used to move a file from one machine to another. HTTP is similar, except that the content of the packets is considered to contain a text file formatted using HyperText Markup Language or HTML. Usually, a given protocol has a default port that a packet is addressed to. For example, HTTP is by default addressed to port 80, and FTP is addressed to port 21. This default port can be overridden, and a HTTP packet could be addressed to port 8080 instead, for example.
A Firewall is a program, machine, or device that prevents certain packets passing through it from going to the intended destination. A firewall looks at each packet, and based off of a specified criteria decides whether to forward the packet or reject it. A firewall may reject a packet if it doesn't come from a designated IP address, or that it is addressed to some unexpected port, for example. A company commonly places a firewall between its internal network (called an Intranet), and the Internet at large. A large organization may have several layers of firewalls between various machines.
A proxy is a machine that intercepts the normal communication between a local machine and a destination machine. The first time a request is made for some information, it may cache the information on the proxy machine so that further requests for the same information are retrieved from the cache instead of from the Internet. This interception is done transparently to the local machine. Large organizations use proxies to reduce the amount of traffic across their connection to the Internet. If 300 people on the Intranet all request http://www.dilbert.com, only one copy of the web site actually gets downloaded to the proxy. Further requests for that URL are retrieved from the proxy instead of from the destination machine until the proxy decides that the information might be out of date and refreshes its content. Frequently the services of a proxy and a firewall are combined in some way.
One arrangement of cheap hardware that is capable of exchanging packets from place to place is called an Ethernet, although there are many others. The hardware involved in exchanging packets is irrelevant to the Internet Protocol. An ethernet adds some information to the packets. Another device that may be used in the transfer of packets is a Modem, which turns electrical representation of sound into bytes, and vice versa.
So a packet might contain:
Ethernet addressing information
When a local machine needs a Dynamic IP Address, it must be assigned to it somehow. This allocation is done by a Dynamic Host Configuration Protocol Server, or DHCP Server. A DHCP Server assigns IP addresses to local machines as necessary. The ISP provides the DHCP service as well. [http://www.ietf.org/rfc/rfc0959.txt]
A Uniform Resource Locator, or URL, is an identifier of a particular resource on the Internet. It is broken into several parts:
Put together, the example URL would be http://www.microsoft.com/ie/default.asp?query=select+*+from+table. [http://www.ietf.org/rfc/rfc2396.txt
With the exception of HTTP, all of the technologies in the previous discussion has been available for many years. What changed the Internet dramatically and turned it into what it is today is the invention of HTTP, HTML, and the Web Browser.
Continue to Part II: Web Basics
[www.xoc.net] Copyright © 1997-2009 by Gregory Reddick . All Rights Reserved. 02/20/09 01:28