
There are many ways to establish the physical connection between computers -- wire, telephone (dial-up or DSL), cable, satellite, wireless, .... We'll leave the details of the connection to the engineers, but let us consider the impact of the type of connection on what you can do with it. The speed of a connection can be measured in "bits per second (bps)", or in multiples of a thousand (kilo, Kbps), million (mega, Mbps), or even billion (giga, Gbps). For example, a fast modem can transfer about 56 Kbps, a cable modem or DSL line more like 1-2 Mbps, and the connections here on campus typically around 10-100 Mbps.
The math: if we want to transfer b bits, via a connection of c bps, it will take b/c seconds. So a small text file (50 thousand bits) would only take a second via fast dial-up, but a high-quality compressed image (maybe 8 million bits) would take more like 2 1/2 minutes. A 10 Mbps connection would handle that image in a second, but couldn't keep up with a 24 frame-per-second movie if each frame took a full 8 million bits.
Once we can connect two computers, the next challenge is to connect multiple computers. Generally speaking, we can differentiate networks at two scales: local area networks (LANs) vs. wide area networks (WANs). These are convenient names, indicating whether or not the computers are physically close. Real networks can be hybrids of these, and the internet itself is just an agglomeration of lots of individual networks.
Computers in a LAN are close to each other, and their communications with each other are handled locally. For example, a LAN connects the desktop computers, printers, mail server, web server, compute clusters, etc. within the computer science department. But how are they connected, and how do they communicate with each other?
The winner? Ethernet: bus connection, broadcast messages, simple conflict resolution. Protocol: wait until the wire is free, then insert message (including id of recipient). Everyone except recipient ignores the message. If two senders start at the same time, each stops, each waits some random amount of time, and tries again (listening to see if the wire is free).
Sharing is good. It's unlikely that each individual needs the full capacity of a wire at all times. So they share the costs of the wire, and the use of it. This approach can run into the usual problems with sharing (sometimes you really need it all; sometimes someone gets greedy, etc.), and various schemes (prioritizing, monitoring, etc.) are used to counteract them.
A WAN is a network of physically-separated computers. It can even be a network of LANs. The main point is that the communication connection is no longer shared by all computers. Thus we can't use the LAN idea of simply broadcasting a message that will be heard by the recipient.
In a WAN, only some pairs of computers are connected to each other.
A message gets passed along from computer to computer, (hopefully)
making progress from the sender toward the recipient. This is called
routing. Routing is complicated by the size of the network,
the rapid addition of new computers and new links, the fact that links
might be broken or computers might crash sometimes, etc. There are
clever algorithms to try to find good routes, and the network routers
(computers responsible for forwarding messages) frequently share
information (e.g., the Unix command traceroute).
To be routed, a message must give the id of the destination
computer. We've already discussed domain
naming (e.g., how to read www.cs.dartmouth.edu
backwards). As I mentioned then, these nice readable names are for
human consumption; the computers actually work with numbers. The ids
used on the internet today, called IP addresses, are 32-bit
numbers, written as 4 8-bit numbers. For example,
www.dartmouth.edu is currently
129.170.16.79. The IP address for a domain name can
change over time (like your phone number). There is a large directory
(much like a phone directory), distributed throughout the internet,
that does the look-up from domain name to IP address (e.g., the Unix
command host). A problem (also seen with phone numbers):
we're running out of IP addresses. The next version, IPv6, will have
128-bit numbers, able to hold enough different addresses for all the
atoms in the universe.
Computer networks work very differently from telephone calls. When you call someone, a physical circuit is established between your phones, dedicated to carrying your voices to each other. As we've seen, when a computer sends a message to another, it goes via a shared connection, and hops from one computer to another along the way. In order to facilitate sharing, messages are chopped up into units called packets. An analogy is sending a letter to someone via a series of postcards. So a packet carries a portion of the message, along with the ids of the sender and receiver, and some other stuff we'll come to shortly.
Now, you can imagine a bunch of problems that could arise if you sent a bunch of postcards to a friend, who was supposed to read them like a letter. What if one got lost (or maybe just significantly delayed)? What if a later one arrived before an earlier one? What if one got smudged or torn? What if so many arrived one day that you couldn't manage to read them all before another pile arrived? What if you got two copies of one of them (maybe that wouldn't happen in the postcard world).
Recall that a protocol is just a set of rules for how two parties communicate (we discussed that in the context of the hypertext transport protocol). Internet communications are made up of layers of protocols.
Each layer must address only a limited part of the complexity of network communications, relying on lower layers to do their part. Modularity and incrementality again! The original packet gets more and more stuff added to it, as the layers provide specific information to help them do their job.
data
http info | data
tcp info | http info | data
ip info | tcp info | http info | data
TCP, which is really the heart of internet communication, provides a logical (not physical) connection between two programs. The programs can act as if they are connected, and TCP handles reliable communication between them. There are several components that enable TCP to do that:
A given computer can handle many different applications. In order to distinguish, e.g., email communications from web communications, it has different ports to handle the different applications. For example, all web messages should be sent to the HTTP port, which has number 80, of the computer. An analogy is sending a letter to a company, with an "Attention" line indicating that the letter is for human resources.
The application messages should obey the appropriate protocol. We've already seen HTTP. As another example, Dartmouth provides a protocol for Blitz and DND. (Shown in class.) As you saw when you were programming JavaScript to extract information from a web page, life is easiest when the information is strictly structured. That's what protocols are all about.
The internet is a network of networks that has grown exponentially over the past 40+ years. Here are a few milestones in that growth, from the ISOC History site, the Computer History Musem, and Hobbes' Internet Timeline.


And it only gets bigger from there.