How the Internet works

Understanding the pillars on which the Internet works is fundamental to understanding the server side.

Introduction

Before we begin, let's define what a protocol is.

A communications protocol is a set of rules that two or more entities use to communicate and transmit information.

In what we know as the Internet, a multitude of protocols are used. For example HTTP for web browsing, SMTP and IMAP/POP for sending and receiving email, RTSP for streaming video, etc.

A mail server is able to communicate and understand with our email application by using protocols such as SMTP, IMAP or POP. You could say that two or more entities speak the same language by using the same protocol.

Internet protocol suite

The Internet protocol suite is a set of network protocols used on the Internet and other networks. It is commonly referred to as TCP/IP because they were the first two protocols. This suite specifies how the data will be packaged, processed, sent or received.

It was created in the 70s by the U.S. Department of Defense for use in the ARPANET, the precursor network of the current Internet.

This protocol suite is divided into 4 layers and is an implementation based on the 7-layer conceptual model called "open systems interconnection model" (OSI). A model is a document that details an idea or concept (in this case the layered system) so that we can explain and understand how it works. Thus, the division into layers serves to be able to abstract part of the complexity so that each layer can take care of its own.

These four layers are: Application, Transport, Internet and Link.

ApplicationSOURCETransportInternetLinkApplicationDESTINATIONTransportInternetLinkROUTERInternetLinkROUTERInternetLink
The four layers on which the Internet works

The Application and Transport layers exist only at the source and destination nodes while the Internet and Link layers also exist at intermediate nodes, such as routers and other devices.

Application layer

The application layer handles protocols related to the application through which we are going to communicate.

As we saw earlier, a protocol is a set of rules for two or more entities to communicate and understand each other.

Therefore, if we are sending an email, our mail application (the sender) will communicate with the mail server (the receiver) through a set of rules. These rules form the SMTP protocol, which is used to send emails.

On the other hand, if we access web pages through our browser, it will request the content from the web server using the HTTP or HTTPS protocol.

In this way both applications speak the same language and understand each other.

In this layer only the necessary data is processed so that the source application and the target application can communicate.

DATAGET / HTTP/2.0HOST: example.com
The output of the application layer

The application layer interacts with the transport layer via a port. Protocols have default ports associated with them. For example, the SMTP protocol uses port 25, while the HTTP protocol uses port 80.

Transport layer

Sending all the data at once would be crazy, especially if that data corresponds to a YouTube video. Imagine receiving so many megabytes all at once in a single message. If there were any problems in the transmission, you would have to start from scratch again and again.

This is why the transport layer is responsible for segmenting this application data into what are known as data packets. In the case of the destination, the transport layer is responsible for reconstructing the data through these packets.

A header is added to these packets, which contains certain information depending on the transport protocol used.

HEADERSource Port: 23136Destination Port: 80DATAGET / HTTP/2.0HOST: example.com
Each packet segmented by the transport layer

For example, if the data to be sent corresponds to the HTTP application protocol (web browsing), port 80 will be added to this header. Thus, the transport layer of the destination machine will know to which application it should pass the data once it has reconstructed it from the packets received.

The two flagship protocols at this layer are TCP and UDP, although there are more.

TCP protocol

TCP (Transmission Control Protocol) is a reliable protocol. This means that at source, when the TCP protocol splits our data into several packets to be sent, it will also ensure that all the packets reach their destination, retrying the sending of lost packets if necessary.

On the other hand, at the destination the TCP protocol is responsible for reconstructing the application data from the packets that have arrived. **The order in which the packets were received does not matter, since the TCP protocol header adds the information needed to reorder them.

UDP protocol

UDP (User Datagram Protocol) is a protocol that stands out for its speed. The price to pay is that there is no guarantee that the packets will arrive at their destination or that they will arrive in order. This protocol is usually used to transport voice or video packets, where the loss of a few packets is acceptable and hardly alters the result.

Internet layer

The Internet layer is responsible for the routing of the packets. In other words, it is the layer that contains information about the sender and receiver of the data.

IP protocol

The most common protocol at this layer is called Internet Protocol (IP). Hence TCP/IP. And yes, IPv4 and IPv6 are versions of this protocol.

At the internet layer, a new header is added to the packet (ahead of the transport layer header). This header adds information such as the version of the IP protocol used, the source IP and destination IP, the size of the packet, etc.

HEADERHEADERSource Port: 23136Destination Port: 80DATAGET / HTTP/2.0HOST: example.comSource IP: 37.87.22.1Destination IP: 123.35.54.68
The result of the packets after passing through the internet layer

With this data the packet is now ready to be sent by cable, antenna or any other means.

This layer is the lowest level layer. Here information is added to the packet coming from the Internet layer (in this layer it is called frame) both at the beginning (header) and at the end (footer). This information contains data such as the physical address (MAC) of the device sending or receiving the data. This information also depends on the protocol used in this layer. Generally, the MAC (Medium access control) protocol is used, which has nothing to do with the physical address (MAC as well).

Source MAC: xx.yyDestination MAC: aa.bbHEADERHEADERSource Port: 23136Destination Port: 80DATAGET / HTTP/2.0HOST: example.comSource IP: 37.87.22.1Destination IP: 123.35.54.68HEADERFOOTER
The packet (now called frame) after passing through the link layer

Once processed, the packet is sent over the network.

How do the packets reach their destination?

Without going into further detail, the packets pass through routers and other devices whose function is to route (to mark the route, the path).

When a packet leaves a device, it leaves in a generally correct direction (packets can get lost or take longer routes than normal, hence the need for the TCP protocol).

The packet then arrives at an intermediate point, for example, an Internet provider's router. This router implements only the link and internet layers.

This device first converts the received signal into a data packet via the link layer.

It then queries (via the internet layer) the destination IP address (remember that this data was added by the source internet layer in its header). As this device is not the recipient of the packet (the destination IP does not match the one assigned to it), it routes the packet to another new address, usually also in the correct direction.

Routers take these indications from routing tables that they have and which act as indicative maps.

After a few hops the packet reaches the destination device and it does the same check using the two lower layers. Since the packet was addressed to this device, the recipient check matches and the internet layer proceeds to forward the packet to the transport layer.

The transport layer picks up the received packets and reconstructs the original data. In the case of the TCP protocol, the packets are reordered correctly. If any packet is missing, the sender will be notified to resend that particular packet. Through the header added by the source TCP protocol, it is known whether packets are missing or all packets are present (Packets received / Total packets).

Since the packet contains information about the port used by the destination application, once the data is reconstructed, the transport layer hands over the data to said application. For example, if the destination was port 80 (HTTP), the web server would receive the request and proceed to send the user the requested web page via a new packet stream.

Thanks to the use of ports, it is possible to have several open applications communicating over the Internet without mixing the data with other applications. Each packet knows to which application it belongs.

Conclusion

This is roughly how the Internet works.

Thanks to the layered separation, understanding this technological achievement is even easier.

You can support me so that I can dedicate even more time to writing articles and have resources to create new projects. Thank you!