This is a more in-depth overview of what happens in link layer when a host wants to send a message from a host to another host. The last post covered how error detection works in Link Layer, so I won't bother about the details of error detection in this post. Rather, I will focus more on keywords like MAC, LAN, ethernet, switches, ARP. I know, lots of buzzwords, but that's network :0
Each host has network card/adapter (NIC), which has a lifetime MAC (Media access control) address. MAC address identifies a host, which is different from IP that is dynamically allocated (so host's IP address changes). A host always has a pair of IP address and MAC address.
MAC address is a 48 bit number (almost 300 trillion possible MAC addresses!), and looks like 12-34-56-78-9A-BC.
Each host is directly connected to a switch. Multiple hosts are connected to an interface of a switch, and switch can identify each host that is connected via switch table. Switch table has a schema of <MAC, Interface, TTL>. Here's an explanation about switch table population.
Single LAN (Local Area Network)
Switch table (single lan)
Switch table is initially empty. Imagine that host A below wants to send a packet to host A'. Host A firstly broadcasts (flood) a frame to all other hosts connected with the same switch that it wants to talk to A'. This implicitly populates the switch table with one entry for A. Host A' will send back to A (unicast, selective sending) a frame, and this will implicitly populate the switch table with one entry for A'.
Switch table (larger scale, still in single lan)
Exactly the same thing happens. Flooding, then selective sending. Then, the switch table gets populated for further communications to proceed.
Single LAN ARP: Host A sends a message to B
Switch table creates a basis for link-layer routing - basically constructs the graph or topology between connected switches and hosts. However, they have no knowledge of IP addresses, and we know that messages are sent from one host (identified by an IP address) to another host (identified by an IP address). So given origin and destination IP addresses, how do we deliver a message from host A to B? All we need to do is to somehow derive an origin/destination MAC address pair from this origin/destination IP address pair. ARP is a protocol that allows for this.
Here, what we will & can assume is that
- A and B have IP addresses allocated.
- routing algorithm would mean that A knows IP address of B (but not the MAC address)
ARP is Address Resolution Protocol. Each node (router, host) holds ARP table, which has a schema of <MAC, IP, TTL> - different from switch table since that one doesn't concern with IP address. Host A attempts to communicate with host B by sending a frame. This frame looks like this:
[to_MAC, to_IP, from_MAC, from_IP, data]
Step 1: Broadcasting
The initial frame that Host A sends is made up of [FF-FF-FF-FF-FF-FF, IP(B), MAC(A), IP(A)], where host A knows values of IP(B), MAC(A), IP(A). FF-FF-FF-FF-FF-FF is because host A does not know MAC address of host B.
Step 2: Selective sending
Host B sends back [MAC(A), IP(A), MAC(B), IP(B)] to host A.
Through these two steps, the ARP tables of host A and B are populated for each other, and indeed, we derived an origin/destination MAC address pair from this origin/destination IP address pair. That's it! We just need to use the underlying switch table to deliver messages from host A to host B now on. Notice this layered thinking - host A (xyz.com, identified by IP address)wants to send message to host B (abc.com, identified by IP address) is a concern of the Network layer (and is at our everyday language), but this process must be suceeded by identification of physical path from one host to another, which can only be done if MAC address is known. (as IP addresses are changeable, and is not a reliable indicator of who you are sending the message to!)
Back to our main strand of discussion!
Now, several switches form a complex graph with hosts connected to it. We abstract this connection out, and say that hosts are connected to a network of switches. Ethernet, LAN are both synonyms for 'network of switches'. We get router connecting to the switch as well. The picture below shows hosts connected to a network of switches (blue blob), and you can see that there's a router connecting two network of switches.
Let's briefly parse through the diagram. Recall that IP address comes in pair with a MAC address. So in Router, since it has two IP addresses, it has two MAC addresses. Also, each host is connected to one switch only, and will have one IP address and one MAC address.
The key question that will drive us is: how do we send a packet from A to B, across a router? We will be using ARP again. However, the key distinguishing point from the previous discussion is that we are sending a message across two LANs.
Multiple LAN ARP: Host A sends a message to B
We assume that Single LAN ARP procedure has taken place implicitly. This means host A can identify MAC address of R (gateway router).
Here's what happens. Assume frame still looks like this: [to_MAC, to_IP, from_MAC, from_IP, data].
The problem here is that MAC address of host B (MAC(B)) is not known. Do note that we know IP(B).
Step 1: host A sends [MAC(R), IP(B), MAC(A), IP(A), data] to router R.
Step 2: router R repacks the frame, and sends [MAC(B), IP(B), MAC(R), IP(A)] to host B.
So here, what's important to realize is that MAC deals with actual address that the packet is being transferred from/to in that particular link layer transmission, while IP address stays constant throughout the delivery of packet from host A to host B. This is also why Router's IP address never appears in any frames.
Furthermore, MAC(R) in step 1 is the MAC address on the host A's side of the router R, while MAC(R) in step 2 is the other one.
Concluding Remarks
Ahhh, that was a very long post!! I always spend around 2+ hours writing up the post (excluding the time that I take to understand the material), so it's always like 2am when I wrap up my writing. I do this because I always fully understand (at least almost) the concept when I learn it, but easily forget all the important details if I don't look back. You should also take time to reflect on what you know by writing it down without minimally looking at the notes.
Anyways, the important concept here was understanding how link layer is the carrier or fascilitator of network layer communications. This involves firstly establishing routes between hosts, identified purely by switch tables that only concerns with MAC addresses. Then, ARP protocol comes in to make sure that the intention of transfer made in IP address is properly translated into transfer between MAC addresses, so that it goes to the right destination (instead of some random dude that pops up to take the IP address of the original destination host). Destination identified by MAC address is reliable, since MAC address is unique. But note that link layer transfer makes no guarantee for reliable transfer, so that's where TCP (of tranport layer) comes in!
'2021 > October 2021' 카테고리의 다른 글
CSS man you are pretty cool (0) | 2021.10.22 |
---|---|
Software architecture - structure and views (0) | 2021.10.16 |
Link layer Error Detection Code (EDC) (0) | 2021.10.10 |
Hierarchical routing (iBGP, eBGP, OSPF, RIP) (0) | 2021.10.09 |
Client and server architecture (0) | 2021.10.09 |