Path MTU – AWS and GCP

Part 2: Path MTU and ICMP

To understand the relationship between MTU, MSS, IP fragmentation and Path MTU Discovery (PMUTD), please refer to the following Cisco documentation.

We can confirm the Path MTU as 1390 bytes by running a tracepath command from gcp-host-1 (172.16.10.2) to aws-host-1 (10.0.1.113) as shown in the output below.

kayode@gcp-host-1:~$ tracepath 10.0.1.113
 1?: [LOCALHOST] pmtu 1460
 1: 10.0.1.113 2.445ms pmtu 1390
 1: no reply
 2: no reply
 3: no reply
...
30: no reply
 Too many hops: pmtu 1390
 Resume: pmtu 1390

 

Earlier, we established that GCP instances have MTU of 1460 bytes and AWS instances have MTU of 9001 bytes. But the path MTU between gcp-host-1 and aws-host-1 is 1390 bytes.

Why is the path MTU lower than 1460 and 9001 bytes?

Outside of the GCP and AWS cloud infrastructure, MTU values on the Internet are typically 1500 bytes or less. This means the MTU of the Internet gateways in both GCP and AWS have 1500 MTU on their outbound Internet interfaces. Therefore our IPsec VPN connection and traffic sent over the Internet gateways are limited to 1500 MTU. IP Packet sizes greater than 1500 bytes are fragmented, or dropped if the Don’t Fragment flag is set in the IP header.

The path MTU is the maximum packet size between a source GCP hostnd a destination host. The path MTU takes into account, not only the MTU of the GCP and AWS instances, but also the MTU of the routers and devices on the Internet that the packets pass through.

Path MTU Discovery (PMTUD) is used to determine the MTU size on the network path between two communicating hosts without the need for IP packet fragmentation. In essence, PMTUD discovers the minimum MTU that is supported along the entire path of communication between two hosts. In our scenario PMTUD has discovered that 1390 bytes is the minimum MTU along the entire path.

Let’s now explore in detail, how PMTUD works in the two stages explained below.

Stage 1 – Exchanging MSS Values

In our scenario, the gcp-host-1 MSS is 1420 bytes while the aws-host-1 MSS is 8961 bytes. The two hosts communicate their Send MSS values to each other.  The MSS is specified as a TCP option, initially in the TCP SYN packet during the TCP handshake. The figure below summarises the exchange of MSS values.

The following steps outline how two hosts communicate their MSS values to each other:

  1. gcp-host-1 compares its MSS buffer (65K) and its MTU (1460- 40 = 1420) and uses the lower value as the MSS (1420) to send to aws-host-1.
  2. aws-host-1 receives GCP host’s Send MSS (1420) and compares it to the value of its outbound interface MTU – 40 (9001 – 40 = 8961).
  3. aws-host-1 sets the lower value (1420) as the MSS for sending IP datagrams to gcp-host-1.
  4. aws-host-1 compares its MSS buffer (65K) and its MTU – 40 (9001 – 40 = 8961) and uses 8961 as the MSS to send to gcp-host-1.
  5. gcp-host-1 receives aws-host-1 Send MSS (4422) and compares it to the value of its outbound interface MTU – 40 (1420).
  6. gcp-host-1 sets the lower value (1420) as the MSS for sending IP datagrams to aws-host-1

The Send MSS values have been agreed as 1420 bytes (MTU 1460 bytes) on gcp-host-1 and aws-host-1. This implies that gcp-host-1 will send IP packets with MTU of 1460 bytes out of its interface eth0 towards aws-host-1.

Each host has negotiated the correct MTU value that the other host will accept without IP packet fragmentation. But the intermediate network devices between the hosts might not support 1460 byte MTU. However, the results indicate a path MTU of 1390 bytes. This means we have a network device along the path that does supports MTU of 1390 bytes. This is where Path MTU discovery comes in.

Stage 2- Path MTU Discovery and ICMP

After gcp-host-1 has established that its MTU is 1460 bytes, the next step is to forward the packet towards aws-host-1. The diagram below illustrates how the subsequent communication occurs.

When gcp-host-1 sends a packet larger than the MTU of a device along the path (Internet Router), the device returns the following ICMP message: Destination Unreachable: Fragmentation Needed and Don’t Fragment was Set (ICMP Type 3, Code 4). This is because the packet would require fragmentation when it encounters the device, Internet Router with an upstream MTU of 1390 bytes – which is less than the packet MTU of 1460 bytes. Since fragmentation is not allowed, the router sends the ICMP message back to gcp-host-1.

The ICMP message also contains the MTU of the device (Internet Router) returning the error – which in this case, is 1390 bytes. This instructs gcp-host-1 to adjust its MTU to 1390 bytes and re-transmit the packet. The process is repeated until all intermediate network devices can transmit the IP packet all the way to the remote host aws-host-1.

Significance of Path MTU

Understanding the path MTU and ICMP behaviour is crucial in troubleshooting MTU and IP packet fragmentation issues. Specifically, it helps in designing robust cloud architectures that take into consideration the propagation of ICMP Type 3, Code 4 messages.

Next, we will look at the design considerations of the GCP and AWS cloud environments in Part 3: GCP Architecture Design Considerations.

Leave a Reply

Your email address will not be published. Required fields are marked *