Detect and fix rare cases where the primary ENI does not serve default traffic

During testing, we encountered a rare scenario when launching EC2 instances with multiple ENIs: the primary ENI (device index 0) does not serve default network traffic. This occurs in approximately 1 out of 10,000 launches (0.01%).

For example, when configuring two ENIs on an instance—ENI-0 (deviceIndex=0) from subnet-0 and ENI-1 (deviceIndex=1) from subnet-1—Linux may recognize eth0 as being from subnet-1 instead of the expected subnet-0. In my use case, we must ensure default traffic routes through the primary ENI for security compliance.

This post demonstrates how we detected and fixed the ENI out-of-order issue in a simplified environment with:

  1. Customized Amazon Linux 2023 (AL2023) AMIs without predictable network interface names
  2. Two ENIs
  3. Shell script-based solution

Find the MAC Address of the Primary ENI

According to the IMDS documentation, the instance metadata provides "the instance's media access control (MAC) address. In cases where multiple network interfaces are present, this refers to the eth0 device (the device for which the device number is 0)."

We can query IMDS for the primary ENI's MAC address using this imdsv2 script:

 1#!/bin/sh
 2
 3# https://docs.aws.amazon.com/AWSEC2/latest/UserGuide/ec2-instance-metadata.html
 4# imdsv2 "mac"
 5
 6function fetch_metadata() {
 7    if [ -z "${1}" ]; then
 8        echo "Usage: fetch_metadata <metadata-path>"
 9        return 1
10    fi
11
12    local METADATA_PATH=${1}
13    local TOKEN_URL="http://169.254.169.254/latest/api/token"
14    local METADATA_URL="http://169.254.169.254/latest/meta-data/${METADATA_PATH}"
15
16    # Fetch the session token
17    local TOKEN=$(curl -s -X PUT "${TOKEN_URL}" -H "X-aws-ec2-metadata-token-ttl-seconds: 21600")
18
19    if [ -z "${TOKEN}" ]; then
20        echo "Failed to fetch the session token"
21        return 1
22    fi
23
24    # Fetch the metadata using the token
25    local METADATA=$(curl -s -H "X-aws-ec2-metadata-token: ${TOKEN}" "${METADATA_URL}")
26
27    if [ -z "${METADATA}" ]; then
28        echo "Failed to fetch the metadata for path: ${METADATA_PATH}"
29        return 1
30    fi
31
32    echo "${METADATA}"
33}
34
35fetch_metadata "${@}"

Find the MAC Address for Default Network Traffic

The following get-default-route-mac script identifies which interface handles default traffic:

 1#!/bin/bash
 2
 3# Script to find the default route device and output its MAC address
 4# 1. Check IPv4 default route first, then IPv6 if needed
 5# 2. Extract device name from the route
 6# 3. Output the MAC address of that device
 7
 8set -e
 9
10# Try to get IPv4 default route first
11default_route=$(ip route show default 2>/dev/null | head -n 1)
12
13# If no IPv4 default route, try IPv6
14if [[ -z "${default_route}" ]]; then
15    default_route=$(ip -6 route show default 2>/dev/null | head -n 1)
16fi
17
18# If still no default route found, log error and exit
19if [[ -z "${default_route}" ]]; then
20    echo "Error: No default route found" >&2
21    exit 1
22fi
23
24# Extract device name from default route
25# The format is typically: default via <gateway> dev <device> [other params]
26device=$(echo "${default_route}" | grep -o 'dev [^ ]*' | cut -d' ' -f2)
27
28if [[ -z "${device}" ]]; then
29    echo "Error: Could not extract device name from default route: ${default_route}" >&2
30    exit 1
31fi
32
33# Get MAC address of the device
34mac_address=$(cat /sys/class/net/${device}/address 2>/dev/null)
35
36if [[ -z "${mac_address}" ]]; then
37    echo "Error: Could not get MAC address for device ${device}" >&2
38    exit 1
39fi
40
41# Echo the MAC address
42echo "${mac_address}"

Swap Network Interface Names: eth0 ↔ eth1

With both MAC addresses available, we can compare them. If the primary ENI's MAC address differs from the default traffic MAC address, we need to swap the interface names using the swap-eth0-eth1 script:

 1#!/bin/bash
 2set -e
 3
 4systemctl stop network
 5
 6ETH0_MAC=$(cat /sys/class/net/eth0/address)
 7ETH1_MAC=$(cat /sys/class/net/eth1/address)
 8
 9cat > /etc/udev/rules.d/70-rename-interfaces.rules << EOF
10SUBSYSTEM=="net", ACTION=="add", DRIVERS=="?*", ATTR{address}=="${ETH1_MAC}", KERNEL=="eth*", NAME="eth0"
11SUBSYSTEM=="net", ACTION=="add", DRIVERS=="?*", ATTR{address}=="${ETH0_MAC}", KERNEL=="eth*", NAME="eth1"
12EOF
13
14cat /etc/udev/rules.d/70-rename-interfaces.rules
15
16/sbin/udevadm control --reload-rules
17
18/sbin/udevadm trigger --attr-match=subsystem=net --action=add
19
20systemctl start network