Skip to content

How do I troubleshoot OS-level network configuration issues on my Amazon EC2 Linux instance?

8 minute read
0

My Amazon Elastic Compute Cloud (Amazon EC2) Linux instance failed its status check because of network configuration issues in the operating system (OS).

Short description

If your EC2 instance fails its instance status checks, then check the system logs for a login prompt to confirm that the instance booted successfully. If the instance booted, then the instance failed its status check because of network configuration issues.

To resolve your issue, connect to your instance, and then update your network configuration.

Resolution

Connect to your instance

Use the EC2 Serial Console to connect to your instance directly. If you can't use the EC2 Serial Console, then connect to a rescue instance to manually edit the instance configuration files. For instructions on how to use a rescue instance, see the Use a rescue instance to manually edit the file section in Why does my EC2 Linux instance go into emergency mode when I try to boot it?

Note: If you use the serial console to connect to your instance, then reboot your instance after you update the network configuration to apply the changes.

Check the cloud-init installation

Network failures might occur because you didn't install the cloud-init package, or you used the cloud-init package to update network configurations at launch.

To resolve this issue, complete the following steps to install the cloud-init package on your instance:

  1. If you used the serial console to connect, then proceed to step 2. If you used the rescue instance method, then run the following command to create a chroot environment:
    for i in dev proc sys run; do mount -o bind /$i /mnt/$i; done; chroot /mnt
    The preceding example bind-mounts the /dev, /proc, /sys, and /run directories from the original root file system. This configuration lets processes that run inside the chroot environment to access the system directories.
  2. To install the cloud-init package, run the following command based on your Linux distribution.
    Amazon Linux 2 (AL2), Amazon Linux 2023 (AL2023), or Red Hat Enterprise Linux (RHEL):
    sudo yum install cloud-init -y
    Ubuntu or Debian:
    sudo apt install cloud-init -y

Check for hardcoded MAC addresses

Hardcoded MAC addresses in the udev configuration files cause network issues. If there's a hardcoded MAC address, then the elastic network interface can't load because the MAC address doesn't match the instance's hardware address.

The following example output from the system log shows that the eth0 interface is down with a False status:

[    7.098826] cloud-init[963]: ci-info: ++++++++++++++++++++++++++++++++++++++Net device info+++++++++++++++++++++++++++++++++++++++
[    7.101136] cloud-init[963]: ci-info: +--------+------+-----------------------------+---------------+--------+-------------------+
[    7.103157] cloud-init[963]: ci-info: | Device |  Up  |           Address           |      Mask     | Scope  |     Hw-Address    |
[    7.105264] cloud-init[963]: ci-info: +--------+------+-----------------------------+---------------+--------+-------------------+
[    7.107320] cloud-init[963]: ci-info: |  eth0  | False|         172.31.21.19        | 255.255.240.0 | global | 0a:ff:c3:d1:93:6f |
[    7.111401] cloud-init[963]: ci-info: |   lo   | True |          127.0.0.1          |   255.0.0.0   |  host  |         .         |
[    7.115529] cloud-init[963]: ci-info: +--------+------+-----------------------------+---------------+--------+-------------------+

To remove hardcoded MAC addresses, complete the following steps:

  1. Check the following file locations for hardcoded MAC addresses:
    /etc/udev/rules.d/
    /etc/udev/rules.d/70-persistent-net.rules
    /etc/udev/rules.d/80-net-name-slot.rules
    The preceding files might use a ATTR{address}=="MAC_ADDRESS" directive to specify a permanent MAC address.

  2. Remove the entries from the configuration files, or run the following command to remove the configuration files:

    sudo mv /etc/udev/rules.d/70-persistent-net.rules /etc/udev/rules.d/70-persistent-net.rules.disabled
  3. If you used the rescue instance method, then proceed to step 4. If you used the serial console to connect to your instance, then you must restart the network service. Run the following commands based on your Linux distribution.
    Ubuntu, Debian, and AL2023:

    sudo systemctl restart systemd-networkd 

    RHEL:

    sudo systemctl restart NetworkManager 

    Systems that use netplan:

    sudo netplan apply 
  4. If you used the rescue instance method, then detach the Amazon Elastic Block Store (Amazon EBS) volume from the rescue instance. Then, attach the volume back to the original instance. To receive a new MAC address, start the original instance.

Check for hardcoded IP addresses

Amazon Machine Images (AMIs) that you create from instances with static IP address configurations inherit hardcoded IP addresses in their network configuration files. If you launch instances from these AMIs, then the instances don't request an IP address through the Dynamic Host Configuration Protocol (DHCP). Instead, the instances use the hardcoded IP address. This configuration causes network connectivity failures. To resolve this issue, make sure that you configured all network interfaces for DHCP before you create the AMI.

Note: You can't update AMIs that already exist. You must set the network interface to use DHCP before you create a new AMI.

Check your configuration

For Amazon Linux or RHEL, use the following configurations:

  • Make sure that the network configuration files contain the following values:
    BOOTPROTO=dhcp
    PEERDNS=yes
    PERSISTENT_DHCLIENT=yes
  • Remove the following lines from the files:
    IPADDR
    NETMASK
    GATEWAY
    DNS

Example configuration:

cat /etc/sysconfig/network-scripts/ifcfg-eth0
DEVICE=eth0
BOOTPROTO=dhcp
ONBOOT=yes
TYPE=Ethernet
USERCTL=yes
PEERDNS=yes
DHCPV6C=yes
DHCPV6C_OPTIONS=-nw
PERSISTENT_DHCLIENT=yes
RES_OPTIONS="timeout:2 attempts:5"

For Ubuntu or Debian, make sure that the netplan configuration contains dhcp4: true.

Example configuration:

cat /etc/netplan/50-cloud-init.yaml
network:
  version: 2
  ethernets:
    ens5:
      match:
        macaddress: "0a:ff:d3:15:30:9d"
      dhcp4: true
      dhcp6: false
      set-name: "ens5"

Note: The macaddress field in the netplan configuration is different from hardcoded MAC rules in udev that cause network failures. Amazon EC2 uses the macaddress value only to match the configuration's network interface. The dchp4:true line lets Amazon EC2 dynamically assign an IP address to the instance.

Restart the network service

Run the following command based on your Linux distribution to restart the network service.

Ubuntu, Debian, or AL2023:

sudo systemctl restart systemd-networkd 

RHEL:

sudo systemctl restart NetworkManager 

Systems that use netplan:

sudo netplan apply 

Verify that the primary network interface starts at boot

If the network interface fails to start at boot, then you receive the "interface eth0: failed" error message. To resolve this issue, verify that ONBOOT is set to yes in the interface configuration file.

Run the following command to open the interface configuration file:

cat /etc/sysconfig/network-scripts/ifcfg-eth0

Make sure that the file contains the following value:

ONBOOT=yes

Resolve network interface naming conflicts

If the network interface name in your configuration files and the network interface on the instance don't match, then you receive the following error message:

"[FAILED] Failed to start LSB: Bring up/down networking"

This issue typically occurs when there are duplicate interface configuration files, such as ifcfg-eth0 and ifcfg-ens5.

To resolve this issue, remove the configuration file for the interface that doesn't exist on your instance. Typically, you keep the latest file. For example, if your instance uses predictable naming conventions such as ens5, then you can keep ifcfg-ens5 and remove ifcfg-eth0. Or, check cloud-init to identify the correct interface name.

Check whether the network interface was automatically renamed at startup

If you activate network interface naming, then the interface names change from formats such as eth0 and eth1 to ens5 and enp0s3. The name changes can cause issues with scripts, firewall rules, and configurations that reference the original names. The naming order might also change after instance restart or reboot.

To deactivate predictable network interface naming, complete the following steps:

  1. To create a chroot environment in the /mnt directory, run the following command:

    for i in dev proc sys run; do mount -o bind /$i /mnt/$i; done; chroot /mnt

    The preceding example bind-mounts the /dev, /proc, /sys, and /run directories from the original root file system. This configuration lets processes that run inside the chroot environment to access the system directories.

  2. Make sure that you activated enhanced networking with Elastic Network Adapter (ENA) on your instance.

  3. Add the net.ifnames=0 kernel parameter to the GRUB_CMDLINE_LINUX line in /etc/default/grub file.
    Example configuration:

    GRUB_CMDLINE_LINUX="console=tty0 console=ttyS0,115200n8 net.ifnames=0"
  4. To rebuild the GRUB configuration, run the following command based on your Linux distribution.
    Amazon Linux or RHEL:

    sudo grub2-mkconfig -o /boot/grub2/grub.cfg

    Ubuntu or Debian:

    sudo update-grub

Check your enhanced networking and enhanced network driver configuration

Test whether you activated enhanced networking on your instance. If you didn't activate it, then activate enhanced networking.

To check whether you installed the ENA driver, run the following command:

sudo modinfo ena |grep -i '^version:' || echo "ENA module not available, try modprobe ena"

If you didn't install the ENA driver, then install the latest driver. For instructions, see Linux kernel driver for ENA family on the GitHub website.

Related information

How do I troubleshoot status check failures for my EC2 Linux instance?

Troubleshoot Amazon EC2 Linux instances with failed status checks

AWS OFFICIALUpdated 2 months ago