How do I resolve the "Kernel panic - not syncing" error with EC2 instances?

5 minutos de lectura
1

I want to resolve the "kernel panic" error that occurs after I upgrade the kernel or reboot my Amazon Elastic Cloud Compute (Amazon EC2) Linux instance because of missing initramfs or kernel modules.

Short description

You receive the "Kernel panic - not syncing" error because the device or address doesn't exist. To resolve this issue, launch a temporary instance and attach the faulty root disk as a secondary drive to perform diagnostics.

Important: When you stop and restart an instance, data on instance store volumes is erased. Back up the data that you want to keep. Also, the public IP address of your instance changes. It's a best practice to use an Elastic IP address instead of a public IP address when you route external traffic to your instance.

Resolution

Note: The following resolution applies to Amazon Linux 2, Amazon Linux 2023, Fedora 16 and later, and Red Hat Enterprise Linux (RHEL) 7 and later.

To attach the root disk to a temporary instance, complete the following steps:

  1. Create a new key pair, or use an existing key pair.

  2. Get the volume ID and device name for the original instance's root volume.

  3. Stop the original instance.

  4. Launch a temporary instance from an AMI (Amazon Machine Image) with the same Linux operating system (OS) version in the same Availability Zone.

  5. Detach the root volume from the original instance and attach it to the temporary instance as a secondary volume. Note the volume device name.

  6. Use the SSH key pair to connect to the temporary instance.

  7. To change to the root user, run the following command:

    [ec2-user ~]$ sudo su
  8. To identify the block device name and partition, run the following command from the temporary instance:

    [root ~]$ lsblk
    NAME    MAJ:MIN RM  SIZE RO TYPE MOUNTPOINT
    xvda    202:0    0    8G  0 disk
    └─xvda1 202:1    0    8G  0 part /
    xvdf    202:80   0  101G  0 disk
    └─xvdf1 202:81   0  101G  0 part

    The preceding example uses a XEN instance with blkfront drivers. Both /dev/xvda and /dev/xvdf are partitioned volumes, and /dev/xvdg isn't. If your volume is partitioned, then run the following command to mount the partition (/dev/xvdf1) instead of the raw device (/dev/xvdf):

    [root ~]$ mount -o nouuid  /dev/xvdf1 /mnt

    If you use an instance built on the AWS Nitro system, then the volume device name is similar to /dev/nvme[0-26]n1. If your instance is built on Nitro with NVMe, then mount the partition at the /mnt directory. Use the device name that you identified in step 8:

    [root ~]$ mount -o nouuid  /dev/nvme1n1p1 /mnt

    For more information, see Device names for volumes on Amazon EC2 instances.

  9. To create a chroot environment in the /mnt directory, run the following command:

    [root ~]$ for i in dev proc sys run; do mount -o bind /$i /mnt/$i; done; chroot /mnt

    In the preceding example, the /dev, /proc, /sys, and /run directories are bind-mounted from the original root file system. This allows processes that run inside the chroot environment to access the system directories.

  10. To create a backup of the initramfs in the "/" directory, run the following command:

    [root ~]$ for file in /boot/initramfs-*.img; do cp "${file}" "/$(basename "$file")_$(date +%Y%m%d)"; done
  11. To list the default kernel, run the following command:

    [root ~]$ grubby --default-kernel

    Example output:

    /boot/vmlinuz-5.15.156-102.160.amzn2.x86_64

    The preceding output lists the kernel that tries to boot at startup.

  12. List the kernels and initramfs in the boot directory:

    [root ~]$ ls -lh /boot/vmlinuz* && ls -lh /boot/initr*

    Example output:

    -rwxr-xr-x. 1 root root 9.7M Apr 23 20:37 /boot/vmlinuz-5.10.215-203.850.amzn2.x86_64
    -rwxr-xr-x. 1 root root 9.9M Apr 23 17:00 /boot/vmlinuz-5.15.156-102.160.amzn2.x86_64
    -rw-------. 1 root root 12M May 3 23:45 /boot/initramfs-5.10.215-203.850.amzn2.x86_64.img
    -rw-------. 1 root root 9.8M May 14 08:03 /boot/initramfs-5.15.156-102.160.amzn2.x86_64.img

    Note where the vmlinuz kernel files have corresponding initramfs files.

  13. To rebuild the initramfs, run the following command.

    [root ~]$ dracut --force --verbose initramfs-kernelVersion.img kernelVersion

    Note: Replace kernelVersion with the latest kernel version.

  14. To determine whether the instance is booting on UEFI or BIOS, run the following command:

    [root ~]$ boot_mode=$(ls /sys/firmware/efi/efivars >/dev/null 2>&1 && echo "EFI" || echo "BIOS"); echo "Boot mode detected: $boot_mode"
  15. To update the grub configuration, run one of the following commands for BIOS or UEFI.
    BIOS:

    [root ~]$ grub2-mkconfig -o /boot/grub2/grub.cfg

    Note: When you run the preceding command, you might receive the error "device-mapper: reload ioctl on osprober-linux-xvda2 (253:0) failed: Device or resource busy Command failed". To resolve this issue, add the GRUB_DISABLE_OS_PROBER=true parameter to the /etc/default/grub file, and then run the command again.

    UEFI:

    Amazon Linux 2 and Amazon Linux 2023:

    [root ~]$ grub2-mkconfig -o /boot/efi/EFI/amzn/grub.cfg

    Fedora 16+:

    [root ~]$ grub2-mkconfig -o /boot/efi/EFI/fedora/grub.cfg

    Red Hat 7+:

    [root ~]$ grub2-mkconfig -o /boot/efi/EFI/redhat/grub.cfg
  16. To exit and detach the volume, run the following command:

    [root ~]$ exit; umount -fl /mnt
  17. Detach the secondary volume from the temporary instance and attach it to the original instance as the root device. Use the same device name from step 5.

  18. Connect to the original instance.

OFICIAL DE AWS
OFICIAL DE AWSActualizada hace un mes
2 comentarios

The first option almost worked for me. After running chroot /mnt, I had to complete a broken yum update (yum-complete-transaction), run yum update/upgrade, reinstall the kernel (yum reinstall kernel) and run grub2-mkconfig -o /boot/grub2/grub.cfg.

respondido hace un año

Thank you for your comment. We'll review and update the Knowledge Center article as needed.

profile pictureAWS
MODERADOR
respondido hace un año