How do I recover my Red Hat 8 or CentOS 8 instance that fails to boot because of issues with the GRUB2 BLS configuration file?

5 分的閱讀內容
0

I have a Red Hat 8 or CentOS 8 Amazon Elastic Compute Cloud (Amazon EC2) instance. I want to recover a corrupted or deleted BLS configuration (blscfg) file found under “/boot/loader/entries/”.

Short description

GRUB2 in RHEL 8 and Centos 8 uses blscfg files and entries in /boot/loader for the boot configuration, as opposed to the previous grub.cfg format. It's a best practice to use the grubby tool to manage the blscfg files and retrieve information from /boot/loader/entries/. If the blscfg files are corrupted or missing from this location, then grubby doesn't show any results. You must regenerate the files to recover functionality. To regenerate the blscfg, create a temporary rescue instance, and then remount your Amazon Elastic Block Store (Amazon EBS) volume on the rescue instance. From the rescue instance, regenerate the blscfg for any installed kernels.
Important: Don't perform this procedure on an instance store-backed instance. This recovery procedure requires a stop and start of your instance, which means that you lose any data on the instance. For more information, see Determine the root device type of your instance.

Resolution

Attach the root volume to a rescue EC2 instance

  1. Create an EBS snapshot of the root volume. For more information, see Create Amazon EBS snapshots.
  2. Open the Amazon EC2 console.
    Note: Be sure that you are in the correct Region. The Region appears in the Amazon EC2 console to the right of your account information. You can choose a different Region from the drop down menu, if needed.
  3. Choose Instances from the navigation pane, and then choose the impaired instance.
  4. Choose Actions, select Instance State, and then choose Stop.
  5. In the Description tab, under Root device, choose /dev/sda1, and then choose the EBS ID.
  6. Choose Actions, Detach Volume, and then choose Yes, Detach. Note the Availability Zone.
  7. Launch a similar rescue EC2 instance in the same Availability Zone. This instance becomes your rescue instance.
  8. After the rescue instance launches, choose Volumes from the navigation pane, and then choose the detached root volume of the impaired instance.
  9. Choose Actions, and then choose Attach Volume.
  10. Choose the rescue instance ID (id-xxxxx), and then set an unused device. In this example, the unused device is /dev/sdf.

Mount the volume of the impaired instance

  1. Use SSH to connect to the rescue instance.

  2. Run the lsblk command to view your available disk devices:

    [ec2-user@ip-10-10-1-111 /]s lsblkNAME    MAJ:MIN RM SIZE RO TYPE MOUNTPOINT
    xvda    202:0    0  10G  0 disk
    ├─xvda1 202:1    0   1M  0 part
    └─xvda2 202:2    0  10G  0 part /
    xvdf    202:80   0  10G  0 disk
    ├─xvdf1 202:81   0   1M  0 part
    └─xvdf2 202:82   0  10G  0 part 

    Note: Nitro-based instances expose EBS volumes as NVMe block devices. The output that the lsblk command generates on Nitro-based instances shows the disk names as nvme[0-26]n1. For more information, see Amazon EBS and NVMe on Linux instances.

  3. Create a mount directory, and then mount the root partition of the mounted volume to this new directory. In the example from Step 2, /dev/xvdf2 is the root partition of the mounted volume. For more information, see Make an Amazon EBS volume available for use on Linux:

    sudo mkdir /mountsudo mount /dev/xvdf2 /mount
  4. Mount /dev, /run, /proc, and /sys of the rescue instance to the same paths as the newly mounted volume:

    sudo mount -o bind /dev /mount/devsudo mount -o bind /run /mount/run
    sudo mount -o bind /proc /mount/proc 
    sudo mount -o bind /sys /mount/sys
  5. Start the chroot environment:

    sudo chroot /mount

Regenerate the blscfg files

  1. Run the rpm command. Note of the available kernels in your instance:

    [root@ip-10-10-1-111 ~]# rpm -q --last kernelkernel-4.18.0-147.3.1.el8_1.x86_64 Tue 21 Jan 2020 05:11:16 PM UTC
    kernel-4.18.0-80.4.2.el8_0.x86_64 Tue 18 Jun 2019 05:06:11 PM UTC
  2. To recreate the blscfg file, run the kernel-install command:
    Note: The systemd-udev rpm installation package provides the kernel-install binary:

    sudo kernel-install add 4.18.0-147.3.1.el8_1.x86_64 /lib/modules/4.18.0-147.3.1.el8_1.x86_64/vmlinuz 

    Replace 4.18.0-147.3.1.el8_0.x86_64 with your kernel version number. The blscfg for the designated kernel regenerates under /boot/loader/entries/:

    [root@ip-10-10-1-111 ~]# ls /boot/loader/entries/2bb67fbca2394ed494dc348993fb9b94-4.18.0-147.3.1.el8_1.x86_64.conf
  3. Repeat step 2 for other installed kernels on the instance, as needed. The latest kernel that you set becomes the default kernel.

  4. To see the current default kernel, run the grubby command --default kernel:

    sudo grubby --default-kernel
  5. Exit from chroot, and unmount the /dev, /run, /proc, and /sys mounts:

    Exitsudo umount /mount/dev
    sudo umount /mount/run
    sudo umount /mount/proc
    sudo umount /mount/sys
    sudo umount /mount
  6. Mount the device back to the original instance with the correct block device mapping. The device now boots with the default kernel.

Related information

How do I revert to a known stable kernel after an update prevents my Amazon EC2 instance from rebooting successfully?

AWS 官方
AWS 官方已更新 7 個月前