EC2 Instance not booting, stuck on grub rescue screen

0

I have a Debian 10 EC2 instance that following a few updates was rebooted, and now it stuck on a "grub rescue" prompt. I am unable to connect to this instance via serial console either because I apparently don't have the proper permissions or something (I have a ticket in to resolve this issue). I can't seem to come up with any option to get this instance working again, everything I've tried has had no impact - attaching the volume to another instance and looking at the grub.conf hasn't helped. I am unsure now how to proceed, and any tips or tricks would be appreciated!

Instance Screenshot

Scott
asked 21 days ago110 views
1 Answer
2

Hlo,

Check Boot Volume: Ensure that the boot volume of your EC2 instance is attached and not detached. If it's detached, reattach it to the instance.

**Check Instance Status: **Verify the status of your EC2 instance in the AWS Management Console. If the instance is running, stop it, and then start it again. Sometimes, a simple restart can resolve boot issues.

Verify Volume Attachment: If you've attached the volume to another instance to inspect its contents, ensure that you're mounting the correct volume and that you have the necessary permissions to access it.

Inspect GRUB Configuration: If you have access to the instance's boot volume, check the GRUB configuration file (/boot/grub/grub.cfg or /boot/grub/menu.lst) to ensure it's correctly configured.

Look for any recent changes or updates that might have affected the GRUB configuration. Sometimes, updates can modify the configuration and cause boot issues. Repair GRUB from Live Environment:

Create a snapshot of the volume attached to your EC2 instance for safety. Launch a new EC2 instance in the same region and availability zone as your problematic instance.

Attach the problematic instance's boot volume as an additional volume to the new instance. SSH into the new instance.

Use tools like chroot to enter the mounted volume and run GRUB commands to reinstall GRUB.

Here are some sample commands:

sudo mount /dev/xvdf1 /mnt  # Mount the volume where the boot partition is located

sudo mount --bind /dev /mnt/dev

sudo mount --bind /proc /mnt/proc

sudo mount --bind /sys /mnt/sys

sudo chroot /mnt  # Enter the mounted volume

grub-install /dev/xvdf  # Replace xvdf with the appropriate device for your boot volume

update-grub

exit  # Exit the chroot environment

sudo umount /mnt/sys

sudo umount /mnt/proc

sudo umount /mnt/dev

sudo umount /mnt

Thank you

answered 21 days ago
  • I appreciate your response. I have tried doing something like this previously (and again just now), but I always get stuck at "chroot"; I can't get past the machine complaining "chroot: failed to run command ‘/bin/bash’: No such file or directory". I have found other guides to get around this error, but none of them have worked. Below is my string of commands, where the nvme1n1p1 partition is the /boot folder for the volume I'm trying to fix:

    [root@ip-11-111-11-11 ~]# lsblk NAME MAJ:MIN RM SIZE RO TYPE MOUNTPOINT nvme2n1 259:0 0 30G 0 disk nvme1n1 259:1 0 128G 0 disk ├─nvme1n1p1 259:3 0 255M 0 part ├─nvme1n1p2 259:4 0 23.8G 0 part └─nvme1n1p3 259:5 0 104G 0 part nvme0n1 259:2 0 10G 0 disk ├─nvme0n1p1 259:6 0 1M 0 part └─nvme0n1p2 259:7 0 10G 0 part / [root@ip-11-111-11-11 ~]# mount /dev/nvme1n1p1 /mnt [root@ip-11-111-11-11 ~]# mount --bind /dev /mnt/dev [root@ip-11-111-11-11 ~]# mount --bind /proc /mnt/proc [root@ip-11-111-11-11 ~]# mount --bind /sys /mnt/sys [root@ip-11-111-11-11 ~]# chroot /mnt chroot: failed to run command ‘/bin/bash’: No such file or directory

  • the nvme1n1p1 partition is the /boot folder for the volume I'm trying to fix

    [root@ip-11-111-11-11 ~]# mount /dev/nvme1n1p1 /mnt

    No, you need to mount the root partition on /mnt and looking at the output of lsblk this is probably nvme1n1p2.

    Unmount those bind mounts and the current mount on /mnt

    Then mount /dev/nvme1n1p2 /mnt and then ls -l /mnt, does this look like the root filesystem (it should have directories for bin, etc, var, usr and so on)?

    If it looks good, then mount /dev/nvme1n1p1 /mnt/boot, re-do your bind mounts, and carry on.

  • Thank you for the clarification, the steps all seem to have worked this time around using your helpful hints. However, it seems to have made no difference as when I then shut down the rescue instance, detach the volume, re-attach it to the original instance and power it up, I get the exact same grub rescue screen. However, it dawned on me that the rescue instance I used was not the same type of instance as the one I'm trying to fix; the rescue instance is a RHEL8 machine, the problem production instance is Debian 10. Does the rescue instance absolutely have to be the exact same OS/version and hardware makeup?

You are not logged in. Log in to post an answer.

A good answer clearly answers the question and provides constructive feedback and encourages professional growth in the question asker.

Guidelines for Answering Questions