AWS DataSync Agent VM kernel corruption issue

0

DataSync Agent reboots to Grub prompt and has corrupted Kernel file.

I have had 3 different Datasync Agents in my VMware cluster reboot unexpectedly and boot to a Grub prompt and report 'you need to load a kernel first' if you run 'boot' from the grub prompt.

Observations: On all 3 DataSync VMs It's always been the same kernel version file that is corrupted, when you run: linux /boot/vmlinuz-5.4.217-126.408.amzn2.x86_64 root=/dev/sda1 it returns the error "error: ../../grub-core/loader/i386/linux.c:696:invalid magic number." this indicates that the Hex data in the beginning of the file is not correct.

Invalid magic number

Resolution: I have been able to successfully get the VM back online via setting Grub to use an older kernel and initrd image file by running the following commands at the grub prompt:

  1. set root=(hd0,gpt1)
  2. linux /boot/vmlinuz-5.4.214-120.368.amzn2.x86_64 root=/dev/sda1 (This is telling Grub to use the 214 kernel not the corrupted 217 version)
  3. initrd /boot/initramfs-5.4.214-120.368.amzn2.x86_64.img
  4. boot

After the machine boots, you have to reboot it to get the network adapter to function (VMware tools doesn't load when you boot from the grub prompt)

My Questions

  1. Is Amazon aware of this Kernel issue or has anyone else experienced it
  2. Does AWS push updates to the DataSync VMs
  3. If DataSync agents are receiving updates, can we opt-out of some or all of the updates or have control over when they are applied? These DataSync agents caused production workflow outages when they rebooted to the Grub prompt unannounced and unexpectedly.
  4. Are we allowed to access the full VM terminal so we can work on things like running 'update-grub' to further resolve and investigate these issues?
S4TV
asked a year ago300 views
1 Answer
0
Accepted Answer

The AWS DataSync team is currently investigating the issue. To your questions:

  1. We are investigating this issue, thank you for the deep dive
  2. The DataSync service automatically applies updates to the agent in the background, as noted here: https://aws.amazon.com/datasync/faqs/#Security_and_compliance.
  3. We currently do not support disabling agent updates or modifying when updates are applied.
  4. DataSync does not support customer root access to the agent.
AWS
Jeff_B
answered a year ago
profile picture
EXPERT
reviewed 23 days ago
  • just wanted to post an update that another one of our datasync agents went offline tonight with the same issue, same kernel version and everything.

You are not logged in. Log in to post an answer.

A good answer clearly answers the question and provides constructive feedback and encourages professional growth in the question asker.

Guidelines for Answering Questions