AWS DataSync Agent VM kernel corruption issue

0

DataSync Agent reboots to Grub prompt and has corrupted Kernel file.

I have had 3 different Datasync Agents in my VMware cluster reboot unexpectedly and boot to a Grub prompt and report 'you need to load a kernel first' if you run 'boot' from the grub prompt.

Observations: On all 3 DataSync VMs It's always been the same kernel version file that is corrupted, when you run: linux /boot/vmlinuz-5.4.217-126.408.amzn2.x86_64 root=/dev/sda1 it returns the error "error: ../../grub-core/loader/i386/linux.c:696:invalid magic number." this indicates that the Hex data in the beginning of the file is not correct.

Invalid magic number

Resolution: I have been able to successfully get the VM back online via setting Grub to use an older kernel and initrd image file by running the following commands at the grub prompt:

  1. set root=(hd0,gpt1)
  2. linux /boot/vmlinuz-5.4.214-120.368.amzn2.x86_64 root=/dev/sda1 (This is telling Grub to use the 214 kernel not the corrupted 217 version)
  3. initrd /boot/initramfs-5.4.214-120.368.amzn2.x86_64.img
  4. boot

After the machine boots, you have to reboot it to get the network adapter to function (VMware tools doesn't load when you boot from the grub prompt)

My Questions

  1. Is Amazon aware of this Kernel issue or has anyone else experienced it
  2. Does AWS push updates to the DataSync VMs
  3. If DataSync agents are receiving updates, can we opt-out of some or all of the updates or have control over when they are applied? These DataSync agents caused production workflow outages when they rebooted to the Grub prompt unannounced and unexpectedly.
  4. Are we allowed to access the full VM terminal so we can work on things like running 'update-grub' to further resolve and investigate these issues?
S4TV
已提问 2 年前309 查看次数
1 回答
0
已接受的回答

The AWS DataSync team is currently investigating the issue. To your questions:

  1. We are investigating this issue, thank you for the deep dive
  2. The DataSync service automatically applies updates to the agent in the background, as noted here: https://aws.amazon.com/datasync/faqs/#Security_and_compliance.
  3. We currently do not support disabling agent updates or modifying when updates are applied.
  4. DataSync does not support customer root access to the agent.
AWS
Jeff_B
已回答 2 年前
profile picture
专家
已审核 1 个月前
  • just wanted to post an update that another one of our datasync agents went offline tonight with the same issue, same kernel version and everything.

您未登录。 登录 发布回答。

一个好的回答可以清楚地解答问题和提供建设性反馈,并能促进提问者的职业发展。

回答问题的准则