AWS DataSync Agent VM kernel corruption issue

0

DataSync Agent reboots to Grub prompt and has corrupted Kernel file.

I have had 3 different Datasync Agents in my VMware cluster reboot unexpectedly and boot to a Grub prompt and report 'you need to load a kernel first' if you run 'boot' from the grub prompt.

Observations: On all 3 DataSync VMs It's always been the same kernel version file that is corrupted, when you run: linux /boot/vmlinuz-5.4.217-126.408.amzn2.x86_64 root=/dev/sda1 it returns the error "error: ../../grub-core/loader/i386/linux.c:696:invalid magic number." this indicates that the Hex data in the beginning of the file is not correct.

Invalid magic number

Resolution: I have been able to successfully get the VM back online via setting Grub to use an older kernel and initrd image file by running the following commands at the grub prompt:

  1. set root=(hd0,gpt1)
  2. linux /boot/vmlinuz-5.4.214-120.368.amzn2.x86_64 root=/dev/sda1 (This is telling Grub to use the 214 kernel not the corrupted 217 version)
  3. initrd /boot/initramfs-5.4.214-120.368.amzn2.x86_64.img
  4. boot

After the machine boots, you have to reboot it to get the network adapter to function (VMware tools doesn't load when you boot from the grub prompt)

My Questions

  1. Is Amazon aware of this Kernel issue or has anyone else experienced it
  2. Does AWS push updates to the DataSync VMs
  3. If DataSync agents are receiving updates, can we opt-out of some or all of the updates or have control over when they are applied? These DataSync agents caused production workflow outages when they rebooted to the Grub prompt unannounced and unexpectedly.
  4. Are we allowed to access the full VM terminal so we can work on things like running 'update-grub' to further resolve and investigate these issues?
S4TV
已提問 2 年前檢視次數 309 次
1 個回答
0
已接受的答案

The AWS DataSync team is currently investigating the issue. To your questions:

  1. We are investigating this issue, thank you for the deep dive
  2. The DataSync service automatically applies updates to the agent in the background, as noted here: https://aws.amazon.com/datasync/faqs/#Security_and_compliance.
  3. We currently do not support disabling agent updates or modifying when updates are applied.
  4. DataSync does not support customer root access to the agent.
AWS
Jeff_B
已回答 2 年前
profile picture
專家
已審閱 1 個月前
  • just wanted to post an update that another one of our datasync agents went offline tonight with the same issue, same kernel version and everything.

您尚未登入。 登入 去張貼答案。

一個好的回答可以清楚地回答問題並提供建設性的意見回饋,同時有助於提問者的專業成長。

回答問題指南