Unexpected instance reboots following cloud-init user data script termination

0

In leveraging AWS's user data mechanism, we aim to utilize the terminate behavior of instance initiated shutdown to gracefully terminate instances through the execution of a shutdown command within the user data script.

However, we've encountered instances that unexpectedly reboot and rerun the user data script instead of terminating as intended. We cannot reproduce this issue, that occur unexpectedly from time to time.

Noteworthy findings from the cloud-init-output.log contributing to this observation include:

Cloud-init v. 19.3-46.amzn2.0.1 running 'init-local' at Thu, 01 Feb 2024 08:06:53 +0000. Up 4.19 seconds.
Cloud-init v. 19.3-46.amzn2.0.1 running 'init' at Thu, 01 Feb 2024 08:06:54 +0000. Up 4.74 seconds.
ci-info: ++++++++++++++++++++++++++++++++++++++Net device info+++++++++++++++++++++++++++++++++++++++

...

/var/lib/cloud/instance/scripts/part-001: line 2:  2599 Terminated [redacted]
Feb 01 08:08:07 cloud-init[2582]: util.py[WARNING]: Failed running /var/lib/cloud/instance/scripts/part-001 [143]
Cloud-init v. 19.3-46.amzn2.0.1 running 'init-local' at Thu, 01 Feb 2024 08:08:23 +0000. Up 4.32 seconds.
Cloud-init v. 19.3-46.amzn2.0.1 running 'init' at Thu, 01 Feb 2024 08:08:24 +0000. Up 4.89 seconds.
ci-info: ++++++++++++++++++++++++++++++++++++++Net device info+++++++++++++++++++++++++++++++++++++++

This indicates that cloud-init runs twice, with the first run "Terminated" for unknown reasons, potentially preventing the shutdown command execution. Subsequently, cloud-init runs again (after a restart that is not supposed to occur?) and the script fails (143) during the second run so no shutdown are called, leading to instances requiring manual cleanup.

We seek insights into the root cause of this behavior and whether it constitutes a bug.

asked 3 months ago178 views
1 Answer
0

Hi, Did you try this with newer version of Amazon Linux 2023 which has latest version of cloud-init? This is just to rule out any potential limitation/bug with cloud-init version 19.x you have mentioned. Below are few additional points you could verify:

  • Review User Data Script: Check for any commands that might inadvertently cause a reboot rather than a shutdown. Ensure that the script explicitly calls for a shutdown (e.g., shutdown -h now or poweroff) and that there are no conditions under which a reboot command (reboot) could be executed instead.
  • Instance Settings: Verify the instance's shutdown behavior is correctly set to terminate.
  • Cloud-init Configuration: Look into cloud-init's configuration to ensure it's set up to handle user data scripts as expected. You might need to adjust cloud-init settings to prevent reexecution of scripts that should only run once.
answered 3 months ago
profile picture
EXPERT
reviewed a month ago
  • Hi, we were previously using AL2 and plan to transition to the latest version of Amazon Linux in 2023. We'll ensure to provide you with updates if the issue persists at that time.

You are not logged in. Log in to post an answer.

A good answer clearly answers the question and provides constructive feedback and encourages professional growth in the question asker.

Guidelines for Answering Questions