Skip to content

Elastic Beanstalk NodeJS 20 6.3.0->6.4.0 upgrade results in hung instances and never completes

0

I have a nodejs app hosted on the NodeJS 20 Amazon Linux 2023 Elastic Beanstalk platform. Every time we try to update any instances to 6.4.0 the new instances never complete health checks and fail to ever actually upgrade, forcing us to abort the upgrade and stay on 6.3.0. I logged into one of the failing instances via systems manager and noticed the following output from dmesg which appears to show something hanging with respect to process 2608:

[ 243.761594] INFO: task iou-sqp-2608:2622 blocked for more than 122 seconds.

[ 243.762323] Not tainted 6.1.115-126.197.amzn2023.x86_64 #1

[ 243.762872] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.

[ 243.763599] task:iou-sqp-2608 state:D stack:0 pid:2622 ppid:2527 flags:0x00004000

[ 243.764376] Call Trace:

[ 243.764688] <TASK>

[ 243.764898] __schedule+0x1ad/0x530

[ 243.765390] schedule+0x5a/0xd0

[ 243.765705] schedule_preempt_disabled+0x11/0x20

[ 243.766254] __mutex_lock.constprop.0+0x372/0x6c0

[ 243.766695] io_sq_thread+0x275/0x4e0

[ 243.767138] ? membarrier_register_private_expedited+0x90/0x90

[ 243.767804] ? io_sqd_handle_event+0xd0/0xd0

[ 243.768636] ret_from_fork+0x22/0x30

[ 243.768973] </TASK>

[ 366.641067] INFO: task iou-sqp-2608:2622 blocked for more than 245 seconds.

[ 366.641770] Not tainted 6.1.115-126.197.amzn2023.x86_64 #1

[ 366.642366] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.

[ 366.643104] task:iou-sqp-2608 state:D stack:0 pid:2622 ppid:2527 flags:0x00004000

[ 366.643889] Call Trace:

[ 366.644245] <TASK>

[ 366.644461] __schedule+0x1ad/0x530

[ 366.644960] schedule+0x5a/0xd0

[ 366.645267] schedule_preempt_disabled+0x11/0x20

[ 366.645869] __mutex_lock.constprop.0+0x372/0x6c0

[ 366.646343] io_sq_thread+0x275/0x4e0

[ 366.646753] ? membarrier_register_private_expedited+0x90/0x90

[ 366.647323] ? io_sqd_handle_event+0xd0/0xd0

[ 366.647802] ret_from_fork+0x22/0x30

[ 366.648152] </TASK>

Process 2608 appears to refer to my node app as shown here:

[root@ip-10-1-100-19 bin]# ps -aef | grep -i node

root 809 2 0 15:11 ? 00:00:00 [xfs-inodegc/nvm]

webapp 2608 2527 0 15:12 ? 00:00:02 node --experimental-json-modules lib/

root 3039 2581 0 15:18 pts/1 00:00:00 grep --color=auto -i node

Everything is fine on version 6.3.0. There appears to be something broken in the latest version of elastic beanstalk platform and there are a few other threads on re:post reporting this but I have not seen any sort of solution thus far except not allowing this update.

asked a year ago185 views
2 Answers
0

The issue you're experiencing with the Elastic Beanstalk NodeJS 20 platform upgrade from version 6.3.0 to 6.4.0 appears to be related to a system-level problem rather than an application-specific issue. The error messages you've shared indicate that a task (iou-sqp-2608) is being blocked for an extended period, which is likely causing the health checks to fail and preventing the upgrade from completing successfully.

This type of issue can sometimes occur due to incompatibilities between the new platform version and the underlying system components or configurations. It's not uncommon for platform updates to introduce changes that may affect system behavior, especially when it comes to low-level operations like I/O handling.

While there isn't a definitive solution provided in the available information, here are some steps you can take to troubleshoot and potentially resolve the issue:

  1. Check for any known issues or recent announcements from AWS regarding the Elastic Beanstalk NodeJS 20 platform. There may be a documented problem that AWS is working on.

  2. Review your application's dependencies and ensure they are compatible with the latest NodeJS 20 version included in the 6.4.0 platform update.

  3. Try creating a new environment with the 6.4.0 platform version and deploy your application to it. This can help isolate whether the issue is related to the upgrade process or the platform itself.

  4. If possible, simplify your application temporarily by removing any complex I/O operations or third-party modules that might be causing conflicts, then attempt the upgrade again.

  5. Consider reaching out to AWS Support for further assistance, as they may have more detailed information about this specific issue and potential workarounds.

  6. As a temporary measure, you may need to continue using version 6.3.0 until a resolution is found or AWS releases a fix in a subsequent platform update.

If the problem persists and affects your production environment, it may be prudent to wait for further updates from AWS or for other users to report successful upgrades before attempting again. Keep monitoring AWS announcements and community forums for any updates on this issue.
Sources
Troubleshooting Elastic Beanstalk - Categories and FAQs - AWS Elastic Beanstalk
Troubleshoot degraded or severe state of Elastic Beanstalk environment | AWS re:Post
Elastic Beanstalk enhanced health reporting and monitoring - AWS Elastic Beanstalk

answered a year ago
0

After some further research I've found this is a known bug in kernel which is affecting Amazon Linux 2023. It appears that a fix is slated to be released to AL2023 later this month. Seems like the only option at this time is to wait for a fix:

https://github.com/amazonlinux/amazon-linux-2023/issues/840#issuecomment-2485782075 https://github.com/amazonlinux/amazon-linux-2023/issues/856 https://github.com/nodejs/node/issues/55587

answered a year ago

You are not logged in. Log in to post an answer.

A good answer clearly answers the question and provides constructive feedback and encourages professional growth in the question asker.