[UBUNTU] EBS volume attachment during boot cause randomly EC2 instance to be stuck

0

We create and deploy custom AMIs based on Ubuntu Jammy and we noticed since jammy-20230428 that randomly all the AMI based on it sometimes fail during the boot process. I can destroy and deploy again to get rid of this. The stack trace is always the same:

[  849.765218] INFO: task swapper/0:1 blocked for more than 727 seconds.
[  849.774999]       Not tainted 5.19.0-1025-aws #26~22.04.1-Ubuntu
[  849.787081] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
[  849.811223] task:swapper/0       state:D stack:    0 pid:    1 ppid:     0 flags:0x00004000
[  849.883494] Call Trace:
[  849.891369]  <TASK>
[  849.899306]  __schedule+0x254/0x5a0
[  849.907878]  schedule+0x5d/0x100
[  849.917136]  io_schedule+0x46/0x80
[  849.970890]  blk_mq_get_tag+0x117/0x300
[  849.976136]  ? destroy_sched_domains_rcu+0x40/0x40
[  849.981442]  __blk_mq_alloc_requests+0xc4/0x1e0
[  849.986750]  blk_mq_get_new_requests+0xcc/0x190
[  849.992185]  blk_mq_submit_bio+0x1eb/0x450
[  850.070689]  __submit_bio+0xf6/0x190
[  850.075545]  submit_bio_noacct_nocheck+0xc2/0x120
[  850.080841]  submit_bio_noacct+0x209/0x560
[  850.085654]  submit_bio+0x40/0xf0
[  850.090361]  submit_bh_wbc+0x134/0x170
[  850.094905]  ll_rw_block+0xbc/0xd0
[  850.175198]  do_readahead.isra.0+0x126/0x1e0
[  850.183531]  jread+0xeb/0x100
[  850.189648]  do_one_pass+0xbb/0xb90
[  850.193917]  ? crypto_create_tfm_node+0x9a/0x120
[  850.207511]  ? crc_43+0x1e/0x1e
[  850.211887]  jbd2_journal_recover+0x8d/0x150
[  850.272927]  jbd2_journal_load+0x130/0x1f0
[  850.280601]  ext4_load_journal+0x271/0x5d0
[  850.288540]  __ext4_fill_super+0x2aa1/0x2e10
[  850.296290]  ? pointer+0x36f/0x500
[  850.304910]  ext4_fill_super+0xd3/0x280
[  850.372470]  ? ext4_fill_super+0xd3/0x280
[  850.380637]  get_tree_bdev+0x189/0x280
[  850.384398]  ? __ext4_fill_super+0x2e10/0x2e10
[  850.388490]  ext4_get_tree+0x15/0x20
[  850.392123]  vfs_get_tree+0x2a/0xd0
[  850.395859]  do_new_mount+0x184/0x2e0
[  850.468151]  path_mount+0x1f3/0x890
[  850.471804]  ? putname+0x5f/0x80
[  850.475341]  init_mount+0x5e/0x9f
[  850.478976]  do_mount_root+0x8d/0x124
[  850.482626]  mount_block_root+0xd8/0x1ea
[  850.486368]  mount_root+0x62/0x6e
[  850.568079]  prepare_namespace+0x13f/0x19e
[  850.571984]  kernel_init_freeable+0x120/0x139
[  850.575930]  ? rest_init+0xe0/0xe0
[  850.579511]  kernel_init+0x1b/0x170
[  850.583084]  ? rest_init+0xe0/0xe0
[  850.586642]  ret_from_fork+0x22/0x30
[  850.668205]  </TASK>

This happens since 5.19.0-1024-aws, I have now rolled back to 5.19.0-1022-aws. Is there anyone aware of this?

esysc
gefragt vor einem Jahr289 Aufrufe
1 Antwort
0

I was able to reproduce also with 5.19.0-1022-aws multiple times, IMHO it doesn't depend on the kernel version. Our instances are all t3 and t3.a type

esysc
beantwortet vor 10 Monaten

Du bist nicht angemeldet. Anmelden um eine Antwort zu veröffentlichen.

Eine gute Antwort beantwortet die Frage klar, gibt konstruktives Feedback und fördert die berufliche Weiterentwicklung des Fragenstellers.

Richtlinien für die Beantwortung von Fragen