[UBUNTU] EBS volume attachment during boot cause randomly EC2 instance to be stuck

0

We create and deploy custom AMIs based on Ubuntu Jammy and we noticed since jammy-20230428 that randomly all the AMI based on it sometimes fail during the boot process. I can destroy and deploy again to get rid of this. The stack trace is always the same:

[  849.765218] INFO: task swapper/0:1 blocked for more than 727 seconds.
[  849.774999]       Not tainted 5.19.0-1025-aws #26~22.04.1-Ubuntu
[  849.787081] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
[  849.811223] task:swapper/0       state:D stack:    0 pid:    1 ppid:     0 flags:0x00004000
[  849.883494] Call Trace:
[  849.891369]  <TASK>
[  849.899306]  __schedule+0x254/0x5a0
[  849.907878]  schedule+0x5d/0x100
[  849.917136]  io_schedule+0x46/0x80
[  849.970890]  blk_mq_get_tag+0x117/0x300
[  849.976136]  ? destroy_sched_domains_rcu+0x40/0x40
[  849.981442]  __blk_mq_alloc_requests+0xc4/0x1e0
[  849.986750]  blk_mq_get_new_requests+0xcc/0x190
[  849.992185]  blk_mq_submit_bio+0x1eb/0x450
[  850.070689]  __submit_bio+0xf6/0x190
[  850.075545]  submit_bio_noacct_nocheck+0xc2/0x120
[  850.080841]  submit_bio_noacct+0x209/0x560
[  850.085654]  submit_bio+0x40/0xf0
[  850.090361]  submit_bh_wbc+0x134/0x170
[  850.094905]  ll_rw_block+0xbc/0xd0
[  850.175198]  do_readahead.isra.0+0x126/0x1e0
[  850.183531]  jread+0xeb/0x100
[  850.189648]  do_one_pass+0xbb/0xb90
[  850.193917]  ? crypto_create_tfm_node+0x9a/0x120
[  850.207511]  ? crc_43+0x1e/0x1e
[  850.211887]  jbd2_journal_recover+0x8d/0x150
[  850.272927]  jbd2_journal_load+0x130/0x1f0
[  850.280601]  ext4_load_journal+0x271/0x5d0
[  850.288540]  __ext4_fill_super+0x2aa1/0x2e10
[  850.296290]  ? pointer+0x36f/0x500
[  850.304910]  ext4_fill_super+0xd3/0x280
[  850.372470]  ? ext4_fill_super+0xd3/0x280
[  850.380637]  get_tree_bdev+0x189/0x280
[  850.384398]  ? __ext4_fill_super+0x2e10/0x2e10
[  850.388490]  ext4_get_tree+0x15/0x20
[  850.392123]  vfs_get_tree+0x2a/0xd0
[  850.395859]  do_new_mount+0x184/0x2e0
[  850.468151]  path_mount+0x1f3/0x890
[  850.471804]  ? putname+0x5f/0x80
[  850.475341]  init_mount+0x5e/0x9f
[  850.478976]  do_mount_root+0x8d/0x124
[  850.482626]  mount_block_root+0xd8/0x1ea
[  850.486368]  mount_root+0x62/0x6e
[  850.568079]  prepare_namespace+0x13f/0x19e
[  850.571984]  kernel_init_freeable+0x120/0x139
[  850.575930]  ? rest_init+0xe0/0xe0
[  850.579511]  kernel_init+0x1b/0x170
[  850.583084]  ? rest_init+0xe0/0xe0
[  850.586642]  ret_from_fork+0x22/0x30
[  850.668205]  </TASK>

This happens since 5.19.0-1024-aws, I have now rolled back to 5.19.0-1022-aws. Is there anyone aware of this?

esysc
posta un anno fa289 visualizzazioni
1 Risposta
0

I was able to reproduce also with 5.19.0-1022-aws multiple times, IMHO it doesn't depend on the kernel version. Our instances are all t3 and t3.a type

esysc
con risposta 10 mesi fa

Accesso non effettuato. Accedi per postare una risposta.

Una buona risposta soddisfa chiaramente la domanda, fornisce un feedback costruttivo e incoraggia la crescita professionale del richiedente.

Linee guida per rispondere alle domande