AmazonL2 kernel problem with software raid - mitigation



The latest kernel for Amazon Linux2 experiences problems when using software RAID, which results in a RAID accessing process to reach 100% CPU and to hang, eventually rendering the ec2 instance unresponsive.

My question is if there is any way to mitigate this issue as it does not happen with CentOS7, RedHat7 and RedHat8, it also doesn't occur with older AmznL2 kernel (4.14.77-81.59.amzn2.x86_64)?

Kernel: 4.14.232-177.418.amzn2.x86_64
OS: Amazon Linux 2

Steps to reproduce (devices might be different):
mdadm --create --verbose /dev/md0 --level=stripe --raid-devices=2 /dev/xvdb /dev/xvdc
mdadm --stop /dev/md0; mdadm --assemble /dev/md0 /dev/xvdb /dev/xvdc;

The "mdadm" process hangs when trying to assemble the raid, also any process trying to access this raid. This happen when the "/dev/md0" is cleared, but "/sys/block/md0" is left.

Best regards and thank you in advance,

Edited by: rga on Jul 12, 2021 11:14 AM

已提問 3 年前檢視次數 241 次
1 個回答

The latest tested kernel (4.14.238-182.422.amzn2.x86_64) of Amazon Linux 2 does not show the mentioned "mdadm" assembly problem.

Edited by: rga on Jul 27, 2021 7:53 AM

已回答 3 年前

您尚未登入。 登入 去張貼答案。