AmazonL2 kernel problem with software raid - mitigation



The latest kernel for Amazon Linux2 experiences problems when using software RAID, which results in a RAID accessing process to reach 100% CPU and to hang, eventually rendering the ec2 instance unresponsive.

My question is if there is any way to mitigate this issue as it does not happen with CentOS7, RedHat7 and RedHat8, it also doesn't occur with older AmznL2 kernel (4.14.77-81.59.amzn2.x86_64)?

Kernel: 4.14.232-177.418.amzn2.x86_64
OS: Amazon Linux 2

Steps to reproduce (devices might be different):
mdadm --create --verbose /dev/md0 --level=stripe --raid-devices=2 /dev/xvdb /dev/xvdc
mdadm --stop /dev/md0; mdadm --assemble /dev/md0 /dev/xvdb /dev/xvdc;

The "mdadm" process hangs when trying to assemble the raid, also any process trying to access this raid. This happen when the "/dev/md0" is cleared, but "/sys/block/md0" is left.

Best regards and thank you in advance,

Edited by: rga on Jul 12, 2021 11:14 AM

demandé il y a 3 ans240 vues
1 réponse

The latest tested kernel (4.14.238-182.422.amzn2.x86_64) of Amazon Linux 2 does not show the mentioned "mdadm" assembly problem.

Edited by: rga on Jul 27, 2021 7:53 AM

répondu il y a 3 ans

Vous n'êtes pas connecté. Se connecter pour publier une réponse.

Une bonne réponse répond clairement à la question, contient des commentaires constructifs et encourage le développement professionnel de la personne qui pose la question.

Instructions pour répondre aux questions