Collecting memory dump of EC2 to mitigate risk of crashing because of underlying host issues

0

I am looking for risk mitigation strategy for a critical Application running on EC2. If EC2 crashes because of underlying host issue, how can we take memory dump for diagnostic reasons and for submitting it with support case with application provider?

Do we have any way of proactively taking memory dumps, and collect it in cloudwatch or s3?

AWS
Pir
demandé il y a 2 mois70 vues
2 réponses
0

Hello.

Is there insufficient information if I just get the memory usage rate with CloudWatch Agent?
https://docs.aws.amazon.com/AmazonCloudWatch/latest/monitoring/metrics-collected-by-CloudWatch-agent.html

Although it may not be very useful in the event of an EC2 physical host failure, you can also use kdump to record information useful for troubleshooting in the event of a kernel panic.
https://docs.aws.amazon.com/AWSEC2/latest/UserGuide/diagnostic-interrupt.html

profile picture
EXPERT
répondu il y a 2 mois
0

Hello,

If your needs related to resource utilization monitoring you can consider monitoring tools like atop and sar.

Additionally, you can also push more Linux metrics using the CloudWatch agent and push them into CloudWatch.

[+] https://docs.aws.amazon.com/AmazonCloudWatch/latest/monitoring/metrics-collected-by-CloudWatch-agent.html

In case of an underlying hardware issue, EC2 will reach 2/2 check failure, or the instance will be moved to another hardware. In this case, what diagnostics you want to share with the support engineer?

profile picture
répondu il y a 2 mois

Vous n'êtes pas connecté. Se connecter pour publier une réponse.

Une bonne réponse répond clairement à la question, contient des commentaires constructifs et encourage le développement professionnel de la personne qui pose la question.

Instructions pour répondre aux questions