Collecting memory dump of EC2 to mitigate risk of crashing because of underlying host issues

0

I am looking for risk mitigation strategy for a critical Application running on EC2. If EC2 crashes because of underlying host issue, how can we take memory dump for diagnostic reasons and for submitting it with support case with application provider?

Do we have any way of proactively taking memory dumps, and collect it in cloudwatch or s3?

AWS
Pir
질문됨 2달 전68회 조회
2개 답변
0

Hello.

Is there insufficient information if I just get the memory usage rate with CloudWatch Agent?
https://docs.aws.amazon.com/AmazonCloudWatch/latest/monitoring/metrics-collected-by-CloudWatch-agent.html

Although it may not be very useful in the event of an EC2 physical host failure, you can also use kdump to record information useful for troubleshooting in the event of a kernel panic.
https://docs.aws.amazon.com/AWSEC2/latest/UserGuide/diagnostic-interrupt.html

profile picture
전문가
답변함 2달 전
0

Hello,

If your needs related to resource utilization monitoring you can consider monitoring tools like atop and sar.

Additionally, you can also push more Linux metrics using the CloudWatch agent and push them into CloudWatch.

[+] https://docs.aws.amazon.com/AmazonCloudWatch/latest/monitoring/metrics-collected-by-CloudWatch-agent.html

In case of an underlying hardware issue, EC2 will reach 2/2 check failure, or the instance will be moved to another hardware. In this case, what diagnostics you want to share with the support engineer?

profile picture
답변함 2달 전

로그인하지 않았습니다. 로그인해야 답변을 게시할 수 있습니다.

좋은 답변은 질문에 명확하게 답하고 건설적인 피드백을 제공하며 질문자의 전문적인 성장을 장려합니다.

질문 답변하기에 대한 가이드라인

관련 콘텐츠