Collecting memory dump of EC2 to mitigate risk of crashing because of underlying host issues

0

I am looking for risk mitigation strategy for a critical Application running on EC2. If EC2 crashes because of underlying host issue, how can we take memory dump for diagnostic reasons and for submitting it with support case with application provider?

Do we have any way of proactively taking memory dumps, and collect it in cloudwatch or s3?

AWS
Pir
feita há 2 meses85 visualizações
2 Respostas
0

Hello.

Is there insufficient information if I just get the memory usage rate with CloudWatch Agent?
https://docs.aws.amazon.com/AmazonCloudWatch/latest/monitoring/metrics-collected-by-CloudWatch-agent.html

Although it may not be very useful in the event of an EC2 physical host failure, you can also use kdump to record information useful for troubleshooting in the event of a kernel panic.
https://docs.aws.amazon.com/AWSEC2/latest/UserGuide/diagnostic-interrupt.html

profile picture
ESPECIALISTA
respondido há 2 meses
0

Hello,

If your needs related to resource utilization monitoring you can consider monitoring tools like atop and sar.

Additionally, you can also push more Linux metrics using the CloudWatch agent and push them into CloudWatch.

[+] https://docs.aws.amazon.com/AmazonCloudWatch/latest/monitoring/metrics-collected-by-CloudWatch-agent.html

In case of an underlying hardware issue, EC2 will reach 2/2 check failure, or the instance will be moved to another hardware. In this case, what diagnostics you want to share with the support engineer?

profile picture
respondido há 2 meses

Você não está conectado. Fazer login para postar uma resposta.

Uma boa resposta responde claramente à pergunta, dá feedback construtivo e incentiva o crescimento profissional de quem perguntou.

Diretrizes para responder a perguntas