ec2 crash/oom frequently

0

Enter image description here

Enter image description here

awser
질문됨 일 년 전502회 조회
1개 답변
0

oom-killer is a Linux process that the kernel runs when a system is low on memory, and needs to kill a process to try and free up some memory (it's more complicated, but that's the basic gist of it).

Your graphs show that CPU maxed out for an hour before going back to zero, and then when it went back to zero your failed instance count went from zero to one. There isn't anything noteworthy in terms of the graphs of network and disk.

Putting both of these together, it is likely that your system was running low on memory, and so the Linux memory manager was trying to swap processes out of main memory and onto disk. As free memory gets less and less the memory manager will spend more and more of its time (and more and more CPU) trying to free up pages of main memory, driving the CPU usage up to 100% as you can see in the first graph. Running out of memory is also why oom-killer would be run (it's only ever run in extreme circumstances like this).

Unfortunately the EC2 section of the AWS Console doesn't display metrics for memory use, you'll need to setup CloudWatch agent to collect these https://docs.aws.amazon.com/AmazonCloudWatch/latest/monitoring/Install-CloudWatch-Agent.html This will help with your troubleshooting if this situation happens again.

profile picture
전문가
Steve_M
답변함 일 년 전
profile picture
전문가
검토됨 일 년 전

로그인하지 않았습니다. 로그인해야 답변을 게시할 수 있습니다.

좋은 답변은 질문에 명확하게 답하고 건설적인 피드백을 제공하며 질문자의 전문적인 성장을 장려합니다.

질문 답변하기에 대한 가이드라인

관련 콘텐츠