"Task ran out of memory" from containerd but neither the pod or the container were killed.

0

Hey guys.

So I faced a few cases in the last weeks that the containerd printed "Task ran out of memory" error, but neither the pod nor the container were killed or restarted.

I could see that the Pod memory usage was skyrocketing at that moment, and get stabilized right after the containerd "Task ran out of memory" got printed.

I was expected to see something like: "exit 137" or an increased pod restart counter, but there was nothing.

What is the actual behavior of the containerd when it prints the "Task ran out of memory" error, and what's the condition of the error?

I'm running the EKS v1.27 cluster with amazon-eks-node-1.27-v20230711 AMI (ami-00f80984c1a72a9d1).

Kernel version: 5.10.199-190.747.amzn2.aarch64, Kubelet v1.27.7-eks-e71965b.

Can someone help me here?

Thanks!

Yechan
asked 3 months ago360 views
2 Answers
0

if a container exceeds its memory limit, it will be terminated and, depending on the restart policy, the pod might be restarted. This is what typically leads to an "exit 137" status, indicating termination due to memory issue. Please check it as below the steps as mention kubectl describe pod <pod-name> Look at kubelet and containerd logs for more detailed information about the memory handling Ensure your EKS cluster, kubelet, and containerd are up-to-date with the latest patches. Use tools like Prometheus or CloudWatch to monitor the memory usage over time.

profile picture
Jagan
answered 3 months ago
profile picture
EXPERT
reviewed 25 days ago
  • Hey Jagan. Thanks for the reply.

    Unfortunately, I was not able to see any exit 137 or restart histories from the pod.

    The memory usage of the pod definitely increased at that time, and it quickly stabilized when the OOM even occurred.

    I checked the Containerd OOM event and the Pod memory usage through the Datadog btw.

    Do you have any idea where should I look further? I don't think checking the EKS version and etc are helpful tho.

0

Given the specific versions you provided (EKS v1.27, Kernel version 5.10.199), it may be worth checking for any known issues or updates related to memory management in these versions. could ypu please refer to thoiis eks-ami official documentations for known issues in the provided version :- https://github.com/awslabs/amazon-eks-ami/blob/master/CHANGELOG.md

profile picture
EXPERT
answered 3 months ago
profile picture
EXPERT
reviewed 25 days ago

You are not logged in. Log in to post an answer.

A good answer clearly answers the question and provides constructive feedback and encourages professional growth in the question asker.

Guidelines for Answering Questions