How can I troubleshoot the pod status ErrImagePull and ImagePullBackoff errors in Amazon EKS?

7 minute read
0

My Amazon Elastic Kubernetes Service (Amazon EKS) pod status is in the ErrImagePull or ImagePullBackoff status.

Short description

If you run the kubectl command get pods and your pods are in the ImagePullBackOff status, then the pods aren't running correctly. The ImagePullBackOff status means that a container couldn't start because an image was unable to get retrieved or pulled. For more information, see Amazon EKS connector pods are in ImagePullBackOff status.

You might receive an ImagePull error if:

  • An image name, tag, or digest are incorrect.
  • The images require credentials to authenticate.
  • The registry isn't accessible.

Resolution

1. Check the pod status, error message, and verify that the image name, tag, and SHA are correct

To get the status of a pod, run the kubectl command get pods:

$ kubectl get pods -n default
NAME                              READY   STATUS             RESTARTS   AGE
nginx-7cdbb5f49f-2p6p2            0/1     ImagePullBackOff   0          86s

To get the details of a pods error message, run the kubectl command describe pod:

$ kubectl describe pod nginx-7cdbb5f49f-2p6p2

...
Events:
  Type     Reason     Age                   From               Message
  ----     ------     ----                  ----               -------
  Normal   Scheduled  4m23s                 default-scheduler  Successfully assigned default/nginx-7cdbb5f49f-2p6p2 to ip-192-168-149-143.us-east-2.compute.internal
  Normal   Pulling    2m44s (x4 over 4m9s)  kubelet            Pulling image "nginxx:latest"
  Warning  Failed     2m43s (x4 over 4m9s)  kubelet            Failed to pull image "nginxx:latest": rpc error: code = Unknown desc = Error response from daemon: pull access denied for nginxx, repository does not exist or may require 'docker login': denied: requested access to the resource is denied
  Warning  Failed     2m43s (x4 over 4m9s)  kubelet            Error: ErrImagePull
  Warning  Failed     2m32s (x6 over 4m8s)  kubelet            Error: ImagePullBackOff
  Normal   BackOff    2m17s (x7 over 4m8s)  kubelet            Back-off pulling image "nginxx:latest"

$ kubectl describe pod nginx-55d75d5f56-qrqmp ... Events: Type Reason Age From Message ---- ------ ---- ---- ------- Normal Scheduled 2m20s default-scheduler Successfully assigned default/nginx-55d75d5f56-qrqmp to ip-192-168-149-143.us-east-2.compute.internal Normal Pulling 40s (x4 over 2m6s) kubelet Pulling image "nginx:latestttt" Warning Failed 39s (x4 over 2m5s) kubelet Failed to pull image "nginx:latestttt": rpc error: code = Unknown desc = Error response from daemon: manifest for nginx:latestttt not found: manifest unknown: manifest unknown Warning Failed 39s (x4 over 2m5s) kubelet Error: ErrImagePull Warning Failed 26s (x6 over 2m5s) kubelet Error: ImagePullBackOff Normal BackOff 11s (x7 over 2m5s) kubelet Back-off pulling image "nginx:latestttt" Make sure that your image tag and name exist and are correct. If the image registry requires authentication, make sure that you are authorized to access it. To verify that the image used in the pod is correct, run the following command:

$ kubectl get pods nginx-7cdbb5f49f-2p6p2  -o jsonpath="{.spec.containers[*].image}" | \sort

nginxx:latest

To understand the pod status values, see Pod phase on the Kubernetes website.

For more information, see How can I troubleshoot the pod status in Amazon EKS?

2. Amazon Elastic Container Registry (Amazon ECR) images

If you're trying to pull images from Amazon ECR using Amazon EKS, additional configuration might be required. If your image is stored in an Amazon ECR private registry, make sure that you specify the credentials imagePullSecrets on the pod. These credentials are used to authenticate with the private registry.

Create a Secret named it regcred:

kubectl create secret docker-registry regcred --docker-server=<your-registry-server> --docker-username=<your-name> --docker-password=<your-pword> --docker-email=<your-email>

Be sure to replace the following credentials:

  • <your-registry-server> is your Private Docker Registry FQDN. Use https://index.docker.io/v1/ for DockerHub.
  • <your-name> is your Docker username.
  • <your-pword> is your Docker password.
  • <your-email> is your Docker email.

You have successfully set your Docker credentials in the cluster as a Secret named regcred.

To understand the contents of the regcred Secret, view the Secret in YAML format:

kubectl get secret regcred --output=yaml

In the following example, a pod needs access to your Docker credentials in regcred:

apiVersion: v1
kind: Pod
metadata:
  name: private-reg
spec:
  containers:
  - name: private-reg-container
    image: <your-private-image>
  imagePullSecrets:
  - name: regcred

Replace your.private.registry.example with the path to an image in a private registry similar to the following:

your.private.registry.example.com/bob/bob-private:v1

To pull the image from the private registry, Kubernetes requires the credentials. The imagePullSecrets field in the configuration file specifies that Kubernetes must get the credentials from a Secret named regcred.

For more options with creating a Secret, see create a Pod that uses a Secret to pull an image on the Kubernetes website.

3. Registry troubleshooting

In the following example, the registry is inaccessible due to a network connectivity issue because kubelet isn't able to reach the private registry endpoint:

$ kubectl describe pods nginx-9cc69448d-vgm4m

...
Events:
  Type     Reason     Age                From               Message
  ----     ------     ----               ----               -------
  Normal   Scheduled  16m                default-scheduler  Successfully assigned default/nginx-9cc69448d-vgm4m to ip-192-168-149-143.us-east-2.compute.internal
  Normal   Pulling    15m (x3 over 16m)  kubelet            Pulling image "nginx:stable"
  Warning  Failed     15m (x3 over 16m)  kubelet            Failed to pull image "nginx:stable": rpc error: code = Unknown desc = Error response from daemon: Get "https://registry-1.docker.io/v2/": net/http: request canceled while waiting for connection (Client.Timeout exceeded while awaiting headers)
  Warning  Failed     15m (x3 over 16m)  kubelet            Error: ErrImagePull
  Normal   BackOff    14m (x4 over 16m)  kubelet            Back-off pulling image "nginx:stable"
  Warning  Failed     14m (x4 over 16m)  kubelet            Error: ImagePullBackOff

The error "Failed to pull image..." means that kubelet tried to connect to the Docker Registry endpoint and failed due to a connection timeout.

To troubleshoot this error, check your subnet, security groups, and network ACL that allow communication to the specified registry endpoint.

In the following example, the registry rate limit has exceeded:

$ kubectl describe pod nginx-6bf9f7cf5d-22q48

...
Events:
  Type     Reason                  Age                   From               Message
  ----     ------                  ----                  ----               -------
  Normal   Scheduled               3m54s                 default-scheduler  Successfully assigned default/nginx-6bf9f7cf5d-22q48 to ip-192-168-153-54.us-east-2.compute.internal
  Warning  FailedCreatePodSandBox  3m33s                 kubelet            Failed to create pod sandbox: rpc error: code = Unknown desc = failed to set up sandbox container "82065dea585e8428eaf9df89936653b5ef12b53bef7f83baddb22edc59cd562a" network for pod "nginx-6bf9f7cf5d-22q48": networkPlugin cni failed to set up pod "nginx-6bf9f7cf5d-22q48_default" network: add cmd: failed to assign an IP address to container
  Warning  FailedCreatePodSandBox  2m53s                 kubelet            Failed to create pod sandbox: rpc error: code = Unknown desc = failed to set up sandbox container "20f2e27ba6d813ffc754a12a1444aa20d552cc9d665f4fe5506b02a4fb53db36" network for pod "nginx-6bf9f7cf5d-22q48": networkPlugin cni failed to set up pod "nginx-6bf9f7cf5d-22q48_default" network: add cmd: failed to assign an IP address to container
  Warning  FailedCreatePodSandBox  2m35s                 kubelet            Failed to create pod sandbox: rpc error: code = Unknown desc = failed to set up sandbox container "d9b7e98187e84fed907ff882279bf16223bf5ed0176b03dff3b860ca9a7d5e03" network for pod "nginx-6bf9f7cf5d-22q48": networkPlugin cni failed to set up pod "nginx-6bf9f7cf5d-22q48_default" network: add cmd: failed to assign an IP address to container
  Warning  FailedCreatePodSandBox  2m                    kubelet            Failed to create pod sandbox: rpc error: code = Unknown desc = failed to set up sandbox container "c02c8b65d7d49c94aadd396cb57031d6df5e718ab629237cdea63d2185dbbfb0" network for pod "nginx-6bf9f7cf5d-22q48": networkPlugin cni failed to set up pod "nginx-6bf9f7cf5d-22q48_default" network: add cmd: failed to assign an IP address to container
  Normal   SandboxChanged          119s (x4 over 3m13s)  kubelet            Pod sandbox changed, it will be killed and re-created.
  Normal   Pulling                 56s (x3 over 99s)     kubelet            Pulling image "httpd:latest"
  Warning  Failed                  56s (x3 over 99s)     kubelet            Failed to pull image "httpd:latest": rpc error: code = Unknown desc = Error response from daemon: toomanyrequests: You have reached your pull rate limit. You may increase the limit by authenticating and upgrading: https://www.docker.com/increase-rate-limit
  Warning  Failed                  56s (x3 over 99s)     kubelet            Error: ErrImagePull
  Normal   BackOff                 43s (x4 over 98s)     kubelet            Back-off pulling image "httpd:latest"
  Warning  Failed                  43s (x4 over 98s)     kubelet            Error: ImagePullBackOff

The Docker registry rate limit is 100 container image requests per six hours for anonymous usage, and 200 for Docker accounts. Image requests exceeding these limits are denied access until the six hour window elapses. To manage usage and understand registry rate limits, see Understanding Your Docker Hub Rate Limit on the Docker website.


Related information

Amazon EKS troubleshooting

How do I troubleshoot Amazon ECR issues with Amazon EKS?

Security best practices for Amazon EKS

AWS OFFICIAL
AWS OFFICIALUpdated a month ago