Skip to content

EKS creating pod stuck at pulling image

1

Hello everyone,

I am using EKS with Kubernetes version 1.29, worker node version 1.28

Pods take a very long time to be created, about 1 hour for image size 1.33 GB (private ghcr.io registry)

When I describe the pod, events are normal, it is stuck at pulling image without any error.

sample: Normal Pulling 51m kubelet Pulling image "ghcr.io/aura-nw/long-campaign-be:euphoria_6cf3d53"

Create a docker-in-docker pod in that cluster and pull the same image to test, it only took about 4 minutes.

Please help if you know anything about this case. Thanks!

asked 2 years ago1.9K views
5 Answers
3
EXPERT
answered 2 years ago
EXPERT
reviewed 2 years ago
EXPERT
reviewed 2 years ago
  • thank you but nothing helpful from the docs, don't have any error, just pulling process takes very long time but still succeed at the end

2

Hello,

To address the issue

Check Image Pull Secrets:

  • Ensure you have configured imagePullSecrets correctly in your pod specification to authenticate with the private ghcr.io registry.
apiVersion: v1
kind: Pod
metadata:
  name: your-pod-name
spec:
  containers:
  - name: your-container-name
    image: ghcr.io/aura-nw/long-campaign-be:euphoria_6cf3d53
  imagePullSecrets:
  - name: your-secret-name

Create a secret with your GitHub Container Registry credentials:

kubectl create secret docker-registry your-secret-name \
  --docker-server=ghcr.io \
  --docker-username=your-username \
  --docker-password=your-personal-access-token

Check Node Resources:

  • Ensure your worker nodes have sufficient resources (CPU, memory, network bandwidth) to pull the images efficiently. Insufficient resources can cause delays in pulling large images.
  • Monitor node status and resource usage:
kubectl describe nodes
  • Ensure VPC, subnets, and security groups are properly configured for internet access and efficient image pulling.

These steps should help in resolving the issue with image pulling in your EKS cluster.

EXPERT
answered 2 years ago
EXPERT
reviewed 2 years ago
EXPERT
reviewed 2 years ago
  • Node Resources seem normal, can pull normally when exec and pull inside a test pod but image pulling for creating container is very slow without any error

1

Hello,

When pulling images from a private registry, use imagePullSecrets in the workload manifest to specify the credentials. These credentials authenticate with the private registry, allowing the pod to pull images from the specified private repository.

https://repost.aws/knowledge-center/eks-pod-status-errors

EXPERT
answered 2 years ago
EXPERT
reviewed 2 years ago
EXPERT
reviewed 2 years ago
0

in additional: I have vpce for S3 and no vpce for api and dkr ecr. ECR in the same Region. EKS EC2 workers have access to Internet via NATGW.

answered 5 months ago
0

I am experiencing a similar problem so far (presumably) after the update from eks 1.30 and AL2 to eks 1.32 and latest AL2023 eks optimized.

Details:
The problem occurs sometimes - when I update my application, which consists of dozens of microservices - 1 or 2 pods stuck in status Pending (pulling image from ECR cache for docker hub) on one of the nodes, while on other nodes it successfully pull an image from ECR cache and starts. If I forcibly kill the pod, it downloads the image in a few seconds and starts successfully. Please pay serious attention to this issue. This is not related to the update periods of the 12h temporary token, because it appears much more often - when I update my application several times an hour - I always come across the fact that random pods on random nodes hang with this problem.

answered 5 months ago

You are not logged in. Log in to post an answer.

A good answer clearly answers the question and provides constructive feedback and encourages professional growth in the question asker.