如何對 Amazon EKS 中的 Pod 狀態 ErrImagePull 和 ImagePullBackoff 錯誤進行疑難排解?
我的 Amazon Elastic Kubernetes Service (Amazon EKS) Pod 狀態處於 ErrImagePull 或 ImagePullBackoff 狀態。
簡短描述
如果您執行 kubectl 命令 get pods,且您的 Pod 處於 ImagePullBackOff 狀態,則 Pod 並未正確執行。ImagePullBackOff 狀態表示容器無法啟動,因為無法擷取或提取映像。如需詳細資訊,請參閱 Amazon EKS 連接器 Pod 處於 ImagePullBackOff 狀態。
如果出現以下情況,您可能會收到 ImagePull 錯誤:
- 映像名稱、標籤或概要不正確。
- 映像需要憑證才能進行身分驗證。
- 無法存取登錄檔。
解決方案
1.檢查 Pod 狀態、錯誤訊息,並確認映像名稱、標籤和 SHA 是否正確
若要取得 Pod 的狀態,請執行 kubectl 命令 get pods:
$ kubectl get pods -n default NAME READY STATUS RESTARTS AGE nginx-7cdbb5f49f-2p6p2 0/1 ImagePullBackOff 0 86s
若要取得 Pod 錯誤訊息的詳細資訊,請執行 kubectl 命令 describe pod:
$ kubectl describe pod nginx-7cdbb5f49f-2p6p2 ... Events: Type Reason Age From Message ---- ------ ---- ---- ------- Normal Scheduled 4m23s default-scheduler Successfully assigned default/nginx-7cdbb5f49f-2p6p2 to ip-192-168-149-143.us-east-2.compute.internal Normal Pulling 2m44s (x4 over 4m9s) kubelet Pulling image "nginxx:latest" Warning Failed 2m43s (x4 over 4m9s) kubelet Failed to pull image "nginxx:latest": rpc error: code = Unknown desc = Error response from daemon: pull access denied for nginxx, repository does not exist or may require 'docker login': denied: requested access to the resource is denied Warning Failed 2m43s (x4 over 4m9s) kubelet Error: ErrImagePull Warning Failed 2m32s (x6 over 4m8s) kubelet Error: ImagePullBackOff Normal BackOff 2m17s (x7 over 4m8s) kubelet Back-off pulling image "nginxx:latest"
$ kubectl describe pod nginx-55d75d5f56-qrqmp ...Events: Type Reason Age From Message ---- ------ ---- ---- ------- Normal Scheduled 2m20s default-scheduler Successfully assigned default/nginx-55d75d5f56-qrqmp to ip-192-168-149-143.us-east-2.compute.internal Normal Pulling 40s (x4 over 2m6s) kubelet Pulling image "nginx:latestttt" Warning Failed 39s (x4 over 2m5s) kubelet Failed to pull image "nginx:latestttt": rpc error: code = Unknown desc = Error response from daemon: manifest for nginx:latestttt not found: manifest unknown: manifest unknown Warning Failed 39s (x4 over 2m5s) kubelet Error: ErrImagePull Warning Failed 26s (x6 over 2m5s) kubelet Error: ImagePullBackOff Normal BackOff 11s (x7 over 2m5s) kubelet Back-off pulling image "nginx:latestttt" 請確定您的映像標籤和名稱已存在且正確無誤。如果映像登錄檔需要身分驗證,請確定您已獲得存取授權。若要確認 Pod 中使用的映像是否正確,請執行下列命令:
$ kubectl get pods nginx-7cdbb5f49f-2p6p2 -o jsonpath="{.spec.containers[*].image}" | \sort nginxx:latest
若要了解 Pod 狀態值,請參閱 Kubernetes 網站上的 Pod 階段。
如需詳細資訊,請參閱如何對 Amazon EKS 中的 Pod 狀態進行疑難排解?
2.Amazon Elastic Container Registry (Amazon ECR) 映像
如果您嘗試使用 Amazon EKS 從 Amazon ECR 提取映像,則可能需要其他組態。如果映像儲存在 Amazon ECR 私有登錄檔中,請確定您已在 Pod 上指定了憑證 imagePullSecrets。這些憑證可用於藉助私有登錄檔進行身分驗證。
建立一個名為 regcred 的密碼:
kubectl create secret docker-registry regcred --docker-server=<your-registry-server> --docker-username=<your-name> --docker-password=<your-pword> --docker-email=<your-email>
請務必取代下列憑證:
- <your-registry-server> 是您的私有 Docker 登錄檔 FQDN。使用 DockerHub 的 https://index.docker.io/v1/。
- <your-name> 是您的 Docker 使用者名稱。
- <your-pword> 是您的 Docker 密碼。
- <your-email> 是您的 Docker 電子郵件。
您已成功將叢集中的 Docker 憑證設定為名為 regcred 的密碼。
若要了解 regcred 密碼的內容,請以 YAML 格式檢視密碼:
kubectl get secret regcred --output=yaml
在以下範例中,Pod 需要以 regcred 方式存取您的 Docker 憑證:
apiVersion: v1 kind: Pod metadata: name: private-reg spec: containers: - name: private-reg-container image: <your-private-image> imagePullSecrets: - name: regcred
將您的 your.private.registry.example 替換為私有登錄檔中的映像路徑,如下所示:
your.private.registry.example.com/bob/bob-private:v1
若要從私有登錄檔中提取映像,Kubernetes 需要憑證。組態檔案中的 imagePullSecrets 欄位指定 Kubernetes 必須從名為 regcred 的「密碼」中取得憑證。
如需有關建立密碼的更多選項,請參閱建立使用密碼在 Kubernetes 網站上提取映像的 Pod。
3.登錄檔疑難排解
在以下範例中,由於網路連線問題而無法存取登錄檔,因為 kubelet 無法連線私有登錄檔端點:
$ kubectl describe pods nginx-9cc69448d-vgm4m ... Events: Type Reason Age From Message ---- ------ ---- ---- ------- Normal Scheduled 16m default-scheduler Successfully assigned default/nginx-9cc69448d-vgm4m to ip-192-168-149-143.us-east-2.compute.internal Normal Pulling 15m (x3 over 16m) kubelet Pulling image "nginx:stable" Warning Failed 15m (x3 over 16m) kubelet Failed to pull image "nginx:stable": rpc error: code = Unknown desc = Error response from daemon: Get "https://registry-1.docker.io/v2/": net/http: request canceled while waiting for connection (Client.Timeout exceeded while awaiting headers) Warning Failed 15m (x3 over 16m) kubelet Error: ErrImagePull Normal BackOff 14m (x4 over 16m) kubelet Back-off pulling image "nginx:stable" Warning Failed 14m (x4 over 16m) kubelet Error: ImagePullBackOff
錯誤 "Failed to pull image..." (無法提取映像...) 表示 kubelet 嘗試連接到 Docker 登錄檔端點,並且由於連接超時而失敗。
若要對此錯誤進行疑難排解,請檢查允許與指定登錄檔端點通訊的子網路、安全群組和網路 ACL。
在以下範例中,已超過登錄檔速率限制:
$ kubectl describe pod nginx-6bf9f7cf5d-22q48 ... Events: Type Reason Age From Message ---- ------ ---- ---- ------- Normal Scheduled 3m54s default-scheduler Successfully assigned default/nginx-6bf9f7cf5d-22q48 to ip-192-168-153-54.us-east-2.compute.internal Warning FailedCreatePodSandBox 3m33s kubelet Failed to create pod sandbox: rpc error: code = Unknown desc = failed to set up sandbox container "82065dea585e8428eaf9df89936653b5ef12b53bef7f83baddb22edc59cd562a" network for pod "nginx-6bf9f7cf5d-22q48": networkPlugin cni failed to set up pod "nginx-6bf9f7cf5d-22q48_default" network: add cmd: failed to assign an IP address to container Warning FailedCreatePodSandBox 2m53s kubelet Failed to create pod sandbox: rpc error: code = Unknown desc = failed to set up sandbox container "20f2e27ba6d813ffc754a12a1444aa20d552cc9d665f4fe5506b02a4fb53db36" network for pod "nginx-6bf9f7cf5d-22q48": networkPlugin cni failed to set up pod "nginx-6bf9f7cf5d-22q48_default" network: add cmd: failed to assign an IP address to container Warning FailedCreatePodSandBox 2m35s kubelet Failed to create pod sandbox: rpc error: code = Unknown desc = failed to set up sandbox container "d9b7e98187e84fed907ff882279bf16223bf5ed0176b03dff3b860ca9a7d5e03" network for pod "nginx-6bf9f7cf5d-22q48": networkPlugin cni failed to set up pod "nginx-6bf9f7cf5d-22q48_default" network: add cmd: failed to assign an IP address to container Warning FailedCreatePodSandBox 2m kubelet Failed to create pod sandbox: rpc error: code = Unknown desc = failed to set up sandbox container "c02c8b65d7d49c94aadd396cb57031d6df5e718ab629237cdea63d2185dbbfb0" network for pod "nginx-6bf9f7cf5d-22q48": networkPlugin cni failed to set up pod "nginx-6bf9f7cf5d-22q48_default" network: add cmd: failed to assign an IP address to container Normal SandboxChanged 119s (x4 over 3m13s) kubelet Pod sandbox changed, it will be killed and re-created. Normal Pulling 56s (x3 over 99s) kubelet Pulling image "httpd:latest" Warning Failed 56s (x3 over 99s) kubelet Failed to pull image "httpd:latest": rpc error: code = Unknown desc = Error response from daemon: toomanyrequests: You have reached your pull rate limit. You may increase the limit by authenticating and upgrading: https://www.docker.com/increase-rate-limit Warning Failed 56s (x3 over 99s) kubelet Error: ErrImagePull Normal BackOff 43s (x4 over 98s) kubelet Back-off pulling image "httpd:latest" Warning Failed 43s (x4 over 98s) kubelet Error: ImagePullBackOff
Docker 登錄檔速率限制為每六小時 100 個容器映像請求 (匿名使用),而 Docker 帳戶則為 200 個容器映像請求。超過這些限制的映像請求會遭到拒絕存取,直到六小時的時間過去為止。若要管理用量及了解登錄檔速率限制,請參閱 Docker 網站上的了解您的 Docker Hub 速率限制。
相關資訊

相關內容
- 已提問 5 個月前lg...
- 已提問 5 個月前lg...
- 已提問 5 個月前lg...
- 已提問 4 個月前lg...
- 已提問 5 個月前lg...
- AWS 官方已更新 1 年前
- AWS 官方已更新 4 個月前