如何對 Amazon EKS 中的 Pod 狀態 ErrImagePull 和 ImagePullBackoff 錯誤進行疑難排解?

5 分的閱讀內容
0

我的 Amazon Elastic Kubernetes Service (Amazon EKS) Pod 狀態處於 ErrImagePull 或 ImagePullBackoff 狀態。

簡短描述

如果您執行 kubectl 命令 get pods,且您的 Pod 處於 ImagePullBackOff 狀態,則 Pod 並未正確執行。ImagePullBackOff 狀態表示容器無法啟動,因為無法擷取或提取映像。如需詳細資訊,請參閱 Amazon EKS 連接器 Pod 處於 ImagePullBackOff 狀態。

如果出現以下情況,您可能會收到 ImagePull 錯誤:

  • 映像名稱、標籤或概要不正確。
  • 映像需要憑證才能進行身分驗證。
  • 無法存取登錄檔。

解決方案

1.檢查 Pod 狀態、錯誤訊息,並確認映像名稱、標籤和 SHA 是否正確

若要取得 Pod 的狀態,請執行 kubectl 命令 get pods

$ kubectl get pods -n default
NAME                              READY   STATUS             RESTARTS   AGE
nginx-7cdbb5f49f-2p6p2            0/1     ImagePullBackOff   0          86s

若要取得 Pod 錯誤訊息的詳細資訊,請執行 kubectl 命令 describe pod

$ kubectl describe pod nginx-7cdbb5f49f-2p6p2

...
Events:
  Type     Reason     Age                   From               Message
  ----     ------     ----                  ----               -------
  Normal   Scheduled  4m23s                 default-scheduler  Successfully assigned default/nginx-7cdbb5f49f-2p6p2 to ip-192-168-149-143.us-east-2.compute.internal
  Normal   Pulling    2m44s (x4 over 4m9s)  kubelet            Pulling image "nginxx:latest"
  Warning  Failed     2m43s (x4 over 4m9s)  kubelet            Failed to pull image "nginxx:latest": rpc error: code = Unknown desc = Error response from daemon: pull access denied for nginxx, repository does not exist or may require 'docker login': denied: requested access to the resource is denied
  Warning  Failed     2m43s (x4 over 4m9s)  kubelet            Error: ErrImagePull
  Warning  Failed     2m32s (x6 over 4m8s)  kubelet            Error: ImagePullBackOff
  Normal   BackOff    2m17s (x7 over 4m8s)  kubelet            Back-off pulling image "nginxx:latest"

$ kubectl describe pod nginx-55d75d5f56-qrqmp ...Events: Type Reason Age From Message ---- ------ ---- ---- ------- Normal Scheduled 2m20s default-scheduler Successfully assigned default/nginx-55d75d5f56-qrqmp to ip-192-168-149-143.us-east-2.compute.internal Normal Pulling 40s (x4 over 2m6s) kubelet Pulling image "nginx:latestttt" Warning Failed 39s (x4 over 2m5s) kubelet Failed to pull image "nginx:latestttt": rpc error: code = Unknown desc = Error response from daemon: manifest for nginx:latestttt not found: manifest unknown: manifest unknown Warning Failed 39s (x4 over 2m5s) kubelet Error: ErrImagePull Warning Failed 26s (x6 over 2m5s) kubelet Error: ImagePullBackOff Normal BackOff 11s (x7 over 2m5s) kubelet Back-off pulling image "nginx:latestttt" 請確定您的映像標籤和名稱已存在且正確無誤。如果映像登錄檔需要身分驗證,請確定您已獲得存取授權。若要確認 Pod 中使用的映像是否正確,請執行下列命令:

$ kubectl get pods nginx-7cdbb5f49f-2p6p2  -o jsonpath="{.spec.containers[*].image}" | \sort

nginxx:latest

若要了解 Pod 狀態值,請參閱 Kubernetes 網站上的 Pod 階段

如需詳細資訊,請參閱如何對 Amazon EKS 中的 Pod 狀態進行疑難排解?

2.Amazon Elastic Container Registry (Amazon ECR) 映像

如果您嘗試使用 Amazon EKS 從 Amazon ECR 提取映像,則可能需要其他組態。如果映像儲存在 Amazon ECR 私有登錄檔中,請確定您已在 Pod 上指定了憑證 imagePullSecrets。這些憑證可用於藉助私有登錄檔進行身分驗證。

建立一個名為 regcred密碼

kubectl create secret docker-registry regcred --docker-server=<your-registry-server> --docker-username=<your-name> --docker-password=<your-pword> --docker-email=<your-email>

請務必取代下列憑證:

  • <your-registry-server> 是您的私有 Docker 登錄檔 FQDN。使用 DockerHub 的 https://index.docker.io/v1/。
  • <your-name> 是您的 Docker 使用者名稱。
  • <your-pword> 是您的 Docker 密碼。
  • <your-email> 是您的 Docker 電子郵件。

您已成功將叢集中的 Docker 憑證設定為名為 regcred 的密碼。

若要了解 regcred 密碼的內容,請以 YAML 格式檢視密碼:

kubectl get secret regcred --output=yaml

在以下範例中,Pod 需要以 regcred 方式存取您的 Docker 憑證:

apiVersion: v1
kind: Pod
metadata:
  name: private-reg
spec:
  containers:
  - name: private-reg-container
    image: <your-private-image>
  imagePullSecrets:
  - name: regcred

將您的 your.private.registry.example 替換為私有登錄檔中的映像路徑,如下所示:

your.private.registry.example.com/bob/bob-private:v1

若要從私有登錄檔中提取映像,Kubernetes 需要憑證。組態檔案中的 imagePullSecrets 欄位指定 Kubernetes 必須從名為 regcred 的「密碼」中取得憑證。

如需有關建立密碼的更多選項,請參閱建立使用密碼在 Kubernetes 網站上提取映像的 Pod

3.登錄檔疑難排解

在以下範例中,由於網路連線問題而無法存取登錄檔,因為 kubelet 無法連線私有登錄檔端點:

$ kubectl describe pods nginx-9cc69448d-vgm4m

...
Events:
  Type     Reason     Age                From               Message
  ----     ------     ----               ----               -------
  Normal   Scheduled  16m                default-scheduler  Successfully assigned default/nginx-9cc69448d-vgm4m to ip-192-168-149-143.us-east-2.compute.internal
  Normal   Pulling    15m (x3 over 16m)  kubelet            Pulling image "nginx:stable"
  Warning  Failed     15m (x3 over 16m)  kubelet            Failed to pull image "nginx:stable": rpc error: code = Unknown desc = Error response from daemon: Get "https://registry-1.docker.io/v2/": net/http: request canceled while waiting for connection (Client.Timeout exceeded while awaiting headers)
  Warning  Failed     15m (x3 over 16m)  kubelet            Error: ErrImagePull
  Normal   BackOff    14m (x4 over 16m)  kubelet            Back-off pulling image "nginx:stable"
  Warning  Failed     14m (x4 over 16m)  kubelet            Error: ImagePullBackOff

錯誤 "Failed to pull image..." (無法提取映像...) 表示 kubelet 嘗試連接到 Docker 登錄檔端點,並且由於連接超時而失敗。

若要對此錯誤進行疑難排解,請檢查允許與指定登錄檔端點通訊的子網路、安全群組和網路 ACL。

在以下範例中,已超過登錄檔速率限制:

$ kubectl describe pod nginx-6bf9f7cf5d-22q48

...
Events:
  Type     Reason                  Age                   From               Message
  ----     ------                  ----                  ----               -------
  Normal   Scheduled               3m54s                 default-scheduler  Successfully assigned default/nginx-6bf9f7cf5d-22q48 to ip-192-168-153-54.us-east-2.compute.internal
  Warning  FailedCreatePodSandBox  3m33s                 kubelet            Failed to create pod sandbox: rpc error: code = Unknown desc = failed to set up sandbox container "82065dea585e8428eaf9df89936653b5ef12b53bef7f83baddb22edc59cd562a" network for pod "nginx-6bf9f7cf5d-22q48": networkPlugin cni failed to set up pod "nginx-6bf9f7cf5d-22q48_default" network: add cmd: failed to assign an IP address to container
  Warning  FailedCreatePodSandBox  2m53s                 kubelet            Failed to create pod sandbox: rpc error: code = Unknown desc = failed to set up sandbox container "20f2e27ba6d813ffc754a12a1444aa20d552cc9d665f4fe5506b02a4fb53db36" network for pod "nginx-6bf9f7cf5d-22q48": networkPlugin cni failed to set up pod "nginx-6bf9f7cf5d-22q48_default" network: add cmd: failed to assign an IP address to container
  Warning  FailedCreatePodSandBox  2m35s                 kubelet            Failed to create pod sandbox: rpc error: code = Unknown desc = failed to set up sandbox container "d9b7e98187e84fed907ff882279bf16223bf5ed0176b03dff3b860ca9a7d5e03" network for pod "nginx-6bf9f7cf5d-22q48": networkPlugin cni failed to set up pod "nginx-6bf9f7cf5d-22q48_default" network: add cmd: failed to assign an IP address to container
  Warning  FailedCreatePodSandBox  2m                    kubelet            Failed to create pod sandbox: rpc error: code = Unknown desc = failed to set up sandbox container "c02c8b65d7d49c94aadd396cb57031d6df5e718ab629237cdea63d2185dbbfb0" network for pod "nginx-6bf9f7cf5d-22q48": networkPlugin cni failed to set up pod "nginx-6bf9f7cf5d-22q48_default" network: add cmd: failed to assign an IP address to container
  Normal   SandboxChanged          119s (x4 over 3m13s)  kubelet            Pod sandbox changed, it will be killed and re-created.
  Normal   Pulling                 56s (x3 over 99s)     kubelet            Pulling image "httpd:latest"
  Warning  Failed                  56s (x3 over 99s)     kubelet            Failed to pull image "httpd:latest": rpc error: code = Unknown desc = Error response from daemon: toomanyrequests: You have reached your pull rate limit. You may increase the limit by authenticating and upgrading: https://www.docker.com/increase-rate-limit
  Warning  Failed                  56s (x3 over 99s)     kubelet            Error: ErrImagePull
  Normal   BackOff                 43s (x4 over 98s)     kubelet            Back-off pulling image "httpd:latest"
  Warning  Failed                  43s (x4 over 98s)     kubelet            Error: ImagePullBackOff

Docker 登錄檔速率限制為每六小時 100 個容器映像請求 (匿名使用),而 Docker 帳戶則為 200 個容器映像請求。超過這些限制的映像請求會遭到拒絕存取,直到六小時的時間過去為止。若要管理用量及了解登錄檔速率限制,請參閱 Docker 網站上的了解您的 Docker Hub 速率限制


相關資訊

Amazon EKS 疑難排解

如何使用 Amazon EKS 對 Amazon ECR 問題進行疑難排解?

Amazon EKS 安全最佳實務

AWS 官方
AWS 官方已更新 1 年前