How do I troubleshoot IRSA errors in Amazon EKS?

6 minute read
0

When I use AWS Identity and Access Management (IAM) roles for service accounts (IRSA) with Amazon Elastic Kubernetes Service (Amazon EKS), I get errors.

Resolution

To troubleshoot issues with IRSA in Amazon EKS, take the following actions based on your use case.

Check the formatting of the IAM ARN

If you if you incorrectly formatted your IAM ARN in the relative service account annotation, then you get the following error:

"An error occurred (ValidationError) when calling the AssumeRoleWithWebIdentity
operation: Request ARN is invalid"

Example of an incorrect ARN format:

 eks.amazonaws.com/role-arn: arn:aws:iam::::1234567890:role/example

The preceding ARN format is incorrect because it has an extra colon ( : ). To verify the correct ARN format, see IAM ARNs.

Check if you have an IAM OIDC provider for your AWS account

If you didn't create an OpenID Connect (OIDC) provider, then you get the following error:

"An error occurred (InvalidIdentityToken) when calling the AssumeRoleWithWebIdentity operation: No OpenIDConnect provider found in your account for https://oidc.eks.region.amazonaws.com/id/xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx"

To troubleshoot this error, get the IAM OIDC provider URL:

aws eks describe-cluster --name cluster name --query "cluster.identity.oidc.issuer" --output text

Note: Replace cluster name with your cluster name.

Example output:

https://oidc.eks.us-west-2.amazonaws.com/id/EXAMPLED539D4633E53DE1B716D3041E

To list the IAM OIDC providers, run the following command:

aws iam list-open-id-connect-providers | grep EXAMPLED539D4633E53DE1B716D3041E

Note: Replace EXAMPLED539D4633E53DE1B716D3041E with the value that the previous command returned.

If the OIDC provider doesn't exist, then run the following eksctl command to create one:

eksctl utils associate-iam-oidc-provider --cluster cluster name --approve

Note: Replace cluster name with your cluster name.

You can also use the AWS Management Console to create an IAM OIDC provider for your cluster.

Verify the audience of the IAM OIDC provider

When you create an IAM OIDC provider, you must use sts.amazonaws.com as your audience. If the audience is incorrect, then you get the following error:

"An error occurred (InvalidIdentityToken) when calling the AssumeRoleWithWebIdentity operation: Incorrect token audience"

To check the audience of the IAM OIDC provider, run the following command:

aws iam get-open-id-connect-provider --open-id-connect-provider-arn ARN-of-OIDC-provider

Note: Replace ARN-of-OIDC-provider with the ARN of your OIDC provider.

-or-

Complete the following steps:

  1. Open the Amazon EKS console.
  2. Select the name of your cluster, and then choose the Configuration tab.
  3. In the Details section, note the value of the OIDC provider URL.
  4. Open the IAM console.
  5. In the navigation pane, under Access Management, choose Identity Providers.
  6. Select the provider that matches the URL for your cluster.

To change the audience, complete the following steps:

  1. Open the IAM console.
  2. In the navigation pane, under Access Management, choose Identity Providers.
  3. Select the provider that matches the URL for your cluster.
  4. Choose Actions, and then choose Add audience.
  5. Add sts.amazonaws.com.

Verify that you created the IAM OIDC resource with a root certificate thumbprint

If you didn't use a root certificate thumbprint to create the OIDC provider, then you get the following error:

"An error occurred (InvalidIdentityToken) when calling the AssumeRoleWithWebIdentity operation: OpenIDConnect provider's HTTPS certificate doesn't match configured thumbprint"

Note: Non-root certificate thumbprints renew yearly, and root certificate thumbprints renew every decade. It's a best practice to use a root certificate thumbprint when you create an IAM OIDC.

For example, you used one of the following services to create your IAM OIDC:

  • AWS Command Line Interface (AWS CLI)
  • AWS Tools for PowerShell
  • IAM API

In this example, you must manually obtain the thumbprint. If you created your IAM OIDC in the IAM console, then it's a best practice to manually obtain the thumbprint. Use this thumbprint to verify that the console obtained the correct IAM OIDC.

Run the following command to obtain the root certificate thumbprint and its expiration date:

echo | openssl s_client -servername oidc.eks.your-region-code.amazonaws.com -showcerts -connect oidc.eks.your-region-code.amazonaws.com:443 2>/dev/null | awk '/-----BEGIN CERTIFICATE-----/{cert=""} {cert=cert $0 "\n"} /-----END CERTIFICATE-----/{last_cert=cert} END{printf "%s", last_cert}' | openssl x509 -fingerprint -noout -dates | sed 's/://g' | awk -F= '{print tolower($2)}'

Note: Replace your-region-code with the AWS Region that your cluster is located in.

Example output:

9e99a48a9960b14926bb7f3b02e22da2b0ab7280 sep 2 000000 2009 gmt jun 28 173916 2034 gmt

In the preceding output, 9e99a48a9960b14926bb7f3b02e22da2b0ab7280 is the thumbprint, sep 2 000000 2009 gmt is the certificate start date, and jun 28 173916 2034 is the certificate expiration date.

Check the configuration of your IAM role's trust policy

If you misconfigured the trust policy of the IAM role, then you get the following error:

"An error occurred (AccessDenied) when calling the AssumeRoleWithWebIdentity operation: Not authorized to perform sts:AssumeRoleWithWebIdentity"

To resolve this issue, make sure that you're using the correct IAM OIDC provider. If the IAM OIDC provider is correct, then use the IAM role configuration guide to check if you correctly configured the trust policy's conditions.

Verify that your pod identity webhook configuration exists and is valid

The pod identity webhook is responsible for injecting the necessary environment variables and projected volume. If you accidentally deleted or changed your webhook configuration, then IRSA stops working.

To verify that your webhook configuration exists and is valid, run the following command:

kubectl get mutatingwebhookconfiguration pod-identity-webhook  -o yaml

If the pod-identity-webhook configuration doesn't exist, then run the following command to create it:

CLUSTER\_CA=$(aws eks describe-cluster --name CLUSTER\_NAME --region REGION\_CODE --query 'cluster.certificateAuthority.data' --output text)`       cat << EOF | kubectl apply -f -       apiVersion: admissionregistration.k8s.io/v1    kind: MutatingWebhookConfiguration    metadata:      name: pod-identity-webhook    webhooks:    - name: iam-for-pods.amazonaws.com      clientConfig:        url: "https://127.0.0.1:23443/mutate"        caBundle: $CLUSTER_CA      failurePolicy: Ignore      rules:      - operations: [ "CREATE" ]        apiGroups: [""]        apiVersions: ["v1"]        resources: ["pods"]      reinvocationPolicy: IfNeeded      sideEffects: None      admissionReviewVersions: ["v1beta1"]    EOF`

Note: Replace CLUSTER_NAME with your cluster name and REGION_CODE with cluster Region.

Verify that your pod identity webhook is injecting environment variables to your pods that use IRSA

Run one of the following commands:

kubectl get pod <pod-name> -n <ns> -o yaml | grep aws-iam-token

-or-

kubectl get pod <pod-name> -n <ns> -o yaml | grep AWS_WEB_IDENTITY_TOKEN_FILE

Verify that you're using supported AWS SDKs

Make sure that you're using an AWS SDK version that allows you to assume an IAM role through the OIDC web identity token file.

Related information

Why can't I use an IAM role for the service account in my Amazon EKS pod?

How do I troubleshoot an OIDC provider and IRSA in Amazon EKS?

AWS OFFICIAL
AWS OFFICIALUpdated a month ago