How do I troubleshoot IRSA errors in Amazon EKS?

6 minute read
0

When I use AWS Identity and Access Management (IAM) roles for service accounts (IRSA) with Amazon Elastic Kubernetes Service (Amazon EKS), I get errors.

Short description

To troubleshoot issues with IRSA in Amazon EKS, complete one or more of the following actions based on your use case:

  • Check the formatting of the IAM Amazon Resource Name (ARN).
  • Check whether you have an IAM OpenID Connect (OIDC) provider for your AWS account.
  • Verify the audience of the OIDC provider.
  • Verify that you created the OIDC resource with a root certificate thumbprint.
  • Check the configuration of your IAM role's trust policy.
  • Verify that your pod identity webhook configuration exists and is valid.
  • Verify that your pod identity webhook is injecting environment variables to your pods using IRSA.
  • Verify that you're using supported AWS SDKs.

Resolution

Check the formatting of the IAM ARN

If you set your IAM ARN in the relative service account annotation with incorrect formatting, then you get the following error:

An error occurred (ValidationError) when calling the AssumeRoleWithWebIdentity
operation: Request ARN is invalid

Here's an example of an incorrect ARN format:

 eks.amazonaws.com/role-arn: arn:aws:iam::::1234567890:role/example

This ARN format is incorrect because it has an extra colon ( : ). To verify correct ARN format, see IAM ARNs.

Check if you have an IAM OIDC provider for your AWS account

If you didn't create an OIDC provider, then you get the following error:

An error occurred (InvalidIdentityToken) when calling the AssumeRoleWithWebIdentity operation: No OpenIDConnect provider found in your account for https://oidc.eks.region.amazonaws.com/id/xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx

To troubleshoot this, get the IAM OIDC provider URL:

aws eks describe-cluster --name cluster name --query "cluster.identity.oidc.issuer" --output text

Note: Replace cluster name with your cluster name.

You get an output that's similar to the following example:

https://oidc.eks.us-west-2.amazonaws.com/id/EXAMPLED539D4633E53DE1B716D3041E

To list the IAM OIDC providers, run the following command:

aws iam list-open-id-connect-providers | grep EXAMPLED539D4633E53DE1B716D3041E

Note: Replace EXAMPLED539D4633E53DE1B716D3041E with the value that the previous command returned.

If the OIDC provider doesn't exist, then use the following eksctl command to create one:

eksctl utils associate-iam-oidc-provider --cluster cluster name --approve

Note: Replace cluster name with your cluster name.

You can also use the AWS Management Console to create an IAM OIDC provider for your cluster.

Verify the audience of the IAM OIDC provider

When you create an IAM OIDC provider, you must use sts.amazonaws.com as your audience. If the audience is incorrect, then you get the following error:

An error occurred (InvalidIdentityToken) when calling the AssumeRoleWithWebIdentity operation: Incorrect token audience

To check the audience of the IAM OIDC provider, run the following command:

aws iam get-open-id-connect-provider --open-id-connect-provider-arn ARN-of-OIDC-provider

Note: Replace ARN-of-OIDC-provider with the ARN of your OIDC provider.

-or-

Complete the following steps:

1.    Open the Amazon EKS console.

2.    Select the name of your cluster, and then choose the Configuration tab.

3.    In the Details section, note the value of the OpenID Connect provider URL.

4.    Open the IAM console.

5.    In the navigation pane, under Access Management, choose Identity Providers.

6.    Select the provider that matches the URL for your cluster.

To change the audience, complete the following steps:

1.    Open the IAM console.

2.    In the navigation pane, under Access Management, choose Identity Providers.

3.    Select the provider that matches the URL for your cluster.

4.    Choose Actions, and then choose Add audience.

5.    Add sts.amazonaws.com.

Verify that you created the IAM OIDC resource with a root certificate thumbprint

If you didn't use a root certificate thumbprint to create the OIDC provider, then you get the following error:

An error occurred (InvalidIdentityToken) when calling the AssumeRoleWithWebIdentity operation: OpenIDConnect provider's HTTPS certificate doesn't match configured thumbprint

Note: Non-root certificate thumbprints renew yearly, and root certificate thumbprints renew every decade. It's a best practice to use a root certificate thumbprint when you create an IAM OIDC.

Suppose that you used one of the following to create your IAM OIDC:

  • AWS Command Line Interface (AWS CLI)
  • AWS Tools for PowerShell
  • IAM API to create your IAM OIDC

In this case, you must manually obtain the thumbprint. If you created your IAM OIDC in the IAM console, then it's a best practice to manually obtain the thumbprint. With this thumbprint, you can verify that the console fetched the correct IAM OIDC.

Obtain the root certificate thumbprint and its expiration date:

echo | openssl s\_client -servername oidc.eks.your-region-code.amazonaws.com -showcerts -connect oidc.eks.your-region-code.amazonaws.com:443 2>/dev/null | awk '/-----BEGIN CERTIFICATE-----/{cert=""} {cert=cert $0 "\\n"} /-----END CERTIFICATE-----/{last\_cert=cert} END{printf "%s", last\_cert}' | openssl x509 -fingerprint -noout -dates | sed 's/://g' | awk -F= '{print tolower($2)}'

Note: Replace your-region-code with the AWS Region that your cluster is located in.

You receive an output that's similar to the following example:

9e99a48a9960b14926bb7f3b02e22da2b0ab7280 sep 2 000000 2009 gmt jun 28 173916 2034 gmt

In this output, 9e99a48a9960b14926bb7f3b02e22da2b0ab7280 is the thumbprint, sep 2 000000 2009 gmt is the certificate start date, and jun 28 173916 2034 is the certificate expiration date.

Check the configuration of your IAM role's trust policy

If the trust policy of the IAM role is misconfigured, then you get the following error:

An error occurred (AccessDenied) when calling the AssumeRoleWithWebIdentity operation: Not authorized to perform sts:AssumeRoleWithWebIdentity

To resolve this issue, make sure that you're using the correct IAM OIDC provider. If the IAM OIDC provider is correct, then use the IAM role configuration guide to check if the trust policy's conditions are correctly configured.

Verify that your pod identity webhook configuration exists and is valid

The pod identity webhook is responsible for injecting the necessary environment variables and projected volume. If you accidentally deleted or changed your webhook configuration, then IRSA stops working.

To verify that your webhook configuration exists and is valid, run the following command:

kubectl get mutatingwebhookconfiguration pod-identity-webhook  -o yaml

Verify that your pod identity webhook is injecting environment variables to your pods that use IRSA

Run one of the following commands:

kubectl get pod <pod-name> -n <ns> -o yaml | grep aws-iam-token

-or-

kubectl get pod <pod-name> -n <ns> -o yaml | grep AWS\_WEB\_IDENTITY\_TOKEN\_FILE

Verify that you're using supported AWS SDKs

Make sure that you're using an AWS SDK version that supports assuming an IAM role through the OIDC web identity token file.

Related information

Why can't I use an IAM role for the service account in my Amazon EKS pod?

How do I troubleshoot an OIDC provider and IRSA in Amazon EKS?

AWS OFFICIAL
AWS OFFICIALUpdated 17 days ago