Skip to content

How do I troubleshoot an OIDC provider and IRSA in Amazon EKS?

7 minute read
0

My Pods can't use the AWS Identity and Access Management (IAM) role permissions with the Amazon Elastic Kubernetes Service (Amazon EKS) AWS account token.

Resolution

Note: If you receive errors when you run AWS Command Line Interface (AWS CLI) commands, then see Troubleshooting errors for the AWS CLI. Also, make sure that you're using the most recent AWS CLI version.

Check whether you have an existing IAM OIDC provider for your cluster

If an OpenID Connect (OIDC) provider doesn't exist, then you receive an error similar to the following:

"WebIdentityErr: failed to retrieve credentials\ncaused by: InvalidIdentityToken: No OpenIDConnect provider found in your account for https://oidc.eks.eu-west-1.amazonaws.com/id/EXAMPLED539D4633E53DE1B716D3041E\n\tstatus code: 400"

To check whether you have an existing IAM OIDC provider, complete the following steps:

  1. To check your cluster's OIDC provider URL, run the following describe-cluster AWS CLI command:

    aws eks describe-cluster --name cluster_name --query "cluster.identity.oidc.issuer" --output text

    Note: Replace cluster_name with your cluster name.
    Example output:

    https://oidc.eks.us-west-2.amazonaws.com/id/EXAMPLED539D4633E53DE1B716D3041E
  2. To list the IAM OIDC providers in your account, run the following list-open-id-connect-providers command:

    aws iam list-open-id-connect-providers | grep EXAMPLED539D4633E53DE1B716D3041E

    Note: Replace EXAMPLED539D4633E53DE1B716D3041E with the OIDC provider URL that you received from the previous command.
    If the command returns an output, then you already have a provider for your cluster. If the command doesn't return an output, then you must create an IAM OIDC provider. Example output:

    "Arn": "arn:aws:iam::111122223333:oidc-provider/oidc.eks.us-west-2.amazonaws.com/id/EXAMPLED539D4633E53DE1B716D3041E"

Check whether your IAM role has the required permissions and an attached IAM policy

Complete the following steps:

  1. Open the IAM console.
  2. In the navigation pane, choose Roles.
  3. Choose the role that's associated with your Kubernetes service account.
  4. Choose the Permissions tab. Then, check the policy that's attached to the role to make sure that it contains the permissions required for your configuration.
  5. Choose the Trust relationships tab. Then, verify that the format of your IAM policy matches the format of the following JSON policy:
    {  "Version": "2012-10-17",
      "Statement": [
        {
          "Effect": "Allow",
          "Principal": {
            "Federated": "arn:aws:iam::ACCOUNT_ID:oidc-provider/oidc.eks.AWS_REGION.amazonaws.com/id/EXAMPLED539D4633E53DE1B716D3041E"
          },
          "Action": "sts:AssumeRoleWithWebIdentity",
          "Condition": {
            "StringEquals": {
              "oidc.eks.AWS_REGION.amazonaws.com/id/EXAMPLED539D4633E53DE1B716D3041E:sub": "system:serviceaccount:SERVICE_ACCOUNT_NAMESPACE:SERVICE_ACCOUNT_NAME",
              "oidc.eks.AWS_REGION.amazonaws.com/id/EXAMPLED539D4633E53DE1B716D3041E:aud": "sts.amazonaws.com"
            }
          }
        }
      ]
    }
    Or, run the following get-role command to check your trust relationship:
    aws iam get-role --role-name EKS-IRSA
    Note: Replace EKS-IRSA with your IAM role for service accounts (IRSA) role name.
    Example output:
    {  "Role": {
        "Path": "/",
        "RoleName": "EKS-IRSA",
        "RoleId": "AROAQ55NEXAMPLELOEISVX",
        "Arn": "arn:aws:iam::ACCOUNT_ID:role/EKS-IRSA",
        "CreateDate": "2021-04-22T06:39:21+00:00",
        "AssumeRolePolicyDocument": {
          "Version": "2012-10-17",
          "Statement": [
            {
              "Effect": "Allow",
              "Principal": {
                "Federated": "arn:aws:iam::ACCOUNT_ID:oidc-provider/oidc.eks.AWS_REGION.amazonaws.com/id/EXAMPLED539D4633E53DE1B716D3041E"
              },
              "Action": "sts:AssumeRoleWithWebIdentity",
              "Condition": {
                "StringEquals": {
                  "oidc.eks.AWS_REGION.amazonaws.com/id/EXAMPLED539D4633E53DE1B716D3041E:aud": "sts.amazonaws.com",
                  "oidc.eks.AWS_REGION.amazonaws.com/id/EXAMPLED539D4633E53DE1B716D3041E:sub": "system:serviceaccount:SERVICE_ACCOUNT_NAMESPACE:SERVICE_ACCOUNT_NAME"
                }
              }
            }
          ]
        },
        "MaxSessionDuration": 3600,
        "RoleLastUsed": {
          "LastUsedDate": "2021-04-22T07:01:15+00:00",
          "Region": "AWS_REGION"
        }
      }
    }
    In the output JSON, check the AssumeRolePolicyDocument section to verify the trust relationship policy.
  6. (Optional) Update the trust relationship for the role to the correct AWS Region, Kubernetes service account name, or Kubernetes namespace.

Check whether you created a service account

To check whether a service account exists, run the following command:

kubectl get sa -n YOUR_NAMESPACE

Note: Replace YOUR_NAMESPACE with your Kubernetes namespace.

Example output:

NAME      SECRETS   AGEdefault   1         28d
irsa      1         66m

Make sure that the output lists your service account. If you don't have a service account, then see Configure service accounts for Pods on the Kubernetes website.

Verify that the service account has the correct IAM role annotations

To verify that your service account has the correct IAM role annotations, run the following command:

kubectl describe sa irsa -n YOUR_NAMESPACE

Note: Replace irsa with your Kubernetes service account name and YOUR_NAMESPACE with your Kubernetes namespace.

Example output:

Name:                irsa
Namespace:           default
Labels:              none
Annotations:         eks.amazonaws.com/role-arn: arn:aws:iam::ACCOUNT_ID:role/IAM_ROLE_NAME
Image pull secrets:  none
Mountable secrets:   irsa-token-v5rtc
Tokens:              irsa-token-v5rtc
Events:              none

Check Annotations to make sure that the IAM role is correct. If it isn't, then run the following command to edit the service account:

kubectl edit sa -n NAMESPACE

Note: Replace NAMESPACE with your namespace.

Then, update the value for Annotations with the correct IAM role.

Verify that you correctly specified the serviceAccountName in your Pod

To verify the serviceAccountName, run the following command:

kubectl get pod POD_NAME  -o yaml -n YOUR_NAMESPACE| grep -i serviceAccountName:

Note: Replace POD_NAME with your Kubernetes Pod and YOUR_NAMESPACE with your namespace.

Example output:

serviceAccountName: irsa

If the value in the output is the incorrect service account name, then edit the deployment manifest with the correct name. Then, redeploy the deployment manifest.

Check the environment variables and permissions

To check the Pod's environment variables, run the following command:

kubectl -n YOUR_NAMESPACE exec -it POD_NAME -- env | grep AWS

Example output:

AWS_REGION=ap-southeast-2
AWS_ROLE_ARN=arn:aws:iam::111122223333:role/EKS-IRSA
AWS_WEB_IDENTITY_TOKEN_FILE=/var/run/secrets/eks.amazonaws.com/serviceaccount/token
AWS_DEFAULT_REGION=ap-southeast-2

Make sure that the output lists your service account. If you don't have a service account, then see Configure service accounts for pods on the Kubernetes website.

Verify that the application uses a supported AWS SDK

Your AWS SDK version must be greater than or equal to the required version for your AWS SDK.

Recreate Pods

If you created Pods before you applied IRSA, then run the following command to recreate the Pods:

kubectl rollout restart deploy nginx

Example output:

deployment.apps/nginx restarted

For daemonsets or statefulsets deployments, run the following command:

kubectl rollout restart deploy DEPLOYMENT_NAME

If you created only one Pod, then you must delete the Pod and recreate it. Complete the following steps:

  1. To delete the Pod, run the following command:
    kubectl delete pod POD_NAME
    Note: Replace POD_NAME with the name of your Pod.
  2. To recreate the Pod, run the following command:
    kubectl apply -f SPEC_FILE
    Note: Replace SPEC_FILE with your Kubernetes manifest file path and file name.

Verify that the audience is correct

If you created the OIDC provider with the incorrect audience, then you receive the following error:

"Error - An error occurred (InvalidIdentityToken) when calling the AssumeRoleWithWebIdentity operation: Incorrect token audience"

To check the IAM identity provider for your cluster, run the following get-open-id-connect-provider command:

aws iam get-open-id-connect-provider --open-id-connect-provider-arn arn:aws:iam::ACCOUNT_ID:oidc-provider/oidc.eks.AWS_REGION.amazonaws.com/id/EXAMPLED539D4633E53DE1B716D3041E

Note: Replace ACCOUNT_ID with your account ID, AWS_REGION with your Region, and EXAMPLED539D4633E53DE1B716D3041E with your OIDC provider URL.

Example output:

{  "Url": "oidc.eks.AWS_REGION.amazonaws.com/id/EXAMPLED539D4633E53DE1B716D3041E",
  "ClientIDList": [
    "sts.amazonaws.com"
  ],
  "ThumbprintList": [
    "9e99a48a9960b14926bb7f3b02e22da2b0ab7280"
  ],
  "CreateDate": "2021-01-21T04:29:09.788000+00:00",
  "Tags": []
}

In the output, make sure that ClientIDList is sts.amazonaws.com. If it isn't, then add an identity provider to the role and enter sts.amazonaws.com for Audience.

Verify that you configured the correct thumbprint

If the thumbprint that you configured in the IAM OIDC isn't correct, then you receive the following error:

"failed to retrieve credentials caused by: InvalidIdentityToken: OpenIDConnect provider's HTTPS certificate doesn't match configured thumbprint"

To automatically configure the correct thumbprint, use eksctl or the Amazon EKS console to create the IAM identity provider. For other ways to obtain a thumbprint, see Obtain the thumbprint for an OpenID Connect identity provider.

(AWS China Region only) Check the AWS_DEFAULT_REGION environment variable

To deploy an IRSA-applied Pod or daemonset to a cluster in the AWS China Region, you must set the AWS_DEFAULT_REGION in the Pod specification. If you don't set the AWS_DEFAULT_REGION environment variable, then you might receive the following error for your Pod or daemonset:

"An error occurred (InvalidClientTokenId) when calling the GetCallerIdentity operation: The security token included in the request is invalid"

To add the AWS_DEFAULT_REGION environment variable to your Pod or daemonset specification, create a deployment manifest similar to the following example:

apiVersion: apps/v1kind: Deployment
metadata:
  name: my-app
spec:
  template:
    metadata:
      labels:
        app: my-app
    spec:
      serviceAccountName: my-app
      containers:
      - name: my-app
        image: my-app:latest
        env:
        - name: AWS_DEFAULT_REGION
          value: "AWS_REGION"
...

Or, run the following command to set the environment variable:

kubectl set env deployment deployment_name AWS_DEFAULT_REGION=example_region -n NAMESPACE" 

Note: Replace deployment_name with your deployment name, example_region with the AWS China Region, and NAMESPACE with your namespace.

2 Comments

Thanks! This is an extremely thorough and helpful article. However, it recommends running containers as the root user, which is a known bad security practice. There is a workaround, which is mentioned in the AWS best practices regarding this very issue (i.e., the dangers of running containers as root).

Could you link to this best practice, or at least explain the workaround (i.e. using securityContext + fsGroup) instead of recommending running containers as root?

Here's the best practices document to which I refer: https://docs.aws.amazon.com/whitepapers/latest/security-practices-multi-tenant-saas-applications-eks/forbid-running-tenant-containers-as-root.html

replied 3 years ago

Thank you for your comment. We'll review and update the Knowledge Center article as needed.

AWS
EXPERT
replied 3 years ago