How do I troubleshoot metadata errors that I receive when I use AWS SDKs in my ECS Fargate tasks?

5 minute read
1

I want to troubleshoot metadata errors that I receive when I use AWS Software Development Kits (AWS SDK) in my Amazon Elastic Container Service (Amazon ECS) for AWS Fargate tasks.

Short description

Troubleshoot metadata errors that you receive when you use AWS SDKs in Amazon ECS for your Fargate tasks based on the following scenarios:

  • Can't retrieve instance metadata on Fargate tasks.
  • Received Missing credentials in config or could not load credentials error.
  • Intermittent metadata errors
  • Received a timeout error from the instance metadata service.

Resolution

Can't retrieve instance metadata on Fargate tasks

If you can't retrieve instance metadata on your Fargate tasks, then complete the following steps:

  1. Use Amazon ECS Exec to access a container in your task:
    Note: Replace example-clustername with the name of your cluster, example-taskid with the required task ID, and example-containername with the name of your container.

    aws ecs execute-command --cluster $example-clustername \
        --task $example-taskid \
        --container $example-containername \
        --interactive \
        --command "/bin/sh"
  2. Retrieve the metadata as follows:
    For tasks on Fargate that use platform version 1.4.0 or later, use the task metadata endpoint version 4:

    curl ${ECS_CONTAINER_METADATA_URI_V4}/task

    For tasks on Fargate that use platform versions earlier than 1.4.0, use the following command:

    curl ${ECS_CONTAINER_METADATA_URI}/task

Received "Missing credentials in config or could not load credentials" error

Check whether you received the following error message or similar:

"Missing credentials in config or Could not load credentials from any providers or Fail to retrieve token"

This error occurs when the Fargate task is launched without a task role added to its task definition. This error also occurs when a task role override isn't specified in the manual RunTask API operation and no other AWS credentials are provided.

To resolve this error, complete the following steps:

  1. Use Amazon ECS Exec to access a container in your task:
    Note: Replace example-cluster-name with the name of your cluster, example-task-id with the required task ID, and example-container-name with the name of your container.

    aws ecs execute-command --cluster $example-cluster-name \
        --task $example-task-id \
        --container $example-container-name \
        --interactive \
        --command "/bin/sh"
  2. Check the task AWS Identity and Access Management (IAM) role that's associated with your task:

    curl -s 169.254.170.2$AWS_CONTAINER_CREDENTIALS_RELATIVE_URI

    Example output:

    {
      "RoleArn": "arn:aws:iam::ACCOUNT_ID:role/<task_role_name>",
      "AccessKeyId": "XXXXXXXXXXXXXXXXX",
      "SecretAccessKey": "XXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXX",
      "Token": "IQoJb......",
      "Expiration": "2024-03-29T19:19:25Z"
    }

    Note: If a Fargate task doesn't have a task IAM role in the task definition, then a 404 page not found error is returned.

  3. Make sure that your AWS SDK uses the default credential provider chain of the container metadata in the Fargate task application code. This allows the credentials to load from the AWS_CONTAINER_CREDENTIALS_RELATIVE_URI system environment variable.

Intermittent metadata errors

Possible causes of intermittent metadata errors when you access a metadata endpoint are:

  • Resource exhaustion, such as high CPU and memory utilization
  • Concurrent high number of threads or processes running
  • CPU spikes
  • High disk pressure on a task's volume, which is caused by an intensive operation that runs in the application
  • A task metadata service that's queried at a high frequency within the container, which occurs when you make API calls

To resolve your intermittent metadata errors, follow these actions:

  • Configure the Fargate task with an appropriate amount of CPU and memory capacity in the task definition. Set up CloudWatch Container Insights to track metrics from your containerized applications.
  • Use CloudWatch Container Insights and the Amazon ECS task metadata endpoint to monitor the task storage utilization. Check your disk utilization and determine whether it must be increased.
  • Reduce the rate that your application code queries the task metadata service. For example, create AWS objects one time. Then, re-use the instance when you make subsequent API calls in your application code.
  • Use the latest AWS SDK version. Newer versions of AWS SDK automatically fetch credentials from the AWS_CONTAINER_CREDENTIALS_RELATIVE_URI environment variable when you make API calls.

Received a timeout error from the instance metadata service

If you receive a timeout error in the instance metadata service, then the timeout threshold defined in the application code was exceeded. This occurs when metadata service requests are queued due to the lack of CPU or memory, or there is a high number of running processes.

To resolve this error, follow these actions:

  • Check the timeout that's defined in your application code. Make sure that you have an appropriate timeout value for the instance metadata service.
  • To retrieve credentials, implement retries. Also, reduce the frequency of calls to the metadata service.
  • Configure the Fargate task with an appropriate amount of CPU and memory in the task definition.

Related information

Amazon ECS task role

AWS OFFICIAL
AWS OFFICIALUpdated 5 months ago