ECS task with the AWSXrayFullAccess permission policy attatched fails to report trace data to XRay

0

When an ECS task with the AWSXrayFullAccess permission policy has Trace collection enabled, it fails to report trace data to X-Ray. The failure reason can be seen in the logs of the sidecar container aws-otel-collector as follows:

2024-06-21T07:47:35.526Z	error	exporterhelper/common.go:292	Exporting failed. Rejecting data.	
{
     "kind": "exporter",
     "data_type": "traces",
     "name": "awsxray",
     "error": "NoCredentialProviders: no valid providers in chain. Deprecated.\n\tFor verbose messaging see aws.Config.CredentialsChainVerboseErrors",
     "rejected_items": 1
}

From the error message, it appears that the AWS X-Ray exporter is unable to obtain the necessary permissions, even though the role running this task has been assigned sufficient permissions, including:

  • AmazonECSTaskExecutionRolePolicy
  • AWSXrayCrossAccountSharingConfiguration
  • AWSXrayFullAccess

The task definition is as follows:

{
    "taskDefinitionArn": "arn:aws:ecs:ap-southeast-1:654654375602:task-definition/webhook-service-dispatch:17",
    "containerDefinitions": [
        {
            "name": "webhook-service-dispath-container",
            "image": "application-image:latest",
            "cpu": 0,
            "portMappings": [],
            "essential": true,
            "command": [
                "./webhook-service",
                "dispatch"
            ],
            "environment": [
            ],
            "mountPoints": [],
            "volumesFrom": [],
            "logConfiguration": {
                "logDriver": "awslogs",
                "options": {
                    "awslogs-group": "/ecs/webhook-service-dispatch",
                    "awslogs-create-group": "true",
                    "awslogs-region": "ap-southeast-1",
                    "awslogs-stream-prefix": "ecs"
                }
            },
            "systemControls": []
        },
        {
            "name": "aws-otel-collector",
            "image": "public.ecr.aws/aws-observability/aws-otel-collector:v0.39.1",
            "cpu": 0,
            "portMappings": [],
            "essential": true,
            "command": [
                "--config=/etc/ecs/ecs-xray.yaml"
            ],
            "environment": [],
            "mountPoints": [],
            "volumesFrom": [],
            "logConfiguration": {
                "logDriver": "awslogs",
                "options": {
                    "awslogs-group": "/ecs/ecs-aws-otel-sidecar-collector",
                    "awslogs-create-group": "true",
                    "awslogs-region": "ap-southeast-1",
                    "awslogs-stream-prefix": "ecs"
                }
            },
            "systemControls": []
        }
    ],
    "family": "webhook-service-dispatch",
    "executionRoleArn": "arn:aws:iam::654654375602:role/ecsTaskExecutionRole",
    "networkMode": "awsvpc",
    "revision": 17,
    "volumes": [],
    "status": "ACTIVE",
    "requiresAttributes": [
        {
            "name": "com.amazonaws.ecs.capability.logging-driver.awslogs"
        },
        {
            "name": "ecs.capability.execution-role-awslogs"
        },
        {
            "name": "com.amazonaws.ecs.capability.ecr-auth"
        },
        {
            "name": "com.amazonaws.ecs.capability.docker-remote-api.1.19"
        },
        {
            "name": "ecs.capability.execution-role-ecr-pull"
        },
        {
            "name": "com.amazonaws.ecs.capability.docker-remote-api.1.18"
        },
        {
            "name": "ecs.capability.task-eni"
        },
        {
            "name": "com.amazonaws.ecs.capability.docker-remote-api.1.29"
        }
    ],
    "placementConstraints": [],
    "compatibilities": [
        "EC2",
        "FARGATE"
    ],
    "requiresCompatibilities": [
        "FARGATE"
    ],
    "cpu": "1024",
    "memory": "3072",
    "runtimePlatform": {
        "cpuArchitecture": "X86_64",
        "operatingSystemFamily": "LINUX"
    },
    "registeredAt": "2024-06-22T03:16:07.754Z",
    "registeredBy": "arn:aws:iam::654654375602:user/pharaoh",
    "tags": []
}

The relevant code snippet for configuring the OpenTelemetry trace exporter is as follows:

exporter, err := otlptrace.New(
     context.TODO(), 
     otlptracegrpc.NewClient(
          otlptracegrpc.WithInsecure(), 
          otlptracegrpc.WithEndpoint("0.0.0.0:4317"),
     )
)
if err != nil {
     return nil, errors.Trace(err)
}
sampler := sdktrace.AlwaysSample()
idg := xray.NewIDGenerator()
tp := sdktrace.NewTracerProvider(
     sdktrace.WithSampler(sampler),
     sdktrace.WithBatcher(exporter),
     sdktrace.WithResource(
          resource.NewWithAttributes(
               semconv.SchemaURL,
               semconv.ServiceNameKey.String(serviceName),
          ),
     ),
     sdktrace.WithIDGenerator(idg),
)
otel.SetTracerProvider(tp)
otel.SetTextMapPropagator(xray.Propagator{})

I would appreciate it if someone could help identify the cause of this issue and suggest how to resolve it.

1 Answer
1
Accepted Answer

Hello,

I see a issue with the task definition configuration, the task execution role is primarily used for pulling container images and writing logs to CloudWatch. However, the application and the aws-otel-collector sidecar container to access AWS X-Ray, you also need to specify a task role that grants the necessary permissions. The task role should include necessary permissions to access the AWS X-ray service.

Refer this documentation to create a task role and configure it in the task definition: https://docs.aws.amazon.com/AmazonECS/latest/developerguide/task-iam-roles.html

Reference:

https://docs.aws.amazon.com/AmazonECS/latest/developerguide/task_execution_IAM_role.html

profile picture
EXPERT
answered 3 months ago
profile picture
EXPERT
reviewed 3 months ago
  • Thank you very much for your reply.

    I have noticed this before, so the permissions policy for the ecsTaskExecutionRole has been modified. As mentioned in the problem description, two additional policies, AWSXrayCrossAccountSharingConfiguration and AWSXrayFullAccess, have been added to the task role, but it still does not work properly.

  • Oh, I see what you mean now, the task role and the task execution role are two different concepts. By adding the task role and attaching X-Ray permissions to the task role, the issue has been resolved. Thank you very much!!!

You are not logged in. Log in to post an answer.

A good answer clearly answers the question and provides constructive feedback and encourages professional growth in the question asker.

Guidelines for Answering Questions