When an ECS task with the AWSXrayFullAccess permission policy has Trace collection enabled, it fails to report trace data to X-Ray. The failure reason can be seen in the logs of the sidecar container aws-otel-collector as follows:
2024-06-21T07:47:35.526Z error exporterhelper/common.go:292 Exporting failed. Rejecting data.
{
"kind": "exporter",
"data_type": "traces",
"name": "awsxray",
"error": "NoCredentialProviders: no valid providers in chain. Deprecated.\n\tFor verbose messaging see aws.Config.CredentialsChainVerboseErrors",
"rejected_items": 1
}
From the error message, it appears that the AWS X-Ray exporter is unable to obtain the necessary permissions, even though the role running this task has been assigned sufficient permissions, including:
- AmazonECSTaskExecutionRolePolicy
- AWSXrayCrossAccountSharingConfiguration
- AWSXrayFullAccess
The task definition is as follows:
{
"taskDefinitionArn": "arn:aws:ecs:ap-southeast-1:654654375602:task-definition/webhook-service-dispatch:17",
"containerDefinitions": [
{
"name": "webhook-service-dispath-container",
"image": "application-image:latest",
"cpu": 0,
"portMappings": [],
"essential": true,
"command": [
"./webhook-service",
"dispatch"
],
"environment": [
],
"mountPoints": [],
"volumesFrom": [],
"logConfiguration": {
"logDriver": "awslogs",
"options": {
"awslogs-group": "/ecs/webhook-service-dispatch",
"awslogs-create-group": "true",
"awslogs-region": "ap-southeast-1",
"awslogs-stream-prefix": "ecs"
}
},
"systemControls": []
},
{
"name": "aws-otel-collector",
"image": "public.ecr.aws/aws-observability/aws-otel-collector:v0.39.1",
"cpu": 0,
"portMappings": [],
"essential": true,
"command": [
"--config=/etc/ecs/ecs-xray.yaml"
],
"environment": [],
"mountPoints": [],
"volumesFrom": [],
"logConfiguration": {
"logDriver": "awslogs",
"options": {
"awslogs-group": "/ecs/ecs-aws-otel-sidecar-collector",
"awslogs-create-group": "true",
"awslogs-region": "ap-southeast-1",
"awslogs-stream-prefix": "ecs"
}
},
"systemControls": []
}
],
"family": "webhook-service-dispatch",
"executionRoleArn": "arn:aws:iam::654654375602:role/ecsTaskExecutionRole",
"networkMode": "awsvpc",
"revision": 17,
"volumes": [],
"status": "ACTIVE",
"requiresAttributes": [
{
"name": "com.amazonaws.ecs.capability.logging-driver.awslogs"
},
{
"name": "ecs.capability.execution-role-awslogs"
},
{
"name": "com.amazonaws.ecs.capability.ecr-auth"
},
{
"name": "com.amazonaws.ecs.capability.docker-remote-api.1.19"
},
{
"name": "ecs.capability.execution-role-ecr-pull"
},
{
"name": "com.amazonaws.ecs.capability.docker-remote-api.1.18"
},
{
"name": "ecs.capability.task-eni"
},
{
"name": "com.amazonaws.ecs.capability.docker-remote-api.1.29"
}
],
"placementConstraints": [],
"compatibilities": [
"EC2",
"FARGATE"
],
"requiresCompatibilities": [
"FARGATE"
],
"cpu": "1024",
"memory": "3072",
"runtimePlatform": {
"cpuArchitecture": "X86_64",
"operatingSystemFamily": "LINUX"
},
"registeredAt": "2024-06-22T03:16:07.754Z",
"registeredBy": "arn:aws:iam::654654375602:user/pharaoh",
"tags": []
}
The relevant code snippet for configuring the OpenTelemetry trace exporter is as follows:
exporter, err := otlptrace.New(
context.TODO(),
otlptracegrpc.NewClient(
otlptracegrpc.WithInsecure(),
otlptracegrpc.WithEndpoint("0.0.0.0:4317"),
)
)
if err != nil {
return nil, errors.Trace(err)
}
sampler := sdktrace.AlwaysSample()
idg := xray.NewIDGenerator()
tp := sdktrace.NewTracerProvider(
sdktrace.WithSampler(sampler),
sdktrace.WithBatcher(exporter),
sdktrace.WithResource(
resource.NewWithAttributes(
semconv.SchemaURL,
semconv.ServiceNameKey.String(serviceName),
),
),
sdktrace.WithIDGenerator(idg),
)
otel.SetTracerProvider(tp)
otel.SetTextMapPropagator(xray.Propagator{})
I would appreciate it if someone could help identify the cause of this issue and suggest how to resolve it.
Thank you very much for your reply.
I have noticed this before, so the permissions policy for the ecsTaskExecutionRole has been modified. As mentioned in the problem description, two additional policies, AWSXrayCrossAccountSharingConfiguration and AWSXrayFullAccess, have been added to the task role, but it still does not work properly.
Oh, I see what you mean now, the task role and the task execution role are two different concepts. By adding the task role and attaching X-Ray permissions to the task role, the issue has been resolved. Thank you very much!!!