Unable to see X-Ray trace for AWS Batch job

0

I am running a batch job that starts an EC2 instance and runs a Java application, and I am trying to generate X-Ray trace for the job. In my docker entrypoint I install X-Ray daemon and start it

curl https://s3.us-west-2.amazonaws.com/aws-xray-assets.us-west-2/xray-daemon/aws-xray-daemon-3.x.rpm -o /tmp/xray.rpm
yum install -v -y /tmp/xray.rpm
xray -o -n us-west-2 &

My Java code is instrumented as such

try {
    Segment segment = AWSXRay.beginSegment("load-data-segment");
    logger.info("Segment started traceID={}, segment", segment.getTraceId(), segment.prettySerialize());
    Subsegment subSegment = AWSXRay.beginSubsegment("load-data-sub-segment");
    logger.info("started subSegment traceId={}, {}", subSegment.getTraceId(), subSegment.prettySerialize());
    subSegment.putAnnotation("job_id", "load-data-job");
    ...
} finally {
    logger.info("Ending subsegment {}, {}", subSegment.getTraceId(), subSegment.prettySerialize());
    AWSXRay.endSubsegment();
    logger.info("Ending segment {}, {}", segment.getTraceId(), segment.prettySerialize());
    AWSXRay.endSegment();
}

And I can see from CloudWatch log that the daemon is installed and started, and the segments are created and eventually terminated normally.

...
EVENTS  1667282262329   Running X-Ray: xray -o -n us-west-2 &   1667282258652
EVENTS  1667282262329   Getting Secret: dev-shared-kc-local.yml 1667282258654
EVENTS  1667282262329   2022-11-01T05:57:38Z [Info] Initializing AWS X-Ray daemon 3.3.5 1667282258658
EVENTS  1667282262329   2022-11-01T05:57:38Z [Info] Using buffer memory limit of 321 MB 1667282258658
EVENTS  1667282262329   2022-11-01T05:57:38Z [Info] 5136 segment buffers allocated  1667282258658
EVENTS  1667282262329   2022-11-01T05:57:38Z [Info] Using region: us-west-2 1667282258679
EVENTS  1667282262329   2022-11-01T05:57:38Z [Info] HTTP Proxy server using X-Ray Endpoint : https://xray.us-west-2.amazonaws.com   1667282258679
EVENTS  1667282262329   2022-11-01T05:57:38Z [Info] Starting proxy http server on 127.0.0.1:2000    1667282258679
...
VENTS  1667282262329   2022-11-01 05:57:41,933 [main] [default] INFO  - Segment started traceID=1-6360b555-5161dc6ff45fcca197606de8, segment   1667282261934
EVENTS  1667282262329   2022-11-01 05:57:41,941 [main] [default] INFO  - started subSegment traceId=null, { 1667282261941
EVENTS  1667282262329     "name" : "load-data-sub-segment", 1667282261941
EVENTS  1667282262329     "id" : "348f5fd5734d2002",    1667282261941
EVENTS  1667282262329     "start_time" : 1.667282261934E9,  1667282261941
EVENTS  1667282262329     "in_progress" : true  1667282261941
EVENTS  1667282262329   }   1667282261941
...
...
EVENTS  1667282596467   2022-11-01 06:03:16,334 [main] [default] INFO  - Ending subsegment null, {  1667282596334
...
EVENTS  1667282596467   2022-11-01 06:03:16,336 [main] [default] INFO  - Ending segment 1-6360b555-5161dc6ff45fcca197606de8

However I am completely unable to find any tracing from X-Ray console. I use the queries like annotation.job_id, but nothing shows up. I think I have all the necessary polices added to my role (AWSXRayDaemonWriteAccess), but I am suspecting if I didn't set it up properly. Where and how can I start debugging this? I feel like completely at wits' end. Thank you for any suggestions!

gefragt vor 2 Jahren558 Aufrufe
2 Antworten
0

The code looks good overall, I don't find any problem. Would you mind add log for segment.isSampled() ? XRay SDK does not emit segment if sample flag is 0/false.

AWS
beantwortet vor 2 Jahren
  • Thank you for suggestion. Here's the output

    2022-11-03 20:11:36,083 [main] [default] INFO  - Segment: isSampled: true, isRecording: true, isEmitted false, isError false, isFault false, inProgress true, isRecording true
    2022-11-03 20:11:36,093 [main] [default] INFO  - subSegment: isFault false, isError false, isEmitted false, inProgress true, isThrottle false        1667506296094
    
0

Still not find any problem. Please print Daemon debug logs by xray -o -n us-west-2 -l dev and check whether X-Ray Daemon has logs like

2022-11-06T20:54:30-08:00 [Debug] Received request on HTTP Proxy server : /GetSamplingRules
2022/11/06 20:54:31 http: proxy error: context canceled
2022-11-06T20:55:00-08:00 [Debug] Skipped telemetry data as no segments found
2022-11-06T20:55:05-08:00 [Debug] Received request on HTTP Proxy server : /GetSamplingRules
2022-11-06T20:55:09-08:00 [Debug] processor: sending partial batch
2022-11-06T20:55:09-08:00 [Debug] processor: segment batch size: 1. capacity: 50
2022-11-06T20:55:10-08:00 [Info] Successfully sent batch of 1 segments (1.501 seconds)
AWS
beantwortet vor 2 Jahren

Du bist nicht angemeldet. Anmelden um eine Antwort zu veröffentlichen.

Eine gute Antwort beantwortet die Frage klar, gibt konstruktives Feedback und fördert die berufliche Weiterentwicklung des Fragenstellers.

Richtlinien für die Beantwortung von Fragen