Not able to send trace to X-Ray from AWS Glue Python script

0

I have a need to send traces from AWS Glue job (written in Python script) to AWS X-Ray. Since X-Ray does not support AWS Glue out of the box, I needed to write little more code to instrument Python script to be able to send traces. I found this link from Chariot Solutions and tried to follow the steps but it's not working, it doesn't give error either. According to this article, it seems we don't even need to spin up a daemon because we have custom emitter. Here is code

import boto3
import io
import json
from awsglue.context import GlueContext
from awsglue.job import Job
from pyspark.context import SparkContext
import aws_xray_sdk.core
aws_xray_sdk.core.patch_all()

class DirectEmitter:
 
    def __init__(self):
        self.xray_client = None  # lazily initialize
 
    def send_entity(self, entity):
        if not self.xray_client:
            self.xray_client = boto3.client('xray')
        segment_doc = json.dumps(entity.to_dict())
        self.xray_client.put_trace_segments(TraceSegmentDocuments=[segment_doc])
 
    def set_daemon_address(self, address):
        pass
 
    @property
    def ip(self):
        return None
 
    @property
    def port(self):
        return None

aws_xray_sdk.core.xray_recorder.configure(
    emitter=DirectEmitter(),
    context_missing='LOG_ERROR',
    sampling=False)

sc = SparkContext.getOrCreate()
glueContext = GlueContext(sc)
spark = glueContext.spark_session
job = Job(glueContext)

zip_file = 'data.zip'
bucket_name = 'mybucket-dev-etl-1'
output_folder = 'myfolder/obf/output'
raw_folder = 'myfolder/obf/raw'

segment = aws_xray_sdk.core.xray_recorder.begin_segment('segment_name')
s3 = boto3.client('s3')
obj = s3.get_object(Bucket=bucket_name, Key=zip_file)
zip_data = io.BytesIO(obj['Body'].read())
segment.put_metadata('key', 'krish-dict', 'namespace')
subsegment = aws_xray_sdk.core.xray_recorder.begin_subsegment('subsegment_name')
with aws_xray_sdk.core.xray_recorder.capture('subsegment_name'):
    extracted_files = extract_zip(zip_data) #this line calls exernal library to extract the file but library is not imported here for security
for file_name, file_content in extracted_files.items():
    subsegment.put_annotation('key', 'krish-value')
    s3.put_object(Bucket=bucket_name, Key=f'{raw_folder}/{file_name}', Body=file_content)

print('extracting complete')
job.commit()
aws_xray_sdk.core.xray_recorder.end_subsegment()
aws_xray_sdk.core.xray_recorder.end_segment()

Am I missing anything here? It seems Daemon is not working for some reason, may be because there is no daemon? but my understanding is that if we have that custom emitter, we don't need to create separate daemon running explicitly? Your any comment or advise would be much appreciated.

profile picture
asked 3 months ago144 views
1 Answer
0

To send the X-Ray trace segments from your service, you have to run X-Ray Daemon to receive and send the data. The article you referred, they also have ran a daemon to send segment from SDK to X-Ray backend.

AWS
answered 3 months ago

You are not logged in. Log in to post an answer.

A good answer clearly answers the question and provides constructive feedback and encourages professional growth in the question asker.

Guidelines for Answering Questions