"[Errno 90] Message too long" with Tracer and Xray

0

Hello, good day.

My engineering team is implementing a lambda function in python that calls a container in ECS which returns a transcript of an audio file. For some of these files, the transcript is extense. Those seem to be triggering the error in the title of this post on Sentry. Looking into the error, found that deactivating the response capture (as bellow) should solve the problem:

@tracer.capture_lambda_handler(capture_response=False, capture_error=True)
def lambda_handler(event: dict, context: LambdaContext) -> str:
...

However, this did not solve the problem and getting the long transcript, again, triggered the error. The XRay integration is being set in the serverless.yml file with the following lines

SENTRY_DSN: "https:....."
AWS_XRAY_SDK_ENABLED: ${env:AWS_XRAY_SDK_ENABLED, "True"}

Is it possible to deactivate the check for message size or other way to go around this error? Thank you a lot in advance.

asked a year ago363 views
1 Answer
1

hey, Powertools for AWS maintainer here (note: we have on-call for GitHub while repost is monitored by AWS Premium Support).

Without knowing how your code is structured, I suspect you have a large call stack that X-Ray is unable to process (64K trace limit). In practical words, I suspect you are calling other functions inside lambda_handler, which are also annotated with @tracer.capture_method.

Example scenario that I can reproduce this:

from aws_lambda_powertools import Tracer

tracer = Tracer()

@tracer.capture_method(capture_response=False)
def process_something():
    ...

@tracer.capture_lambda_handler(capture_response=False, capture_error=True)
def lambda_handler(event: dict, context: LambdaContext) -> str:
    imaginary_batch = event.get("Records", [])
    for iteration in imaginary_batch:
        process_something()

If my theory is correct, comment out the line @tracer.capture_lambda_handler and it should work. Leave other functions annotated (@tracer.capture_method) as-is.

Let me know how it goes!

Heitor Lessa


If theory is correct, technical explanation as to why that happens despite using capture_response=False.

The entry point is lambda_handler. Therefore, a X-Ray subsegment named ## lambda_handler begins the moment the function is invoked. Until that function doesn't complete, it will not complete a X-Ray Trace subsegment. Once it completes, it sends to X-Ray for processing.

However, if you are calling another function (think deeply nested) from your entry point that is also emitting subsegments (## process_something), then the subsegment ## lambda_handler never completes.. eventually error out as it reaches the 64K limit in X-Ray.

"Visually" looking like this:

  • ## lambda_handler
    • ## process_something
    • ## process_something
    • ## process_something
    • ## process_something
    • ## process_something
    • N more times until.. Message too long

Why capture_response=False is ineffective here

This flag would only work if say process_something function returns a response larger than 64K, for example a large transcript, a S3 file content, etc. That's why it works for most people.

AWS
answered a year ago

You are not logged in. Log in to post an answer.

A good answer clearly answers the question and provides constructive feedback and encourages professional growth in the question asker.

Guidelines for Answering Questions