SageMaker Inference Timeout Configuration

0

Hello,

I was trying to deploy this CartoonGAN solution on an ml.x5.xlage instance using AWS SageMaker endpoint. The deployment was successful and I can invoke the endpoint, passing a base64-encoded image, without any problems. But the responses said my inference task was terminated and timed out after 10.01 seconds. I am wondering if there is some sort of timeout variable that I can set to increase how long it waits for the model to respond to the query. Or is there anything I can do to solve this?

Here is my deployment code

from sagemaker.pytorch.model import PyTorchModel

pytorch_model = PyTorchModel(
        model_data=model_uri,
        role=role_arn,
        framework_version=torch_version,
        py_version=py_version,
        entry_point='serve.py',
        source_dir='.'
)
pytorch_model.deploy(
        endpoint_name=endpoint_name,
        instance_type=instance_type,
        initial_instance_count=1
    )

The log from CloudWatch

2024-04-02T22:48:54.258+07:00	INIT_START Runtime Version: python:3.10.v29 
2024-04-02T22:48:54.662+07:00	START RequestId: d57785ae-bc40-40f4-8f8b-dc1f77a84de4 Version: $LATEST
2024-04-02T22:49:04.677+07:00	2024-04-02T15:49:04.675Z d57785ae-bc40-40f4-8f8b-dc1f77a84de4 Task timed out after 10.01 seconds
2024-04-02T22:49:04.677+07:00	END RequestId: d57785ae-bc40-40f4-8f8b-dc1f77a84de4
2024-04-02T22:49:04.677+07:00	REPORT RequestId: d57785ae-bc40-40f4-8f8b-dc1f77a84de4	Duration: 10013.20 ms	Billed Duration: 10000 ms	Memory Size: 512 MB	Max Memory Used: 74 MB	Init Duration: 403.33 ms

Thank you,

asked a month ago148 views
1 Answer
1

I suspect that the issue might be related to the model's initialization or inference code in the serve.py script. Enable debugging logs to identify any specific bottlenecks in the inference process.

A customer's model containers must respond to requests within 60 seconds. The model itself can have a maximum processing time of 60 seconds before responding to invocations. If your model is going to take 50-60 seconds of processing time, the SDK socket timeout should be set to be 70 seconds. Source

Also if you're using AWS Lambda to invoke the SageMaker endpoint, check the timeout setting of your Lambda function. the default timeout for a new Lambda function is 3 seconds, but it can be extended up to 15 minutes.

profile picture
EXPERT
answered a month ago

You are not logged in. Log in to post an answer.

A good answer clearly answers the question and provides constructive feedback and encourages professional growth in the question asker.

Guidelines for Answering Questions