SageMaker Inference Timeout Configuration

0

Hello,

I was trying to deploy this CartoonGAN solution on an ml.x5.xlage instance using AWS SageMaker endpoint. The deployment was successful and I can invoke the endpoint, passing a base64-encoded image, without any problems. But the responses said my inference task was terminated and timed out after 10.01 seconds. I am wondering if there is some sort of timeout variable that I can set to increase how long it waits for the model to respond to the query. Or is there anything I can do to solve this?

Here is my deployment code

from sagemaker.pytorch.model import PyTorchModel

pytorch_model = PyTorchModel(
        model_data=model_uri,
        role=role_arn,
        framework_version=torch_version,
        py_version=py_version,
        entry_point='serve.py',
        source_dir='.'
)
pytorch_model.deploy(
        endpoint_name=endpoint_name,
        instance_type=instance_type,
        initial_instance_count=1
    )

The log from CloudWatch

2024-04-02T22:48:54.258+07:00	INIT_START Runtime Version: python:3.10.v29 
2024-04-02T22:48:54.662+07:00	START RequestId: d57785ae-bc40-40f4-8f8b-dc1f77a84de4 Version: $LATEST
2024-04-02T22:49:04.677+07:00	2024-04-02T15:49:04.675Z d57785ae-bc40-40f4-8f8b-dc1f77a84de4 Task timed out after 10.01 seconds
2024-04-02T22:49:04.677+07:00	END RequestId: d57785ae-bc40-40f4-8f8b-dc1f77a84de4
2024-04-02T22:49:04.677+07:00	REPORT RequestId: d57785ae-bc40-40f4-8f8b-dc1f77a84de4	Duration: 10013.20 ms	Billed Duration: 10000 ms	Memory Size: 512 MB	Max Memory Used: 74 MB	Init Duration: 403.33 ms

Thank you,

gefragt vor 2 Monaten176 Aufrufe
1 Antwort
1

I suspect that the issue might be related to the model's initialization or inference code in the serve.py script. Enable debugging logs to identify any specific bottlenecks in the inference process.

A customer's model containers must respond to requests within 60 seconds. The model itself can have a maximum processing time of 60 seconds before responding to invocations. If your model is going to take 50-60 seconds of processing time, the SDK socket timeout should be set to be 70 seconds. Source

Also if you're using AWS Lambda to invoke the SageMaker endpoint, check the timeout setting of your Lambda function. the default timeout for a new Lambda function is 3 seconds, but it can be extended up to 15 minutes.

profile picture
EXPERTE
beantwortet vor 2 Monaten

Du bist nicht angemeldet. Anmelden um eine Antwort zu veröffentlichen.

Eine gute Antwort beantwortet die Frage klar, gibt konstruktives Feedback und fördert die berufliche Weiterentwicklung des Fragenstellers.

Richtlinien für die Beantwortung von Fragen