SageMaker Inference Timeout Configuration

0

Hello,

I was trying to deploy this CartoonGAN solution on an ml.x5.xlage instance using AWS SageMaker endpoint. The deployment was successful and I can invoke the endpoint, passing a base64-encoded image, without any problems. But the responses said my inference task was terminated and timed out after 10.01 seconds. I am wondering if there is some sort of timeout variable that I can set to increase how long it waits for the model to respond to the query. Or is there anything I can do to solve this?

Here is my deployment code

from sagemaker.pytorch.model import PyTorchModel

pytorch_model = PyTorchModel(
        model_data=model_uri,
        role=role_arn,
        framework_version=torch_version,
        py_version=py_version,
        entry_point='serve.py',
        source_dir='.'
)
pytorch_model.deploy(
        endpoint_name=endpoint_name,
        instance_type=instance_type,
        initial_instance_count=1
    )

The log from CloudWatch

2024-04-02T22:48:54.258+07:00	INIT_START Runtime Version: python:3.10.v29 
2024-04-02T22:48:54.662+07:00	START RequestId: d57785ae-bc40-40f4-8f8b-dc1f77a84de4 Version: $LATEST
2024-04-02T22:49:04.677+07:00	2024-04-02T15:49:04.675Z d57785ae-bc40-40f4-8f8b-dc1f77a84de4 Task timed out after 10.01 seconds
2024-04-02T22:49:04.677+07:00	END RequestId: d57785ae-bc40-40f4-8f8b-dc1f77a84de4
2024-04-02T22:49:04.677+07:00	REPORT RequestId: d57785ae-bc40-40f4-8f8b-dc1f77a84de4	Duration: 10013.20 ms	Billed Duration: 10000 ms	Memory Size: 512 MB	Max Memory Used: 74 MB	Init Duration: 403.33 ms

Thank you,

질문됨 2달 전176회 조회
1개 답변
1

I suspect that the issue might be related to the model's initialization or inference code in the serve.py script. Enable debugging logs to identify any specific bottlenecks in the inference process.

A customer's model containers must respond to requests within 60 seconds. The model itself can have a maximum processing time of 60 seconds before responding to invocations. If your model is going to take 50-60 seconds of processing time, the SDK socket timeout should be set to be 70 seconds. Source

Also if you're using AWS Lambda to invoke the SageMaker endpoint, check the timeout setting of your Lambda function. the default timeout for a new Lambda function is 3 seconds, but it can be extended up to 15 minutes.

profile picture
전문가
답변함 2달 전

로그인하지 않았습니다. 로그인해야 답변을 게시할 수 있습니다.

좋은 답변은 질문에 명확하게 답하고 건설적인 피드백을 제공하며 질문자의 전문적인 성장을 장려합니다.

질문 답변하기에 대한 가이드라인

관련 콘텐츠