1 Answer
- Newest
- Most votes
- Most comments
1
I suspect that the issue might be related to the model's initialization or inference code in the serve.py
script. Enable debugging logs to identify any specific bottlenecks in the inference process.
A customer's model containers must respond to requests within 60 seconds. The model itself can have a maximum processing time of 60 seconds before responding to invocations. If your model is going to take 50-60 seconds of processing time, the SDK socket timeout should be set to be 70 seconds. Source
Also if you're using AWS Lambda to invoke the SageMaker endpoint, check the timeout setting of your Lambda function. the default timeout for a new Lambda function is 3 seconds, but it can be extended up to 15 minutes.
Relevant content
- asked 6 months ago
- asked 7 months ago
- AWS OFFICIALUpdated 2 months ago
- AWS OFFICIALUpdated 2 months ago
- AWS OFFICIALUpdated a year ago