Received server error (0) from model when invoke model

0

0 Hi,

I was trying to deploy my inference model to the endpoint and I was given a ModelError. "ModelError: An error occurred (ModelError) when calling the InvokeEndpoint operation: Received server error (0) from model with message "Your invocation timed out while waiting for a response from container model. Review the latency metrics for each container in Amazon CloudWatch, resolve the issue, and try again.". See Link:https://eu-north-1.console.aws.amazon.com/cloudwatch/home?region=eu-north-1#logEventViewer:group=/aws/sagemaker/Endpoints/SageMakerEndpointtts7

I'm not sure what caused this issue and couldn't figure out how latency metrics in the CloudWatch would be useful in this case. Does anyone know what the approach is to solve this issue? It would also be great to know why this happens. Thanks in advance for any help!

2 Answers
0

For real-time inference endpoint, InvokeEndpoint:the model container must respond to requests within 60 seconds. The model itself can have a maximum processing time of 60 seconds before responding to invocations. You could check the model latency using CloudWatch metrics.

profile picture
kraft
answered 9 months ago
0

Hello Lamia Mohamed

Real-time inference is ideal for online inferences that have low latency or high throughput requirements. Use real-time inference for a persistent and fully managed endpoint (REST API) that can handle sustained traffic, backed by the instance type of your choice. Real-time inference can support payload sizes up to 6 MB and processing times of 60 seconds.

This is also given in SageMaker Docs

I can give a few suggestions based on this error. Check the Benchmarking on the Model, Test the Model and Model container Locally.

If the Model container produces inference within the 60 s timeout then we are good to go for SageMaker.

ModelLatency is helpful because Sagemaker requires the container to respond within 60 seconds [1]: if you see the ModelLatency at or above 60 seconds that confirms the container isn't responding fast enough. At that point, you'll need to figure out why your container isn't running quickly enough. If it is a SageMaker owned Model I would suggest looking into the inference Logic and contacting AWS Support

[1] -https://docs.aws.amazon.com/sagemaker/latest/dg/your-algorithms-inference-code.html#your-algorithms-inference-code-container-response

AWS
SUPPORT ENGINEER
answered 9 months ago

You are not logged in. Log in to post an answer.

A good answer clearly answers the question and provides constructive feedback and encourages professional growth in the question asker.

Guidelines for Answering Questions