Sagemaker endpoint running but constantly restarting

0

I have deployed a model to a Sagemaker endpoint using BentoML/BentoCTL. This is a tool for building APIs and containerizing models. To test, I use curl with a JSON payload to make a request. When I run the created docker container on my local machine I can successfully invoke it and get responses back. So I don't think the problem is in the docker image.

When I deploy to sagemaker, I receive the message {"message":"Service Unavailable"} as a response to my curl request. I can see the endpoint running in the Sagemaker/Endpoints dashboard. Viewing the cloudwatch logs, it appears that the the endpoint is constantly restarting. There are messages that are printed at startup (e.g. Tensorflow loading messages) that are written to the log over and over.

I thought that this might be due to using an instance type with low memory (t2.medium) so I switched to m5.4xlarge as a test, but the result is the same.

What can I do? How can I determine what's causing the endless restarts?

2 回答
0

When you mean restart? Does it mean "Updating" the endpoint? Do you have an autoscaling policy attached to the endpoint? Do you see any errors in the Cloudwatch logs?

AWS
已回答 2 年前
0
AWS
已回答 2 年前

您未登录。 登录 发布回答。

一个好的回答可以清楚地解答问题和提供建设性反馈,并能促进提问者的职业发展。

回答问题的准则