Sagemaker endpoint running but constantly restarting

0

I have deployed a model to a Sagemaker endpoint using BentoML/BentoCTL. This is a tool for building APIs and containerizing models. To test, I use curl with a JSON payload to make a request. When I run the created docker container on my local machine I can successfully invoke it and get responses back. So I don't think the problem is in the docker image.

When I deploy to sagemaker, I receive the message {"message":"Service Unavailable"} as a response to my curl request. I can see the endpoint running in the Sagemaker/Endpoints dashboard. Viewing the cloudwatch logs, it appears that the the endpoint is constantly restarting. There are messages that are printed at startup (e.g. Tensorflow loading messages) that are written to the log over and over.

I thought that this might be due to using an instance type with low memory (t2.medium) so I switched to m5.4xlarge as a test, but the result is the same.

What can I do? How can I determine what's causing the endless restarts?

2回答
0

When you mean restart? Does it mean "Updating" the endpoint? Do you have an autoscaling policy attached to the endpoint? Do you see any errors in the Cloudwatch logs?

AWS
回答済み 2年前
0
AWS
回答済み 2年前

ログインしていません。 ログイン 回答を投稿する。

優れた回答とは、質問に明確に答え、建設的なフィードバックを提供し、質問者の専門分野におけるスキルの向上を促すものです。

質問に答えるためのガイドライン

関連するコンテンツ