How to restart Sagemaker endpoint?

0

Sometimes, when CUDA error: device-side assert triggered error occurs due to some invalid input or something, the Sagemaker endpoint stops working even for any future valid inputs. This stackoverflow answer says that restarting is the only way to solve this error. However, I can't find a way to simply restart a Sagemaker endpoint and all I have been doing is delete an existing endpoint and then create a new one. This takes a lot of time, so I was wondering if there was any way to simply restart the sagemaker endpoint machine.

質問済み 10ヶ月前978ビュー
2回答
0

Hi, did you try to update your endpoint with a new config via update-endpoint CLI command: https://awscli.amazonaws.com/v2/documentation/api/latest/reference/sagemaker/update-endpoint.html

It says:

Deploys the new EndpointConfig specified in the request, switches to using newly created 
endpoint, and then deletes resources provisioned for the endpoint using the previous 
EndpointConfig (there is no availability loss).

You may also want to automate the detection of status change of your endpoint via EventBridge and link it to a Lambda that will do the endpoint update to minimize your downtime. See https://docs.aws.amazon.com/sagemaker/latest/dg/automating-sagemaker-with-eventbridge.html#eventbridge-deployment-state

Best, Didier

profile pictureAWS
エキスパート
回答済み 10ヶ月前
0

Looks like you can't restart an endpoint by calling update-endpoint without specifying a different endpoint configuration, since the endpoint's current configuration is "in use".

aws sagemaker update-endpoint --endpoint-name my-endpoint --endpoint-config-name my-endpoint-config

An error occurred (ValidationException) when calling the UpdateEndpoint operation: Cannot update endpoint "my-endpoint" with the currently in use endpoint configuration "my-endpoint-config.
回答済み 8ヶ月前

ログインしていません。 ログイン 回答を投稿する。

優れた回答とは、質問に明確に答え、建設的なフィードバックを提供し、質問者の専門分野におけるスキルの向上を促すものです。

質問に答えるためのガイドライン

関連するコンテンツ