Cost of autoscaling endpoint Amazon SageMaker endpoint to zero

1

I want to use an Amazon Sagemaker endpoint for a custom classification model. The endpoint should only handle sporadic input (say a few times a week). For this purpose I want to employ autoscaling that scales the number of instances down to 0 when the endpoint is not used.

Are there any costs associated with having an endpoint with 0 instances?

Thanks!

1回答
2
承認された回答

You dont pay any compute costs for the duration when the endpoint size scales down to 0. But i think you can design it better. There are few other options for you to use in SageMaker Endpoint(assuming you are using realtime endpoint)

  1. Try using SageMaker Serverless Inference instead. Its purely serverless in nature so you pay only when the endpoint is serving inference. i think that would fit your requirement better.
  2. You can think of using Lambda as well which will reduce your hosting costs. but you have to do more work in setting up the inference stack all by yourself.
  3. There is also an option of SageMaker asynchronous inference but its mostly useful for inference which require longer time to process each request. The reason i mention this is it also support scale to 0 when no traffic is coming.
AWS
エキスパート
回答済み 2年前
  • Thanks for your answer! As I understand it I could also use "SageMaker Batch Transform inference" (given I have the inputs saved in s3 bucket), and that will save my predictions automatically to a s3 output bucket. Do you think that interference type could also be useful for this use case?

ログインしていません。 ログイン 回答を投稿する。

優れた回答とは、質問に明確に答え、建設的なフィードバックを提供し、質問者の専門分野におけるスキルの向上を促すものです。

質問に答えるためのガイドライン

関連するコンテンツ