Cost of autoscaling endpoint Amazon SageMaker endpoint to zero

1

I want to use an Amazon Sagemaker endpoint for a custom classification model. The endpoint should only handle sporadic input (say a few times a week). For this purpose I want to employ autoscaling that scales the number of instances down to 0 when the endpoint is not used.

Are there any costs associated with having an endpoint with 0 instances?

Thanks!

1 個回答
2
已接受的答案

You dont pay any compute costs for the duration when the endpoint size scales down to 0. But i think you can design it better. There are few other options for you to use in SageMaker Endpoint(assuming you are using realtime endpoint)

  1. Try using SageMaker Serverless Inference instead. Its purely serverless in nature so you pay only when the endpoint is serving inference. i think that would fit your requirement better.
  2. You can think of using Lambda as well which will reduce your hosting costs. but you have to do more work in setting up the inference stack all by yourself.
  3. There is also an option of SageMaker asynchronous inference but its mostly useful for inference which require longer time to process each request. The reason i mention this is it also support scale to 0 when no traffic is coming.
AWS
專家
已回答 2 年前
  • Thanks for your answer! As I understand it I could also use "SageMaker Batch Transform inference" (given I have the inputs saved in s3 bucket), and that will save my predictions automatically to a s3 output bucket. Do you think that interference type could also be useful for this use case?

您尚未登入。 登入 去張貼答案。

一個好的回答可以清楚地回答問題並提供建設性的意見回饋,同時有助於提問者的專業成長。

回答問題指南