Cost of autoscaling endpoint Amazon SageMaker endpoint to zero

1

I want to use an Amazon Sagemaker endpoint for a custom classification model. The endpoint should only handle sporadic input (say a few times a week). For this purpose I want to employ autoscaling that scales the number of instances down to 0 when the endpoint is not used.

Are there any costs associated with having an endpoint with 0 instances?

Thanks!

1 Answer
2
Accepted Answer

You dont pay any compute costs for the duration when the endpoint size scales down to 0. But i think you can design it better. There are few other options for you to use in SageMaker Endpoint(assuming you are using realtime endpoint)

  1. Try using SageMaker Serverless Inference instead. Its purely serverless in nature so you pay only when the endpoint is serving inference. i think that would fit your requirement better.
  2. You can think of using Lambda as well which will reduce your hosting costs. but you have to do more work in setting up the inference stack all by yourself.
  3. There is also an option of SageMaker asynchronous inference but its mostly useful for inference which require longer time to process each request. The reason i mention this is it also support scale to 0 when no traffic is coming.
AWS
EXPERT
answered 2 years ago
  • Thanks for your answer! As I understand it I could also use "SageMaker Batch Transform inference" (given I have the inputs saved in s3 bucket), and that will save my predictions automatically to a s3 output bucket. Do you think that interference type could also be useful for this use case?

You are not logged in. Log in to post an answer.

A good answer clearly answers the question and provides constructive feedback and encourages professional growth in the question asker.

Guidelines for Answering Questions