Spot instances for inference and sagemaker?

1

Is it possible to deploy spot inf1 instances on sagemaker? We run an API 24/7, and it's costly to keep it up, considering we only have 2 hours of peak performance a day.

We don't shut off those machines because we might have random bursts of traffic during the day that CPU instances can't hold. Alternatively, we could deploy spot EC2 inf machines; however, I'm unsure how I would invoke them from gateway and lambda. Does anybody have a tip or recommendation for our case?

Thanks!

1 回答
0

You could possibly integrate EC2 Spot instance fleet with Application Auto Scaling service to spin up or down spot instances when you receive traffic. To scale it down to 0 instances, you will need to configure a queue to hold the requests while you spin up from 0 instances to 1 or more. Then your application would insert the requests in the SQS queue and wait for an instance to be available. Take a look at this link for more information on how to configure application autoscaling with Spot instances: https://docs.aws.amazon.com/autoscaling/application/userguide/services-that-can-integrate-ec2.html

To configure your policy for the autoscaling, you can look at SQS queue length metric. Here is how you can set a target tracking policy for the application autoscaling: https://docs.aws.amazon.com/autoscaling/application/userguide/create-target-tracking-policy-cli.html

AWS
Will_B
已回答 2 年前
profile picture
专家
已审核 1 个月前

您未登录。 登录 发布回答。

一个好的回答可以清楚地解答问题和提供建设性反馈,并能促进提问者的职业发展。

回答问题的准则