Spot instances for inference and sagemaker?

1

Is it possible to deploy spot inf1 instances on sagemaker? We run an API 24/7, and it's costly to keep it up, considering we only have 2 hours of peak performance a day.

We don't shut off those machines because we might have random bursts of traffic during the day that CPU instances can't hold. Alternatively, we could deploy spot EC2 inf machines; however, I'm unsure how I would invoke them from gateway and lambda. Does anybody have a tip or recommendation for our case?

Thanks!

1回答
0

You could possibly integrate EC2 Spot instance fleet with Application Auto Scaling service to spin up or down spot instances when you receive traffic. To scale it down to 0 instances, you will need to configure a queue to hold the requests while you spin up from 0 instances to 1 or more. Then your application would insert the requests in the SQS queue and wait for an instance to be available. Take a look at this link for more information on how to configure application autoscaling with Spot instances: https://docs.aws.amazon.com/autoscaling/application/userguide/services-that-can-integrate-ec2.html

To configure your policy for the autoscaling, you can look at SQS queue length metric. Here is how you can set a target tracking policy for the application autoscaling: https://docs.aws.amazon.com/autoscaling/application/userguide/create-target-tracking-policy-cli.html

AWS
Will_B
回答済み 2年前
profile picture
エキスパート
レビュー済み 1ヶ月前

ログインしていません。 ログイン 回答を投稿する。

優れた回答とは、質問に明確に答え、建設的なフィードバックを提供し、質問者の専門分野におけるスキルの向上を促すものです。

質問に答えるためのガイドライン

関連するコンテンツ