Spot instances for inference and sagemaker?

1

Is it possible to deploy spot inf1 instances on sagemaker? We run an API 24/7, and it's costly to keep it up, considering we only have 2 hours of peak performance a day.

We don't shut off those machines because we might have random bursts of traffic during the day that CPU instances can't hold. Alternatively, we could deploy spot EC2 inf machines; however, I'm unsure how I would invoke them from gateway and lambda. Does anybody have a tip or recommendation for our case?

Thanks!

1개 답변
0

You could possibly integrate EC2 Spot instance fleet with Application Auto Scaling service to spin up or down spot instances when you receive traffic. To scale it down to 0 instances, you will need to configure a queue to hold the requests while you spin up from 0 instances to 1 or more. Then your application would insert the requests in the SQS queue and wait for an instance to be available. Take a look at this link for more information on how to configure application autoscaling with Spot instances: https://docs.aws.amazon.com/autoscaling/application/userguide/services-that-can-integrate-ec2.html

To configure your policy for the autoscaling, you can look at SQS queue length metric. Here is how you can set a target tracking policy for the application autoscaling: https://docs.aws.amazon.com/autoscaling/application/userguide/create-target-tracking-policy-cli.html

AWS
Will_B
답변함 2년 전
profile picture
전문가
검토됨 한 달 전

로그인하지 않았습니다. 로그인해야 답변을 게시할 수 있습니다.

좋은 답변은 질문에 명확하게 답하고 건설적인 피드백을 제공하며 질문자의 전문적인 성장을 장려합니다.

질문 답변하기에 대한 가이드라인