Spot instances for inference and sagemaker?
Is it possible to deploy spot inf1 instances on sagemaker? We run an API 24/7, and it's costly to keep it up, considering we only have 2 hours of peak performance a day.
We don't shut off those machines because we might have random bursts of traffic during the day that CPU instances can't hold. Alternatively, we could deploy spot EC2 inf machines; however, I'm unsure how I would invoke them from gateway and lambda. Does anybody have a tip or recommendation for our case?
You could possibly integrate EC2 Spot instance fleet with Application Auto Scaling service to spin up or down spot instances when you receive traffic. To scale it down to 0 instances, you will need to configure a queue to hold the requests while you spin up from 0 instances to 1 or more. Then your application would insert the requests in the SQS queue and wait for an instance to be available. Take a look at this link for more information on how to configure application autoscaling with Spot instances: https://docs.aws.amazon.com/autoscaling/application/userguide/services-that-can-integrate-ec2.html
To configure your policy for the autoscaling, you can look at SQS queue length metric. Here is how you can set a target tracking policy for the application autoscaling: https://docs.aws.amazon.com/autoscaling/application/userguide/create-target-tracking-policy-cli.html
Spot instances for inference and sagemaker?asked 3 months ago
Is it possible to test locally SageMaker Inference Pipelines?Accepted AnswerEXPERTasked 2 years ago
ec2 (Spot instances) going from Runing - Initializing to Terminatedasked 2 months ago
how to choose an instance type for a sagemaker testing/inference?asked 3 months ago
Can I limit the type of instances that data scientists can launch for training jobs in SageMaker?Accepted Answerasked 2 years ago
Hibernating Spot Instances upon interruption in Amazon EKSAccepted Answerasked 2 years ago
Savings Plans Applicability to ML instancesAccepted Answerasked 3 years ago
Which Amazon SageMaker algorithms can only use GPU for training?Accepted AnswerMODERATORasked 2 years ago
EC2 spot instance keeps rebootingasked 2 months ago
Amazon SageMaker Built-in algorithms and Spot checkpointingAccepted Answerasked 2 years ago