ECS + Spot Integration - Multiple ASGs vs SpotFleet
Customer who is deploying a workload (stateless web services) on ECS and would like to leverage Spot Instances to reduce the cost.
Customer would like an ECS cluster that has a base number of On-Demand Instances and then AutoScale based on demand by trying to add Spot Instances first and then On-Demand Instances if Spot Request fails.
What is the recommended way to achieve this objective?
One option I can see is to use multiple AutoScaling Groups for On-Demand & Spot Instances with alarms based on Service/Cluster Utilization.
But I would also like to explore the possibility of using SpotFleet with ECS directly. But I don't see a way to make sure that On-demand instances are added to the cluster if Spot Requests fail with a SpotFleet. Is there any way to achieve this with SpotFleet?
EC2 Spot Fleet supports a static amount of OD and a scalable target capacity for Spot capacity. They should start there.
The customer should focus on diversifying their Spot requests across AZs and instance types/sizes (super simple with Spot Fleet and containers) to reduce the likelihood of not being able to provision Spot capacity. Keep in mind that Spot and OD pull from the same capacity pool. If they cannot get a Spot Instance in a given pool, they very well might not be able to get an OD instance in the same pool. The reaction to not getting a Spot Instance should be to try a different pool (either another AZ of same instance type/size, or a different instance type/size in the same AZ)- Spot Fleet will handle this for them automatically by specifying multiple launch specifications or launch templates.
If they still want to have an ASG with OD as a "backup", they could consider keeping an ASG that scales based on an alarm that triggers when FulfilledCapacity is < TargetCapacity for the Spot Fleet for a given period of time, and join this ASG to the same ECS cluster. Critical is having the OD target pull from a different pool of capacity than defined in the Spot Fleet.
How to scale an aws ecs service based on multiple alarmsasked 3 months ago
ECS Capacity providers best practicesAccepted Answerasked 2 months ago
ec2 (Spot instances) going from Runing - Initializing to Terminatedasked 2 months ago
Spot instances for inference and sagemaker?asked 3 months ago
Spot persistent requestasked 25 days ago
ECS + Spot Integration - Multiple ASGs vs SpotFleetAccepted Answerasked 4 years ago
Are there any best practices for sending logs from ECS on EC2, ECS on Fargate and other AWS services such as API GW, load balancers (and more AWS services) to Splunk?asked 3 months ago
ECS: Capacity Provider vs Autoscaling Groupasked 7 months ago
AWS Fargate Spot - Automated Draining for Spot SupportAccepted Answerasked 3 years ago
Hibernating Spot Instances upon interruption in Amazon EKSAccepted Answerasked 2 years ago