ECS + Spot Integration - Multiple ASGs vs SpotFleet
Customer who is deploying a workload (stateless web services) on ECS and would like to leverage Spot Instances to reduce the cost.
Customer would like an ECS cluster that has a base number of On-Demand Instances and then AutoScale based on demand by trying to add Spot Instances first and then On-Demand Instances if Spot Request fails.
What is the recommended way to achieve this objective?
One option I can see is to use multiple AutoScaling Groups for On-Demand & Spot Instances with alarms based on Service/Cluster Utilization.
But I would also like to explore the possibility of using SpotFleet with ECS directly. But I don't see a way to make sure that On-demand instances are added to the cluster if Spot Requests fail with a SpotFleet. Is there any way to achieve this with SpotFleet?
EC2 Spot Fleet supports a static amount of OD and a scalable target capacity for Spot capacity. They should start there.
The customer should focus on diversifying their Spot requests across AZs and instance types/sizes (super simple with Spot Fleet and containers) to reduce the likelihood of not being able to provision Spot capacity. Keep in mind that Spot and OD pull from the same capacity pool. If they cannot get a Spot Instance in a given pool, they very well might not be able to get an OD instance in the same pool. The reaction to not getting a Spot Instance should be to try a different pool (either another AZ of same instance type/size, or a different instance type/size in the same AZ)- Spot Fleet will handle this for them automatically by specifying multiple launch specifications or launch templates.
If they still want to have an ASG with OD as a "backup", they could consider keeping an ASG that scales based on an alarm that triggers when FulfilledCapacity is < TargetCapacity for the Spot Fleet for a given period of time, and join this ASG to the same ECS cluster. Critical is having the OD target pull from a different pool of capacity than defined in the Spot Fleet.
Relevant questions
How to scale an aws ecs service based on multiple alarms
asked 3 months agoECS Capacity providers best practices
Accepted Answerasked 2 months agoec2 (Spot instances) going from Runing - Initializing to Terminated
asked 2 months agoSpot instances for inference and sagemaker?
asked 3 months agoSpot persistent request
asked 25 days agoECS + Spot Integration - Multiple ASGs vs SpotFleet
Accepted Answerasked 4 years agoAre there any best practices for sending logs from ECS on EC2, ECS on Fargate and other AWS services such as API GW, load balancers (and more AWS services) to Splunk?
asked 3 months agoECS: Capacity Provider vs Autoscaling Group
asked 7 months agoAWS Fargate Spot - Automated Draining for Spot Support
Accepted Answerasked 3 years agoHibernating Spot Instances upon interruption in Amazon EKS
Accepted Answerasked 2 years ago