- Newest
- Most votes
- Most comments
It sounds like your using managed scaling in the capacity provider? Capacity providers use a target tracking scaling policy, which creates managed alarms for you. Target Tracking alarm settings aren't user configurable, and currently have a low usage alarm (to trigger scale-in) which goes off after 15 consecutive 1 minute periods are below the threshold (based on the target value you set - lower the target, the more conservative the scale-in will be).
ECS publishes the CapacityProviderReservation to CloudWatch, which is what the alarms are triggering from. After the alarm is triggered, AutoScaling will evaluate if it can scale-in and potentially terminate instances https://aws.amazon.com/blogs/containers/deep-dive-on-amazon-ecs-cluster-auto-scaling/
If you enable managed termination protection, instances will be protected from scale-in until there's no running tasks on them (ignoring daemon tasks): https://docs.aws.amazon.com/AmazonECS/latest/developerguide/cluster-auto-scaling.html
There is Auto-scaling cooldowns which can delay the termination of additional instances for a set period of time. Here is the reference: Scaling cooldowns for Amazon EC2 Auto Scaling
This is wrong. Cooldown in ASG is for simple scaling policies, and capacity providers use target tracking
Relevant content
- Accepted Answerasked 7 months ago
- AWS OFFICIALUpdated 22 days ago
- AWS OFFICIALUpdated 2 years ago
- AWS OFFICIALUpdated 2 years ago
- AWS OFFICIALUpdated 3 months ago
Yes, I am using managed scaling in the capacity provider. When you say "lower the target, the more conservative the scale-in will be" does "target" refer to the target capacity of the capacity provider? Is there an approach that might work where I don't use managed scaling or would that defeat the purpose of using the capacity provider?
Thank you for this response and links to documentation. I'll read through those.
Yes, exactly. If the target capacity value is lower, then instances won't be terminated as soon (not a time delay, but still a form of buffer). For example, setting it to 100 means "I don't want to launch more instances until the existing ones are full". You can use a different metric and configure your own alarms with a step scaling policy on the ASG, but that loses out on a lot of the benefits. The Managed Termination Protection especially is very nice, and requires you to use Managed Scaling. Is there a reason you need a time delay even with Managed Termination Protection?
We are often running batch jobs that are triggered by someone or when some event happens. Often there will be several events that come in after that time that are spread out temporally. Between jobs we would ideally not scale down too quickly to avoid the time needed to start new instances so that subsequent jobs could start faster than the first batch did. Having a configurable scale down time would be nice because we could easily tweak the time before scale down events to match our use case and balance speed of jobs and cost.