Auto Scaling Group not scaling based on ECS desired task count
I have an EC2-backed ECS cluster which contains a ASG (using Cluster Auto Scaling) that is allowed to scale between 1 and 5 EC2 instances. There is also a service defined on this cluster which is also set to scale between 1 and 5 tasks with each task reserving almost the full resources of a single instance.
I have configured the service to scale it's desired task count depending on the size of various queues within an Amazon MQ instance which is all handled by CloudWatch alarms. The scaling of the desired task count works as expected but the ASG doesn't provision new EC2 instances to fit the amount of desired tasks unless I manually go in and change the desired capacity of the ASG. This means the new tasks never get deployed as ECS cant find any suitable instances to deploy them too.
I dont know if i'm missing something but all the doumentation I have found on ECS Auto Scaling Groups is that it should scale instances to fit the total resources requested by the desired amount of tasks.
If I manually increase the desired capacity in the ASG and add an additional task that gets deployed on that new instance then the
CapacityProviderReservation still remains at 100%. If I then remove that second task then after a while the ASG will scale in and remove the instance that no longer has any tasks running on it which is the expected behaviour.
Any pointers would be greatly appreciated.
As a side note this is all setup using the Python CDK.
Edit: Clarified that the ASG is currently using CAS (as far as I can tell) and added details about scaling in working as expected
Many thanks Tom
The ASG doesn't know about the ECS service and won't scale by itself. There are two common ways to setup scaling on it to make sure there are instances added to the cluster when needed
- Use the same metric for both the ASG and service so that they both get alarms triggered at the same time. This could potentially be subject to a race condition where the queue length metric triggers an alarm for the ASG or service, and the alarm ends up moving back to OK before the other one scales
- Use CAS (Cluster AutoScaling) to trigger the ASG based off the tasks running/pending in the cluster
Thanks for the response. I'm fairly certain that I'm currently using CAS. In the CDK I attached the auto scaling group to the ECS Cluster as a capacity provider which based of the link you provided and elsewhere should cause ECS to manage the EC2 scaling.
Interestingly I manually increased the desired capacity of the ASG to 2 and once my cluster scaled down to a single task the ASG scaled in the extra EC2 instance that was running so the scaling in seems to be working correctly but the scaling out isn't.
It is like it's not detecting tasks until they are at least pending which obviously it can't get to unless there is capacity available
Why is Auto Scaling Group not taking Security Group from Launch Template?Accepted Answerasked 3 months ago
scale in protection setting in auto scaling group is ignoredasked 3 months ago
force auto scaling group to scale in by terminating k8s pods ungracefullyasked a year ago
Does using SPOT_CAPACITY _OPTIMIZED launch spot instances into an auto-scaling group in AWS Batch?asked 5 days ago
AWS batch does not scale down EC2 instancesasked 5 months ago
Auto Scaling Group stuck in "Updating Capacity"asked 2 years ago
How many Load Balancers of what schemes are actually required while creating an ECS cluster with AutoScaling Via Capacity Provider?asked 5 months ago
Auto Scaling Group not scaling based on ECS desired task countasked a month ago
ALBRequestCountPerTarget auto-scaling metric not available on an ECS Serviceasked a month ago
ECS services not scaling in (scale in protection is disabled)asked 20 days ago