By using AWS re:Post, you agree to the Terms of Use
/Auto Scaling Group not scaling based on ECS desired task count/

Auto Scaling Group not scaling based on ECS desired task count

0

I have an EC2-backed ECS cluster which contains a ASG (using Cluster Auto Scaling) that is allowed to scale between 1 and 5 EC2 instances. There is also a service defined on this cluster which is also set to scale between 1 and 5 tasks with each task reserving almost the full resources of a single instance.

I have configured the service to scale it's desired task count depending on the size of various queues within an Amazon MQ instance which is all handled by CloudWatch alarms. The scaling of the desired task count works as expected but the ASG doesn't provision new EC2 instances to fit the amount of desired tasks unless I manually go in and change the desired capacity of the ASG. This means the new tasks never get deployed as ECS cant find any suitable instances to deploy them too.

I dont know if i'm missing something but all the doumentation I have found on ECS Auto Scaling Groups is that it should scale instances to fit the total resources requested by the desired amount of tasks.

If I manually increase the desired capacity in the ASG and add an additional task that gets deployed on that new instance then the CapacityProviderReservation still remains at 100%. If I then remove that second task then after a while the ASG will scale in and remove the instance that no longer has any tasks running on it which is the expected behaviour.

Any pointers would be greatly appreciated.

As a side note this is all setup using the Python CDK.

Edit: Clarified that the ASG is currently using CAS (as far as I can tell) and added details about scaling in working as expected

Many thanks Tom

1 Answers
0

Hi Tom,

The ASG doesn't know about the ECS service and won't scale by itself. There are two common ways to setup scaling on it to make sure there are instances added to the cluster when needed

  1. Use the same metric for both the ASG and service so that they both get alarms triggered at the same time. This could potentially be subject to a race condition where the queue length metric triggers an alarm for the ASG or service, and the alarm ends up moving back to OK before the other one scales
  2. Use CAS (Cluster AutoScaling) to trigger the ASG based off the tasks running/pending in the cluster
answered a month ago
  • Hi Shahad,

    Thanks for the response. I'm fairly certain that I'm currently using CAS. In the CDK I attached the auto scaling group to the ECS Cluster as a capacity provider which based of the link you provided and elsewhere should cause ECS to manage the EC2 scaling.

    Interestingly I manually increased the desired capacity of the ASG to 2 and once my cluster scaled down to a single task the ASG scaled in the extra EC2 instance that was running so the scaling in seems to be working correctly but the scaling out isn't.

    It is like it's not detecting tasks until they are at least pending which obviously it can't get to unless there is capacity available

You are not logged in. Log in to post an answer.

A good answer clearly answers the question and provides constructive feedback and encourages professional growth in the question asker.

Guidelines for Answering Questions