ECS services not scaling in (scale in protection is disabled)

0

Hello. I've an ECS cluster (EC2 based) attached to a CSP. The service scaling out is OK, but it isn't scaling IN. And I've already checked the scale in protection and it's disabled (Disable Scale In: false)

Description of the environment:

  • 1 cluster (ec2-based), 2 services
  • Services are attached to an ALB (registering and deregistering fine)
  • Services are with autoscaling enabled, checking memory (above 90%), NO scale in protection,1 task minimum, 3 tasks max.
  • Services are using a Capacity Service provider, apparently working as intended: it's creating new EC2 instances when new tasks are provisioned and dropping when they're with 0 tasks running, registering and deregistering as expected.
  • The cloudwatch alarms are working fine, Alarming when expected (with Low and High usages)

Description of the test and what's "not working":

  • Started with 1 task for each service and 1 instance for both services.
  • I've managed to enter one of the containers and run a memory test, increasing its usage to over 90%
  • The service detected it and asked for the provision of a new task.
  • There were no instances that could allocate the new task, so the ECS asked for the CSP/Auto Scaling Group a new ec2 instance
  • The new instance was provisioned, registered in the cluster and ran the new task.
  • The service's memory usage avg. decreased from ~93% to ~73% (average from the sum of both tasks)
  • All's fine, the memory stress ran for 20 minutes.
  • After the memory stress was over, the memory usage dropped to ~62%
  • The cloudwatch alarm was triggered (maybe even before, when it was with 73% usage, I didn't check it)
  • The service is still running 2 tasks right now (after 3 hours or more) and it's not decreasing the Desired Count from 2 to 1.

Is there anything that I'm missing here? I've already done a couple of tests, trying to change the service auto scaling thresholds and other configurations, but nothing is changing this behaviour.

Any help would be appreciated.

Thanks in advance.

  • 2 quick questions:

    1. just confirming that the ECS service AutoScaling is the one that's not scaling in; and because of that the capacity provider isn't lowering its metric to have the ASG scale in?
    2. Do you have multiple target tracking policies on the ECS service? If so, they all need to want to scale-in at the same time for a scale-in to happen
  • Hello Shahad, thank you for your quick response.

    1- Yes, it seems that it's the ECS service autoscaling isn't scaling in (tasks). The CP/ASG is scaling in as expected (when a EC2 instance is with 0 tasks running, that is, the CapacityProviderReservertion is below 100%). 2- I read about it, but there was* only one target tracking policy in place. (update below)

1 Answer
0

*UPDATE:

I was going through some blogs, manuals and documentations and I tried something different: separated the scalein from scaleout policies.

Here's what I changed:

  • Disabled SCALE IN from the policy that tracked if the memory was above 90%.
  • Created another STEP policy and new alarm, checking if the memory was below 70%. The action is configured to decrease ONE task per time.

It's working that way... I don't think that's the ideal solution and still don't know why the target tracking isn't scaling in. I mean, it's still a very small project, but, once the need for scale out and scale in increases in dozens, the scale in process will be much slower (as it should be, but in this case, it could be much much slower depending on how much tasks scaled out)

I could go back to the previous scenario if anyone come up with something to check and try out.

Thanks again.

answered 2 years ago

You are not logged in. Log in to post an answer.

A good answer clearly answers the question and provides constructive feedback and encourages professional growth in the question asker.

Guidelines for Answering Questions