ECS task not scale-in when metric CPU and Memory under policy

0

Hi everyone

I have problem with Service auto scaling in ECS. I setup Scaling Policy with CPU and Memory like image below Enter image description here Enter image description here

But When I check metric my service in ecs Its under policy but not trigger scale in Enter image description here

My cloudwatch alarm trigger Enter image description here Enter image description here

I see another post same problem with me but dont know how to fix: https://repost.aws/questions/QU6_7Dd4GZTkabvNWDJIKKww/ecs-auto-scale-in-doesn-t-work-event-though-i-got-cloudwatch-scale-in-alarm

My service in ecs now running 2 task. I set Min is 1 and max is 2

Thanks for support

2 Answers
0
Accepted Answer

You may need to check the box for "Turn off scale-in" on the Memory policy

Target tracking will scale-in conservatively. This means it will choose not to scale-in if it believes this action is unsafe and might immediately cause a scale-out. Just looking at the metrics, the Memory is 56% on 2 tasks. So in theory, if a scale-in happened going down to 1 task, this means there would be 112% usage on the one remaining task, which would cause a scale-out (and also crash the task). You can verify this by looking at the "not scaled reasons" in the Application AutoScaling activity history: https://docs.aws.amazon.com/autoscaling/application/userguide/application-auto-scaling-scaling-activities.html#include-not-scaled-activities-with-the-aws-cli

Because the Operating System doesn't free up memory immediately when it isn't actively being used, its likely that scale-in is actually safe, and the 1 remaining task after a scale-in would still be under your 90% threshold. This is why you would need to consider turning off 'scale-in' for the memory metric, because the metric value does not reflect the real active memory usage of the OS.
If you make this change, scale-in will happen whenever the CPU metric is low

AWS
EXPERT
answered 2 months ago
profile pictureAWS
EXPERT
reviewed 2 months ago
  • Thanks for advice. Before I try your way. I test setting Scaling Policy Memory to 160% and It trigger reduce task to 1. I want to know do you have any document relate with your recommend "Just looking at the metrics, the Memory is 56% on 2 tasks. So in theory, if a scale-in happened going down to 1 task, this means there would be 112% usage on the one remaining task, which would cause a scale-out (and also crash the task)". I think it true but i cant find document about this.

  • Not as specifically, but its discussed in the 2nd to last bullet of this doc: https://docs.aws.amazon.com/autoscaling/application/userguide/target-tracking-scaling-policy-overview.html#target-tracking-considerations Starting with "You may see gaps between the target value..."

-2

The issue you're experiencing with ECS service auto-scaling not triggering scale-in actions despite metrics being under the policy thresholds is a common challenge. There are a few potential reasons and solutions to consider:

  1. Conflicting policies: When you have multiple scaling policies (CPU and Memory in this case), scale-in only occurs when all policies agree to scale in. This prevents oscillation where one metric might trigger scale-out while another triggers scale-in. Ensure that both CPU and Memory utilization are consistently low for scale-in to occur.

  2. Cooldown periods: Check your scale-in cooldown period. If it's set too high, it might prevent immediate scale-in actions even when conditions are met. Consider adjusting this value if necessary.

  3. Minimum capacity: Verify that your current task count isn't already at the minimum capacity set for your service. If it is, no further scale-in will occur.

  4. Task placement strategy: If you're using a binpack strategy, tasks might not redistribute optimally after a scale-in event. This can lead to tasks remaining on different instances instead of consolidating.

  5. Capacity Provider settings: If you're using a Capacity Provider for your Auto Scaling Group, check if Managed Termination Protection is enabled. This feature prevents instances from being scaled in as long as there are tasks running on them.

To address these issues:

  1. Use Target Tracking scaling policies instead of step scaling if possible. They're generally more effective at maintaining your desired utilization levels.

  2. If using step scaling, ensure your alarm thresholds and scaling adjustments are appropriately configured.

  3. Consider implementing a custom task rebalancer using a Lambda function triggered periodically by CloudWatch Events. This can help redistribute tasks more efficiently after scale-in events.

  4. If using a Capacity Provider, try disabling Managed Termination Protection and enabling Managed Draining instead. Set the target capacity utilization to 100% to encourage better bin-packing.

  5. Monitor the CapacityProviderReservation metric in CloudWatch to gain insights into your cluster's scaling behavior.

By addressing these points, you should be able to improve your ECS service's auto-scaling behavior, particularly for scale-in actions.
Sources
Issue with AWS ECS Auto-Scaling and Binpack Task Placement Strategy: Tasks Not Shifting Back After Scale-In | AWS re:Post
Autoscalling in ECS cluster service does not behave as expected causing target deregistration and service downtime | AWS re:Post
ECS Service auto scaling policy conflict the one to other policy | AWS re:Post
Amazon ECS EC2-backed auto scale-out and scale-in | AWS re:Post

profile picture
answered 2 months ago

You are not logged in. Log in to post an answer.

A good answer clearly answers the question and provides constructive feedback and encourages professional growth in the question asker.

Guidelines for Answering Questions