ECS starts tasks before target group finish deregistering container

1

Hey there!

I've been facing an issue on ECS. I have a task definition with two containers in it, one is dependent on the other. Turns out that whenever I deploy to this service ECS (EC2 type) try stopping the current task and then begins draining connections to that, so far so good, but just a few seconds after that (~1min) it tries starting new tasks even though the old ones are still in RUNNING state and the target group has not deregistered those containers yet. I have deregistration_delay set to 300 though, which is the default value for it.

If I reduce the deregistration_delay that does not happens, however, I need more time in order for those containers to get drained completely.

Can't ECS respect deregistration_delay and start deploying tasks only after target group has deregistered it.?

Docker Version: 19.03.13-ce Agent Version: 1.51.0 Kernel Version: 4.14.225-121.362.amzn1.x86_64 AMI: amzn-ami-2018.03.20210331-amazon-ecs-optimized

Enter image description here

1 Answer
0

You can try setting the minimumHealthyPercent to 0 and maximumPercent to 100. As described in the docs:

If minimumHealthyPercent is below 100%, the scheduler can ignore desiredCount temporarily during a deployment. For example, if desiredCount is four tasks, a minimum of 50% allows the scheduler to stop two existing tasks before starting two new tasks. Tasks for services that don't use a load balancer are considered healthy if they're in the RUNNING state. Tasks for services that use a load balancer are considered healthy if they're in the RUNNING state and are reported as healthy by the load balancer.

The maximumPercent parameter represents an upper limit on the number of running tasks during a deployment. You can use it to define the deployment batch size. For example, if desiredCount is four tasks, a maximum of 200% starts four new tasks before stopping the four older tasks (provided that the cluster resources required to do this are available).

Even so, this might not do quite what you want. The system is designed to avoid downtime. If the previous task is deregistered and stopped before the updated task starts, the service will be down for a period of time, which is not what most users want.

profile picture
EXPERT
bwhaley
answered 2 years ago
  • Hey @bwhaley! I already have minimumHealthyPercent to 0 and maximumPercent to 100 for this service. When deploying, ECS starts starting new task without killing current one. It fails cause I don't have enough resources on that container instance for two tasks running at the same time. Either it fails due to port usage or memory resources, the lack of it. If I tweak deregistration_delay to a lower value, let's say 60 seconds, it works as expected, the current running task gets stopped before ECS starts new task.

You are not logged in. Log in to post an answer.

A good answer clearly answers the question and provides constructive feedback and encourages professional growth in the question asker.

Guidelines for Answering Questions