Scaling ECS Fargate - graceful session draining



Background: A customer builds an RDP (Windows remote access protocol) session broker. As such, they have an interactive usage pattern in which multiple (hundreds or thousands) of users open and use RDP sessions. The customer is using ECS Fargate for the session broker logic, and each container can host multiple RDP sessions.

Challenge: When scaling down the cluster or updating versions on the ECS Fargate cluster by deploying new container images, the customer would like to gracefully drain the sessions on containers that should be terminated (i.e. keep those RDP sessions un-interrupted until the user terminates them).

How would you suggest to implement that? Should it necessarily involve application level logic?

BTW, We've also suggested the customer to examine implementing a Fargate-task-per-RDP-session method, (in which the container will terminate once the session has ended) so they will also look into this alternative. This should obviously be tested with focus on cost, magnitude and cold-start interval aspects.

asked 2 years ago351 views
1 Answer
Accepted Answer

First, see our recent blog post at

During a deployment, ECS will create new tasks, and begin the shutdown process of older tasks. The older tasks will be sent a SIGTERM signal, which could be caught by the application and arrange for orderly shutdown. Older tasks will also be deregistered from any load balancers. Once the final session has ended, the application must exit.

Once the SIGTERM is sent, there is a configurable grace period after which ECS sends a SIGKILL (non-interruptible termination) to the task containers. On EC2 this can be a practically unlimited time, but on Fargate, there is a 120 second limit at this time. (We have a PFR to extend this.) So, after 120 seconds, the Fargate task will be forcibly terminated.

If the customer needs a longer grace period, we recommend using EC2 for the time being. Otherwise, if the customer wants to stay on Fargate, they will need some sort of out-of-band signaling mechanism for communicating shutdown that is based on a more complex automation than ECS deployment.

answered 2 years ago

You are not logged in. Log in to post an answer.

A good answer clearly answers the question and provides constructive feedback and encourages professional growth in the question asker.

Guidelines for Answering Questions