How to guarantee regular AMI updates to Capacity Provider

0

I'm trying to create an ECS cluster that uses an EC2 capacity provider, using the "ECS-Optimized" AMIs provided by Amazon found here: https://docs.aws.amazon.com/AmazonECS/latest/developerguide/retrieve-ecs-optimized_AMI.html

I'd like to ensure that the EC2 instances that act as "container instances" for the cluster can only live up to some maximum amount of time, in order to guarantee that all instances eventually get replaced by new ones running the latest versions of those AMIs. This means we're getting all important security updates to the operating system and the ECS agent in a reasonable amount of time.

Autoscaling groups have a setting called the "maximum instance lifetime" that would seem to be exactly what we want. However, because I'm using "managed scaling" with "managed termination protection" for the capacity provider, it seems like the ASG never actually attempts to terminate old instances on its own, since all instances have scale-in protection enabled until they are drained of tasks. So it seems like the "maximum instance lifetime" setting does not actually work when the number of tasks running in the cluster remains about constant.

Is there some ECS option to automatically drain container instances after a set amount of time? Otherwise, how can we guarantee that we eventually start using the latest versions of those AMIs?

Nathan
asked a year ago322 views
1 Answer
0
Accepted Answer

You can use this approach:

  1. Use Amazon CloudWatch Events: Create a CloudWatch Event rule that triggers a Lambda function at regular intervals, such as once a day.

  2. Create a Lambda function: The Lambda function will perform the following steps:

    • Get a list of container instances in the ECS cluster using the describeContainerInstances API.
    • Iterate through the container instances and check their launch time.
    • If the launch time of a container instance exceeds your desired maximum instance lifetime, call the updateContainerInstancesState API to set the container instance to DRAINING state, preventing new tasks from being scheduled on that instance.
    • Optionally, call the terminateInstances API to terminate instances that have been in the DRAINING state for a certain period of time.
  3. Schedule the Lambda function: Use CloudWatch Events to trigger the Lambda function at regular intervals, for example, once a day. This ensures that container instances exceeding the maximum instance lifetime are drained and eventually replaced with instances running the latest versions of the ECS-Optimized AMIs.

By following this approach, you can enforce a maximum instance lifetime for your container instances in the ECS cluster and ensure that they are eventually replaced with instances running the latest versions of the ECS-Optimized AMIs.

profile picture
Amol_M
answered a year ago
  • I was hoping there was some feature built in to ECS so we wouldn't have to go through all this effort, but it looks like this is the best option for now. Thanks!

You are not logged in. Log in to post an answer.

A good answer clearly answers the question and provides constructive feedback and encourages professional growth in the question asker.

Guidelines for Answering Questions