Issue with ECS Service Rolling Deployment

1

We're facing an issue with our ECS service during the rolling deployment process.

Context: We have an ECS service that we regularly update with new task definitions. This service is crucial for our application, and any disruption affects our operations.

Expected behaviour: When we update the service with a new task definition, the service should be able to stop a task, deploy a new task with the new task definition and continue with the rolling deployment

Actual behaviour: We get an error saying that there are not available instances that can handle the task. The problem is that the console tells us that one of the old tasks has been stopped. However, upon SSHing into the EC2 instance, we observed all running containers still belong to the old task definition. After stopping the container manually from the instance the task is successfully deployed.

We have tried with new EC2 instances, we have been monitoring the tasks and the instance and we don’t see any problem in our end.

We have a QA environment where we can update the service without this issue.

Our configuration:

  • Min capacity: 30%
  • Max capacity: 200%
  • Min tasks: 2
  • Max tasks: 4
  • AMI ID: ami-0fec9863172e50c93

Would you be able to help us diagnose and resolve this issue? Any guidance would be greatly appreciated.

asked 6 months ago150 views
No Answers

You are not logged in. Log in to post an answer.

A good answer clearly answers the question and provides constructive feedback and encourages professional growth in the question asker.

Guidelines for Answering Questions