Issue with ECS Service Rolling Deployment

1

We're facing an issue with our ECS service during the rolling deployment process.

Context: We have an ECS service that we regularly update with new task definitions. This service is crucial for our application, and any disruption affects our operations.

Expected behaviour: When we update the service with a new task definition, the service should be able to stop a task, deploy a new task with the new task definition and continue with the rolling deployment

Actual behaviour: We get an error saying that there are not available instances that can handle the task. The problem is that the console tells us that one of the old tasks has been stopped. However, upon SSHing into the EC2 instance, we observed all running containers still belong to the old task definition. After stopping the container manually from the instance the task is successfully deployed.

We have tried with new EC2 instances, we have been monitoring the tasks and the instance and we don’t see any problem in our end.

We have a QA environment where we can update the service without this issue.

Our configuration:

  • Min capacity: 30%
  • Max capacity: 200%
  • Min tasks: 2
  • Max tasks: 4
  • AMI ID: ami-0fec9863172e50c93

Would you be able to help us diagnose and resolve this issue? Any guidance would be greatly appreciated.

已提问 7 个月前157 查看次数
没有答案

您未登录。 登录 发布回答。

一个好的回答可以清楚地解答问题和提供建设性反馈,并能促进提问者的职业发展。

回答问题的准则