My Amazon Elastic Container Service (Amazon ECS) task is taking a long time to move to the STOPPED state. Or, my Amazon ECS task is stuck in the RUNNING state when the container instance is set to DRAINING.
Short description
When you set an ECS instance to DRAINING, Amazon ECS prevents new tasks from being scheduled for placement on the container instance. Amazon ECS stops service tasks in the PENDING state immediately. Amazon ECS transitions service tasks in the RUNNING state to the STOPPED state. Standalone tasks in the PENDING or RUNNING state are unaffected. To stop standalone tasks, wait for them to stop on their own or stop them manually.
Issues with configuration parameters or tasks can keep tasks in the RUNNING state or delay their transition to the STOPPED state.
To troubleshoot these issues, complete the following tasks:
- Update your DeploymentConfiguration parameters
- Update the deregistration delay value
- Update the ECS_CONTAINER_STOP_TIMEOUT value
- Look for other task-related issues
Resolution
To troubleshoot Amazon ECS tasks that take a long time to stop, complete the following tasks.
Update your DeploymentConfiguration parameters
Complete the following steps:
- Open the Amazon ECS console.
- In the navigation pane, choose Clusters. Then, choose the cluster where your container instance is draining.
- Choose the Infrastructure tab.
- Under Container instances, filter by Status for DRAINING.
- Choose your container instance, and then find out the service for the tasks that are draining or taking a long time to drain.
- Choose the Services tab, select the service, and then choose Deployments.
- Check the values for minimumHealthyPercent and maximumPercent.
Note: Service tasks on the container instance that are in the RUNNING state are stopped and replaced according to the service's deployment configuration parameters. For more information, see Draining Amazon ECS container instances.
Update the deregistration delay value
Important: The following steps apply only to services that use the Application Load Balancer or Network Load Balancer. If your service uses the Classic Load Balancer, then check the connection draining values.
Complete the following steps:
- Open the Amazon ECS console.
- In the navigation pane, choose Clusters, and then choose the cluster where your container instance is draining.
- Choose the Services tab, and then select the service with the stack that's stuck in RUNNING.
- Choose Target Group Name.
- On the Details tab, scroll down, and then select the Deregistration delay check box.
Update the ECS_CONTAINER_STOP_TIMEOUT value
Complete the following steps:
-
Use Secure Shell (SSH) to connect to your container instance.
-
To find the ECS_CONTAINER_STOP_TIMEOUT value, run the following command:
docker inspect ecs-agent --format '{{json .Config.Env}}'
-
If there's a value for ECS_CONTAINER_STOP_TIMEOUT, then check the duration value.
Note: ECS_CONTAINER_STOP_TIMEOUT is an ECS container agent parameter that defines the amount of time that Amazon ECS waits before ECS ends a container. The time duration starts counting when a task is stopped. If the ECS_CONTAINER_STOP_TIMEOUT parameter doesn't appear in the output, then Amazon ECS uses the default value of 30 seconds.
Look for other task-related issues
To look for other task-related issues, use SSH to connect to your Linux instance. Then, complete the following tasks:
Related information
Draining Amazon ECS container instances