Knowledge Center Monthly Newsletter - March 2025
Stay up to date with the latest from the Knowledge Center. See all new and updated Knowledge Center articles published in the last month and re:Post’s top contributors.
Why isn't my Amazon ECS service stable?
My Amazon Elastic Container Service (Amazon ECS) service periodically restarts itself. I want my Amazon ECS service to be stable or steady.
Short description
Your Amazon ECS service can fail to be stable for one of the following reasons:
- Container health checks fail.
- Elastic Load Balancing (ELB) health checks fail.
- An Amazon ECS task exits with non-zero exit codes.
- A container instance doesn't meet the Amazon ECS tasks requirements.
- A container instance unexpectedly terminates.
- Amazon ECS tasks in the service fail to start.
Resolution
Use Amazon ECS events on the Amazon ECS console to check why the service isn't stable.
Container health checks fail
Before you deploy your application to Amazon ECS, use health checks to make sure that your container works as expected. If your Amazon ECS service suddenly fails container health checks, then check the application logs. If your application uses the awslogs driver, then check the logs in Amazon CloudWatch.
Note: Health check parameters that you specify in a container definition override Docker health checks that exist in the container image.
ELB health checks fail
Load balancers periodically run container health checks on each server to determine the servers that are safe to direct traffic to. When an Amazon ECS service fails because of ELB health checks, there might be a communication issue between the load balancer and the service.
To check whether you correctly configured the load balancers and the Amazon ECS service, take the following actions:
-
Confirm that the container security group allows traffic from the load balancer.
-
Run the following command within the container to make sure that the application listens to the correct port:
netstat -tulpn | grep LISTEN
-
Make sure that the health check path is correct.
-
Check the application logs for errors.
-
Run a curl command for the health check path within Amazon Elastic Compute Cloud (Amazon EC2). Or, activate ECS exec on AWS Fargate and run a curl command for the health check within the container to confirm the response code.
-
Monitor the CPU and memory metrics of the service because high CPU utilization can cause the application to be unresponsive, and ELB health checks fail.
-
Set the minimum health check grace period to be 1.5 - 2.0 times longer than the time it takes the application to reach the ACTIVE state.
An Amazon ECS task exits with non-zero exit code
If there's an issue in a container, then Amazon ECS tasks within the service exit with a non-zero exit code. All tasks must have at least one essential container. If the essential container exits for any reason, then the whole task fails and the Amazon ECS service becomes unstable.
The following exit codes are reasons why your Amazon ECS task might fail:
- You get the 1 exit code when there's an application error. For more information about the error, review the application logs.
- You get the 137 exit code when the task was forced to exit (SIGKILL) for the container or there's an out-of-memory (OOM) error. To check whether you're experiencing an OOM issue, review your CloudWatch metrics.
- You get the 139 exit code when there's a segmentation fault. This usually happens when the application tries to access memory that isn't available, or there's an unset or environment variable that's not valid.
- You get the 255 exit code when the ENTRYPOINT CMD command in your container fails because of an error. To confirm that this is the cause, review your CloudWatch metrics.
Note: You can use the DescribeTasks API to view the details of a stopped task. However, the details for the stopped task appear in the results for only 1 hour.
A container instance doesn't meet the Amazon ECS tasks requirements
For more information about how to revolve container requirements, see How do I resolve the "no container instance met all of its requirements" error in Amazon ECS?
A container Instance unexpectedly terminates
If you use a capacity provider in your Amazon ECS cluster without managed termination, then the capacity provider might terminate instances that have running tasks. This happens when there's a scale-in action, and the AWS ECS service becomes unstable. Activate managed termination protection so that the capacity provider doesn't terminate container instances that have running tasks.
To activate managed termination protection, you must turn on instance scale-in protection for the Auto Scaling group.
To turn on scale-in protection, complete the following steps:
- Open the Amazon EC2 console.
- In the navigation pane, choose Auto Scaling Groups, and then select your Auto Scaling group.
- On the Details tab, under Advanced configurations, choose Edit.
- Under Instance scale-in protection, select Enable instance scale-in protection.
- Choose Update.
To activate managed termination protection, complete the following steps:
- Open the Amazon ECS console.
- In the navigation pane, choose Clusters.
- On the Clusters page, select your cluster.
- On the Cluster : name page, choose Infrastructure, and then choose Update.
- On the Create capacity providers page, under Auto Scaling group, under Scaling policies, configure the following options:
Select Turn on managed scaling.
Select Turn on scaling protection. - Choose Update.
Note: Make sure that other tools that you use don't remove the AmazonECSManaged tag from the Auto Scaling group. When a tool removes the tag, Amazon ECS can't manage the scaling.
Amazon ECS tasks within the service fail to start
When you create or update a service, the service might not be stable because the Amazon ECS task can't pull the image.
To resolve this issue, see the following AWS Knowledge Service articles:
- How do I resolve a "ResourceInitializationError" when I try to pull secrets or retrieve Amazon ECR authentication for ECS tasks?
- How do I resolve the "cannotpullcontainererror" error for my Amazon ECS tasks on Fargate?
Related information
How do I troubleshoot container health check failures for Amazon ECS tasks?
Why has my Amazon ECS task stopped?

Relevant content
- asked 2 months agolg...
- asked a year agolg...
- asked 9 months agolg...
- AWS OFFICIALUpdated 7 months ago
- AWS OFFICIALUpdated 2 months ago
- AWS OFFICIALUpdated 10 months ago
- AWS OFFICIALUpdated 21 days ago