How do I troubleshoot the container health check failures for Amazon ECS tasks?

4 minute read
1

My Amazon Elastic Container Service (Amazon ECS) task fails the container heath check.

Short description

If you receive the following error, then the Amazon ECS containers in your task are using health checks that your service can't pass:

(service AWS-Service) (task ff3e71a4-d7e5-428b-9232-2345657889) failed container health checks

Resolution

To troubleshoot Amazon ECS container health check failures, complete the following steps:

  • Before you provision to Amazon ECS, locally test the container to make sure that it passes the container health checks.
  • Confirm that the command that you pass to the container is correct and that you use the correct syntax for your Amazon ECS tasks.
  • Give your container enough time to initiate.
  • If your Amazon ECS task continues to run for an extended time, then check your application logs and Amazon CloudWatch logs.

Locally test the container to make sure that it passes the container health check

Before you provision your container to Amazon ECS, make sure that your container works as expected and can pass the specified container health check. Test your container with the Dockerfile HEALTHCHECK configuration on the Docker website. Confirm that the container passes the health check that's defined in the Dockerfile. Then, specify the health check configuration in the task definition to allow the Amazon ECS container agent to monitor and report the health check.

Note: Amazon ECS doesn't monitor Docker health checks that are embedded in a container image and aren't specified in the container definition. Health check parameters that are specified in a container definition override Docker health checks that exist in the container image.

Confirm that you use the correct syntax for your Amazon ECS tasks

Note: If you receive errors when running AWS Command Line Interface (AWS CLI), make sure that you're using the most recent version of the AWS CLI.

Use the correct commands and syntax for your Amazon ECS tasks. If you use the AWS Management Console JSON panel, the AWS CLI, or APIs, then enclose the list of commands in brackets.

Example command:

["CMD-SHELL", "curl -f http://localhost/ || exit 1"]

If you use the AWS Management Console to edit your ECS task, then you don't need to include the brackets:

CMD-SHELL, Curl -f http://localhost/ || exit 1

Don't separate the health check command with double quotes, such as ["CMD-SHELL", "healthcheck.sh", "||", "exit 1"]. Instead, use the following command syntax:

["CMD-SHELL", "healthcheck.sh || exit 1"]

Give your container enough time to initiate

If your container takes a long time time to initiate, then your container can fail the container health check. Set the startPeriod in the advanced container definition parameter. This gives your Amazon ECS container time to bootstrap before any failed health checks are included in the maximum number of retries.

For tasks that are running for a long time, check your application logs and Amazon CloudWatch logs

If your Amazon ECS container is running for a long time and fails the container health check, then check your application logs. If your Amazon ECS task uses awslogs log driver, then check your application logs on CloudWatch.

Note: AWS Fargate is a managed service. Therefore, you can't access the underlying infrastructure. To troubleshoot this issue, launch your Amazon ECS tasks in Amazon Elastic Compute Cloud (Amazon EC2). Then, use SSH to connect to your Amazon EC2 instances. You can also use Amazon ECS Exec to interact directly with your ECS containers.

Related information

How can I get my Amazon ECS tasks running using the Amazon EC2 launch type to pass the Application Load Balancer health check in Amazon ECS?

AWS OFFICIAL
AWS OFFICIALUpdated a year ago
4 Comments

Is there a way to alert on this health check ?

replied 2 years ago

Thank you for your comment. We'll review and update the Knowledge Center article as needed.

profile pictureAWS
MODERATOR
replied 2 years ago

Exactly my thinking. The current state of these logs, UI is horrible.

profile picture
replied a year ago

Thank you for your comment. We'll review and update the Knowledge Center article as needed.

profile pictureAWS
MODERATOR
replied a year ago