Why are the tasks in my Amazon ECS cluster failing to start?

Lesedauer: 5 Minute
0

I'm trying to place a task in my Amazon Elastic Container Service (Amazon ECS) cluster. However, my task placement is failing, and my task won't change to the RUNNING state in my cluster.

Short description

To successfully place your task in your cluster, choose one of the following solutions:

  • If you placed your task with the Amazon ECS service, then complete the steps in the Check your service event messages and the Check the stopped task for errors sections.
  • If you ran your task a standalone task or scheduled task, then complete the steps in the Check the stopped task for errors section.

Resolution

Check your service event messages

  1. Open the Amazon ECS console.
  2. In the navigation menu, choose Clusters, and then select the cluster that contains your service.
  3. On the Services tab of your cluster's page, in the Service Name column, select the service that you want to inspect.
  4. On your service's page, choose Events.
  5. In the Message column, look for errors or other useful information.

Based on your findings from step 5, review Service event messages to troubleshoot your error.

Note: Service events display only the last 100 events.

Check the stopped task for errors

Important: You can see a stopped task that was stopped only in the last 1 hour.

  1. Open the Amazon ECS console.
  2. In the navigation menu, choose Clusters, and then select the cluster that contains your stopped task.
  3. On your cluster's page, choose the Tasks tab.
  4. In the Desired task status table header, choose Stopped, and then select the stopped task to inspect. The most recent stopped tasks are listed first.
  5. On the Details tab of your stopped task, inspect the Stopped reason field to find out why your task was stopped.
  6. If there is a container that's stopped and the Stopped reason is Task failed to start, expand the container, and then inspect the Status reason row to see what caused the task state to change.

Based on your findings from step 5, review the following information to resolve your error:

  • Task failed ELB health checks in (elb elb-name): The current task failed the Elastic Load Balancing health check for the load balancer that's associated with the task's service. For more information, see Troubleshooting service load balancers.
    Note: This root cause is applicable only for tasks launched as part of the service.
  • Scaling activity initiated by (deployment deployment-id): When you reduce the desired count of a stable service, some tasks must be stopped to reach the desired number. You see this Stopped reason for tasks that are stopped due to the downscaling of services. For more information, see Troubleshooting service auto scaling.
    Note: This root cause is applicable only for tasks launched as part of the service.
  • Host EC2 (instance id) stopped/terminated: You see this Stopped reason if you stop or terminate an Amazon Elastic Compute Cloud (Amazon EC2) container instance with running tasks. To investigate why your Amazon EC2 instance was terminated, see Why did Amazon EC2 terminate my instance?
  • Container instance deregistration forced by user: If you force the deregistration of a container instance with running tasks, then you see this Stopped reason.
  • Essential container in task exited: If a container marked as essential in the task definition exits or dies, the task might be stopped. You see this Stopped reason when an essential container exiting is the cause of a stopped task. In this case, the findings from step 6 provide more diagnostic information about why the container stopped.

Also, review API failure reasons.

Based on your findings from step 6, review the following information to resolve your error:

  • If the container status has the error CannotPullContainerError, see CannotPullContainer task errors.
  • For other error messages returned and more information on these error messages, see Stopped tasks error codes.
  • If this inspection doesn't provide enough information and you used the EC2 launch type, then connect to the container instance with SSH and inspect the Docker container locally. For more information, see Inspect Docker containers.

Note: If you're using any task placement constraints or strategies, your cluster must use instances that meet the requirements of your constraints or strategies.


Related information

How do I resolve the "[AWS service] was unable to place a task because no container instance met all of its requirements" error in Amazon ECS?

How do I resolve "the closest matching container-instance container-instance-id has insufficient CPU units available" error in Amazon ECS?

How do I resolve "the closest matching container-instance container-instance-id encountered error 'AGENT'" error for my service in Amazon ECS?

How can I resolve the Amazon ECR error "CannotPullContainerError: API error" in Amazon ECS?

How can I resolve the “CannotPullContainerError: Error response from daemon:Get https://registry-name/: net/http: request canceled while waiting for connection (Client.Timeout exceeded while awaiting headers)” error in Amazon ECS?

AWS OFFICIAL
AWS OFFICIALAktualisiert vor 2 Jahren