Skip to content

ECS (EC2 launch type) new deployment container fails to start with no clear error message

0

Hi everyone,

I have an application deployed on an AWS ECS cluster using the EC2 launch type. When a new deployment is triggered for a specific service, the new container is not successfully created.

In the ECS console, under Tasks → Stopped tasks, the Error code / Stopped reason is either empty or very generic (e.g., "Task failed to start"), which does not help in identifying the root cause.

Environment details:

ECS launch type: EC2 (c5.2xlarge) EC2 instance available resources before starting the new task: 4.75 vCPU and 8.244 GB memory New container resource requirements: 0.75 vCPU and 3 GB memory The EC2 instance hosting this service has enough available CPU and memory at the time of deployment.

Even though there are sufficient resources available, the task fails to start, and no detailed reason is provided in the ECS console.

Any advice or suggestions would be greatly appreciated!

Thanks in advance!

asked 9 months ago160 views
1 Answer
0

To troubleshoot your ECS deployment failure on EC2, you'll need to gather more detailed information since the console is showing only generic error messages.

Here are steps to diagnose the issue:

  1. Check CloudWatch Logs:
  • Open the Amazon ECS console
  • Navigate to Clusters > Your Cluster > Tasks tab
  • Select the Task ID of the failed task
  • Check the "Stopped Status" for more details
  • Review the logs for specific errors (note that logs are only available for about an hour after the task stops)
  1. Use the DescribeTasks API to get more detailed information about the task's stoppedReason

  2. Verify your container instance configuration:

  • Confirm that your cluster has active instances
  • Ensure the task's CPU/memory requirements don't exceed the container instance's capacity
  • Check if there are any port conflicts with existing tasks
  1. Check IAM permissions:
  • Verify that your container instance role has the necessary permissions
  • Ensure the task execution role has the proper permissions to pull images from ECR (if you're using ECR)
  • The IAM policy "AmazonECSTaskExecutionRolePolicy" should be attached to the task execution role
  1. Verify network configuration:
  • Check security groups and network settings
  • Ensure the task has proper network connectivity
  1. Check for compatibility issues:
  • Ensure the task definition launch type matches the cluster capacity type
  • Verify that your container image URI is correct and accessible
  1. Look for deployment circuit breaker errors:
  • If the deployment circuit breaker was triggered, it indicates repeated failures
  • This could be due to failed health checks, missing ECR images, or resource constraints

Since you've already verified that resources are sufficient, the issue might be related to permissions, network configuration, or container image access. Creating and properly configuring the container instance role and task execution role is particularly important for EC2 launch type.
Sources
ECS deployment task failed (deployed on EC2) | AWS re:Post
Troubleshoot my failed Amazon ECS deployments | AWS re:Post
Troubleshoot Amazon ECS deployment failures with an activated circuit breaker | AWS re:Post
Troubleshooting Amazon ECS TaskFailedToStart errors - Amazon Elastic Container Service
Troubleshoot schedule issues for Amazon ECS tasks on EC2 instances | AWS re:Post

answered 9 months ago
EXPERT
reviewed 9 months ago

You are not logged in. Log in to post an answer.

A good answer clearly answers the question and provides constructive feedback and encourages professional growth in the question asker.