EC2, ECS Agent Failing to Start – IAM Role & Docker API Version Issues

0

Hello AWS Community,

I'm experiencing issues with the ECS agent on my EC2 instance. The agent fails to start properly, and when I check the logs, I see multiple errors related to IAM role permissions and Docker API version mismatch.

Logs: ECS Agent Restarting Constantly:

Mar 14 21:37:02 amazon-ecs-init[4660]: level=warn msg="ECS Agent failed to start, retrying in 1.084438935s"
Mar 14 21:37:03 amazon-ecs-init[4660]: level=info msg="Removing existing agent container ID: 6c20e43d704f..."
Mar 14 21:37:03 amazon-ecs-init[4660]: level=info msg="Starting Amazon Elastic Container Service Agent"

Docker API Version Issue:

level=info msg="Unable to get Docker client for version 1.22: Error response from daemon: client version 1.22 is too old. Minimum supported API version is 1.24, please upgrade your client to a newer version"
level=info msg="Unable to get Docker client for version 1.23: Error response from daemon: client version 1.23 is too old. Minimum supported API version is 1.24, please upgrade your client to a newer version"

IAM Role Error – ECS Agent Lacks Proper Permissions:

level=error msg="Error getting ECS instance credentials: no valid providers in chain: failed to refresh cached credentials, failed to get ecs-instance-role EC2 IMDS role credentials, api error AssumeRoleUnauthorizedAccess: EC2 cannot assume the role ecs-instance-role."

Current Issues:

ECS Agent fails to start and restarts repeatedly.
IAM role "ecs-instance-role" seems to lack proper permissions.
Docker API version on the instance is outdated (1.22, 1.23 instead of 1.24+).

curl http://localhost:51678/v1/metadata does not respond, meaning the ECS agent is not running correctly. Troubleshooting Steps Taken: Checked ECS agent logs:

sudo journalctl -u ecs --no-pager --lines=50

Checked IAM Role Permissions for ECS Instance Role:

Verified that the role "ecs-instance-role" is attached.
Checked if "AmazonEC2ContainerServiceforEC2Role" policy is assigned.

Checked Docker Version:

docker --version

Restarted ECS Agent:

sudo systemctl restart ecs

Reinstalled ECS Agent:

sudo yum remove -y ecs-init && sudo yum install -y ecs-init

Update instance with update and yum

Tuto info aws followed: https://docs.aws.amazon.com/AmazonECS/latest/developerguide/ecs-logs-collector.html https://docs.aws.amazon.com/AmazonECS/latest/developerguide/ecs-agent-config.html Description role

I tried look if some in infraestrcture and create this manually but dont work. Infraestructure

I tried run automation task of the official doc but said run right but dont resolve nothing. Automate task

Log ECS: Log ECS

Project terraform im creating to what you can test or modify this: https://github.com/easy-post-ia-1/terraform-front

Questions: How can I update Docker to ensure it supports API version 1.24+? What specific IAM permissions should I check for "ecs-instance-role"? Why is the ECS agent failing to start properly? Any additional troubleshooting steps? Any help would be greatly appreciated! 🚀

1 Answer
-2

Based on the issues you're experiencing, it appears there are multiple problems that need to be addressed to get your ECS agent running properly on your EC2 instance. Let's break down the issues and provide solutions:

  1. Docker API Version Issue: The error messages indicate that your Docker version is outdated. To resolve this:
  • Update Docker to the latest version supported by Amazon ECS. You can do this by running:
    sudo yum update -y docker
    
  • After updating, restart the Docker service:
    sudo service docker restart
    
  1. IAM Role Permissions: The error suggests that the EC2 instance cannot assume the "ecs-instance-role". To fix this:
  • Ensure that the IAM role attached to your EC2 instance has the necessary permissions. At a minimum, it should have the "AmazonEC2ContainerServiceforEC2Role" policy attached.
  • Check the trust relationship of the IAM role. It should allow the EC2 service to assume the role.
  • Verify that the instance profile is correctly associated with your EC2 instance.
  1. ECS Agent Failing to Start: The constant restarting of the ECS agent could be due to the above issues or additional configuration problems. To troubleshoot:
  • Check the ECS agent configuration file (/etc/ecs/ecs.config) for any misconfigurations.
  • Ensure that the EC2 instance has outbound internet access to communicate with ECS service endpoints.
  • Verify that the security groups and network ACLs allow necessary traffic.

Additional troubleshooting steps:

  • Check the full ECS agent logs for more detailed error messages:
    sudo cat /var/log/ecs/ecs-agent.log
    
  • Verify DNS resolution is working correctly in your VPC.
  • Ensure the instance can access the EC2 instance metadata service.

If issues persist after trying these steps, you may need to:

  1. Terminate the problematic instance and launch a new one with the latest ECS-optimized AMI.
  2. Double-check all network configurations, including VPC settings, route tables, and endpoints.
  3. Verify that your AWS account is properly set up for ECS usage, including having the necessary service-linked roles.

Remember to reboot your instance after making significant changes to allow new configurations to take effect.
Sources
EC2 scale from auto scaling group not start ecs agent | AWS re:Post
Troubleshoot permissions issues in Amazon ECS | AWS re:Post
Troubleshoot stopped Windows tasks in Amazon ECS | AWS re:Post
Updating the Amazon ECS container agent - Amazon Elastic Container Service

profile picture
answered a month ago
  • i did all this but dont work :c

You are not logged in. Log in to post an answer.

A good answer clearly answers the question and provides constructive feedback and encourages professional growth in the question asker.

Guidelines for Answering Questions