Fargate task with public IP timeout on ECR registry auth

0

I'm unable to run services with ECS because my Fargate container cannot reach ECR. So I'm getting the error:

Task stopped at: 2024-01-19T08:58:58.698Z
ResourceInitializationError: unable to pull secrets or registry auth: execution resource retrieval failed: unable to retrieve ecr registry auth: service call has been retried 3 time(s): RequestError: send request failed caused by: Post "https://api.ecr.eu-west-1.amazonaws.com/": dial tcp 63.34.60.177:443: i/o timeout. Please check your task network configuration.

I'm checking everything that should be needed for egress traffic.

1 - Auto-assing public IP is enabled for my tasks, as you can see this is working fine:

Enter image description here

2 - Service security group outbound rules allow all traffic:

Enter image description here

3 - VPC Network ACL is the default one, having the following for inbound and outbound. So this shouldn't be the problem either,

  • 100 - allow all IPv4
  • 110 - allow all IPv6
  • default deny all

4 - Public subnets have are attached to a route table with default route to Internet Gateway:

Enter image description here

If I create an EC2 instance with the same security group in one of those subnets I can reach ECR without issues, I don't understand why traffic is being rejected. In VPC flow logs I can see packets are rejected but there's no security group or NACL that explain this issue.

Enter image description here

Any advise on how to troubleshoot further?

EDIT: Added route table subnets associations

Enter image description here

VPC was created using Terraform:

module "vpc" {
  source  = "terraform-aws-modules/vpc/aws"
  version = "5.1.1"

  name           = "${local.namespace}-vpc"
  cidr           = "10.10.0.0/20"
  azs            = ["eu-west-1a", "eu-west-1b", "eu-west-1c"]
  public_subnets = ["10.10.0.0/23", "10.10.2.0/23", "10.10.4.0/23"]

  enable_nat_gateway   = false
  enable_dns_hostnames = true
  enable_dns_support   = true
  enable_dhcp_options  = true
}
asked 3 months ago251 views
1 Answer
0

Can you screen shot the explicit subnet list the route table is associated with.

I’ve a hunch there’s a config issue.

profile picture
EXPERT
answered 3 months ago
  • Sure, I cannot post screenshots here, but I updated the question with that.

  • Hello.

    I think you should also check the task definition settings. By the way, what kind of IAM policy do you set for the IAM role in the task definition? Is the task execution role set? https://docs.aws.amazon.com/AmazonECS/latest/developerguide/task_execution_IAM_role.html

  • Thanks Alejandro.. Did you create your own security group and assign to the ECS Service using terraform? Can you share the SG ID on the ECS task and the SG rules? Your VPC and route tables look good.. thanks for the screen shot

  • Thanks for your help Gary! I think this is a Account related issue that I need to clarify with AWS technical support. I made a minimum reproducible Terraform configuration and deploy it in a different AWS account, and it worked perfectly. However is 100% reproducible in the original account with exactly the same Terraform code, also same regions, AZs and using providers version locked.

  • Ok so far what I’ve seen that you have shared looks fine. Hope you get it resolved.

You are not logged in. Log in to post an answer.

A good answer clearly answers the question and provides constructive feedback and encourages professional growth in the question asker.

Guidelines for Answering Questions