CannotPullContainerError failed to copy: httpReadSeeker in Batch using Fargate

1

My Batch job fails with the error CannotPullContainerError (failed to copy: httpReadSeeker). I'm using a private VCP and defined all the required Endpoints:

  • com.amazonaws.eu-central-1.s3
  • com.amazonaws.eu-central-1.ecr.api
  • com.amazonaws.eu-central-1.ecr.dkr
  • com.amazonaws.eu-central-1.logs
  • com.amazonaws.eu-central-1.ecs
  • com.amazonaws.eu-central-1.ecs-agent
  • com.amazonaws.eu-central-1.ecs-telemetry

For the S3 endpoint, the policy is the default one:

{
	"Statement": [
		{
			"Action": "*",
			"Effect": "Allow",
			"Principal": "*",
			"Resource": "*"
		}
	]
}

The image is stored in a ECR private repository and exists (the process runs just fine when using Batch with EC2 instead of Fargate).

The Computing Env uses the Service role: AWSServiceRoleForBatch

The Batch job uses a custom executionRoleArn that has the standard policy AmazonECSTaskExecutionRolePolicy on it.

I have been looking in re:Post for similar errors and no solution works for me (when using a private ensure the use of endpoints, be sure that S3 endpoint has access to starport...).

Thanks,

David

2 Answers
0

Hi Ismail

The end point for S3 (com.amazonaws.eu-central-1.s3) is already in the VPC using a Full Access policy. In any case I have expanded the Policy as:

{
	"Statement": [
		{
			"Action": "*",
			"Effect": "Allow",
			"Principal": "*",
			"Resource": "*"
		},
		{
			"Sid": "Access-to-specific-bucket-only",
			"Principal": "*",
			"Action": [
				"s3:GetObject"
			],
			"Effect": "Allow",
			"Resource": [
				"arn:aws:s3:::prod-eu-central-1-starport-layer-bucket/*"
			]
		}
	]
}

but the error is the same.

Thanks,

David

David
answered 9 months ago
  • Hi David,

    The policy you added should have given you access to the starport bucket. However, since the issue persists, I suspect that the problem may lie in the networking setup.

    Fargate tasks require a route to the internet for pulling images, and this can be provided either through a NAT gateway or a VPC endpoint. If your Fargate tasks are running in a private subnet, you must ensure the subnet is associated with a route table that has a default route to a NAT gateway or NAT instance.

    You mentioned your network is public with an internet gateway but without a public IP. For Fargate to pull container images without a public IP, the best practice would be to set up a private subnet associated with a route table that has a default route to a NAT gateway or instance. This will allow Fargate tasks to reach the internet without needing a public IP.

    Here's how you can check this:

    • Navigate to the VPC section in your AWS console.
    • Find the subnet in which your Fargate tasks are running.
    • Check the route table associated with that subnet.
    • Ensure that there is a route to a NAT gateway or NAT instance.

    This setup will ensure your Fargate tasks can reach the necessary AWS services without needing a public IP.

  • Hi Ismail,

    The VPC is running in a private network (not a public one) without access to internet. This is why I have defined the endpoints in the VPC. My understanding is that either an access to internet exists (public VPC, NAT...) OR endpoints are defined. I'm doing the second option.

    Thanks,

    David

  • Hi Ismail,

    Yes, if a public IP is used (and the NAT and the IG), the system works. But I was trying to use a private network, no public IP, using end-points . It will be great to know why fails in my case.

    Thanks,

    David

0

Hello David,

The CannotPullContainerError you're experiencing seems to be associated with the inability of your Fargate task to pull the Docker image from your ECR repository.

Given that your VPC is private, the task needs a route to the internet to pull images from ECR. This route can be provided either through a NAT gateway or a VPC endpoint. It appears that you have set up the necessary endpoints for ECR and other services; however, pulling images from ECR also requires access to the S3 service, as ECR stores image layers in an S3 bucket, namely the "starport" bucket.

Please ensure that you have an S3 gateway endpoint configured in your VPC and that it has a policy allowing access to the "starport" bucket. The policy should look like this:

{
  "Statement": [
    {
      "Sid": "Access-to-specific-bucket-only",
      "Principal": "*",
      "Action": [
        "s3:GetObject"
      ],
      "Effect": "Allow",
      "Resource": ["arn:aws:s3:::prod-region-starport-layer-bucket/*"]
    }
  ]
}

Please refer to the following document on how to set up an S3 gateway endpoint: https://docs.aws.amazon.com/AmazonECR/latest/userguide/vpc-endpoints.html#ecr-setting-up-s3-gateway

Also, bear in mind that Fargate tasks running in private subnets will require a NAT gateway or a private subnet associated with a route table that has a default route to a NAT gateway or NAT instance to pull images. You might find this re:Post useful for more information on this topic: https://repost.aws/questions/QULIQs1kYOQKO1RpDaBkq-Wg/cannotpullcontainererror-in-the-private-network

AWS
answered 9 months ago

You are not logged in. Log in to post an answer.

A good answer clearly answers the question and provides constructive feedback and encourages professional growth in the question asker.

Guidelines for Answering Questions