I am attempting to push a 12GB docker image to ECR. It works for all layers except the large one which retries and fails repeatedly..

I am logged into an EC2 amazon-linux instance. Docker version 18.06.1-ce, build e68fc7a215d7133c34aa18e3b72b4a21fd0c6136

I am logging into ECR the standard way: https://docs.aws.amazon.com/AmazonECR/latest/userguide/Registries.html

aws ecr get-login --region ${REGION} --no-include-email > login.sh

this generates the command to login to ECR which looks something like which my script executes and the login is successful.

docker login -u AWS -p <reallylongpass> https://123123123123.dkr.ecr.us-east-1.amazonaws.com

The local docker image looks great and has been tagged properly.
The image is VALID and I am able to 'docker run'/'docker exec -it <containerid> bash' and launch it just fine.
'docker images' looks like:

123123123123.dkr.ecr.us-east-1.amazonaws.com/myrepo latest 1faee4304174 4 days ago 12.1GB
myrepo latest 1faee4304174 4 days ago 12.1GB

Next I am attempting 'docker push'. My proxies look good.

docker push ${ECR_ID}.dkr.ecr.${REGION}.amazonaws.com/${IMAGE}:latest

The command pushes multiple layers successfully to the remote ECR repo. Output looks like:

The push refers to repository [123123123123.dkr.ecr.us-east-1.amazonaws.com/myrepo]
5cf889a10bb3: Pushed
b9d2d8033662: Pushed
a0eabe3f044d: Pushed
ed3c559e8570: Pushed
7dd2c6c87cb8: Pushed
879ed9fddabb: Pushed
5c0964ce8332: Pushed
2466533a61e0: Pushed
5c13fd4091c4: Pushed
424b600de233: Pushing ==> ] 634.5MB/11.14GB
0c62105c5e04: Pushed
74ef47ea2b76: Pushing ==================================> ] 212.8MB/307.9MB
742a3cb8ef5b: Pushed
9356a924d1e7: Pushing ==================================================>] 187.1MB
5f70bf18a086: Pushed
9b0885650d8b: Pushing =================================> ] 130.1MB/196.6MB

Now at this point, all the layers except for the largest one (the 11.14GB) complete pretty quickly. The 11.14GB layer is much bigger and takes ~5 minutes total.

When the 11.14GB layer finally reaches 100%, it will show this line and sit there for about 1 min.

424b600de233: Pushing ==================================================> 11.3GB

Then it will show:

424b600de233: Retrying in 5 seconds
424b600de233: Retrying in 4 seconds ...

and restart the push again for that large layer. It will repeatedly fail.

/var/log/docker shows: "Upload failed, retrying: EOF". I am completely baffled as to what is going on. I enabled verbose docker logging (see link below), but this hasn't revealed anything deeper yet. I haven't exceeded any ECR limits I don't think?
Please help, thank you

ecr limits: https://docs.aws.amazon.com/AmazonECR/latest/userguide/service_limits.html
docker debug mode: https://docs.aws.amazon.com/AmazonECS/latest/developerguide/docker-debug-mode.html


time="2018-12-10T17:04:25.037918492Z" level=debug msg="UnmountDevice START(hash=3917bed0c364c431655d753a41aa9da4e2db2d954497a3f0ed115ce52131ef95)" storage-driver=devicemapper
time="2018-12-10T17:04:25.037947493Z" level=debug msg="Unmount(/var/lib/docker/devicemapper/mnt/3917bed0c364c431655d753a41aa9da4e2db2d954497a3f0ed115ce52131ef95)" storage-driver=devicemapper
time="2018-12-10T17:04:26.208305666Z" level=debug msg="Unmount done" storage-driver=devicemapper
time="2018-12-10T17:04:26.208566225Z" level=debug msg="deactivateDevice START(3917bed0c364c431655d753a41aa9da4e2db2d954497a3f0ed115ce52131ef95)" storage-driver=devicemapper
time="2018-12-10T17:04:26.208671326Z" level=debug msg="devicemapper: RemoveDeviceDeferred START(docker-202:1-1576181-3917bed0c364c431655d753a41aa9da4e2db2d954497a3f0ed115ce52131ef95)"
time="2018-12-10T17:04:26.248795710Z" level=debug msg="devicemapper: RemoveDeviceDeferred END(docker-202:1-1576181-3917bed0c364c431655d753a41aa9da4e2db2d954497a3f0ed115ce52131ef95)"
time="2018-12-10T17:04:26.248824617Z" level=debug msg="deactivateDevice END(3917bed0c364c431655d753a41aa9da4e2db2d954497a3f0ed115ce52131ef95)" storage-driver=devicemapper
time="2018-12-10T17:04:26.248837221Z" level=debug msg="UnmountDevice END(hash=3917bed0c364c431655d753a41aa9da4e2db2d954497a3f0ed115ce52131ef95)" storage-driver=devicemapper
time="2018-12-10T17:04:27.838363250Z" level=debug msg="Calling GET /v1.25/containers/json?limit=0"
time="2018-12-10T17:04:39.835637837Z" level=debug msg="Calling GET /v1.25/containers/json?limit=0"
time="2018-12-10T17:04:51.860897072Z" level=debug msg="Calling GET /v1.25/containers/json?limit=0"
time="2018-12-10T17:04:51.866517706Z" level=debug msg="Calling GET /v1.25/containers/json?limit=0"
time="2018-12-10T17:05:03.838044567Z" level=debug msg="Calling GET /v1.25/containers/json?limit=0"
time="2018-12-10T17:05:15.836698076Z" level=debug msg="Calling GET /v1.25/containers/json?limit=0"
time="2018-12-10T17:05:27.835978356Z" level=debug msg="Calling GET /v1.25/containers/json?limit=0"
time="2018-12-10T17:05:35.645825351Z" level=error msg="Upload failed, retrying: EOF"


171 time="2018-12-06T17:33:40.903623164Z" level=error msg="Upload failed, retrying: EOF"  
172 time="2018-12-06T17:43:01.903052850Z" level=error msg="Upload failed, retrying: EOF"  
173 time="2018-12-06T17:52:18.852081979Z" level=error msg="Upload failed, retrying: EOF"  
174 time="2018-12-06T18:01:42.852832787Z" level=error msg="Upload failed, retrying: EOF"  
175 time="2018-12-06T18:11:17.852544006Z" level=error msg="Upload failed: EOF"  

NOTE: I created a small image (<2GB) as a test and I am able to push this one with no issues.. I wonder why the larger image/layer will not push ?!?

Thanks for the reply. I thought the same thing, so I recreated by docker image, and by trimming down a .tar file which is added to the image, I was able to reduce the total size down from to 12.1GB to 9.09GB.

The largest layer is now only 8.268GB

ea66653845d4: Pushed
847e0c1e35ce: Pushing ==================================================>] 8.268GB
6be84415c641: Waiting

I attempted again the 'docker push' and same symptom occurs.

time="2018-12-11T00:12:13.601180691Z" level=error msg="Upload failed, retrying: EOF"

Looks like you are hitting the max layer size limit: 10G here.

Maximum layer size ** 10,000 MiB

And those are hard limits.

Try break that 11.14GB layer to two layers if you can I would suggest.

answered 6 years ago
