Hello, We are using AWS Batch for running Nvidia Cuda based container images hosted on ECR. This containers weight is over 12GB and our AWS Batch instances takes to much time to init (6-7min).

Is there any way to preload the image may be on a custom AMI so the start time is reduced considerably? The AMI currently used has neither access from cloudshell nor SSH.

Thank you!

I asked a similar question at re:Invent and found much of the time is cleanup / downloading the image each invocation / start of a job. In the ECS cluster you can change the pull policy to use cached image if available or validate the cached image. This would allow the pull to be FAR faster, at the risk of running an old image if you aggressively cache (if your nodes fully turn off when the batch is done this may be OK). The settings aren't in batch however.

Unless the nodes generally only run one job then terminate then the only difference between caching and preloading would be the time to run the first job.

Good luck, hope it helps.

answered a year ago

