What is the max object-per-second throughput possible when copying S3 objects from Standard to Glacier Instant Retrieval with S3 Batch Operations?

Question

As in the title: what is the max object-per-second throughput possible when copying objects from Standard to Glacier Instant Retrieval with S3 Batch Operations?

In our dev/sandbox environments we have created jobs to copy objects between storage classes using a manifest (pre-computed .csv.gz, not using manifest generator). So far we have only seen ~500 objects-per-second throughput in dev but in prod we are looking to move ~10 billion small objects (<1MiB) so we need it to go faster.

I'm aware that we can split the job up and that is something we will explore but how can we boost throughput per job? Is there any way we can get the speed up to more like ~10k objects-per-second? Does the throughput depend on the target storage class (e.g. Intelligent Tiering versus GIR)? Will it be faster or slower if the bucket has more "partitions" allocated?

Thanks in advance,
James

Accepted Answer

Your requests per second will depend on how you've organised your data into prefixes. You can find out more about the performance here and considerations around prefixes.

Also this YouTube Video from Re:Invent 2023 goes deeper into this and is definitely worth a watch: https://youtu.be/sYDJYqvNeXU?list=PL2yQDdvlhXf83bp752n992F52HWaR_js3&t=967

Also this YouTube Video from Re:Invent 2023 goes deeper into this and is definitely worth a watch:
https://youtu.be/sYDJYqvNeXU?list=PL2yQDdvlhXf83bp752n992F52HWaR_js3&t=967

What is the max object-per-second throughput possible when copying S3 objects from Standard to Glacier Instant Retrieval with S3 Batch Operations?

関連するコンテンツ