How can I optimize performance when I upload large files to Amazon S3?
3 minute read
I want to upload large files (1 GB or larger) to Amazon Simple Storage Service (Amazon S3). How can I optimize the performance of this upload?
When you upload large files to Amazon S3, it's a best practice to leverage multipart uploads. If you're using the AWS Command Line Interface (AWS CLI), then all high-level aws s3 commands automatically perform a multipart upload when the object is large. These high-level commands include aws s3 cp and aws s3 sync.
Consider the following options for improving the performance of uploads and optimizing multipart uploads:
If you're using the AWS CLI, customize the upload configurations.
Enable Amazon S3 Transfer Acceleration.
If you're using the AWS CLI, customize the upload configurations
max_concurrent_requests: This value sets the number of requests that can be sent to Amazon S3 at a time. The default value is 10. Note: Running more threads consumes more resources on your machine. You must be sure that your machine has enough resources to support the maximum number of concurrent requests that you want.
max_queue_size: This value sets the maximum number of tasks in the queue. The default value is 1,000.
multipart_threshold: This value sets the size threshold for multipart uploads of individual files. The default value is 8 MB.
multipart_chunksize: This value sets the size of each part that the AWS CLI uploads in a multipart upload for an individual file. This setting allows you to break down a larger file (for example, 300 MB) into smaller parts for quicker upload speeds. The default value is 8 MB. Note: A multipart upload requires that a single file is uploaded in not more than 10,000 distinct parts. You must be sure that the chunksize that you set balances the part file size and the number of parts.
max_bandwidth: This value sets the maximum bandwidth for uploading data to Amazon S3. There is no default value.
Amazon S3 Transfer Acceleration can provide fast and secure transfers over long distances between your client and Amazon S3. Transfer Acceleration uses Amazon CloudFront's globally distributed edge locations.