To potentially decrease the overall time that it takes to complete the transfer, split the transfer into multiple mutually exclusive operations. You can run multiple instances of aws s3 cp (copy), aws s3 mv (move), or aws s3 sync (synchronize) at the same time.
One way to split up your transfer is to use --exclude and --include parameters to separate the operations by file name. For example, you want to copy a large amount of data from one bucket to another bucket. In this example, all of the file names begin with a number. You can run the following commands on two instances of the AWS CLI.
Note: The --exclude and --include parameters are processed on the client side. Because of this, the resources of your local machine might affect the performance of the operation.
Run this command to copy the files with names that begin with the numbers 0 through 4:
Important: If you must transfer a large number of objects (hundreds of millions), consider building a custom application using an AWS SDK to perform the copy. While the AWS CLI can perform the copy, a custom application might be more efficient at that scale.
Consider using AWS Snowball for transfers between your on-premises data centers and Amazon S3, particularly when the data exceeds 10 TB.
Note the following limitations:
AWS Snowball doesn't support bucket-to-bucket data transfers.
AWS Snowball doesn't support server-side encryption with keys managed by AWS Key Management System (AWS KMS). For more information, see Encryption in AWS Snowball.
S3DistCp with Amazon EMR
Consider using S3DistCp with Amazon EMR to copy data across Amazon S3 buckets. S3DistCp enables parallel copying of large volumes of objects.
Important: Because this option requires you to launch an Amazon EMR cluster, be sure to review Amazon EMR pricing.