Upload large file with multiupload for cross s3 accounts using Boto3

0

I'm trying to upload a file that is greater than 100gb with boto3 file_uploadobj with the multiupload option. The code will run for about 30 mins and then throw the following error:

ClientError: An error occurred (InvalidArgument) when calling the UploadPart operation: Part number must be an integer between 1 and 10000, inclusive

Here is my code:

import boto3
from boto3.s3.transfer import TransferConfig

# s3 client for current account
s3_dst = boto3.client('s3')

# assume role to access s3 client in another account
s3_src = boto3.client(
    's3',
    aws_access_key_id
    aws_secret_access_key
    aws_session_token
)
                        
src_resp = s3_src.get_object(
	Bucket = 'old-bucket',
	Key = 'src.csv'
)

config = TransferConfig(multipart_threshold=1024*250, max_concurrency=10,
                        multipart_chunksize=1024*250, use_threads=True)
	
s3_dst.upload_fileobj(src_resp['Body'], 'new-bucket', key='src.csv', Config=config)
	

What do I have to do to get this to work? Are there other options besides upload_file?

asked 2 years ago1384 views
2 Answers
1
Accepted Answer

I've done exactly what you are trying to do and the issue is that your parts are large enough. Specify a part like multipart_chunksize=10241024250. The chunksize I believe is in bytes and you are limited to 10,000 parts, so that's what is going on here.

answered 2 years ago
0

Since you are using mulipart uploads, I want to be sure you know about this. I recently found out that failed multipart uploads are saved in S3 and I pay for them. Here's how you can get rid of them. You will see that the metrics shows more objects and more storage than you can see in your bucket if this is happening. I would ask for a refund of those costs if you have them. Here's an article on how to create a lifecycle policy to automatically delete these failed parts. https://aws.amazon.com/blogs/aws/s3-lifecycle-management-update-support-for-multipart-uploads-and-delete-markers/

answered 2 years ago

You are not logged in. Log in to post an answer.

A good answer clearly answers the question and provides constructive feedback and encourages professional growth in the question asker.

Guidelines for Answering Questions