Upload large file with multiupload for cross s3 accounts using Boto3

0

I'm trying to upload a file that is greater than 100gb with boto3 file_uploadobj with the multiupload option. The code will run for about 30 mins and then throw the following error:

ClientError: An error occurred (InvalidArgument) when calling the UploadPart operation: Part number must be an integer between 1 and 10000, inclusive

Here is my code:

import boto3
from boto3.s3.transfer import TransferConfig

# s3 client for current account
s3_dst = boto3.client('s3')

# assume role to access s3 client in another account
s3_src = boto3.client(
    's3',
    aws_access_key_id
    aws_secret_access_key
    aws_session_token
)
                        
src_resp = s3_src.get_object(
	Bucket = 'old-bucket',
	Key = 'src.csv'
)

config = TransferConfig(multipart_threshold=1024*250, max_concurrency=10,
                        multipart_chunksize=1024*250, use_threads=True)
	
s3_dst.upload_fileobj(src_resp['Body'], 'new-bucket', key='src.csv', Config=config)
	

What do I have to do to get this to work? Are there other options besides upload_file?

已提問 2 年前檢視次數 1400 次
2 個答案
1
已接受的答案

I've done exactly what you are trying to do and the issue is that your parts are large enough. Specify a part like multipart_chunksize=10241024250. The chunksize I believe is in bytes and you are limited to 10,000 parts, so that's what is going on here.

已回答 2 年前
0

Since you are using mulipart uploads, I want to be sure you know about this. I recently found out that failed multipart uploads are saved in S3 and I pay for them. Here's how you can get rid of them. You will see that the metrics shows more objects and more storage than you can see in your bucket if this is happening. I would ask for a refund of those costs if you have them. Here's an article on how to create a lifecycle policy to automatically delete these failed parts. https://aws.amazon.com/blogs/aws/s3-lifecycle-management-update-support-for-multipart-uploads-and-delete-markers/

已回答 2 年前

您尚未登入。 登入 去張貼答案。

一個好的回答可以清楚地回答問題並提供建設性的意見回饋,同時有助於提問者的專業成長。

回答問題指南