Upload large file with multiupload for cross s3 accounts using Boto3

0

I'm trying to upload a file that is greater than 100gb with boto3 file_uploadobj with the multiupload option. The code will run for about 30 mins and then throw the following error:

ClientError: An error occurred (InvalidArgument) when calling the UploadPart operation: Part number must be an integer between 1 and 10000, inclusive

Here is my code:

import boto3
from boto3.s3.transfer import TransferConfig

# s3 client for current account
s3_dst = boto3.client('s3')

# assume role to access s3 client in another account
s3_src = boto3.client(
    's3',
    aws_access_key_id
    aws_secret_access_key
    aws_session_token
)
                        
src_resp = s3_src.get_object(
	Bucket = 'old-bucket',
	Key = 'src.csv'
)

config = TransferConfig(multipart_threshold=1024*250, max_concurrency=10,
                        multipart_chunksize=1024*250, use_threads=True)
	
s3_dst.upload_fileobj(src_resp['Body'], 'new-bucket', key='src.csv', Config=config)
	

What do I have to do to get this to work? Are there other options besides upload_file?

질문됨 2년 전1400회 조회
2개 답변
1
수락된 답변

I've done exactly what you are trying to do and the issue is that your parts are large enough. Specify a part like multipart_chunksize=10241024250. The chunksize I believe is in bytes and you are limited to 10,000 parts, so that's what is going on here.

답변함 2년 전
0

Since you are using mulipart uploads, I want to be sure you know about this. I recently found out that failed multipart uploads are saved in S3 and I pay for them. Here's how you can get rid of them. You will see that the metrics shows more objects and more storage than you can see in your bucket if this is happening. I would ask for a refund of those costs if you have them. Here's an article on how to create a lifecycle policy to automatically delete these failed parts. https://aws.amazon.com/blogs/aws/s3-lifecycle-management-update-support-for-multipart-uploads-and-delete-markers/

답변함 2년 전

로그인하지 않았습니다. 로그인해야 답변을 게시할 수 있습니다.

좋은 답변은 질문에 명확하게 답하고 건설적인 피드백을 제공하며 질문자의 전문적인 성장을 장려합니다.

질문 답변하기에 대한 가이드라인

관련 콘텐츠