S3 Batch Operations + Lambda to bulk rename objects

0

Hi,

I need to rename 10 Million S3 objects which are all stored in a single bucket that is arranged in a folder structure by {year}/{month}. I was thinking to use S3 batch operations invoking a lambda function to perform this task.

I was planning to use a custom manifest to specify the objects that I want to rename (not all stored objects in the bucket should be renamed) and I was wondering if there is a way to include and pass a {new_name} value in the CSV manifest, so that I pass this value in the JSON request from the batch job to the lambda function and then use this value to rename the object. Is there a way to perform such task? If not, what do you suggest to have a list of new names that I need to assign? Please note that the new name is not a string that I can compute from the actual name via any function in lambda using only the actual name as an input parameter.

Thank you,
Camilo

질문됨 5년 전1443회 조회
3개 답변
0
수락된 답변

There's no way to pass this additional piece of object level information in the CSV manifest today. If it is not something that can be derived from the existing key as you stated, could the desired values be stored in Dynamo DB or another location such that the Lambda function could look them up based on the existing key?

AWS
awsrwx
답변함 5년 전
profile picture
전문가
검토됨 22일 전
0

You could also consider adding a JSON structure like this in your manifest, which would eliminate the need to store the new keys elsewhere:
{"origKey": "object1key", "newKey": "newObject1Key"}

You could put that in the manifest like this:
bucket,{"origKey": "object1key", "newKey": "newObject1Key"}
bucket,{"origKey": "object2key", "newKey": "newObject2Key"}
bucket,{"origKey": "object3key", "newKey": "newObject3Key"}

We would then send that JSON to the Lambda function as the S3 key and it can be parsed there to identify the original and the key key for each copy. Please note that the JSON would need to be URL encoded for this to work and each "key" entry would need to be less than 1024 characters

The URL encoded values you would actually use:

bucket,%7B%22origKey%22%3A%20%22object1key%22%2C%20%22newKey%22%3A%20%22newObject1Key%22%7D
bucket,%7B%22origKey%22%3A%20%22object2key%22%2C%20%22newKey%22%3A%20%22newObject2Key%22%7D
bucket,%7B%22origKey%22%3A%20%22object3key%22%2C%20%22newKey%22%3A%20%22newObject3Key%22%7D

AWS
awsrwx
답변함 5년 전
0

Thank you rob for your guidance! I will attemp to implement this approach using the JSON structure in the CSV manifest. It looks like the perfect solution for my particular bulk rename case.

답변함 5년 전

로그인하지 않았습니다. 로그인해야 답변을 게시할 수 있습니다.

좋은 답변은 질문에 명확하게 답하고 건설적인 피드백을 제공하며 질문자의 전문적인 성장을 장려합니다.

질문 답변하기에 대한 가이드라인

관련 콘텐츠