S3 Batch Operations + Lambda to bulk rename objects

0

Hi,

I need to rename 10 Million S3 objects which are all stored in a single bucket that is arranged in a folder structure by {year}/{month}. I was thinking to use S3 batch operations invoking a lambda function to perform this task.

I was planning to use a custom manifest to specify the objects that I want to rename (not all stored objects in the bucket should be renamed) and I was wondering if there is a way to include and pass a {new_name} value in the CSV manifest, so that I pass this value in the JSON request from the batch job to the lambda function and then use this value to rename the object. Is there a way to perform such task? If not, what do you suggest to have a list of new names that I need to assign? Please note that the new name is not a string that I can compute from the actual name via any function in lambda using only the actual name as an input parameter.

Thank you,
Camilo

asked 5 years ago1415 views
3 Answers
0
Accepted Answer

There's no way to pass this additional piece of object level information in the CSV manifest today. If it is not something that can be derived from the existing key as you stated, could the desired values be stored in Dynamo DB or another location such that the Lambda function could look them up based on the existing key?

AWS
awsrwx
answered 5 years ago
profile picture
EXPERT
reviewed 12 days ago
0

You could also consider adding a JSON structure like this in your manifest, which would eliminate the need to store the new keys elsewhere:
{"origKey": "object1key", "newKey": "newObject1Key"}

You could put that in the manifest like this:
bucket,{"origKey": "object1key", "newKey": "newObject1Key"}
bucket,{"origKey": "object2key", "newKey": "newObject2Key"}
bucket,{"origKey": "object3key", "newKey": "newObject3Key"}

We would then send that JSON to the Lambda function as the S3 key and it can be parsed there to identify the original and the key key for each copy. Please note that the JSON would need to be URL encoded for this to work and each "key" entry would need to be less than 1024 characters

The URL encoded values you would actually use:

bucket,%7B%22origKey%22%3A%20%22object1key%22%2C%20%22newKey%22%3A%20%22newObject1Key%22%7D
bucket,%7B%22origKey%22%3A%20%22object2key%22%2C%20%22newKey%22%3A%20%22newObject2Key%22%7D
bucket,%7B%22origKey%22%3A%20%22object3key%22%2C%20%22newKey%22%3A%20%22newObject3Key%22%7D

AWS
awsrwx
answered 5 years ago
0

Thank you rob for your guidance! I will attemp to implement this approach using the JSON structure in the CSV manifest. It looks like the perfect solution for my particular bulk rename case.

answered 5 years ago

You are not logged in. Log in to post an answer.

A good answer clearly answers the question and provides constructive feedback and encourages professional growth in the question asker.

Guidelines for Answering Questions