S3 Batch Operations + Lambda to bulk rename objects

0

Hi,

I need to rename 10 Million S3 objects which are all stored in a single bucket that is arranged in a folder structure by {year}/{month}. I was thinking to use S3 batch operations invoking a lambda function to perform this task.

I was planning to use a custom manifest to specify the objects that I want to rename (not all stored objects in the bucket should be renamed) and I was wondering if there is a way to include and pass a {new_name} value in the CSV manifest, so that I pass this value in the JSON request from the batch job to the lambda function and then use this value to rename the object. Is there a way to perform such task? If not, what do you suggest to have a list of new names that I need to assign? Please note that the new name is not a string that I can compute from the actual name via any function in lambda using only the actual name as an input parameter.

Thank you,
Camilo

gefragt vor 5 Jahren1443 Aufrufe
3 Antworten
0
Akzeptierte Antwort

There's no way to pass this additional piece of object level information in the CSV manifest today. If it is not something that can be derived from the existing key as you stated, could the desired values be stored in Dynamo DB or another location such that the Lambda function could look them up based on the existing key?

AWS
awsrwx
beantwortet vor 5 Jahren
profile picture
EXPERTE
überprüft vor 22 Tagen
0

You could also consider adding a JSON structure like this in your manifest, which would eliminate the need to store the new keys elsewhere:
{"origKey": "object1key", "newKey": "newObject1Key"}

You could put that in the manifest like this:
bucket,{"origKey": "object1key", "newKey": "newObject1Key"}
bucket,{"origKey": "object2key", "newKey": "newObject2Key"}
bucket,{"origKey": "object3key", "newKey": "newObject3Key"}

We would then send that JSON to the Lambda function as the S3 key and it can be parsed there to identify the original and the key key for each copy. Please note that the JSON would need to be URL encoded for this to work and each "key" entry would need to be less than 1024 characters

The URL encoded values you would actually use:

bucket,%7B%22origKey%22%3A%20%22object1key%22%2C%20%22newKey%22%3A%20%22newObject1Key%22%7D
bucket,%7B%22origKey%22%3A%20%22object2key%22%2C%20%22newKey%22%3A%20%22newObject2Key%22%7D
bucket,%7B%22origKey%22%3A%20%22object3key%22%2C%20%22newKey%22%3A%20%22newObject3Key%22%7D

AWS
awsrwx
beantwortet vor 5 Jahren
0

Thank you rob for your guidance! I will attemp to implement this approach using the JSON structure in the CSV manifest. It looks like the perfect solution for my particular bulk rename case.

beantwortet vor 5 Jahren

Du bist nicht angemeldet. Anmelden um eine Antwort zu veröffentlichen.

Eine gute Antwort beantwortet die Frage klar, gibt konstruktives Feedback und fördert die berufliche Weiterentwicklung des Fragenstellers.

Richtlinien für die Beantwortung von Fragen