S3 Batch Operations + Lambda to bulk rename objects

0

Hi,

I need to rename 10 Million S3 objects which are all stored in a single bucket that is arranged in a folder structure by {year}/{month}. I was thinking to use S3 batch operations invoking a lambda function to perform this task.

I was planning to use a custom manifest to specify the objects that I want to rename (not all stored objects in the bucket should be renamed) and I was wondering if there is a way to include and pass a {new_name} value in the CSV manifest, so that I pass this value in the JSON request from the batch job to the lambda function and then use this value to rename the object. Is there a way to perform such task? If not, what do you suggest to have a list of new names that I need to assign? Please note that the new name is not a string that I can compute from the actual name via any function in lambda using only the actual name as an input parameter.

Thank you,
Camilo

posta 5 anni fa1443 visualizzazioni
3 Risposte
0
Risposta accettata

There's no way to pass this additional piece of object level information in the CSV manifest today. If it is not something that can be derived from the existing key as you stated, could the desired values be stored in Dynamo DB or another location such that the Lambda function could look them up based on the existing key?

AWS
awsrwx
con risposta 5 anni fa
profile picture
ESPERTO
verificato 22 giorni fa
0

You could also consider adding a JSON structure like this in your manifest, which would eliminate the need to store the new keys elsewhere:
{"origKey": "object1key", "newKey": "newObject1Key"}

You could put that in the manifest like this:
bucket,{"origKey": "object1key", "newKey": "newObject1Key"}
bucket,{"origKey": "object2key", "newKey": "newObject2Key"}
bucket,{"origKey": "object3key", "newKey": "newObject3Key"}

We would then send that JSON to the Lambda function as the S3 key and it can be parsed there to identify the original and the key key for each copy. Please note that the JSON would need to be URL encoded for this to work and each "key" entry would need to be less than 1024 characters

The URL encoded values you would actually use:

bucket,%7B%22origKey%22%3A%20%22object1key%22%2C%20%22newKey%22%3A%20%22newObject1Key%22%7D
bucket,%7B%22origKey%22%3A%20%22object2key%22%2C%20%22newKey%22%3A%20%22newObject2Key%22%7D
bucket,%7B%22origKey%22%3A%20%22object3key%22%2C%20%22newKey%22%3A%20%22newObject3Key%22%7D

AWS
awsrwx
con risposta 5 anni fa
0

Thank you rob for your guidance! I will attemp to implement this approach using the JSON structure in the CSV manifest. It looks like the perfect solution for my particular bulk rename case.

con risposta 5 anni fa

Accesso non effettuato. Accedi per postare una risposta.

Una buona risposta soddisfa chiaramente la domanda, fornisce un feedback costruttivo e incoraggia la crescita professionale del richiedente.

Linee guida per rispondere alle domande