S3 Batch Operations + Lambda to bulk rename objects

0

Hi,

I need to rename 10 Million S3 objects which are all stored in a single bucket that is arranged in a folder structure by {year}/{month}. I was thinking to use S3 batch operations invoking a lambda function to perform this task.

I was planning to use a custom manifest to specify the objects that I want to rename (not all stored objects in the bucket should be renamed) and I was wondering if there is a way to include and pass a {new_name} value in the CSV manifest, so that I pass this value in the JSON request from the batch job to the lambda function and then use this value to rename the object. Is there a way to perform such task? If not, what do you suggest to have a list of new names that I need to assign? Please note that the new name is not a string that I can compute from the actual name via any function in lambda using only the actual name as an input parameter.

Thank you,
Camilo

質問済み 5年前1443ビュー
3回答
0
承認された回答

There's no way to pass this additional piece of object level information in the CSV manifest today. If it is not something that can be derived from the existing key as you stated, could the desired values be stored in Dynamo DB or another location such that the Lambda function could look them up based on the existing key?

AWS
awsrwx
回答済み 5年前
profile picture
エキスパート
レビュー済み 22日前
0

You could also consider adding a JSON structure like this in your manifest, which would eliminate the need to store the new keys elsewhere:
{"origKey": "object1key", "newKey": "newObject1Key"}

You could put that in the manifest like this:
bucket,{"origKey": "object1key", "newKey": "newObject1Key"}
bucket,{"origKey": "object2key", "newKey": "newObject2Key"}
bucket,{"origKey": "object3key", "newKey": "newObject3Key"}

We would then send that JSON to the Lambda function as the S3 key and it can be parsed there to identify the original and the key key for each copy. Please note that the JSON would need to be URL encoded for this to work and each "key" entry would need to be less than 1024 characters

The URL encoded values you would actually use:

bucket,%7B%22origKey%22%3A%20%22object1key%22%2C%20%22newKey%22%3A%20%22newObject1Key%22%7D
bucket,%7B%22origKey%22%3A%20%22object2key%22%2C%20%22newKey%22%3A%20%22newObject2Key%22%7D
bucket,%7B%22origKey%22%3A%20%22object3key%22%2C%20%22newKey%22%3A%20%22newObject3Key%22%7D

AWS
awsrwx
回答済み 5年前
0

Thank you rob for your guidance! I will attemp to implement this approach using the JSON structure in the CSV manifest. It looks like the perfect solution for my particular bulk rename case.

回答済み 5年前

ログインしていません。 ログイン 回答を投稿する。

優れた回答とは、質問に明確に答え、建設的なフィードバックを提供し、質問者の専門分野におけるスキルの向上を促すものです。

質問に答えるためのガイドライン

関連するコンテンツ