S3: Efficient Way to List and Copy a Million-Object Collection

0

I am attempting to list all the objects in my million-object S3 bucket that were uploaded within 2 years of today, and then sync these to a bucket in a new account and region (for DR purposes). As a first step, I am using the following command:

aws s3api list-objects-v2 --bucket BUCKET_NAME --query 'Contents[?contains(LastModified, YYYY-MM-DD)].Key'

However, this command is taking days to run. Other than breaking this command down and scanning month-by-month over the two-year period, is there a more effective / efficient way to obtain the desired list of objects (and then sync them cross-account and cross-region)?

2 Antworten
1

You can have a look at S3 Inventory : https://docs.aws.amazon.com/AmazonS3/latest/userguide/storage-inventory.html

S3 Inventory is designed for efficient and scalable reporting on your S3 objects. It provides a scheduled CSV or ORC output of your objects, which you can query externally.

example query

aws s3api put-bucket-inventory-configuration --bucket BUCKET_NAME --id inventory-id \
  --inventory-configuration '{"Id": "inventory-id","Destination": {"S3BucketDestination": {"Bucket": "REPORT_BUCKET_NAME"}},"IncludedObjectVersions": "Current","Schedule": {"Frequency": "Daily"},"Format": "CSV","Fields": ["Size","LastModifiedDate","ETag","StorageClass","IsMultipartUploaded","ReplicationStatus"]}'
profile picture
beantwortet vor 5 Monaten
0

Once the inventory is gathered (I've used the above solution for large S3 object quantities), have you considered using Cross-Region Replication with RTC?

https://docs.aws.amazon.com/AmazonS3/latest/userguide/replication-walkthrough-2.html (cross account/region) https://docs.aws.amazon.com/AmazonS3/latest/userguide/replication-walkthrough-5.html (RTC).

AWS
KAS
beantwortet vor 5 Monaten

Du bist nicht angemeldet. Anmelden um eine Antwort zu veröffentlichen.

Eine gute Antwort beantwortet die Frage klar, gibt konstruktives Feedback und fördert die berufliche Weiterentwicklung des Fragenstellers.

Richtlinien für die Beantwortung von Fragen