query S3 objects' metadata

0

In terms of comparing and validating my transfer of data to S3 files, I need to be able to query the S3 objects' metadata in a given bucket.

Is there a way to query the S3 metadata, maybe a catalogue or something? At the moment I created a script which goes through the all objects and extracts the metadata which has been saved in a DB so I can use it. In my case, we have a few million files and it takes 5 hours for the script to finish.

I am looking for out of a box solution/AWS product or something that keeps this "S3 metadata catalogue" up to date.

Could you recommend me something?

  • No out-of-box solution unfortunately. You need to build it yourself.

질문됨 3달 전131회 조회
1개 답변
2
수락된 답변

Depending on what object metadata you are interested in, you could consider enabling Amazon S3 Inventory. Amazon S3 inventory provides comma-separated values (CSV) or Apache optimized row columnar (ORC) or Apache Parquet (Parquet) output files that list your objects and their corresponding metadata (such as object size, last modified date, encryption status and other fields) on a daily or weekly basis for an S3 bucket or objects that share a prefix (objects that have names that begin with the same string). Once the S3 inventory data has been generated in an S3 bucket, you can query these files (containing the metadata such as last_modified_date, e_tag etc) in Athena. For more details, please review the documentation and the blog:

[+] https://docs.aws.amazon.com/AmazonS3/latest/userguide/storage-inventory-athena-query.html [+] https://aws.amazon.com/blogs/storage/manage-and-analyze-your-data-at-scale-using-amazon-s3-inventory-and-amazon-athena/

AWS
지원 엔지니어
답변함 3달 전
profile picture
전문가
검토됨 2달 전
  • That seems to be what I am looking for. The best will be if it was real-time but once a day also could work in my case.

    Thanks!

로그인하지 않았습니다. 로그인해야 답변을 게시할 수 있습니다.

좋은 답변은 질문에 명확하게 답하고 건설적인 피드백을 제공하며 질문자의 전문적인 성장을 장려합니다.

질문 답변하기에 대한 가이드라인

관련 콘텐츠