How to get S3 files information optimally

0

I need to get information like file name, file size, file type from hierarchical S3 folder structure which is organized like business/sub_business/date/files_with_name_of_originator . I need to get the info for one particular sub_business and within that for one particular type of file (type contained in file name) for a date range. One Option is to Query the sub_business for each request. Other option can be to capture S3 events when files come and store meta data in RDS. I get daily 2 million files. Is there a better way?

1 Answer
0

If the "turnover rate" of your files is quite rapid then what you're doing is probably best. You might find that using a different data store (such as DynamoDB) may be less expensive and more performant (because DynamoDB works well as a key/value store) but that may require too many changes to your application.

However, you might consider using S3 Inventory noting that it may take up to 48 hours for files to appear in the inventory - that might not be appropriate for your requirements.

profile pictureAWS
EXPERT
answered 8 months ago
  • Hi Bret thanks for response. BY " what you're doing is probably best." -Do you mean Capturing the events and keeping the metadata in a database? or querying the S3 using S3 API when the API request comes for this metadata?

  • I think that using the event notifications and capturing that data into a database is a good way to go. But querying S3 could also work. It all depends on your use case (including how often the files change; how often you are going to want to query it; stuff like that).

You are not logged in. Log in to post an answer.

A good answer clearly answers the question and provides constructive feedback and encourages professional growth in the question asker.

Guidelines for Answering Questions