Automatically tag S3 objects by prefix or help with lifecycle rules

0

We have large amounts of data written to S3 via an S3 file gateway SMB share. Each day's data lands in a new folder/prefix with date dynamically added to the folder name. Some folder data we want to keep for 100 days and purge. Other folder data we want to keep for a year and purge. While we are at it, we should probably purge empty prefixes/folder names. The application writing the data to the S3 gateway can add text to the start or end of the folder name automatically but apparently lifecycle rules can't use wildcards in prefix names.

In the scenario below we want to keep data written in Weekly* prefixes for 100 days and Keep* prefixes for 365 days. Research on the subject mostly focused on tagging which would work great but I don't know how to tag dynamically by text in a prefix name. Lifecycle rules would be easy with these tag examples but don't want to make a manual tagging process:

Weekly* were tagged with key=retention, value 100

and

Keep* key=retention, value 365.

               \SomeRoot\
                             Weekly-2023-09-07\
                                                                  FileA.txt
                                                                  FileB.txt 
                                                                  etcFile
                             Weekly-2023-09-14\
                             Weekly-2023-09-21\
                             Keep-2023-09-28\
                             Weekly-2023-10-05\
                             Weekly-2023-10-12\
                             Weekly-2023-10-19\
                             Keep-2023-10-26\
                             Weekly-2023-11-02\
                             EtcPrefix…

Objects written in \SomeRoot and \SomeRoot* will be moved to Glacier Immediate Access after 7 days. This part seems easy enough but wanted to mention it

asked 8 months ago415 views
1 Answer
0

Hi,

You'll see examples of lifecycle rules on this page: https://docs.aws.amazon.com/AmazonS3/latest/userguide/lifecycle-configuration-examples.html like this one

<LifecycleConfiguration>
  <Rule>
    <ID>Transition and Expiration Rule</ID>
    <Filter>
       <Prefix>tax/</Prefix>
    </Filter>
    <Status>Enabled</Status>
    <Transition>
      <Days>365</Days>
      <StorageClass>S3 Glacier Flexible Retrieval</StorageClass>
    </Transition>
    <Expiration>
      <Days>3650</Days>
    </Expiration>
  </Rule>
</LifecycleConfiguration>

The rules are heavily based on prefix. So, I would suggest a slight change in your directory structure:

/keep100/[data to keep for 100 days structure]
/keep365/[data to keep for 100 days structure]

Then you can easily apply the <Expiration> xml node that you need in a lifecycle applied to the distinct prefixes.

Best

Didier

profile pictureAWS
EXPERT
answered 8 months ago
  • Thank you. I was looking for a way to dynamically tag files located in prefixes using text contained in the prefix name. There will be hundreds of prefixes with files that need to be tagged. Yes I could rearrange the directory structure to make things a bit more simple but wanted to exhaust what can be done with the current file structure.

You are not logged in. Log in to post an answer.

A good answer clearly answers the question and provides constructive feedback and encourages professional growth in the question asker.

Guidelines for Answering Questions