Automatically tag S3 objects by prefix or help with lifecycle rules

0

We have large amounts of data written to S3 via an S3 file gateway SMB share. Each day's data lands in a new folder/prefix with date dynamically added to the folder name. Some folder data we want to keep for 100 days and purge. Other folder data we want to keep for a year and purge. While we are at it, we should probably purge empty prefixes/folder names. The application writing the data to the S3 gateway can add text to the start or end of the folder name automatically but apparently lifecycle rules can't use wildcards in prefix names.

In the scenario below we want to keep data written in Weekly* prefixes for 100 days and Keep* prefixes for 365 days. Research on the subject mostly focused on tagging which would work great but I don't know how to tag dynamically by text in a prefix name. Lifecycle rules would be easy with these tag examples but don't want to make a manual tagging process:

Weekly* were tagged with key=retention, value 100

and

Keep* key=retention, value 365.

               \SomeRoot\
                             Weekly-2023-09-07\
                                                                  FileA.txt
                                                                  FileB.txt 
                                                                  etcFile
                             Weekly-2023-09-14\
                             Weekly-2023-09-21\
                             Keep-2023-09-28\
                             Weekly-2023-10-05\
                             Weekly-2023-10-12\
                             Weekly-2023-10-19\
                             Keep-2023-10-26\
                             Weekly-2023-11-02\
                             EtcPrefix…

Objects written in \SomeRoot and \SomeRoot* will be moved to Glacier Immediate Access after 7 days. This part seems easy enough but wanted to mention it

已提問 8 個月前檢視次數 459 次
1 個回答
0

Hi,

You'll see examples of lifecycle rules on this page: https://docs.aws.amazon.com/AmazonS3/latest/userguide/lifecycle-configuration-examples.html like this one

<LifecycleConfiguration>
  <Rule>
    <ID>Transition and Expiration Rule</ID>
    <Filter>
       <Prefix>tax/</Prefix>
    </Filter>
    <Status>Enabled</Status>
    <Transition>
      <Days>365</Days>
      <StorageClass>S3 Glacier Flexible Retrieval</StorageClass>
    </Transition>
    <Expiration>
      <Days>3650</Days>
    </Expiration>
  </Rule>
</LifecycleConfiguration>

The rules are heavily based on prefix. So, I would suggest a slight change in your directory structure:

/keep100/[data to keep for 100 days structure]
/keep365/[data to keep for 100 days structure]

Then you can easily apply the <Expiration> xml node that you need in a lifecycle applied to the distinct prefixes.

Best

Didier

profile pictureAWS
專家
已回答 8 個月前
  • Thank you. I was looking for a way to dynamically tag files located in prefixes using text contained in the prefix name. There will be hundreds of prefixes with files that need to be tagged. Yes I could rearrange the directory structure to make things a bit more simple but wanted to exhaust what can be done with the current file structure.

您尚未登入。 登入 去張貼答案。

一個好的回答可以清楚地回答問題並提供建設性的意見回饋,同時有助於提問者的專業成長。

回答問題指南