Automatically tag S3 objects by prefix or help with lifecycle rules

0

We have large amounts of data written to S3 via an S3 file gateway SMB share. Each day's data lands in a new folder/prefix with date dynamically added to the folder name. Some folder data we want to keep for 100 days and purge. Other folder data we want to keep for a year and purge. While we are at it, we should probably purge empty prefixes/folder names. The application writing the data to the S3 gateway can add text to the start or end of the folder name automatically but apparently lifecycle rules can't use wildcards in prefix names.

In the scenario below we want to keep data written in Weekly* prefixes for 100 days and Keep* prefixes for 365 days. Research on the subject mostly focused on tagging which would work great but I don't know how to tag dynamically by text in a prefix name. Lifecycle rules would be easy with these tag examples but don't want to make a manual tagging process:

Weekly* were tagged with key=retention, value 100

and

Keep* key=retention, value 365.

               \SomeRoot\
                             Weekly-2023-09-07\
                                                                  FileA.txt
                                                                  FileB.txt 
                                                                  etcFile
                             Weekly-2023-09-14\
                             Weekly-2023-09-21\
                             Keep-2023-09-28\
                             Weekly-2023-10-05\
                             Weekly-2023-10-12\
                             Weekly-2023-10-19\
                             Keep-2023-10-26\
                             Weekly-2023-11-02\
                             EtcPrefix…

Objects written in \SomeRoot and \SomeRoot* will be moved to Glacier Immediate Access after 7 days. This part seems easy enough but wanted to mention it

1개 답변
0

Hi,

You'll see examples of lifecycle rules on this page: https://docs.aws.amazon.com/AmazonS3/latest/userguide/lifecycle-configuration-examples.html like this one

<LifecycleConfiguration>
  <Rule>
    <ID>Transition and Expiration Rule</ID>
    <Filter>
       <Prefix>tax/</Prefix>
    </Filter>
    <Status>Enabled</Status>
    <Transition>
      <Days>365</Days>
      <StorageClass>S3 Glacier Flexible Retrieval</StorageClass>
    </Transition>
    <Expiration>
      <Days>3650</Days>
    </Expiration>
  </Rule>
</LifecycleConfiguration>

The rules are heavily based on prefix. So, I would suggest a slight change in your directory structure:

/keep100/[data to keep for 100 days structure]
/keep365/[data to keep for 100 days structure]

Then you can easily apply the <Expiration> xml node that you need in a lifecycle applied to the distinct prefixes.

Best

Didier

profile pictureAWS
전문가
답변함 8달 전
  • Thank you. I was looking for a way to dynamically tag files located in prefixes using text contained in the prefix name. There will be hundreds of prefixes with files that need to be tagged. Yes I could rearrange the directory structure to make things a bit more simple but wanted to exhaust what can be done with the current file structure.

로그인하지 않았습니다. 로그인해야 답변을 게시할 수 있습니다.

좋은 답변은 질문에 명확하게 답하고 건설적인 피드백을 제공하며 질문자의 전문적인 성장을 장려합니다.

질문 답변하기에 대한 가이드라인

관련 콘텐츠