Archive Open ZFS data based on access-time

0

Is it possible to move Open ZFS data grater than a certain period of time (say 6 months) to another low cost filesystem (S3 glacier). Most of the data needs to be kept for compliance purpose, hence checking if any archival solution native or custom is available based on atime or mtime (/durationsincelastaccess) ?

Thomas
질문됨 2달 전191회 조회
2개 답변
1

This process is not directly supported by OpenZFS and requires a custom solution. Here's a high-level approach to automate the migration of data older than a specified period, such as 6 months, to Amazon S3 Glacier:

  1. Identify Older Files First, use tools and scripts to identify files older than 6 months. You can use the find command in Unix-based systems to list these files:
find /path/to/zfs/dataset -type f -mtime +180

This command lists files modified more than 180 days ago.

  1. Archive and Transfer Before moving files to S3 Glacier, consider archiving them to reduce the number of objects and possibly save on costs. You can use tar or other compression tools for this purpose:
tar -czvf archive-name.tar.gz /path/to/older/files

  1. Upload to Amazon S3 Glacier You can use the AWS CLI to upload the archived files directly to an S3 bucket configured for Glacier storage:
aws s3 cp archive-name.tar.gz s3://your-bucket-name/path/to/archive/ --storage-class DEEP_ARCHIVE

The DEEP_ARCHIVE storage class offers the lowest cost storage option in S3 but with a retrieval time of 12 hours or more.

  1. Automate the Process To automate this process, you can create a script that performs these steps and schedule it to run periodically using cron jobs or other scheduling tools.
profile picture
전문가
답변함 2달 전
profile picture
전문가
Artem
검토됨 한 달 전
profile picture
전문가
검토됨 2달 전
0

As the question is tagged with Amazon FSx for OpenZFS what follows assumes that's where the data is located that needs to be migrated (and not, say, a third-party on-prem OpenZFS product) then AWS DataSync is the way to go.

https://aws.amazon.com/datasync/faqs/#Data_movement

Q: Where can I move data to and from?

A: DataSync supports the following storage location types: .... Amazon Simple Storage Service (Amazon S3), .... Amazon FSx for OpenZFS file systems

Even if your data is currently on-prem it may still be worth looking into.

Q: How do I use AWS DataSync to migrate data to AWS?

A: You can use AWS DataSync to migrate data located on premises, at the edge, or in other clouds to Amazon S3

The above mentions "plain" S3, but Glacier also gets a call-out in the same section of the FAQ.

Q: How do I use AWS DataSync to archive cold data?

A: You can use AWS DataSync to move cold data from on-premises storage systems directly to durable and secure long-term storage, such as Amazon S3 Glacier Flexible Retrieval (formerly S3 Glacier) or Amazon S3 Glacier Deep Archive.

profile picture
전문가
Steve_M
답변함 2달 전
profile picture
전문가
검토됨 한 달 전
  • I had checked DataSync, while it allows moving data between FSx and S3 (I did not test it), I did not find any option to specify a rule. My requirement is to not just move data between FSx and S3 but archival of files greater than a certain age. Please let me know if my understanding is incorrect. Thanks

  • I haven't tried it myself either, according to https://docs.aws.amazon.com/datasync/latest/userguide/create-s3-location.html#using-storage-classes

    New objects copied to an S3 bucket are stored using the storage class that you specify when creating your Amazon S3 transfer location.

    The steps to do create the S3 transfer location & specify the storage class are at https://docs.aws.amazon.com/datasync/latest/userguide/create-s3-location.html#create-s3-location-how-to

    To create an Amazon S3 location

    1. Open the AWS DataSync console at https://console.aws.amazon.com/datasync/

    2. In the left navigation pane, expand Data transfer, then choose Locations and Create location.

    3. For Location type, choose Amazon S3.

    4. For S3 bucket, choose the bucket that you want to use as a location. (When creating your DataSync task later, you specify whether this location is a transfer source or destination.)

    5. For S3 storage class, choose a storage class that you want your objects to use when Amazon S3 is a transfer destination.

로그인하지 않았습니다. 로그인해야 답변을 게시할 수 있습니다.

좋은 답변은 질문에 명확하게 답하고 건설적인 피드백을 제공하며 질문자의 전문적인 성장을 장려합니다.

질문 답변하기에 대한 가이드라인