export changes from Lustre filesystem

0

I would like to automatically export changes from and fsx for lustre file system to an S3 bucket either continuously or at the time of cluster deletion. I would prefer a simple cluster configuration (yaml cluster configuration, plus OnNodeConfigured.Scripts if necessary). I am aware of the option to configure a "data repository association", but the the lustre filesystem does not exist at the time of cluster creation. I assume I noticed the SharedStorage.FsxLustreSettings.ExportPath configuration node and wonder whether it could help me. What is its intended use? I tested a configuration with DeploymentType: SCRATCH_2 and a specified ExportPath. Perhaps not surprisingly, none of the files created on the lustre filesystem appeared on the S3 bucket even after cluster deletion. Can automatic export be configured inside the cluster configuration yaml file with any of the DeploymentTypes? How?

asked a year ago267 views
2 Answers
0

Currently, automatic export is only supported on the Persistent 2 deployment type. https://docs.aws.amazon.com/fsx/latest/LustreGuide/export-changed-data-meta-dra.html

On Persistent 1 and Scratch file systems, you can use a data repository task to export changes made on the file system to the S3 bucket. https://docs.aws.amazon.com/fsx/latest/LustreGuide/export-data-repo-task-dra.html

AWS
answered a year ago
0

Hello,

from the SharedStorage.FsxLustreSettings.ExportPath documentation:

ExportPath (Optional, String)

The path in Amazon S3 where the root of your FSx for Lustre file system is exported. This setting is only supported when the ImportPath parameter is specified. The path must use the same Amazon S3 bucket as specified in ImportPath. You can provide an optional prefix to which new and changed data is to be exported from your FSx for Lustre file system. If an ExportPath value is not provided, FSx for Lustre sets a default export path, s3://import-bucket/FSxLustre[creation-timestamp]. The timestamp is in UTC format, for example s3://import-bucket/FSxLustre20181105T222312Z.

The Amazon S3 export bucket must be the same as the import bucket specified by ImportPath. If you only specify a bucket name, such as s3://import-bucket, you get a 1:1 mapping of file system objects to Amazon S3 bucket objects. This mapping means that the input data in Amazon S3 is overwritten on export. If you provide a custom prefix in the export path, such as s3://import-bucket/[custom-optional-prefix], FSx for Lustre exports the contents of your file system to that export prefix in the Amazon S3 bucket.

So, the intended use of SharedStorage.FsxLustreSettings.ExportPath and SharedStorage.FsxLustreSettings.ImportPath is the following:

  • you can set up an ImportPath to populate your Lustre file system with data coming from an S3 bucket;
  • you can use ExportPath to automatically export data to the same bucket, either to a subfolder of the bucket (the default should be to create a subfolder with a timestamp of the creation, alternative you can provide a custom subfolder), or overwriting the original bucket data (by providing the exact S3 path as ImportPath).

This functionality is not available for PERSISITENT_2 file systems, because in that case there's another mechanism internal to the FSxForLustre service.

AWS
answered a year ago

You are not logged in. Log in to post an answer.

A good answer clearly answers the question and provides constructive feedback and encourages professional growth in the question asker.

Guidelines for Answering Questions