- Newest
- Most votes
- Most comments
Hello Stephen,
You're right, pausing applications for backups isn't ideal with 24/7 operations. Here's how AWS Backup with EFS handles potential inconsistencies:
AWS Backup for Amazon EFS may encounter inconsistencies if the file system is modified during a backup. These inconsistencies (like duplicated or missing data) are specific to that snapshot and won't be automatically resolved.
To Solve this:
- Schedule backups during low-activity periods if possible.
- Regularly restore and validate backups to ensure data integrity.
- Implement application-level strategies to minimize modifications during backups.
Go through This Link: https://aws.amazon.com/getting-started/hands-on/amazon-efs-backup-and-restore-using-aws-backup/
- Are these inconsistencies resolved in a subsequent backup or are they retained in an inconsistent state forever?
Per the documentation it's not possible to have a backup that's consistent at a particular point in time. Changes that occur while a backup is running, and aren't included in that backup, would be picked up by a subsequent backup.
- Are there specific activities that can be triggered on the filesystem that would cause the files to be backed up safely, such as touch?
I don't believe so.
- Do you recommend instead that the backup is fully restored and then affected files are identified and resolved? I'd like to avoid over-engineering a solution
Not enough information here to advise if there's a workaround (for changes after the backup time potentially being included in a backup). If a consistent backup is a requirement, I suggest looking at our FSx services. Both ONTAP and OpenZFS can provide a similar service, and have backups based on file system snapshots.

Thanks, I am aware of these but I'd like a slightly more nuanced answer if possible.
Let me be clear. The 24/7 activities I am referring to are long-running processes, often spanning multiple days, of batch processing by analysis engines. These are equally weighted read-write operations. We don't want to include substantial dead periods or miss regular backups.
You mention that the inconsistencies are "specific to that snapshot". Does that mean that the next snapshot will have a correct backup of the affected changes, assuming no further modifications, or will the data still be in the failed state if restoring from future backups?
Note that "regularly restoring and validating backups" only demonstrates the issue, it doesn't actually resolve anything. I'm looking for solutions that resolve it after the issue has occured.
The other options would basically entail duplicating and verifying substantial volumes of data. We can do this, but I'm wondering if there's a lighter touch approach.