AWS EFS - file delete and recreate not detected programmatically for 25 to

0

I am observing a very large delay in EFS detecting that a file has been re-created i.e. deleted and then created again.

In this simple test example, I have a single file that gets deleted and recreated around 5 seconds later. I have two EC2 instances mounted to the same EFS:

EC2-1: responsible for reading the file

EC2-2: responsible for deleting and creating the file.

The problem I am seeing is that when EC2-2 deletes the file, EC2-1 correctly updates to say it is no longer present. EC2-2 then recreates the file around 5 seconds later. EC2-1 does not detect the file has returned for another 25 to 30 seconds.

Now, if I run some sort of query on the file system on EC2-1 just after recreation (like and LS command), it DOES then immediately update to say the file is created.

To be clear, I visually see the file get created on EC2-1 FS immediately after creation, just by running an LS. Its reading it programmatically that fails. In my test case I have a Node.js script that literally just calls readFileSync() every second. I have also tested with the same in Python to conclude this is an EFS issue.

If I run the same script on EC2-2, is see expected results i.e. the file is missing for a second and then is available immediately once recreated. So, reading this on the instance that does the delete and create, works as expected.

Its as if EFS is not detecting the file delete/recreation at all.

OS is Ubuntu Server 18.04 on both EC2 VM's. Tested on new EFS of type "General Purpose" and "High I/O".

wonka
asked 2 years ago121 views
3 Answers
0

Hello!

Based on your description, you are very likely observing the effects of the negative lookup cache in your NFS client. When you do an "ls" on a non-existing file on EC2-1, the local NFS client will cache the non-existence of this file for a number of seconds. The exact timeout depends on your exact NFS mount options but by default is in the 30 second range. The file that was created by EC2-2 does exist immediately on EFS after EC2-2 created it, it's existence is just hidden by the cache on EC2-1.

There are few ways around this problem, each of them with a different tradeoff:

* You can mount your file system with "lookupcache=pos" to disable negative lookup caching. The benefit is that your use case will work without further changes. The disadvantage is if your application frequently accesses non-existing files, you may see a performance degradation.

* You can do a readdir on the directory containing the file which will return the new file and invalidate any negative caches in that directory. This is what you are observing in your post when do you an "ls" on the directory. Readdir operations are relatively expensive and I would not recommend this if your have to do this frequently or if your directory contains more than a handful of files.

* You can try to structure your application so that it never accesses a file unless it knows it exists.

answered 2 years ago
0

This was solved using the noac option. This sums up the options nicely.

https://forums.aws.amazon.com/thread.jspa?threadID=245354

wonka
answered 2 years ago
0

For others who may read this thread in the future.

The 'noac' mount option resolves the issue by disabling all caching, while 'lookupcache=pos' is more targeted and only disables negative dentry caching. For this specific issue, 'lookupcache=pos' is preferred as it will have less performance impact.

answered 2 years ago

You are not logged in. Log in to post an answer.

A good answer clearly answers the question and provides constructive feedback and encourages professional growth in the question asker.

Guidelines for Answering Questions