EFS high metadata IO

0

I have an EFS with <100 GB of data, mounted onto ~5 ECS tasks. In the monitoring tab of the EFS, I can see that the throughput utilization is always above 75% and >99% of the throughput is metadata. How can I find out the cause and reduce the throughput utilization so my EFS can have better performance?

asked 2 years ago1794 views
1 Answer
0

With Amazon EFS, you can monitor network throughput, I/O for read, write, and metadata operations, client connections, and burst credit balances for your file systems. If performance falls outside your established baseline, you might need to change the size of your file system or the number of connected clients to optimize the file system for your workload.

To establish a baseline, you should, at a minimum, monitor the following items:

* Your file system's network throughput.
* The number of client connections to a file system.
* The number of bytes for each file system operation, including data read, data write, and metadata operations.

You can use the following automated monitoring tools to watch Amazon EFS and report when something is wrong:

* Amazon CloudWatch Alarms 
* Amazon CloudWatch Logs 
* Amazon CloudWatch Events 
* AWS CloudTrail Log Monitoring

For your use case I suggest monitoring Amazon EFS with Amazon CloudWatch. For more details on how to use Cloudwatch to monitor EFS refer to below article,

https://docs.aws.amazon.com/efs/latest/ug/monitoring-cloudwatch.html

Amazon EFS delivers more than 10 gibibytes per second (GiBps) of throughput over 500,000 IOPS, and sub-millisecond or low single digit millisecond latencies. The following documentation provide an overview of Amazon EFS performance, and describe how your file system configuration impacts key performance dimensions. We also provide some important tips and recommendations for optimizing the performance of your file system.

https://docs.aws.amazon.com/efs/latest/ug/performance.html

To improve read performance :

The easiest way to increase read performance would be to implement a cache on the instance, such that read requests to the EFS would be served from the local disk used by the instance. This would improve performance as there would be no network or metadata access induced latency, which improves the performance of the read requests. This would also reduce the I/O sent to the EFS, which is useful for general performance EFS. Such a cache is known as a file system cache, or FS-cache, and a common example of FS-cache is cachefilesd. However, note that the downside to using a cache is that the cache must be checked first before the EFS, so this cache would not help if the application is constantly accessing new files from the EFS.

To improve write performance

The only way to improve the write performance would be to implement your writes in parallel. This can be done using multiple EC2 instances (i.e. multiple clients) or using utilities such as GNU parallel, msrsync, or fpsync.

For more EFS performance tips refer to below article,

https://docs.aws.amazon.com/efs/latest/ug/performance-tips.html

If you would like to further deep dive into the root cause of the performance issue then I request you to raise a case with AWS ECS support as the support engineer would be able to look into the account ECS/EFS performance metrics for further analysis.

AWS
SUPPORT ENGINEER
Sathya
answered 2 years ago

You are not logged in. Log in to post an answer.

A good answer clearly answers the question and provides constructive feedback and encourages professional growth in the question asker.

Guidelines for Answering Questions