How do I improve the performance of my FSx for Lustre file system?

4 minute read
0

I want to improve the performance of my Amazon FSx for Lustre file system.

Resolution

Increase file system size

The throughput that an FSx for Lustre file system supports is proportional to its storage capacity.

Throughput capacity = Storage capacity (TiB) * Per unit storage throughput (MBps)

For example, a persistent file system with 4.8 TiB of storage capacity and 50 MBps per TiB of throughput per unit of storage provides:

  • An aggregate baseline disk throughput of 240 MBps
  • A burst disk throughput of 1.152 GiBps

If object storage targets (OSTs) are almost full, then the file system might hang or get stuck when you read or write to the file system. In this case, increase the size of the file system.

Troubleshoot unbalanced OSTs

FSx for Lustre is a distributed file system that's comprised of OSTs that store data in the file system. To see the number of OSTs and size of each OST, run the following command from the client:

$lfs df -h

If it's an unbalanced file system, then the output looks similar to the following one:

UUID                 bytes   Used  Available Use%  Mounted on  
testfs-MDT0000_UUID  4.4G   214.5M   3.9G     4%   /mnt/testfs[MDT:0]  
testfs-MDT0001_UUID  4.4G   144.5M   4.0G     4%   /mnt/testfs[MDT:1]  
testfs-OST0000_UUID  2.0T   751.3G   1.1G    37%   /mnt/testfs[OST:0]  
testfs-OST0001_UUID  2.0T   755.3G   1.1G    37%   /mnt/testfs[OST:1]  
testfs-OST0002_UUID  2.0T     1.9T  55.1M    99%   /mnt/testfs[OST:2] <-  
testfs-OST0003_UUID  2.0T   751.3G   1.1G    37%   /mnt/testfs[OST:3]  
testfs-OST0004_UUID  2.0T   747.3G   1.1G    37%   /mnt/testfs[OST:4]  
testfs-OST0005_UUID  2.0T   743.3G   1.1G    36%   /mnt/testfs[OST:5]

filesystem summary: 11.8T     5.5T   5.7T    46%  /mnt/lustre

The available storage of a single OST might be relatively smaller or larger than the rest. This happens because of one of the following conditions:

  • New OSTs are added, and the optimization isn't complete.
  • The default stripe count of one placed the file on a single OST.

If multiple OSTs are full, then increase the storage capacity of your file system. If only a few OSTs are full, then rebalance the OSTs.

Also, tune the striping configuration to free up space and improve performance. You can set up a progressive file layout (PFL) configuration that allows the layout of a file to change with size. For example, to specify a layout configuration, use the lfs setstripe command with -E options to specify layout components for different sized files:

lfs setstripe -E 100M -c 1 -E 10G -c 8 -E 100G -c 16 -E -1 -c 32 /mountname/directory

Note:

  • PFL might not help with smaller files.
  • You can use the lfs setstripe command to set the stripe configuration only for new files and folders. You must use the lfs migrate command to strip the existing files or folders.
  • Sequential reads might not benefit from striping.

Use larger instances for compute-intensive workloads

For intensive workloads, choose instances with larger memory or computing capacity.

The following are some tuning best practices:

1.    Tune large client instances for optimal performance:

For client instance types with memory of more than 64 GiB, apply the following tuning:

lctl set_param ldlm.namespaces.*.lru_max_age=600000

For client instance types with more than 64 CPU cores, apply the following tuning:

echo "options ptlrpc ptlrpcd_per_cpt_max=32" >> /etc/modprobe.d/modprobe.conf  
echo "options ksocklnd credits=2560" >> /etc/modprobe.d/modprobe.conf  
         
# reload all kernel modules to apply the above two settings  
sudo reboot

2.    After the client is mounted, apply the following tuning:

sudo lctl set_param osc.*OST*.max_rpcs_in_flight=32  
sudo lctl set_param mdc.*.max_rpcs_in_flight=64  
sudo lctl set_param mdc.*.max_mod_rpcs_in_flight=50

Note:

The lctl set_param command doesn't persist over reboot. You can't set these parameters permanently from the client side. Therefore, it's a best practice to implement a boot cron job to set the configuration with the recommended tunings.

Related information

Aggregate baseline and burst throughput

Performance tips

AWS OFFICIAL
AWS OFFICIALUpdated a year ago