Why do EC2 and ECS instances with EFS mounted perform poorly when reading and writing a large number of small files? And how to optimize?

0

When I use EFS mounted on EC2 or mounted on ECS as storage space for my Nix packages, I find that the speed of Nix building packages and environments is very slow. Then I noticed that the CPU usage is very low and the usage time is short, with a large part of the time spent waiting for file I/O. Subsequently, I conducted tests in efs mounted point by fio it's my ec2's env : the AMI image of my EC2 is using Amazon Linux 2023 t3.large 2vCPU 4GB it's efs settings as follows: Performance modes:Performance modes Throughput modes:Elastic Throughput mode

the mouted point setting is

/home/ec2-user/test_efs type nfs4 (rw,relatime,vers=4.1,rsize=1048576,wsize=1048576,namlen=255,hard,noresvport,proto=tcp,port=20305,timeo=600,retrans=2,sec=sys,clientaddr=127.0.0.1,local_lock=none,addr=127.0.0.1)

the fio comand that randomly reads and writes 1000 small files using synchronous I/O

sudo fio --name=smallfile_test2 --directory=./  --numjobs=4 --size=4k --ioengine=sync --rw=randrw --rwmixread=50 --bs=4k --iodepth=1 --fsync_on_close=1 --nrfiles=10000 --time_based --runtime=60s --group_reporting

the result is

smallfile_test2: (groupid=0, jobs=4): err= 0: pid=897064: Wed Dec 27 06:56:52 2023
  read: IOPS=0, BW=53B/s (53B/s)(16.0KiB/304188msec)
    clat (usec): min=1101, max=2232, avg=1664.57, stdev=575.46
     lat (usec): min=1102, max=2233, avg=1665.67, stdev=575.64
    clat percentiles (usec):
     |  1.00th=[ 1106],  5.00th=[ 1106], 10.00th=[ 1106], 20.00th=[ 1106],
     | 30.00th=[ 1237], 40.00th=[ 1237], 50.00th=[ 1237], 60.00th=[ 2089],
     | 70.00th=[ 2089], 80.00th=[ 2245], 90.00th=[ 2245], 95.00th=[ 2245],
     | 99.00th=[ 2245], 99.50th=[ 2245], 99.90th=[ 2245], 99.95th=[ 2245],
     | 99.99th=[ 2245]
   bw (  KiB/s): min=   15, max=   15, per=100.00%, avg=15.00, stdev= 0.00, samples=2
   iops        : min=    3, max=    3, avg= 3.00, stdev= 0.00, samples=2
  write: IOPS=0, BW=53B/s (53B/s)(16.0KiB/304188msec); 0 zone resets
    clat (nsec): min=9222, max=26049, avg=18860.50, stdev=7037.13
     lat (nsec): min=9583, max=27054, avg=19946.25, stdev=7398.49
    clat percentiles (nsec):
     |  1.00th=[ 9280],  5.00th=[ 9280], 10.00th=[ 9280], 20.00th=[ 9280],
     | 30.00th=[19328], 40.00th=[19328], 50.00th=[19328], 60.00th=[20864],
     | 70.00th=[20864], 80.00th=[25984], 90.00th=[25984], 95.00th=[25984],
     | 99.00th=[25984], 99.50th=[25984], 99.90th=[25984], 99.95th=[25984],
     | 99.99th=[25984]
   bw (  KiB/s): min=   16, max=   16, per=100.00%, avg=16.00, stdev= 0.00, samples=2
   iops        : min=    4, max=    4, avg= 4.00, stdev= 0.00, samples=2
  lat (usec)   : 10=12.50%, 20=12.50%, 50=25.00%
  lat (msec)   : 2=25.00%, 4=25.00%
  cpu          : usr=0.14%, sys=0.62%, ctx=310807, majf=0, minf=50
  IO depths    : 1=100.0%, 2=0.0%, 4=0.0%, 8=0.0%, 16=0.0%, 32=0.0%, >=64=0.0%
     submit    : 0=0.0%, 4=100.0%, 8=0.0%, 16=0.0%, 32=0.0%, 64=0.0%, >=64=0.0%
     complete  : 0=0.0%, 4=100.0%, 8=0.0%, 16=0.0%, 32=0.0%, 64=0.0%, >=64=0.0%
     issued rwts: total=4,4,0,0 short=0,0,0,0 dropped=0,0,0,0
     latency   : target=0, window=0, percentile=100.00%, depth=1

Run status group 0 (all jobs):
   READ: bw=53B/s (53B/s), 53B/s-53B/s (53B/s-53B/s), io=16.0KiB (16.4kB), run=304188-304188msec
  WRITE: bw=53B/s (53B/s), 53B/s-53B/s (53B/s-53B/s), io=16.0KiB (16.4kB), run=304188-304188msec

obviously,the iops is too low

   iops        : min=    3, max=    3, avg= 3.00, stdev= 0.00, samples=2

but when in ebs , the result is

smallfile_test2: (groupid=0, jobs=4): err= 0: pid=906077: Wed Dec 27 07:21:22 2023
  read: IOPS=9316, BW=36.4MiB/s (38.2MB/s)(2192MiB/60232msec)
    clat (nsec): min=879, max=24011k, avg=8541.61, stdev=226653.49
     lat (nsec): min=927, max=24011k, avg=8906.03, stdev=231034.76
    clat percentiles (nsec):
     |  1.00th=[    1432],  5.00th=[    1688], 10.00th=[    1768],
     | 20.00th=[    1880], 30.00th=[    1944], 40.00th=[    2008],
     | 50.00th=[    2064], 60.00th=[    2128], 70.00th=[    2224],
     | 80.00th=[    2320], 90.00th=[    2512], 95.00th=[    2640],
     | 99.00th=[    4896], 99.50th=[   13248], 99.90th=[  749568],
     | 99.95th=[ 4816896], 99.99th=[10158080]
   bw (  KiB/s): min=17662, max=72812, per=100.00%, avg=52569.89, stdev=1466.62, samples=338
   iops        : min= 4414, max=18203, avg=13142.01, stdev=366.69, samples=338
  write: IOPS=9327, BW=36.4MiB/s (38.2MB/s)(2195MiB/60232msec); 0 zone resets
    clat (nsec): min=1295, max=30046k, avg=6890.12, stdev=190530.56
     lat (nsec): min=1367, max=30046k, avg=7381.93, stdev=197984.30
    clat percentiles (usec):
     |  1.00th=[    3],  5.00th=[    3], 10.00th=[    3], 20.00th=[    3],
     | 30.00th=[    4], 40.00th=[    4], 50.00th=[    4], 60.00th=[    4],
     | 70.00th=[    4], 80.00th=[    4], 90.00th=[    5], 95.00th=[    5],
     | 99.00th=[   15], 99.50th=[   16], 99.90th=[   37], 99.95th=[   81],
     | 99.99th=[10159]
   bw (  KiB/s): min=13886, max=72164, per=100.00%, avg=52618.68, stdev=1505.34, samples=338
   iops        : min= 3470, max=18041, avg=13154.18, stdev=376.36, samples=338
  lat (nsec)   : 1000=0.09%
  lat (usec)   : 2=19.52%, 4=73.20%, 10=5.57%, 20=1.33%, 50=0.09%
  lat (usec)   : 100=0.01%, 250=0.01%, 500=0.06%, 750=0.04%, 1000=0.02%
  lat (msec)   : 2=0.01%, 4=0.01%, 10=0.02%, 20=0.02%, 50=0.01%
  cpu          : usr=31.83%, sys=3.86%, ctx=144003, majf=0, minf=1521
  IO depths    : 1=100.0%, 2=0.0%, 4=0.0%, 8=0.0%, 16=0.0%, 32=0.0%, >=64=0.0%
     submit    : 0=0.0%, 4=100.0%, 8=0.0%, 16=0.0%, 32=0.0%, 64=0.0%, >=64=0.0%
     complete  : 0=0.0%, 4=100.0%, 8=0.0%, 16=0.0%, 32=0.0%, 64=0.0%, >=64=0.0%
     issued rwts: total=561165,561837,0,0 short=0,0,0,0 dropped=0,0,0,0
     latency   : target=0, window=0, percentile=100.00%, depth=1

Run status group 0 (all jobs):
   READ: bw=36.4MiB/s (38.2MB/s), 36.4MiB/s-36.4MiB/s (38.2MB/s-38.2MB/s), io=2192MiB (2299MB), run=60232-60232msec
  WRITE: bw=36.4MiB/s (38.2MB/s), 36.4MiB/s-36.4MiB/s (38.2MB/s-38.2MB/s), io=2195MiB (2301MB), run=60232-60232msec

Disk stats (read/write):
  nvme0n1: ios=1475/152800, merge=0/2487, ticks=686/103513, in_queue=104199, util=99.70%

Why does EFS exhibit significantly lower performance compared to EBS in scenarios involving the mounting of numerous small files, and is there potential for optimization in this area?

1 Answer
0

Hi,

You may want to read and apply this guidance in details: https://docs.aws.amazon.com/efs/latest/ug/performance.html

In particular, I recommend sections NFS client mount settings and Optimizing small-file performance

Best,

Didier

profile pictureAWS
EXPERT
answered 4 months ago

You are not logged in. Log in to post an answer.

A good answer clearly answers the question and provides constructive feedback and encourages professional growth in the question asker.

Guidelines for Answering Questions