Why do EC2 and ECS instances with EFS mounted perform poorly when reading and writing a large number of small files? And how to optimize?

0

When I use EFS mounted on EC2 or mounted on ECS as storage space for my Nix packages, I find that the speed of Nix building packages and environments is very slow. Then I noticed that the CPU usage is very low and the usage time is short, with a large part of the time spent waiting for file I/O. Subsequently, I conducted tests in efs mounted point by fio it's my ec2's env : the AMI image of my EC2 is using Amazon Linux 2023 t3.large 2vCPU 4GB it's efs settings as follows: Performance modes:Performance modes Throughput modes:Elastic Throughput mode

the mouted point setting is

/home/ec2-user/test_efs type nfs4 (rw,relatime,vers=4.1,rsize=1048576,wsize=1048576,namlen=255,hard,noresvport,proto=tcp,port=20305,timeo=600,retrans=2,sec=sys,clientaddr=127.0.0.1,local_lock=none,addr=127.0.0.1)

the fio comand that randomly reads and writes 1000 small files using synchronous I/O

sudo fio --name=smallfile_test2 --directory=./  --numjobs=4 --size=4k --ioengine=sync --rw=randrw --rwmixread=50 --bs=4k --iodepth=1 --fsync_on_close=1 --nrfiles=10000 --time_based --runtime=60s --group_reporting

the result is

smallfile_test2: (groupid=0, jobs=4): err= 0: pid=897064: Wed Dec 27 06:56:52 2023
  read: IOPS=0, BW=53B/s (53B/s)(16.0KiB/304188msec)
    clat (usec): min=1101, max=2232, avg=1664.57, stdev=575.46
     lat (usec): min=1102, max=2233, avg=1665.67, stdev=575.64
    clat percentiles (usec):
     |  1.00th=[ 1106],  5.00th=[ 1106], 10.00th=[ 1106], 20.00th=[ 1106],
     | 30.00th=[ 1237], 40.00th=[ 1237], 50.00th=[ 1237], 60.00th=[ 2089],
     | 70.00th=[ 2089], 80.00th=[ 2245], 90.00th=[ 2245], 95.00th=[ 2245],
     | 99.00th=[ 2245], 99.50th=[ 2245], 99.90th=[ 2245], 99.95th=[ 2245],
     | 99.99th=[ 2245]
   bw (  KiB/s): min=   15, max=   15, per=100.00%, avg=15.00, stdev= 0.00, samples=2
   iops        : min=    3, max=    3, avg= 3.00, stdev= 0.00, samples=2
  write: IOPS=0, BW=53B/s (53B/s)(16.0KiB/304188msec); 0 zone resets
    clat (nsec): min=9222, max=26049, avg=18860.50, stdev=7037.13
     lat (nsec): min=9583, max=27054, avg=19946.25, stdev=7398.49
    clat percentiles (nsec):
     |  1.00th=[ 9280],  5.00th=[ 9280], 10.00th=[ 9280], 20.00th=[ 9280],
     | 30.00th=[19328], 40.00th=[19328], 50.00th=[19328], 60.00th=[20864],
     | 70.00th=[20864], 80.00th=[25984], 90.00th=[25984], 95.00th=[25984],
     | 99.00th=[25984], 99.50th=[25984], 99.90th=[25984], 99.95th=[25984],
     | 99.99th=[25984]
   bw (  KiB/s): min=   16, max=   16, per=100.00%, avg=16.00, stdev= 0.00, samples=2
   iops        : min=    4, max=    4, avg= 4.00, stdev= 0.00, samples=2
  lat (usec)   : 10=12.50%, 20=12.50%, 50=25.00%
  lat (msec)   : 2=25.00%, 4=25.00%
  cpu          : usr=0.14%, sys=0.62%, ctx=310807, majf=0, minf=50
  IO depths    : 1=100.0%, 2=0.0%, 4=0.0%, 8=0.0%, 16=0.0%, 32=0.0%, >=64=0.0%
     submit    : 0=0.0%, 4=100.0%, 8=0.0%, 16=0.0%, 32=0.0%, 64=0.0%, >=64=0.0%
     complete  : 0=0.0%, 4=100.0%, 8=0.0%, 16=0.0%, 32=0.0%, 64=0.0%, >=64=0.0%
     issued rwts: total=4,4,0,0 short=0,0,0,0 dropped=0,0,0,0
     latency   : target=0, window=0, percentile=100.00%, depth=1

Run status group 0 (all jobs):
   READ: bw=53B/s (53B/s), 53B/s-53B/s (53B/s-53B/s), io=16.0KiB (16.4kB), run=304188-304188msec
  WRITE: bw=53B/s (53B/s), 53B/s-53B/s (53B/s-53B/s), io=16.0KiB (16.4kB), run=304188-304188msec

obviously,the iops is too low

   iops        : min=    3, max=    3, avg= 3.00, stdev= 0.00, samples=2

but when in ebs , the result is

smallfile_test2: (groupid=0, jobs=4): err= 0: pid=906077: Wed Dec 27 07:21:22 2023
  read: IOPS=9316, BW=36.4MiB/s (38.2MB/s)(2192MiB/60232msec)
    clat (nsec): min=879, max=24011k, avg=8541.61, stdev=226653.49
     lat (nsec): min=927, max=24011k, avg=8906.03, stdev=231034.76
    clat percentiles (nsec):
     |  1.00th=[    1432],  5.00th=[    1688], 10.00th=[    1768],
     | 20.00th=[    1880], 30.00th=[    1944], 40.00th=[    2008],
     | 50.00th=[    2064], 60.00th=[    2128], 70.00th=[    2224],
     | 80.00th=[    2320], 90.00th=[    2512], 95.00th=[    2640],
     | 99.00th=[    4896], 99.50th=[   13248], 99.90th=[  749568],
     | 99.95th=[ 4816896], 99.99th=[10158080]
   bw (  KiB/s): min=17662, max=72812, per=100.00%, avg=52569.89, stdev=1466.62, samples=338
   iops        : min= 4414, max=18203, avg=13142.01, stdev=366.69, samples=338
  write: IOPS=9327, BW=36.4MiB/s (38.2MB/s)(2195MiB/60232msec); 0 zone resets
    clat (nsec): min=1295, max=30046k, avg=6890.12, stdev=190530.56
     lat (nsec): min=1367, max=30046k, avg=7381.93, stdev=197984.30
    clat percentiles (usec):
     |  1.00th=[    3],  5.00th=[    3], 10.00th=[    3], 20.00th=[    3],
     | 30.00th=[    4], 40.00th=[    4], 50.00th=[    4], 60.00th=[    4],
     | 70.00th=[    4], 80.00th=[    4], 90.00th=[    5], 95.00th=[    5],
     | 99.00th=[   15], 99.50th=[   16], 99.90th=[   37], 99.95th=[   81],
     | 99.99th=[10159]
   bw (  KiB/s): min=13886, max=72164, per=100.00%, avg=52618.68, stdev=1505.34, samples=338
   iops        : min= 3470, max=18041, avg=13154.18, stdev=376.36, samples=338
  lat (nsec)   : 1000=0.09%
  lat (usec)   : 2=19.52%, 4=73.20%, 10=5.57%, 20=1.33%, 50=0.09%
  lat (usec)   : 100=0.01%, 250=0.01%, 500=0.06%, 750=0.04%, 1000=0.02%
  lat (msec)   : 2=0.01%, 4=0.01%, 10=0.02%, 20=0.02%, 50=0.01%
  cpu          : usr=31.83%, sys=3.86%, ctx=144003, majf=0, minf=1521
  IO depths    : 1=100.0%, 2=0.0%, 4=0.0%, 8=0.0%, 16=0.0%, 32=0.0%, >=64=0.0%
     submit    : 0=0.0%, 4=100.0%, 8=0.0%, 16=0.0%, 32=0.0%, 64=0.0%, >=64=0.0%
     complete  : 0=0.0%, 4=100.0%, 8=0.0%, 16=0.0%, 32=0.0%, 64=0.0%, >=64=0.0%
     issued rwts: total=561165,561837,0,0 short=0,0,0,0 dropped=0,0,0,0
     latency   : target=0, window=0, percentile=100.00%, depth=1

Run status group 0 (all jobs):
   READ: bw=36.4MiB/s (38.2MB/s), 36.4MiB/s-36.4MiB/s (38.2MB/s-38.2MB/s), io=2192MiB (2299MB), run=60232-60232msec
  WRITE: bw=36.4MiB/s (38.2MB/s), 36.4MiB/s-36.4MiB/s (38.2MB/s-38.2MB/s), io=2195MiB (2301MB), run=60232-60232msec

Disk stats (read/write):
  nvme0n1: ios=1475/152800, merge=0/2487, ticks=686/103513, in_queue=104199, util=99.70%

Why does EFS exhibit significantly lower performance compared to EBS in scenarios involving the mounting of numerous small files, and is there potential for optimization in this area?

storm
質問済み 5ヶ月前361ビュー
1回答
0

Hi,

You may want to read and apply this guidance in details: https://docs.aws.amazon.com/efs/latest/ug/performance.html

In particular, I recommend sections NFS client mount settings and Optimizing small-file performance

Best,

Didier

profile pictureAWS
エキスパート
回答済み 5ヶ月前

ログインしていません。 ログイン 回答を投稿する。

優れた回答とは、質問に明確に答え、建設的なフィードバックを提供し、質問者の専門分野におけるスキルの向上を促すものです。

質問に答えるためのガイドライン

関連するコンテンツ