Why is there a high I/O wait, an increased queue length, and a spike in latency with my Amazon EBS volume?

3 minute read
0

There's a high I/O wait, an increased queue length, and a spike in latency with my Amazon Elastic Block Store (Amazon EBS) volume.

Resolution

You experience an increased queue length and a high I/O wait with your Amazon EBS volumes when there's a latency in I/O operation completion. The following are some of the common reasons for increased latency.

The volume is reaching its throughput or IOPS quota

If you're reaching your throughput and IOPs quotas, then you might experience latency. To determine your throughput and IOPS quotas, see How can I calculate the maximum IOPS and throughput for an Amazon EBS volume? Then, check whether the EBS volumes of your Amazon Elastic Compute Cloud (Amazon EC2) instance are reaching the throughput or IOPS quotas.

If you frequently reach your throughput or IOPS quota, then change the volume type or size to one that meets your application's needs. To determine what volume types to use, it's a best practice to benchmark your EBS volumes against your workload in a test environment.

The instance throughput or IOPS quota is reached

EBS-optimized EC2 instances have a maximum aggregated throughput and IOPS across all EBS volumes that are attached to the instance. You might see a high I/O wait and increased latency, but your volume doesn't reach its throughput or IOPS quotas. If this happens, then check whether the volume's throughput or IOPS reaches the instance's throughput or IOPS quota.

For example, you have a gp3 volume of 1 TiB with 16,000 provisioned IOPS and 700 MiBps throughput that's attached to a t3.medium instance. A t3.medium instance can achieve a maximum performance of 260.57 MiBps throughput and 11,800 IOPS that are aggregated across all attached volumes. The instance achieves this for only 30 minutes in a 24 hour period. Then, performance is throttled to a baseline of 43.43 MiBps throughput and 2,000 IOPS that are aggregated across all the attached volumes. Although one volume can manage up to 700 MiBps and 16,000 IOPS, the instance can't achieve this performance.

If your application performance needs exceed the capabilities of your instance, then change the instance type to one that can manage your workload.

Microbursting occurs in the volume

Microbursting happens when a volume bursts IOPS or throughput for a significantly shorter period than the collection period. Amazon CloudWatch doesn't show microbursting. For more information, see How can I identify if my Amazon EBS volume is micro-bursting and then prevent this from happening?

You restored the volume from a snapshot and the volume is initializing

When restore a volume from a snapshot, the volume must initialize the data. The first time that you access each block of data, you might experience increased latency because the volume must download the data from Amazon Simple Storage (Amazon S3).

To minimize latency, you can force the initialization of the volume. You can also turn on Amazon EBS fast snapshot restore so that the volume is fully initialized when you create it.

There's an issue with the underlying storage subsystems of the volume

If you tried all the preceding troubleshooting steps and continue to experience high latency, then contact AWS Support.

Related information

How do I use CloudWatch metrics to calculate the average throughput and average number of IOPS that my EBS volume provides?

Addressing I/O latency when restoring Amazon EBS volumes from EBS snapshots

AWS OFFICIAL
AWS OFFICIALUpdated 4 months ago