Skip to content

How do I troubleshoot an Amazon EBS volume that’s in the "warning," "impaired," or "insufficient-data" state?

5 minute read
1

I want to troubleshoot an Amazon Elastic Block Store (Amazon EBS) volume that’s in the “warning,” “impaired,” or “insufficient-data” state.

Resolution

Check the status of the volume

Note: If you receive errors when you run AWS Command Line Interface (AWS CLI) commands, then see Troubleshooting errors for the AWS CLI. Also, make sure that you're using the most recent AWS CLI version.

Use the Amazon Elastic Compute Cloud (Amazon EC2) console, the AWS CLI, or AWS Tools for PowerShell to check volume status. Then, troubleshoot the issue based on the status that the volume is in.

Note: If your volume's status is ok, then the volume passed all status checks, is performing as expected, and requires no action.

Troubleshoot volumes with the "warning" status

When your volume is in the warning status, the volume's I/O performance is declining. The warning status applies only to Provisioned IOPS SSD io1, io2, and General Purpose SSD gp3 volumes.

To troubleshoot what's causing the decline in I/O performance, complete the following steps:

  1. Review the volume's events to identify the cause.
  2. Verify that the attached EC2 instance functions correctly.
  3. Create a snapshot of the volume as a backup.
  4. Monitor the volume's status to determine if the condition improves or worsens.
  5. If the volume continues to remain in the warning status, then create a new volume from the snapshot and replace the declining volume.

Note: During the initialization of io1 and io2 volumes that you restored from a snapshot, performance might drop below 50% of the expected level. After initialization completes, the volume's performance returns to normal. To minimize the effects of volume initialization, manually initialize the volumes.

Troubleshoot volumes with the "impaired" status

If the volume is in the impaired status, then the volume's status check failed. To prevent potential data inconsistencies, Amazon EBS deactivates I/O to the volume and the volume becomes unavailable. To keep the volume available when it's impaired, activate the Auto-Enabled IO attribute of the volume.

Note: If you activated Auto-Enabled IO, then monitor for Auto-Enabled IO events on the volume.

If the volume shows the impaired status, then take one of the following actions:

Troubleshoot volumes with the "insufficient-data" status

Newly created volumes

When the status of a newly created volume is insufficient-data, the checks might still be in progress on the volume. Wait for the checks to complete.

Existing volumes that previously reported ok

When an existing volume's status changes from ok to insufficient-data, confirm that you correctly attached the volume to a running instance.

Then, use one of the following methods to verify that I/O operations are occurring on the volume:

  • Check Amazon CloudWatch metrics for the Amazon EBS volume. Review VolumeReadOps and VolumeWriteOps to confirm that read and write operations occur.
  • To check volume activity at the operating system (OS) level, use iostat for Linux or perfmon for Windows.

If you attach the volume and I/O operations occur but the insufficient-data status continues, then contact AWS Support.

Configure a CloudWatch alarm for volume status

To detect when a volume becomes impaired, set a CloudWatch alarm for the VolumeStalledIOCheck metric.

Note: The VolumeStalledIOCheck metric is available only for volumes that you attached to Nitro-based EC2 instances. The metric isn't available for volumes that you attached to Amazon Elastic Container Service (Amazon ECS) or AWS Fargate tasks.

To configure the alarm, complete the following steps:

  1. Open the CloudWatch console.
  2. In the navigation pane, choose Alarms, and then choose Create alarm.
  3. Choose Select metric, and then choose EBS, Per-Volume Metrics with Instance ID.
  4. Select VolumeStalledIOCheck for your volume ID and instance ID combination.
  5. For Statistic, choose Maximum.
  6. For Period, choose 1 minute.
  7. For Threshold, set the value to >= 1.
  8. For Datapoints to alarm, set the value to 10 out of 10.
  9. Configure notification actions based on your requirements.
  10. Choose Create alarm.

Use AWS FIS to test your alarm and failover workflows

You can use the following AWS Fault Injection Service (AWS FIS) actions to validate your CloudWatch alarms and failover workflows:

  • Use Pause I/O to pause all I/O operations on a volume and simulate complete storage impairment.
  • Use latency injection to inject configurable latency into a percentage of read or write operations. You can use preconfigured patterns, including sustained, increasing, intermittent, and decreasing latency.

Note: AWS FIS fault injection works only with volumes that you attached to Nitro-based EC2 instances. Instance store volumes aren't eligible. All Nitro-based instance types support latency injection, except P4d, P5, P5e, Trn2u, G6, G6f, Gr6, Gr6f, M8i, M8i-flex, C8i-flex, R8i, R8i-flex, I8ge, Mac-m4pro, and Mac-m4. For more information, see Fault testing on Amazon EBS.

Related information

Monitoring Amazon EBS volume disruptions using Amazon EventBridge

Amazon EventBridge events for Amazon EBS

AWS OFFICIALUpdated 24 days ago