- Newest
- Most votes
- Most comments
Performance Implications of High Storage Utilization in Redshift
When a Redshift cluster reaches 95% storage capacity, several performance issues can arise:
Potential Performance Impacts
-
Vacuum Operations: Redshift requires temporary space for vacuum operations. At high utilization, vacuums may:
- Run less efficiently
- Take longer to complete
- Fail to reclaim space effectively
-
Query Performance: Limited free space can impact:
- Sort operations that require temp space
- Complex joins and aggregations
- Concurrent query performance
-
Data Loading: ETL processes may slow down or fail due to insufficient space for staging
Key Metrics to Monitor
To determine if storage utilization is affecting performance, monitor these metrics:
1. Vacuum-Related Metrics
- SVV_VACUUM_PROGRESS: Check duration and effectiveness of vacuum operations
- SVV_VACUUM_SUMMARY: Review historical vacuum performance
- STL_VACUUM: Examine detailed vacuum logs for failures or extended runtimes
2. Performance Metrics
- PercentageDiskSpaceUsed: Track this CloudWatch metric over time
- QueryDuration: Monitor if query performance is degrading
- WLM Queue Wait Times: Check if queries are waiting longer
3. System Table Queries
sql -- Check for vacuum effectiveness SELECT table_name, unsorted_rows, vacuum_sort_benefit FROM svv_table_info ORDER BY vacuum_sort_benefit DESC;
-- Monitor tables with high deleted space SELECT "table", size, tbl_rows, unsorted_rows, deleted_rows, (deleted_rows::float / tbl_rows::float) 100 as pctdeleted FROM svv_table_info WHERE deleted_rows > 0 ORDER BY pct_deleted DESC;
-- Check for long-running vacuums SELECT FROM stlvacuum ORDER BY xid DESC LIMIT 10;
4. CloudWatch Metrics
- CPUUtilization: Higher values may indicate system strain
- ReadIOPS/WriteIOPS: Increased I/O can indicate vacuum overhead
- ReadLatency/WriteLatency: Higher latencies suggest storage pressure
Recommendations
If metrics indicate performance issues but increasing capacity isn't an option:
-
Optimize Vacuum Strategy:
- Schedule vacuums during low-usage periods
- Use targeted vacuums on critical tables
- Consider VACUUM BOOST for one-time space recovery
-
Data Management:
- Archive historical data to S3 using Redshift Spectrum
- Implement more aggressive data retention policies
- Use column encoding to improve compression
-
Table Optimization:
- Review and adjust distribution and sort keys
- Consider table partitioning for large tables
- Analyze skew in data distribution
Documentation Sources
These recommendations are supported by AWS documentation:
-
Vacuum Operations and Disk Space:
- VACUUM Command Documentation states: "VACUUM DELETE temporarily blocks update and delete operations" and "By default, VACUUM skips the sort phase for any table where more than 95 percent of the table's rows are already sorted."
-
Storage Utilization Impact:
- Vacuuming Tables notes: "Maintenance operations like VACUUM and COPY use intermediate temporary storage space for their sort operations, so a spike in disk usage is expected."
-
Performance Monitoring:
- SVV_TABLE_INFO provides the
vacuum_sort_benefitcolumn which shows "The estimated maximum percentage improvement of scan query performance when you run vacuum sort." - Monitoring Cluster Performance recommends monitoring "Percentage disk space used" to track disk utilization.
- SVV_TABLE_INFO provides the
-
Vacuum Optimization:
- Minimizing Vacuum Times advises: "Run vacuum operations on a regular schedule" and "If you need to load a new table with multiple COPY operations, run the largest load first."
Yes, there can be significant performance implications when a Redshift cluster is using 95% of its storage capacity. While this specific threshold isn't mentioned for Redshift, it's generally recommended to keep storage utilization below 80% for optimal performance in file systems. When storage utilization is very high, it can negatively impact various aspects of cluster performance, including query execution and system maintenance tasks.
To determine if the high storage utilization is impacting performance, you can review several metrics:
-
CPU Utilization: Monitor for unusual spikes in CPU usage, which could indicate the system is working harder to manage limited space.
-
Disk Usage: Keep an eye on the PercentageDiskSpaceUsed metric. At 95%, you're likely to see impacts on performance.
-
Query Performance: Compare query execution times, especially for queries that were previously performing well. Look for increased durations in query phases like Planning, Compiling, and Execution.
-
VACUUM and ANALYZE operations: These background maintenance tasks may take longer or fail to complete effectively due to limited space. Monitor the duration and success rate of these operations.
-
Commit Queue Length: A growing commit queue could indicate that the system is struggling to write data efficiently due to limited space.
-
Query Planning and Waiting Times: Increased times in these metrics could suggest that the system is having difficulty optimizing query execution due to space constraints.
-
Database Connections: An unusual increase might indicate that queries are taking longer to complete, causing more concurrent connections.
-
Unsorted Rows: Check the percentage of unsorted rows in your tables. High storage utilization may impact the system's ability to sort data efficiently.
To gather this information, you can use the Redshift console's Cluster Performance tab and Query Monitoring tab. Additionally, you can set up CloudWatch alarms to monitor these metrics and alert you when they exceed certain thresholds.
If you consistently see degradation in these metrics correlating with the high storage utilization, it would be strong evidence that the limited storage is impacting performance. In such a case, increasing the cluster's capacity would likely lead to significant performance improvements.
Sources
Amazon Redshift: Monitoring and troubleshooting query performance using system tables | AWS re:Post
Performance for Amazon FSx for OpenZFS - FSx for OpenZFS
Amazon Redshift Serverless Monitoring and troubleshooting query performance using system tables | AWS re:Post
Relevant content
- asked 2 years ago
- AWS OFFICIALUpdated 10 months ago
- AWS OFFICIALUpdated a year ago

Thank you! I will start implementing these recommendations asap.