- Newest
- Most votes
- Most comments
When an OpenSearch Service domain gets stuck in the "Modifying" state, it's typically related to issues during the blue/green deployment process, particularly with shard relocation. Even though your cluster health is green and you have sufficient memory and CPU, there are several potential causes to investigate:
Common causes for being stuck in "Modifying" state:
- Large shard sizes that take longer to relocate
- Index or search load on the cluster
- Insufficient disk space
- Reaching a shard count quota
Troubleshooting steps:
- Monitor key CloudWatch metrics to identify the underlying issue:
- FreeStorageSpace - Check if disk space is running low
- ClusterStatus - Verify it remains healthy throughout the process
- ClusterIndexWritesBlocked - Check if writes are being blocked
-
Monitor shard migration progress using the
_cat/recoverycommand to see if shards are actually migrating, even if the console shows 0% progress. -
Take corrective actions based on findings:
- If FreeStorageSpace is low, delete unused indexes
- If ClusterIndexWritesBlocked is True, delete unused indexes to free disk space or reduce traffic to the domain if JVM memory pressure is high
Best practices to prevent future issues:
- Reindex larger shards to smaller sizes
- Initiate configuration changes during periods of low cluster usage
- Test updates on non-production domains first
- Ensure sufficient resources to handle the increased load during updates
If the domain has failed validation checks, check the domain description for any failed activities or validation-related error messages that might provide more specific guidance.
Sources
Troubleshoot stuck or failed upgrades in OpenSearch Service | AWS re:Post
Troubleshooting Amazon OpenSearch Service - Amazon OpenSearch Service
Relevant content
- AWS OFFICIALUpdated 8 months ago
