1 Answer
- Newest
- Most votes
- Most comments
0
RDS recovery can be triggered for several reasons:
- Hardware failure: The underlying hardware hosting your RDS instance may have experienced a failure
- Storage issues: Problems with the storage volume attached to your instance
- Network connectivity issues: Network disruptions between components
- Database engine crashes: The database process itself crashed due to bugs or resource constraints
- Monitoring system detection: AWS's automated monitoring detected an unhealthy state
Investigate Root Cause To address the recurring reboots:
- Check CloudWatch metrics for unusual patterns (CPU, memory, storage, connections)
- Review RDS logs for errors before the recovery events
- Check if you're hitting resource limits (connections, storage)
- Consider upgrading your instance type if it's resource-constrained
The most effective solution is to enable Multi-AZ for your RDS instance. This creates a standby replica in a different Availability Zone that automatically takes over during failures:
- During planned maintenance or instance failures, RDS automatically fails over to the standby
- Failover typically completes within 60-120 seconds
- Your application connects to the same endpoint, so no code changes are needed
If the issue persists after implementing Multi-AZ, contact AWS Support for deeper investigation.
answered a year ago
Relevant content
- asked 2 years ago
- asked 2 months ago
- AWS OFFICIALUpdated 4 months ago
- AWS OFFICIALUpdated 5 years ago
