- Newest
- Most votes
- Most comments
Additional Troubleshooting Steps:
- Check CloudWatch Metrics: Even in "Starting" state, check Disk Read/Write Ops and EBS Byte Balance. If you see activity, the DB is likely performing a recovery or block hydration—do not interrupt it. If activity is zero, the instance is likely stuck on degraded hardware.
- The "Post-Upgrade Stop" Trap: Stopping an instance immediately after a minor upgrade often interrupts critical internal processes (like system catalog updates or WAL-logging). This likely triggered a lengthy recovery upon restart.
- Actionable Alternative: Since you are on Basic Support, don't wait indefinitely. Use Point-in-Time Recovery (PITR) to restore a new instance from the state just before the upgrade. This is often faster than waiting for a "stuck" instance to resolve.
- CLI Trigger: Try modifying a non-critical parameter (like a Security Group) via the AWS CLI. Sometimes this forces the RDS control plane to refresh the instance state when the Console UI is unresponsive.
I understand this is a concerning situation. While I don't have specific information about instances being stuck in a "Starting" state after a minor version upgrade, I can provide some general guidance based on RDS PostgreSQL upgrade best practices.
Regarding your specific questions:
-
Silent recovery vs. hardware issues: Without access to the underlying infrastructure logs, it's difficult to determine the exact cause. However, extended "Starting" states can sometimes indicate the database is performing recovery operations, especially after a stop/start cycle following an upgrade.
-
Forcing state changes: Unfortunately, when an RDS instance is in a non-modifiable state, your options through the console are limited. The instance needs to complete whatever internal process it's executing before it becomes responsive again.
-
Precautions for production: For your production upgrade, I strongly recommend:
- Schedule the upgrade during a maintenance window with low traffic
- Avoid stopping the instance immediately after the upgrade completes - allow it to stabilize and verify full functionality first
- Consider using RDS Extended Support if you need more time to plan and test upgrades, as this prevents automatic upgrades until after the standard support end date
- Take a manual snapshot before beginning the upgrade process
- Test database connectivity and application functionality thoroughly on staging before proceeding to production
Important considerations:
When performing minor version upgrades, it's recommended to schedule them during periods of low traffic and ensure your applications use AWS SDKs with exponential backoff and jitter to handle any temporary connectivity issues during the upgrade process.
Given that you're on Basic Support and cannot access technical support, you may need to wait for the instance to complete its internal processes. If the instance remains stuck beyond a reasonable timeframe (typically several hours), you may need to consider restoring from your pre-upgrade snapshot.
For future upgrades, avoiding the stop/start cycle immediately after an upgrade and allowing the instance to fully stabilize first may help prevent this issue.
Sources
Performing a minor version upgrade - Amazon Aurora
