My Postgres database is rebooting forever

0

My database stopped responding and I rebooted it. Now, it is forever rebooting and there is an error message in the log: 2023-01-11 15:39:16.131 GMT [30125] LOG: skipping missing configuration file "/rdsdbdata/config/recovery.conf" 2023-01-11 15:39:16 UTC::@:[30125]:LOG: database system is shut down

asked a year ago587 views
1 Answer
1

Based on my understanding, RDS PostgreSQL instance is experiencing an Out of Memory issue. It appears that the instance was experiencing memory pressure and that this prevented the host from contacting the monitoring system. To relieve the instance's memory burden, it was restarted. Operating System Process, RDS Process, Monitoring Process, and Database Engine Process typically utilize the memory allotted to the underlying host. The engine process takes precedence over all other processes running on the host whenever the customer workload requires memory. As a result, the monitoring agent was unable to communicate heartbeat data to the monitoring application because of the limited free memory. The Monitoring Application assumes there may be a problem because it hasn't received any heartbeat data from the Monitoring Agent located on the same RDS host. As a result, it starts the reboot procedure.

Steps to try:

  1. Check for the workload/queries consuming more memory and work on further Query tuning from Performance Insight tool.
  2. Check for the RDS PostgreSQL memory related parameters and tune them if needed.
  3. Please refer to the link (https://docs.aws.amazon.com/AmazonRDS/latest/UserGuide/creating_alarms.html)which will guide you to set up cloudwatch alarms in RDS Metrics and pro-actively monitor those. You can create a CloudWatch Alarm to monitor and send alerts when the RDS metric goes beyond a certain threshold limit. CloudWatch uses Amazon SNS to send an email.
  4. You can scale up the RDS instance class to next higher level so that it can have more memory. You need to perform load testing on the Non-Prod instance first before working on the Prod as we are not sure of the impact of the new workloads on the instance

If none of these work, please open a support case with AWS using the following link. (https://console.aws.amazon.com/support/home#/case/create)

AWS
answered a year ago
  • Thank you for your answer. However, the database is not running anymore and I can't modify the instance. I read some logs and realized that the database is corrupted on RDS.

You are not logged in. Log in to post an answer.

A good answer clearly answers the question and provides constructive feedback and encourages professional growth in the question asker.

Guidelines for Answering Questions