AWS RDS PostgreSQL Dump Stops After 30 Minutes or 11GB Size Limit from EC2 in the Same Region

0

I am facing an issue when taking a PostgreSQL dump from an AWS RDS instance to an EC2 instance in the same region using the pg_dump command. The dump process runs smoothly for about 30 minutes and stops abruptly once it reaches 11 GB in size. Here’s the command I am using:

pg_dump -h <hostname> -p 5432 -U <username> -Fc -b -v -f <dumpfilelocation.sql> -d <database_name> I have made the following changes to my DB parameter group on RDS to try and resolve the issue:

Increased work_mem Increased statement_timeout Adjusted memory and WAL settings Despite these changes, the issue persists. The dump always stops at 11 GB or 30 minutes. I also tried attaching a custom DB parameter group to the RDS instance, but that didn’t help.

1 Answer
0

Possible Areas to Investigate:

  1. EC2 Disk Storage: Ensure the EC2 instance has enough disk space to accommodate the full dump. Running out of disk space could cause the process to fail once the dump reaches a certain size (in this case, 11 GB).

  2. Network Timeout: Verify if any network components have timeout settings that could be interrupting the connection. Adjust the tcp_keepalives settings in PostgreSQL and on the EC2 instance to keep the connection active:

    • On PostgreSQL:
      SET tcp_keepalives_idle = 60;
      SET tcp_keepalives_interval = 10;
    • On the EC2 instance, adjust net.ipv4.tcp_keepalive_time via sysctl.
  3. pg_dump Compression: To reduce the size of the dump and potentially lower the time required to complete the transfer, try using compression:

    pg_dump -h <hostname> -p 5432 -U <username> -Fc -b -v | gzip > <dumpfilelocation.sql.gz>
  4. Splitting the Dump: For larger databases, consider splitting the dump by schema or table. This can be done using the -n option (for schema) or the -t option (for table) with pg_dump:

    • Example for schema:
      pg_dump -h <hostname> -p 5432 -U <username> -Fc -b -n <schema_name> -v -f <dumpfile.sql> -d <database_name>
  5. Check Logs: Review the logs of both the PostgreSQL instance and the EC2 instance for any errors or warnings around the time when the dump process halts. Logs might provide insights into whether resource limits or other issues are causing the interruption.

  6. DB Parameter Group Confirmation: Double-check that your custom DB parameter group is correctly applied to the RDS instance and that the instance was properly rebooted after attaching the group.

  7. Disk I/O or Network Bottlenecks: Monitor CloudWatch metrics to identify whether your EC2 instance is hitting EBS or network I/O limits during the dump process.

  8. T2/T3 Instance Throttling: If you are using a T2 or T3 instance, it’s possible that CPU credits are being depleted, causing the instance to throttle CPU performance. You can either:

    • Enable T2/T3 Unlimited mode to allow the instance to burst beyond its baseline CPU performance without throttling, though this may incur additional charges.
    • Use CloudWatch to monitor CPU credit consumption and determine if throttling is occurring during the dump process.
  9. Instance Type Consideration: If your dump process is resource-intensive, it might be worth considering upgrading to a larger or different instance type (e.g., m-series), which provides more consistent performance compared to burstable instances like T2/T3.

profile picture
answered a month ago

You are not logged in. Log in to post an answer.

A good answer clearly answers the question and provides constructive feedback and encourages professional growth in the question asker.

Guidelines for Answering Questions