DMS homogenous postgres RDS→RDS slows after initially working

0

Hi folks, I have a ~200GB postgres database on an m6g.xlarge instance that I am trying to migrate to a new instance. The end goal of the migration is to reduce our overprovisioned volume size from current 5TB to <400GB to reduce RDS cost.

The DMS migration task begins great and very quickly transfers some of the largest tables, around 100GB+ of data, then seems to slow to a crawl. Looking at the monitoring tab for the new RDS instance, the slowdown seems to correlate with the "EBSByteBalance" chart going from 100 very quickly to 0 and then being pegged at 0.

I have left the task running for a full day and see FreeStorageSpace continuing to decline at about 30GB/day (it is a straight line), so some additional data is being written. The tables that have data in them and are queryable in the new instance also seem to have replication working. But it's unclear whether the task is going to eventually succeed.

Is there a way to configure a new RDS instance to allow for a larger EBSByteBalance so I can complete the migration more quickly (if that is in fact the cause of the slowdown)? Or should I just wait...? At the current rate (30gb/day) it would take a few days for the rest to finish...

  • Update: It showed no visible progress for 36 hours, and then suddenly jumped from "79%" to "91%" then 98% then "Complete".

    EBSByteBalance remains at zero. After the 36 hour period of very slowly declining free disk space, it jumped as the final table(s) came online in the new replica. OldestReplicationSlotLag on the original RDS instance had crept up to 20GB over that slow period, then dropped to near-zero and now remains near zero as the replication continues. In case anybody else runs into something similar, seems like it was actually working as it slowly copied over these large tables.

1 Answer
0
Accepted Answer

The sudden progress observed during the replication process could be due to factors such as the completion of copying over large tables, optimization mechanisms, fluctuations in resource allocation, efficient management of replication lag, and other external factors.

profile picture
EXPERT
answered 24 days ago

You are not logged in. Log in to post an answer.

A good answer clearly answers the question and provides constructive feedback and encourages professional growth in the question asker.

Guidelines for Answering Questions