I want to troubleshoot high storage consumption on my source PostgreSQL database when I run an AWS Database Migration Service (AWS DMS) change data capture (CDC) task.
Short description
When you use PostgreSQL as a source for a CDC task, AWS DMS uses PostgreSQL logical replication slots to get changes from the source database. The slots retain write-ahead logs (WALs) that AWS DMS requires, even when you don't connect WALs to the PostgreSQL source. AWS DMS only removes WALs from PostgreSQL after it fetches the required changes from the replication slot and advances the restart_lsn of the replication slot.
For more information, see Logical decoding concepts on the PostgreSQL website.
Storage volume issues can occur for the following reasons:
- You stop the AWS DMS CDC task for a long time. If you don't connect AWS DMS to the source database, then it doesn't consume changes from the replication slot on the source. So, PostgreSQL continuously retains WALs.
- You have a heavy workload that causes excessive WAL generation. When you use logical replication slots, PostgreSQL retains WALs if the replication slot requires the log sequence numbers (LSNs).
- You have idle replication slots. Even when a replication slot is inactive, the WAL that the replication slot retains still contains information about the table, schema or database. This fills up storage on the source, even when the tables have no transactions.
Resolution
Check if replication slots cause high disk space usage on the PostgreSQL source
To check if replication slots are causing high disk space usage in your PostgreSQL database, run one of the following queries:
For PostgreSQL v9:
psql=> SELECT slot_name, pg_size_pretty(pg_xlog_location_diff(pg_current_xlog_location(),restart_lsn)) AS replicationSlotLag, active FROM pg_replication_slots ;
For PostgreSQL v10 and later:
psql=> SELECT slot_name, pg_size_pretty(pg_wal_lsn_diff(pg_current_wal_lsn(),restart_lsn)) AS replicationSlotLag, active FROM pg_replication_slots ;
Example output:
slot_name | replicationslotlag | active---------------------------------------------------------------+--------------------+--------
xc36ujql35djp_00013322_907c1e0a_9f8b_4c13_89ea_ef0ea1cf143d | 129 GB | f
7pajuy7htthd7sqn_00013322_a27bcebf_7d0f_4124_b336_92d0fb9f5130 | 704 MB | t
zp2tkfo4ejw3dtlw_00013322_03e77862_689d_41c5_99ba_021c8a3f851a | 624 MB | t
If a replication slot has an active state set to f (false), then your database isn't consuming that slot.
To drop unused slots, run the following query:
psql=> SELECT pg_drop_replication_slot('YOUR_SLOTNAME');
Note: Replace YOUR_SLOTNAME with the slot name.
For more information, see Why did I receive a "No space left on device" or "DiskFull" error on Amazon Relational Database Service (Amazon RDS) for PostgreSQL?
Turn on the WAL heartbeat feature
To reduce your storage consumption on a PostgreSQL source database, turn on the heartbeatEnable extra connection attribute (ECA). This attribute helps prevent storage full scenarios on the PostgreSQL source.
To turn on the WAL heartbeat feature, add the following ECA to the PostgreSQL source endpoint:
heartbeatEnable=Y;
Note: The heartbeat transaction runs on the source only if an AWS DMS task is running. If you stop your AWS DMS tasks, then the WAL heartbeat feature has no effect.
You can also specify the following ECAs:
heartbeatFrequency=frequency;heartbeatSchema=schemaname;
The heartbeatFrequency attribute determines how often in minutes that the heartbeat transaction runs on the PostgreSQL source. Replace frequency with how often you want the transaction to run. For example, if you set heartbeatFrequency to a value of 15, then AWS DMS runs the heartbeat transaction every 15 minutes on the source.
The heartbeatSchema attribute specifies what database schema AWS DMS creates database objects in to generate the heartbeat transaction. Replace schemaname with your database schema.
Set the maximum size of a slot in PostgreSQL
You can apply max_slot_wal_keep_size on the source database in PostgreSQL version 13 and later. This sets the maximum amount of WALs that replication slots can retain. For more information, see max_slot_wal_keep_size on the PostgreSQL website.
Note: The max_slot_wal_keep_size setting helps you avoid storage full issues on the PostgreSQL source. But, it can cause the PostgreSQL source database to purge WALs before the AWS DMS CDC task reads the changes from the replication slot. This can cause the task to fail.
Related information
Types of PostgreSQL replication slots on the Hevo website