- Newest
- Most votes
- Most comments
For your blue/green deployment switchover issue with an inactive replication slot on the empty 'postgres' database, you have a few options to resolve this without rebuilding the entire deployment:
Option 1: Drop the Inactive Replication Slot
Since the 'postgres' database is empty with no user tables or data, you can safely drop the inactive replication slot. To do this:
- First, identify the specific slot name by running:
SELECT * FROM pg_replication_slots WHERE slot_type NOT LIKE 'physical';
- Then drop the identified slot:
SELECT pg_drop_replication_slot(slot_name);
Replace slot_name with the actual name of the inactive replication slot.
This should resolve the replication lag issue on the empty database and allow your switchover to proceed.
Option 2: If DDL Changes Were Made
If you've made any DDL changes or large object modifications during the blue/green deployment process, Amazon RDS will block the switchover. In this case, the system changes the replication state to "Replication degraded" and prevents switchover. Unfortunately, if this is the case, you would need to delete and recreate the blue/green deployment.
However, since you mentioned only the 'postgres' database has the issue and it's empty, dropping the inactive replication slot should be sufficient to resolve your problem without the 8-hour rebuild process.
After dropping the slot, verify that all replication slots are active and have minimal lag before attempting the switchover again.
Sources
Troubleshoot Amazon RDS for PostgreSQL blue/green deployment errors | AWS re:Post
Switching a blue/green deployment in Amazon RDS - Amazon Relational Database Service
As you clearly know from your comment above, you can't drop the default postgres database replication slot yourself.
For a situation like this, the best course of action is to open a support case as the support engineers have backend access to help you. If that's not an option or you don't want to wait, you could try to see if you can activate the subscription on the green side. You may end up seeing the same type of permission issues though, since you're altering subscriptions in the rds namespace, in which case you're back to the support case (or starting over). I don't have a quick way to test.
Connect to the green instance's postgres database and check the subscription status:
-- On the GREEN instance, connected to the 'postgres' database
SELECT subname, subenabled, subslotname, subconninfo
FROM pg_subscription;
If there's a disabled subscription, enable it:
ALTER SUBSCRIPTION <subscription_name> ENABLE;
If the subscription exists but is stuck, try refreshing it:
ALTER SUBSCRIPTION <subscription_name> REFRESH PUBLICATION;
Monitor the slot on the blue side to see if it becomes active:
-- On the BLUE instance
SELECT slot_name, active, restart_lsn, confirmed_flush_lsn,
pg_current_wal_lsn() - confirmed_flush_lsn AS lag_bytes
FROM pg_replication_slots
WHERE database = 'postgres';
If you do end up rebuilding, you can either do what you mentioned and create a dummy table with a record in it in the postgres database before creating the blue/green deployment, or you could rename (or drop) the postgres database before starting.
Relevant content
- asked 2 years ago
- asked 3 months ago

you can't drop a replication slot on RDS. I'm not sure why the AI keeps suggesting this: SQL Error [42501]: ERROR: must be superuser or replication role to use logical replication slots in the 'rds' namespace
Error position: