Skip to content

blue/green deploy won't switchout because of empty database and replication slot

0

I have an active Blue/Green deployment on RDS PostgreSQL 13 → 18:

Switchover is failing due to an inactive replication slot on the empty default 'postgres' database: Status: inactive Lag: 45 GB All other slots are active with 56 bytes lag. The postgres database has no user tables or data. I cannot advance the slot manually as pg_replication_slot_advance is blocked in the rds namespace. Requesting assistance to advance or clear this slot so the switchover can proceed without rebuilding the deployment, which took 8 hours.

I assume I can insert a table into it and redo the blue/green if I have to but I'd love a non-8 hour solution.

2 Answers
0

For your blue/green deployment switchover issue with an inactive replication slot on the empty 'postgres' database, you have a few options to resolve this without rebuilding the entire deployment:

Option 1: Drop the Inactive Replication Slot

Since the 'postgres' database is empty with no user tables or data, you can safely drop the inactive replication slot. To do this:

  1. First, identify the specific slot name by running:
SELECT * FROM pg_replication_slots WHERE slot_type NOT LIKE 'physical';
  1. Then drop the identified slot:
SELECT pg_drop_replication_slot(slot_name);

Replace slot_name with the actual name of the inactive replication slot.

This should resolve the replication lag issue on the empty database and allow your switchover to proceed.

Option 2: If DDL Changes Were Made

If you've made any DDL changes or large object modifications during the blue/green deployment process, Amazon RDS will block the switchover. In this case, the system changes the replication state to "Replication degraded" and prevents switchover. Unfortunately, if this is the case, you would need to delete and recreate the blue/green deployment.

However, since you mentioned only the 'postgres' database has the issue and it's empty, dropping the inactive replication slot should be sufficient to resolve your problem without the 8-hour rebuild process.

After dropping the slot, verify that all replication slots are active and have minimal lag before attempting the switchover again.
Sources
Troubleshoot Amazon RDS for PostgreSQL blue/green deployment errors | AWS re:Post
Switching a blue/green deployment in Amazon RDS - Amazon Relational Database Service

answered 2 months ago
  • you can't drop a replication slot on RDS. I'm not sure why the AI keeps suggesting this: SQL Error [42501]: ERROR: must be superuser or replication role to use logical replication slots in the 'rds' namespace

    Error position:

0

As you clearly know from your comment above, you can't drop the default postgres database replication slot yourself.

For a situation like this, the best course of action is to open a support case as the support engineers have backend access to help you. If that's not an option or you don't want to wait, you could try to see if you can activate the subscription on the green side. You may end up seeing the same type of permission issues though, since you're altering subscriptions in the rds namespace, in which case you're back to the support case (or starting over). I don't have a quick way to test.

Connect to the green instance's postgres database and check the subscription status:

-- On the GREEN instance, connected to the 'postgres' database
SELECT subname, subenabled, subslotname, subconninfo 
FROM pg_subscription;

If there's a disabled subscription, enable it:

ALTER SUBSCRIPTION <subscription_name> ENABLE;

If the subscription exists but is stuck, try refreshing it:

ALTER SUBSCRIPTION <subscription_name> REFRESH PUBLICATION;

Monitor the slot on the blue side to see if it becomes active:

-- On the BLUE instance
SELECT slot_name, active, restart_lsn, confirmed_flush_lsn,
       pg_current_wal_lsn() - confirmed_flush_lsn AS lag_bytes
FROM pg_replication_slots
WHERE database = 'postgres';

If you do end up rebuilding, you can either do what you mentioned and create a dummy table with a record in it in the postgres database before creating the blue/green deployment, or you could rename (or drop) the postgres database before starting.

AWS
answered 2 months ago
EXPERT
reviewed 2 months ago

You are not logged in. Log in to post an answer.

A good answer clearly answers the question and provides constructive feedback and encourages professional growth in the question asker.