Skip to content

How do I troubleshoot autovaccum when it runs the "to prevent wraparound" flag in Amazon RDS for PostgreSQL or Aurora PostgreSQL-Compatible databases?

6 minute read
0

I want to resolve performance issues when autovaccum runs the "to prevent wraparound" flag in my Amazon Relational Database Service (Amazon RDS) for PostgreSQL or Amazon Aurora PostgreSQL-Compatible Edition database.

Short description

When you query pg_stat_activity, autovacuum might continually run the "to prevent wraparound" flag.

Autovacuum workers generally don't block other commands. When a process tries to acquire a lock that conflicts with the autocvacuum's SHARE UPDATE EXCLUSIVE lock, the lock acquisition can interrupt the autovacuum. However, when the autovacuum is running to prevent transaction ID wraparound, the system doesn't automatically interrupt the autovacuum.

Aggressive autovacuum with the "to prevent wraparound" flag can happen in the following scenarios:

  • The table's relfrozenxid value is greater than vacuum_freeze_table_age transactions, so the system performs an aggressive vacuum to freeze old tuples and advance relfrozenxid.
  • A table reaches its autovacuum_freeze_max_age.

Database performance issues can occur when autovacuum runs aggressively. For more information, see Automatic vacuuming on the PostgreSQL website.

Resolution

Determine whether autovacuum is running

To determine whether autovacuum is running, how long it's running, or whether it's waiting, run the following query:

SELECT datname, usename, pid, state, wait\_event, current\_timestamp - xact\_start AS xact\_runtime, query  
FROM pg\_stat\_activity  
WHERE upper(query) LIKE '%VACUUM%'  
ORDER BY xact\_start;

In the following example output, the "to prevent wraparound" flag is running on mydb for mytable1 and mytable2:

datname | usename  | pid   | state  | wait\_event |       xact\_runtime      | query  
\--------+----------+-------+--------+------------+-------------------------+----------------------------------------------------------------------------------------------  
mydb    | rdsadmin | 16473 | active |            | 33 days 16:32:11.600656 | autovacuum: VACUUM ANALYZE public.mytable1 (to prevent wraparound)  
mydb    | rdsadmin | 22553 | active |            | 14 days 09:15:34.073141 | autovacuum: VACUUM ANALYZE public.mytable2 (to prevent wraparound)  
mydb    | rdsadmin | 41909 | active |            | 3 days 02:43:54.203349  | autovacuum: VACUUM ANALYZE public.mytable3  
mydb    | rdsadmin | 618   | active |            | 00:00:00                | SELECT datname, usename, pid, state, wait\_event, current\_timestamp - xact\_start AS xact\_runtime, query+  
        |          |       |        |            |                         | FROM pg\_stat\_activity +  
        |          |       |        |            |                         | WHERE query like '%VACUUM%'+  
        |          |       |        |            |                         | ORDER BY xact\_start; +

The autovaccum always targets the tables that have a relfrozenxid that's greater than the number of transactions in autovaccum_freeze_max_age.

Identify unfrozen transactions

To identify unfrozen transactions in your database, run the following query:

SELECT datname, age(datfrozenxid) FROM pg\_database ORDER BY age(datfrozenxid) desc limit 20;

In the following example output, the autovacuum deamon prioritizes the database with a datfrozenxid that's greater than 200 million:

datname   | age  
mydb      | 1771757888  
template0 | 1721757888  
template1 | 1721757888  
rdsadmin  | 1694008527  
postgres  | 1693881061  
(5 rows)

The datfrozenxid column of a database's pg_database row is at the end of the normal transaction IDs in the database. The datfrozenxid column is the minimum of the relfrozenxid values for each table within the database.

Freeze the tables

If your table is reaching the maximum relfrozenxid table age, then manually vacuum freeze the tables.

After the manual vacuum freeze completes, view the verbose output to determine whether there are blockers, such as old replication slots and prepared transactions. If you receive the "DETAIL: xxx dead row versions cannot be removed yet, oldest xmin: xxx" message, then the vacuum is running but can't clean up the dead tuples.

To find the vacuum blockers, you can use the postgres_get_av_diag() function. For more information, see Identify and resolve aggressive vacuum blockers in RDS for PostgreSQL.

Increase the number of autovacuum workers and remove unused indexes

When the databases or tables that require vacuuming exceed the autovacuum_max_workers quota, the engine processes the next database or table only when a worker becomes available. In environments with high transaction rates, wraparound vacuum issues can occur.

To resolve this issue, increase the autovacuum_max_workers parameter so that there are more concurrent autovacuum workers.

If your table has large indexes, then it's a best practice to remove unused indexes.

Increase the vacuum memory

To make sure that the engine applies the adjusted memory allocation to new autovacuums, increase the autovacuum_work_mem parameter.

To increase the maintenance_work_mem for manual vacuum operations, such as a vacuum freeze, run the following command:

SET maintenance\_work\_mem TO '2GB';

Note: The default value for maintenance_work_mem is 64 MB. After the vacuum completes, change the value to its previous value. If you don't change the value, then memory issues might occur for smaller instance sizes.

To start the vacuum operation, run the following command:

VACUUM FREEZE VERBOSE table\_name;

If there are large indexes that require cleanup, then autovacuum must perform multiple passes. In PostgreSQL 16 and earlier, there's a 1 GB maintenance_work_mem quota that the vacuum operation uses. PostgreSQL 17 and later use TidStore that dynamically allocates memory and doesn't use a single-allocation array so that the vacuum can more efficiently use maintenance_work_mem.

Activate parallel vacuuming

For a manual vacuum, you can activate parallel vacuuming so that PostgreSQL assigns one vacuum worker to each index and vacuums them in parallel.

To activate the parallel vacuum, adjust the max_parallel_maintenance_workers setting based on the number of available vCPUs for your DB instance and number of indexes on the table.

For more information, see Parallel vacuuming in Amazon RDS for PostgreSQL and Amazon Aurora PostgreSQL.

Monitor the number of transactions that your database used

When the autovacuum operation can't complete a vacuum, the database might continue to increase. You can use the MaximumUsedTransactionIDs Amazon CloudWatch metric to get the number of maximum transactions that the database used.

Also, create a CloudWatch alarm for the MaximumUsedTransactionIDs metric to notify you when the number of transactions reaches the value that you specified. For more information, see Implement an early warning system for transaction ID wraparound in Amazon RDS for PostgreSQL.

To monitor MaximumUsedTransactionIDs on the Amazon RDS console, complete the following steps:

  1. Open the Amazon RDS console.
  2. In the navigation pane, choose Databases, and then select your database.
  3. In the Logs and Events tab, you can view MaximumUsedTransactionIDs in the Recent events section.

If you're reaching the MaximumUsedTransactionIDs quota, then you must vacuum the table.

To identify the vacuum blockers, see Identify and resolve aggressive vacuum blockers in RDS for PostgreSQL.