RDS Postgres archive/WAL Error Logs

0

We recently modified a Postgres RDS instance in US-EAST-1 from db.m5.4xl to db.m6i.4xl. Since the change we have noticed archive/WAL errors in the Postgres error log. This is running Postgres 10.21.

Examples: 2022-10-30 22:10:48 UTC::@:[359]:LOG: archive command failed with exit code 1 2022-10-30 22:10:48 UTC::@:[359]:DETAIL: The failed archive command was: /etc/rds/dbbin/pgscripts/rds_wal_archive pg_wal/0000000100000AE000000082

2022-10-30 22:28:04 UTC::@:[359]:LOG: archive command failed with exit code 1 2022-10-30 22:28:04 UTC::@:[359]:DETAIL: The failed archive command was: /etc/rds/dbbin/pgscripts/rds_wal_archive pg_wal/0000000100000AE000000086

(of course the filename changes each time) These happen as frequently as every few minutes or every few hours. Is this an error in the RDS instance or something we should address?

已提問 2 年前檢視次數 606 次
2 個答案
0
已接受的答案

Through AWS support, I learned the "archive command failed with exit code 1" log entry is a symptom of failing to have enough IOPS to write the WAL to S3. As the filename is advancing each time, it is not an issue, per se, but does indicate back-pressure within the storage subsystem. If the filename were to stay the same, that means the WAL is not being written to S3 and would start to affect the last restorable time.

In short, the RDS instance needs more IOPs.

已回答 1 年前
0

Hi there,

These logs are generally not something to be concerned about. They tend to be seen when there are transient network issues reported with multi-AZ between primary and secondary instances of database instances. They should resolve if they haven't already :)

profile pictureAWS
支援工程師
Brandon
已回答 1 年前
  • Thanks for the follow up. I had forgotten about this post and added an accepted answer based on my interaction with AWS support.

您尚未登入。 登入 去張貼答案。

一個好的回答可以清楚地回答問題並提供建設性的意見回饋,同時有助於提問者的專業成長。

回答問題指南