RDS Aurora PostgreSQL crashes with segmentation fault

0

Hey guys. I have the following issues. During normal operation and after initial load, once a day, pglogical crashes on the destination side, which is RDS aurora PGSQL 14.8 using pglogical. 2.4.2:

#Destination side 2024-01-25 13:17:09 UTC::@:[537]:LOG: background worker "pglogical apply 131082:4047160452" (PID 6709) was terminated by signal 11: Segmentation fault 2024-01-25 13:17:09 UTC::@:[537]:LOG: terminating any other active server processes 2024-01-25 13:17:09 UTC::@:[537]:FATAL: Can't handle storage runtime process crash 2024-01-25 13:17:09 UTC::@:[537]:LOG: database system is shutess crash 2024-01-25 13:17:09 UTC::@:[537]:LOG: database system is shut down

After that this initial error, the cluster enters into continuous rebooting and crashing, causing significant CPU usage and resources.

On source side we have some queries which are done couple seconds before that crash, but they don't seem to cause the problem as after re-creating the environment and re-executing the queries, the problem doesn't occur.

On the source cluster we are having these errors after the initial error on the destination: 2024-01-25 13:17:09 UTC:(63772):user@database_name:[26536]:LOG: could not receive data from client: Connection reset by peer 2024-01-25 13:17:09 UTC:(63772):user@database_name:[26536]:STATEMENT: START_REPLICATION SLOT "replication_slot_name" LOGICAL 12/28C9A430 (expected_encoding 'UTF8', min_proto_version '1', max_proto_version '1', startup_params_format '1', "binary.want_internal_basetypes" '1', "binary.want_binary_basetypes" '1', "binary.basetypes_major_version" '1400', "binary.sizeof_datum" '8', "binary.sizeof_int" '4', "binary.sizeof_long" '8', "binary.bigendian" '0', "binary.float4_byval" '0', "binary.float8_byval" '1', "binary.integer_datetimes" '0', "hooks.setup_function" 'pglogical.pglogical_hooks_setup', "pglogical.forward_origins" '"all"', "pglogical.replication_set_names" 'tenant_service', "relmeta_cache_size" '-1', pg_version '140008', pglogical_version '2.4.2', pglogical_version_num '20402', pglogical_apply_pid '6709') 2024-01-25 13:17:09 UTC:*(63772):user@database_name:[26536]:LOG: unexpected EOF on standby connection

Not sure if important to note but, the pglogical replicates between different AWS account.

Source and Destination: RDS Aurora PostgreSQL 14.8 pglogical: 2.4.2

Source: 1 Writer 1 Reader

Destination: 1 Writer

Julien
質問済み 3ヶ月前347ビュー
回答なし

ログインしていません。 ログイン 回答を投稿する。

優れた回答とは、質問に明確に答え、建設的なフィードバックを提供し、質問者の専門分野におけるスキルの向上を促すものです。

質問に答えるためのガイドライン

関連するコンテンツ