Lambda and RDS Postgress Multi A-Z

0

Hi,

I have a RDS Multi A-Z Postgres database used by lambda functions.

Today AWS performed a minor upgrade to the database. According to RDS logs the upgrade started at July 14, 2023, 07:50 (UTC+03:00):

  • July 14, 2023, 07:41 (UTC+03:00); Backing up DB instance
  • July 14, 2023, 07:47 (UTC+03:00): Finished DB Instance backup
  • July 14, 2023, 07:50 (UTC+03:00): DB instance shutdown
  • July 14, 2023, 07:50 (UTC+03:00): DB instance restarted
  • July 14, 2023, 07:52 (UTC+03:00): DB instance restarted
  • July 14, 2023, 07:55 (UTC+03:00): Database instance patched
  • July 14, 2023, 07:56 (UTC+03:00): Backing up DB instance
  • July 14, 2023, 07:58 (UTC+03:00): Finished DB Instance backup
  • July 14, 2023, 07:59 (UTC+03:00): Performance Insights has been enabled
  • July 14, 2023, 07:59 (UTC+03:00): Monitoring Interval changed to 0

The lambda functions started got get network errors at 2023-07-14T07:50:29.969+03:00 but continued to get network errors until 2023-07-14T10:01:29.000+03:00

The impact on lambda was two whole hours more than the database upgrade interval.Additionally I am using Multi A-Z especially to minimize the impact to lambda.

Lambda is using Python 3.10 with the pg8000 package. [edit] The db initialization is outside lambda handler.

The actual exception was:: 

[ERROR] InterfaceError: network error Traceback (most recent call last): File "/var/task/main.py", line 126, in main return lambda_run.run() File "/var/task/hyper/lambda_run/lambda_run.py", line 24, in run return self.func(self.event, self.context) File "/var/task/main.py", line 161, in prospector auth = get_auth_for_key(api_key) File "/var/task/main.py", line 636, in get_auth_for_key cursor.execute(sql, {"api_key": api_key}) File "/opt/python/lib/python3.10/site-packages/pg8000/legacy.py", line 253, in execute self._context = self._c.execute_unnamed( File "/opt/python/lib/python3.10/site-packages/pg8000/core.py", line 665, in execute_unnamed self.send_PARSE(NULL_BYTE, statement, oids) File "/opt/python/lib/python3.10/site-packages/pg8000/core.py", line 643, in send_PARSE self._send_message(PARSE, val) File "/opt/python/lib/python3.10/site-packages/pg8000/core.py", line 730, in _send_message _write(self._sock, code) File "/opt/python/lib/python3.10/site-packages/pg8000/core.py", line 172, in _write raise InterfaceError("network error") from e

Any recomandtion what I should do to minimize impact of similar operations? Any ideea why lambda kept gettiong errors for two hours more?

Any help woudl be greatly apreciated.

Thank you very much,

Felix.

1 Answer
0

This is an expected behavior. Minor version upgrades include only changes that are backward-compatible with existing applications. If your DB instance is in a Multi-AZ deployment then, the writer and any standby instance are upgraded simultaneously. Therefore, your DB instance might not be available until the upgrade is complete. For more details, see Automatic minor version upgrades for PostgreSQL.

If your PostgreSQL DB instance is using read replicas, you must first upgrade all of the read replicas before upgrading the primary instance.

answered 10 months ago
  • I understand that the writer and any standby instance are upgraded simultaneously, but why did lambda continue to get errors for two whole hours after the upgrade was completed?

You are not logged in. Log in to post an answer.

A good answer clearly answers the question and provides constructive feedback and encourages professional growth in the question asker.

Guidelines for Answering Questions