Hi,
I have a RDS Multi A-Z Postgres database used by lambda functions.
Today AWS performed a minor upgrade to the database. According to RDS logs the upgrade started at July 14, 2023, 07:50 (UTC+03:00):
- July 14, 2023, 07:41 (UTC+03:00); Backing up DB instance
- July 14, 2023, 07:47 (UTC+03:00): Finished DB Instance backup
- July 14, 2023, 07:50 (UTC+03:00): DB instance shutdown
- July 14, 2023, 07:50 (UTC+03:00): DB instance restarted
- July 14, 2023, 07:52 (UTC+03:00): DB instance restarted
- July 14, 2023, 07:55 (UTC+03:00): Database instance patched
- July 14, 2023, 07:56 (UTC+03:00): Backing up DB instance
- July 14, 2023, 07:58 (UTC+03:00): Finished DB Instance backup
- July 14, 2023, 07:59 (UTC+03:00): Performance Insights has been enabled
- July 14, 2023, 07:59 (UTC+03:00): Monitoring Interval changed to 0
The lambda functions started got get network errors at 2023-07-14T07:50:29.969+03:00 but continued to get network errors until 2023-07-14T10:01:29.000+03:00
The impact on lambda was two whole hours more than the database upgrade interval.Additionally I am using Multi A-Z especially to minimize the impact to lambda.
Lambda is using Python 3.10 with the pg8000 package.
[edit] The db initialization is outside lambda handler.
The actual exception was::
[ERROR] InterfaceError: network error
Traceback (most recent call last):
File "/var/task/main.py", line 126, in main
return lambda_run.run()
File "/var/task/hyper/lambda_run/lambda_run.py", line 24, in run
return self.func(self.event, self.context)
File "/var/task/main.py", line 161, in prospector
auth = get_auth_for_key(api_key)
File "/var/task/main.py", line 636, in get_auth_for_key
cursor.execute(sql, {"api_key": api_key})
File "/opt/python/lib/python3.10/site-packages/pg8000/legacy.py", line 253, in execute
self._context = self._c.execute_unnamed(
File "/opt/python/lib/python3.10/site-packages/pg8000/core.py", line 665, in execute_unnamed
self.send_PARSE(NULL_BYTE, statement, oids)
File "/opt/python/lib/python3.10/site-packages/pg8000/core.py", line 643, in send_PARSE
self._send_message(PARSE, val)
File "/opt/python/lib/python3.10/site-packages/pg8000/core.py", line 730, in _send_message
_write(self._sock, code)
File "/opt/python/lib/python3.10/site-packages/pg8000/core.py", line 172, in _write
raise InterfaceError("network error") from e
Any recomandtion what I should do to minimize impact of similar operations?
Any ideea why lambda kept gettiong errors for two hours more?
Any help woudl be greatly apreciated.
Thank you very much,
Felix.
Was db initialization outside lambda handler?