I have a Lambda function and a Postgres Serverless V2 RDS instance in us-west-2. The Lambda function frequently takes 6 seconds to establish a database connection to the RDS instance with IAM Authentication, and I have profiled the code to confirm that the time is spent on the actual connection process, not on cold starts, or creating the IAM Authentication token. Below you can see a sample from X-Ray:
You can see that the connection function is taking 6 seconds. There is also a subsegment which is getting the IAM Auth token in 150ms. The rest of the time is spent connecting. This is a relatively new trend (last couple of days), because it just started tripping my Cloudwatch Alarms.
My backend code is Python 3.11, using psycopg2. After the IAM Auth token is negotiated, I immediately call psycopg2.connect and the function completes. So there is nothing else taking time, and I am fairly confident that there is something wrong with the network or the database.
How can I debug this? How can I improve this connection time?