- Newest
- Most votes
- Most comments
Hi Alexei,
Can you clarify what is printing this line:
2022-02-14T17:38:23.740+02:00 DEBUG 2022-02-14T15:38:23.739Z 2ea5db18-c9b5-4df8-b3ef-dfc01f9ede00 Starting new HTTPS connection (1): cognito-idp.us-east-1.amazonaws.com:443
It seems that a Connection is being made to the cognito endpoint here and nothing else is printed until Lambda times out.
This can happen when Lambda is unable to successfully connect to an external endpoint before the Lambda timeout occurs. From what I can see in the logs, Lambda is trying to connect to the Cognito endpoint but before it is able to make a successful connection or receive a reponse, the Lambda timeout is reached. In most cases, if there is no response within a few seconds then it would best to retry the HTTPS request.
It will depend on the timeout settings and the retry behavior of the library you are using if you will see an error in your Lambda logs or if you will see the retry in your Lambda logs
For example, the Boto3 SDK has a Connection timeout of 60 seconds. Therefore, if it tries to connect to Cognito then it will only timeout after 60 seconds. If your Lambda function has a timeout of 3 or 30 seconds it will appear as if your Lambda has "frozen" when it fails to connect to the Cognito endpoint until such time that the Lambda times out.
Please have a look here in this knowledge center article for more information about this -> https://aws.amazon.com/premiumsupport/knowledge-center/lambda-function-retry-timeout-sdk/
You may need to change the connection timeout of the library you are using to reach the cognito endpoint.
If the issue occurs consistently, you can update your timeout settings temporarily with the following formula(found in the above article):
First attempt (connection timeout + socket timeout) + Number of retries x (connection timeout + socket timeout) + 20 seconds additional code runtime margin = Required Lambda function timeout
This should ideally show that the HTTPS connection is being retried.
It's definitely possible that there may be another issues but this is what I'd like to rule out first based on the Lambda logs that have been provided.
I see at the same time other instances access the same Cognito endpoint at 200 milisecond max, so the endpoint isn't down. I can't increase this Lambda timeout, it's an authorizer ffs, it can't ask other API to wait for 60 seconds until it figures out to grant or deny access. If it fails, at least it should fail quick. I'm not invoking it with boto, it's invoked by the API gateway, is there a way to modify the network timeouts and retries for this case?
Hi,
Aplogies for the delay. To clarify, I'm not saying that the endpoint is down - that will rarely be the case. The issue here is the client that is trying to connect to the endpoint. A client can experience intermittent/transient issues which may prevent it from being able to successfully invoke an external endpoint. This can happen to a small percentage of Lambda invocations although it can definitely happen to other clients as well.
That is correct and what I am getting at -> the request that Lambda is making needs to fail fast in cases like this where it is unable to reach the endpoint. You will need to check the library which you are using to invoke the external endpoint and change the timeout so that if fails fast instead of waiting for the default timeout which seems to be greater than your Lambda timoeut. This will need to be configured in the client library. The knowledge center article shows how it can be done for boto3 but for example here is how you can set the timeout in the requests library -> https://docs.python-requests.org/en/master/user/advanced/#timeouts
Apologies if the above formula caused confusion, as mentioned, I meant to update the timeout settings temporarily so that you can see the HTTP request being retried to confirm if that is indeed the issue.
Can you clarify how exactly the cognito endpoint is being invoked here? What is printing this line
Starting new HTTPS connection (1): cognito-idp.us-east-1.amazonaws.com:443
Relevant content
- AWS OFFICIALUpdated 7 months ago
- AWS OFFICIALUpdated a year ago
- AWS OFFICIALUpdated 3 years ago
Nothing firm to offer: But on on the off chance that there is a memory issue here - have you tried increasing the memory to (say) 512MB and see if it fails then? I've seen Python Lambda functions try to allocate memory and fail (silently) but Lambda doesn't show that it is using all of the memory. Complete guess but trying to read the runes.