Understanding our load testing data on lambdas using provisional concurrency

0

Hi,

We have an API Gateway with endpoints that trigger a lambda. The lambda's response time is 1-5 ms when warm and does nothing but return a string. The lambda also has a version which is configured to have 15 provisioned concurrency.

We're running load tests against the API Gateway endpoint to see the effect high traffic will have. Each load test is a consistent load test with no ramp-up. A 1000 TPS load test executes for 2 minutes. Once done, we have 8 minutes of downtime before the next load test executes. Our load tests are 1000 TPS, 2000 TPS, and 5000 TPS.

Below is a screenshot of our concurrent executions (taken from the monitoring tab of the Lambda console):

concurrent connections

As you can see, the far left data points correspond with the 1000 TPS load test which results in 450 concurrent connections. However, the middle and right data points which correspond to 2000 TPS and 5000 TPS load tests each have <150 concurrent connections.

It looks like the APIGW+lambda are suffering in performance from cold starts during the initial moments of the 1000 TPS load test. Why aren't the 2000 TPS and 3000 TPS following the same pattern?

asked a year ago561 views
1 Answer
1
Accepted Answer

Lambda Execution environments stay warm for a few minutes after the last invocation. It may be that they were still warn. You can verify this by looking at CloudWatch Logs to check if the functions have Init Duration (which indicates a cold start), or not.

Also, make sure that you set the view on Maximum for Concurrency invocations and not Count.

profile pictureAWS
EXPERT
Uri
answered a year ago
  • Hi Uri, is provisioned concurrency considered warm start? We configured 1 provisioned concurrency for one of our Lambda functions that requires low latency. We make sure only 1 request is sent at any time. But the first request's response time is still very long (about 3s). Subsequent request's response time is only around 500ms. This pattern repeats after a long break between the requests. The 1 provisioned concurrency is kept active all the time. Why are we getting the long response time for the first request even with provisioned concurrency? How to reduce it? Thanks.

  • I am guessing that your function has some init code that happens inside the handler. You should do all initialization code. e.g., read config, open database connections, etc., outside the handler. This way, they will be invoked during the deployment, and not during the first invocation.

    Also, if your init code is outside the handler, and if you are using Node.js, use top level await to make sure that all the initialization is done in a synchronous way. Otherwise, the actual init will happen with the first invoke.

  • Thank you Uri! We'll try to optimise our code as you suggested.

You are not logged in. Log in to post an answer.

A good answer clearly answers the question and provides constructive feedback and encourages professional growth in the question asker.

Guidelines for Answering Questions