By using AWS re:Post, you agree to the Terms of Use
/When Lambda functions are warm or provisioned, what is the expected latency between the Lambda Service and the Lambda Execution environment?/

When Lambda functions are warm or provisioned, what is the expected latency between the Lambda Service and the Lambda Execution environment?


From my understanding, when Lambda functions are invoked the Lambda Service dispatches a network request to the Lambda execution environment. This of course adds some latency between when the function is invoked from the client and the function actually receives the event payload. When I invoke an empty "hello world" function, processing the request takes less than 1ms. However, invoking the function from the same AWS region typically takes 15-30 ms for the full request/response cycle.

Is there any official documentation on what type of latency we should expect?

I've also heard of this overhead getting better. Is there any plan to continue reducing or removing this overhead?

My use case: within a web request I need to send many (100+) synchronous requests to Lambda to source data and make calculations. I would like to keep this process below 1 second, but at 30ms per invocation it would take 3 seconds to process 100. Ideally I would need Lambda invocations to respond in 10ms or less.


1 Answers

We don't document the latency for invoking a Lambda function because there are many variables that can affect that time. Size of function. How the code initialises. The various runtimes start and operate differently so that also has an effect.

The Lambda team (like every other AWS team) is always looking for ways to make improvements to the service and those will arrive when they are ready.

To your use case: Latency comes in many forms - code, network, back-end systems, etc. While it's good to focus on the Lambda function latency there are other places where performance gains can be found. In particular, I would look at whether you can send at least some of those requests in parallel to reduce the overall time required. I'd also look at combining the requests into a single call to Lambda as that reduces the number of round-trips to between the front and back ends.

Finally: Please engage with your local AWS Solutions Architect. They can give you much more detailed advice than is available here on the forum because you can talk privately about your code, the architecture and what you're trying to achieve.

answered 2 months ago
  • Thanks for the response. Is using a managed runtime and the out of the box "hello world" code example not the smallest in function size? If so I don't think there is anything else I can do in that respect to decreasing latency. The code executes in 1ms or less no not sure what to optimize. It seems like the only area to optimize now is what is out of my control, such as the networking in the Lambda Service.

    In this scenario, I'm not able to send parallel requests as each invocation reveals information that is necessary for a subsequent request. Is this a problem the Lambda team is working on?

  • Yes, the "Hello, World" function is going to execute the fastest; which means that any other code you add to that is going to increase latency (in this case, code execution time).

    I'm not sure I'd describe this as "a problem". There are always going to be overheads and (as we've proven in the last few years) we are always going to improve the service. But I don't think you can bank on that happening in a timeframe that you require nor will it necessarily give you the time savings you require.

    Again, I strongly recommend you have conversation with your local SA. Or consider and architecture where the back end can do some of the subsequent work for you without having to have the client side send multiple requests.

You are not logged in. Log in to post an answer.

A good answer clearly answers the question and provides constructive feedback and encourages professional growth in the question asker.

Guidelines for Answering Questions