By using AWS re:Post, you agree to the Terms of Use
/How to shorten the API response time in a API GW + Lambda solution/

How to shorten the API response time in a API GW + Lambda solution


We are building a REST API using API GW + Lambda with NodeJS. In a nutshell the API extracts a big payload (up to 1MB) from the request, stores it into S3 and then return a response. We'd like to shorten the API response time but find this is quite difficult using Lambda. What we have thought of:

  1. Make storing payload into S3 async via SNS/SQS. However the payload is too big to put into a SNS/SQS message directly so we still have to put it somewhere first (e.g. S3) and include a reference in the SNS/SQS message. Therefore making it async does not seem helpful here.
  2. We also tried to return the response to API GW before storing object into S3 completes and hoped that lambda would continue running until the S3 calls completes. However the lambda stops execution immediately after the return. Any pending operations get "frozen" and only continue when the lambda is triggered by a new incoming request. This is what AWS document describes. Changing lambda context callbackWaitsForEmptyEventLoop to true or false does not help either.

Any ideas are appreciated.

Update on 1 Jun 2022:

Thanks all for your answers. I've uploaded two XRAY traces for the first invocation and second invocation on the lambda with provisioned concurrency. I tried to put the same payload (1.3MB) into ElastiCache, S3 and DynamoDB (after slicing) at the same time.

As you see putting the payload into S3 isn't that slow. I haven't tried the other approaches like writing into Firehose yet but I doubt it will be significantly faster. I reckon the lambda extension is more reasonable if it enables Lambda to continue run after return response to API GW.

3 Answers

The only way in which you can have code running after you return the response from the handler is using Lambda extensions. You will need to create an extension that will wait for the main function to send it the file (it could be done using an internal socket or using a file in /tmp). Once the function sends the file to the extension it can return. The extension can then save the file to S3.

You can find some deep dive session here.

answered a month ago
  • Thanks Uri. Will give it a go whenever I get a chance.


This isn't going to seem helpful, but: Things take time and sometimes there aren't good solutions to that. However, some suggestions:

CPU performance in Lambda functions increases as memory allocation does. So if you allocate more memory to your Lambda function (if even you don't use it) it will get more CPU allocated to it and (probably) execute faster. There will be a balancing point for that where adding more memory doesn't decrease execution time.

You may find that other storage types may be faster than S3 but you will need to test. You could try using EFS with your Lambda function but that will add some cost to your solution but it may meet your performance requirements. You might choose to store the request object in EFS, send a SQS message which triggers another Lambda function to copy that to S3 for longer-term (and less expensive) storage; then delete the file from EFS.

If there is a challenge with cold start timing (it doesn't appear to be that way but just in case) then consider using Lambda Provisioned Concurrency noting again that there is a cost there.

answered a month ago
  • Thanks Brettski for your suggestions.

    Yes we have increased memory to 2048M although only around 128M used, and it didn't really improve the S3 operation and SNS event publishing time that much. We have implemented provisioned concurrency, it does eliminate the lambda initialisation time when it cold starts. We haven't tried EFS but tried ElastiCache. One problem with ElastiCache is, a provisioned lambda is not maintaining TCP connections so upon first API call lambda needs to reconnect to ElastiCache which makes it almost as slow as S3. Anyway the ElastiCache is ruled out by the mgmt as it is not server-less infra.

  • The delay you're experiencing connecting to ElastiCache is also going to be present connecting to S3 and SNS; those services also rely on TCP connections being created and TLS being negotiated; both are going to lead to delays; as above: things take time and those things don't really change that much with increased CPU.

  • Hi Brettski, you are right that connecting to ElastiCache / S3 / SNS are over the same TCP and TLS, the difference is that when lambda instance is warm, the TCP connection to ElastiCache is maintained. I've uploaded two XRAY trace for your info.


It is possible to upload files to S3 directly from API Gateway without having a lambda in between. The following shows how that can be done.

If you have to use a Lambda for other reasons and want to send the data to S3 asynchronously, you can use one of the following approaches if your file size max is 1 MB.

a) From the Lambda, write to Kinesis Data Firehose and that in turn writes to S3 b) From the Lambda, write to Kinesis Data Streams and have another Lambda pick up from the Kinesis Data Streams and write to S3 c) From the Lambda, write to Kinesis Data Streams and have Kinesis Data Firehose read from Kinesis Data Streams and write to S3

answered a month ago
  • Thanks Indranil. The scenario in my question is a simplified version, our business requirements are more than just uploading the payload. I will try Firehose when I get a chance but I doubt it would be faster than writing the file into S3 directly.

You are not logged in. Log in to post an answer.

A good answer clearly answers the question and provides constructive feedback and encourages professional growth in the question asker.

Guidelines for Answering Questions