By using AWS re:Post, you agree to the AWS re:Post Terms of Use

Frequent lambda cold starts

0

Hello,

I've been scratching my head trying to understand how long it takes for lambda to become deallocated, resulting in a cold start in a subsequent call. Most benchmarks I've found point to lambdas staying up for around 45 - 60 minutes based on their memory allocation.

Meanwhile it seems that my lambdas do not stay up longer than a minute or two most of the time, which is quite troublesome, because it at least doubles the response times.

Is there some simple explanation for this behaviour, maybe a configuration issue on my side? Is there any steps I can take to understand why this is happening and prevent this?

These are the CDK props I deploy lambdas with:

runtime: cdk.aws_lambda.Runtime.NODEJS_20_X,
handler: "index.handler",
logRetention: cdk.aws_logs.RetentionDays.ONE_DAY,
memorySize: 128,
timeout: cdk.Duration.seconds(10),

Here is a chunk of CloudWatch logs for a single lambda with timestamps, so you can see what I mean Notice that the third cold start happens only after 45 seconds of previous call! Bonus question - why would lambda runtime vary that much - these are all identical request, yet duration fluctuates between 200 - 1200ms. That seems a lot.

----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
|   timestamp              |                                                                                  message                                                                                   |                    logStreamName                     |
|--------------------------|----------------------------------------------------------------------------------------------------------------------------------------------------------------------------|------------------------------------------------------|
| 2024-11-27T14:24:46.440Z | REPORT RequestId: 5161d2b4-0122-49df-a9a0-371b91e9bb8c Duration: 1532.67 ms Billed Duration: 1533 ms Memory Size: 128 MB Max Memory Used: 94 MB Init Duration: 486.94 ms   | 2024/11/27/[$LATEST]cbb90b99838c4046b8d81e6b4394a7ab |
| 2024-11-27T14:32:26.655Z | REPORT RequestId: 9667b1e2-913b-417e-915d-969724cb7cba Duration: 1656.41 ms Billed Duration: 1657 ms Memory Size: 128 MB Max Memory Used: 93 MB Init Duration: 483.07 ms   | 2024/11/27/[$LATEST]7d9c0bbf44df4fdfba1e56fd527dd3a5 |
| 2024-11-27T14:32:47.055Z | REPORT RequestId: e6d0e225-aef2-4720-9911-bddeb57a0e4d Duration: 783.08 ms Billed Duration: 784 ms Memory Size: 128 MB Max Memory Used: 94 MB                              | 2024/11/27/[$LATEST]7d9c0bbf44df4fdfba1e56fd527dd3a5 |
| 2024-11-27T14:33:31.792Z | REPORT RequestId: 7eaa05a5-8807-4c37-9bac-dae2ff7e7135 Duration: 1496.20 ms Billed Duration: 1497 ms Memory Size: 128 MB Max Memory Used: 93 MB Init Duration: 514.09 ms   | 2024/11/27/[$LATEST]989cc7f410fa4f4a8757419aec6acd5a |
| 2024-11-27T14:33:55.832Z | REPORT RequestId: 6883e600-a003-4af1-8289-1d195c263d7d Duration: 232.59 ms Billed Duration: 233 ms Memory Size: 128 MB Max Memory Used: 93 MB                              | 2024/11/27/[$LATEST]989cc7f410fa4f4a8757419aec6acd5a |
| 2024-11-27T14:34:12.912Z | REPORT RequestId: 4a6341ab-c922-40a4-9a99-d817c08c8ca7 Duration: 728.88 ms Billed Duration: 729 ms Memory Size: 128 MB Max Memory Used: 93 MB                              | 2024/11/27/[$LATEST]989cc7f410fa4f4a8757419aec6acd5a |
| 2024-11-27T14:34:52.432Z | REPORT RequestId: 9d87bbb7-bd6a-4e27-8921-56f2f11aae65 Duration: 253.00 ms Billed Duration: 253 ms Memory Size: 128 MB Max Memory Used: 93 MB                              | 2024/11/27/[$LATEST]989cc7f410fa4f4a8757419aec6acd5a |
| 2024-11-27T14:35:19.892Z | REPORT RequestId: 1840a9af-9203-4b52-afcb-99238da94441 Duration: 863.19 ms Billed Duration: 864 ms Memory Size: 128 MB Max Memory Used: 93 MB                              | 2024/11/27/[$LATEST]989cc7f410fa4f4a8757419aec6acd5a |
| 2024-11-27T14:37:08.012Z | REPORT RequestId: 55ddab52-f987-4c3b-9635-322a73ad8018 Duration: 975.56 ms Billed Duration: 976 ms Memory Size: 128 MB Max Memory Used: 93 MB                              | 2024/11/27/[$LATEST]989cc7f410fa4f4a8757419aec6acd5a |
----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
asked 6 days ago38 views
2 Answers
0

The behavior you're experiencing with frequent cold starts in your Lambda function is not typical, as Lambda execution environments are usually retained for a non-deterministic period, often around 45-60 minutes. However, there are several factors that could contribute to this situation:

  1. Concurrency and scaling: If your function is receiving multiple concurrent requests, Lambda may create new execution environments to handle the load, resulting in more cold starts. Each execution environment handles one request at a time, so simultaneous invocations can lead to multiple cold starts.

  2. Memory allocation: You've set the memory size to 128 MB, which is the minimum. Lower memory allocations can lead to slower performance and potentially more frequent cold starts. Increasing the memory allocation might help, as it also increases the CPU power allocated to your function.

  3. Function updates: If you're frequently updating your function code or configuration, this can lead to more cold starts as new execution environments are created.

  4. Service optimizations: Lambda optimizes execution across all customers, which might sometimes lead to unexpected behavior for individual functions.

To address these issues and understand why this is happening, you can try the following:

  1. Increase memory allocation: Try increasing the memorySize to 256 MB or higher to see if it improves performance and reduces cold starts.

  2. Monitor invocation patterns: Use CloudWatch metrics to analyze your function's invocation patterns and concurrency.

  3. Consider using Provisioned Concurrency: If predictable start times are crucial, you might want to use Provisioned Concurrency to keep a specified number of execution environments warm and ready.

  4. Implement a custom warm-up strategy: You could create a CloudWatch Events rule to periodically invoke your function to keep it warm, though this doesn't guarantee elimination of all cold starts.

  5. Use Lambda SnapStart: If you're using Java, consider using Lambda SnapStart, which can significantly reduce cold start times.

Regarding the variation in execution times (200-1200ms), this can be due to several factors:

  1. Cold starts vs. warm starts: Cold starts will naturally take longer.
  2. External dependencies: If your function interacts with other services or databases, response times can vary.
  3. Lambda service variability: There can be some inherent variability in Lambda execution times.
  4. Code execution path: Different code paths within your function might lead to different execution times.

To optimize performance, consider profiling your code to identify any bottlenecks, optimize any external calls, and ensure efficient use of resources. Remember that Lambda's behavior can be influenced by various factors, and some degree of variability is normal.
Sources
Lambda execution environments - AWS Lambda
Factors that may slow down cold starts of Neptune Gremlin Lambda functions - Amazon Neptune
Troubleshoot invocation issues in Lambda - AWS Lambda
Improving startup performance with Lambda SnapStart - AWS Lambda
Community | Lambda Monitoring Made Simple: Observe, Learn, and Optimize

profile picture
answered 6 days ago
0

Short answer: You have no control over the length of time a Lambda function invocation stays warm for.

Under the hood, Lambda is a multi-tenant solution at massive scale. It makes decisions on how long to keep functions warm for based on load (both your and the entire system) as well as many other parameters.

As per the re:Post Agent answer, the best course of action is to minimise cold start times in your code if the variation between warm and cold starts is an issue.

profile pictureAWS
EXPERT
answered 5 days ago

You are not logged in. Log in to post an answer.

A good answer clearly answers the question and provides constructive feedback and encourages professional growth in the question asker.

Guidelines for Answering Questions