AWS Lambda does not choose warm environments

0

I have a script to keep two AWS Lambda functions warm:

PING_DELAY = 300
last_ping = -PING_DELAY
while True:
    try:
        timer = Timer()
        timestamp_str = \
            datetime.now().strftime("%d.%m.%Y %H:%M:%S.%f")[:-3]
        print(f"Ping at {timestamp_str} ", end='', flush=True)
        response = lambda_client.invoke(
            FunctionName='myfunc',
            InvocationType = 'RequestResponse',
            Payload=json.dumps({'command': 'ping'}))
        payload = json.load(response['Payload'])
        if 'errorMessage' in payload:
            raise Exception(payload['errorMessage'])
        else:
            my_time = timer.stop()
            stats = payload['stats']
            print(f"took {my_time}ms.   n_cold: {stats['n_cold']}   total_init: {stats['total_init']}ms", flush=True)
    except Exception as e:
        print(f"AWS Lambda submit failed: {e}", flush=True)
    
    time.sleep(PING_DELAY)

Every five minutes, it invokes the myfunc AWS Lambda function, which in turn invokes in parallel 109 instances of another Lambda function. Based on the responses, myfunc reports in its response how many instances experienced a cold start (n_cold) among other measurements. Here is part of the output of this script:

Ping at 28.11.2023 08:58:06.605 took 586ms.   n_cold: 0   total_init: 0ms
Ping at 28.11.2023 09:03:07.291 took 7133ms.   n_cold: 4   total_init: 24215ms
Ping at 28.11.2023 09:08:14.524 took 54809ms.   n_cold: 0   total_init: 0ms
Ping at 28.11.2023 09:14:09.433 took 10211ms.   n_cold: 13   total_init: 77080ms
Ping at 28.11.2023 09:19:19.744 took 694ms.   n_cold: 19   total_init: 118006ms
Ping at 28.11.2023 09:24:20.538 took 4617ms.   n_cold: 0   total_init: 0ms
Ping at 28.11.2023 09:29:25.185 took 2042ms.   n_cold: 0   total_init: 0ms
Ping at 28.11.2023 09:34:27.327 took 566ms.   n_cold: 4   total_init: 24450ms
Ping at 28.11.2023 09:39:27.993 took 602ms.   n_cold: 8   total_init: 47270ms
Ping at 28.11.2023 09:44:28.695 took 658ms.   n_cold: 11   total_init: 66350ms

As the output shows, at least 90 of the 109 environments remain warm at all times. For the actual computation that I am interested in, myfunc invokes in parallel 50 instances of that same Lambda function. Given the availability of at least 90 warm environments, I would expect that all the 50 instances would be run in warm environments. However, the logs show that some of those 50 invocations result in a cold start. I checked that the computation did not coincide in time with the invocations for keeping the environments warm.

Why does AWS Lambda not choose a warm environment when an abundant number of such environments is available?

P.S. The question in SO

asked 5 months ago155 views
1 Answer
0

Lambda keeps environments warm for a few minutes after the invocation, so that if another request comes in, it will be a warm invocation. It will also recycle environments after some time, even if they are in the few minutes grace. This is done for various reasons. So it may well be that you warmed 50 instances, and some of them got to their recycle time and this is why you get cold invocations.

This is exactly the reason that we created Provisioned Concurrency. It makes sure that you have warm environments when you needs them. Using these warmers to keep environments is more, is more cost effective compared to Provisioned Concurrency, but it has limitations, as you can see for yourself. Another limitation is that when you try to warm them, you don't really know how many active environments are there, so you may be creating new ones. Same can happen when you get actual requests, and they can't find a warm environment, because they are currently being invoked by the warmer.

profile pictureAWS
EXPERT
Uri
answered 5 months ago
profile picture
EXPERT
reviewed 25 days ago
  • As the question states, I have 109 warmed instances. It is highly improbable that 60 of them became stale. The question also states that the warmer was not executing at the time.

  • Uri is spot on here - one of the reasons we launched Provisioned Concurrency was because customers were trying to keep environments warm and not getting the results they were looking for. There's no way to predict what the Lambda service is doing "under the hood" - you may think you understand it but it is a massive, multi-tenant service which is dealing with millions of events per second from many different customers.

  • Therefore: I see a lot of questions from you in this space. What are you trying to do? Rather than getting into the weeds of trying to get something working in a specific way - it'd be great to try and understand the bigger problem here.

  • @Brettski-AWS The computation involves computing hundreds of dense vector embeddings and this needs to be done in real time in response to each request of a user of my application. For a relatively low rate of user requests, splitting the job between multiple concurrently run instances of an AWS Lambda function seems to be the most economical way of doing this. The low rate of requests makes concurrency provision financially infeasible. My strategy for dealing with cold starts is to keep many more warm environments than the number of instances that I need to run in parallel. It works most of the time and the phenomenon I reported in this question is rather an exception. However, since I recall reading somewhere (I couldn't find it when I was writing the question) that AWS Lambda will use a warm environment if there is one, I did not expect such an exception to occur...

You are not logged in. Log in to post an answer.

A good answer clearly answers the question and provides constructive feedback and encourages professional growth in the question asker.

Guidelines for Answering Questions