I have a script to keep two AWS Lambda functions warm:
PING_DELAY = 300
last_ping = -PING_DELAY
while True:
try:
timer = Timer()
timestamp_str = \
datetime.now().strftime("%d.%m.%Y %H:%M:%S.%f")[:-3]
print(f"Ping at {timestamp_str} ", end='', flush=True)
response = lambda_client.invoke(
FunctionName='myfunc',
InvocationType = 'RequestResponse',
Payload=json.dumps({'command': 'ping'}))
payload = json.load(response['Payload'])
if 'errorMessage' in payload:
raise Exception(payload['errorMessage'])
else:
my_time = timer.stop()
stats = payload['stats']
print(f"took {my_time}ms. n_cold: {stats['n_cold']} total_init: {stats['total_init']}ms", flush=True)
except Exception as e:
print(f"AWS Lambda submit failed: {e}", flush=True)
time.sleep(PING_DELAY)
Every five minutes, it invokes the myfunc
AWS Lambda function, which in turn invokes in parallel 109 instances of another Lambda function. Based on the responses, myfunc
reports in its response how many instances experienced a cold start (n_cold
) among other measurements. Here is part of the output of this script:
Ping at 28.11.2023 08:58:06.605 took 586ms. n_cold: 0 total_init: 0ms
Ping at 28.11.2023 09:03:07.291 took 7133ms. n_cold: 4 total_init: 24215ms
Ping at 28.11.2023 09:08:14.524 took 54809ms. n_cold: 0 total_init: 0ms
Ping at 28.11.2023 09:14:09.433 took 10211ms. n_cold: 13 total_init: 77080ms
Ping at 28.11.2023 09:19:19.744 took 694ms. n_cold: 19 total_init: 118006ms
Ping at 28.11.2023 09:24:20.538 took 4617ms. n_cold: 0 total_init: 0ms
Ping at 28.11.2023 09:29:25.185 took 2042ms. n_cold: 0 total_init: 0ms
Ping at 28.11.2023 09:34:27.327 took 566ms. n_cold: 4 total_init: 24450ms
Ping at 28.11.2023 09:39:27.993 took 602ms. n_cold: 8 total_init: 47270ms
Ping at 28.11.2023 09:44:28.695 took 658ms. n_cold: 11 total_init: 66350ms
As the output shows, at least 90 of the 109 environments remain warm at all times. For the actual computation that I am interested in, myfunc
invokes in parallel 50 instances of that same Lambda function. Given the availability of at least 90 warm environments, I would expect that all the 50 instances would be run in warm environments. However, the logs show that some of those 50 invocations result in a cold start. I checked that the computation did not coincide in time with the invocations for keeping the environments warm.
Why does AWS Lambda not choose a warm environment when an abundant number of such environments is available?
P.S. The question in SO
As the question states, I have 109 warmed instances. It is highly improbable that 60 of them became stale. The question also states that the warmer was not executing at the time.
Uri is spot on here - one of the reasons we launched Provisioned Concurrency was because customers were trying to keep environments warm and not getting the results they were looking for. There's no way to predict what the Lambda service is doing "under the hood" - you may think you understand it but it is a massive, multi-tenant service which is dealing with millions of events per second from many different customers.
Therefore: I see a lot of questions from you in this space. What are you trying to do? Rather than getting into the weeds of trying to get something working in a specific way - it'd be great to try and understand the bigger problem here.
@Brettski-AWS The computation involves computing hundreds of dense vector embeddings and this needs to be done in real time in response to each request of a user of my application. For a relatively low rate of user requests, splitting the job between multiple concurrently run instances of an AWS Lambda function seems to be the most economical way of doing this. The low rate of requests makes concurrency provision financially infeasible. My strategy for dealing with cold starts is to keep many more warm environments than the number of instances that I need to run in parallel. It works most of the time and the phenomenon I reported in this question is rather an exception. However, since I recall reading somewhere (I couldn't find it when I was writing the question) that AWS Lambda will use a warm environment if there is one, I did not expect such an exception to occur...