Some invocations of a Lambda function take much longer despite a warm start

0

Here is my AWS Lambda function:

import json
import time
import pickle

def ms_now():
    return int(time.time_ns() / 1000000)

class Timer():
    def __init__(self):
        self.start = ms_now()

    def stop(self):
        return ms_now() - self.start

timer = Timer()

from punctuators.models import PunctCapSegModelONNX
model_name = "pcs_en"
model_sentences = PunctCapSegModelONNX.from_pretrained(model_name)

with open('model_embeddings.pkl', 'rb') as file:
    model_embeddings = pickle.load(file)

cold_start = True
init_time = timer.stop()
print("Time to initialize:", init_time, flush=True)

def segment_text(texts):
    sentences = model_sentences.infer(texts)
    sentences = [
        [(s, len(model_embeddings.tokenizer.encode(s))) for s in el]
         for el in sentences]
    return sentences

def get_embeddings(texts):
    return model_embeddings.encode(texts)

def compute(body):
    command = body['command']
    
    if command == 'ping':
        return 'success'

    texts = body['texts']

    if command == 'embeddings':
        result = get_embeddings(texts)
        return [el.tolist() for el in result]
    
    if command == 'sentences':
        return segment_text(texts)
    
    assert(False)

def lambda_handler(event, context):
    global cold_start
    global init_time
    
    stats = {'cold_start': cold_start, 'init_time': init_time}
    cold_start = False
    init_time = 0

    stats['started'] = ms_now()
    result = compute(event['body'])
    stats['finished'] = ms_now()
    return {
        'statusCode': 200,
        'headers': {
            'Content-Type': 'application/json'
        },
        'body': {'result': result, 'stats': stats}
    }

This Lambda function, along with the packages and the models (so that those don't need to be downloaded), is deployed as a docker image.

In addition to the timestamps of when the function started and finished (not including the cold start initialization), the response contains the information about whether it was a cold start and how long it took to initialize. I have another function, which invokes this function 15 times in parallel.

The anomaly happens with the first of these parallel invocations. Usually, it takes ~300ms (computed as the difference of the timestamps in the response). But sometimes it takes 900ms and longer (with the same input).

This does not happen due to a cold start, since I have init_time==0 in the response (when a cold start occurs, init_time>6000). It happens both with command == 'embeddings' and with command == 'sentences'.

What could be the explanation for these spikes? With a warm start, what can cause a Lambda function to take much longer than usual?

P.S. The question at SO

asked 5 months ago184 views
1 Answer
1
Accepted Answer

It's probably Python garbage collection. On a warm start the container is reused so over a series of invocations the garbage collector is likely to kick in at some point and make that invocation take longer. I was at a presentation last night about using Rust in Lambda where graphs were shown with this exact thing - comparing Lambda execution times of Rust vs a language that has garbage collection. In this case the other language was Typescript, and there was a spike in execution time every 2 minutes due to the garbage collector.

EXPERT
answered 5 months ago
profile pictureAWS
EXPERT
reviewed 5 months ago
  • Disabling automatic garbage collection with gc.disable() helped! But can you come up with an explanation for how come this almost always happened in the first invocation?

You are not logged in. Log in to post an answer.

A good answer clearly answers the question and provides constructive feedback and encourages professional growth in the question asker.

Guidelines for Answering Questions