Some invocations of a Lambda function take much longer despite a warm start

0

Here is my AWS Lambda function:

import json
import time
import pickle

def ms_now():
    return int(time.time_ns() / 1000000)

class Timer():
    def __init__(self):
        self.start = ms_now()

    def stop(self):
        return ms_now() - self.start

timer = Timer()

from punctuators.models import PunctCapSegModelONNX
model_name = "pcs_en"
model_sentences = PunctCapSegModelONNX.from_pretrained(model_name)

with open('model_embeddings.pkl', 'rb') as file:
    model_embeddings = pickle.load(file)

cold_start = True
init_time = timer.stop()
print("Time to initialize:", init_time, flush=True)

def segment_text(texts):
    sentences = model_sentences.infer(texts)
    sentences = [
        [(s, len(model_embeddings.tokenizer.encode(s))) for s in el]
         for el in sentences]
    return sentences

def get_embeddings(texts):
    return model_embeddings.encode(texts)

def compute(body):
    command = body['command']
    
    if command == 'ping':
        return 'success'

    texts = body['texts']

    if command == 'embeddings':
        result = get_embeddings(texts)
        return [el.tolist() for el in result]
    
    if command == 'sentences':
        return segment_text(texts)
    
    assert(False)

def lambda_handler(event, context):
    global cold_start
    global init_time
    
    stats = {'cold_start': cold_start, 'init_time': init_time}
    cold_start = False
    init_time = 0

    stats['started'] = ms_now()
    result = compute(event['body'])
    stats['finished'] = ms_now()
    return {
        'statusCode': 200,
        'headers': {
            'Content-Type': 'application/json'
        },
        'body': {'result': result, 'stats': stats}
    }

This Lambda function, along with the packages and the models (so that those don't need to be downloaded), is deployed as a docker image.

In addition to the timestamps of when the function started and finished (not including the cold start initialization), the response contains the information about whether it was a cold start and how long it took to initialize. I have another function, which invokes this function 15 times in parallel.

The anomaly happens with the first of these parallel invocations. Usually, it takes ~300ms (computed as the difference of the timestamps in the response). But sometimes it takes 900ms and longer (with the same input).

This does not happen due to a cold start, since I have init_time==0 in the response (when a cold start occurs, init_time>6000). It happens both with command == 'embeddings' and with command == 'sentences'.

What could be the explanation for these spikes? With a warm start, what can cause a Lambda function to take much longer than usual?

P.S. The question at SO

demandé il y a 6 mois194 vues
1 réponse
1
Réponse acceptée

It's probably Python garbage collection. On a warm start the container is reused so over a series of invocations the garbage collector is likely to kick in at some point and make that invocation take longer. I was at a presentation last night about using Rust in Lambda where graphs were shown with this exact thing - comparing Lambda execution times of Rust vs a language that has garbage collection. In this case the other language was Typescript, and there was a spike in execution time every 2 minutes due to the garbage collector.

EXPERT
répondu il y a 6 mois
profile pictureAWS
EXPERT
vérifié il y a 6 mois
  • Disabling automatic garbage collection with gc.disable() helped! But can you come up with an explanation for how come this almost always happened in the first invocation?

Vous n'êtes pas connecté. Se connecter pour publier une réponse.

Une bonne réponse répond clairement à la question, contient des commentaires constructifs et encourage le développement professionnel de la personne qui pose la question.

Instructions pour répondre aux questions