Some invocations of a Lambda function take much longer despite a warm start

0

Here is my AWS Lambda function:

import json
import time
import pickle

def ms_now():
    return int(time.time_ns() / 1000000)

class Timer():
    def __init__(self):
        self.start = ms_now()

    def stop(self):
        return ms_now() - self.start

timer = Timer()

from punctuators.models import PunctCapSegModelONNX
model_name = "pcs_en"
model_sentences = PunctCapSegModelONNX.from_pretrained(model_name)

with open('model_embeddings.pkl', 'rb') as file:
    model_embeddings = pickle.load(file)

cold_start = True
init_time = timer.stop()
print("Time to initialize:", init_time, flush=True)

def segment_text(texts):
    sentences = model_sentences.infer(texts)
    sentences = [
        [(s, len(model_embeddings.tokenizer.encode(s))) for s in el]
         for el in sentences]
    return sentences

def get_embeddings(texts):
    return model_embeddings.encode(texts)

def compute(body):
    command = body['command']
    
    if command == 'ping':
        return 'success'

    texts = body['texts']

    if command == 'embeddings':
        result = get_embeddings(texts)
        return [el.tolist() for el in result]
    
    if command == 'sentences':
        return segment_text(texts)
    
    assert(False)

def lambda_handler(event, context):
    global cold_start
    global init_time
    
    stats = {'cold_start': cold_start, 'init_time': init_time}
    cold_start = False
    init_time = 0

    stats['started'] = ms_now()
    result = compute(event['body'])
    stats['finished'] = ms_now()
    return {
        'statusCode': 200,
        'headers': {
            'Content-Type': 'application/json'
        },
        'body': {'result': result, 'stats': stats}
    }

This Lambda function, along with the packages and the models (so that those don't need to be downloaded), is deployed as a docker image.

In addition to the timestamps of when the function started and finished (not including the cold start initialization), the response contains the information about whether it was a cold start and how long it took to initialize. I have another function, which invokes this function 15 times in parallel.

The anomaly happens with the first of these parallel invocations. Usually, it takes ~300ms (computed as the difference of the timestamps in the response). But sometimes it takes 900ms and longer (with the same input).

This does not happen due to a cold start, since I have init_time==0 in the response (when a cold start occurs, init_time>6000). It happens both with command == 'embeddings' and with command == 'sentences'.

What could be the explanation for these spikes? With a warm start, what can cause a Lambda function to take much longer than usual?

P.S. The question at SO

질문됨 6달 전195회 조회
1개 답변
1
수락된 답변

It's probably Python garbage collection. On a warm start the container is reused so over a series of invocations the garbage collector is likely to kick in at some point and make that invocation take longer. I was at a presentation last night about using Rust in Lambda where graphs were shown with this exact thing - comparing Lambda execution times of Rust vs a language that has garbage collection. In this case the other language was Typescript, and there was a spike in execution time every 2 minutes due to the garbage collector.

전문가
답변함 6달 전
profile pictureAWS
전문가
검토됨 6달 전
  • Disabling automatic garbage collection with gc.disable() helped! But can you come up with an explanation for how come this almost always happened in the first invocation?

로그인하지 않았습니다. 로그인해야 답변을 게시할 수 있습니다.

좋은 답변은 질문에 명확하게 답하고 건설적인 피드백을 제공하며 질문자의 전문적인 성장을 장려합니다.

질문 답변하기에 대한 가이드라인

관련 콘텐츠