Some invocations of a Lambda function take much longer despite a warm start

0

Here is my AWS Lambda function:

import json
import time
import pickle

def ms_now():
    return int(time.time_ns() / 1000000)

class Timer():
    def __init__(self):
        self.start = ms_now()

    def stop(self):
        return ms_now() - self.start

timer = Timer()

from punctuators.models import PunctCapSegModelONNX
model_name = "pcs_en"
model_sentences = PunctCapSegModelONNX.from_pretrained(model_name)

with open('model_embeddings.pkl', 'rb') as file:
    model_embeddings = pickle.load(file)

cold_start = True
init_time = timer.stop()
print("Time to initialize:", init_time, flush=True)

def segment_text(texts):
    sentences = model_sentences.infer(texts)
    sentences = [
        [(s, len(model_embeddings.tokenizer.encode(s))) for s in el]
         for el in sentences]
    return sentences

def get_embeddings(texts):
    return model_embeddings.encode(texts)

def compute(body):
    command = body['command']
    
    if command == 'ping':
        return 'success'

    texts = body['texts']

    if command == 'embeddings':
        result = get_embeddings(texts)
        return [el.tolist() for el in result]
    
    if command == 'sentences':
        return segment_text(texts)
    
    assert(False)

def lambda_handler(event, context):
    global cold_start
    global init_time
    
    stats = {'cold_start': cold_start, 'init_time': init_time}
    cold_start = False
    init_time = 0

    stats['started'] = ms_now()
    result = compute(event['body'])
    stats['finished'] = ms_now()
    return {
        'statusCode': 200,
        'headers': {
            'Content-Type': 'application/json'
        },
        'body': {'result': result, 'stats': stats}
    }

This Lambda function, along with the packages and the models (so that those don't need to be downloaded), is deployed as a docker image.

In addition to the timestamps of when the function started and finished (not including the cold start initialization), the response contains the information about whether it was a cold start and how long it took to initialize. I have another function, which invokes this function 15 times in parallel.

The anomaly happens with the first of these parallel invocations. Usually, it takes ~300ms (computed as the difference of the timestamps in the response). But sometimes it takes 900ms and longer (with the same input).

This does not happen due to a cold start, since I have init_time==0 in the response (when a cold start occurs, init_time>6000). It happens both with command == 'embeddings' and with command == 'sentences'.

What could be the explanation for these spikes? With a warm start, what can cause a Lambda function to take much longer than usual?

P.S. The question at SO

質問済み 6ヶ月前195ビュー
1回答
1
承認された回答

It's probably Python garbage collection. On a warm start the container is reused so over a series of invocations the garbage collector is likely to kick in at some point and make that invocation take longer. I was at a presentation last night about using Rust in Lambda where graphs were shown with this exact thing - comparing Lambda execution times of Rust vs a language that has garbage collection. In this case the other language was Typescript, and there was a spike in execution time every 2 minutes due to the garbage collector.

エキスパート
回答済み 6ヶ月前
profile pictureAWS
エキスパート
レビュー済み 6ヶ月前
  • Disabling automatic garbage collection with gc.disable() helped! But can you come up with an explanation for how come this almost always happened in the first invocation?

ログインしていません。 ログイン 回答を投稿する。

優れた回答とは、質問に明確に答え、建設的なフィードバックを提供し、質問者の専門分野におけるスキルの向上を促すものです。

質問に答えるためのガイドライン

関連するコンテンツ