Intermittently slow CPU performance in AWS Lambda

0

Consider the following AWS Lambda function:

import time
import pickle
import gc

def ms_now():
    return int(time.time_ns() / 1000000)

class Timer():
    def __init__(self):
        self.start = ms_now()

    def stop(self):
        return ms_now() - self.start

with open('model_embeddings.pkl', 'rb') as file:
    model_embeddings = pickle.load(file)

def get_embeddings(texts):
    timer = Timer()
    embeddings = model_embeddings.encode(texts)   # The line of interest
    print(f"Time: {timer.stop()}ms")
    return embeddings

def lambda_handler(event, _):
    gc.disable()                                  # Disable garbage collection
    result = get_embeddings(event['texts']).tolist()
    return {
        'statusCode': 200,
        'headers': {
            'Content-Type': 'application/json'
        },
        'result': result[0][0], 
    }

The function is deployed as image and is given 10,240 MB of RAM, of which only 800 MB is used. The function outputs the timing of a single line that performs a CPU-intensive task of running input strings through a neural network (HuggingFace sentence transformer) that has been initialized and loaded into memory.

I run this function repeatedly with the same input from AWS Lambda console. The output time is usually around 350ms. However, some runs are as slow as 2100ms. Note that these spikes cannot be explained by either cold starts (since only one line's performance is being measured) or garbage collection kicking in (since garbage collection is disabled).

EDIT: I ran it locally (under WSL under Windows) and never got such a high spike. So it must be something AWS Lambda - specific...

What can cause such spikes?

P.S. The question at SO

asked 4 months ago158 views
1 Answer
0

A few things to consider:

  1. You say it is not a cold start issue. Are you sure? Check in your CloudWatch Logs group. The first invoke in each Log stream is a cold start, the rest are the warm starts. Can you correlate them? I saw on SO that you say that the issue goes away if you first prime the function with a small input. This actually points to some sort of cold start that in the first invoke it requires more processing.
  2. You allocated 10GB to the function, but you are only using 800 MB, this is an enormous waste, unless your code is multithreaded and can utilize more than one core. Otherwise, reduce the memory to 1760MB.
profile pictureAWS
EXPERT
Uri
answered 4 months ago
    1. Yes, I have checked that it is not a cold start. And even if it was a cold start, how can priming speed up a cold start?
    2. The function becomes faster proportionately with increasing memory. I would increase it further if I could.

You are not logged in. Log in to post an answer.

A good answer clearly answers the question and provides constructive feedback and encourages professional growth in the question asker.

Guidelines for Answering Questions