How to speed up file loading in AWS Lambda

0

I have developed a lambda function using a docker image. This container contains certain codes and files. The codes read from these files and load the data into memory. However, I’ve encountered performance issues during this process. Specifically, it takes more than 30 seconds to load a 400MB file. May I know if there is any way to optimize this loading time? My lambda function is configured to use 10GB memory.

  • Where are you loading the files from? What is the size of the image?

  • The total size of the image is 800MB, and the file is stored at /opt in the image.

preguntada hace un año1,2 mil visualizaciones
3 Respuestas
0

In the question you haven't said how much memory is allocated to the Lambda function. Note that the CPU power (which can affect transfer speed on the network) allocated to each Lambda function is proportional to the memory given to it.

So you might try increasing the amount of memory and see if you get an increase in network transfer performance.

profile pictureAWS
EXPERTO
respondido hace un año
  • updated the question

0

Hi, you don't want to upload such big file via a Lambda. Direct upload to S3 is much better for performances, reliability, etc.

This post will explain you how to do for a web / mobile application: https://aws.amazon.com/blogs/compute/uploading-to-amazon-s3-directly-from-a-web-or-mobile-application/

If you upload from a laptop or a server, CLI is the simplest way to go: aws s3 cp (or sync) is the way to go. See https://awscli.amazonaws.com/v2/documentation/api/latest/reference/s3/cp.html

Best,

Didier

profile pictureAWS
EXPERTO
respondido hace un año
0

Given the size of your Lambda package, I assume you are using a container image and not a zip file. When working with container images, to reduce cold stat time, we have some lazy loading mechanism, which loads image blocks into memory, only when needed, and not all upfront like with ZIP files. For most use cases, this improves performance, in your case, as you try to read into memory almost the entire package at init time, you are actually suffering from this loading mechanism.

I think there are a few options:

  1. Use provisioned concurrency. This will initialize the execution environment upfront, but it comes with extra cost.
  2. Load the file from S3 or EFS instead (need to test to check the performance). If you do use S3, you can use range get to get many smaller portions of the file in parallel, reducing the total load time.
  3. Try also to load the file in the init stage, i.e., outside the handler.

Also, you configured your function with 10GB memory. This will give you 6 cores. Unless you really need these many cores, or you actually need so much memory, I would recommend reducing the memory. Unless your application is multithreaded, going above ~1.8 GB, will not give you any performance boost.

profile pictureAWS
EXPERTO
respondido hace un año

No has iniciado sesión. Iniciar sesión para publicar una respuesta.

Una buena respuesta responde claramente a la pregunta, proporciona comentarios constructivos y fomenta el crecimiento profesional en la persona que hace la pregunta.

Pautas para responder preguntas