내용으로 건너뛰기

How to speed up file loading in AWS Lambda

0

I have developed a lambda function using a docker image. This container contains certain codes and files. The codes read from these files and load the data into memory. However, I’ve encountered performance issues during this process. Specifically, it takes more than 30 seconds to load a 400MB file. May I know if there is any way to optimize this loading time? My lambda function is configured to use 10GB memory.

  • Where are you loading the files from? What is the size of the image?

  • The total size of the image is 800MB, and the file is stored at /opt in the image.

질문됨 2년 전1.9천회 조회
3개 답변
0

In the question you haven't said how much memory is allocated to the Lambda function. Note that the CPU power (which can affect transfer speed on the network) allocated to each Lambda function is proportional to the memory given to it.

So you might try increasing the amount of memory and see if you get an increase in network transfer performance.

AWS
전문가
답변함 2년 전
  • updated the question

0

Hi, you don't want to upload such big file via a Lambda. Direct upload to S3 is much better for performances, reliability, etc.

This post will explain you how to do for a web / mobile application: https://aws.amazon.com/blogs/compute/uploading-to-amazon-s3-directly-from-a-web-or-mobile-application/

If you upload from a laptop or a server, CLI is the simplest way to go: aws s3 cp (or sync) is the way to go. See https://awscli.amazonaws.com/v2/documentation/api/latest/reference/s3/cp.html

Best,

Didier

전문가
답변함 2년 전
0

Given the size of your Lambda package, I assume you are using a container image and not a zip file. When working with container images, to reduce cold stat time, we have some lazy loading mechanism, which loads image blocks into memory, only when needed, and not all upfront like with ZIP files. For most use cases, this improves performance, in your case, as you try to read into memory almost the entire package at init time, you are actually suffering from this loading mechanism.

I think there are a few options:

  1. Use provisioned concurrency. This will initialize the execution environment upfront, but it comes with extra cost.
  2. Load the file from S3 or EFS instead (need to test to check the performance). If you do use S3, you can use range get to get many smaller portions of the file in parallel, reducing the total load time.
  3. Try also to load the file in the init stage, i.e., outside the handler.

Also, you configured your function with 10GB memory. This will give you 6 cores. Unless you really need these many cores, or you actually need so much memory, I would recommend reducing the memory. Unless your application is multithreaded, going above ~1.8 GB, will not give you any performance boost.

AWS
전문가
답변함 2년 전

로그인하지 않았습니다. 로그인해야 답변을 게시할 수 있습니다.

좋은 답변은 질문에 명확하게 답하고 건설적인 피드백을 제공하며 질문자의 전문적인 성장을 장려합니다.

관련 콘텐츠