Scaling behavior of S3 prefix (503 errors)

0

Issue: I have a file stored in s3 with prefix like /main_folder/#{RandomHash}/#{fileName}. I am having a SNS trigger attached to Lambda function where it will download this file. The scenario is I need this same file in N number of Lambda functions say (2000 Lambdas). I am getting 503 "Please reduce your request rate." exceptions (I am trying GET operation on s3 object) when the file size is higher (more than 20MB).

I am able to download a file with 20MB size into 2000 lambda's without any exceptions, but when I try with large file sizes like 100MB only 1200 Lambda's succeed and others get 503 exceptions.

Below are the different approaches I tried:

  • Used default s3 clients for download
  • Used transfer manager for download
  • Tried implementing logic in python, node.js, java

As per AWS docs we can request 5500 GET request per second per prefix but for some reason in my case I am getting 503 errors that to dependent on the file size even though I trigger only 2000 Lambda's. I would like to get some opinions on solutions or help in finding RCA on this issue.

Thanks in advance.

krish
asked 9 months ago281 views
1 Answer
1

Hi,

you probably reach the 5500 GET requests per second in your use case: each of your 2'000 Lambdas reads fast since it's all on AWS cloud. So, each Lambda triggers multiple reads per second and you reach the limit.

To overcome this limit, I would suggest to place some form of cache between S3 and your Lambda (like AWS MemoryDb for Redis): the first Lambda that get the trigger reads the file and writes it into the cache. Then, all other lambdas read from the cache to obtain the file. Of course, you need some form of semaphore (via ad hoc Redis primitives) to make sure that only 1 lambda reads from cache and the others wait until finished before reading from cache.

Best,

Didier

profile pictureAWS
EXPERT
answered 9 months ago
  • Hi Didier, thanks for the reply.

    But I am a bit confused on the "each Lambda triggers multiple reads per second and you reach the limit" part as I am getting a single object and if 2000 Lambdas will only send out 2000 GET requests even considering all are sent at same time. Could you please explain how the limits might be reached?

    On the redis part I guess it will have a higher pricing involved. I am looking for some solutions which can be done with s3 itself as I would like all components of my application as serverless. I looked in EFS but it also has higher pricing and it's not serverless.

You are not logged in. Log in to post an answer.

A good answer clearly answers the question and provides constructive feedback and encourages professional growth in the question asker.

Guidelines for Answering Questions