lambda parallel invocation high volume

0

Hi team,

I have a lambda function responsible for doing files virus scan for every file coming in an s3 bucket,

the bucket can have a huge amount of files uploaded (~ 200,000 requests per second) virus scan lambda.

reading this article I saw that we have only 1000 (Lambda provides your account with a total concurrency limit of 1,000 across all functions in a region.)

https://docs.aws.amazon.com/lambda/latest/dg/lambda-concurrency.html

is there a way that lambda can scale to that amount of requests for parallel invokation ?

should we really pass by the request a quota increase?

Thank you team!

  • Yes, if you want to go with lambda based architecture then you should go with "Quote Increase".

2 Answers
2

Account level lambda concurrency ensures that, these many number of concurrent executions in this account can happen however, if you have reserved concurrency set at any lambda function then other lambda function may be throttled as net concurrency = Account level concurrency - (Total number of reserved concurrency).

Increasing the account level concurrency would help, but with this amount of parallel requests, it'd be very hard for lambda to keep up with any concurrency set at account level. You may consider reserved concurrency, which ensures that these many number of invocations would happen for sure but if you reserve concurrency for a function, then account level concurrency would be -> Total concurrency at account level - Reserved concurrency for this function.

I'd like to explain it this way, let say:

Total number of lambda functions in the account: 500

Account level concurrency: 1400

There are 7 lambda functions in the account, which have reserved concurrency set on it, each with 200 as concurrent executions, this means that account level concurrency remained zero and rest of all lambda functions would be throttled even if these 7 lambda functions are not running.

Reserved concurrency is often adopted for mission critical applications where those lambda functions can't afford throttling but at the same time it simply takes the reservation from account level concurrency.

Refer this AWS Documentation, which talks about this topic.

Other thing to consider is number of requests per second. See "Lambda API requests" -> "Invocation request quota" section Lambda quotas. As, what you explained here, this limit would also come into the play.

Hope you find this helpful.

Comment here if you have additional questions, happy to assist.

Abhishek

profile pictureAWS
EXPERT
answered 8 months ago
  • thank you for your answer, so if it will affect the rest of lambdas in the account, you advise that I use EC2 with auto-scaling for virus scan instead of a lambda function?

  • If you have sufficient Account Level Lambda Concurrency, and this lambda consumes all/most of that a/c level lambda concurrency then yes, other lambda functions may not get chance to run as there would not be any concurrency left at a/c level and those lambda functions would start getting throttled. You may need to consider some fan out mechanism(queueing the event), that way you can avoid sudden spike in this lambda execution keeping other lambda functions executions intact too. Or you can consider going the other route ie ec2 but make sure you analyze the cost aspect for both of these approaches. Hope this explanation helps. Comment here if you have additional questions. happy to assist.

  • as per the article you shared in your answer max reserved concurrency = 1000, this would not help with the required number of request per second that my lambda need to handle.

  • That's right, you would need to request increasing the account level lambda concurrency but at the same time you should rethink if increased limit would be enough to handle the load that you are expecting. Would this approach be scalable if in future number of uploaded files/second increases. Like I mentioned, this is not the only limit which would be a factor, "Lambda API requests" -> "Invocation request quota" would also play a role here.

    So I'd suggest you to get the account level concurrency increased and see, how this works. Most likely with the given scenario, it would not be able to handle this much of burst, so you'd need to consider some alternative like fan out(queuing mechanism) or EC2.

    Hope this helps, comment here if you have additional questions, happy to help.

  • Thank you for your answer! could you please elaborate more on how you see the fanout mechanism in this scenario?

    do you mean, having many lambda functions in my account that all of them do the virus scan and then call them in parallel?

    ==> Those virus scans lambdas each of them will consume load form a different sqs queue?

0

Hi,

While the default quota for concurrent executions for AWS Lambda is 1000, this limit can be increased to tens of thousands, see here.

However, are you sure that the rest of your architecture is able to handle 200k requests per seconds?

Furthermore, what is the use case to scan the object for every download and not once when it's uploaded?

profile pictureAWS
EXPERT
answered 8 months ago
  • yes sorry the file is scanned once it's uploaded in a bucket, then I have an event bridge rule that fires a virus scan lambda function

    file uploaded to s3 --> s3 send event to event bridge --> event bridge fire a lambda function for file viru scan

    is that a good architecture that can handle 200K per second?

You are not logged in. Log in to post an answer.

A good answer clearly answers the question and provides constructive feedback and encourages professional growth in the question asker.

Guidelines for Answering Questions