Inconsistent Number of Parallel Executions for Kinesis-triggered Lambda

0

Hi,

I created a Kinesis Data Stream with 50 pre-defined shards and set up a Lambda function triggered by this KDS. I configured the trigger options as follows: batch size=1, batch window=0, and Concurrent batches per shard=10. Then, on a high-performance EC2 instance, I used Python's multiprocess feature to run 2000 processes concurrently, with each process writing data to the KDS in a loop 10 times, resulting in a total of approximately 20,000 records written in about 10 seconds.

Based on my trigger settings, I expected 50 (shards) * 10 (Concurrent batches per shard) = 500 Lambda instances to be launched simultaneously (as this number does not exceed the default limit of 1000 concurrently executing Lambdas). I estimated that with 20,000 records distributed across 500 (Lambda instances), each instance would take approximately 0.7 seconds (Lambda execution time), resulting in a total execution time of around 30 seconds. However, according to the CloudWatch Logs' Log Stream, the execution took a total of 30 minutes.

In CloudWatch Logs, I can see only 9 Log Streams, which leads me to believe that only 9 Lambda instances were launched to process the data. Is there any mistake in my configuration?

1 Answer
0

You need to make sure that you are using enough partition keys so that your messages will get distributed across the shards and in the shards will be able to utilize the parallelization factor. You should have at least 500 partition keys. To be on the safe side, give every record a unique partition key.

profile pictureAWS
EXPERT
Uri
answered 9 months ago
  • Thank you for the reply! I am using a random string for partition key, so I have 20000 partition keys. In my Lambda, it accesses RDS Proxy of RDS SQL Server (Express version), and according to Microsoft docs this version only allows 10 DB connections. So, are the DB connections limiting the Lambda instances? but why? ... Best Regards.

  • I do not think the number of connection is limiting Lambda. First, if it would, you would see an error in the function, it will not cause less concureency. Second, you are using RDS proxy, which its role is exactly this, proxy a large number of connection (from Lambda) to a small number towards the database.

    I am not sure what is limiting your concurrency. I would do the following things:

    1. Increase the batch size to something much higher. It will perform better and cost you less
    2. Look at CloudWatch metrics to see if there are any errors (throttling, timeouts, etc.). Check also the iterator age to see if it actually increases.
    3. Look at the Stream metrics in CloudWatch to make sure that your test clients are actually able to write the records to the stream in the rate that you think it does. Remember that each shard supports up to 1000 records or 1 MB per second.
  • Hi Uri, Thank you very much! I finally find out that the unreserved account concurrence of my Lambda is 10... Thanks again! Best Regards.

You are not logged in. Log in to post an answer.

A good answer clearly answers the question and provides constructive feedback and encourages professional growth in the question asker.

Guidelines for Answering Questions