By using AWS re:Post, you agree to the Terms of Use

Set group.id for Kafka event source for lambda?

0

We have a self managed Kafka cluster and I would like to configure a topic to be an event source for lambda. The documentation says this is possible, but I don’t see anywhere you can set the group.Id. https://kafka.apache.org/documentation/#consumerconfigs_group.id

Is it possible to configure this value?

1 Answer
0

Can you shed some light on what is it you are trying to achieve by using the group.id in the lambda context?

Using group.id makes perfect sense if consumers were running on containers or EC2 because you may want traditional queue functionality within the group, where one particular message is delivered to only one member of the group.

In the case of lambda, lambda can be spinning up a large number of instances to consume messages from Kafka and a particular lambda function is acting like the group in the containers/EC2 world. One message will be delivered to a particular instance of a lambda. So technically it functions just like a group of consumers.

If you define another lambda function as a consumer to the same Kafka topic, that lambda function will act like its own group.

profile picture
EXPERT
answered 3 months ago
  • Thank you for your response. There are actually two cases I'm curious about. And I think you may have answered most of how this first case works already. The docs suggest when scaling that 1 consumer is created intially that processes all partitions, and that one consumer will result in multiple lambdas as needed. But if it gets overloaded, additional *consumers * will be spun up that each target a specific partition. And additional lambdas will be spun up by that consumer as needed. Am I understanding that correctly? What would I expect to see as active clients / groups on the kafka cluster when this happens? (that might explain what is happening without a group.id)

    Now if the group.id isn't really needed - I might just need another plan for multi-region redundancy. Thinking I could simply set the group.id on the event source and that would allow me to deploy the exact same code to multiple regions and consume the topic actively in all. The group would allow each region to consume, but they wouldn't consume messages processed elsewhere. I expect that a multi-region deployment of the same code would really look like multiple separate consumers and each region would consume every message, is that correct?

  • This blog from confluent might explain it. Says Consumers to lambdas are one to one. The event source mechanism must be responsible for divvying up the partitions when needed, and do so without a group.Id. https://www.confluent.io/blog/serverless-event-stream-processing/#scaling-and-performance

You are not logged in. Log in to post an answer.

A good answer clearly answers the question and provides constructive feedback and encourages professional growth in the question asker.

Guidelines for Answering Questions