Lambda MSK Event Source Unexpected Retry Behaviour

0

Hi,

I have an MSK topic that is an event source for a C# lambda. According to the documentation over at https://docs.aws.amazon.com/lambda/latest/dg/with-msk.html

Lambda reads the messages sequentially for each partition. After Lambda processes each batch, it commits the offsets of the messages in that batch. If your function returns an error for any of the messages in a batch, Lambda retries the whole batch of messages until processing succeeds or the messages expire.

However, this is different to the behaviour I'm seeing.

My lambda is currently throwing a timeout exception, and I can see from the logs that it's bubbling up to the lambda runtime as I intended. The lambda is re-invoked somewhere between 1 and 5 times, and then it isn't attempted again. I can see in CloudWatch a log like

START RequestId: 21b8cbda-8e66-4d1d-9812-9d2661985a73 Version: $LATEST

for each invocation - which is how I'm determining the number of times it's being retried.

I was expecting the lambda to be retried indefinitely until it succeeded. Have I misunderstood the documentation or is that how it should behave?

Also what is meant by "or the messages expire." - is that referring to the retention period for the topic (mine is currently 1 week)? Or is there some other message expiry mechanism specific to MSK?

Thank you!

No Answers

You are not logged in. Log in to post an answer.

A good answer clearly answers the question and provides constructive feedback and encourages professional growth in the question asker.

Guidelines for Answering Questions