How to set the starting position for a Kinesis Delivery Stream

0

When hooking up a lambda to a kinesis stream, the event source can choose a starting position. Settings like TRIM_HORIZON and timestamps allow to replay data that is already in the stream. When using a delivery stream this setting does not seem available, and it only plays data from the latest record. Even creating a brand new firehose does not allow to replay the data. This is a problem when the data is difficult or expensive to generate again. For example I had the firehose not set up properly a couple times, but had to restart the whole expensive backend job to process the data again. Am I missing something?

edgy42
質問済み 2年前1617ビュー
1回答
1

Yes you are right. If Kinesis Data Streams is used as a source for Kinesis Data Firehose, then KDF starts reading from the latest position from KDS. This is documented here - https://docs.aws.amazon.com/firehose/latest/dev/writing-with-kinesis-streams.html - "Kinesis Data Firehose starts reading data from the LATEST position of your Kinesis stream."

It is also mentioned in the FAQ - https://aws.amazon.com/kinesis/data-firehose/faqs/

Q: From where does Kinesis Data Firehose read data when my Kinesis Data Stream is configured as the source of my delivery stream?

Kinesis Data Firehose starts reading data from the LATEST position of your Kinesis Data Stream when it’s configured as the source of a delivery stream.

If you want to read records in the KDS from the beginning or from a particular position, one option could be to configure a lambda function to read from the position you are interested in and then put the records to KDF using the SDK.

profile pictureAWS
エキスパート
回答済み 2年前
  • @edgy42 - If my response has helped you, can I request you to please accept my answer. Thanks

ログインしていません。 ログイン 回答を投稿する。

優れた回答とは、質問に明確に答え、建設的なフィードバックを提供し、質問者の専門分野におけるスキルの向上を促すものです。

質問に答えるためのガイドライン

関連するコンテンツ