How to set the starting position for a Kinesis Delivery Stream

0

When hooking up a lambda to a kinesis stream, the event source can choose a starting position. Settings like TRIM_HORIZON and timestamps allow to replay data that is already in the stream. When using a delivery stream this setting does not seem available, and it only plays data from the latest record. Even creating a brand new firehose does not allow to replay the data. This is a problem when the data is difficult or expensive to generate again. For example I had the firehose not set up properly a couple times, but had to restart the whole expensive backend job to process the data again. Am I missing something?

edgy42
질문됨 2년 전1617회 조회
1개 답변
1

Yes you are right. If Kinesis Data Streams is used as a source for Kinesis Data Firehose, then KDF starts reading from the latest position from KDS. This is documented here - https://docs.aws.amazon.com/firehose/latest/dev/writing-with-kinesis-streams.html - "Kinesis Data Firehose starts reading data from the LATEST position of your Kinesis stream."

It is also mentioned in the FAQ - https://aws.amazon.com/kinesis/data-firehose/faqs/

Q: From where does Kinesis Data Firehose read data when my Kinesis Data Stream is configured as the source of my delivery stream?

Kinesis Data Firehose starts reading data from the LATEST position of your Kinesis Data Stream when it’s configured as the source of a delivery stream.

If you want to read records in the KDS from the beginning or from a particular position, one option could be to configure a lambda function to read from the position you are interested in and then put the records to KDF using the SDK.

profile pictureAWS
전문가
답변함 2년 전
  • @edgy42 - If my response has helped you, can I request you to please accept my answer. Thanks

로그인하지 않았습니다. 로그인해야 답변을 게시할 수 있습니다.

좋은 답변은 질문에 명확하게 답하고 건설적인 피드백을 제공하며 질문자의 전문적인 성장을 장려합니다.

질문 답변하기에 대한 가이드라인

관련 콘텐츠