- Newest
- Most votes
- Most comments
When you add data to your Kinesis Data Stream every record is persisted in the stream as a single record. With PutRecord() you are effectively adding a single message to the stream.
Kinesis Data Firehose is the easiest way to load streaming data into data stores. With Kinesis Data Firehose you create Delivery Streams. A Delivery Stream has a Source to read data from, and a Destination to deliver the streaming data. In your case the Kinesis Data Stream is the source, and S3 is the Destination. But it does not write the data from the Data Stream as one record per file.
The frequency of data delivery to Amazon S3 is determined by the S3 buffer size and buffer interval value you configured for your Delivery Stream. Kinesis Data Firehose buffers incoming data before delivering it to Amazon S3. You can configure the values for S3 buffer size (1 MB to 128 MB) or buffer interval (60 to 900 seconds), and the condition satisfied first triggers data delivery to Amazon S3. This writes1 file to S3 with all the records in the buffer, and starts again buffering records until the condition for buffer size or buffer interval is met again.
You can find more information on Kinesis Data Firehose streaming concepts and how it uses Data Sources and Data Delivery in the Kinesis Data Firehose FAQs.
Relevant content
- Accepted Answerasked 8 months ago
- Accepted Answerasked a year ago
- AWS OFFICIALUpdated 5 months ago
- AWS OFFICIALUpdated 5 months ago
- AWS OFFICIALUpdated a year ago
- AWS OFFICIALUpdated 2 years ago
ok I see, it makes sense. But there seems to be a flaw in the way Kinesis Firehose buffer behaves, as it simply appends all received messages together, without any delimited, breaking their structure. for the text files it might be not a critical issue, but sending binaries would make them all invalid when saving to S3 because their integrity is lost due to combining them all togeher
Any advise how to tackle such situation?