- Newest
- Most votes
- Most comments
You are correct that if all records are identical in size and no transformations are applied, there should be a linear relationship between the number of records and the total bytes delivered. However, there are a few factors that can cause the relationship to be non-linear, such as:
Record size variation: If the size of the records varies, it can cause the total bytes delivered to not have a linear relationship with the number of records. In this case, it is possible that the average record size is different between t1 and t2.
Data compression: Amazon Kinesis Data Firehose automatically compresses the data before delivering it to the S3 bucket. The compression ratio can vary depending on the nature of the data. If the data delivered between t1 and t2 have different compression characteristics, this can cause the non-linear relationship you observed.
Aggregation and batching: Kinesis Data Firehose aggregates multiple records into a single object before delivering it to the S3 bucket. The size of the aggregated object might not be a simple sum of the sizes of the individual records, as there can be some overhead associated with the aggregation process.
To estimate the average size of a record, you can calculate the difference in bytes and the difference in the number of records between t1 and t2, then divide the bytes difference by the records difference:
(2300 KB - 1600 KB) / (746 records - 363 records)
However, keep in mind that the actual average record size might still vary due to the factors mentioned above.
Relevant content
- asked 2 years ago
- AWS OFFICIALUpdated 2 years ago
- AWS OFFICIALUpdated a year ago
- AWS OFFICIALUpdated 3 months ago
- AWS OFFICIALUpdated 6 months ago
Thank you for answering!! Hmm ok so for:
So it would seem i made an incorrect assumption on 1?