Hi,
We have a scenario where we want to send a JSON object that contains a nested list of records into Kinesis Firehose from an IoT Core rule, and have this list split into multiple records, with each record then having some enrichment applied (new attribute added) before Firehose batches and writes these individual records to S3 (Parquet).
Is the splitting something that we are able to do via a Firehose Lambda Transformation function, or would we need to split the list before Firehose, and then pass each record in? The issue I think we'd hit with the former is the fact that the incoming payload would have a single recordId
, and Firehose won't like us passing the same ID back for multiple records.
Example incoming payload from IoT Core:
{ "records": [ { "id": "1", "name": "Name 1" }, { "id": "2", "name": "Name 2" } ] }
So in the ideal scenario, the two items in the records
array would be enriched separately and returned to Firehose as multiple records.
I think this is causing another issue when submitting multiple records to Opensearch, I detailed it in my comment here: https://repost.aws/questions/QUH6MmoTQbQpOUaNmfIKrqvA/use-lambda-kinesis-data-firehose-opensearch-when-kdf-send-demo-data-to-opensearch-data-set-index#COZwYC_Kl_RxyAt4iHv1II8Q