Ir para o conteúdo

JQ Parsing a json stream input - Amazon Firehose

0

Hello,

I have a json stream coming into my aws firehose that can contain either metrics or report key-value pairs based on the data source. They look like this -

metrics stream

{ "metric": { "k1" : "v1", "k2" : "v2", ... } }

report stream

{ "report": { "k1" : "v1", "k2" : "v2", ... } }

I am trying to use JQ expressions to check if the json has("metric")/has("report") and then derive the values from the blob. or eg,

if has("metric") then "metric" else "report" end

This seemed valid syntax from the online JQ Playground(ref). However, it fails when it is defined as a Dynamic partitioning Key in the firehose. I want to read different keys from the json to define the output of the stream. For eg, both Keys - K1, K2 are important to be read and they are dependent on being able to parse if the parent key is metric/report.

In this stream, metirc/report form the 1st level partition key, followed by the K1 value. I am using below JQ syntax to derive K1,

if has("metric") then .metric.k1 else .report.k1 end

Error message -

JQ Subprocess failed due to: jq: error: syntax error, unexpected if (Unix shell quoting issues?) at <top-level>, line 2: {dataType:if has("metric") then "metric" else "report" end} jq: 1 compile error

Error code: DynamicPartitioning.MetadataExtractionFailed

What's the mistake with JQ expression here. It seems pretty normal here but I am unable to get the right expression to verify the metric or report key.

feita há um ano822 visualizações
1 Resposta
0

The error message suggests there is a syntax issue with the jq expression used in AWS Firehose for dynamic partitioning. The problem might be related to how the jq expression is interpreted when embedded in a larger JSON structure. Let's break down the solution to ensure it aligns with the jq syntax and Firehose's requirements.

Here's how you can structure your jq expression correctly to derive the first level partition key (metric or report) and the k1 value.

Determine the parent key (metric or report):

jq

if has("metric") then "metric" else "report" end

Extract the value of k1 depending on the parent key:

jq

if has("metric") then .metric.k1 else .report.k1 end

To create a dynamic partition key, you need to ensure the entire expression is valid when used in Firehose. One way to do this is to output a JSON object with the desired fields. Here is how you can combine these steps in a jq expression:

jq

{
  "dataType": if has("metric") then "metric" else "report" end,
  "k1": if has("metric") then .metric.k1 else .report.k1 end
}

Ensure that this jq expression is correctly quoted and escaped when defining it in Firehose's configuration. Here is how you can define it in the Firehose dynamic partitioning key configuration:

json

{
  "MetadataExtractionQuery": "{\"dataType\":if has(\"metric\") then \"metric\" else \"report\" end,\"k1\":if has(\"metric\") then .metric.k1 else .report.k1 end}"
}

This JSON ensures the jq expression is correctly embedded within the Firehose configuration. Note the escape characters for quotes inside the jq expression.

Summary Verify your jq expression syntax is correct. Ensure the expression is properly escaped and formatted within the Firehose configuration. Try updating your Firehose configuration with the above MetadataExtractionQuery and it should resolve the issue you are encountering.

respondido há um ano
ESPECIALISTA
avaliado há um ano

Você não está conectado. Fazer login para postar uma resposta.

Uma boa resposta responde claramente à pergunta, dá feedback construtivo e incentiva o crescimento profissional de quem perguntou.