Querying array of nested objects with nested array of objects in Redshift

0

Let's say I have the following JSON

{
  "id": 1,
  "sets": [
    {
      "values": [
        {
          "value": 1
        },
        {
          "value": 2
        }
      ]
    },
    {
      "values": [
        {
          "value": 5
        },
        {
          "value": 6
        }
      ]
    }
  ]
}

If the table name is X I expect the query

 SELECT x.id, v.value 
 FROM X as x,
   x.sets as sets,
   sets.values as v

to give me

id, value
1, 1
1, 2
2, 5
2, 6

and it does work if both sets and values has one object each. When there's more the query fails with column 'id' had 0 remaining values but expected 2. Seems to me I'm not iterating over "sets" properly?

So my question is: what's the proper way to query data structured like my example above in Redshift (using PartiQL)?

posta 2 anni fa357 visualizzazioni
2 Risposte
0

Hi rakibansary,

Unless I'm wrong, I cannot see the name of the SUPER datatype column in your query. In the following example, my SUPER datatype column, where the JSON record is stored, is called json_text.

 SELECT x.json_text.id, v.value 
 FROM X as x,
   x.json_text.sets as sets,
   sets.values as v

Thanks,

AWS
ESPERTO
Ziad
con risposta 2 anni fa
  • Some more context - I actually have data stored as Parquet and not JSON - I guess SUPER doesn't apply to parquet.

    x.json_text.sets fails for me with

    Hint: Split into more than one FROM clause elements or use a JOIN operation to extract ARRAY elements

    The problem here seems to be, json_text is an array.

0

Hi rakibansary,

The json_text column is just an example to show you how usually PartiQL is used on SUPER datatype columns.

SUPER datatype is used for semi-structured data such as JSON. The following link contains several examples on querying semi-structured data https://docs.aws.amazon.com/redshift/latest/dg/query-super.html#navigation.

If you are having a Parquet file with several columns and a json column, then you can use the SERIEALIZETOJSON option while copying the data to Redshift. You can find an example on the following link (the Copying data from columnar-format Parquet and ORC section) https://docs.aws.amazon.com/redshift/latest/dg/ingest-super.html#copy_json.

Thanks,

AWS
ESPERTO
Ziad
con risposta 2 anni fa

Accesso non effettuato. Accedi per postare una risposta.

Una buona risposta soddisfa chiaramente la domanda, fornisce un feedback costruttivo e incoraggia la crescita professionale del richiedente.

Linee guida per rispondere alle domande