Ongoing service disruptions
For the most recent update on ongoing service disruptions affecting the AWS Middle East (UAE) Region (ME-CENTRAL-1), refer to the AWS Health Dashboard. For information on AWS Service migration, see How do I migrate my services to another region?
How do I use the SUPER data type in Amazon Redshift to handle and query JSON data?
I want to use the SUPER data type in Amazon Redshift to handle and query JSON data.
Short description
Use the SUPER data type to store semistructured data in your Amazon Redshift data warehouses. You can query the SUPER data type with the PartiQL language.
You can also use Amazon Redshift Spectrum for semistructured data support.
Resolution
You can use either the COPY command or the INSERT command with the JSON_PARSE function to load JSON into SUPER data. When you load large data, it's a best practice to use the COPY command.
Use the COPY command
Use the COPY command to load JSON from data files in Amazon Simple Storage Service (Amazon S3). For more information, see Loading semistructured data into Amazon Redshift.
To load JSON into a single SUPER data column, use the noshred option.
To load JSON into multiple columns, use the auto option or specify the jsonpaths file. When you use the auto option, COPY matches the top-level JSON attributes with column names and lets you load nested values as SUPER values. The following examples demonstrate how to use the auto option.
Example JSON data file:
{"r_regionkey":0,"r_name":"AFRICA","r_nations":[{"n_nationkey":11,"n_name":"ALGERIA"},{"n_nationkey":5,"n_name":"ETHIOPIA"},{"n_nationkey":14,"n_name":"KENYA"}]} {"r_regionkey":1,"r_name":"AMERICA","r_nations":[{"n_nationkey":1,"n_name":"ARGENTINA"},{"n_nationkey":2,"n_name":"BRAZIL"}]} {"r_regionkey":2,"r_name":"ASIA","r_nations":[{"n_nationkey":8,"n_name":"INDIA"}]}
Example COPY command:
# CREATE TABLE region_nations ( r_regionkey smallint ,r_name varchar ,r_nations super ); # COPY region_nations FROM 's3://<S3 path to the JSON data file>' IAM_ROLE '<IAM role>' FORMAT JSON 'auto'; # SELECT * FROM region_nations; r_regionkey | r_name | r_nations -------------+---------+------------------------------------------------------------------------------------------------------------------- 0 | AFRICA | [{"n_nationkey":11,"n_name":"ALGERIA"},{"n_nationkey":5,"n_name":"ETHIOPIA"},{"n_nationkey":14,"n_name":"KENYA"}] 1 | AMERICA | [{"n_nationkey":1,"n_name":"ARGENTINA"},{"n_nationkey":2,"n_name":"BRAZIL"}] 2 | ASIA | [{"n_nationkey":8,"n_name":"INDIA"}] (3 rows)
Load the JSON data in a text or .csv file to SUPER. Make sure that it's in a valid JSON format. Amazon Redshift uses standard escaping rules for .csv files.
Example .csv file:
r_regionkey,r_name,r_nations 0,AFRICA,"[{""n_nationkey"":11,""n_name"":""ALGERIA""},{""n_nationkey"":5,""n_name"":""ETHIOPIA""},{""n_nationkey"":14,""n_name"":""KENYA""}]" 1,AMERICA,"[{""n_nationkey"":1,""n_name"":""ARGENTINA""},{""n_nationkey"":2,""n_name"":""BRAZIL""}]" 2,ASIA,"[{""n_nationkey"":8,""n_name"":""INDIA""}]"
Example COPY command:
# CREATE TABLE region_nations ( r_regionkey smallint ,r_name varchar ,r_nations super ); # COPY region_nations FROM 's3://<S3 path to the CSV file>' IAM_ROLE '<IAM role>' FORMAT CSV IGNOREHEADER 1; # SELECT * FROM region_nations; r_regionkey | r_name | r_nations -------------+---------+------------------------------------------------------------------------------------------------------------------- 0 | AFRICA | [{"n_nationkey":11,"n_name":"ALGERIA"},{"n_nationkey":5,"n_name":"ETHIOPIA"},{"n_nationkey":14,"n_name":"KENYA"}] 1 | AMERICA | [{"n_nationkey":1,"n_name":"ARGENTINA"},{"n_nationkey":2,"n_name":"BRAZIL"}] 2 | ASIA | [{"n_nationkey":8,"n_name":"INDIA"}] (3 rows)
Use the INSERT command and JSON_PARSE function
The JSON_PARSE function parses data in JSON format and converts it into the SUPER data type that you can use with the INSERT command.
Example command:
# CREATE TABLE region_nations ( r_regionkey smallint ,r_name varchar ,r_nations super ); # INSERT INTO region_nations VALUES(0,'AFRICA',json_parse('[{"n_nationkey":11,"n_name":"ALGERIA"},{"n_nationkey":5,"n_name":"ETHIOPIA"},{"n_nationkey":14,"n_name":"KENYA"}]')); # INSERT INTO region_nations VALUES(1,'AMERICA',json_parse('[{"n_nationkey":1,"n_name":"ARGENTINA"},{"n_nationkey":2,"n_name":"BRAZIL"}]')); # INSERT INTO region_nations VALUES(2,'ASIA',json_parse('[{"n_nationkey":8,"n_name":"INDIA"}]')); # SELECT * FROM region_nations; r_regionkey | r_name | r_nations -------------+---------+------------------------------------------------------------------------------------------------------------------- 0 | AFRICA | [{"n_nationkey":11,"n_name":"ALGERIA"},{"n_nationkey":5,"n_name":"ETHIOPIA"},{"n_nationkey":14,"n_name":"KENYA"}] 1 | AMERICA | [{"n_nationkey":1,"n_name":"ARGENTINA"},{"n_nationkey":2,"n_name":"BRAZIL"}] 2 | ASIA | [{"n_nationkey":8,"n_name":"INDIA"}] (3 rows)
Query SUPER data with PartiQL
Amazon Redshift uses the PartiQL language for SQL-compatible access to relational, semistructured and nested data. For more information, see Querying semistructured data.
Example query:
# SELECT * FROM region_nations; r_regionkey | r_name | r_nations -------------+---------+------------------------------------------------------------------------------------------------------------------- 0 | AFRICA | [{"n_nationkey":11,"n_name":"ALGERIA"},{"n_nationkey":5,"n_name":"ETHIOPIA"},{"n_nationkey":14,"n_name":"KENYA"}] 1 | AMERICA | [{"n_nationkey":1,"n_name":"ARGENTINA"},{"n_nationkey":2,"n_name":"BRAZIL"}] 2 | ASIA | [{"n_nationkey":8,"n_name":"INDIA"}] (3 rows) # SELECT region_nations.r_nations[1].n_name FROM region_nations; n_name ------------ "ETHIOPIA" "BRAZIL" (3 rows) # SELECT rn.r_regionkey, rn.r_name, n FROM region_nations rn, rn.r_nations n; r_regionkey | r_name | n -------------+---------+---------------------------------------- 0 | AFRICA | {"n_nationkey":11,"n_name":"ALGERIA"} 0 | AFRICA | {"n_nationkey":5,"n_name":"ETHIOPIA"} 0 | AFRICA | {"n_nationkey":14,"n_name":"KENYA"} 1 | AMERICA | {"n_nationkey":1,"n_name":"ARGENTINA"} 1 | AMERICA | {"n_nationkey":2,"n_name":"BRAZIL"} 2 | ASIA | {"n_nationkey":8,"n_name":"INDIA"} (6 rows)
Unload SUPER data to a JSON file
You can use UNLOAD command to extract data from the SUPER data type and store it in S3 as a JSON document.
Example command:
# SELECT * FROM region_nations; r_regionkey | r_name | r_nations -------------+---------+------------------------------------------------------------------------------------------------------------------- 0 | AFRICA | [{"n_nationkey":11,"n_name":"ALGERIA"},{"n_nationkey":5,"n_name":"ETHIOPIA"},{"n_nationkey":14,"n_name":"KENYA"}] 1 | AMERICA | [{"n_nationkey":1,"n_name":"ARGENTINA"},{"n_nationkey":2,"n_name":"BRAZIL"}] 2 | ASIA | [{"n_nationkey":8,"n_name":"INDIA"}] (3 rows) # UNLOAD ('SELECT * FROM region_nations') TO 's3://<S3 path>' IAM_ROLE '<IAM role>' FORMAT JSON;
Example JSON document:
{"r_regionkey":0,"r_name":"AFRICA","r_nations":[{"n_nationkey":11,"n_name":"ALGERIA"},{"n_nationkey":5,"n_name":"ETHIOPIA"},{"n_nationkey":14,"n_name":"KENYA"}]} {"r_regionkey":1,"r_name":"AMERICA","r_nations":[{"n_nationkey":1,"n_name":"ARGENTINA"},{"n_nationkey":2,"n_name":"BRAZIL"}]} {"r_regionkey":2,"r_name":"ASIA","r_nations":[{"n_nationkey":8,"n_name":"INDIA"}]}
- Topics
- Analytics
- Tags
- Amazon Redshift
- Language
- English

Relevant content
- asked 2 years ago
- asked a year ago
AWS OFFICIALUpdated 10 months ago
AWS OFFICIALUpdated a year ago