Hi,
I have created an external table in redshift as below:
CREATE EXTERNAL TABLE "someschema"."avro_simple_nested" (
"actor_id" varchar(200),
"first_name" varchar(200),
"last_name" varchar(200),
"last_update" bigint,
"version" int,
"debut_film" struct<
"name":varchar(200),
"score":int
>
)
ROW FORMAT SERDE 'org.apache.hadoop.hive.serde2.avro.AvroSerDe'
WITH SERDEPROPERTIES (
'avro.schema.literal'='{"namespace": "example.avro",
"type": "record",
"name": "actor",
"fields": [
{"name": "actor_id", "type": "string"},
{"name": "first_name", "type": "string"},
{"name": "last_name", "type": "string"},
{"default": 0, "name": "last_update", "type": "long"},
{"name": "version", "type": "int"},
{"name": "debut_film", "type": {"type":"record", "name":"debut_film_name",
"fields": [
{"name": "name", "type": "string"},
{"name": "score", "type": "int"}
]}}
]
}'
)
STORED AS AVRO
LOCATION 's3://....../..../avro-files-simple-nested/'
If I try and query the table by using SELECT * from Table I get:
ERROR: Cannot expand query to include unsupported column type for column "debut_film".
If I try and query the non struct columns, I am able to see the data.
If I check the same data in Athena, it shows fine. The table is shown correctly and the data is as expected.
If I query the SVV_EXTERNAL_COLUMNS table, you can see the table definition is correct.
debut_film struct<name:varchar(200),score:int>
Has anyone else used an external table over avro data and managed to view nested data in redshift?
Thanks!