Athena engine v3, errors in VIEW with UNNEST

1

It seems that recent version of Athena engine (v3) has severe inconsistency with UNNEST ARRAY of structs. Moreover, it was not documented among breaking changes at https://docs.aws.amazon.com/athena/latest/ug/engine-versions-reference-0003.html. To reproduce:

WITH dataset AS (
  SELECT ARRAY[
    CAST(ROW('Bob', 38) AS ROW(name VARCHAR, age INTEGER)),
    CAST(ROW('Alice', 35) AS ROW(name VARCHAR, age INTEGER)),
    CAST(ROW('Jane', 27) AS ROW(name VARCHAR, age INTEGER))
  ] AS users
)
SELECT t.* FROM dataset CROSS JOIN UNNEST(users) t(s);
-- OK
CREATE OR REPLACE VIEW v_test AS 
WITH dataset AS (
  SELECT ARRAY[
    CAST(ROW('Bob', 38) AS ROW(name VARCHAR, age INTEGER)),
    CAST(ROW('Alice', 35) AS ROW(name VARCHAR, age INTEGER)),
    CAST(ROW('Jane', 27) AS ROW(name VARCHAR, age INTEGER))
  ] AS users
)
SELECT t.* FROM dataset CROSS JOIN UNNEST(users) t(s);
-- line 9:36: Column alias list has 1 entries but 't' has 2 columns available
CREATE OR REPLACE VIEW v_test AS 
WITH dataset AS (
  SELECT ARRAY[
    CAST(ROW('Bob', 38) AS ROW(name VARCHAR, age INTEGER)),
    CAST(ROW('Alice', 35) AS ROW(name VARCHAR, age INTEGER)),
    CAST(ROW('Jane', 27) AS ROW(name VARCHAR, age INTEGER))
  ] AS users
)
SELECT t.* FROM dataset CROSS JOIN UNNEST(users) t(name,age);
-- OK
select * from v_test
-- INVALID_VIEW: line 1:15: Failed analyzing stored view: line 9:12: Column alias list has 2 entries but 't' has 1 columns available

As opposite to that, in Athena engine v2 , the "single column syntax" t(s) works consistently in both CREATE VIEW, SELECT FROM VIEW and immediate SELECT. In current Trino (v. 402), the "multiple column syntax" t(name,age) works consistently in both CREATE VIEW, SELECT FROM VIEW and immediate SELECT.

These errors prevent us to migrate to Athena engine v3. How could it be solved? Thank you in advance

gefragt vor einem Jahr591 Aufrufe
1 Antwort
0

Hello, This is known issue in the Athena Engine version 3 with the creation of the views only, as explained in your example. For now the workaround is to Use Engine version 2.

"Multiple column syntaxes" with unnest() do work with Athena also but in your example you are only using unnest() on a single column "users". See below example which shows correct way of opening multiple columns with unnest():

WITH dataset AS (
  SELECT ARRAY[
    CAST(ROW('Bob', 38) AS ROW(name VARCHAR, age INTEGER)),
    CAST(ROW('Alice', 35) AS ROW(name VARCHAR, age INTEGER)),
    CAST(ROW('Jane', 27) AS ROW(name VARCHAR, age INTEGER))
  ] AS users ,
ARRAY[
    CAST(ROW('Bob', 38) AS ROW(name VARCHAR, age INTEGER)),
    CAST(ROW('Alice', 35) AS ROW(name VARCHAR, age INTEGER)),
    CAST(ROW('Jane', 27) AS ROW(name VARCHAR, age INTEGER))
  ] AS users1
)
SELECT t.* FROM dataset CROSS JOIN UNNEST(users,users1) t(s,t)
AWS
SUPPORT-TECHNIKER
Shubh
beantwortet vor einem Jahr
  • Hello Do you have any plans on improving this behavior? Cause my team have the very same situation - we can't upgrade until there is a way to use unnest over an array of structs

    Thanks!

Du bist nicht angemeldet. Anmelden um eine Antwort zu veröffentlichen.

Eine gute Antwort beantwortet die Frage klar, gibt konstruktives Feedback und fördert die berufliche Weiterentwicklung des Fragenstellers.

Richtlinien für die Beantwortung von Fragen