Athena engine v3, errors in VIEW with UNNEST

1

It seems that recent version of Athena engine (v3) has severe inconsistency with UNNEST ARRAY of structs. Moreover, it was not documented among breaking changes at https://docs.aws.amazon.com/athena/latest/ug/engine-versions-reference-0003.html. To reproduce:

WITH dataset AS (
  SELECT ARRAY[
    CAST(ROW('Bob', 38) AS ROW(name VARCHAR, age INTEGER)),
    CAST(ROW('Alice', 35) AS ROW(name VARCHAR, age INTEGER)),
    CAST(ROW('Jane', 27) AS ROW(name VARCHAR, age INTEGER))
  ] AS users
)
SELECT t.* FROM dataset CROSS JOIN UNNEST(users) t(s);
-- OK
CREATE OR REPLACE VIEW v_test AS 
WITH dataset AS (
  SELECT ARRAY[
    CAST(ROW('Bob', 38) AS ROW(name VARCHAR, age INTEGER)),
    CAST(ROW('Alice', 35) AS ROW(name VARCHAR, age INTEGER)),
    CAST(ROW('Jane', 27) AS ROW(name VARCHAR, age INTEGER))
  ] AS users
)
SELECT t.* FROM dataset CROSS JOIN UNNEST(users) t(s);
-- line 9:36: Column alias list has 1 entries but 't' has 2 columns available
CREATE OR REPLACE VIEW v_test AS 
WITH dataset AS (
  SELECT ARRAY[
    CAST(ROW('Bob', 38) AS ROW(name VARCHAR, age INTEGER)),
    CAST(ROW('Alice', 35) AS ROW(name VARCHAR, age INTEGER)),
    CAST(ROW('Jane', 27) AS ROW(name VARCHAR, age INTEGER))
  ] AS users
)
SELECT t.* FROM dataset CROSS JOIN UNNEST(users) t(name,age);
-- OK
select * from v_test
-- INVALID_VIEW: line 1:15: Failed analyzing stored view: line 9:12: Column alias list has 2 entries but 't' has 1 columns available

As opposite to that, in Athena engine v2 , the "single column syntax" t(s) works consistently in both CREATE VIEW, SELECT FROM VIEW and immediate SELECT. In current Trino (v. 402), the "multiple column syntax" t(name,age) works consistently in both CREATE VIEW, SELECT FROM VIEW and immediate SELECT.

These errors prevent us to migrate to Athena engine v3. How could it be solved? Thank you in advance

已提問 1 年前檢視次數 593 次
1 個回答
0

Hello, This is known issue in the Athena Engine version 3 with the creation of the views only, as explained in your example. For now the workaround is to Use Engine version 2.

"Multiple column syntaxes" with unnest() do work with Athena also but in your example you are only using unnest() on a single column "users". See below example which shows correct way of opening multiple columns with unnest():

WITH dataset AS (
  SELECT ARRAY[
    CAST(ROW('Bob', 38) AS ROW(name VARCHAR, age INTEGER)),
    CAST(ROW('Alice', 35) AS ROW(name VARCHAR, age INTEGER)),
    CAST(ROW('Jane', 27) AS ROW(name VARCHAR, age INTEGER))
  ] AS users ,
ARRAY[
    CAST(ROW('Bob', 38) AS ROW(name VARCHAR, age INTEGER)),
    CAST(ROW('Alice', 35) AS ROW(name VARCHAR, age INTEGER)),
    CAST(ROW('Jane', 27) AS ROW(name VARCHAR, age INTEGER))
  ] AS users1
)
SELECT t.* FROM dataset CROSS JOIN UNNEST(users,users1) t(s,t)
AWS
支援工程師
Shubh
已回答 1 年前
  • Hello Do you have any plans on improving this behavior? Cause my team have the very same situation - we can't upgrade until there is a way to use unnest over an array of structs

    Thanks!

您尚未登入。 登入 去張貼答案。

一個好的回答可以清楚地回答問題並提供建設性的意見回饋,同時有助於提問者的專業成長。

回答問題指南