Athena engine v3, errors in VIEW with UNNEST

1

It seems that recent version of Athena engine (v3) has severe inconsistency with UNNEST ARRAY of structs. Moreover, it was not documented among breaking changes at https://docs.aws.amazon.com/athena/latest/ug/engine-versions-reference-0003.html. To reproduce:

WITH dataset AS (
  SELECT ARRAY[
    CAST(ROW('Bob', 38) AS ROW(name VARCHAR, age INTEGER)),
    CAST(ROW('Alice', 35) AS ROW(name VARCHAR, age INTEGER)),
    CAST(ROW('Jane', 27) AS ROW(name VARCHAR, age INTEGER))
  ] AS users
)
SELECT t.* FROM dataset CROSS JOIN UNNEST(users) t(s);
-- OK
CREATE OR REPLACE VIEW v_test AS 
WITH dataset AS (
  SELECT ARRAY[
    CAST(ROW('Bob', 38) AS ROW(name VARCHAR, age INTEGER)),
    CAST(ROW('Alice', 35) AS ROW(name VARCHAR, age INTEGER)),
    CAST(ROW('Jane', 27) AS ROW(name VARCHAR, age INTEGER))
  ] AS users
)
SELECT t.* FROM dataset CROSS JOIN UNNEST(users) t(s);
-- line 9:36: Column alias list has 1 entries but 't' has 2 columns available
CREATE OR REPLACE VIEW v_test AS 
WITH dataset AS (
  SELECT ARRAY[
    CAST(ROW('Bob', 38) AS ROW(name VARCHAR, age INTEGER)),
    CAST(ROW('Alice', 35) AS ROW(name VARCHAR, age INTEGER)),
    CAST(ROW('Jane', 27) AS ROW(name VARCHAR, age INTEGER))
  ] AS users
)
SELECT t.* FROM dataset CROSS JOIN UNNEST(users) t(name,age);
-- OK
select * from v_test
-- INVALID_VIEW: line 1:15: Failed analyzing stored view: line 9:12: Column alias list has 2 entries but 't' has 1 columns available

As opposite to that, in Athena engine v2 , the "single column syntax" t(s) works consistently in both CREATE VIEW, SELECT FROM VIEW and immediate SELECT. In current Trino (v. 402), the "multiple column syntax" t(name,age) works consistently in both CREATE VIEW, SELECT FROM VIEW and immediate SELECT.

These errors prevent us to migrate to Athena engine v3. How could it be solved? Thank you in advance

已提问 1 年前591 查看次数
1 回答
0

Hello, This is known issue in the Athena Engine version 3 with the creation of the views only, as explained in your example. For now the workaround is to Use Engine version 2.

"Multiple column syntaxes" with unnest() do work with Athena also but in your example you are only using unnest() on a single column "users". See below example which shows correct way of opening multiple columns with unnest():

WITH dataset AS (
  SELECT ARRAY[
    CAST(ROW('Bob', 38) AS ROW(name VARCHAR, age INTEGER)),
    CAST(ROW('Alice', 35) AS ROW(name VARCHAR, age INTEGER)),
    CAST(ROW('Jane', 27) AS ROW(name VARCHAR, age INTEGER))
  ] AS users ,
ARRAY[
    CAST(ROW('Bob', 38) AS ROW(name VARCHAR, age INTEGER)),
    CAST(ROW('Alice', 35) AS ROW(name VARCHAR, age INTEGER)),
    CAST(ROW('Jane', 27) AS ROW(name VARCHAR, age INTEGER))
  ] AS users1
)
SELECT t.* FROM dataset CROSS JOIN UNNEST(users,users1) t(s,t)
AWS
支持工程师
Shubh
已回答 1 年前
  • Hello Do you have any plans on improving this behavior? Cause my team have the very same situation - we can't upgrade until there is a way to use unnest over an array of structs

    Thanks!

您未登录。 登录 发布回答。

一个好的回答可以清楚地解答问题和提供建设性反馈,并能促进提问者的职业发展。

回答问题的准则