CREATE TABLE AS SELECT statement fails because function returns timestamp with timezone, but Athena Engine version 3 doesn't allow casting from TimestampTZ to Timestamp either

0

I just tried out Athena Engine version 3 with my existing workload (that works under engine version 2), and I get a number of errors like this one:

NOT_SUPPORTED: Casting a Timestamp with Time Zone to Timestamp is not supported. You may need to manually clean the data at location 's3://[REDACTED]/tables/[REDACTED]' before retrying. Athena will not delete data in your account.

Looking at the Athena Engine version 3 documentation, there is this section documenting this limitation:

Casting from TimestampTZ to Timestamp is not supported

Error Message: Casting a Timestamp with Time Zone to Timestamp is not supported.

Suggested Solution: Any explicit or implicit cast from TimestampTZ to Timestamp throws the exception. If possible, remove the cast and use a different data type.

And indeed the queries where I get this error are doing such a cast:

    CAST(from_iso8601_timestamp(
		regexp_extract("$path", '(\d{4})-(\d{2})-(\d{2})T(\d{2}):(\d{2}):(\d{2})Z'))
		AS TIMESTAMP)

But my problem is that the suggestion to remove the CAST function doesn't work for me either, because my queries are CREATE TABLE AS SELECT statements, so when run something like this:

CREATE TABLE table_name AS 
SELECT from_iso8601_timestamp(
		regexp_extract("$path", '(\d{4})-(\d{2})-(\d{2})T(\d{2}):(\d{2}):(\d{2})Z'))
FROM source_table_name

I get this error:

NOT_SUPPORTED: Unsupported Hive type: timestamp(3) with time zone. You may need to manually clean the data at location 's3://[REDACTED]/tables/[REDACTED]' before retrying. Athena will not delete data in your account.

And indeed, the latter error is the reason I ever had the CAST function in these statements.

Doing just the SELECT query (instead of `CREATE TABLE AS SELECT) works in the latter case... but I really do need to create the table.

This becomes an even bigger problem when we look at the Trino documentation for date/time functions and notice that it has numerous function that return timestamp with time zone, and that it's very easy to run into this problem, like this CTAS statement illustrates:

CREATE TABLE default.test WITH (
    format = 'Parquet'
) AS
SELECT 
    -- All of these functions return TIMESTAMP WITH TIME ZONE:
    current_timestamp AS a,
    current_timestamp(6) AS b,
    from_iso8601_timestamp('2021-01-02T03:05:05') AS c,
    from_iso8601_timestamp_nanos('2021-01-02T03:05:05.123456') AS d,
    from_unixtime(0, 'UTC') AS e,
    now() AS f,
    parse_datetime('2022-04-05 15:55', 'yyyy-MM-DD HH:mm') AS g
;

Unless Athena Engine version 3 is patched to permit such casts, an easy, robust workaround needs to be documented.

질문됨 2년 전293회 조회
답변 없음

로그인하지 않았습니다. 로그인해야 답변을 게시할 수 있습니다.

좋은 답변은 질문에 명확하게 답하고 건설적인 피드백을 제공하며 질문자의 전문적인 성장을 장려합니다.

질문 답변하기에 대한 가이드라인