CREATE TABLE AS SELECT statement fails because function returns timestamp with timezone, but Athena Engine version 3 doesn't allow casting from TimestampTZ to Timestamp either

0

I just tried out Athena Engine version 3 with my existing workload (that works under engine version 2), and I get a number of errors like this one:

NOT_SUPPORTED: Casting a Timestamp with Time Zone to Timestamp is not supported. You may need to manually clean the data at location 's3://[REDACTED]/tables/[REDACTED]' before retrying. Athena will not delete data in your account.

Looking at the Athena Engine version 3 documentation, there is this section documenting this limitation:

Casting from TimestampTZ to Timestamp is not supported

Error Message: Casting a Timestamp with Time Zone to Timestamp is not supported.

Suggested Solution: Any explicit or implicit cast from TimestampTZ to Timestamp throws the exception. If possible, remove the cast and use a different data type.

And indeed the queries where I get this error are doing such a cast:

    CAST(from_iso8601_timestamp(
		regexp_extract("$path", '(\d{4})-(\d{2})-(\d{2})T(\d{2}):(\d{2}):(\d{2})Z'))
		AS TIMESTAMP)

But my problem is that the suggestion to remove the CAST function doesn't work for me either, because my queries are CREATE TABLE AS SELECT statements, so when run something like this:

CREATE TABLE table_name AS 
SELECT from_iso8601_timestamp(
		regexp_extract("$path", '(\d{4})-(\d{2})-(\d{2})T(\d{2}):(\d{2}):(\d{2})Z'))
FROM source_table_name

I get this error:

NOT_SUPPORTED: Unsupported Hive type: timestamp(3) with time zone. You may need to manually clean the data at location 's3://[REDACTED]/tables/[REDACTED]' before retrying. Athena will not delete data in your account.

And indeed, the latter error is the reason I ever had the CAST function in these statements.

Doing just the SELECT query (instead of `CREATE TABLE AS SELECT) works in the latter case... but I really do need to create the table.

This becomes an even bigger problem when we look at the Trino documentation for date/time functions and notice that it has numerous function that return timestamp with time zone, and that it's very easy to run into this problem, like this CTAS statement illustrates:

CREATE TABLE default.test WITH (
    format = 'Parquet'
) AS
SELECT 
    -- All of these functions return TIMESTAMP WITH TIME ZONE:
    current_timestamp AS a,
    current_timestamp(6) AS b,
    from_iso8601_timestamp('2021-01-02T03:05:05') AS c,
    from_iso8601_timestamp_nanos('2021-01-02T03:05:05.123456') AS d,
    from_unixtime(0, 'UTC') AS e,
    now() AS f,
    parse_datetime('2022-04-05 15:55', 'yyyy-MM-DD HH:mm') AS g
;

Unless Athena Engine version 3 is patched to permit such casts, an easy, robust workaround needs to be documented.

asked 2 years ago289 views
No Answers

You are not logged in. Log in to post an answer.

A good answer clearly answers the question and provides constructive feedback and encourages professional growth in the question asker.

Guidelines for Answering Questions