Does Athena Support Numeric Data type?

0

Using SQL Alchemy ORM to query an AWS athena database. First off, here is the raw SQL query, which is working perfectly

WITH zi AS (SELECT
       trip.start_dt,
       address.zone_id
FROM view_midtable_esc_trip_details_cleaned AS trip
INNER JOIN view_midtable_esc_address_expanded AS address
ON trip.start_address_ref = address.address_ref),

all_trips AS (SELECT
date_parse(
        concat(
            date_format(zi.start_dt, '%Y-%m-%d %H:'),
-- The following line causing issues when converting to ORM
            CAST(FLOOR(extract(minute from zi.start_dt)/15)*15 AS VARCHAR),
            ':00'),
        '%Y-%m-%d %H:%i:%s'
    ) AS modified_timestamp,
    zi.zone_id
FROM zi)

SELECT COUNT(*) as demand,
modified_timestamp,
zone_id FROM all_trips GROUP BY modified_timestamp, zone_id;

Now, here is the relevant part of the code where I am doing the minutes extraction from the timestamp, and replacing it with the following logic

floor(original_minutes/15)*15

minutes_interval:int=15
quantised_minute:BinaryExpression=func.FLOOR(func.extract('minute', 
zi.columns[original_timestamp])/minutes_interval)*minutes_interval
modified_timestamp = func.date_format(func.concat(
    func.date_format(zi.columns[original_timestamp], '%Y-%m-%d %H:'),
func.CAST(quantised_minute, VARCHAR)), 
                                      text("'%Y-%m-%d %H:%i'")).label(name=original_timestamp)    
query:Query=session.query(modified_timestamp, 
                          zi.columns[zone_col], 
                          func.count().label(name=LABEL)).group_by(modified_timestamp, zi.columns[zone_col])

When I print the raw query that is sent to the engine, this is getting translated by ORM as (comments and formatting mine)

SELECT date_format(concat(date_format(anon_1.start_dt, '%Y-%m-%d %H:'), 
-- What is this extra casting?
CAST(FLOOR(EXTRACT(minute FROM anon_1.start_dt) / CAST(15 AS NUMERIC)) * 15 AS VARCHAR)), 

'%Y-%m-%d %H:%i') AS start_dt, anon_1.zone_id, count(*) AS total_unique_demand 
FROM ...

and the python error message is

OperationalError: (pyathena.error.OperationalError) TYPE_MISMATCH: line 1:123: Unknown type: NUMERIC
[SQL: SELECT date_format(concat(date_format(anon_1.start_dt, '%%Y-%%m-%%d %%H:'), CAST(FLOOR(EXTRACT(minute FROM anon_1.start_dt) / CAST(15 AS NUMERIC)) * 15 AS VARCHAR)), '%%Y-%%m-%%d %%H:%%i') AS start_dt, anon_1.zone_id, count(*) AS total_unique_demand 

So, to verify it, I put this in Athena

SELECT CAST(15 AS NUMERIC);

and, sure enough, it is throwing the exact same error. So, the question is how to divide by 15 in SQL Alchemy in a way that is compatible with athena?

If it is important, I am using pyathena for engine.

質問済み 4ヶ月前453ビュー
1回答
0

Since AWS do not own neither “pyathena” nor SQL Alchemy, AWS can not comment on how these engines are converting your query.

Secondly, as mentioned in error there is no datatype called NUMERIC in Athena. Kindly refer to [1] for the supported datatypes.

[1] https://docs.aws.amazon.com/athena/latest/ug/data-types.html

回答済み 4ヶ月前
profile picture
エキスパート
レビュー済み 1ヶ月前

ログインしていません。 ログイン 回答を投稿する。

優れた回答とは、質問に明確に答え、建設的なフィードバックを提供し、質問者の専門分野におけるスキルの向上を促すものです。

質問に答えるためのガイドライン

関連するコンテンツ