Does Athena Support Numeric Data type?

0

Using SQL Alchemy ORM to query an AWS athena database. First off, here is the raw SQL query, which is working perfectly

WITH zi AS (SELECT
       trip.start_dt,
       address.zone_id
FROM view_midtable_esc_trip_details_cleaned AS trip
INNER JOIN view_midtable_esc_address_expanded AS address
ON trip.start_address_ref = address.address_ref),

all_trips AS (SELECT
date_parse(
        concat(
            date_format(zi.start_dt, '%Y-%m-%d %H:'),
-- The following line causing issues when converting to ORM
            CAST(FLOOR(extract(minute from zi.start_dt)/15)*15 AS VARCHAR),
            ':00'),
        '%Y-%m-%d %H:%i:%s'
    ) AS modified_timestamp,
    zi.zone_id
FROM zi)

SELECT COUNT(*) as demand,
modified_timestamp,
zone_id FROM all_trips GROUP BY modified_timestamp, zone_id;

Now, here is the relevant part of the code where I am doing the minutes extraction from the timestamp, and replacing it with the following logic

floor(original_minutes/15)*15

minutes_interval:int=15
quantised_minute:BinaryExpression=func.FLOOR(func.extract('minute', 
zi.columns[original_timestamp])/minutes_interval)*minutes_interval
modified_timestamp = func.date_format(func.concat(
    func.date_format(zi.columns[original_timestamp], '%Y-%m-%d %H:'),
func.CAST(quantised_minute, VARCHAR)), 
                                      text("'%Y-%m-%d %H:%i'")).label(name=original_timestamp)    
query:Query=session.query(modified_timestamp, 
                          zi.columns[zone_col], 
                          func.count().label(name=LABEL)).group_by(modified_timestamp, zi.columns[zone_col])

When I print the raw query that is sent to the engine, this is getting translated by ORM as (comments and formatting mine)

SELECT date_format(concat(date_format(anon_1.start_dt, '%Y-%m-%d %H:'), 
-- What is this extra casting?
CAST(FLOOR(EXTRACT(minute FROM anon_1.start_dt) / CAST(15 AS NUMERIC)) * 15 AS VARCHAR)), 

'%Y-%m-%d %H:%i') AS start_dt, anon_1.zone_id, count(*) AS total_unique_demand 
FROM ...

and the python error message is

OperationalError: (pyathena.error.OperationalError) TYPE_MISMATCH: line 1:123: Unknown type: NUMERIC
[SQL: SELECT date_format(concat(date_format(anon_1.start_dt, '%%Y-%%m-%%d %%H:'), CAST(FLOOR(EXTRACT(minute FROM anon_1.start_dt) / CAST(15 AS NUMERIC)) * 15 AS VARCHAR)), '%%Y-%%m-%%d %%H:%%i') AS start_dt, anon_1.zone_id, count(*) AS total_unique_demand 

So, to verify it, I put this in Athena

SELECT CAST(15 AS NUMERIC);

and, sure enough, it is throwing the exact same error. So, the question is how to divide by 15 in SQL Alchemy in a way that is compatible with athena?

If it is important, I am using pyathena for engine.

已提問 4 個月前檢視次數 453 次
1 個回答
0

Since AWS do not own neither “pyathena” nor SQL Alchemy, AWS can not comment on how these engines are converting your query.

Secondly, as mentioned in error there is no datatype called NUMERIC in Athena. Kindly refer to [1] for the supported datatypes.

[1] https://docs.aws.amazon.com/athena/latest/ug/data-types.html

已回答 4 個月前
profile picture
專家
已審閱 1 個月前

您尚未登入。 登入 去張貼答案。

一個好的回答可以清楚地回答問題並提供建設性的意見回饋,同時有助於提問者的專業成長。

回答問題指南