Does Athena Support Numeric Data type?

0

Using SQL Alchemy ORM to query an AWS athena database. First off, here is the raw SQL query, which is working perfectly

WITH zi AS (SELECT
       trip.start_dt,
       address.zone_id
FROM view_midtable_esc_trip_details_cleaned AS trip
INNER JOIN view_midtable_esc_address_expanded AS address
ON trip.start_address_ref = address.address_ref),

all_trips AS (SELECT
date_parse(
        concat(
            date_format(zi.start_dt, '%Y-%m-%d %H:'),
-- The following line causing issues when converting to ORM
            CAST(FLOOR(extract(minute from zi.start_dt)/15)*15 AS VARCHAR),
            ':00'),
        '%Y-%m-%d %H:%i:%s'
    ) AS modified_timestamp,
    zi.zone_id
FROM zi)

SELECT COUNT(*) as demand,
modified_timestamp,
zone_id FROM all_trips GROUP BY modified_timestamp, zone_id;

Now, here is the relevant part of the code where I am doing the minutes extraction from the timestamp, and replacing it with the following logic

floor(original_minutes/15)*15

minutes_interval:int=15
quantised_minute:BinaryExpression=func.FLOOR(func.extract('minute', 
zi.columns[original_timestamp])/minutes_interval)*minutes_interval
modified_timestamp = func.date_format(func.concat(
    func.date_format(zi.columns[original_timestamp], '%Y-%m-%d %H:'),
func.CAST(quantised_minute, VARCHAR)), 
                                      text("'%Y-%m-%d %H:%i'")).label(name=original_timestamp)    
query:Query=session.query(modified_timestamp, 
                          zi.columns[zone_col], 
                          func.count().label(name=LABEL)).group_by(modified_timestamp, zi.columns[zone_col])

When I print the raw query that is sent to the engine, this is getting translated by ORM as (comments and formatting mine)

SELECT date_format(concat(date_format(anon_1.start_dt, '%Y-%m-%d %H:'), 
-- What is this extra casting?
CAST(FLOOR(EXTRACT(minute FROM anon_1.start_dt) / CAST(15 AS NUMERIC)) * 15 AS VARCHAR)), 

'%Y-%m-%d %H:%i') AS start_dt, anon_1.zone_id, count(*) AS total_unique_demand 
FROM ...

and the python error message is

OperationalError: (pyathena.error.OperationalError) TYPE_MISMATCH: line 1:123: Unknown type: NUMERIC
[SQL: SELECT date_format(concat(date_format(anon_1.start_dt, '%%Y-%%m-%%d %%H:'), CAST(FLOOR(EXTRACT(minute FROM anon_1.start_dt) / CAST(15 AS NUMERIC)) * 15 AS VARCHAR)), '%%Y-%%m-%%d %%H:%%i') AS start_dt, anon_1.zone_id, count(*) AS total_unique_demand 

So, to verify it, I put this in Athena

SELECT CAST(15 AS NUMERIC);

and, sure enough, it is throwing the exact same error. So, the question is how to divide by 15 in SQL Alchemy in a way that is compatible with athena?

If it is important, I am using pyathena for engine.

질문됨 4달 전453회 조회
1개 답변
0

Since AWS do not own neither “pyathena” nor SQL Alchemy, AWS can not comment on how these engines are converting your query.

Secondly, as mentioned in error there is no datatype called NUMERIC in Athena. Kindly refer to [1] for the supported datatypes.

[1] https://docs.aws.amazon.com/athena/latest/ug/data-types.html

답변함 4달 전
profile picture
전문가
검토됨 한 달 전

로그인하지 않았습니다. 로그인해야 답변을 게시할 수 있습니다.

좋은 답변은 질문에 명확하게 답하고 건설적인 피드백을 제공하며 질문자의 전문적인 성장을 장려합니다.

질문 답변하기에 대한 가이드라인