Does Athena Support Numeric Data type?

0

Using SQL Alchemy ORM to query an AWS athena database. First off, here is the raw SQL query, which is working perfectly

WITH zi AS (SELECT
       trip.start_dt,
       address.zone_id
FROM view_midtable_esc_trip_details_cleaned AS trip
INNER JOIN view_midtable_esc_address_expanded AS address
ON trip.start_address_ref = address.address_ref),

all_trips AS (SELECT
date_parse(
        concat(
            date_format(zi.start_dt, '%Y-%m-%d %H:'),
-- The following line causing issues when converting to ORM
            CAST(FLOOR(extract(minute from zi.start_dt)/15)*15 AS VARCHAR),
            ':00'),
        '%Y-%m-%d %H:%i:%s'
    ) AS modified_timestamp,
    zi.zone_id
FROM zi)

SELECT COUNT(*) as demand,
modified_timestamp,
zone_id FROM all_trips GROUP BY modified_timestamp, zone_id;

Now, here is the relevant part of the code where I am doing the minutes extraction from the timestamp, and replacing it with the following logic

floor(original_minutes/15)*15

minutes_interval:int=15
quantised_minute:BinaryExpression=func.FLOOR(func.extract('minute', 
zi.columns[original_timestamp])/minutes_interval)*minutes_interval
modified_timestamp = func.date_format(func.concat(
    func.date_format(zi.columns[original_timestamp], '%Y-%m-%d %H:'),
func.CAST(quantised_minute, VARCHAR)), 
                                      text("'%Y-%m-%d %H:%i'")).label(name=original_timestamp)    
query:Query=session.query(modified_timestamp, 
                          zi.columns[zone_col], 
                          func.count().label(name=LABEL)).group_by(modified_timestamp, zi.columns[zone_col])

When I print the raw query that is sent to the engine, this is getting translated by ORM as (comments and formatting mine)

SELECT date_format(concat(date_format(anon_1.start_dt, '%Y-%m-%d %H:'), 
-- What is this extra casting?
CAST(FLOOR(EXTRACT(minute FROM anon_1.start_dt) / CAST(15 AS NUMERIC)) * 15 AS VARCHAR)), 

'%Y-%m-%d %H:%i') AS start_dt, anon_1.zone_id, count(*) AS total_unique_demand 
FROM ...

and the python error message is

OperationalError: (pyathena.error.OperationalError) TYPE_MISMATCH: line 1:123: Unknown type: NUMERIC
[SQL: SELECT date_format(concat(date_format(anon_1.start_dt, '%%Y-%%m-%%d %%H:'), CAST(FLOOR(EXTRACT(minute FROM anon_1.start_dt) / CAST(15 AS NUMERIC)) * 15 AS VARCHAR)), '%%Y-%%m-%%d %%H:%%i') AS start_dt, anon_1.zone_id, count(*) AS total_unique_demand 

So, to verify it, I put this in Athena

SELECT CAST(15 AS NUMERIC);

and, sure enough, it is throwing the exact same error. So, the question is how to divide by 15 in SQL Alchemy in a way that is compatible with athena?

If it is important, I am using pyathena for engine.

asked 4 months ago436 views
1 Answer
0

Since AWS do not own neither “pyathena” nor SQL Alchemy, AWS can not comment on how these engines are converting your query.

Secondly, as mentioned in error there is no datatype called NUMERIC in Athena. Kindly refer to [1] for the supported datatypes.

[1] https://docs.aws.amazon.com/athena/latest/ug/data-types.html

answered 4 months ago
profile picture
EXPERT
reviewed 23 days ago

You are not logged in. Log in to post an answer.

A good answer clearly answers the question and provides constructive feedback and encourages professional growth in the question asker.

Guidelines for Answering Questions