Does Athena Support Numeric Data type?

0

Using SQL Alchemy ORM to query an AWS athena database. First off, here is the raw SQL query, which is working perfectly

WITH zi AS (SELECT
       trip.start_dt,
       address.zone_id
FROM view_midtable_esc_trip_details_cleaned AS trip
INNER JOIN view_midtable_esc_address_expanded AS address
ON trip.start_address_ref = address.address_ref),

all_trips AS (SELECT
date_parse(
        concat(
            date_format(zi.start_dt, '%Y-%m-%d %H:'),
-- The following line causing issues when converting to ORM
            CAST(FLOOR(extract(minute from zi.start_dt)/15)*15 AS VARCHAR),
            ':00'),
        '%Y-%m-%d %H:%i:%s'
    ) AS modified_timestamp,
    zi.zone_id
FROM zi)

SELECT COUNT(*) as demand,
modified_timestamp,
zone_id FROM all_trips GROUP BY modified_timestamp, zone_id;

Now, here is the relevant part of the code where I am doing the minutes extraction from the timestamp, and replacing it with the following logic

floor(original_minutes/15)*15

minutes_interval:int=15
quantised_minute:BinaryExpression=func.FLOOR(func.extract('minute', 
zi.columns[original_timestamp])/minutes_interval)*minutes_interval
modified_timestamp = func.date_format(func.concat(
    func.date_format(zi.columns[original_timestamp], '%Y-%m-%d %H:'),
func.CAST(quantised_minute, VARCHAR)), 
                                      text("'%Y-%m-%d %H:%i'")).label(name=original_timestamp)    
query:Query=session.query(modified_timestamp, 
                          zi.columns[zone_col], 
                          func.count().label(name=LABEL)).group_by(modified_timestamp, zi.columns[zone_col])

When I print the raw query that is sent to the engine, this is getting translated by ORM as (comments and formatting mine)

SELECT date_format(concat(date_format(anon_1.start_dt, '%Y-%m-%d %H:'), 
-- What is this extra casting?
CAST(FLOOR(EXTRACT(minute FROM anon_1.start_dt) / CAST(15 AS NUMERIC)) * 15 AS VARCHAR)), 

'%Y-%m-%d %H:%i') AS start_dt, anon_1.zone_id, count(*) AS total_unique_demand 
FROM ...

and the python error message is

OperationalError: (pyathena.error.OperationalError) TYPE_MISMATCH: line 1:123: Unknown type: NUMERIC
[SQL: SELECT date_format(concat(date_format(anon_1.start_dt, '%%Y-%%m-%%d %%H:'), CAST(FLOOR(EXTRACT(minute FROM anon_1.start_dt) / CAST(15 AS NUMERIC)) * 15 AS VARCHAR)), '%%Y-%%m-%%d %%H:%%i') AS start_dt, anon_1.zone_id, count(*) AS total_unique_demand 

So, to verify it, I put this in Athena

SELECT CAST(15 AS NUMERIC);

and, sure enough, it is throwing the exact same error. So, the question is how to divide by 15 in SQL Alchemy in a way that is compatible with athena?

If it is important, I am using pyathena for engine.

demandé il y a 4 mois453 vues
1 réponse
0

Since AWS do not own neither “pyathena” nor SQL Alchemy, AWS can not comment on how these engines are converting your query.

Secondly, as mentioned in error there is no datatype called NUMERIC in Athena. Kindly refer to [1] for the supported datatypes.

[1] https://docs.aws.amazon.com/athena/latest/ug/data-types.html

répondu il y a 4 mois
profile picture
EXPERT
vérifié il y a un mois

Vous n'êtes pas connecté. Se connecter pour publier une réponse.

Une bonne réponse répond clairement à la question, contient des commentaires constructifs et encourage le développement professionnel de la personne qui pose la question.

Instructions pour répondre aux questions