I'm attempting to run an Athena query using AWS Data Wrangler which is failing with an error indicating a column cannot be resolved. However when I view the failing query in the Athena console and rerun it there, it succeeds. I have a feeling it has something to do with the formatting with table or column names but I've been unsuccessful finding a work around. When I attempt the same query using vanilla boto3, it also fails with the same error. Here is an example of the code which fails followed by the same successful query ran in the console. (with some names anonymized for security) Adding to the strangeness further, if I remove the joins, it will error indicating the first column in the select cannot be resolved but the same query runs successfully in the console.
Python Code
import awswrangler as wr
query = """
SELECT
t1.description,
t1.model_number as model,
t1.p_id as p_id,
t1.created_at as createdAt,
t1.updated_at as updatedAt,
t3.description as platform,
t2.name as module
FROM
database_with_underscores.first_table t1
LEFT JOIN
database_with_underscores.other_table t2
ON
t2.id = t1.module_id
LEFT JOIN
database_with_underscores.lasttable t3
ON
t1.platform_id = t3.id
"""
df = wr.athena.read_sql_query(query, workgroup="primary", database="database_with_underscores")
Python Result
awswrangler.exceptions.QueryFailed: SYNTAX_ERROR: line 19:5: Column 't2.id' cannot be resolved
AWS Athena Console
SELECT
t1.description,
t1.model_number as model,
t1.p_id as p_id,
t1.created_at as createdAt,
t1.updated_at as updatedAt,
t3.description as platform,
t2.name as module
FROM
database_with_underscores.first_table t1
LEFT JOIN
database_with_underscores.other_table t2
ON
t2.id = t1.module_id
LEFT JOIN
database_with_underscores.lasttable t3
ON
t1.platform_id = t3.id
Console Result
Successful
Link to a diff of the execution where you can see the Query is identical but one succeeds and the other fails https://imgur.com/a/V9QSbIn
Ah yes, that was a bad copy/paste. In the source code prior to me anonymizing it, they're both identical yet the same issues persists. The both are in fact
t2.id = t1.module_id
Also for further clarification, when the query fails via Python, I go the console, click the failed query to load it and hit run without edits and it succeeds. I'm not actually copying/pasting anything between the two systems.