Skip to content

How to query Athena Table that has a dash character in a column name?

0

I have a file in S3 with the following contents:

{"foo-bar": {"name":"Mercury","distanceFromSun":0.39,"orbitalPeriod":0.24,"dayLength":58.65}}{"foo-bar": {"name":"Venus","distanceFromSun":0.72,"orbitalPeriod":0.62,"dayLength":243.02}}

The Glue table definition is:

CREATE EXTERNAL TABLE `planets_ion_2`(
  `foo-bar` struct<name:string,distancefromsun:decimal(38,18),orbitalperiod:decimal(38,18),daylength:decimal(38,18)> COMMENT '')
ROW FORMAT SERDE 
  'com.amazon.ionhiveserde.IonHiveSerDe' 
STORED AS INPUTFORMAT 
  'com.amazon.ionhiveserde.formats.IonInputFormat' 
OUTPUTFORMAT 
  'com.amazon.ionhiveserde.formats.IonOutputFormat'
LOCATION
  's3://<some_bucket>/AthenaDataStore/Planets2'
TBLPROPERTIES (
  'transient_lastDdlTime'='1740775321')

I get bank values for the "name" key when I query Athena like so:

select "foo-bar".name from planets_ion_2

Enter image description here

If I remove the dash from the top level key in the S3 json, from "foo-bar" to "foobar", it works fine:

{"foobar": {"name":"Mercury","distanceFromSun":0.39,"orbitalPeriod":0.24,"dayLength":58.65}}{"foobar": {"name":"Venus","distanceFromSun":0.72,"orbitalPeriod":0.62,"dayLength":243.02}}

CREATE EXTERNAL TABLE `planets_ion_2`(
  `foobar` struct<name:string,distancefromsun:decimal(38,18),orbitalperiod:decimal(38,18),daylength:decimal(38,18)> COMMENT '')
ROW FORMAT SERDE 
  'com.amazon.ionhiveserde.IonHiveSerDe' 
STORED AS INPUTFORMAT 
  'com.amazon.ionhiveserde.formats.IonInputFormat' 
OUTPUTFORMAT 
  'com.amazon.ionhiveserde.formats.IonOutputFormat'
LOCATION
  's3://<some_bucket>/AthenaDataStore/Planets2'
TBLPROPERTIES (
  'transient_lastDdlTime'='1740775321')

Enter image description here

So I can't seem to escape the dash. I've tried a lot of things without success, including backticks: select `foo-bar`.name from planets_ion_2

Any time I try to escape with backtick, I get the error:

Queries of this type are not supported

Enter image description here

Thanks in advance for any help.

asked 8 months ago437 views
1 Answer
0

When dealing with column names that contain special characters like dashes in Amazon Athena, you need to use a specific syntax to query them correctly. In your case, with the column name "foo-bar", you should use double quotes to enclose the column name, not backticks. Here's how you can modify your query:

SELECT "foo-bar".name FROM planets_ion_2

This syntax should allow you to access the nested "name" field within the "foo-bar" struct.

It's important to note that while Athena follows SQL standards for the most part, it has some specific rules when it comes to handling special characters in column names. Double quotes are used for column names with special characters, while backticks are typically used in the table creation statements or when working with the Athena console query editor for table names with special characters.

If you continue to experience issues, you might want to consider renaming the column in your data source to avoid the dash, as you've noticed it works fine without it. Alternatively, you could create a view that aliases the column name to something without special characters, which might make querying easier in the long run.

Remember, when creating tables in Athena, it's generally a good practice to avoid using special characters in column names to prevent these kinds of issues and make querying more straightforward.
Sources
Query W3C extended log file format - Amazon Athena
Create databases and tables - Amazon Athena

answered 8 months ago
  • The AI's suggestion of putting the column name in double quotes is part of the example I already tried.

You are not logged in. Log in to post an answer.

A good answer clearly answers the question and provides constructive feedback and encourages professional growth in the question asker.