スキップしてコンテンツを表示

How to query Athena Table that has a dash character in a column name?

0

I have a file in S3 with the following contents:

{"foo-bar": {"name":"Mercury","distanceFromSun":0.39,"orbitalPeriod":0.24,"dayLength":58.65}}{"foo-bar": {"name":"Venus","distanceFromSun":0.72,"orbitalPeriod":0.62,"dayLength":243.02}}

The Glue table definition is:

CREATE EXTERNAL TABLE `planets_ion_2`(
  `foo-bar` struct<name:string,distancefromsun:decimal(38,18),orbitalperiod:decimal(38,18),daylength:decimal(38,18)> COMMENT '')
ROW FORMAT SERDE 
  'com.amazon.ionhiveserde.IonHiveSerDe' 
STORED AS INPUTFORMAT 
  'com.amazon.ionhiveserde.formats.IonInputFormat' 
OUTPUTFORMAT 
  'com.amazon.ionhiveserde.formats.IonOutputFormat'
LOCATION
  's3://<some_bucket>/AthenaDataStore/Planets2'
TBLPROPERTIES (
  'transient_lastDdlTime'='1740775321')

I get bank values for the "name" key when I query Athena like so:

select "foo-bar".name from planets_ion_2

Enter image description here

If I remove the dash from the top level key in the S3 json, from "foo-bar" to "foobar", it works fine:

{"foobar": {"name":"Mercury","distanceFromSun":0.39,"orbitalPeriod":0.24,"dayLength":58.65}}{"foobar": {"name":"Venus","distanceFromSun":0.72,"orbitalPeriod":0.62,"dayLength":243.02}}

CREATE EXTERNAL TABLE `planets_ion_2`(
  `foobar` struct<name:string,distancefromsun:decimal(38,18),orbitalperiod:decimal(38,18),daylength:decimal(38,18)> COMMENT '')
ROW FORMAT SERDE 
  'com.amazon.ionhiveserde.IonHiveSerDe' 
STORED AS INPUTFORMAT 
  'com.amazon.ionhiveserde.formats.IonInputFormat' 
OUTPUTFORMAT 
  'com.amazon.ionhiveserde.formats.IonOutputFormat'
LOCATION
  's3://<some_bucket>/AthenaDataStore/Planets2'
TBLPROPERTIES (
  'transient_lastDdlTime'='1740775321')

Enter image description here

So I can't seem to escape the dash. I've tried a lot of things without success, including backticks: select `foo-bar`.name from planets_ion_2

Any time I try to escape with backtick, I get the error:

Queries of this type are not supported

Enter image description here

Thanks in advance for any help.

質問済み 9ヶ月前489ビュー
1回答
0

When dealing with column names that contain special characters like dashes in Amazon Athena, you need to use a specific syntax to query them correctly. In your case, with the column name "foo-bar", you should use double quotes to enclose the column name, not backticks. Here's how you can modify your query:

SELECT "foo-bar".name FROM planets_ion_2

This syntax should allow you to access the nested "name" field within the "foo-bar" struct.

It's important to note that while Athena follows SQL standards for the most part, it has some specific rules when it comes to handling special characters in column names. Double quotes are used for column names with special characters, while backticks are typically used in the table creation statements or when working with the Athena console query editor for table names with special characters.

If you continue to experience issues, you might want to consider renaming the column in your data source to avoid the dash, as you've noticed it works fine without it. Alternatively, you could create a view that aliases the column name to something without special characters, which might make querying easier in the long run.

Remember, when creating tables in Athena, it's generally a good practice to avoid using special characters in column names to prevent these kinds of issues and make querying more straightforward.
Sources
Query W3C extended log file format - Amazon Athena
Create databases and tables - Amazon Athena

回答済み 9ヶ月前
  • The AI's suggestion of putting the column name in double quotes is part of the example I already tried.

ログインしていません。 ログイン 回答を投稿する。

優れた回答とは、質問に明確に答え、建設的なフィードバックを提供し、質問者の専門分野におけるスキルの向上を促すものです。

関連するコンテンツ