Athena query does not see any columns

0

I created a table in the Glue Catalog via a Glue job. The job is very simple: it reads data from a crawled MySQL table and writes it to GZIP parquet files. When I try to use the new table in another Glue Job all seems fine and the data preview also works.

But when I try to query the table using Athena, I get this error: COLUMN_NOT_FOUND: line 1:8: SELECT * not allowed from relation that has no columns

Using the 'generate DDL' feature in Athena give me this output: CREATE EXTERNAL TABLE countries_np_gzip_v2( iso_country_code string, name string, id int, phone_code string) ROW FORMAT SERDE 'org.apache.hadoop.hive.ql.io.parquet.serde.ParquetHiveSerDe' STORED AS INPUTFORMAT 'org.apache.hadoop.hive.ql.io.parquet.MapredParquetInputFormat' OUTPUTFORMAT 'org.apache.hadoop.hive.ql.io.parquet.MapredParquetOutputFormat' LOCATION 's3://[snip]/countries_np_gzip_v2/' TBLPROPERTIES ( 'CreatedByJob'='test_countries_no_partition', 'CreatedByJobRun'='jr_82b4f7e64c96e38abc76820159e4bbf35de22a38fa952074689bf839860b7885', 'classification'='parquet', 'useGlueParquetWriter'='true')

I have tried version 2 and version 3 of Glue and got the same result. What does seem strange is that the TBLPROPERTIES does not contain any information about the compression (I used GZIP).

  • The compression should be inside the parquet files, so the reader figures out how the row groups are compressed. It's not relevant to the table since each file (actually each row group inside) could use a different compression. Is Athena able to list the columns in the table list on the left?

  • @Gonzalo. Yes, the columns are shown correctly in the table list on the left-hand side. If I bypass the Glue Catalog -- i.e. add a table from Athena via S3 directly -- it works as expected.

Willem
asked 10 months ago2459 views
2 Answers
0

Hello there,

It's intriguing that Athena is returning the COLUMN_NOT_FOUND error even if your Glue jobs and data previews are functioning normally.

In order to find the problem's cause and fix it, let's try the following steps:

  1. To access the databases and tables, AWS Glue and Amazon Athena must be in the same AWS region. Verify that the region you selected for both services is the same.

  2. Check one of the Parquet files in the S3 bucket manually, since S3 is the store you're using. To check if the columns are constructed properly, you can download a file and examine it with a Parquet viewer.

  3. This may sound redundant, but occasionally a straightforward restart of the Glue crawler will assist in detecting the new schemas.

  4. When using Athena with Glue, it's crucial to have compatible versions.

If it doesn't work, no worries, we'll sort it out together! Stay cool and keep clouding!

AWS
e_mora
answered 10 months ago
  • Thank you for your answer; but I am afraid I am still stuck. I have tried using the same version and double-checked the region (points 1 and 4). Point 2: I am sure the Parquet files are OK because when I use Athena straight via S3 (I bypass the catalog) then the query works. And as for point 3: I have not used a Glue crawler in my setup.

0

I have had this issue. It seems like an acess issue. Are you using Lake Formation? If so, please check if SELECT access has been granted on the table level and 'Describe' has been granted at the database level.

profile picture
answered a month ago

You are not logged in. Log in to post an answer.

A good answer clearly answers the question and provides constructive feedback and encourages professional growth in the question asker.

Guidelines for Answering Questions