Skip to content

Athena - Metadata Query Issue

0

I am trying to get the metadata of a database by running a query in Athena:- SELECT 'DEV' DF_ENVIRONMENT, 'Source Layer' DATA_LAYER, CAST(TABLE_CATALOG AS VARCHAR) DATABASE_NAME, CAST(TABLE_SCHEMA AS VARCHAR) TABLE_SCHEMA, CAST(TABLE_NAME AS VARCHAR) TABLE_NAME, CAST(COLUMN_NAME AS VARCHAR) COLUMN_NAME, CAST(DATA_TYPE AS VARCHAR) DATA_TYPE, CAST(ORDINAL_POSITION AS INTEGER) ORDINAL_POSITION, CASE WHEN IS_NULLABLE = 'YES' THEN TRUE ELSE FALSE END AS IS_NULLABLE , CURRENT_DATE M_CREATED_ON_DTTM, CURRENT_DATE M_CHANGED_ON_DTTM FROM information_schema.columns where TABLE_SCHEMA = 'example_db'

Getting the BELOW ERROR:- GENERIC_USER_ERROR: Error listing table columns for catalog awsdatacatalog: Detected Iceberg type table without metadata location. Please make sure an Iceberg-enabled compute engine such as Athena or EMR Spark is used to create the table, or the table is created by using the Iceberg open source AWS library iceberg-aws. Setting table_type parameter in Glue metastore to create an Iceberg table is not supported. This query ran against the "example" database, unless qualified by the query. Please post the error message on our forum or contact customer support with Query Id: 28f33b2d-3f43-4fb8-b7f6-f2b89cce2eef

Need a solution for this, Or how to filter out the tables which has the metadata.

2 Answers
0

The error you're encountering suggests that there's an issue with one or more Iceberg tables in your database that don't have proper metadata locations. This is likely because these tables were not created using an Iceberg-enabled compute engine or the supported AWS Iceberg library.

To resolve this issue and retrieve metadata for the tables that do have proper metadata, you can modify your query to exclude the problematic Iceberg tables. Here's an approach you can try:

  1. First, identify the tables that are causing the issue by querying the Glue Data Catalog. You can do this with a query like:

    SELECT table_name FROM information_schema.tables WHERE table_schema = 'example_db' AND table_type = 'EXTERNAL_TABLE'

  2. Once you have the list of tables, you can modify your original query to exclude these problematic tables. Here's an example:

    SELECT 'DEV' AS DF_ENVIRONMENT, 'Source Layer' AS DATA_LAYER, CAST(TABLE_CATALOG AS VARCHAR) AS DATABASE_NAME, CAST(TABLE_SCHEMA AS VARCHAR) AS TABLE_SCHEMA, CAST(TABLE_NAME AS VARCHAR) AS TABLE_NAME, CAST(COLUMN_NAME AS VARCHAR) AS COLUMN_NAME, CAST(DATA_TYPE AS VARCHAR) AS DATA_TYPE, CAST(ORDINAL_POSITION AS INTEGER) AS ORDINAL_POSITION, CASE WHEN IS_NULLABLE = 'YES' THEN TRUE ELSE FALSE END AS IS_NULLABLE, CURRENT_DATE AS M_CREATED_ON_DTTM, CURRENT_DATE AS M_CHANGED_ON_DTTM FROM information_schema.columns WHERE TABLE_SCHEMA = 'example_db' AND TABLE_NAME NOT IN ( SELECT table_name FROM information_schema.tables WHERE table_schema = 'example_db' AND table_type = 'EXTERNAL_TABLE' )

This query should allow you to retrieve metadata for the tables that don't have issues with their Iceberg metadata.

For a long-term solution, you should review how the Iceberg tables are being created in your environment. Ensure that they are created using an Iceberg-enabled compute engine like Athena or EMR Spark, or by using the Iceberg open source AWS library iceberg-aws. Simply setting the table_type parameter in the Glue metastore is not sufficient for creating a properly configured Iceberg table.

If you need to work with the Iceberg tables specifically, you may need to recreate them using the proper tools or update their metadata locations. Consider reaching out to AWS support for guidance on properly setting up and managing Iceberg tables in your environment.
Sources
Troubleshoot issues in Athena - Amazon Athena
TABLE_NOT_FOUND when Running from Lambda but query runs successfully when running from editor | AWS re:Post
Querying metadata tables with AWS analytics services - Amazon Simple Storage Service

answered 7 months ago
0

Hello,

The error Detected Iceberg type table without metadata location occurs if your database example_db contains one or more Iceberg tables without the table property metadata_location. This property is required for Athena to retrieve column information of iceberg tables by inspecting the iceberg metadata.

  1. I would recommend you to run a cli command similar to the one like below to fetch list of the iceberg tables without the metadata_location. Note: I am using a third party library called jq for parsing output json.
aws glue get-tables --database-name example_db | jq -r 'select(.TableList[].Parameters.table_type != "ICEBERG") | .TableList[].Name,.TableList[].Parameters.metadata_location'
  1. Then, you can exclude those table from your query like below to get the metadata information of the remaining tables
SELECT 'DEV' DF_ENVIRONMENT, 'Source Layer' DATA_LAYER, CAST(TABLE_CATALOG AS VARCHAR) DATABASE_NAME, CAST(TABLE_SCHEMA AS VARCHAR) TABLE_SCHEMA, CAST(TABLE_NAME AS VARCHAR) TABLE_NAME, CAST(COLUMN_NAME AS VARCHAR) COLUMN_NAME, CAST(DATA_TYPE AS VARCHAR) DATA_TYPE, CAST(ORDINAL_POSITION AS INTEGER) ORDINAL_POSITION, CASE WHEN IS_NULLABLE = 'YES' THEN TRUE ELSE FALSE END AS IS_NULLABLE , CURRENT_DATE M_CREATED_ON_DTTM, CURRENT_DATE M_CHANGED_ON_DTTM FROM information_schema.columns where TABLE_SCHEMA = 'example_db' and TABLE_NAME not in (....)
AWS
SUPPORT ENGINEER
answered 7 months ago

You are not logged in. Log in to post an answer.

A good answer clearly answers the question and provides constructive feedback and encourages professional growth in the question asker.