S3 table bucker created by Apache iceberg and PySpark can not be shown in Athena CLI

0

I have a s3 table bucket and I am using PySaprk and Apache Iceberg to write data to the bucket.

df.writeTo("s3tablesbucket.stg.sgsbi_dbo_splusorders2")
.partitionedBy("report_date")
.createOrReplace()

I can get the the table contents using pyspark.

spark.sql("""select * from s3tablesbucket.stg.sgsbi_dbo_splusorders2 limit 1""").show() +-------------+-----------+-----------+
|code_customer|surena_code|report_date| +-------------+-----------+-----------+ | 181498| 498| 1404-02-09| +-------------+-----------+-----------+

However, I can't do it using cli and Athen editor

SELECT * FROM "sgsbi_dbo_splusorders2" limit 1;

GENERIC_INTERNAL_ERROR: Client initialization failed on credential refresh This query ran against the "stg" database, unless qualified by the query. Please post the error message on our forum or contact customer support with Query Id: 8f976aa8-cdcf-42ff-b944-a11a8072df8c

Not sure what is wrong. I am writing the data to s3 table by the root account using account id and the secret.

asked 19 days ago46 views
1 Answer
1

The error you're encountering is likely due to a few potential issues when trying to query Iceberg tables through Athena. Here's how to resolve this:

  1. First, ensure proper catalog configuration:
-- Create an Iceberg catalog in Athena
CREATE CATALOG iceberg_catalog
WITH (
  catalog_type = 'iceberg',
  s3_bucket = 's3tablesbucket',
  warehouse_dir = 's3://s3tablesbucket/'
);
  1. Update your query syntax:
-- Use the proper catalog and schema reference
SELECT * 
FROM iceberg_catalog.stg.sgsbi_dbo_splusorders2 
LIMIT 1;
  1. Check IAM permissions: Ensure your IAM role/user has these permissions:
{
    "Version": "2012-10-17",
    "Statement": [
        {
            "Effect": "Allow",
            "Action": [
                "athena:*",
                "glue:*",
                "s3:GetBucketLocation",
                "s3:GetObject",
                "s3:ListBucket",
                "s3:PutObject"
            ],
            "Resource": [
                "arn:aws:s3:::s3tablesbucket/*",
                "arn:aws:s3:::s3tablesbucket"
            ]
        }
    ]
}
  1. Verify Glue Catalog integration:
# When writing with PySpark, ensure metadata is registered in Glue
df.writeTo("s3tablesbucket.stg.sgsbi_dbo_splusorders2") \
  .partitionedBy("report_date") \
  .option("catalog-impl", "hive") \
  .createOrReplace()
  1. Check table properties in Glue:
-- Verify table format is properly set
SHOW TBLPROPERTIES sgsbi_dbo_splusorders2;
  1. Configure Athena workgroup settings:
  • Ensure output location is set
  • Verify workgroup has necessary permissions

If still experiencing issues:

  1. Try using AWS CLI to verify access:
aws s3 ls s3://s3tablesbucket/stg/sgsbi_dbo_splusorders2/
  1. Check table location:
SHOW CREATE TABLE iceberg_catalog.stg.sgsbi_dbo_splusorders2;
  1. Verify metadata files:
aws s3 ls s3://s3tablesbucket/stg/sgsbi_dbo_splusorders2/metadata/
  1. Update table properties if needed:
ALTER TABLE iceberg_catalog.stg.sgsbi_dbo_splusorders2 
SET TBLPROPERTIES ('table_type' = 'ICEBERG');
  1. Check for proper file formats:
# When writing, specify format explicitly
df.writeTo("s3tablesbucket.stg.sgsbi_dbo_splusorders2") \
  .partitionedBy("report_date") \
  .option("write-format", "parquet") \
  .createOrReplace()

If the issue persists:

  1. Verify your Athena query engine version supports Iceberg
  2. Check if you need to update table metadata in Glue
  3. Ensure all required dependencies are properly configured
  4. Review CloudWatch logs for detailed error messages

Final troubleshooting steps:

-- Refresh table metadata
CALL system.sync_table_properties('iceberg_catalog.stg.sgsbi_dbo_splusorders2');

-- Check table format
SELECT * 
FROM iceberg_catalog.system.tables 
WHERE table_name = 'sgsbi_dbo_splusorders2';

-- Verify partitioning
SHOW CREATE TABLE iceberg_catalog.stg.sgsbi_dbo_splusorders2;

Remember to:

  • Use the latest Athena engine version
  • Keep table statistics updated
  • Monitor query performance
  • Maintain proper access controls
profile pictureAWS
EXPERT
answered 19 days ago
profile pictureAWS
EXPERT
reviewed 15 days ago

You are not logged in. Log in to post an answer.

A good answer clearly answers the question and provides constructive feedback and encourages professional growth in the question asker.

Guidelines for Answering Questions