- Newest
- Most votes
- Most comments
This error is related to configuration issues when trying to use DBT with AWS Glue to work with Iceberg tables. The error occurs because your DBT configuration isn't properly set up to handle Iceberg tables.
When working with Iceberg tables in AWS Glue through DBT, you need specific Spark configurations to enable proper catalog integration. Here's how to fix this:
- Make sure you have the correct Spark session configuration in your DBT profile:
--conf spark.sql.catalog.spark_catalog=org.apache.iceberg.spark.SparkSessionCatalog
--conf spark.sql.catalog.spark_catalog.warehouse=s3://your-bucket-path
--conf spark.sql.catalog.spark_catalog.catalog-impl=org.apache.iceberg.aws.glue.GlueCatalog
--conf spark.sql.catalog.spark_catalog.io-impl=org.apache.iceberg.aws.s3.S3FileIO
--conf spark.sql.extensions=org.apache.iceberg.spark.extensions.IcebergSparkSessionExtensions
-
Don't use the
datalake_formatsconfiguration as it might cause conflicts. -
Ensure you're using Glue Version 5.0 or later and include the necessary Iceberg JAR files:
- iceberg-aws-bundle-1.7.0.jar
- iceberg-spark-runtime-3.3_2.12-1.7.1.jar
-
In your model SQL, make sure you're using the correct table format. Since you're working with Iceberg tables, you should specify
USING ICEBERGinstead ofUSING PARQUETin your CREATE TABLE statement. -
Check that your IAM role has sufficient permissions for both AWS Glue catalog operations and S3 access to your data lake location.
By implementing these changes, you should be able to successfully run your DBT models with AWS Glue on Iceberg tables.
Sources
AWS Glue Interactive Sessions Query Iceberg and Non Iceberg Tables | AWS re:Post
Glue script job error spark_catalog requires a single-part namespace, but got [glue_catalog, foo] | AWS re:Post
Relevant content
- asked 2 years ago
- asked 3 years ago
- AWS OFFICIALUpdated a year ago
- AWS OFFICIALUpdated 2 years ago
