- Newest
- Most votes
- Most comments
To enable cross-account access for AWS Glue jobs to read/write Iceberg tables in another account's Glue Data Catalog, I can see a few issues with your current approaches.
The key difference between your working CSV table access and failing Iceberg table access is related to how Iceberg tables are handled in cross-account scenarios. Let me explain the issues and provide a solution:
Main Issue
When using GlueContext.create_data_frame.from_catalog() with a catalog_id parameter, it works for standard Glue tables (like your CSV table), but for Iceberg tables, you need to use Spark SQL with the properly configured catalog.
Solution
- Resource Policy Configuration:
- In Account B, ensure your Glue Data Catalog resource policy includes the necessary permissions for Account A's role:
{
"Version": "2012-10-17",
"Statement": [
{
"Effect": "Allow",
"Principal": {
"AWS": ["arn:aws:iam::ACCOUNT-A-ID:role/GlueRoleA"]
},
"Action": [
"glue:GetDatabase",
"glue:GetTable",
"glue:GetTables",
"glue:GetPartition",
"glue:GetPartitions",
"glue:BatchGetPartition"
],
"Resource": [
"arn:aws:glue:us-east-1:ACCOUNT-B-ID:catalog",
"arn:aws:glue:us-east-1:ACCOUNT-B-ID:database/welcome_db",
"arn:aws:glue:us-east-1:ACCOUNT-B-ID:table/welcome_db/*"
]
}
]
}
- Lake Formation Configuration:
- If using Lake Formation, add a statement to the Data Catalog resource policy to permit cross-account grants:
{
"Effect": "Allow",
"Principal": {
"Service": "ram.amazonaws.com"
},
"Action": "glue:ShareResource",
"Resource": [
"arn:aws:glue:us-east-1:ACCOUNT-B-ID:table/*/*",
"arn:aws:glue:us-east-1:ACCOUNT-B-ID:database/*",
"arn:aws:glue:us-east-1:ACCOUNT-B-ID:catalog"
]
}
- Code Modification:
For Iceberg tables, instead of using
create_data_frame.from_catalog(), use Spark SQL directly:
# For Iceberg tables, use Spark SQL with the properly configured catalog spark.sql(f"USE glue_catalog.welcome_db") df_iceberg = spark.sql("SELECT * FROM sample_iceberg_tbl") df_iceberg.show(10)
Or alternatively:
df_iceberg = spark.sql("SELECT * FROM glue_catalog.welcome_db.sample_iceberg_tbl") df_iceberg.show(10)
- Spark Configuration Adjustments: Make sure your Spark configuration includes these settings:
.config("spark.sql.catalog.glue_catalog", "org.apache.iceberg.spark.SparkCatalog") .config("spark.sql.catalog.glue_catalog.catalog-impl", "org.apache.iceberg.aws.glue.GlueCatalog") .config("spark.sql.catalog.glue_catalog.warehouse", "s3://<bucket_name>/iceberg_warehouse/") .config("spark.sql.catalog.glue_catalog.io-impl", "org.apache.iceberg.aws.s3.S3FileIO") .config("spark.sql.catalog.glue_catalog.catalog-id", "111122223333") .config("spark.sql.defaultCatalog", "glue_catalog")
If you're using Lake Formation for permissions management, ensure you've granted the appropriate Lake Formation permissions to the role in Account A, and consider using hybrid access mode if you're transitioning from AWS Glue permissions to Lake Formation permissions.
Remember that both the resource policy in Account B and the IAM permissions in Account A need to be properly configured for cross-account access to work correctly.
Sources
Managing cross-account permissions using both AWS Glue and Lake Formation - AWS Lake Formation
Configure cross-account AWS Glue access in Athena for Spark - Amazon Athena
Granting cross-account access - AWS Glue
Set up cross-account AWS Glue Data Catalog access using AWS Lake Formation and AWS IAM Identity Center with Amazon Redshift and Amazon QuickSight | AWS Big Data Blog
Relevant content
- asked 3 years ago
- asked 2 years ago

I have tried this method, its not working. I am unable to read iceberg table from different aws account via glue script