- Newest
- Most votes
- Most comments
Based on the information provided and the current capabilities of Amazon Kinesis Data Firehose, it appears that there are some limitations when working with Apache Iceberg tables via table-level resource links. Let's address your questions:
-
Amazon Kinesis Data Firehose does not currently support writing to Apache Iceberg tables using only a table-level resource link without a resource-linked database. This limitation is likely the root cause of the error you're experiencing.
-
Yes, it seems that having a resource-linked database is mandatory for Firehose to correctly resolve the underlying Iceberg table. Firehose expects to work with a complete database structure, which includes both the database and table information.
-
To get Firehose to recognize the resource-linked table as an Iceberg table, you'll need to make some changes to your setup:
- Instead of creating just a table-level resource link, you should create a resource link for the entire database that contains your Iceberg table.
- The resource link should point to the namespace (database) in your table bucket.
- When configuring Firehose, you would then use the name of this resource-linked database as the database name, and the original table name as the table name.
To resolve your issue, follow these steps:
-
Create a database-level resource link in your AWS Glue Data Catalog that points to the
marketing_data
database in thes3tablescatalog/project-data-bucket
catalog. -
Grant the necessary Lake Formation permissions on both the resource link and the target (linked) namespace or table to your Firehose service role.
-
In your Firehose configuration, use the name of the resource-linked database as the database name, and
ad_campaign_metrics
as the table name. -
Ensure that the column names and data types in your input JSON match exactly with those defined in your Iceberg table schema. Firehose is particular about this matching, including capitalization.
-
If you have nested JSON structures, be aware that Firehose only processes the first level of nodes in a multi-level nested JSON. Ensure your
event_timestamp
field is at the top level of your JSON structure.
By making these adjustments, you should be able to successfully stream data from Firehose to your shared Iceberg table. If you continue to experience issues, double-check your IAM permissions and ensure that the Firehose role has all necessary permissions to access the resource link, the underlying table, and write to the S3 bucket where the Iceberg data is stored.
Sources
Streaming data to tables with Amazon Data Firehose - Amazon Simple Storage Service
Consideration and limitations - Amazon Data Firehose
How to send data from Firehose to S3 tables ? | AWS re:Post
Relevant content
- asked 8 months ago
- asked 5 months ago
- AWS OFFICIALUpdated 5 months ago