Skip to content

Bug: Firehose to Athena Iceberg Table Write does not always work

0

I have been testing the direct write to Iceberg feature in Firehose and I have come to realize that there is a problem with the feature. It does not always work. Let me elaborate:

Then, I tested this feature with a buffer size of 1MiB and buffer interval of 0 secs ; essentially no buffering of any records It worked well, I tested INSERT, UPDATE, DELETE and all good.

Later, in the evening I decided to do some more testing with the same aws cli commands. Firehose is able to write the data to the correct bucket and the correct subdirectory in the bucket, but it does not update the metatdata. This causes Athena to not detect the new parquet file(s).

I thought I probably did something wrong with the table definitions or IAM policies, so I deleted everything and repeated all the steps again. I faced the same issue, it works initially and then stops updating the metadata files. Is there anybody else who faced this?

1 Answer
1

Hello,

Thank you very much for your question. Currently, it is important to note that Firehose supports Apache Iceberg Tables as a destination in only some regions, such as US East (N. Virginia), US West (Oregon), Europe (Ireland), Asia Pacific (Tokyo), Canada (Central), and Asia Pacific (Sydney) AWS Regions. Based on the information given, it could be a permission issue. First, double-check that the IAM role used by Kinesis Data Firehose has the necessary permissions to write data to the S3 bucket and update the Iceberg table metadata. Furthermore, you could try increasing the buffer size and buffer interval for the Kinesis Data Firehose stream. This could help mitigate potential issues caused by concurrent writes or transient failures. It's also worth noting that the direct write to Iceberg feature in Kinesis Data Firehose is relatively new, and there might be some initial issues or limitations that need to be addressed. Monitoring the AWS release notes and documentation updates for any known issues or workarounds would also be helpful. You can check the further resources for more information:

AWS
answered a year ago

You are not logged in. Log in to post an answer.

A good answer clearly answers the question and provides constructive feedback and encourages professional growth in the question asker.