Skip to content

Iceberg table compaction error

0

I have an Iceberg table which was created as a custom resource in CDK. Every hour I do a merge update to the table to load new and update existing data. To avoid exponential growth of table data volume, I use the newly announced feature Glue Iceberg Table Optimization. According to the CloudWatch logs of Compaction history, approximately 1 of 10 attempts are failed with the following error:

Compaction optimizer failed. Error: partial-progress.enabled is true but no rewrite commit succeeded. Check the logs to determine why the individual commits failed. If this is persistent it may help to increase partial-progress.max-commits which will break the rewrite operation into smaller commits.. Root cause: java.lang.RuntimeException: partial-progress.enabled is true but no rewrite commit succeeded. Check the logs to determine why the individual commits failed. If this is persistent it may help to increase partial-progress.max-commits which will break the rewrite operation into smaller commits.

How can I troubleshoot this error, considering that Athena (and the Glue catalog’s Iceberg integration) doesn’t recognize the property key partial-progress.max-commits? Moreover I have no control over compaction timing, to prevent it from overlapping with my updates. Sure the compaction code (I assume there should be a kind of Spark Job) is not accessible as well.

  • What is also interesting is following. It is written in docs:

    In the Data Catalog, the compaction process starts when a table or any of its partitions have more than 100 files. Each file must be smaller than 75% of the target file size.

    Meanwhile in Compaction history each run has not more than 10 Files compacted. Definitely not 100. And it runs like 3 times every hour.

asked a year ago880 views
1 Answer
-1

The error you're encountering with your Iceberg table compaction is related to the partial progress feature of the compaction process. This feature is designed to allow compaction to proceed even if some parts of the operation fail, but in your case, it seems that no part of the compaction is succeeding.

To troubleshoot this issue, consider the following steps:

  1. Check the CloudWatch logs for more detailed information about why individual commits are failing. The error message suggests that there might be more specific reasons for the failures in the logs.

  2. Ensure that you're not running multiple update operations in parallel. Concurrent modifications to the same set of files can cause conflicts and lead to compaction failures. Try to schedule your merge updates and compaction operations sequentially to avoid potential conflicts.

  3. Verify that your table properties are correctly set and compatible with AWS Glue and Athena. Some Iceberg properties may not be directly supported by these services, which could be causing issues with the compaction process.

  4. Consider implementing a retry mechanism with exponential backoff for your update operations. This can help mitigate transient issues that might be causing the compaction to fail.

  5. Review your table's partitioning strategy. If you have a large number of small partitions, it could be contributing to the compaction issues. You might need to adjust your partitioning scheme to create fewer, larger partitions.

  6. Check if your table has a large number of small files. AWS Glue Data Catalog's compaction process is designed to combine small files into larger ones when a table or partition has more than 100 files smaller than 75% of the target file size (currently 512MB). Ensuring your data aligns with these thresholds might improve compaction success.

  7. If the issue persists, you may need to reach out to AWS support for further assistance, as they might be able to provide more insight into the specific behavior of the Glue Iceberg Table Optimization feature and any known issues or workarounds.

Remember that while Athena and Glue catalog's Iceberg integration might not directly recognize the partial-progress.max-commits property, this could be an internal property used by the compaction process. Focus on optimizing your table structure and update processes rather than trying to modify this specific property.
Sources
Compaction optimization - AWS Glue
Troubleshoot Athena Apache Iceberg table errors | AWS re:Post
Optimizing Iceberg tables - AWS Lake Formation

answered a year ago

You are not logged in. Log in to post an answer.

A good answer clearly answers the question and provides constructive feedback and encourages professional growth in the question asker.