Skip to content

Iceberg Performance Tuning for Amazon Redshift

8 minute read
Content level: Intermediate
0

Apache Iceberg is an open-source table format designed to help manage and organize large-scale data stored in data lakes. it integrates seamlessly with Amazon Redshift. With this integration, Redshift users can directly query Apache Iceberg tables stored in Amazon S3 or Amazon S3tables without the need to move or copy data.

This article covers the recommended best practices for optimizing Apache Iceberg format tables while using it from Amazon Redshift Spectrum and Redshift Serverless.

1. Table Creation Best Practices

a. Always use CTAS to create Iceberg tables

CTAS writes data with special column IDs embedded in each file, making it easier and faster for the query engine to find the right columns — even if they are renamed or reordered later

b. Avoid using the add_files procedure for production tables

add_files registers existing Parquet files as-is without rewriting them. While this is faster since no data is moved, the files will lack column IDs. This means the query engine has to match columns by name, which is slower and may return incorrect results if columns are renamed

c. comparative Analysis CTAS Vs add_files
CategoryCTASadd_files
Best Used ForProduction and frequently queried tablesTemporary or read-only archival tables with a fixed schema
Long-Term PerformanceReliable and consistentCan degrade over time
File LayoutOptimized file layout for better performanceUses existing file layout, which may not be optimal
Schema ChangesHandles column renames and reordering without issuesMay return incorrect results if columns are renamed
Column IDsReliable and consistentCan degrade over time

2. File Size Optimization

Default File Size is 512 MB. You can adjust the file size with following table properties

TBLPROPERTIES ( 
  ‘write.target-file-size-bytes'=’your file size’,
  ‘write.parquet.row-group-size-bytes'=‘your group size'

File size Recommendations

  • Small tables (few GB) = 128 MB and groups of 8 - 16 MB
  • Medium tables (hundreds of GB) = Default is a good parameter. Adjust the group to 16 - 64 MB
  • Very Large tables (TBs) = Increase to 1024 MB and increase group
a. Target 128–512 MB per file after compression

This range provides the ideal balance between the number of files the engine needs to open and the amount of data each task processes. Files within this range ensure fast query performance without putting too much pressure on memory.

b. Avoid files that are too small (under ~32 MB)

A large number of small files adds excessive overhead — the engine ends up spending more time handling files than processing actual data. This is referred to as the 'small-file problem' and is a very common reason behind slow-running queries."

c. Avoid files that are too large (over ~1024 MB)

Very large files reduce parallelism because a single task must read the entire file. This can also cause memory pressure on individual executors and slow down scans.

d. Use file compaction to fix small files

Run the rewrite_data_files procedure regularly to merge many small files into optimally sized ones. For example, compacting 15 million tiny files into ~10,000 properly sized files can dramatically improve query speed.

3. Collecting Table Statistics for Better Query Performance

a. Why statistics matter

Redshift use table and column statistics (like row counts, min/max values, null counts and distinct values) to make smarter decisions about how to execute queries. Without statistics, the engine may choose inefficient query plans — scanning more data than necessary.

Row counts, min/max values, and null counts are all collected out of the box with Iceberg — no user action is required However, one important statistic is not included by default: Number of Distinct Values (NDV). Collecting NDV stats requires a separate user action through Glue and the results are stored in a Puffin file, which is an extension of the Iceberg format. Please refer following for additional details.

How to collect statistics using AWS Glue
b. When to collect statistics

Collect or refresh statistics after significant data loads, compaction jobs, or schema changes. For tables with frequent writes, schedule periodic statistics collection daily or after each major ETL run.

4. Partitioning Strategy

a. Use Iceberg's hidden partitioning instead of Hive-style partitioning

Hidden partitioning applies transforms (like days(), months(), hours(), bucket()) on column values automatically. You don't need to create extra partition columns or manage directory structures. The query engine automatically prunes irrelevant partitions even when you filter on the original column.

Bucket Partitioning and Truncate Partitioning in Apache Iceberg, both of which are forms of "Hidden Partitioning" designed for high cardinality columns

Bucket Partitioning Bucket partitioning uses a hash function to distribute data evenly across a fixed number of buckets based on a column's value. Use bucket partition When you need uniform partition sizes regardless of data distribution, For example,

PARTITIONED BY ( bucket(50, customerNumber) );

Truncate Partitioning Truncate partitioning groups data by truncating a column's value to a specified length, creating partitions based on prefixes. It Creates logical groupings based on natural data patterns. This is useful when queries filter by prefixes or ranges of string/numeric values.

For example,

PARTITIONED BY ( truncate(1, lastName) )
b.comparative Analysis of Hidden partitions**
CategoryTime-Based PartitioningBucket PartitioningTruncate Partitioning
When to UseWhen queries primarily filter by time ranges; most common partitioning strategy for time-series dataWhen you need uniform partition sizes and query by exact values on high cardinality columnsWhen data has natural groupings by prefix and queries filter by prefix/range
How It WorksExtracts time components (year, month, day, hour) from a timestamp column to create partitionsUses a hash function to distribute data evenly across a fixed number of bucketsTruncates a column value to a specified length, grouping data by prefix
Partition CountDepends on time range and granularity (e.g., 12 partitions/year for monthly)Fixed — you define the number of buckets (e.g., 50)Depends on the number of distinct truncated values
Best Column TypesTimestamp, date columnsHigh cardinality columns: customer IDs, phone numbers, account numbersString or numeric columns with natural prefixes: last names, product codes
Skewness RiskModerate — depends on data volume per time periodLow — hash ensures even distributionHigher — some partitions may be heavily loaded
c. Reduce Partition Overhead: Choose Coarser Partitioning Like Days Over Hours

Hourly partitioning creates ~8,760 partitions per year. Over 3 years, that's 26,000+ partitions — each adding metadata overhead and slowing down query planning. Daily partitioning creates only ~365 partitions per year, which is sufficient for most query patterns.

For additional details please refer Iceberg partitioning

5. Compression: Key Points to Consider

a. Use Zstandard (ZSTD) compression instead of Snappy

ZSTD produces significantly smaller files than Snappy, which means lower storage costs and less data to scan during queries. Despite better compression, ZSTD decompression speed is comparable to Snappy, so read performance stays fast.

ZSTD compression level guidance

Level 3 is a good general-purpose setting. Level 1 is best for write-heavy or streaming workloads where you want fast writes. Levels 9–19 are for archival tables where storage savings are the top priority.

b. When Snappy might still be used

Snappy has been the traditional default and offers very fast compression/decompression. However, its compression ratio is lower, resulting in larger files and higher storage costs. It's mainly relevant for legacy compatibility.

c. When GZIP might be appropriate

GZIP offers the highest compression ratio (smallest files) but has the slowest decompression. Use it only for archival or cold data that is rarely queried, where minimizing storage cost matters more than read speed.

6. Choosing the right update-delete strategy

TBLPROPERTIES ( 
  ‘write.delete.mode'=‘merge-on-read',
  'write.update.mode'=‘copy-on-write’,
  'write.merge.mode'=‘copy-on-write

Iceberg supports two delete modes: Copy on Write and Merge on Read

Copy on Write rewrites data files at deletion time to physically remove deleted rows. This makes writes slower but reads faster, since there are no delete files to reconcile at query time.

Merge on Read leaves data files unchanged and instead records deletions in separate delete files. This makes writes faster but adds read overhead, as the engine must reconcile delete files during queries.

For tables with frequent deletes, the accumulation of Merge-on-Read delete files can degrade query performance over time. It's recommended to run compaction periodically to clean up these delete files and maintain optimal read performance.

Redshift compatibility: Redshift supports reading Iceberg tables in either mode but only supports writing deletes using Merge on Read.

7. Maintenance

a. Snapshot Management

Expire snapshots regularly to delete data files no longer needed and keep metadata size manageable.

b. Orphan File Cleanup

Remove orphan files after snapshot expiration. Failed or interrupted jobs leave behind unreferenced data files that consume storage.

8. Leverage Materialize views in Redshift

Amazon Redshift supports creating materialized views on Apache Iceberg tables that reside in your Amazon S3 data lake. This feature allows you to pre-compute and store complex query results from Iceberg tables, significantly accelerating dashboard and analytical query performance.

Please refer Materialize views for additional details

Additional Resources

Redshift Iceberg Write Support

Iceberg Best practices