How to create an Iceberg table using base-2 file layout in Athena?

0

I'm trying to create an Iceberg table which uses the base-2 layout. This can be achieved by setting the write.object-storage.enabled=true property on the table.

When creating an Iceberg table with Glue, I'm able to set this property.

I'm also able to set this property when creating the table with SparkSQL.

However, with Athena I get this error message: Unsupported table property key: write.object-storage.enabled

Example query:

CREATE TABLE my_test_table(
  batch_id string, dim_date_id bigint
) PARTITIONED BY (batch_id)
LOCATION 's3://my_location/'
TBLPROPERTIES (
   'write.object-storage.enabled'='true',
   'table_type'='iceberg'
)

As per the Athena docs, this parameter is not supported. Please let me know when will support for this parameter be added. Also, do you have an alternative solution?

asked 13 days ago29 views
1 Answer
0

You are correct that Athena currently does not support the 'write.object-storage.enabled' property for Iceberg tables. This limitation is indeed mentioned in the Athena documentation, which only allows a predefined list of key-value pairs in the table properties for creating or altering Iceberg tables.

As for when support for this parameter will be added, there is no specific information available about future feature releases. AWS regularly updates its services, but timelines for specific features are not typically announced in advance.

In terms of alternatives, while you can't directly set the base-2 file layout in Athena, you can still create and use Iceberg tables with other optimizations. Here are a few suggestions:

  1. Use AWS Glue or Spark SQL to create the table with the desired properties, then query it using Athena. Since you mentioned you're able to set this property when creating the table with Glue or SparkSQL, this could be a viable workaround.

  2. Focus on other optimization techniques available in Athena for Iceberg tables. For example, you can use partitioning, bucketing, and compression to improve query performance. You can specify these in the CREATE TABLE statement:

    CREATE TABLE my_test_table (
      batch_id string,
      dim_date_id bigint
    )
    PARTITIONED BY (batch_id)
    LOCATION 's3://my_location/'
    TBLPROPERTIES (
      'table_type' = 'ICEBERG',
      'format' = 'PARQUET',
      'write_compression' = 'SNAPPY'
    )
  3. If you need specific Iceberg features or configurations not supported by Athena, consider using other AWS services that have fuller Iceberg support, such as EMR with Spark or AWS Glue, for those particular operations.

  4. Keep an eye on Athena release notes and documentation for updates. AWS continuously improves its services, and support for additional Iceberg properties may be added in the future.

If this feature is crucial for your use case, you may want to provide feedback to the Athena team. They welcome feature requests and feedback at athena-feedback@amazon.com.
Sources
Create Iceberg tables - Amazon Athena
Use Amazon Athena with Spark SQL for your open-source transactional table formats | AWS Big Data Blog
Getting started with Apache Iceberg tables in Amazon Athena SQL - AWS Prescriptive Guidance

profile picture
answered 13 days ago

You are not logged in. Log in to post an answer.

A good answer clearly answers the question and provides constructive feedback and encourages professional growth in the question asker.

Guidelines for Answering Questions