Saltar al contenido

CDK: s3 table bucket, Apache iceberg table creation with partitioning and other advanced configuration

0

aws_cdk.aws_s3tables_alpha module now supports creating s3 table bucket, namespace and even Apache Iceberg table creation. but I couldn't find any documented way to add partitioning configuration. Is that supported? if not then what's the recommended way to do that via CDK? perhaps I will have to implement a custom CDK resource. Goal is to not have a manual script outside CDK for Iceberg table creation.

Another question is if we are creating an Iceberg table via CDK. will it support schema evolution via CDK?

Example Iceberg table creation query via Athena:

CREATE TABLE some_table (
  user_id STRING,
  resource_id STRING,
  tenant_id STRING,
  project_id STRING,
  action STRING,
  timestamp TIMESTAMP,
  context MAP<STRING, STRING>
)
PARTITIONED BY (
  `tenant_id`,
  day(`timestamp`),
  `project_id`
)
TBLPROPERTIES ('table_type' = 'ICEBERG');

Reference: https://docs.aws.amazon.com/cdk/api/v2/docs/aws-s3tables-alpha-readme.html https://github.com/aws/aws-cdk/issues/33054

preguntada hace 2 meses117 visualizaciones
1 Respuesta
0

Based on the available information, while AWS CloudFormation and CDK support for S3 Tables was announced in August 2025, there doesn't appear to be documented support specifically for configuring partitioning when creating Apache Iceberg tables through the aws_cdk.aws_s3tables_alpha module.

When creating Iceberg tables in S3 Tables, you typically need to specify partitioning using the PARTITIONED BY clause, as shown in your example. This allows you to define partition transforms and specify multiple columns for partitioning.

Since the CDK module doesn't appear to have explicit support for these partitioning configurations, you have a few options:

  1. You could implement a custom CDK resource that uses AWS SDK calls or executes the necessary SQL statements to create the table with your desired partitioning configuration.

  2. You could use the CDK to create the table bucket and namespace, and then use a separate mechanism (like a Lambda function triggered by CloudFormation custom resources) to execute the SQL statements needed to create the properly partitioned table.

Regarding schema evolution, Iceberg tables inherently support schema evolution as part of the Apache Iceberg specification. However, managing schema changes through CDK would likely require you to define the new schema in your CDK code and implement a deployment mechanism that applies these changes. Since the CDK module appears to be focused on resource creation rather than ongoing schema management, you might need to implement custom logic to handle schema evolution scenarios.

For the most accurate and up-to-date information, I'd recommend checking the latest CDK documentation or reaching out to AWS Support, as the capabilities of the aws_cdk.aws_s3tables_alpha module may have evolved since the information provided.
Sources
What's New at AWS - Cloud Innovation & News
Create Iceberg tables - Amazon Athena
Register S3 table bucket catalogs and query Tables from Athena - Amazon Athena

respondido hace 2 meses

No has iniciado sesión. Iniciar sesión para publicar una respuesta.

Una buena respuesta responde claramente a la pregunta, proporciona comentarios constructivos y fomenta el crecimiento profesional en la persona que hace la pregunta.