Create a glue table with type delta using AWS CDK

0

Is there a way to create such a glue table using AWS python CDK ? have been trying with no success so far

CREATE EXTERNAL TABLE [db_name.]table_name LOCATION 's3://DOC-EXAMPLE-BUCKET/your-folder/' TBLPROPERTIES ('table_type' = 'DELTA')

I tried those 2 options https://docs.aws.amazon.com/cdk/api/v2/python/aws_cdk.aws_glue_alpha/Table.html https://docs.aws.amazon.com/cdk/api/v2/python/aws_cdk.aws_glue/CfnTable.html

I Got 2 issues:

  • aws_cdk.aws_glue_alpha.Table : there is no glue.DataFormat.DELTA option so I cannot specify Delta format
  • And for CfnTable I tried this snippet the resource gets created but the Athena table is empty
  aws_glue.CfnTable(
            self,
            "test_table",
            catalog_id=Aws.ACCOUNT_ID,
            database_name=database_name,
            table_input=aws_glue.CfnTable.TableInputProperty(
                description="test",
                table_type="EXTERNAL_TABLE",
                name="test_table",
                parameters={"classification": "delta"},
                partition_keys=[
                   aws_glue.CfnTable.ColumnProperty(name="title", type="int"),
                    aws_glue.CfnTable.ColumnProperty(name="year", type="int"),
                    aws_glue.CfnTable.ColumnProperty(name="month", type="int"),
                    aws_glue.CfnTable.ColumnProperty(name="day", type="int"),
                ],
                storage_descriptor=aws_glue.CfnTable.StorageDescriptorProperty(
                    input_format="org.apache.hadoop.mapred.SequenceFileInputFormat",
                    location=f"s3://{environment}-test-data/",
                    output_format="org.apache.hadoop.hive.ql.io.HiveSequenceFileOutputFormat",
                    serde_info=aws_glue.CfnTable.SerdeInfoProperty(
                        serialization_library="org.apache.hadoop.hive.serde2.lazy.LazySimpleSerDe"
                    ),
                ),
            ),
        )
1 Answer
0

You can create tables using either the L1 constructor or the L2 constructor, whichever you prefer.
If you are familiar with CloudFormation, the L1 constructor may be easier to create.
https://docs.aws.amazon.com/cdk/api/v2/python/aws_cdk.aws_glue_alpha/Table.html
https://docs.aws.amazon.com/cdk/api/v2/python/aws_cdk.aws_glue/CfnTable.html

profile picture
EXPERT
answered 9 months ago
  • Hi Riku thank you for answer, I've actually tried both before posting my question, you can see my edited question now for more transparency

  • I am not quite understanding that the Athena table is empty. Are the S3 buckets and bucket folders correct? Also, are you using a crawler to retrieve data from S3 in addition to creating Glue tables? https://docs.aws.amazon.com/athena/latest/ug/glue-best-practices.html

  • I tried the code above and in glue table it show right location but when I query from Athena it output now result, while when I created the table CREATE EXTERNAL TABLE command everything works as perfect No I am not using a crawler

  • First, make sure that the table you created in Glue contains data. If the data is correctly in the Glue table, it seems to me that there is a problem with the configuration on the Athena side. Have you selected the correct database on the Athena side?

  • Yes everything is right, actually it seem there is an issue with my cdk code because the table is there, location is right but no data get showing when querying, Have you tried running my dummy cdk code on your own?

You are not logged in. Log in to post an answer.

A good answer clearly answers the question and provides constructive feedback and encourages professional growth in the question asker.

Guidelines for Answering Questions