Is there a way to create such a glue table using AWS python CDK ? have been trying with no success so far
CREATE EXTERNAL TABLE [db_name.]table_name LOCATION 's3://DOC-EXAMPLE-BUCKET/your-folder/' TBLPROPERTIES ('table_type' = 'DELTA')
I tried those 2 options
https://docs.aws.amazon.com/cdk/api/v2/python/aws_cdk.aws_glue_alpha/Table.html
https://docs.aws.amazon.com/cdk/api/v2/python/aws_cdk.aws_glue/CfnTable.html
I Got 2 issues:
aws_cdk.aws_glue_alpha.Table
: there is no glue.DataFormat.DELTA option so I cannot specify Delta format
- And for CfnTable I tried this snippet the resource gets created but the Athena table is empty
aws_glue.CfnTable(
self,
"test_table",
catalog_id=Aws.ACCOUNT_ID,
database_name=database_name,
table_input=aws_glue.CfnTable.TableInputProperty(
description="test",
table_type="EXTERNAL_TABLE",
name="test_table",
parameters={"classification": "delta"},
partition_keys=[
aws_glue.CfnTable.ColumnProperty(name="title", type="int"),
aws_glue.CfnTable.ColumnProperty(name="year", type="int"),
aws_glue.CfnTable.ColumnProperty(name="month", type="int"),
aws_glue.CfnTable.ColumnProperty(name="day", type="int"),
],
storage_descriptor=aws_glue.CfnTable.StorageDescriptorProperty(
input_format="org.apache.hadoop.mapred.SequenceFileInputFormat",
location=f"s3://{environment}-test-data/",
output_format="org.apache.hadoop.hive.ql.io.HiveSequenceFileOutputFormat",
serde_info=aws_glue.CfnTable.SerdeInfoProperty(
serialization_library="org.apache.hadoop.hive.serde2.lazy.LazySimpleSerDe"
),
),
),
)
Hi Riku thank you for answer, I've actually tried both before posting my question, you can see my edited question now for more transparency
I am not quite understanding that the Athena table is empty. Are the S3 buckets and bucket folders correct? Also, are you using a crawler to retrieve data from S3 in addition to creating Glue tables? https://docs.aws.amazon.com/athena/latest/ug/glue-best-practices.html
I tried the code above and in glue table it show right location but when I query from Athena it output now result, while when I created the table CREATE EXTERNAL TABLE command everything works as perfect No I am not using a crawler
First, make sure that the table you created in Glue contains data. If the data is correctly in the Glue table, it seems to me that there is a problem with the configuration on the Athena side. Have you selected the correct database on the Athena side?
Yes everything is right, actually it seem there is an issue with my cdk code because the table is there, location is right but no data get showing when querying, Have you tried running my dummy cdk code on your own?