Is it possible to partition Glue table by a field of a struct column?

0

I have messages arriving into S3 from Firehose - the schema of messages is JSON and has two structs "header" and "body", each of which contains simple data types. Glue table so far has partitions that are generated by Firehose - year, month, day and hour.

Now I know that it is possible to have Firehose create a partition with "dynamic partitioning" based on incoming events (and it can do it based on a field of a struct/object), although I do not know yet exact configuration options I would need to apply. However, I am interested in whether it is possible to have a Glue Table partition based on a simple-typed field of a struct field/column, i.e. without applying "dynamic partitioning" at the Firehose to extract the field as its' own "column".

hRed
已提问 7 个月前350 查看次数
1 回答
1
已接受的回答

In traditional Glue tables the partition column is outside the data (in the file path) so no.
You need the ingest tool to dynamically extract that value from the input and use it to divide the data into directories accordingly.

profile pictureAWS
专家
已回答 6 个月前
profile picture
专家
已审核 1 个月前

您未登录。 登录 发布回答。

一个好的回答可以清楚地解答问题和提供建设性反馈,并能促进提问者的职业发展。

回答问题的准则

相关内容