Is it possible to partition Glue table by a field of a struct column?

0

I have messages arriving into S3 from Firehose - the schema of messages is JSON and has two structs "header" and "body", each of which contains simple data types. Glue table so far has partitions that are generated by Firehose - year, month, day and hour.

Now I know that it is possible to have Firehose create a partition with "dynamic partitioning" based on incoming events (and it can do it based on a field of a struct/object), although I do not know yet exact configuration options I would need to apply. However, I am interested in whether it is possible to have a Glue Table partition based on a simple-typed field of a struct field/column, i.e. without applying "dynamic partitioning" at the Firehose to extract the field as its' own "column".

hRed
asked 6 months ago332 views
1 Answer
1
Accepted Answer

In traditional Glue tables the partition column is outside the data (in the file path) so no.
You need the ingest tool to dynamically extract that value from the input and use it to divide the data into directories accordingly.

profile pictureAWS
EXPERT
answered 6 months ago
profile picture
EXPERT
reviewed a month ago

You are not logged in. Log in to post an answer.

A good answer clearly answers the question and provides constructive feedback and encourages professional growth in the question asker.

Guidelines for Answering Questions