Is it possible to partition Glue table by a field of a struct column?

0

I have messages arriving into S3 from Firehose - the schema of messages is JSON and has two structs "header" and "body", each of which contains simple data types. Glue table so far has partitions that are generated by Firehose - year, month, day and hour.

Now I know that it is possible to have Firehose create a partition with "dynamic partitioning" based on incoming events (and it can do it based on a field of a struct/object), although I do not know yet exact configuration options I would need to apply. However, I am interested in whether it is possible to have a Glue Table partition based on a simple-typed field of a struct field/column, i.e. without applying "dynamic partitioning" at the Firehose to extract the field as its' own "column".

hRed
demandé il y a 7 mois350 vues
1 réponse
1
Réponse acceptée

In traditional Glue tables the partition column is outside the data (in the file path) so no.
You need the ingest tool to dynamically extract that value from the input and use it to divide the data into directories accordingly.

profile pictureAWS
EXPERT
répondu il y a 6 mois
profile picture
EXPERT
vérifié il y a un mois

Vous n'êtes pas connecté. Se connecter pour publier une réponse.

Une bonne réponse répond clairement à la question, contient des commentaires constructifs et encourage le développement professionnel de la personne qui pose la question.

Instructions pour répondre aux questions