Is it possible to partition Glue table by a field of a struct column?

0

I have messages arriving into S3 from Firehose - the schema of messages is JSON and has two structs "header" and "body", each of which contains simple data types. Glue table so far has partitions that are generated by Firehose - year, month, day and hour.

Now I know that it is possible to have Firehose create a partition with "dynamic partitioning" based on incoming events (and it can do it based on a field of a struct/object), although I do not know yet exact configuration options I would need to apply. However, I am interested in whether it is possible to have a Glue Table partition based on a simple-typed field of a struct field/column, i.e. without applying "dynamic partitioning" at the Firehose to extract the field as its' own "column".

hRed
preguntada hace 7 meses350 visualizaciones
1 Respuesta
1
Respuesta aceptada

In traditional Glue tables the partition column is outside the data (in the file path) so no.
You need the ingest tool to dynamically extract that value from the input and use it to divide the data into directories accordingly.

profile pictureAWS
EXPERTO
respondido hace 6 meses
profile picture
EXPERTO
revisado hace un mes

No has iniciado sesión. Iniciar sesión para publicar una respuesta.

Una buena respuesta responde claramente a la pregunta, proporciona comentarios constructivos y fomenta el crecimiento profesional en la persona que hace la pregunta.

Pautas para responder preguntas