flatten deeply nested JSON with Crawler

0

My customer wants to flatten deeply nested JSON object. They used Glue Crawler Classifier with $[*] (lift the array elements up one level, so that each JSON record gets loaded into its own row). When they ran Crawler and view results he saw some array type instead of struct.

I saw a previous response to similar but need to understand in more details how to fix that

AWS
asked 4 years ago2458 views
1 Answer
0
Accepted Answer

They are seeing Arrays and Structs based on the schema of the JSON document.

{
   "event_params":[ {"key":"Value"}, {"Key","value"}],
   "geolocation": { "lat": 56.333333, "lng": 57.44333 }
}

If this is my JSON document, when I run the crawler it will result in this schema

event_params: Array
geolocation: Struct

The customer can use Custom Classifiers in Glue to extract the data further. Or, they can keep it as is, and run a glue job to format the data into a new table in columnar format.

answered 4 years ago

You are not logged in. Log in to post an answer.

A good answer clearly answers the question and provides constructive feedback and encourages professional growth in the question asker.

Guidelines for Answering Questions