Glue Crawler cannot classify SNAPPY compressed JSON files


I have a KFH application that puts compressed json files as snappy into an S3 bucket. I have also a Glue Crawler that creates schema using that bucket. However, the crawler classifies the table as UNKNOWN in case I activate snappy compression. It cannot detect the file is in JSON format indeed. According to below doc, Glue crawler provides snappy compression with JSON files but I wasn't able to achieve it.

I have also thought it might be related to the file extension and tried below names but it didn't work:








1 Answer

Glue crawler is unable to read it, you could create a custom JSON Classifier. After creating it, attach the custom classifier to the crawler, and this should enable the crawler to read it correctly, changing its status from Unknown to the name of your custom classifier.

Example Below:
      "type": "constituency",
      "id": "ocd-division\/country:us\/state:ak",
      "name": "Alaska"
Please refer to the following documentation on adding a custom classifier:
answered 9 months ago

You are not logged in. Log in to post an answer.

A good answer clearly answers the question and provides constructive feedback and encourages professional growth in the question asker.

Guidelines for Answering Questions