Glue Crawler and Classifiers - Supported file encodings: is UTF16 supported?
0
Hi, AWS Glue Crawlers with CSV and XML Classifiers and works well with files encoded in UTF-8 but not with file encoded in UTF-16.
Public documentation does not clarify this point:
- Do Glue crawler and classifier support UTF-16?
- Is there please an available documentation on supported encodings with Glue crawlers and classifiers?
Best regards,
Nicolas.
asked 2 years ago77 views
1 Answers
0
Accepted Answer
Glue at the moment supports UTF-8 encoded files only [1]. If UTF-16 files are passed in, you may encounter the "Internal Service Exception" error message. The most feasible method would be to programatically convert the utf-16 files to utf-8 before passing it through Glue Crawler.
[1] - https://docs.aws.amazon.com/glue/latest/dg/components-key-concepts.html
Relevant questions
How to escape a comma in a csv file in AWS Glue?
Accepted AnswerGlue Crawler and Classifiers - Supported file encodings: is UTF16 supported?
Accepted Answerasked 2 years agoAWS Glue crawler creating multiple tables
asked 5 months agoescape caracter in AWS glue
Accepted Answerasked 6 months agobackslash in CSV with glue
asked 6 months agoGlue Crawler CSV file with a field containing commas
Accepted Answerasked a year agoAWS Glue crawler detecting a .(dot) before header of a csv file
asked 3 years agoAWS Glue read a csv file encoded in Windows 1252 with extended characters
Accepted AnswerAWS Glue, crawlers and issue with money datatype
asked a month agoAWS Glue crawler
asked a month ago