- Newest
- Most votes
- Most comments
Hi,
I am facing the same problem. Are you able to get a fix for this? I really do not want to use Spark DataFrame API at this point after spending so much time making the Glue data catalog perfect.
Specify the encoding at the top of your script:
import sys
reload(sys)
sys.setdefaultencoding("utf-8")
I used this in my job and it resolved the error.
I have discussed this with AWS technical support and there is no solution using DynamicFrames - you need to rewrite I'm afraid...
I have the same problem. Is this still in the works? Anybody found a working solution?
Currently Glue DynamicFrame supports custom encoding in XML, but not in other formats like JSON or CSV.
If your data includes non-UTF characters, you can use DataFrame to read the data, write back to S3 with UTF8.
You can refer some samples in below repository.
https://github.com/aws-samples/aws-glue-samples/blob/master/examples/converting_char_encoding.md
my glue job is getting failed to with an error unable to parse when trying to process ANSI formatted file. Any solution?
Relevant content
- asked 2 years ago
- asked 2 years ago
- AWS OFFICIALUpdated 2 years ago
- AWS OFFICIALUpdated a year ago
- AWS OFFICIALUpdated 3 years ago