- Newest
- Most votes
- Most comments
Has this problem been resolved? Alternatively, what is the solution? Is it possible to rectify the data type using AWS Glue Spark? How do we manage situations where there are varying data types across multiple files, particularly in parquet format?
That means the schema of your files is inconsistent and that column is generalized as string but that is problematic in itself.
Assuming you can't fix the parquet files to be consistent (or the table is partitioned and files are consistent within each partition), you still might be able to workaround.
Looking at the error, I would say you are reading as DataFrame and not DynamicFrame, which is more flexible in these aspects.
Can you share the reading part of the code and the full stack trace?
Any solution to the above problem? I am facing the same issue. Would specifying the data type for all the columns as "string" when writing the parquet files help resolve the issue?
Relevant content
- asked 5 years ago

For the same issue that I am facing in one of my glue jobs, I have a custom node in glue studio that reads data from a dynamic frame and convert it to spark dataframe to perform some checks. I am getting the error when trying to save the data in another table in the node logic. I can provide more details