“Parquet column cannot be converted in file, Pyspark Expected string Found: INT32.”

Question

I encountered the following error, “Parquet column cannot be converted in file, Pyspark Expected string Found: INT32.”
I tried to convert the column to INT32 (Applying withColumn(), but the error persisted. 
I tried add the statement, “spark.conf.set("spark.sql.parquet.enableVectorizedReader","false", but that did not help either. 
I wold appreciate very much your insights.
Thanks

Answer

That means the schema Spark has doesn't match the file, it can be due to reading via a catalog table that doesn't match the data, or having inconsistent parquet files in the same directory. 
If you do have mixed files, I would try to read with "mergeSchema"=true but not sure if it's going to solve it, you might need to tell the files apart and read them separately.

“Parquet column cannot be converted in file, Pyspark Expected string Found: INT32.”

Relevanter Inhalt