“Parquet column cannot be converted in file, Pyspark Expected string Found: INT32.”

0

I encountered the following error, “Parquet column cannot be converted in file, Pyspark Expected string Found: INT32.” I tried to convert the column to INT32 (Applying withColumn(), but the error persisted. I tried add the statement, “spark.conf.set("spark.sql.parquet.enableVectorizedReader","false", but that did not help either. I wold appreciate very much your insights. Thanks

질문됨 3달 전727회 조회
1개 답변
0

That means the schema Spark has doesn't match the file, it can be due to reading via a catalog table that doesn't match the data, or having inconsistent parquet files in the same directory. If you do have mixed files, I would try to read with "mergeSchema"=true but not sure if it's going to solve it, you might need to tell the files apart and read them separately.

profile pictureAWS
전문가
답변함 3달 전

로그인하지 않았습니다. 로그인해야 답변을 게시할 수 있습니다.

좋은 답변은 질문에 명확하게 답하고 건설적인 피드백을 제공하며 질문자의 전문적인 성장을 장려합니다.

질문 답변하기에 대한 가이드라인

관련 콘텐츠