ETL job failing with weird error

2

My etl job failing with below by checking the log, not sure what causing. Highly appreciate any advice

Language: python 3 Glue : 3

An error occurred while calling o93.parquet. java.lang.UnsupportedOperationException: org.apache.parquet.column.values.dictionary.PlainValuesDictionary$PlainBinaryDictionary

Mark
質問済み 7ヶ月前232ビュー
1回答
3
承認された回答

Hello,

Seems like you are getting UnsupportedOperationException when reading the parquet data. There might be two cases as far as I aware. Either the underlying parquet file/files might be corrupted or the schema/datatype reference interpreted incorrectly. If you have partitioned data in s3 data source, try reading different data and see if you are getting the same issue when specifically reading particular partitioned data. If the files are not corrupted on the other hand, check if any column is in different type for an example, it may also throw this kind of exception. Refer this Jira - https://issues.apache.org/jira/browse/SPARK-24828

AWS
サポートエンジニア
回答済み 7ヶ月前
  • Thank you!!. I got the issue when querying particular partition, not sure though but I recreated that partition and the issue is resolved.

ログインしていません。 ログイン 回答を投稿する。

優れた回答とは、質問に明確に答え、建設的なフィードバックを提供し、質問者の専門分野におけるスキルの向上を促すものです。

質問に答えるためのガイドライン

関連するコンテンツ