Glue ETL PySpark Job Fails after Upgrade from Glue Version 2.0 to 3.0 error occurred while calling pyWriteDynamicFrame EOFException occurred while reading the port number from pyspark.daemon's stdout

0

A PySpark Glue ETL job fails after upgrading from Glue Version 2.0 to 3.0.

The job fails while writing a DynamicFrame to s3 in parquet format.

error snippet from log: py4j.protocol.Py4JJavaError: An error occurred while calling o433.pyWriteDynamicFrame. : org.apache.spark.SparkException: Job aborted

Caused by: org.apache.spark.SparkException: Job aborted due to stage failure: Task 1 in stage 30.2 failed 4 times, most recent failure: Lost task 1.3 in stage 30.2 (TID 2007594) (10.78.9.145 executor 8): org.apache.spark.SparkException: EOFException occurred while reading the port number from pyspark.daemon's stdout

mrjimi
질문됨 2년 전2842회 조회
2개 답변
0

I changed the worker_type from G.1X to G.2X and the job completed successfully, albeit in 38 hours. So then I tuned the Spark code so that the 3 Dataframes are all partitioned on the same value .repartition("attribute_name"), and also doubled the number of workers from 5 to 10. Then the job completed successfully in 1 hr 20 mins. The partitioning helped the JOIN that was being done to create the final dataset being written to s3.

mrjimi
답변함 2년 전
  • thank you for feedback. How long was it taking with Glue 2.0? were you using the same number of nodes?

0

Hi,

have you looked at the documentation about migrating Glue from version 2.0 to 3.0 ? and Spark 2 to Spark 3?

Do you use external libraries?

Contacting AWS Support might be the fastest way to resolve your issue if you cannot find any indication in the documentation shared, without seeing the job itself it is difficult to provide more prescriptive guidance.

hope this helps

AWS
전문가
답변함 2년 전

로그인하지 않았습니다. 로그인해야 답변을 게시할 수 있습니다.

좋은 답변은 질문에 명확하게 답하고 건설적인 피드백을 제공하며 질문자의 전문적인 성장을 장려합니다.

질문 답변하기에 대한 가이드라인

관련 콘텐츠