Glue Error: java.lang.IllegalStateException: Trying to set metadata after initializing

0

I have a glue job that transforms data from glue table. And I encounter the following error. It does not occur for every run of the job.

I have looked at a few documentarians, it seems to be coming from how spark interacts with the underlying S3 file system. Any insights into why this occurs?

An error occurred while calling o1196.count. : java.lang.IllegalStateException: Trying to set metadata after initializing. at com.google.common.base.Preconditions.checkState(Preconditions.java:149) at org.apache.hadoop.fs.FileStatus.initializeMetadata(FileStatus.java:444) at org.apache.hadoop.fs.FileStatus.ensureMetadataInitialized(FileStatus.java:429) at org.apache.hadoop.fs.FileStatus.getMetadata(FileStatus.java:409) at org.apache.parquet.hadoop.util.DefaultETagExtractor.getETag(DefaultETagExtractor.java:27) at org.apache.spark.sql.execution.PartitionedFileUtil$.$anonfun$splitFiles$1(PartitionedFileUtil.scala:50) at org.apache.spark.sql.execution.PartitionedFileUtil$.$anonfun$splitFiles$1$adapted(PartitionedFileUtil.scala:46) at scala.collection.TraversableLike.$anonfun$map$1(TraversableLike.scala:233) at scala.collection.immutable.NumericRange.foreach(NumericRange.scala:70) at scala.collection.TraversableLike.map(TraversableLike.scala:233) at scala.collection.TraversableLike.map$(TraversableLike.scala:226) at scala.collection.AbstractTraversable.map(Traversable.scala:104) at org.apache.spark.sql.execution.PartitionedFileUtil$.splitFiles(PartitionedFileUtil.scala:46) at org.apache.spark.sql.execution.FileSourceScanExec.$anonfun$createNonBucketedReadRDD$2(DataSourceScanExec.scala:708) at scala.collection.TraversableLike.$anonfun$flatMap$1(TraversableLike.scala:240) at scala.collection.immutable.List.foreach(List.scala:388) at scala.collection.TraversableLike.flatMap(TraversableLike.scala:240) at scala.collection.TraversableLike.flatMap$(TraversableLike.scala:237) at scala.collection.immutable.List.flatMap(List.scala:351) at org.apache.spark.sql.execution.FileSourceScanExec.$anonfun$createNonBucketedReadRDD$1(DataSourceScanExec.scala:697) at scala.collection.TraversableLike.$anonfun$flatMap$1(TraversableLike.scala:240) at scala.collection.IndexedSeqOptimized.foreach(IndexedSeqOptimized.scala:32) at scala.collection.IndexedSeqOptimized.foreach$(IndexedSeqOptimized.scala:29) at scala.collection.mutable.ArrayOps$ofRef.foreach(ArrayOps.scala:194) at scala.collection.TraversableLike.flatMap(TraversableLike.scala:240) at scala.collection.TraversableLike.flatMap$(TraversableLike.scala:237) at scala.collection.mutable.ArrayOps$ofRef.flatMap(ArrayOps.scala:194) at org.apache.spark.sql.execution.FileSourceScanExec.createNonBucketedReadRDD(DataSourceScanExec.scala:696) at org.apache.spark.sql.execution.FileSourceScanExec.inputRDD$lzycompute(DataSourceScanExec.scala:469) at org.apache.spark.sql.execution.FileSourceScanExec.inputRDD(DataSourceScanExec.scala:408) at org.apache.spark.sql.execution.FileSourceScanExec.doExecuteColumnar(DataSourceScanExec.scala:600) at org.apache.spark.sql.execution.SparkPlan.$anonfun$executeColumnar$1(SparkPlan.scala:212) at org.apache.spark.sql.execution.SparkPlan.$anonfun$executeQuery$1(SparkPlan.scala:223) at org.apache.spark.rdd.RDDOperationScope$.withScope(RDDOperationScope.scala:151) at org.apache.spark.sql.execution.SparkPlan.executeQuery(SparkPlan.scala:220) at org.apache.spark.sql.execution.SparkPlan.executeColumnar(SparkPlan.scala:208) at org.apache.spark.sql.execution.InputAdapter.doExecuteColumnar(WholeStageCodegenExec.scala:519) at org.apache.spark.sql.execution.SparkPlan.$anonfun$executeColumnar$1(SparkPlan.scala:212) at org.apache.spark.sql.execution.SparkPlan.$anonfun$executeQuery$1(SparkPlan.scala:223) at org.apache.spark.rdd.RDDOperationScope$.withScope(RDDOperationScope.scala:151) at org.apache.spark.sql.execution.SparkPlan.executeQuery(SparkPlan.scala:220) at org.apache.spark.sql.execution.SparkPlan.executeColumnar(SparkPlan.scala:208) at org.apache.spark.sql.execution.ColumnarToRowExec.inputRDDs(Columnar.scala:202) at org.apache.spark.sql.execution.FilterExec.inputRDDs(basicPhysicalOperators.scala:149) at org.apache.spark.sql.execution.WholeStageCodegenExec.doExecute(WholeStageCodegenExec.scala:746) at org.apache.spark.sql.execution.SparkPlan.$anonfun$execute$1(SparkPlan.scala:185) at org.apache.spark.sql.execution.SparkPlan.$anonfun$executeQuery$1(SparkPlan.scala:223) at org.apache.spark.rdd.RDDOperationScope$.withScope(RDDOperationScope.scala:151) at org.apache.spark.sql.execution.SparkPlan.executeQuery(SparkPlan.scala:220) at org.apache.spark.sql.execution.SparkPlan.execute(SparkPlan.scala:181) at org.apache.spark.sql.execution.SparkPlan.getByteArrayRdd(SparkPlan.scala:326) at org.apache.spark.sql.execution.SparkPlan.executeCollectIterator(SparkPlan.scala:402) at org.apache.spark.sql.execution.exchange.BroadcastExchangeExec.org$apache$spark$sql$execution$exchange$BroadcastExchangeExec$$doComputeRelation(BroadcastExchangeExec.scala:178) at org.apache.spark.sql.execution.exchange.BroadcastExchangeExec$$anon$1.doCompute(BroadcastExchangeExec.scala:171) at org.apache.spark.sql.execution.exchange.BroadcastExchangeExec$$anon$1.doCompute(BroadcastExchangeExec.scala:167) at org.apache.spark.sql.execution.AsyncDriverOperation.$anonfun$compute$1(AsyncDriverOperation.scala:73) at org.apache.spark.sql.catalyst.QueryPlanningTracker$.withTracker(QueryPlanningTracker.scala:107) at org.apache.spark.sql.execution.SQLExecution$.withTracker(SQLExecution.scala:232) at org.apache.spark.sql.execution.SQLExecution$.withTracker(SQLExecution.scala:224) at org.apache.spark.sql.execution.SQLExecution$.$anonfun$withExecutionId$1(SQLExecution.scala:207) at org.apache.spark.sql.execution.SQLExecution$.withSQLConfPropagated(SQLExecution.scala:253) at org.apache.spark.sql.execution.SQLExecution$.withExecutionId(SQLExecution.scala:204) at org.apache.spark.sql.execution.AsyncDriverOperation.compute(AsyncDriverOperation.scala:67) at org.apache.spark.sql.execution.AsyncDriverOperation.$anonfun$computeFuture$1(AsyncDriverOperation.scala:53) at org.apache.spark.sql.execution.SQLExecution$.$anonfun$withThreadLocalCaptured$1(SQLExecution.scala:275) at java.util.concurrent.FutureTask.run(FutureTask.java:266) at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624) at java.lang.Thread.run(Thread.java:750) at org.apache.spark.sql.execution.adaptive.AdaptiveExecutor.checkNoFailures(AdaptiveExecutor.scala:147) at org.apache.spark.sql.execution.adaptive.AdaptiveExecutor.doRun(AdaptiveExecutor.scala:88) at org.apache.spark.sql.execution.adaptive.AdaptiveExecutor.tryRunningAndGetFuture(AdaptiveExecutor.scala:66) at org.apache.spark.sql.execution.adaptive.AdaptiveExecutor.execute(AdaptiveExecutor.scala:57) at org.apache.spark.sql.execution.adaptive.AdaptiveSparkPlanExec.$anonfun$getFinalPhysicalPlan$1(AdaptiveSparkPlanExec.scala:184) at org.apache.spark.sql.SparkSession.withActive(SparkSession.scala:772) at org.apache.spark.sql.execution.adaptive.AdaptiveSparkPlanExec.getFinalPhysicalPlan(AdaptiveSparkPlanExec.scala:183) at org.apache.spark.sql.execution.adaptive.AdaptiveSparkPlanExec.executeCollect(AdaptiveSparkPlanExec.scala:404) at org.apache.spark.sql.Dataset.$anonfun$count$1(Dataset.scala:3046) at org.apache.spark.sql.Dataset.$anonfun$count$1$adapted(Dataset.scala:3045) at org.apache.spark.sql.Dataset.$anonfun$withAction$1(Dataset.scala:3724) at org.apache.spark.sql.catalyst.QueryPlanningTracker$.withTracker(QueryPlanningTracker.scala:107) at org.apache.spark.sql.execution.SQLExecution$.withTracker(SQLExecution.scala:232) at org.apache.spark.sql.execution.SQLExecution$.executeQuery$1(SQLExecution.scala:110) at org.apache.spark.sql.execution.SQLExecution$.$anonfun$withNewExecutionId$6(SQLExecution.scala:135) at org.apache.spark.sql.catalyst.QueryPlanningTracker$.withTracker(QueryPlanningTracker.scala:107) at org.apache.spark.sql.execution.SQLExecution$.withTracker(SQLExecution.scala:232) at org.apache.spark.sql.execution.SQLExecution$.$anonfun$withNewExecutionId$5(SQLExecution.scala:135) at org.apache.spark.sql.execution.SQLExecution$.withSQLConfPropagated(SQLExecution.scala:253) at org.apache.spark.sql.execution.SQLExecution$.$anonfun$withNewExecutionId$1(SQLExecution.scala:134) at org.apache.spark.sql.SparkSession.withActive(SparkSession.scala:772) at org.apache.spark.sql.execution.SQLExecution$.withNewExecutionId(SQLExecution.scala:68) at org.apache.spark.sql.Dataset.withAction(Dataset.scala:3722) at org.apache.spark.sql.Dataset.count(Dataset.scala:3045) at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) at java.lang.reflect.Method.invoke(Method.java:498) at py4j.reflection.MethodInvoker.invoke(MethodInvoker.java:244) at py4j.reflection.ReflectionEngine.invoke(ReflectionEngine.java:357) at py4j.Gateway.invoke(Gateway.java:282) at py4j.commands.AbstractCommand.invokeMethod(AbstractCommand.java:132) at py4j.commands.CallCommand.execute(CallCommand.java:79) at py4j.GatewayConnection.run(GatewayConnection.java:238) at java.lang.Thread.run(Thread.java:750)

YFL
gefragt vor 2 Monaten337 Aufrufe
1 Antwort
0

That looks like a racing condition by concurrent access to the same FileStatus object, which is not thread safe.
Assuming you are not doing nothing odd, seems you have hit an obscure bug and will have to accept it needs to retry (it should fail early so doesn't cause delay).

profile pictureAWS
EXPERTE
beantwortet vor 2 Monaten

Du bist nicht angemeldet. Anmelden um eine Antwort zu veröffentlichen.

Eine gute Antwort beantwortet die Frage klar, gibt konstruktives Feedback und fördert die berufliche Weiterentwicklung des Fragenstellers.

Richtlinien für die Beantwortung von Fragen