Receiving Warning: spark.executor.instances less than spark.dynamicAllocation.minExecutors

0

We are receiving a warning while running our Glue job:

23/04/10 06:39:21 WARN Utils: spark.executor.instances less than spark.dynamicAllocation.minExecutors is invalid, ignoring its setting, please update your configs. 23/04/10 06:39:22 WARN Utils: spark.executor.instances less than spark.dynamicAllocation.minExecutors is invalid, ignoring its setting, please update your configs. 23/04/10 06:39:22 WARN ExecutorAllocationManager: Dynamic allocation without a shuffle service is an experimental feature.

We are using Glue 3.0, Worker G.2X and have: number_of_workers = 20 max_capacity = 20

And --enable-auto-scaling is 'true'.

Job is not throwing an error, but would like to understand this better and clear the warning if possible. Thank you.

質問済み 1年前837ビュー
1回答
0

Hello,

Thanks for reaching out.

These two warnings about Apache Spark are merely informational messages:

  1. WARN Utils: spark.executor.instances less than spark.dynamicAllocation.minExecutors is invalid, ignoring its setting, please update your configs.

The spark.executor.instances and spark.dynamicAllocation.minExecutors parameters are related to the allocation of executors for Spark applications:

  • If dynamicAllocation (Auto-Scaling) is enabled in a Glue job then the value of spark.executor.instances is set to 'None'.
  • The spark.dynamicAllocation.minExecutors is the lower bound for the number of executors. If Auto-Scaling is enabled in a Glue job, it’s defaulted to 1.

The function that logs the warning can be found in the spark config file [1], which essentially logs the warning message when the value of spark.dynamicAllocation.minExecutors > spark.executor.instances. It tries to get the set value of spark.executor.instances, else sets it to 0.

Therefore, when the job is run with Auto-Scaling enabled, the default value of 0 for spark.executor.instances is used, and the warning is logged since the default value of spark.dynamicAllocation.minExecutors is greater than 0.

To clear this warning, you can consider setting the Log Level to ‘ERROR’ by following the instructions provided in [2].

  1. WARN ExecutorAllocationManager: Dynamic allocation without a shuffle service is an experimental feature.

This warning is logged from the code snippet provided in [3] when the value of spark.shuffle.service.enabled is set to 'False'.
In order to clear this warning, you can set spark.shuffle.service.enabled to ‘true’ or you can follow instructions in [2] to change the log level.

Hope this information was helpful.

References:

[1] https://github.com/apache/spark/blob/v3.1.1/core/src/main/scala/org/apache/spark/util/Utils.scala#L2592-L2595

[2] https://repost.aws/knowledge-center/glue-reduce-cloudwatch-logs#:~:text=Set%20the%20logging%20level%20using%20Spark%20context%20method%20setLogLevel

[3] https://github.com/apache/spark/blob/v3.1.1/core/src/main/scala/org/apache/spark/ExecutorAllocationManager.scala#L205-L216

AWS
サポートエンジニア
回答済み 1年前

ログインしていません。 ログイン 回答を投稿する。

優れた回答とは、質問に明確に答え、建設的なフィードバックを提供し、質問者の専門分野におけるスキルの向上を促すものです。

質問に答えるためのガイドライン