Receiving Warning: spark.executor.instances less than spark.dynamicAllocation.minExecutors

0

We are receiving a warning while running our Glue job:

23/04/10 06:39:21 WARN Utils: spark.executor.instances less than spark.dynamicAllocation.minExecutors is invalid, ignoring its setting, please update your configs. 23/04/10 06:39:22 WARN Utils: spark.executor.instances less than spark.dynamicAllocation.minExecutors is invalid, ignoring its setting, please update your configs. 23/04/10 06:39:22 WARN ExecutorAllocationManager: Dynamic allocation without a shuffle service is an experimental feature.

We are using Glue 3.0, Worker G.2X and have: number_of_workers = 20 max_capacity = 20

And --enable-auto-scaling is 'true'.

Job is not throwing an error, but would like to understand this better and clear the warning if possible. Thank you.

已提問 1 年前檢視次數 837 次
1 個回答
0

Hello,

Thanks for reaching out.

These two warnings about Apache Spark are merely informational messages:

  1. WARN Utils: spark.executor.instances less than spark.dynamicAllocation.minExecutors is invalid, ignoring its setting, please update your configs.

The spark.executor.instances and spark.dynamicAllocation.minExecutors parameters are related to the allocation of executors for Spark applications:

  • If dynamicAllocation (Auto-Scaling) is enabled in a Glue job then the value of spark.executor.instances is set to 'None'.
  • The spark.dynamicAllocation.minExecutors is the lower bound for the number of executors. If Auto-Scaling is enabled in a Glue job, it’s defaulted to 1.

The function that logs the warning can be found in the spark config file [1], which essentially logs the warning message when the value of spark.dynamicAllocation.minExecutors > spark.executor.instances. It tries to get the set value of spark.executor.instances, else sets it to 0.

Therefore, when the job is run with Auto-Scaling enabled, the default value of 0 for spark.executor.instances is used, and the warning is logged since the default value of spark.dynamicAllocation.minExecutors is greater than 0.

To clear this warning, you can consider setting the Log Level to ‘ERROR’ by following the instructions provided in [2].

  1. WARN ExecutorAllocationManager: Dynamic allocation without a shuffle service is an experimental feature.

This warning is logged from the code snippet provided in [3] when the value of spark.shuffle.service.enabled is set to 'False'.
In order to clear this warning, you can set spark.shuffle.service.enabled to ‘true’ or you can follow instructions in [2] to change the log level.

Hope this information was helpful.

References:

[1] https://github.com/apache/spark/blob/v3.1.1/core/src/main/scala/org/apache/spark/util/Utils.scala#L2592-L2595

[2] https://repost.aws/knowledge-center/glue-reduce-cloudwatch-logs#:~:text=Set%20the%20logging%20level%20using%20Spark%20context%20method%20setLogLevel

[3] https://github.com/apache/spark/blob/v3.1.1/core/src/main/scala/org/apache/spark/ExecutorAllocationManager.scala#L205-L216

AWS
支援工程師
已回答 1 年前

您尚未登入。 登入 去張貼答案。

一個好的回答可以清楚地回答問題並提供建設性的意見回饋,同時有助於提問者的專業成長。

回答問題指南