Receiving Warning: spark.executor.instances less than spark.dynamicAllocation.minExecutors

0

We are receiving a warning while running our Glue job:

23/04/10 06:39:21 WARN Utils: spark.executor.instances less than spark.dynamicAllocation.minExecutors is invalid, ignoring its setting, please update your configs. 23/04/10 06:39:22 WARN Utils: spark.executor.instances less than spark.dynamicAllocation.minExecutors is invalid, ignoring its setting, please update your configs. 23/04/10 06:39:22 WARN ExecutorAllocationManager: Dynamic allocation without a shuffle service is an experimental feature.

We are using Glue 3.0, Worker G.2X and have: number_of_workers = 20 max_capacity = 20

And --enable-auto-scaling is 'true'.

Job is not throwing an error, but would like to understand this better and clear the warning if possible. Thank you.

posta un anno fa834 visualizzazioni
1 Risposta
0

Hello,

Thanks for reaching out.

These two warnings about Apache Spark are merely informational messages:

  1. WARN Utils: spark.executor.instances less than spark.dynamicAllocation.minExecutors is invalid, ignoring its setting, please update your configs.

The spark.executor.instances and spark.dynamicAllocation.minExecutors parameters are related to the allocation of executors for Spark applications:

  • If dynamicAllocation (Auto-Scaling) is enabled in a Glue job then the value of spark.executor.instances is set to 'None'.
  • The spark.dynamicAllocation.minExecutors is the lower bound for the number of executors. If Auto-Scaling is enabled in a Glue job, it’s defaulted to 1.

The function that logs the warning can be found in the spark config file [1], which essentially logs the warning message when the value of spark.dynamicAllocation.minExecutors > spark.executor.instances. It tries to get the set value of spark.executor.instances, else sets it to 0.

Therefore, when the job is run with Auto-Scaling enabled, the default value of 0 for spark.executor.instances is used, and the warning is logged since the default value of spark.dynamicAllocation.minExecutors is greater than 0.

To clear this warning, you can consider setting the Log Level to ‘ERROR’ by following the instructions provided in [2].

  1. WARN ExecutorAllocationManager: Dynamic allocation without a shuffle service is an experimental feature.

This warning is logged from the code snippet provided in [3] when the value of spark.shuffle.service.enabled is set to 'False'.
In order to clear this warning, you can set spark.shuffle.service.enabled to ‘true’ or you can follow instructions in [2] to change the log level.

Hope this information was helpful.

References:

[1] https://github.com/apache/spark/blob/v3.1.1/core/src/main/scala/org/apache/spark/util/Utils.scala#L2592-L2595

[2] https://repost.aws/knowledge-center/glue-reduce-cloudwatch-logs#:~:text=Set%20the%20logging%20level%20using%20Spark%20context%20method%20setLogLevel

[3] https://github.com/apache/spark/blob/v3.1.1/core/src/main/scala/org/apache/spark/ExecutorAllocationManager.scala#L205-L216

AWS
TECNICO DI SUPPORTO
con risposta un anno fa

Accesso non effettuato. Accedi per postare una risposta.

Una buona risposta soddisfa chiaramente la domanda, fornisce un feedback costruttivo e incoraggia la crescita professionale del richiedente.

Linee guida per rispondere alle domande