Hello Team,
We are facing issues with EMR 6.10.0 SPARK(pyspark) Application and following are the details for it ,
- Spark submit is failing in cluster mode for pyspark application
- However we have set the environment variables and its working fine for us in client mode and local mode when we execute it from master node.
- command is as follows: spark-submit --master yarn --deploy-mode cluster main.py
- we tried multiple options via runtime but still our pyspark code is failing in cluster mode with below two errors are as follows:
**
Error 1: With Spark submit command from master node we are getting below error message:**
INFO Client: Application report for application_1693454247420_0022 (state: FAILED)
23/08/31 11:06:17 INFO Client:
client token: N/A
diagnostics: Application application_1693454247420_0022 failed 2 times due to AM Container for appattempt_1693454247420_0022_000002 exited with exitCode: 13
Failing this attempt.Diagnostics: [2023-08-31 11:06:16.928]Exception from container-launch.
Container id: container_1693454247420_0022_02_000001
Exit code: 13
INFO Client: Application report for application_1693454247420_0022 (state: FAILED)
23/08/31 11:06:17 INFO Client:
client token: N/A
diagnostics: Application application_1693454247420_0022 failed 2 times due to AM Container for appattempt_1693454247420_0022_000002 exited with exitCode: 13
Failing this attempt.Diagnostics: [2023-08-31 11:06:16.928]Exception from container-launch.
Container id: container_1693454247420_0022_02_000001
Exit code: 13
**
Error 2: Below is the error on the Datanode side i.e its the errors on Node Manager logs**
Node manager logs:
2023-08-31 11:37:40,223 ERROR org.apache.hadoop.yarn.server.nodemanager.NodeStatusUpdaterImpl (Node Status Updater): NM node labels {CORE:exclusivity=true} were not accepted by RM and message from RM : null2023-08-31 11:39:40,250 ERROR org.apache.hadoop.yarn.server.nodemanager.NodeStatusUpdaterImpl (Node Status Updater): NM node labels {CORE:exclusivity=true} were not accepted by RM and message from RM : null2023-08-31 11:41:40,282 ERROR org.apache.hadoop.yarn.server.nodemanager.NodeStatusUpdaterImpl (Node Status Updater): NM node labels {CORE:exclusivity=true} were not accepted by RM and message from RM : null2023-08-31 11:43:40,305 ERROR org.apache.hadoop.yarn.server.nodemanager.NodeStatusUpdaterImpl (Node Status Updater): NM node labels {CORE:exclusivity=true} were not accepted by RM and message from RM : null2023-08-31 11:45:40,327 ERROR org.apache.hadoop.yarn.server.nodemanager.NodeStatusUpdaterImpl (Node Status Updater): NM node labels {CORE:exclusivity=true} were not accepted by RM and message from RM : null2023-08-31 11:46:38,095 INFO org.apache.hadoop.yarn.server.nodemanager.containermanager.localizer.ResourceLocalizationService (NM ContainerManager dispatcher): Cache Size Before Clean: 8129896432, Total Deleted: 0, Public Deleted: 0, Private Deleted: 02023-08-31 11:46:38,157 INFO org.apache.hadoop.yarn.server.nodemanager.logaggregation.tracker.NMLogAggregationStatusTracker (Timer-0): Rolling over the cached log aggregation status.2023-08-31 11:47:40,353 ERROR org.apache.hadoop.yarn.server.nodemanager.NodeStatusUpdaterImpl (Node Status Updater): NM node labels {CORE:exclusivity=true} were not accepted by RM and message from RM : null
2023-08-31 12:01:42,512 INFO client.DefaultNoHARMFailoverProxyProvider: Connecting to ResourceManager at /10.0.100.107:8033addToClusterNodeLabels: java.io.IOException: Node-label-based scheduling is disabled. Please check yarn.node-labels.enabled at org.apache.hadoop.yarn.ipc.RPCUtil.getRemoteException(RPCUtil.java:38) at org.apache.hadoop.yarn.server.resourcemanager.AdminService.logAndWrapException(AdminService.java:936) at org.apache.hadoop.yarn.server.resourcemanager.AdminService.addToClusterNodeLabels(AdminService.java:821) at org.apache.hadoop.yarn.server.api.impl.pb.service.ResourceManagerAdministrationProtocolPBServiceImpl.addToClusterNodeLabels(ResourceManagerAdministrationProtocolPBServiceImpl.java:270) at org.apache.hadoop.yarn.proto.ResourceManagerAdministrationProtocol$ResourceManagerAdministrationProtocolService$2.callBlockingMethod(ResourceManagerAdministrationProtocol.java:311) at org.apache.hadoop.ipc.ProtobufRpcEngine2$Server$ProtoBufRpcInvoker.call(ProtobufRpcEngine2.java:604) at org.apache.hadoop.ipc.ProtobufRpcEngine2$Server$ProtoBufRpcInvoker.call(ProtobufRpcEngine2.java:572) at org.apache.hadoop.ipc.ProtobufRpcEngine2$Server$ProtoBufRpcInvoker.call(ProtobufRpcEngine2.java:556) at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:1093) at org.apache.hadoop.ipc.Server$RpcCall.run(Server.java:1175) at org.apache.hadoop.ipc.Server$RpcCall.run(Server.java:1099) at java.security.AccessController.doPrivileged(Native Method) at javax.security.auth.Subject.doAs(Subject.java:422) at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1878) at org.apache.hadoop.ipc.Server$Handler.run(Server.java:3316)Caused by: java.io.IOException: Node-label-based scheduling is disabled. Please check yarn.node-labels.enabled at org.apache.hadoop.yarn.nodelabels.CommonNodeLabelsManager.addToCluserNodeLabels(CommonNodeLabelsManager.java:309) at org.apache.hadoop.yarn.server.resourcemanager.nodelabels.RMNodeLabelsManager.addToCluserNodeLabels(RMNodeLabelsManager.java:141) at org.apache.hadoop.yarn.server.resourcemanager.AdminService.addToClusterNodeLabels(AdminService.java:816) ... 12 more
2023-08-31 12:07:40,677 ERROR org.apache.hadoop.yarn.server.nodemanager.NodeStatusUpdaterImpl (Node Status Updater): NM node labels {CORE:exclusivity=true} were not accepted by RM and message from RM : null
2023-08-31 12:09:40,695 ERROR org.apache.hadoop.yarn.server.nodemanager.NodeStatusUpdaterImpl (Node Status Updater): NM node labels {CORE:exclusivity=true} were not accepted by RM and message from RM : null
2023-08-31 12:11:40,724 ERROR org.apache.hadoop.yarn.server.nodemanager.NodeStatusUpdaterImpl (Node Status Updater): NM node labels {CORE:exclusivity=true} were not accepted by RM and message from RM : null
2023-08-31 12:13:40,744 ERROR org.apache.hadoop.yarn.server.nodemanager.NodeStatusUpdaterImpl (Node Status Updater): NM node labels {CORE:exclusivity=true} were not accepted by RM and message from RM : null
Team please suggests some inputs or workarounds or any resolutions for my above blockers.
Thank you
My case, I have issue in dependency incompatibility which did not allow me to submit cluster node. Thanks