EMR 7.0.0 on EC2: shell script steps do not start/stay pending

0

Hi,

after EMR 7.0.0 was released in the previous week, we wanted to start using it.

Problem

We have shell script EMR steps that are executed during the start of the cluster. These EMR steps never get started, after the cluster is done bootstrapping and stay "Pending" although the cluster state is "Running". The same happens if we start the cluster without providing the steps during startup and just add them after it was bootstrapped. An example can be seen here:

Step is pending and not started

The execution of the same script in the same way is working with EMR 6.15.0. The only thing changed, is the EMR version. PySpark EMR steps also still work.

Is there a known bug or something that needs to be changed on our side? What can we do to run the shell scripts as previously done?

If any information is missing, please let us know. Thank you in advance!

EMR setup

Amazon EMR version: emr-7.0.0

Installed applications:

  • Hadoop 3.3.6
  • JupyterEnterpriseGateway 2.6.0
  • Livy 0.7.1
  • Spark 3.5.0

Instances: 1 Primary instance m5.2xlarge with 4 32GB EBS stores

EGeist
asked 4 months ago309 views
2 Answers
2

Hello,

Basically, I do not find any issue in executing the shell script through Step in EMR 7.0.0. I tried both executing the Step as part of cluster provisioning and executing the Step through Add Step API via console & CLI method. Both methods worked as expected.

I presume in your case, there might be a specific shell script blocker or issue at the configuration. I recommend to login into the primary node and try executing the script manually to test if it's working fine or not. You might find the Step logs in /mnt/var/log/hadoop/steps location.

You can also try adding the step through CLI method or below method alternatively via console,

  1. Add Step, choose Type as customized jar
  2. Provide the Step name and Jar Location as s3://us-east-1.elasticmapreduce/libs/script-runner/script-runner.jar. Here region would be chosen based on your cluster region.
  3. In the Argument field, enter your actual shell script location s3://<Your bucket>/scripts/test.sh

If above methods are still not helpful to find the issue, please feel free to reach AWS Premium Support for getting more assistance.

AWS
SUPPORT ENGINEER
answered 4 months ago
  • Hello,

    I added the step exactly like you mentioned via console, with the corresponding region in the Jar location and running the step doesn't start at all - even though nothing else is running on the cluster. As mentioned before, the cluster goes into state "Running", but the step itself stays "Pending" and is not started and therefore doesn't write any logs. The script is probably not even downloaded. I added a screenshot to the initial post.

    I will try reaching out to AWS support - if there are any other ideas - please let me know!

  • I tested in eu-central-1 as well. This did not provide a chance to replicate your issue unfortunately. I suspect there could be network level issue as well. Consulting AWS Support would be worth troubleshooting this issue further.

0
Accepted Answer

The problem was the script that we ran.

In the script, that we ran, openssl-devel was first removed and then a more up2date version of it was installed via yum (needed in EMR version 6.15.0 and below to compile newer Python versions).

This removal of openssl-devel lead to a failure of "hadoop-state-pusher", which is apparently responsible for communicating the state of an EMR step back to AWS. As it failed, the cluster was looking all the time as if the EMR step didn't run, although it probably finished already internally.

As the openssl-devel version is newer on EMR 7.0.0 upwards anyways, this is probably not needed anymore. We were able to run our script, by NOT removing openssl-devel.

EGeist
answered 3 months ago
AWS
SUPPORT ENGINEER
reviewed a month ago

You are not logged in. Log in to post an answer.

A good answer clearly answers the question and provides constructive feedback and encourages professional growth in the question asker.

Guidelines for Answering Questions