EMR main instance is not reachable. Please check your instance status.

0

Intermittently, the EMR parent node gets unreachable with the below error, which impacts the scaling of worker/task nodes and ultimately leads to batch job failures after few days (even the EMR CW monitoring metrics stop getting logged during this time).

"WaitingResize for group (ig-********) is pending because parent instance is not reachable. Please check your parent instance status."

We tried running the below command on parent node, which solves the problem temporarily, though again gets into the same situation after couple of days.

"/etc/alternatives/jre/bin/java -Xmx1024m -XX:MinHeapFreeRatio=10 -server -cp /usr/share/aws/emr/instance-controller/lib/*:/home/hadoop/conf -Dlog4j.defltInitOverride aws157.instancecontroller.Main &"

What would be the root cause?

amand
asked 2 years ago93 views
No Answers

You are not logged in. Log in to post an answer.

A good answer clearly answers the question and provides constructive feedback and encourages professional growth in the question asker.

Guidelines for Answering Questions