EMR main instance is not reachable. Please check your instance status.

0

Intermittently, the EMR parent node gets unreachable with the below error, which impacts the scaling of worker/task nodes and ultimately leads to batch job failures after few days (even the EMR CW monitoring metrics stop getting logged during this time).

"WaitingResize for group (ig-********) is pending because parent instance is not reachable. Please check your parent instance status."

We tried running the below command on parent node, which solves the problem temporarily, though again gets into the same situation after couple of days.

"/etc/alternatives/jre/bin/java -Xmx1024m -XX:MinHeapFreeRatio=10 -server -cp /usr/share/aws/emr/instance-controller/lib/*:/home/hadoop/conf -Dlog4j.defltInitOverride aws157.instancecontroller.Main &"

What would be the root cause?

amand
已提問 2 年前檢視次數 93 次
沒有答案

您尚未登入。 登入 去張貼答案。

一個好的回答可以清楚地回答問題並提供建設性的意見回饋,同時有助於提問者的專業成長。

回答問題指南