EMR main instance is not reachable. Please check your instance status.

0

Intermittently, the EMR parent node gets unreachable with the below error, which impacts the scaling of worker/task nodes and ultimately leads to batch job failures after few days (even the EMR CW monitoring metrics stop getting logged during this time).

"WaitingResize for group (ig-********) is pending because parent instance is not reachable. Please check your parent instance status."

We tried running the below command on parent node, which solves the problem temporarily, though again gets into the same situation after couple of days.

"/etc/alternatives/jre/bin/java -Xmx1024m -XX:MinHeapFreeRatio=10 -server -cp /usr/share/aws/emr/instance-controller/lib/*:/home/hadoop/conf -Dlog4j.defltInitOverride aws157.instancecontroller.Main &"

What would be the root cause?

amand
preguntada hace 2 años93 visualizaciones
No hay respuestas

No has iniciado sesión. Iniciar sesión para publicar una respuesta.

Una buena respuesta responde claramente a la pregunta, proporciona comentarios constructivos y fomenta el crecimiento profesional en la persona que hace la pregunta.

Pautas para responder preguntas