EKS Cluster's node down

0

We are using EKS and one of our dedicated worker node for TimescaleDB (TSDB) went down. The node soon displayed unreachable taints and a new node got scheduled in place of it soon after.

We want to investigate further as to why the node went down. We have an idea that the TSDB pod was operating on high memory moments before the crash but we would like to be concretely point out the reason for the issue in order to fault-proof it for future.

Can someone suggest a direction to take? We already looked at the logs for the instance that went down, there was nothing that concretely points to a crash or node being unavailable.

Arisht
질문됨 6달 전155회 조회
답변 없음

로그인하지 않았습니다. 로그인해야 답변을 게시할 수 있습니다.

좋은 답변은 질문에 명확하게 답하고 건설적인 피드백을 제공하며 질문자의 전문적인 성장을 장려합니다.

질문 답변하기에 대한 가이드라인

관련 콘텐츠