us-west-2 g5g.xlarge repetitive sudden hang up

0

Hello. My G5g (Graviton2 - NVIDIA T4G) instance in us-west-2 region suddenly hangs up over and over again. Does anybody experience this?

It stated up fine. I could use it without problem, only for a while.

But over and over again, after around one hour from startup, it hanged up at all. It suddenly lost network connection.

  • Rebooting from management console did not resolve connectivity.
  • After stopping from management console, the instance was kept "Stopping" for more than 5 minutes or longer.
  • Force stop from management console only solve that situation.

I only use VirtualGL and TigerVNC server. I didn't notice any memory shortage before the hanging up.

已提问 2 年前245 查看次数
3 回答
0

I could reproduce exact the same problem also in ap-northeast-1 (Tokyo) region.

已回答 2 年前
0

It occurs with the both of NVIDIA official driver for aarch64. https://www.nvidia.com/en-us/drivers/unix/ Linux aarch64 Latest Production Branch Version: 470.94 Latest New Feature Branch Version: 495.46

已回答 2 年前
0

I made another trial with AWS provided AMI ami-0122dba335a03859e, Deep Learning AMI Graviton GPU CUDA 11.4.2 (Ubuntu 20.04) 20211119, without any update, in us-west-2. It could be running for more than 3 hours, looked fine. But after I started to use the GPU with VirtualGL + TigerVNC + Firefox to show threejs.org sample pages, it hanged up. The same symptom arose.

已回答 2 年前

您未登录。 登录 发布回答。

一个好的回答可以清楚地解答问题和提供建设性反馈,并能促进提问者的职业发展。

回答问题的准则