- Newest
- Most votes
- Most comments
As per AL2023 release notes, g6f is not in the list of supported instance types. The DLAMI use Tesla driver.
Supported EC2 instances: G4dn, G5, G6, Gr6, G6e, P4d, P4de, P5, P5e, P5en, P6-B200, P6-B300
From Available drivers by instance type, you need NVIDIA GRID driver. This article can help.
It appears you're experiencing an issue with the NVIDIA driver on your G6f instance using the Deep Learning AMI. The error message indicates that the NVIDIA driver is not communicating properly with the GPU.
Looking at your logs, I can see the key issue: nvidia: probe of 0000:31:00.0 failed with error -1 and the message about "NVIDIA Virtual GPU (vGPU)". This suggests that there might be a compatibility issue between the driver and the specific GPU in your G6f instance.
G6f instances use NVIDIA L4 GPUs, and they require specific driver support. Here are some steps to troubleshoot:
-
First, verify that the G6f instance type is fully supported by the specific DLAMI version you're using. While the MOTD mentions G6 support, there might be specific requirements for the G6f variant.
-
Consider using the AWS Deep Learning OSS Nvidia Driver AMI instead of the Base AMI with Single CUDA. The OSS Nvidia Driver AMI typically has better compatibility with various GPU instance types.
-
If you continue to have issues, you could try installing the NVIDIA driver manually, ensuring it's compatible with the L4 GPU in your G6f instance.
-
Another approach would be to use a different DLAMI version that might have better compatibility with G6f instances.
It's worth noting that sometimes newly released instance types or GPU variants might require updated drivers or specific configurations that aren't yet included in all DLAMI versions. In such cases, using the latest available DLAMI or manually updating the drivers might be necessary.
Sources
Resolve NVIDIA GPU and GPU driver issues in Amazon EC2 | AWS re:Post
Issues gdscheck -p - receiving: CUDA_ERROR_SYSTEM_NOT_READY | AWS re:Post
Using the Deep Learning AMI with Conda - Deep Learning AMI
