내용으로 건너뛰기

g4dn EC2 refusing to use T4

0

set up the instance, installed the nvidia drivers, installed dcv, on connect it's using some amazon display adapter (aws indirect display device according to dxdiag), and not the nvidia one. Windows Server 2025 Nvidia smi output:

+-----------------------------------------------------------------------------------------+ | NVIDIA-SMI 573.07 Driver Version: 573.07 CUDA Version: 12.8 | |-----------------------------------------+------------------------+----------------------+ | GPU Name Driver-Model | Bus-Id Disp.A | Volatile Uncorr. ECC | | Fan Temp Perf Pwr:Usage/Cap | Memory-Usage | GPU-Util Compute M. | | | | MIG M. | |=========================================+========================+======================| | 0 Tesla T4 TCC | 00000000:00:1E.0 Off | 0 | | N/A 33C P0 24W / 70W | 163MiB / 15360MiB | 0% Default | | | | N/A | +-----------------------------------------+------------------------+----------------------+

+-----------------------------------------------------------------------------------------+ | Processes: | | GPU GI CI PID Type Process name GPU Memory | | ID ID Usage | |=========================================================================================| +-----------------------------------------------------------------------------------------+

Seems fine. but how can I set the T4 to the default used one?

질문됨 5달 전153회 조회
2개 답변
2

From your nviidia-smi output, your driver is running in TCC and not WDDM mode. TCC mode does not use GPU for graphics acceleration. For example, your Process List is empty

Do uninstall and install NVIDIA GRID (aka NVIDIA RTX Virtual Workstation) driver as per documentation. Or you can refer to this blog for installation using CloudFormation. Example screenshot of nvidia-smi and nvidia-smi -q | findstr "Product License" below

Enter image description here

AWS
전문가
답변함 5달 전
-1

The issue you're experiencing is that your g4dn instance has the NVIDIA T4 GPU properly installed (as shown by your nvidia-smi output), but it's not being used for display rendering in your DCV session.

This is actually expected behavior. The NVIDIA T4 GPU on g4dn instances is configured in TCC (Tesla Compute Cluster) mode as shown in your nvidia-smi output, which optimizes it for compute workloads rather than graphics display. The AWS indirect display device is handling the display rendering while the T4 is available for compute tasks.

To use the GPU for graphics acceleration in your DCV session, you need to:

  1. Make sure you have the DCV-GL package installed (nice-dcv-gl)
  2. Ensure the NVIDIA driver is properly configured for DCV
  3. Run the following commands to enable GPU acceleration for DCV:
  • Stop the X server
  • Run dcvgladmin disable
  • Run dcvgladmin enable
  • Restart the X server

You can verify if OpenGL acceleration is working by running the dcvgltest tool from the nice-dcv-gltest package.

If you're using a DLAMI (Deep Learning AMI), make sure you're using the correct one for your g4dn instance. AWS provides specific DLAMIs that use the NVIDIA OSS driver which supports G4dn instances.
Sources
Issues gdscheck -p - receiving: CUDA_ERROR_SYSTEM_NOT_READY | AWS re:Post
DCV cannot access the 3D X Server [:0.0 :0.1]. | AWS re:Post
Specifications for Amazon EC2 accelerated computing instances - Amazon EC2
Important NVIDIA driver changes to DLAMIs - AWS Deep Learning AMIs

답변함 5달 전

로그인하지 않았습니다. 로그인해야 답변을 게시할 수 있습니다.

좋은 답변은 질문에 명확하게 답하고 건설적인 피드백을 제공하며 질문자의 전문적인 성장을 장려합니다.

관련 콘텐츠