nvidia-smi not working on Amazon's DeepLearning AMI

Question

Greetings,  
  
I have set up an instance of "Deep Learning AMI (Ubuntu)" (https://aws.amazon.com/marketplace/pp/B077GCH38C)  
expecting it to have functioning NVIDIA GPUs.  
  
However, when I write **nvidia-smi** in the command line ,I get:

```
NVIDIA-SMI has failed because it couldn't communicate with the NVIDIA driver. Make sure that the latest NVIDIA driver is installed and running.
```

And when I try to execute Python code that uses PyTorch 0.4 (I did install that version as required) I get:

```
THCudaCheck FAIL file=/opt/conda/conda-bld/pytorch_1532579245307/work/aten/src/THC/THCGeneral.cpp line=74 error=38 : no CUDA-capable device is detected
Traceback (most recent call last):
  File "main.py", line 105, in 
    train_data = batchify(corpus.train, args.batch_size, args)
  File "/home/ubuntu/awd-lstm-lm/utils.py", line 21, in batchify
    data = data.cuda()
RuntimeError: cuda runtime error (38) : no CUDA-capable device is detected at /opt/conda/conda-bld/pytorch_1532579245307/work/aten/src/THC/THCGeneral.cpp:74
```

Has anyone experienced something similar? Any way to fix this?

Answer

Answered: I did not choose a GPU instance when setting up the machine

nvidia-smi not working on Amazon's DeepLearning AMI

相關內容