- Newest
- Most votes
- Most comments
Instead of installing NVIDIA driver manually, you can use AWS Deep Learning AMIs which comes with NVIDIA driver, CUDA toolkit, docker, other software and optionally pyTorch pre-installed. Search for Deep Learning OSS Nvidia Driver AMI GPU PyTorch when launching EC2 instance (screenshot below)
Release notes at https://docs.aws.amazon.com/dlami/latest/devguide/aws-deep-learning-ami-gpu-pytorch-2.7-ubuntu-22-04.html
If you need graphical desktop, you can refer to Deep Learning graphical desktop on Ubuntu Linux with AWS Deep Learning AMI (DLAMI) or Deep Learning AMI
If you want to install NVIDIA driver manually, the article Install NVIDIA GPU driver, CUDA Toolkit, NVIDIA Container Toolkit on Amazon EC2 instances running Ubuntu Linux can help
It sounds like you're experiencing an issue with CUDA not being available on your G5 instance. The G5 instances do come with NVIDIA A10G Tensor Core GPUs, but you need to install the necessary drivers and CUDA toolkit to use them.
For your Ubuntu-based G5 instance, you'll need to:
- Install the NVIDIA GPU driver
- Install the CUDA Toolkit
- Install any necessary libraries for PyTorch to detect the GPU
Here's how to install the NVIDIA driver and CUDA Toolkit on your Ubuntu instance:
First, update your system:
sudo apt-get update
sudo apt-get upgrade -y
Then install the NVIDIA driver:
sudo apt-get install -y nvidia-driver-latest
For the CUDA Toolkit, ensure you have at least 20 GB of free storage space, then install it:
cd /tmp
wget https://developer.download.nvidia.com/compute/cuda/12.4.1/local_installers/cuda_12.4.1_550.54.15_linux.run
chmod +x ./cuda_*.run
sudo ./cuda_*.run --toolkit --silent
After installation, you'll need to restart your instance:
sudo reboot
Once your instance is back up, you should be able to run nvidia-smi to verify the driver installation. You may also need to set up environment variables for CUDA:
export PATH=/usr/local/cuda/bin:$PATH
export LD_LIBRARY_PATH=/usr/local/cuda/lib64:$LD_LIBRARY_PATH
Alternatively, you could consider using the AWS Deep Learning Base GPU AMI which comes pre-configured with CUDA (including version 12.6) and has added support specifically for G5 EC2 instances. This would save you the trouble of manual installation.
Sources
AWS Deep Learning Base GPU AMI (Amazon Linux 2023) - AWS Deep Learning AMIs
Install NVIDIA GPU driver, CUDA Toolkit, NVIDIA Container Toolkit on Amazon EC2 instances running Ubuntu Linux | AWS re:Post
Amazon EC2 G5 Instances | Amazon Web Services
How do I install NVIDIA GPU driver, CUDA Toolkit, NVIDIA Container Toolkit on Amazon EC2 instances running Amazon Linux 2 (AL2)? | AWS re:Post
The command: sudo apt-get install -y nvidia-driver-latest
Will give this response: Reading package lists... Done Building dependency tree... Done Reading state information... Done E: Unable to locate package nvidia-driver-latest
Relevant content
- asked 9 months ago
