Overview
This article suggests how to install NVIDIA GPU driver, CUDA Toolkit, NVIDIA Container Toolkit on NVIDIA GPU EC2 instances running AL2023 (Amazon Linux 2023)
Note that by using this method, you agree to NVIDIA Driver License Agreement, End User License Agreement and other related license agreement. If you are doing development, you may want to register for NVIDIA Developer Program.
Pre-built AMIs
If you need AMIs preconfigured with NVIDIA GPU driver, CUDA, other NVIDIA software, and optionally PyTorch or TensorFlow framework, consider AWS Deep Learning AMIs.
Refer to Release notes for DLAMIs for currently supported options, and Deep Learning graphical desktop on Amazon Linux 2023 (AL2023) with AWS Deep Learning AMI (DLAMI) for graphical desktop setup guidance.
For container workloads, consider Amazon ECS-optimized Linux AMIs and Amazon EKS optimized AMIs
Note: instructions in this article are not applicable to pre-built AMIs.
Custom ECS GPU-optimized AMI
If you wish to build your own custom Amazon ECS GPU-optimized AMI, install NVIDIA driver, Docker and NVIDIA container toolkit, and refer to How do I create and use custom AMIs in Amazon ECS?
About CUDA toolkit
CUDA Toolkit is generally optional when GPU instance is used to run applications (as opposed to develop applications) as the CUDA application typically packages (by statically or dynamically linking against) the CUDA runtime and libraries needed.
Prepare Amazon Linux 2023
Launch a new NVIDIA GPU instance running Amazon Linux 2023 preferably with at least 20 GB storage and connect to the instance
Kernel 6.12
If your AL2023 is running kernel 6.1, update to kernel 6.12 for improvements in scheduling, networking, security, and system tracing.
sudo dnf update -y
if (uname -r | grep -q ^6\\.1\\.); then
sudo dnf clean all
VER=$(dnf list kernel-headers --showduplicates| grep -E "^\s*kernel-headers" | awk '{print $2}' | sort -V | tail -1)
VER=$VER.`arch`
sudo dnf install -y kernel-headers-$VER kernel-devel-$VER kernel6.12-modules-extra-$VER kernel-modules-extra-common-$VER kernel6.12-$VER
if [ -f /boot/vmlinuz-$VER ]; then
sudo grubby --set-default "/boot/vmlinuz-$VER"
sudo reboot
fi
fi
Refer to Updating the Linux kernel on AL2023 for details.
Prepare AL2023
Install DKMS and kernel headers
sudo dnf clean all
sudo dnf install -y dkms
sudo systemctl enable --now dkms
if (uname -r | grep -q ^6\\.12\\.); then
sudo dnf install -y kernel-headers-$(uname -r) kernel-devel-$(uname -r) kernel6.12-modules-extra-$(uname -r) kernel-modules-extra-common-$(uname -r)
else
sudo dnf install -y kernel-headers-$(uname -r) kernel-devel-$(uname -r) kernel-modules-extra-$(uname -r) kernel-modules-extra-common-$(uname -r)
fi
Install NVIDIA driver and CUDA toolkit
Method 1: Package Manager Installation
CUDA version 12.5 and higher supports Amazon Linux 2023 package manager installation on x86_64.
CUDA version 12.9 and NVIDIA driver 570.148.08 adds arm64 support.
NVIDIA driver version 560 or higher from NVIDIA repository supports compute only/headless mode but not desktop mode. If you need NVIDIA graphical desktop drivers and libraries, you can
Add repo
You can choose either NVIDIA or AL2023 repository
Option 1: NVIDIA repo
if (arch | grep -q x86); then
ARCH=x86_64
else
ARCH=sbsa
fi
sudo dnf config-manager --add-repo https://developer.download.nvidia.com/compute/cuda/repos/amzn2023/$ARCH/cuda-amzn2023.repo
Option 2: AL2023 repo (x86_64 only)
nvidia-release
was added to 2023.6.20241031 release and enables a yum repository with NVIDIA drivers.
sudo dnf install -y nvidia-release
Install NVIDIA driver
Option 1: NVIDIA repo
sudo dnf module install -y nvidia-driver:open-dkms
To install a specific version, e.g. 575
sudo dnf module install -y nvidia-driver:575-open
The above install NVIDIA Open-source kernel module. Refer to Driver Installation Guide about NVIDIA Kernel Modules and installation options.
Option 2: AL2023 repo (x86_64 only)
sudo dnf install -y nvidia-open
Install CUDA toolkit
sudo dnf install -y cuda-toolkit
To install a specific version, e.g. 12.9
sudo dnf install -y cuda-toolkit-12-9
Refer to CUDA documentation for installation options
Method 2: Runfile Installation
Runfile installer is not supported for AL2023 and may not work.
Ensure EC2 instance has more than 10 GB of free disk space
Install development libraries
sudo dnf install -y vulkan-devel libglvnd-devel elfutils-libelf-devel xorg-x11-server-Xorg
Option 1: NVIDIA driver only
To install NVIDIA driver version 570.148.08
cd /tmp
DRIVER_VERSION=570.148.08
curl -L -O https://us.download.nvidia.com/tesla/$DRIVER_VERSION/NVIDIA-Linux-$(arch)-$DRIVER_VERSION.run
chmod +x ./NVIDIA-Linux-$(arch)-$DRIVER_VERSION.run
sudo ./NVIDIA-Linux-$(arch)-$DRIVER_VERSION.run -s
To install a specific version, refer to Driver Release Notes and modify the above line that sets DRIVER_VERSION
value
Option 2: NVIDIA driver and/or CUDA toolkit
You can go to CUDA Toolkit download page to obtain latest runfile (local)
installer download URL for RHEL 9 on x86_64 and arm64 sbsa.
cd /var/tmp
if (arch | grep -q x86); then
wget https://developer.download.nvidia.com/compute/cuda/12.9.0/local_installers/cuda_12.9.0_575.51.03_linux.run
else
wget https://developer.download.nvidia.com/compute/cuda/12.9.0/local_installers/cuda_12.9.0_575.51.03_linux_sbsa.run
fi
chmod +x ./cuda*.run
To install another version, refer to CUDA Toolkit Archive for runfile (local)
download link.
Option 2a: NVIDIA driver and CUDA toolkit
sudo ./cuda_*.run --driver --toolkit --tmpdir=/var/tmp --silent
Option 2b: CUDA toolkit only
sudo ./cuda_*.run --toolkit --tmpdir=/var/tmp --silent
To troubleshoot compilation, view contents of /var/log/nvidia-installer.log
and /var/log/cuda-installer.log
(if applicable)
Refer to CUDA documentation for installation options
Runfile Uninstallation
To uninstall CUDA Toolkit, run the uninstallation script provided in the bin directory of the toolkit. For version 12.9
sudo /usr/local/cuda-12.9/bin/cuda-uninstaller
To remove NVIDIA driver
sudo /usr/bin/nvidia-uninstall
Post installation
Restart your OS
sudo reboot
Verify NVIDIA driver
nvidia-smi
Output should be similar to below
Fri May 23 14:45:57 2025
+-----------------------------------------------------------------------------------------+
| NVIDIA-SMI 575.51.03 Driver Version: 575.51.03 CUDA Version: 12.9 |
|-----------------------------------------+------------------------+----------------------+
| GPU Name Persistence-M | Bus-Id Disp.A | Volatile Uncorr. ECC |
| Fan Temp Perf Pwr:Usage/Cap | Memory-Usage | GPU-Util Compute M. |
| | | MIG M. |
|=========================================+========================+======================|
| 0 NVIDIA T4G Off | 00000000:00:1F.0 Off | 0 |
| N/A 70C P0 33W / 70W | 0MiB / 15360MiB | 8% Default |
| | | N/A |
+-----------------------------------------+------------------------+----------------------+
+-----------------------------------------------------------------------------------------+
| Processes: |
| GPU GI CI PID Type Process name GPU Memory |
| ID ID Usage |
|=========================================================================================|
| No running processes found |
+-----------------------------------------------------------------------------------------+
Verify CUDA tookit
/usr/local/cuda/bin/nvcc -V
Output should be similar to below
nvcc: NVIDIA (R) Cuda compiler driver
Copyright (c) 2005-2025 NVIDIA Corporation
Built on Fri_Feb_21_20:26:18_PST_2025
Cuda compilation tools, release 12.8, V12.8.93
Build cuda_12.8.r12.8/compiler.35583870_0
Post-installation Actions
Refer to NVIDIA CUDA Installation Guide for Linux for post-installation actions before CUDA Toolkit can be used. For example, you may want to modify your PATH
environment variable to include /usr/local/cuda/bin
. For runfile installation, modify LD_LIBRARY_PATH
to include /usr/local/cuda/lib
NVIDIA Container Toolkit
NVIDIA Container toolkit supports AL2023 on both x86_64 and arm64.
For arm64, use g5g.2xlarge
or larger instance size as g5g.xlarge
may cause failures due to the limited system memory.
if (! dnf search nvidia | grep -q nvidia-container-toolkit); then
sudo dnf config-manager --add-repo https://nvidia.github.io/libnvidia-container/stable/rpm/nvidia-container-toolkit.repo
fi
sudo dnf install -y nvidia-container-toolkit
Refer to NVIDIA Container toolkit documentation about supported platforms, prerequisites and installation options
Verify Container Toolkit
nvidia-container-cli -V
Output should be similar to below
cli-version: 1.17.7
lib-version: 1.17.7
build date: 2025-05-16T13:28+0000
build revision: d26524ab5db96a55ae86033f53de50d3794fb547
build compiler: gcc 4.8.5 20150623 (Red Hat 4.8.5-44)
build platform: aarch64
build flags: -D_GNU_SOURCE -D_FORTIFY_SOURCE=2 -DNDEBUG -std=gnu11 -O2 -g -fdata-sections -ffunction-sections -fplan9-extensions -fstack-protector -fno-strict-aliasing -fvisibility=hidden -Wall -Wextra -Wcast-align -Wpointer-arith -Wmissing-prototypes -Wnonnull -Wwrite-strings -Wlogical-op -Wformat=2 -Wmissing-format-attribute -Winit-self -Wshadow -Wstrict-prototypes -Wunreachable-code -Wconversion -Wsign-conversion -Wno-unknown-warning-option -Wno-format-extra-args -Wno-gnu-alignof-expression -Wl,-zrelro -Wl,-znow -Wl,-zdefs -Wl,--gc-sections
Container engine configuration
Refer to NVIDIA Container Toolkit site for container engine configuration instructions.
Docker
To install and configure docker
sudo dnf install -y docker
sudo systemctl enable docker
sudo usermod -aG docker ec2-user
sudo nvidia-ctk runtime configure --runtime=docker
sudo systemctl restart docker
Verify Docker engine configuration
To verify docker configuration
sudo docker run --rm --runtime=nvidia --gpus all public.ecr.aws/amazonlinux/amazonlinux:2023 nvidia-smi
Output should be similar to below
Unable to find image 'public.ecr.aws/amazonlinux/amazonlinux:2023' locally
2023: Pulling from amazonlinux/amazonlinux
b9b2e8e61af6: Pull complete
Digest: sha256:ff1fad724e2ef77b8851124cbc35204d1defe63128f077021a2b3e459fcd866f
Status: Downloaded newer image for public.ecr.aws/amazonlinux/amazonlinux:2023
Fri May 23 14:46:11 2025
+-----------------------------------------------------------------------------------------+
| NVIDIA-SMI 575.51.03 Driver Version: 575.51.03 CUDA Version: 12.9 |
|-----------------------------------------+------------------------+----------------------+
| GPU Name Persistence-M | Bus-Id Disp.A | Volatile Uncorr. ECC |
| Fan Temp Perf Pwr:Usage/Cap | Memory-Usage | GPU-Util Compute M. |
| | | MIG M. |
|=========================================+========================+======================|
| 0 NVIDIA T4G Off | 00000000:00:1F.0 Off | 0 |
| N/A 68C P0 32W / 70W | 0MiB / 15360MiB | 0% Default |
| | | N/A |
+-----------------------------------------+------------------------+----------------------+
+-----------------------------------------------------------------------------------------+
| Processes: |
| GPU GI CI PID Type Process name GPU Memory |
| ID ID Usage |
|=========================================================================================|
| No running processes found |
+-----------------------------------------------------------------------------------------+
Install on EC2 instance at launch
To install NVIDIA driver and NVIDIA Container Toolkit including docker using Method 1 when launching a new AL2023 GPU instance preferably with kernel 6.12 and at least 20 GB storage, you can use the following as user data script. Uncomment line ending with cuda-toolkit to install CUDA toolkit.
#!/bin/bash
sudo dnf clean all
sudo dnf install -y dkms
sudo systemctl enable dkms
if (uname -r | grep -q ^6\\.12\\.); then
sudo dnf install -y kernel-headers-$(uname -r) kernel-devel-$(uname -r) kernel6.12-modules-extra-$(uname -r) kernel-modules-extra-common-$(uname -r)
else
sudo dnf install -y kernel-headers-$(uname -r) kernel-devel-$(uname -r) kernel-modules-extra-$(uname -r) kernel-modules-extra-common-$(uname -r)
fi
cd /tmp
if (arch | grep -q x86); then
ARCH=x86_64
else
ARCH=sbsa
fi
sudo dnf config-manager --add-repo https://developer.download.nvidia.com/compute/cuda/repos/amzn2023/$ARCH/cuda-amzn2023.repo
sudo dnf module install -y nvidia-driver:open-dkms
# sudo dnf install -y cuda-toolkit
if (! dnf search nvidia | grep -q nvidia-container-toolkit); then
sudo dnf config-manager --add-repo https://nvidia.github.io/libnvidia-container/stable/rpm/nvidia-container-toolkit.repo
fi
sudo dnf install -y nvidia-container-toolkit
sudo dnf install -y docker
sudo systemctl enable docker
sudo usermod -aG docker ec2-user
sudo nvidia-ctk runtime configure --runtime=docker
sudo systemctl restart docker
sudo reboot
Verify
Connect to your EC2 instance
nvidia-smi
/usr/local/cuda/bin/nvcc -V
nvidia-container-cli -V
sudo docker run --rm --runtime=nvidia --gpus all public.ecr.aws/amazonlinux/amazonlinux:2023 nvidia-smi
View /var/log/cloud-init-output.log
to troubleshoot any installation issues.
Perform post-installation actions in order to use CUDA toolkit. To verify integrity of installation, you can download, compile and run CUDA samples such as deviceQuery.

If Docker and NVIDIA container toolkit (but not CUDA toolkit) are installed and configured, you can use CUDA samples container image to validate CUDA driver.
sudo docker run --rm --runtime=nvidia --gpus all nvcr.io/nvidia/k8s/cuda-sample:devicequery

GUI (graphical desktop) remote access
If you need remote graphical desktop access, refer to How do I install GUI (graphical desktop) on Amazon EC2 instances running Amazon Linux 2023 (AL2023)?
This article installs NVIDIA Tesla driver (also know as NVIDIA Datacenter Driver), which is intended primarily for GPU compute workloads. GRID drivers provide access to four 4K displays per GPU and are certified to provide optimal performance for professional visualization applications. Refer to GPU-accelerated graphical desktop on Amazon Linux 2023 (AL2023) with NVIDIA GRID and Amazon DCV for setup guidance.
Other Software
NVIDIA GPUDirect Storage
If you use method 1 to install NVIDIA driver only, you can install NVIDIA Magnum IO GPUDirect® Storage (GDS) and libcufile
sudo dnf install -y nvidia-gds
To install GDS only
sudo apt install -y nvidia-fs
Reboot
Reboot after installation is complete
sudo reboot
Verify
To verify installation
lsmod | grep nvidia_fs
Output should be similar to below
nvidia_fs 262144 0
nvidia 11481088 3 nvidia_uvm,nvidia_fs,nvidia_modeset
If nvidia-gds
meta-package is installed
/usr/local/cuda/gds/tools/gdscheck -p
Output should be similar to below
GDS release version: 1.14.1.1
nvidia_fs version: 2.25 libcufile version: 2.12
Platform: x86_64
...
...
==============
PLATFORM INFO:
==============
IOMMU: disabled
Nvidia Driver Info Status: Supported(Nvidia Open Driver Installed)
Cuda Driver Version Installed: 12090
Platform: g4dn.xlarge, Arch: x86_64(Linux 6.1.141-155.222.amzn2023.x86_64)
Platform verification succeeded
Refer to GDS documentation and Driver installation guide for more information