Upgrade nvidia-driver in Amazon EKS AMI with nvidia gpu support

1

The current EKS optimized Amazon Linux AMI ships nvidia-driver version 470. Unfortunately, our software requires version 510. Is there an official AMI with such version, or how could I upgrade the nvidia-driver in the AMI or perhaps using an overridden bootstrap command?

質問済み 2年前378ビュー
2回答
0

You can automate the install of the nvidia drivers via User Data in the launch template for your clusters. Here's the docs for managed node group launch templates that might help, https://docs.aws.amazon.com/eks/latest/userguide/launch-templates.html

AWS
Ray K
回答済み 2年前
  • I don't think it's an efficient method to install something at the boot of the worker node. Anyway, it might be done as a last resort, however, unfortunately the Amazon repo for nvidia packages (which is used for the gpu supported AMI) doesn't have any newer nvidia and cuda related packages. Could it be updated to have nvidia-510 packages as well? If so where to file such request?

0

We are also encountered this issue. Is there a more recent solution? This is a breaking issue with torch 2.

It seems like the recommended approach here is to create a new custom AMI. Deep Learning AMI GPU PyTorch 1.11.0 (Ubuntu 20.04) 20220912 does have 5xx drivers (but my understanding is it has no K8s support), while our EKS AMI has the old drivers. Perhaps we will be able to get a new AMI working properly, but this seems like something that AWS should offer.

Samson
回答済み 10ヶ月前

ログインしていません。 ログイン 回答を投稿する。

優れた回答とは、質問に明確に答え、建設的なフィードバックを提供し、質問者の専門分野におけるスキルの向上を促すものです。

質問に答えるためのガイドライン

関連するコンテンツ