Upgrade nvidia-driver in Amazon EKS AMI with nvidia gpu support

1

The current EKS optimized Amazon Linux AMI ships nvidia-driver version 470. Unfortunately, our software requires version 510. Is there an official AMI with such version, or how could I upgrade the nvidia-driver in the AMI or perhaps using an overridden bootstrap command?

gefragt vor 2 Jahren378 Aufrufe
2 Antworten
0

You can automate the install of the nvidia drivers via User Data in the launch template for your clusters. Here's the docs for managed node group launch templates that might help, https://docs.aws.amazon.com/eks/latest/userguide/launch-templates.html

AWS
Ray K
beantwortet vor 2 Jahren
  • I don't think it's an efficient method to install something at the boot of the worker node. Anyway, it might be done as a last resort, however, unfortunately the Amazon repo for nvidia packages (which is used for the gpu supported AMI) doesn't have any newer nvidia and cuda related packages. Could it be updated to have nvidia-510 packages as well? If so where to file such request?

0

We are also encountered this issue. Is there a more recent solution? This is a breaking issue with torch 2.

It seems like the recommended approach here is to create a new custom AMI. Deep Learning AMI GPU PyTorch 1.11.0 (Ubuntu 20.04) 20220912 does have 5xx drivers (but my understanding is it has no K8s support), while our EKS AMI has the old drivers. Perhaps we will be able to get a new AMI working properly, but this seems like something that AWS should offer.

Samson
beantwortet vor 10 Monaten

Du bist nicht angemeldet. Anmelden um eine Antwort zu veröffentlichen.

Eine gute Antwort beantwortet die Frage klar, gibt konstruktives Feedback und fördert die berufliche Weiterentwicklung des Fragenstellers.

Richtlinien für die Beantwortung von Fragen