Upgrade nvidia-driver in Amazon EKS AMI with nvidia gpu support

1

The current EKS optimized Amazon Linux AMI ships nvidia-driver version 470. Unfortunately, our software requires version 510. Is there an official AMI with such version, or how could I upgrade the nvidia-driver in the AMI or perhaps using an overridden bootstrap command?

preguntada hace 2 años378 visualizaciones
2 Respuestas
0

You can automate the install of the nvidia drivers via User Data in the launch template for your clusters. Here's the docs for managed node group launch templates that might help, https://docs.aws.amazon.com/eks/latest/userguide/launch-templates.html

AWS
Ray K
respondido hace 2 años
  • I don't think it's an efficient method to install something at the boot of the worker node. Anyway, it might be done as a last resort, however, unfortunately the Amazon repo for nvidia packages (which is used for the gpu supported AMI) doesn't have any newer nvidia and cuda related packages. Could it be updated to have nvidia-510 packages as well? If so where to file such request?

0

We are also encountered this issue. Is there a more recent solution? This is a breaking issue with torch 2.

It seems like the recommended approach here is to create a new custom AMI. Deep Learning AMI GPU PyTorch 1.11.0 (Ubuntu 20.04) 20220912 does have 5xx drivers (but my understanding is it has no K8s support), while our EKS AMI has the old drivers. Perhaps we will be able to get a new AMI working properly, but this seems like something that AWS should offer.

Samson
respondido hace 10 meses

No has iniciado sesión. Iniciar sesión para publicar una respuesta.

Una buena respuesta responde claramente a la pregunta, proporciona comentarios constructivos y fomenta el crecimiento profesional en la persona que hace la pregunta.

Pautas para responder preguntas