Skip to content

AMI and Instance Types to run Ollama - Llama3.1 LLM in EKS

0

Hi,

I am planning to run Llama3.1 (Parameters 8B) using Ollama in EKS 1.33 managed node group. Already created the docker image with the necessary dependencies. But I am not not able to decide what AMI and Insance Types should I use?

I believe AMI should be NVIDIA with GPU capability. I am thinking to use AMI AL2023_x86_64_NVIDIA and g5.2xlarge.

Is it sufficiant to run the foundation model as it has only 1 GPU? Also, is there any guidelines to choose the right AMI and Instance Types to run various LLM?

Thanks, Suvendu

1 Answer
2

For best performance, Ollama needs to load entire model into GPU memory. The GPU in g5.2xlarge has about 24 GB GPU memory, while Llama 3.1 8B model is about 4.9 GB in size. g4dn instance has about 16 GB GPU memory.

You can find EC2 listing and their GPU memory size at EC2 instance page under Accelerated Computing.

AWS
EXPERT
answered 10 months ago

You are not logged in. Log in to post an answer.

A good answer clearly answers the question and provides constructive feedback and encourages professional growth in the question asker.