inferentia neuron core usage is only 1 when 4 cores are available

0

Im using the following code to load a neuron compiled model for inference. However on my inf1.2xlarge instance, neuron-top shows for cores (NC0 to NC3). Only NC0 gets used in inference. How do I increase throughput by using all cores???

from transformers import BertTokenizer, BertModel
import torch
import torch_neuron
import os.path
import os

os.environ['NEURON_RT_NUM_CORES']=str(4)
fname = 'modelneuron.pt'
tokenizer = BertTokenizer.from_pretrained('bert-base-uncased')
inputs = tokenizer("Hello, my dog is cute and big", return_tensors="pt")
if not os.path.isfile(fname):

    model = BertModel.from_pretrained('bert-base-uncased', return_dict=False)

    neuron_model = torch_neuron.trace(model,
                                    example_inputs = (inputs['input_ids'],inputs['attention_mask']))
    neuron_model.save("modelneuron.pt")
    print('saved neuron model')
else:
    neuron_model = torch.jit.load('modelneuron.pt')
    print('loaded neuron model')

for i in range(10000):
    outputs = neuron_model(*(inputs['input_ids'],inputs['attention_mask']))

print(outputs)
demandé il y a un an285 vues
1 réponse
1

Hi,

For running inference in parallel using Neuron on a inf1 instance to utilize all available NeuronCores we can use torch.neuron.DataParallel.

torch.neuron.DataParallel implements data parallelism at the module level by duplicating the Neuron model on all available NeuronCores and distributing data across the different cores for parallelized inference.

You can read more about running Inference using torch.neuron.DataParallel here: https://awsdocs-neuron.readthedocs-hosted.com/en/latest/frameworks/torch/torch-neuron/api-torch-neuron-dataparallel-api.html#torch-neuron-dataparallel-api

In addition, here is an example of using DataParallel https://awsdocs-neuron.readthedocs-hosted.com/en/latest/src/examples/pytorch/resnet50.html#Run-Inference-using-torch.neuron.DataParallel

AWS
Chris_T
répondu il y a un an

Vous n'êtes pas connecté. Se connecter pour publier une réponse.

Une bonne réponse répond clairement à la question, contient des commentaires constructifs et encourage le développement professionnel de la personne qui pose la question.

Instructions pour répondre aux questions