inferentia neuron core usage is only 1 when 4 cores are available

0

Im using the following code to load a neuron compiled model for inference. However on my inf1.2xlarge instance, neuron-top shows for cores (NC0 to NC3). Only NC0 gets used in inference. How do I increase throughput by using all cores???

from transformers import BertTokenizer, BertModel
import torch
import torch_neuron
import os.path
import os

os.environ['NEURON_RT_NUM_CORES']=str(4)
fname = 'modelneuron.pt'
tokenizer = BertTokenizer.from_pretrained('bert-base-uncased')
inputs = tokenizer("Hello, my dog is cute and big", return_tensors="pt")
if not os.path.isfile(fname):

    model = BertModel.from_pretrained('bert-base-uncased', return_dict=False)

    neuron_model = torch_neuron.trace(model,
                                    example_inputs = (inputs['input_ids'],inputs['attention_mask']))
    neuron_model.save("modelneuron.pt")
    print('saved neuron model')
else:
    neuron_model = torch.jit.load('modelneuron.pt')
    print('loaded neuron model')

for i in range(10000):
    outputs = neuron_model(*(inputs['input_ids'],inputs['attention_mask']))

print(outputs)
1 Risposta
1

Hi,

For running inference in parallel using Neuron on a inf1 instance to utilize all available NeuronCores we can use torch.neuron.DataParallel.

torch.neuron.DataParallel implements data parallelism at the module level by duplicating the Neuron model on all available NeuronCores and distributing data across the different cores for parallelized inference.

You can read more about running Inference using torch.neuron.DataParallel here: https://awsdocs-neuron.readthedocs-hosted.com/en/latest/frameworks/torch/torch-neuron/api-torch-neuron-dataparallel-api.html#torch-neuron-dataparallel-api

In addition, here is an example of using DataParallel https://awsdocs-neuron.readthedocs-hosted.com/en/latest/src/examples/pytorch/resnet50.html#Run-Inference-using-torch.neuron.DataParallel

AWS
Chris_T
con risposta un anno fa

Accesso non effettuato. Accedi per postare una risposta.

Una buona risposta soddisfa chiaramente la domanda, fornisce un feedback costruttivo e incoraggia la crescita professionale del richiedente.

Linee guida per rispondere alle domande