Questions tagged with AWS Inferentia

Content language: English

Sort by most recent

Browse through the questions and answers listed below or filter and sort to narrow down your results.

  • 1
  • 2
  • 12 / page
Im using the following code to load a neuron compiled model for inference. However on my inf1.2xlarge instance, neuron-top shows for cores (NC0 to NC3). Only NC0 gets used in inference. How do I...
1
answers
0
votes
114
views
asked 7 months ago
Hi, I want to neuron compile a bert large model(patentbert from google) which has sequence length 512. How do I do this? Also I want to call the model as before or need to know what I should change...
1
answers
0
votes
181
views
asked 7 months ago
I am trying to load a neuron compiled model generated as given in https://awsdocs-neuron.readthedocs-hosted.com/en/latest/src/examples/tensorflow/huggingface_bert/huggingface_bert.html . I am still a...
2
answers
0
votes
175
views
asked 7 months ago
Hi, This link https://awsdocs-neuron.readthedocs-hosted.com/en/latest/frameworks/tensorflow/tensorflow-neuron/tutorials/bert_demo/bert_demo.html mentions how to compile using tensorflow 1. Can anyone...
1
answers
0
votes
150
views
asked 7 months ago
I followed user guide on updating torch neuron and then started compiling the model to neuron. But got an error, from which I don't understand what's wrong. In Neuron SDK you claim that it should...
1
answers
1
votes
321
views
asked 8 months ago
I'm following some guides and from my understanding this should be possible. But I've been trying for hours to compile a yolov5 model into a neuron model with no success. Is it even possible to do...
1
answers
2
votes
210
views
asked 8 months ago
Hi Team, I wanted to compile a BERT model and run it on inferentia. I trained my model using pytorch and tried to convert it by following the same steps in this...
1
answers
0
votes
111
views
asked 9 months ago
I am trying to test a model compiled for Inferentia on an `inf1.2xlarge`, but when loading the model I receive the following error messages: ``` 2022-Sep-15 22:10:01.0152 3802:3802 ERROR ...
1
answers
0
votes
139
views
ntw-au
asked 9 months ago
I have compiled my model to run on Inferentia and I can load up multiple models from 1 process such as a single jupyter notebook. I am trying to host the models via a server and am using gunicorn as...
2
answers
0
votes
150
views
asked 10 months ago
Hello, I tried using [Inf1 EC2 instance](https://aws.amazon.com/ec2/instance-types/inf1/) for deploying my ML model. I need to monitor the GPU usage of the ML model. I could find the CPU usage in...
1
answers
1
votes
557
views
asked a year ago
I am trying to deploy Pytorch model on ml.inf1.xlarge instance. Image: 301217895009.dkr.ecr.us-west-2.amazonaws.com/sagemaker-neo-pytorch:1.5.1-inf-py3 My python code using some oepncv functions, and...
1
answers
0
votes
265
views
synapse
asked a year ago
We have been trying to deploy our multiple models to a multi-model endpoint that uses inference machines (inf.xlarge) without luck. ClientError: An error occurred (ValidationException) when calling...
1
answers
0
votes
208
views
asked a year ago
  • 1
  • 2
  • 12 / page