Questions tagged with AWS Inferentia
Content language: English
Sort by most recent
Im using the following code to load a neuron compiled model for inference. However on my inf1.2xlarge instance, neuron-top shows for cores (NC0 to NC3). Only NC0 gets used in inference. How do I...
Hi,
I want to neuron compile a bert large model(patentbert from google) which has sequence length 512. How do I do this?
Also I want to call the model as before or need to know what I should change...
I am trying to load a neuron compiled model generated as given in https://awsdocs-neuron.readthedocs-hosted.com/en/latest/src/examples/tensorflow/huggingface_bert/huggingface_bert.html . I am still a...
Hi,
This link https://awsdocs-neuron.readthedocs-hosted.com/en/latest/frameworks/tensorflow/tensorflow-neuron/tutorials/bert_demo/bert_demo.html mentions how to compile using tensorflow 1. Can anyone...
I followed user guide on updating torch neuron and then started compiling the model to neuron.
But got an error, from which I don't understand what's wrong.
In Neuron SDK you claim that it should...
I'm following some guides and from my understanding this should be possible. But I've been trying for hours to compile a yolov5 model into a neuron model with no success. Is it even possible to do...
Hi Team,
I wanted to compile a BERT model and run it on inferentia. I trained my model using pytorch and tried to convert it by following the same steps in this...
I am trying to test a model compiled for Inferentia on an `inf1.2xlarge`, but when loading the model I receive the following error messages:
```
2022-Sep-15 22:10:01.0152 3802:3802 ERROR ...
I have compiled my model to run on Inferentia and I can load up multiple models from 1 process such as a single jupyter notebook.
I am trying to host the models via a server and am using gunicorn as...
Hello,
I tried using [Inf1 EC2 instance](https://aws.amazon.com/ec2/instance-types/inf1/) for deploying my ML model. I need to monitor the GPU usage of the ML model. I could find the CPU usage in...
I am trying to deploy Pytorch model on ml.inf1.xlarge instance.
Image: 301217895009.dkr.ecr.us-west-2.amazonaws.com/sagemaker-neo-pytorch:1.5.1-inf-py3
My python code using some oepncv functions, and...
We have been trying to deploy our multiple models to a multi-model endpoint that uses inference machines (inf.xlarge) without luck.
ClientError: An error occurred (ValidationException) when calling...