Questions tagged with AWS Inferentia
Content language: English
Select up to 5 tags to filter
Sort by most recent
Browse through the questions and answers listed below or filter and sort to narrow down your results.
As the title says, we can host LLM's and Stable diffusion models from jumpstart directly on SageMaker Inf1 or Inf2 chips ?
> I tried doing that with Stable Diffusion 2 Model (i.e from studio...
1
answers
0
votes
98
views
asked a month agolg...
We are facing issues while using this model on the aforementioned machine. We were able to run the same experiment on G5 instance successfully but we are observing that the same code is not working on...
0
answers
0
votes
48
views
asked 2 months agolg...
**Can someone help me load my model to create an endpoint?**
Provided explanation of steps followed, error logs and code used to create everything...thank you in advance.
I'm trying very hard to...
2
answers
0
votes
210
views
asked 3 months agolg...
It seems to be available according to every online source I see.
2
answers
0
votes
158
views
asked 3 months agolg...
I am currently facing an issue with the AWS Neuron SDK when trying to run the PyTorch example provided in the AWS Neuron GitHub repository on a Deep Learning AMI Neuron PyTorch 1.13 (Ubuntu 20.04)...
1
answers
0
votes
109
views
asked 4 months agolg...
I am currently using Amazon SageMaker for running my machine learning models, but it is becoming costly. To reduce costs, I am considering two options: AWS Elastic Inference and AWS Inferentia.
I...
1
answers
0
votes
248
views
asked 4 months agolg...
Hi All,
I have to compute gradient on BERT model on inferentia. For this I guess I also need access to the hidden layers. Im currently not able to proceed because of not finding literature on the net...
1
answers
0
votes
55
views
asked 4 months agolg...
I'm trying to make a public facing web app that allows for inferencing, with probably ten or so available models to my users. My initial thought was that I would have a front-end basic webpage, that...
1
answers
0
votes
82
views
asked 4 months agolg...
Hi,
I am trying to deploy the Databricks open source LLM i.e Dolly on inf2 instance. Instance type is `inf2.24xlarge` used the AMI `Deep Learning AMI Neuron PyTorch 1.13 (Ubuntu 20.04) 2023051`.
I am...
2
answers
0
votes
367
views
asked 5 months agolg...
Hi,
I have some code which generates a shape of torch.Size([1, 512, 1024] when calling bert on inf1.
I have compiled the model for inf2.
However the same code on inf2 produces a shape of...
1
answers
0
votes
96
views
asked 5 months agolg...
Hi, I'm trying to run the gptj_demo on Inf2 with AMI Deep Learning AMI Neuron PyTorch 1.13.0 (Ubuntu 20.04) 20230405 and installed the pytorch neuron as...
1
answers
0
votes
207
views
asked 5 months agolg...
I have an ML model from Huggingface, which essentially looks as follows:
```
import torch
from transformers import BloomTokenizerFast, BloomForCausalLM
device = torch.device('cuda' if...
0
answers
0
votes
52
views
asked 5 months agolg...