Questions tagged with AWS Inferentia
Content language: English
Select up to 5 tags to filter
Sort by most recent
Browse through the questions and answers listed below or filter and sort to narrow down your results.
We are using tensorflow.neuron to compile a tensorflow 1.x SavedModel to run on AWS Inferentia machines on EC2. We do this by calling:
tensorflow.neuron.saved_model.compile(model_dir,...
3
answers
0
votes
128
views
asked a month agolg...
Currently, I host my model with `tensorflow_model_server`. Here is how I export my model:
```
model = tf.keras.models.load_model("model.hdf5")
def __decode_images(images, nch):
o =...
1
answers
0
votes
174
views
asked 5 months agolg...
I am new to AWS Neuron SDK and the documentation seems confusing to me.
There is no direct guide on how to install the SDK and use it to compile models. The examples are outdated and the installation...
1
answers
0
votes
281
views
asked 5 months agolg...
Currently, we are using Elastic Inference for inferencing on AWS ECS. We use `inference_accelerators` in `ecs.Ec2TaskDefinition` to set up elastic inference. For scaling, we are monitoring...
1
answers
0
votes
237
views
asked 6 months agolg...
I have a project where I would like to send inference requests. For this I need a API as AWS Lambda or a SageMaker endpoint so that the customer can send their request there.
The inference performed...
1
answers
0
votes
391
views
asked 6 months agolg...
Hello, i'm using **pytorch-inference:2.0.1-gpu-py310-cu118-ubuntu20.04-sagemaker** AMI and following this guide...
1
answers
0
votes
217
views
asked 6 months agolg...
hello,
i'm using **pytorch-inference:2.0.1-gpu-py310-cu118-ubuntu20.04-sagemaker** image to run the service i'd like to switch to INF2 instance.
~~I think i can try to use...
1
answers
0
votes
256
views
asked 7 months agolg...
TensorFlow instancelg...
I'm considering launching an instance to work on one of my TensorFlow models since my current PC doesn't perform efficiently. My PC has 32GB of RAM, a 20CPU i7 processor, and an RTX 3050Ti 20GB GPU. I...
1
answers
0
votes
245
views
asked 7 months agolg...
Hello,
I am using an autoscaling group with inferentia chips but I encounter some problems during the deployment. There are three available zones in my ASG which means that those zones must contain...
0
answers
0
votes
129
views
asked 7 months agolg...
As the title says, we can host LLM's and Stable diffusion models from jumpstart directly on SageMaker Inf1 or Inf2 chips ?
> I tried doing that with Stable Diffusion 2 Model (i.e from studio...
1
answers
0
votes
432
views
asked 9 months agolg...
We are facing issues while using this model on the aforementioned machine. We were able to run the same experiment on G5 instance successfully but we are observing that the same code is not working on...
1
answers
0
votes
271
views
asked 10 months agolg...
**Can someone help me load my model to create an endpoint?**
Provided explanation of steps followed, error logs and code used to create everything...thank you in advance.
I'm trying very hard to...
2
answers
0
votes
508
views
asked a year agolg...