All Content tagged with AWS Inferentia
AWS Inferentia is designed to provide high performance inference in the cloud, to drive down the total cost of inference, and to make it easy for developers to integrate machine learning into their business applications.
Content language: English
Select up to 5 tags to filter
Sort by most recent
EXPERT
published 18 days ago1 votes73 views
Hello AWS team!
I am trying to run a suite of inference recommendation jobs leveraging NVIDIA Triton Inference Server on a set of GPU instances (ml.g5.12xlarge, ml.g5.8xlarge, ml.g5.16xlarge) as well...
EXPERT
published 3 months ago3 votes1074 views
Hi,
Is there more documentation/examples for *TensorFlow* on Trn1/Trn1n instances?
Documentation at:
[https://awsdocs-neuron.readthedocs-hosted.com/en/latest/frameworks/tensorflow/index.html]()...
We are using tensorflow.neuron to compile a tensorflow 1.x SavedModel to run on AWS Inferentia machines on EC2. We do this by calling:
tensorflow.neuron.saved_model.compile(model_dir,...
Currently, I host my model with `tensorflow_model_server`. Here is how I export my model:
```
model = tf.keras.models.load_model("model.hdf5")
def __decode_images(images, nch):
o =...
I am new to AWS Neuron SDK and the documentation seems confusing to me.
There is no direct guide on how to install the SDK and use it to compile models. The examples are outdated and the installation...
Currently, we are using Elastic Inference for inferencing on AWS ECS. We use `inference_accelerators` in `ecs.Ec2TaskDefinition` to set up elastic inference. For scaling, we are monitoring...
I have a project where I would like to send inference requests. For this I need a API as AWS Lambda or a SageMaker endpoint so that the customer can send their request there.
The inference performed...
Hello, i'm using **pytorch-inference:2.0.1-gpu-py310-cu118-ubuntu20.04-sagemaker** AMI and following this guide...