Skip to content

Questions tagged with AWS Inferentia

AWS Inferentia is designed to provide high performance inference in the cloud, to drive down the total cost of inference, and to make it easy for developers to integrate machine learning into their business applications.

Content language: English

Filter questions
Select tags to filter
Sort by
Sort by most recent
Filter Questions by:

Browse through the questions and answers listed below or filter and sort to narrow down your results.

41 results
[AWS Neuron Documentation](https://awsdocs-neuron.readthedocs-hosted.com/en/latest/general/setup/neuron-setup/multiframework/multi-framework-ubuntu22-neuron-dlami.html#setup-ubuntu22-multi-framework-d...
1
answers
0
votes
126
views
AWS
asked a year ago
Hello AWS team! I am trying to run a suite of inference recommendation jobs leveraging NVIDIA Triton Inference Server on a set of GPU instances (ml.g5.12xlarge, ml.g5.8xlarge, ml.g5.16xlarge) as well...
1
answers
0
votes
889
views
asked 2 years ago
Hi, Is there more documentation/examples for *TensorFlow* on Trn1/Trn1n instances? Documentation at: [https://awsdocs-neuron.readthedocs-hosted.com/en/latest/frameworks/tensorflow/index.html]() ha...
3
answers
0
votes
761
views
asked 2 years ago
We are using tensorflow.neuron to compile a tensorflow 1.x SavedModel to run on AWS Inferentia machines on EC2. We do this by calling: tensorflow.neuron.saved_model.compile(model_dir, compiled_model_d...
3
answers
0
votes
615
views
asked 2 years ago
Currently, I host my model with `tensorflow_model_server`. Here is how I export my model: ``` model = tf.keras.models.load_model("model.hdf5") def __decode_images(images, nch): o = tf.vectorized...
1
answers
0
votes
808
views
asked 2 years ago
I am new to AWS Neuron SDK and the documentation seems confusing to me. There is no direct guide on how to install the SDK and use it to compile models. The examples are outdated and the installation ...
1
answers
0
votes
1.1K
views
asked 2 years ago
Currently, we are using Elastic Inference for inferencing on AWS ECS. We use `inference_accelerators` in `ecs.Ec2TaskDefinition` to set up elastic inference. For scaling, we are monitoring `Accelerato...
1
answers
0
votes
854
views
asked 2 years ago
I have a project where I would like to send inference requests. For this I need a API as AWS Lambda or a SageMaker endpoint so that the customer can send their request there. The inference performed ...
1
answers
0
votes
1K
views
asked 2 years ago
Hello, i'm using **pytorch-inference:2.0.1-gpu-py310-cu118-ubuntu20.04-sagemaker** AMI and following this guide https://awsdocs-neuron.readthedocs-hosted.com/en/latest/general/setup/neuron-setup/pyto...
1
answers
0
votes
725
views
asked 2 years ago
hello, i'm using **pytorch-inference:2.0.1-gpu-py310-cu118-ubuntu20.04-sagemaker** image to run the service i'd like to switch to INF2 instance. ~~I think i can try to use **pytorch-inference-neuro...
1
answers
0
votes
882
views
asked 2 years ago
I'm considering launching an instance to work on one of my TensorFlow models since my current PC doesn't perform efficiently. My PC has 32GB of RAM, a 20CPU i7 processor, and an RTX 3050Ti 20GB GPU. I...
1
answers
0
votes
800
views
asked 2 years ago
Hello, I am using an autoscaling group with inferentia chips but I encounter some problems during the deployment. There are three available zones in my ASG which means that those zones must contain av...
1
answers
0
votes
593
views
asked 2 years ago
  • 1
  • 2
  • 3
  • 4
  • Page size
    12 / page