Have you taken a look at the tensorflow-neuron TF2 HuggingFace tutorial? This tutorial demonstrates how to compile and run TF2 HuggingFace models on Inf1. You can also take a look at the tensorflow-neuron TF2 API documentation for additional information about the compilation process.
You are correct that 0 operations placed on the Neuron runtime means the Inferentia NeuronCores (the main compute engines on inf1) will not be used.
Please let us know if you have any additional questions!
AWS Pytorch Neuron Compliation Errorasked 2 months ago
neuron compiling a bert modelasked 25 days ago
Is it possible to compile a neuron model in my local machine?asked 2 months ago
What is a practical Inferentia limit to model size?asked 3 months ago
inferentia neuron core usage is only 1 when 4 cores are availableasked 12 days ago
Not able to compile to NEFF, the BERT model from neuron tutorialasked 2 months ago
neuron compiling bert model for inferentia on tf2
Neuron model loads when compiled for 1 core but fails to load when compiled for 4asked 12 days ago
Issue with loading neuron model
Not able to convert Hugging Face fine-tuned BERT model into AWS Neuronasked 5 months ago