Hi,
This link https://awsdocs-neuron.readthedocs-hosted.com/en/latest/frameworks/tensorflow/tensorflow-neuron/tutorials/bert_demo/bert_demo.html mentions how to compile using tensorflow 1. Can anyone let me know the steps to neuron compile a BERT large model for running inference on inferentia using tensorflow v2??
Thanks in advance Ajay
P.S This is what my log looks like while compiling on tf1 INFO:tensorflow:fusing subgraph {subgraph neuron_op_e76ab3d9bc74f09f with input tensors ["<tf.Tensor 'bert/encoder/ones0/_0:0' shape=(1, 512, 1) dtype=float32>", "<tf.Tensor 'bert/encoder/Cast0/_1:0' shape=(1, 1, 512) dtype=float32>", "<tf.Tensor 'bert/embeddings/LayerNorm/batchnorm/add_10/_2:0' shape=(1, 512, 1024) dtype=float32>"], output tensors ["<tf.Tensor 'bert/pooler/dense/Tanh:0' shape=(1, 1024) dtype=float32>", "<tf.Tensor 'bert/encoder/layer_23/output/LayerNorm/batchnorm/add_1:0' shape=(1, 512, 1024) dtype=float32>"]} with neuron-cc . Compiler status ERROR WARNING:tensorflow:11/03/2022 04:28:48 AM ERROR 9932 [neuron-cc]: Failed to parse model /tmp/tmpbyvnmr6h/neuron_op_e76ab3d9bc74f09f/graph_def.pb: The following operators are not implemented: {'Einsum'} (NotImplementedError)
INFO:tensorflow:Number of operations in TensorFlow session: 7427 INFO:tensorflow:Number of operations after tf.neuron optimizations: 2901 INFO:tensorflow:Number of operations placed on Neuron runtime: 0
WARNING:tensorflow:Converted /home/ubuntu/bert_repo/patent_model/ to ./bert-saved-model-neuron_tf1.15 but no operator will be running on AWS machine learning accelerators. This is probably not what you want. Please refer to https://github.com/aws/aws-neuron-sdk for current limitations of the AWS Neuron SDK. We are actively improving (and hiring)! {'OnNeuronRatio': 0.0}
---I assume the OnNeuronRatio being 0 means that I wont be able to make use of Inferentia hardware acceleration. Is that correct?
Hi Josh, thanks for your answer. However Im facing other issues related to the code on the link you posted. Please check my new question ...https://repost.aws/questions/QUEtWfa_1PRmqAf7NrFL8Zfw/issue-with-loading-neuron-model