Questions tagged with AWS Neuron

Content language: English

Sort by most recent

Browse through the questions and answers listed below or filter and sort to narrow down your results.

  • 1
  • 2
  • 12 / page

Neuron model loads when compiled for 1 core but fails to load when compiled for 4

Hello, We are testing the pipeline mode for neuron/inferentia, but can not get a model running for multi-core. The single core compiled model loads fine and is able to run inference on inferentia without issue. However, after compiling a model for multi-core using `compiler-args=['--neuroncore-pipeline-cores', '4']` (which takes ~16hrs on a r6a.16xl) the model errors out while loading into memory on the inferentia box. Here's the error message: ``` 2022-Nov-22 22:29:25.0728 20764:22801 ERROR TDRV:dmem_alloc Failed to alloc DEVICE memory: 589824 2022-Nov-22 22:29:25.0728 20764:22801 ERROR TDRV:copy_and_stage_mr_one_channel Failed to allocate aligned (0) buffer in MLA DRAM for W10-t of size 589824 bytes, channel 0 2022-Nov-22 22:29:25.0728 20764:22801 ERROR TDRV:kbl_model_add copy_and_stage_mr() error 2022-Nov-22 22:29:26.0091 20764:22799 ERROR TDRV:dmem_alloc Failed to alloc DEVICE memory: 16777216 2022-Nov-22 22:29:26.0091 20764:22799 ERROR TDRV:dma_ring_alloc Failed to allocate RX ring 2022-Nov-22 22:29:26.0091 20764:22799 ERROR TDRV:drs_create_data_refill_rings Failed to allocate pring for data refill dma 2022-Nov-22 22:29:26.0091 20764:22799 ERROR TDRV:kbl_model_add create_data_refill_rings() error 2022-Nov-22 22:29:26.0116 20764:20764 ERROR TDRV:remove_model Unknown model: 1001 2022-Nov-22 22:29:26.0116 20764:20764 ERROR TDRV:kbl_model_remove Failed to find and remove model: 1001 2022-Nov-22 22:29:26.0117 20764:20764 ERROR TDRV:remove_model Unknown model: 1001 2022-Nov-22 22:29:26.0117 20764:20764 ERROR TDRV:kbl_model_remove Failed to find and remove model: 1001 2022-Nov-22 22:29:26.0117 20764:20764 ERROR NMGR:dlr_kelf_stage Failed to load subgraph 2022-Nov-22 22:29:26.0354 20764:20764 ERROR NMGR:stage_kelf_models Failed to stage graph: kelf-a.json to NeuronCore 2022-Nov-22 22:29:26.0364 20764:20764 ERROR NMGR:kmgr_load_nn_post_metrics Failed to load NN: 1.11.7.0+aec18907e-/tmp/tmpab7oth00, err: 4 Traceback (most recent call last): File "infer_test.py", line 34, in <module> model_neuron = torch.jit.load('model-4c.pt') File "/root/pytorch_venv/lib64/python3.7/site-packages/torch_neuron/jit_load_wrapper.py", line 13, in wrapper script_module = jit_load(*args, **kwargs) File "/root/pytorch_venv/lib64/python3.7/site-packages/torch/jit/_serialization.py", line 162, in load cpp_module = torch._C.import_ir_module(cu, str(f), map_location, _extra_files) RuntimeError: Could not load the model status=4 message=Allocation Failure ``` Any help would be appreciated.
1
answers
0
votes
29
views
asked 12 days ago

Issue with loading neuron model

I am trying to load a neuron compiled model generated as given in https://awsdocs-neuron.readthedocs-hosted.com/en/latest/src/examples/tensorflow/huggingface_bert/huggingface_bert.html . I am still a newbie so please excuse my mistakes. This is my code for loading a neuron compiled model.It is almost entirely based on the code in the page referred earlier. from transformers import pipeline import tensorflow as tf import tensorflow.neuron as tfn class TFBertForSequenceClassificationDictIO(tf.keras.Model): def __init__(self, model_wrapped): super().__init__() self.model_wrapped = model_wrapped self.aws_neuron_function = model_wrapped.aws_neuron_function def call(self, inputs): input_ids = inputs['input_ids'] attention_mask = inputs['attention_mask'] logits = self.model_wrapped([input_ids, attention_mask]) return [logits] class TFBertForSequenceClassificationFlatIO(tf.keras.Model): def __init__(self, model): super().__init__() self.model = model def call(self, inputs): input_ids, attention_mask = inputs output = self.model({'input_ids': input_ids, 'attention_mask': attention_mask}) return output['logits'] string_inputs = [ 'I love to eat pizza!', 'I am sorry. I really want to like it, but I just can not stand sushi.', 'I really do not want to type out 128 strings to create batch 128 data.', 'Ah! Multiplying this list by 32 would be a great solution!', ] string_inputs = string_inputs * 32 model_name = 'distilbert-base-uncased-finetuned-sst-2-english' neuron_pipe = pipeline('sentiment-analysis', model=model_name, framework='tf') example_inputs = neuron_pipe.tokenizer(string_inputs) pipe = pipeline('sentiment-analysis', model=model_name, framework='tf') reloaded_model = tf.keras.models.load_model('./distilbert_b128_2') model_wrapped = TFBertForSequenceClassificationFlatIO(pipe.model) example_inputs_list = [example_inputs['input_ids'], example_inputs['attention_mask']] model_wrapped_traced = tfn.trace(model_wrapped, example_inputs_list) rewrapped_model = TFBertForSequenceClassificationDictIO(model_wrapped_traced) This is the stacktrace 2022-11-05 02:46:55.553817: W tensorflow/python/util/util.cc:368] Sets are not currently considered sequences, but this may change in the future, so consider avoiding using them. All model checkpoint layers were used when initializing TFDistilBertForSequenceClassification. All the layers of TFDistilBertForSequenceClassification were initialized from the model checkpoint at distilbert-base-uncased-finetuned-sst-2-english. If your task is similar to the task the model of the checkpoint was trained on, you can already use TFDistilBertForSequenceClassification for predictions without further training. Some layers from the model checkpoint at distilbert-base-uncased-finetuned-sst-2-english were not used when initializing TFDistilBertForSequenceClassification: ['dropout_19'] - This IS expected if you are initializing TFDistilBertForSequenceClassification from the checkpoint of a model trained on another task or with another architecture (e.g. initializing a BertForSequenceClassification model from a BertForPreTraining model). - This IS NOT expected if you are initializing TFDistilBertForSequenceClassification from the checkpoint of a model that you expect to be exactly identical (initializing a BertForSequenceClassification model from a BertForSequenceClassification model). Some layers of TFDistilBertForSequenceClassification were not initialized from the model checkpoint at distilbert-base-uncased-finetuned-sst-2-english and are newly initialized: ['dropout_39'] You should probably TRAIN this model on a down-stream task to be able to use it for predictions and inference. WARNING:tensorflow:No training configuration found in save file, so the model was *not* compiled. Compile it manually. Traceback (most recent call last): File "inferencesmall.py", line 40, in <module> model_wrapped_traced = tfn.trace(model_wrapped, example_inputs_list) File "/home/ubuntu/huggingface/venv/lib/python3.7/site-packages/tensorflow_neuron/python/_trace.py", line 167, in trace func = func.get_concrete_function(*example_inputs) File "/home/ubuntu/huggingface/venv/lib/python3.7/site-packages/tensorflow/python/eager/def_function.py", line 1264, in get_concrete_function concrete = self._get_concrete_function_garbage_collected(*args, **kwargs) File "/home/ubuntu/huggingface/venv/lib/python3.7/site-packages/tensorflow/python/eager/def_function.py", line 1244, in _get_concrete_function_garbage_collected self._initialize(args, kwargs, add_initializers_to=initializers) File "/home/ubuntu/huggingface/venv/lib/python3.7/site-packages/tensorflow/python/eager/def_function.py", line 786, in _initialize *args, **kwds)) File "/home/ubuntu/huggingface/venv/lib/python3.7/site-packages/tensorflow/python/eager/function.py", line 2983, in _get_concrete_function_internal_garbage_collected graph_function, _ = self._maybe_define_function(args, kwargs) File "/home/ubuntu/huggingface/venv/lib/python3.7/site-packages/tensorflow/python/eager/function.py", line 3292, in _maybe_define_function graph_function = self._create_graph_function(args, kwargs) File "/home/ubuntu/huggingface/venv/lib/python3.7/site-packages/tensorflow/python/eager/function.py", line 3140, in _create_graph_function capture_by_value=self._capture_by_value), File "/home/ubuntu/huggingface/venv/lib/python3.7/site-packages/tensorflow/python/framework/func_graph.py", line 1161, in func_graph_from_py_func func_outputs = python_func(*func_args, **func_kwargs) File "/home/ubuntu/huggingface/venv/lib/python3.7/site-packages/tensorflow/python/eager/def_function.py", line 677, in wrapped_fn out = weak_wrapped_fn().__wrapped__(*args, **kwds) File "/home/ubuntu/huggingface/venv/lib/python3.7/site-packages/tensorflow/python/framework/func_graph.py", line 1147, in autograph_handler raise e.ag_error_metadata.to_exception(e) tensorflow.python.autograph.impl.api.StagingError: in user code: File "/home/ubuntu/huggingface/venv/lib/python3.7/site-packages/keras/engine/base_layer.py", line 987, in error_handler * return fn(*args, **kwargs) StagingError: Exception encountered when calling layer "tf_bert_for_sequence_classification_flat_io" (type TFBertForSequenceClassificationFlatIO). in user code: File "inferencesmall.py", line 22, in call * output = self.model({'input_ids': input_ids, 'attention_mask': attention_mask}) File "/home/ubuntu/huggingface/venv/lib/python3.7/site-packages/keras/utils/traceback_utils.py", line 67, in error_handler ** raise e.with_traceback(filtered_tb) from None StagingError: Exception encountered when calling layer "tf_distil_bert_for_sequence_classification_1" (type TFDistilBertForSequenceClassification). in user code: File "/home/ubuntu/huggingface/venv/lib/python3.7/site-packages/transformers/models/distilbert/modeling_tf_distilbert.py", line 798, in call * distilbert_output = self.distilbert( File "/home/ubuntu/huggingface/venv/lib/python3.7/site-packages/keras/utils/traceback_utils.py", line 67, in error_handler ** raise e.with_traceback(filtered_tb) from None StagingError: Exception encountered when calling layer "distilbert" (type TFDistilBertMainLayer). in user code: File "/home/ubuntu/huggingface/venv/lib/python3.7/site-packages/transformers/models/distilbert/modeling_tf_distilbert.py", line 423, in call * inputs = input_processing( File "/home/ubuntu/huggingface/venv/lib/python3.7/site-packages/transformers/modeling_tf_utils.py", line 372, in input_processing * output[parameter_names[i]] = input IndexError: list index out of range Thanks in advance Ajay
1
answers
0
votes
73
views
asked a month ago

neuron compiling bert model for inferentia on tf2

Hi, This link https://awsdocs-neuron.readthedocs-hosted.com/en/latest/frameworks/tensorflow/tensorflow-neuron/tutorials/bert_demo/bert_demo.html mentions how to compile using tensorflow 1. Can anyone let me know the steps to neuron compile a BERT large model for running inference on inferentia using tensorflow v2?? Thanks in advance Ajay P.S This is what my log looks like while compiling on tf1 INFO:tensorflow:fusing subgraph {subgraph neuron_op_e76ab3d9bc74f09f with input tensors ["<tf.Tensor 'bert/encoder/ones0/_0:0' shape=(1, 512, 1) dtype=float32>", "<tf.Tensor 'bert/encoder/Cast0/_1:0' shape=(1, 1, 512) dtype=float32>", "<tf.Tensor 'bert/embeddings/LayerNorm/batchnorm/add_10/_2:0' shape=(1, 512, 1024) dtype=float32>"], output tensors ["<tf.Tensor 'bert/pooler/dense/Tanh:0' shape=(1, 1024) dtype=float32>", "<tf.Tensor 'bert/encoder/layer_23/output/LayerNorm/batchnorm/add_1:0' shape=(1, 512, 1024) dtype=float32>"]} with neuron-cc . Compiler status ERROR WARNING:tensorflow:11/03/2022 04:28:48 AM ERROR 9932 [neuron-cc]: Failed to parse model /tmp/tmpbyvnmr6h/neuron_op_e76ab3d9bc74f09f/graph_def.pb: The following operators are not implemented: {'Einsum'} (NotImplementedError) INFO:tensorflow:Number of operations in TensorFlow session: 7427 INFO:tensorflow:Number of operations after tf.neuron optimizations: 2901 INFO:tensorflow:Number of operations placed on Neuron runtime: 0 WARNING:tensorflow:Converted /home/ubuntu/bert_repo/patent_model/ to ./bert-saved-model-neuron_tf1.15 but no operator will be running on AWS machine learning accelerators. This is probably not what you want. Please refer to https://github.com/aws/aws-neuron-sdk for current limitations of the AWS Neuron SDK. We are actively improving (and hiring)! {'OnNeuronRatio': 0.0} ---I assume the OnNeuronRatio being 0 means that I wont be able to make use of Inferentia hardware acceleration. Is that correct?
1
answers
0
votes
29
views
asked a month ago

RuntimeError: The PyTorch Neuron Runtime could not be initialized

I just started using Neuron on Inf1 and I'm following the examples. I did the [resnet50](https://awsdocs-neuron.readthedocs-hosted.com/en/latest/src/examples/pytorch/resnet50.html) example, no problems. Then I tried to follow the [BERT](https://awsdocs-neuron.readthedocs-hosted.com/en/latest/src/examples/pytorch/bert_tutorial/tutorial_pretrained_bert.html) example and I got the following error. I followed [these](https://awsdocs-neuron.readthedocs-hosted.com/en/latest/neuron-runtime/nrt-troubleshoot.html) troubleshooting steps - Neuron is installed and I haven't seen any of the other errors. ``` --------------------------------------------------------------------------- RuntimeError Traceback (most recent call last) /tmp/ipykernel_27071/3834338657.py in <module> 35 36 # Verify the TorchScript works on both example inputs ---> 37 paraphrase_classification_logits_neuron = model_neuron(*example_inputs_paraphrase) 38 not_paraphrase_classification_logits_neuron = model_neuron(*example_inputs_not_paraphrase) 39 ~/pytorch_venv/lib64/python3.7/site-packages/torch/nn/modules/module.py in _call_impl(self, *input, **kwargs) 1108 if not (self._backward_hooks or self._forward_hooks or self._forward_pre_hooks or _global_backward_hooks 1109 or _global_forward_hooks or _global_forward_pre_hooks): -> 1110 return forward_call(*input, **kwargs) 1111 # Do not call functions when jit is used 1112 full_backward_hooks, non_full_backward_hooks = [], [] RuntimeError: The following operation failed in the TorchScript interpreter. Traceback of TorchScript (most recent call last): /home/ec2-user/pytorch_venv/lib64/python3.7/site-packages/torch_neuron/decorators.py(372): forward /home/ec2-user/pytorch_venv/lib64/python3.7/site-packages/torch/nn/modules/module.py(1098): _slow_forward /home/ec2-user/pytorch_venv/lib64/python3.7/site-packages/torch/nn/modules/module.py(1110): _call_impl /home/ec2-user/pytorch_venv/lib64/python3.7/site-packages/torch_neuron/graph.py(548): __call__ /home/ec2-user/pytorch_venv/lib64/python3.7/site-packages/torch_neuron/graph.py(207): run_op /home/ec2-user/pytorch_venv/lib64/python3.7/site-packages/torch_neuron/graph.py(196): __call__ /home/ec2-user/pytorch_venv/lib64/python3.7/site-packages/torch_neuron/runtime.py(69): forward /home/ec2-user/pytorch_venv/lib64/python3.7/site-packages/torch/nn/modules/module.py(1098): _slow_forward /home/ec2-user/pytorch_venv/lib64/python3.7/site-packages/torch/nn/modules/module.py(1110): _call_impl /home/ec2-user/pytorch_venv/lib64/python3.7/site-packages/torch/jit/_trace.py(965): trace_module /home/ec2-user/pytorch_venv/lib64/python3.7/site-packages/torch/jit/_trace.py(750): trace /home/ec2-user/pytorch_venv/lib64/python3.7/site-packages/torch_neuron/tensorboard.py(307): tb_parse /home/ec2-user/pytorch_venv/lib64/python3.7/site-packages/torch_neuron/tensorboard.py(533): tb_graph /home/ec2-user/pytorch_venv/lib64/python3.7/site-packages/torch_neuron/decorators.py(482): maybe_generate_tb_graph_def /home/ec2-user/pytorch_venv/lib64/python3.7/site-packages/torch_neuron/convert.py(513): maybe_determine_names_from_tensorboard /home/ec2-user/pytorch_venv/lib64/python3.7/site-packages/torch_neuron/convert.py(200): trace /tmp/ipykernel_27071/3834338657.py(34): <module> /home/ec2-user/pytorch_venv/lib64/python3.7/site-packages/IPython/core/interactiveshell.py(3553): run_code /home/ec2-user/pytorch_venv/lib64/python3.7/site-packages/IPython/core/interactiveshell.py(3473): run_ast_nodes /home/ec2-user/pytorch_venv/lib64/python3.7/site-packages/IPython/core/interactiveshell.py(3258): run_cell_async /home/ec2-user/pytorch_venv/lib64/python3.7/site-packages/IPython/core/async_helpers.py(78): _pseudo_sync_runner /home/ec2-user/pytorch_venv/lib64/python3.7/site-packages/IPython/core/interactiveshell.py(3030): _run_cell /home/ec2-user/pytorch_venv/lib64/python3.7/site-packages/IPython/core/interactiveshell.py(2976): run_cell /home/ec2-user/pytorch_venv/lib64/python3.7/site-packages/ipykernel/zmqshell.py(528): run_cell /home/ec2-user/pytorch_venv/lib64/python3.7/site-packages/ipykernel/ipkernel.py(387): do_execute /home/ec2-user/pytorch_venv/lib64/python3.7/site-packages/ipykernel/kernelbase.py(730): execute_request /home/ec2-user/pytorch_venv/lib64/python3.7/site-packages/ipykernel/kernelbase.py(406): dispatch_shell /home/ec2-user/pytorch_venv/lib64/python3.7/site-packages/ipykernel/kernelbase.py(499): process_one /home/ec2-user/pytorch_venv/lib64/python3.7/site-packages/ipykernel/kernelbase.py(510): dispatch_queue /usr/lib64/python3.7/asyncio/events.py(88): _run /usr/lib64/python3.7/asyncio/base_events.py(1786): _run_once /usr/lib64/python3.7/asyncio/base_events.py(541): run_forever /home/ec2-user/pytorch_venv/lib64/python3.7/site-packages/tornado/platform/asyncio.py(215): start /home/ec2-user/pytorch_venv/lib64/python3.7/site-packages/ipykernel/kernelapp.py(712): start /home/ec2-user/pytorch_venv/lib64/python3.7/site-packages/traitlets/config/application.py(982): launch_instance /home/ec2-user/pytorch_venv/lib64/python3.7/site-packages/ipykernel_launcher.py(17): <module> /usr/lib64/python3.7/runpy.py(85): _run_code /usr/lib64/python3.7/runpy.py(193): _run_module_as_main RuntimeError: The PyTorch Neuron Runtime could not be initialized. Neuron Driver issues are logged to your system logs. See the Neuron Runtime's troubleshooting guide for help on this topic: https://awsdocs-neuron.readthedocs-hosted.com/en/latest/ ``` When I searched the system logs from **EC2 > Instances > ... > Get system log** for "neuron" I got these results: ``` [ 264.104973] neuron: loading out-of-tree module taints kernel. [ 264.107388] neuron: module verification failed: signature and/or required key missing - tainting kernel [ 264.113673] Neuron Driver Started with Version:2.3.26.0-67ad286904ed6cc43a8761d89c8477de0ba961e1 [ 264.119896] neuron:nr_reset_thread_fn: nd0: initiating reset [ 264.139198] neuron:mpset_constructor: reserved 134217728 bytes of host memory [ 271.253090] neuron:nr_reset_thread_fn: nd0: reset completed [ 2010.629871] neuron:npid_attach: neuron:npid_attach: pid=25574, slot=0 [ 2069.531541] neuron:npid_detach: neuron:npid_detach: pid=25574, slot=0 [ 2093.038724] neuron:npid_attach: neuron:npid_attach: pid=25666, slot=0 [ 2266.011350] neuron:npid_detach: neuron:npid_detach: pid=25666, slot=0 [ 2271.008477] neuron:npid_attach: neuron:npid_attach: pid=25777, slot=0 [ 4139.513592] neuron:npid_detach: neuron:npid_detach: pid=25777, slot=0 ```
1
answers
0
votes
27
views
nikolay
asked a month ago

I cant save neuron model after compile the model into an AWS Neuron optimized TorchScript

I cant save neuron model after compile the model into an AWS Neuron optimized TorchScript. My code: ``` import tensorflow # to workaround a protobuf version conflict issue import torch import torch.neuron import torch.nn.functional as F import transformers from transformers import AutoTokenizer, AutoModelForSequenceClassification tokenizer = AutoTokenizer.from_pretrained('./data/tokenizer') model = AutoModelForSequenceClassification.from_pretrained('data/model', return_dict=False, torchscript=True) parsed_json_list = [ {"premise": "I have a dog", "hypotheses": ["I love dogs", "I hate dogs"]} ] model_inputs = tokenizer( [ parsed_json["premise"] for parsed_json in parsed_json_list for _ in parsed_json["hypotheses"] ], [ hypothesis for parsed_json in parsed_json_list for hypothesis in parsed_json["hypotheses"] ], return_tensors='pt', padding=True, truncation=True, ) pred = model(**model_inputs) example_inputs= model_inputs['input_ids'], model_inputs['attention_mask'], model_inputs['token_type_ids'] model_neuron = torch.neuron.trace(model, example_inputs, verbose=1) ``` Neuron log after torch.neuron.trace: ``` INFO:Neuron:Number of neuron graph operations 121 did not match traced graph 101 - using heuristic matching of hierarchical information INFO:Neuron:Number of arithmetic operators (post-compilation) before = 2131, compiled = 1992, percent compiled = 93.48% INFO:Neuron:The neuron partitioner created 51 sub-graphs INFO:Neuron:Neuron successfully compiled 50 sub-graphs, Total fused subgraphs = 51, Percent of model sub-graphs successfully compiled = 98.0% INFO:Neuron:Compiled these operators (and operator counts) to Neuron: INFO:Neuron: => aten::Int: 398 INFO:Neuron: => aten::ScalarImplicit: 2 INFO:Neuron: => aten::add: 82 INFO:Neuron: => aten::arange: 2 INFO:Neuron: => aten::bmm: 47 INFO:Neuron: => aten::clamp: 22 INFO:Neuron: => aten::contiguous: 71 INFO:Neuron: => aten::detach: 144 INFO:Neuron: => aten::div: 60 INFO:Neuron: => aten::expand: 22 INFO:Neuron: => aten::gelu: 13 INFO:Neuron: => aten::layer_norm: 26 INFO:Neuron: => aten::linear: 97 INFO:Neuron: => aten::mul: 38 INFO:Neuron: => aten::neg: 11 INFO:Neuron: => aten::permute: 71 INFO:Neuron: => aten::select: 1 INFO:Neuron: => aten::size: 460 INFO:Neuron: => aten::slice: 27 INFO:Neuron: => aten::sqrt: 36 INFO:Neuron: => aten::squeeze: 23 INFO:Neuron: => aten::sub: 1 INFO:Neuron: => aten::to: 96 INFO:Neuron: => aten::transpose: 47 INFO:Neuron: => aten::unsqueeze: 29 INFO:Neuron: => aten::view: 166 INFO:Neuron:Not compiled operators (and operator counts) to Neuron: INFO:Neuron: => aten::Int: 35 [supported] INFO:Neuron: => aten::__and__: 1 [supported] INFO:Neuron: => aten::abs: 1 [supported] INFO:Neuron: => aten::add: 3 [supported] INFO:Neuron: => aten::bmm: 1 [supported] INFO:Neuron: => aten::ceil: 1 [supported] INFO:Neuron: => aten::clamp: 2 [supported] INFO:Neuron: => aten::contiguous: 1 [supported] INFO:Neuron: => aten::detach: 2 [supported] INFO:Neuron: => aten::div: 2 [supported] INFO:Neuron: => aten::embedding: 1 [not supported] INFO:Neuron: => aten::expand: 2 [supported] INFO:Neuron: => aten::gather: 24 [not supported] INFO:Neuron: => aten::gt: 1 [supported] INFO:Neuron: => aten::le: 1 [supported] INFO:Neuron: => aten::linear: 1 [supported] INFO:Neuron: => aten::log: 2 [supported] INFO:Neuron: => aten::lt: 1 [supported] INFO:Neuron: => aten::mul: 2 [supported] INFO:Neuron: => aten::neg: 1 [supported] INFO:Neuron: => aten::permute: 1 [supported] INFO:Neuron: => aten::repeat: 24 [not supported] INFO:Neuron: => aten::sign: 1 [not supported] INFO:Neuron: => aten::size: 10 [supported] INFO:Neuron: => aten::slice: 2 [supported] INFO:Neuron: => aten::squeeze: 2 [supported] INFO:Neuron: => aten::to: 5 [supported] INFO:Neuron: => aten::transpose: 1 [supported] INFO:Neuron: => aten::type_as: 2 [supported] INFO:Neuron: => aten::unsqueeze: 2 [supported] INFO:Neuron: => aten::view: 2 [supported] INFO:Neuron: => aten::where: 2 [not supported] INFO:Neuron:skip_inference_context for tensorboard symbols at /home/ec2-user/test_aws_inf_neuron/pytorch_venv/lib64/python3.7/site-packages/torch_neuron/tensorboard.py:305 tb_parse INFO:Neuron:Number of neuron graph operations 468 did not match traced graph 599 - using heuristic matching of hierarchical information CPU times: user 1min 50s, sys: 13.1 s, total: 2min 3s Wall time: 7min 37s ``` After i try save my model: ``` model_neuron.save('test.pt') ``` error log: ``` --------------------------------------------------------------------------- RuntimeError Traceback (most recent call last) /tmp/ipykernel_17776/1444584018.py in <module> ----> 1 model_neuron.save('test.pt') ~/test_aws_inf_neuron/pytorch_venv/lib64/python3.7/site-packages/torch/jit/_script.py in save(self, f, **kwargs) 691 See :func:`torch.jit.save <torch.jit.save>` for details. 692 """ --> 693 return self._c.save(str(f), **kwargs) 694 695 def _save_for_lite_interpreter(self, *args, **kwargs): RuntimeError: Could not export Python function call 'XSoftmax'. Remove calls to Python functions before export. Did you forget to add @script or @script_method annotation? If this is a nn.ModuleList, add it to __constants__: /home/ec2-user/test_aws_inf_neuron/pytorch_venv/lib64/python3.7/site-packages/torch_neuron/native_ops/prim.py(46): PythonOp /home/ec2-user/test_aws_inf_neuron/pytorch_venv/lib64/python3.7/site-packages/torch_neuron/graph.py(330): __call__ /home/ec2-user/test_aws_inf_neuron/pytorch_venv/lib64/python3.7/site-packages/torch_neuron/graph.py(207): run_op /home/ec2-user/test_aws_inf_neuron/pytorch_venv/lib64/python3.7/site-packages/torch_neuron/graph.py(196): __call__ /home/ec2-user/test_aws_inf_neuron/pytorch_venv/lib64/python3.7/site-packages/torch_neuron/runtime.py(69): forward /home/ec2-user/test_aws_inf_neuron/pytorch_venv/lib64/python3.7/site-packages/torch/nn/modules/module.py(1098): _slow_forward /home/ec2-user/test_aws_inf_neuron/pytorch_venv/lib64/python3.7/site-packages/torch/nn/modules/module.py(1110): _call_impl /home/ec2-user/test_aws_inf_neuron/pytorch_venv/lib64/python3.7/site-packages/torch/jit/_trace.py(965): trace_module /home/ec2-user/test_aws_inf_neuron/pytorch_venv/lib64/python3.7/site-packages/torch/jit/_trace.py(750): trace /home/ec2-user/test_aws_inf_neuron/pytorch_venv/lib64/python3.7/site-packages/torch_neuron/tensorboard.py(307): tb_parse /home/ec2-user/test_aws_inf_neuron/pytorch_venv/lib64/python3.7/site-packages/torch_neuron/tensorboard.py(533): tb_graph /home/ec2-user/test_aws_inf_neuron/pytorch_venv/lib64/python3.7/site-packages/torch_neuron/decorators.py(482): maybe_generate_tb_graph_def /home/ec2-user/test_aws_inf_neuron/pytorch_venv/lib64/python3.7/site-packages/torch_neuron/convert.py(513): maybe_determine_names_from_tensorboard /home/ec2-user/test_aws_inf_neuron/pytorch_venv/lib64/python3.7/site-packages/torch_neuron/convert.py(200): trace <timed exec>(1): <module> /home/ec2-user/test_aws_inf_neuron/pytorch_venv/lib64/python3.7/site-packages/IPython/core/magics/execution.py(1335): time /home/ec2-user/test_aws_inf_neuron/pytorch_venv/lib64/python3.7/site-packages/IPython/core/magic.py(187): <lambda> /home/ec2-user/test_aws_inf_neuron/pytorch_venv/lib64/python3.7/site-packages/decorator.py(232): fun /home/ec2-user/test_aws_inf_neuron/pytorch_venv/lib64/python3.7/site-packages/IPython/core/interactiveshell.py(2473): run_cell_magic /tmp/ipykernel_17776/2573155944.py(1): <module> /home/ec2-user/test_aws_inf_neuron/pytorch_venv/lib64/python3.7/site-packages/IPython/core/interactiveshell.py(3553): run_code /home/ec2-user/test_aws_inf_neuron/pytorch_venv/lib64/python3.7/site-packages/IPython/core/interactiveshell.py(3473): run_ast_nodes /home/ec2-user/test_aws_inf_neuron/pytorch_venv/lib64/python3.7/site-packages/IPython/core/interactiveshell.py(3258): run_cell_async /home/ec2-user/test_aws_inf_neuron/pytorch_venv/lib64/python3.7/site-packages/IPython/core/async_helpers.py(78): _pseudo_sync_runner /home/ec2-user/test_aws_inf_neuron/pytorch_venv/lib64/python3.7/site-packages/IPython/core/interactiveshell.py(3030): _run_cell /home/ec2-user/test_aws_inf_neuron/pytorch_venv/lib64/python3.7/site-packages/IPython/core/interactiveshell.py(2976): run_cell /home/ec2-user/test_aws_inf_neuron/pytorch_venv/lib64/python3.7/site-packages/ipykernel/zmqshell.py(528): run_cell /home/ec2-user/test_aws_inf_neuron/pytorch_venv/lib64/python3.7/site-packages/ipykernel/ipkernel.py(387): do_execute /home/ec2-user/test_aws_inf_neuron/pytorch_venv/lib64/python3.7/site-packages/ipykernel/kernelbase.py(730): execute_request /home/ec2-user/test_aws_inf_neuron/pytorch_venv/lib64/python3.7/site-packages/ipykernel/kernelbase.py(406): dispatch_shell /home/ec2-user/test_aws_inf_neuron/pytorch_venv/lib64/python3.7/site-packages/ipykernel/kernelbase.py(499): process_one /home/ec2-user/test_aws_inf_neuron/pytorch_venv/lib64/python3.7/site-packages/ipykernel/kernelbase.py(510): dispatch_queue /usr/lib64/python3.7/asyncio/events.py(88): _run /usr/lib64/python3.7/asyncio/base_events.py(1786): _run_once /usr/lib64/python3.7/asyncio/base_events.py(541): run_forever /home/ec2-user/test_aws_inf_neuron/pytorch_venv/lib64/python3.7/site-packages/tornado/platform/asyncio.py(215): start /home/ec2-user/test_aws_inf_neuron/pytorch_venv/lib64/python3.7/site-packages/ipykernel/kernelapp.py(712): start /home/ec2-user/test_aws_inf_neuron/pytorch_venv/lib64/python3.7/site-packages/traitlets/config/application.py(982): launch_instance /home/ec2-user/test_aws_inf_neuron/pytorch_venv/lib64/python3.7/site-packages/ipykernel_launcher.py(17): <module> /usr/lib64/python3.7/runpy.py(85): _run_code /usr/lib64/python3.7/runpy.py(193): _run_module_as_main ``` Thanks in advance.
1
answers
0
votes
51
views
asked a month ago

AWS Pytorch Neuron Compliation Error

I followed user guide on updating torch neuron and then started compiling the model to neuron. But got an error, from which I don't understand what's wrong. In Neuron SDK you claim that it should compile all operations, even not supported ones, they just should run on CPU. The error: ``` INFO:Neuron:All operators are compiled by neuron-cc (this does not guarantee that neuron-cc will successfully compile) INFO:Neuron:Number of arithmetic operators (pre-compilation) before = 3345, fused = 3345, percent fused = 100.0% INFO:Neuron:Number of neuron graph operations 8175 did not match traced graph 9652 - using heuristic matching of hierarchical information INFO:Neuron:Compiling function _NeuronGraph$3362 with neuron-cc INFO:Neuron:Compiling with command line: '/home/ubuntu/alias/neuron/neuron_env/bin/neuron-cc compile /tmp/tmpmp8qvhtb/graph_def.pb --framework TENSORFLOW --pipeline compile SaveTemps --output /tmp/tmpmp8qvhtb/graph_def.neff --io-config {"inputs": {"0:0": [[1, 3, 768, 768], "float32"]}, "outputs": ["aten_sigmoid/Sigmoid:0"]} --verbose 35' ..............................................................................INFO:Neuron:Compile command returned: -9 WARNING:Neuron:torch.neuron.trace failed on _NeuronGraph$3362; falling back to native python function call ERROR:Neuron:neuron-cc failed with the following command line call: /home/ubuntu/alias/neuron/neuron_env/bin/neuron-cc compile /tmp/tmpmp8qvhtb/graph_def.pb --framework TENSORFLOW --pipeline compile SaveTemps --output /tmp/tmpmp8qvhtb/graph_def.neff --io-config '{"inputs": {"0:0": [[1, 3, 768, 768], "float32"]}, "outputs": ["aten_sigmoid/Sigmoid:0"]}' --verbose 35 Traceback (most recent call last): File "/home/ubuntu/alias/neuron/neuron_env/lib/python3.7/site-packages/torch_neuron/convert.py", line 382, in op_converter item, inputs, compiler_workdir=sg_workdir, **kwargs) File "/home/ubuntu/alias/neuron/neuron_env/lib/python3.7/site-packages/torch_neuron/decorators.py", line 220, in trace 'neuron-cc failed with the following command line call:\n{}'.format(command)) subprocess.SubprocessError: neuron-cc failed with the following command line call: /home/ubuntu/alias/neuron/neuron_env/bin/neuron-cc compile /tmp/tmpmp8qvhtb/graph_def.pb --framework TENSORFLOW --pipeline compile SaveTemps --output /tmp/tmpmp8qvhtb/graph_def.neff --io-config '{"inputs": {"0:0": [[1, 3, 768, 768], "float32"]}, "outputs": ["aten_sigmoid/Sigmoid:0"]}' --verbose 35 INFO:Neuron:Number of arithmetic operators (post-compilation) before = 3345, compiled = 0, percent compiled = 0.0% INFO:Neuron:The neuron partitioner created 1 sub-graphs INFO:Neuron:Neuron successfully compiled 0 sub-graphs, Total fused subgraphs = 1, Percent of model sub-graphs successfully compiled = 0.0% INFO:Neuron:Compiled these operators (and operator counts) to Neuron: INFO:Neuron:Not compiled operators (and operator counts) to Neuron: INFO:Neuron: => aten::Int: 942 [supported] INFO:Neuron: => aten::_convolution: 107 [supported] INFO:Neuron: => aten::add: 104 [supported] INFO:Neuron: => aten::batch_norm: 1 [supported] INFO:Neuron: => aten::cat: 1 [supported] INFO:Neuron: => aten::contiguous: 4 [supported] INFO:Neuron: => aten::div: 104 [supported] INFO:Neuron: => aten::dropout: 208 [supported] INFO:Neuron: => aten::feature_dropout: 1 [supported] INFO:Neuron: => aten::flatten: 60 [supported] INFO:Neuron: => aten::gelu: 52 [supported] INFO:Neuron: => aten::layer_norm: 161 [supported] INFO:Neuron: => aten::linear: 264 [supported] INFO:Neuron: => aten::matmul: 104 [supported] INFO:Neuron: => aten::mul: 52 [supported] INFO:Neuron: => aten::permute: 210 [supported] INFO:Neuron: => aten::relu: 1 [supported] INFO:Neuron: => aten::reshape: 262 [supported] INFO:Neuron: => aten::select: 104 [supported] INFO:Neuron: => aten::sigmoid: 1 [supported] INFO:Neuron: => aten::size: 278 [supported] INFO:Neuron: => aten::softmax: 52 [supported] INFO:Neuron: => aten::transpose: 216 [supported] INFO:Neuron: => aten::upsample_bilinear2d: 4 [supported] INFO:Neuron: => aten::view: 52 [supported] Traceback (most recent call last): File "to_neuron.py", line 14, in <module> model_neuron = torch.neuron.trace(model, example_inputs=[image.cuda()]) File "/home/ubuntu/alias/neuron/neuron_env/lib/python3.7/site-packages/torch_neuron/convert.py", line 184, in trace cu.stats_post_compiler(neuron_graph) File "/home/ubuntu/alias/neuron/neuron_env/lib/python3.7/site-packages/torch_neuron/convert.py", line 493, in stats_post_compiler "No operations were successfully partitioned and compiled to neuron for this model - aborting trace!") RuntimeError: No operations were successfully partitioned and compiled to neuron for this model - aborting trace! ```
1
answers
1
votes
79
views
asked 2 months ago

Is it possible to compile a neuron model in my local machine?

I'm following some guides and from my understanding this should be possible. But I've been trying for hours to compile a yolov5 model into a neuron model with no success. Is it even possible to do this in my local machine or do I have to be in an inferentia instance? This is what my environment looks like: ``` # packages in environment at /miniconda3/envs/neuron: # # Name Version Build Channel _libgcc_mutex 0.1 main _openmp_mutex 5.1 1_gnu absl-py 1.2.0 pypi_0 pypi astor 0.8.1 pypi_0 pypi attrs 22.1.0 pypi_0 pypi backcall 0.2.0 pyhd3eb1b0_0 ca-certificates 2022.07.19 h06a4308_0 cachetools 5.2.0 pypi_0 pypi certifi 2022.9.24 pypi_0 pypi charset-normalizer 2.1.1 pypi_0 pypi cycler 0.11.0 pypi_0 pypi debugpy 1.5.1 py37h295c915_0 decorator 5.1.1 pyhd3eb1b0_0 dmlc-nnvm 1.11.1.0+0 pypi_0 pypi dmlc-topi 1.11.1.0+0 pypi_0 pypi dmlc-tvm 1.11.1.0+0 pypi_0 pypi entrypoints 0.4 py37h06a4308_0 fonttools 4.37.3 pypi_0 pypi gast 0.2.2 pypi_0 pypi google-auth 2.12.0 pypi_0 pypi google-auth-oauthlib 0.4.6 pypi_0 pypi google-pasta 0.2.0 pypi_0 pypi gputil 1.4.0 pypi_0 pypi grpcio 1.49.1 pypi_0 pypi h5py 3.7.0 pypi_0 pypi idna 3.4 pypi_0 pypi importlib-metadata 4.12.0 pypi_0 pypi inferentia-hwm 1.11.0.0+0 pypi_0 pypi iniconfig 1.1.1 pypi_0 pypi ipykernel 6.15.2 py37h06a4308_0 ipython 7.34.0 pypi_0 pypi ipywidgets 8.0.2 pypi_0 pypi islpy 2021.1+aws2021.x.16.0.bld0 pypi_0 pypi jedi 0.18.1 py37h06a4308_1 jupyter_client 7.3.5 py37h06a4308_0 jupyter_core 4.10.0 py37h06a4308_0 jupyterlab-widgets 3.0.3 pypi_0 pypi keras-applications 1.0.8 pypi_0 pypi keras-preprocessing 1.1.2 pypi_0 pypi kiwisolver 1.4.4 pypi_0 pypi ld_impl_linux-64 2.38 h1181459_1 libffi 3.3 he6710b0_2 libgcc-ng 11.2.0 h1234567_1 libgomp 11.2.0 h1234567_1 libsodium 1.0.18 h7b6447c_0 libstdcxx-ng 11.2.0 h1234567_1 llvmlite 0.39.1 pypi_0 pypi markdown 3.4.1 pypi_0 pypi markupsafe 2.1.1 pypi_0 pypi matplotlib 3.5.3 pypi_0 pypi matplotlib-inline 0.1.6 py37h06a4308_0 ncurses 6.3 h5eee18b_3 nest-asyncio 1.5.5 py37h06a4308_0 networkx 2.4 pypi_0 pypi neuron-cc 1.11.7.0+aec18907e pypi_0 pypi numba 0.56.2 pypi_0 pypi numpy 1.19.5 pypi_0 pypi oauthlib 3.2.1 pypi_0 pypi opencv-python 4.6.0.66 pypi_0 pypi openssl 1.1.1q h7f8727e_0 opt-einsum 3.3.0 pypi_0 pypi packaging 21.3 pyhd3eb1b0_0 pandas 1.3.5 pypi_0 pypi parso 0.8.3 pyhd3eb1b0_0 pexpect 4.8.0 pyhd3eb1b0_3 pickleshare 0.7.5 pyhd3eb1b0_1003 pillow 9.2.0 pypi_0 pypi pip 22.2.2 pypi_0 pypi pluggy 1.0.0 pypi_0 pypi prompt-toolkit 3.0.31 pypi_0 pypi protobuf 3.20.3 pypi_0 pypi psutil 5.9.2 pypi_0 pypi ptyprocess 0.7.0 pyhd3eb1b0_2 py 1.11.0 pypi_0 pypi pyasn1 0.4.8 pypi_0 pypi pyasn1-modules 0.2.8 pypi_0 pypi pygments 2.13.0 pypi_0 pypi pyparsing 3.0.9 py37h06a4308_0 pytest 7.1.3 pypi_0 pypi python 3.7.13 h12debd9_0 python-dateutil 2.8.2 pyhd3eb1b0_0 pytz 2022.2.1 pypi_0 pypi pyyaml 6.0 pypi_0 pypi pyzmq 23.2.0 py37h6a678d5_0 readline 8.1.2 h7f8727e_1 requests 2.28.1 pypi_0 pypi requests-oauthlib 1.3.1 pypi_0 pypi rsa 4.9 pypi_0 pypi scipy 1.4.1 pypi_0 pypi seaborn 0.12.0 pypi_0 pypi setuptools 59.8.0 pypi_0 pypi six 1.16.0 pyhd3eb1b0_1 sqlite 3.39.3 h5082296_0 tensorboard 1.15.0 pypi_0 pypi tensorboard-data-server 0.6.1 pypi_0 pypi tensorboard-plugin-wit 1.8.1 pypi_0 pypi tensorflow 1.15.0 pypi_0 pypi tensorflow-estimator 1.15.1 pypi_0 pypi termcolor 2.0.1 pypi_0 pypi thop 0.1.1-2209072238 pypi_0 pypi tk 8.6.12 h1ccaba5_0 tomli 2.0.1 pypi_0 pypi torch 1.11.0 pypi_0 pypi torch-neuron 1.11.0.2.3.0.0 pypi_0 pypi torchvision 0.12.0 pypi_0 pypi tornado 6.2 py37h5eee18b_0 tqdm 4.64.1 pypi_0 pypi traitlets 5.4.0 pypi_0 pypi typing-extensions 4.3.0 pypi_0 pypi urllib3 1.26.12 pypi_0 pypi wcwidth 0.2.5 pyhd3eb1b0_0 werkzeug 2.2.2 pypi_0 pypi wheel 0.37.1 pypi_0 pypi widgetsnbextension 4.0.3 pypi_0 pypi wrapt 1.14.1 pypi_0 pypi xz 5.2.6 h5eee18b_0 zeromq 4.3.4 h2531618_0 zipp 3.8.1 pypi_0 pypi zlib 1.2.12 h5eee18b_3 ```
1
answers
2
votes
54
views
asked 2 months ago

Not able to compile to NEFF, the BERT model from neuron tutorial

Hi Team, I wanted to compile a BERT model and run it on inferentia. I trained my model using pytorch and tried to convert it by following the same steps in this [tutorial](https://github.com/aws-neuron/aws-neuron-sdk/blob/master/src/examples/pytorch/bert_tutorial/tutorial_pretrained_bert.ipynb) on my amazon linux Machine. But I keep getting failure with this error: ``` 09/22/2022 06:13:56 PM ERROR 23737 [neuron-cc]: Failed to parse model /tmp/tmp64l9ygmj/graph_def.pb: The following operators are not implemented: {'SelectV2'} (NotImplementedError) ``` I followed the installation steps [here](https://awsdocs-neuron.readthedocs-hosted.com/en/latest/neuron-intro/pytorch-setup/pytorch-install.html) for pytorch-1.11.0 and tried to execute the code in tutorial but got the same error. We wanted to explore using Inferentia for our large BERT model but are blocked on doing so due to failure in conversion to NEFF format. I also tried following steps using TF and ran into some other ops unsupported issue. Could you please help! Below are the setup commands i ran on my Amazon Linux Desktop ``` sudo yum install -y python3.7-venv gcc-c++ python3.7 -m venv pytorch_venv source pytorch_venv/bin/activate pip install -U pip # Set Pip repository to point to the Neuron repository pip config set global.extra-index-url https://pip.repos.neuron.amazonaws.com #Install Neuron PyTorch pip install torch-neuron neuron-cc[tensorflow] "protobuf<4" torchvision !pip install --upgrade "transformers==4.6.0" pip install tensorflow==2.8.1 ``` and then executed the below script(copied from tutorial) on my amazon linux host: ``` import tensorflow # to workaround a protobuf version conflict issue import torch import torch.neuron from transformers import AutoTokenizer, AutoModelForSequenceClassification, AutoConfig import transformers import os import warnings # Setting up NeuronCore groups for inf1.6xlarge with 16 cores num_cores = 16 # This value should be 4 on inf1.xlarge and inf1.2xlarge nc_env = ','.join(['1'] * num_cores) warnings.warn("NEURONCORE_GROUP_SIZES is being deprecated, if your application is using NEURONCORE_GROUP_SIZES please \ see https://awsdocs-neuron.readthedocs-hosted.com/en/latest/release-notes/deprecation.html#announcing-end-of-support-for-neuroncore-group-sizes \ for more details.", DeprecationWarning) os.environ['NEURONCORE_GROUP_SIZES'] = nc_env # Build tokenizer and model tokenizer = AutoTokenizer.from_pretrained("bert-base-cased-finetuned-mrpc") model = AutoModelForSequenceClassification.from_pretrained("bert-base-cased-finetuned-mrpc", return_dict=False) # Setup some example inputs sequence_0 = "The company HuggingFace is based in New York City" sequence_1 = "Apples are especially bad for your health" sequence_2 = "HuggingFace's headquarters are situated in Manhattan" max_length=128 paraphrase = tokenizer.encode_plus(sequence_0, sequence_2, max_length=max_length, padding='max_length', truncation=True, return_tensors="pt") not_paraphrase = tokenizer.encode_plus(sequence_0, sequence_1, max_length=max_length, padding='max_length', truncation=True, return_tensors="pt") # Run the original PyTorch model on compilation exaple paraphrase_classification_logits = model(**paraphrase)[0] # Convert example inputs to a format that is compatible with TorchScript tracing example_inputs_paraphrase = paraphrase['input_ids'], paraphrase['attention_mask'], paraphrase['token_type_ids'] example_inputs_not_paraphrase = not_paraphrase['input_ids'], not_paraphrase['attention_mask'], not_paraphrase['token_type_ids'] # Run torch.neuron.trace to generate a TorchScript that is optimized by AWS Neuron model_neuron = torch.neuron.trace(model, example_inputs_paraphrase) ``` This gave me the following error: ``` 2022-09-22 18:13:12.145617: W tensorflow/stream_executor/platform/default/dso_loader.cc:64] Could not load dynamic library 'libcudart.so.11.0'; dlerror: libcudart.so.11.0: cannot open shared object file: No such file or directory 2022-09-22 18:13:12.145649: I tensorflow/stream_executor/cuda/cudart_stub.cc:29] Ignore above cudart dlerror if you do not have a GPU set up on your machine. sample_pytorch_model.py:14: DeprecationWarning: NEURONCORE_GROUP_SIZES is being deprecated, if your application is using NEURONCORE_GROUP_SIZES please see https://awsdocs-neuron.readthedocs-hosted.com/en/latest/release-notes/deprecation.html#announcing-end-of-support-for-neuroncore-group-sizes for more details. for more details.", DeprecationWarning) Downloading: 100%|██████████████████████████████████████████████████████████████████████████████████████████| 433/433 [00:00<00:00, 641kB/s] Downloading: 100%|████████████████████████████████████████████████████████████████████████████████████████| 213k/213k [00:00<00:00, 636kB/s] Downloading: 100%|████████████████████████████████████████████████████████████████████████████████████████| 436k/436k [00:00<00:00, 731kB/s] Downloading: 100%|███████████████████████████████████████████████████████████████████████████████████████| 29.0/29.0 [00:00<00:00, 35.2kB/s] Downloading: 100%|███████████████████████████████████████████████████████████████████████████████████████| 433M/433M [00:09<00:00, 45.7MB/s] /local/home/spareek/pytorch_venv/lib64/python3.7/site-packages/transformers/modeling_utils.py:1968: TracerWarning: Converting a tensor to a Python boolean might cause the trace to be incorrect. We can't record the data flow of Python values, so this value will be treated as a constant in the future. This means that the trace might not generalize to other inputs! input_tensor.shape[chunk_dim] == tensor_shape for input_tensor in input_tensors INFO:Neuron:There are 3 ops of 1 different types in the TorchScript that are not compiled by neuron-cc: aten::embedding, (For more information see https://github.com/aws/aws-neuron-sdk/blob/master/release-notes/neuron-cc-ops/neuron-cc-ops-pytorch.md) INFO:Neuron:Number of arithmetic operators (pre-compilation) before = 565, fused = 548, percent fused = 96.99% INFO:Neuron:Number of neuron graph operations 1601 did not match traced graph 1323 - using heuristic matching of hierarchical information INFO:Neuron:Compiling function _NeuronGraph$662 with neuron-cc INFO:Neuron:Compiling with command line: '/local/home/spareek/pytorch_venv/bin/neuron-cc compile /tmp/tmp64l9ygmj/graph_def.pb --framework TENSORFLOW --pipeline compile SaveTemps --output /tmp/tmp64l9ygmj/graph_def.neff --io-config {"inputs": {"0:0": [[1, 128, 768], "float32"], "1:0": [[1, 1, 1, 128], "float32"]}, "outputs": ["Linear_5/aten_linear/Add:0"]} --verbose 35' huggingface/tokenizers: The current process just got forked, after parallelism has already been used. Disabling parallelism to avoid deadlocks... To disable this warning, you can either: - Avoid using `tokenizers` before the fork if possible - Explicitly set the environment variable TOKENIZERS_PARALLELISM=(true | false) .2022-09-22 18:13:52.697717: W tensorflow/stream_executor/platform/default/dso_loader.cc:64] Could not load dynamic library 'libcudart.so.11.0'; dlerror: libcudart.so.11.0: cannot open shared object file: No such file or directory 2022-09-22 18:13:52.697749: I tensorflow/stream_executor/cuda/cudart_stub.cc:29] Ignore above cudart dlerror if you do not have a GPU set up on your machine. 09/22/2022 06:13:56 PM ERROR 23737 [neuron-cc]: Failed to parse model /tmp/tmp64l9ygmj/graph_def.pb: The following operators are not implemented: {'SelectV2'} (NotImplementedError) Compiler status ERROR INFO:Neuron:Compile command returned: 1 WARNING:Neuron:torch.neuron.trace failed on _NeuronGraph$662; falling back to native python function call ERROR:Neuron:neuron-cc failed with the following command line call: /local/home/spareek/pytorch_venv/bin/neuron-cc compile /tmp/tmp64l9ygmj/graph_def.pb --framework TENSORFLOW --pipeline compile SaveTemps --output /tmp/tmp64l9ygmj/graph_def.neff --io-config '{"inputs": {"0:0": [[1, 128, 768], "float32"], "1:0": [[1, 1, 1, 128], "float32"]}, "outputs": ["Linear_5/aten_linear/Add:0"]}' --verbose 35 Traceback (most recent call last): File "/local/home/spareek/pytorch_venv/lib64/python3.7/site-packages/torch_neuron/convert.py", line 382, in op_converter item, inputs, compiler_workdir=sg_workdir, **kwargs) File "/local/home/spareek/pytorch_venv/lib64/python3.7/site-packages/torch_neuron/decorators.py", line 220, in trace 'neuron-cc failed with the following command line call:\n{}'.format(command)) subprocess.SubprocessError: neuron-cc failed with the following command line call: /local/home/spareek/pytorch_venv/bin/neuron-cc compile /tmp/tmp64l9ygmj/graph_def.pb --framework TENSORFLOW --pipeline compile SaveTemps --output /tmp/tmp64l9ygmj/graph_def.neff --io-config '{"inputs": {"0:0": [[1, 128, 768], "float32"], "1:0": [[1, 1, 1, 128], "float32"]}, "outputs": ["Linear_5/aten_linear/Add:0"]}' --verbose 35 INFO:Neuron:Number of arithmetic operators (post-compilation) before = 565, compiled = 0, percent compiled = 0.0% INFO:Neuron:The neuron partitioner created 1 sub-graphs INFO:Neuron:Neuron successfully compiled 0 sub-graphs, Total fused subgraphs = 1, Percent of model sub-graphs successfully compiled = 0.0% INFO:Neuron:Compiled these operators (and operator counts) to Neuron: INFO:Neuron:Not compiled operators (and operator counts) to Neuron: INFO:Neuron: => aten::Int: 97 [supported] INFO:Neuron: => aten::add: 39 [supported] INFO:Neuron: => aten::contiguous: 12 [supported] INFO:Neuron: => aten::div: 12 [supported] INFO:Neuron: => aten::dropout: 38 [supported] INFO:Neuron: => aten::embedding: 3 [not supported] INFO:Neuron: => aten::gelu: 12 [supported] INFO:Neuron: => aten::layer_norm: 25 [supported] INFO:Neuron: => aten::linear: 74 [supported] INFO:Neuron: => aten::matmul: 24 [supported] INFO:Neuron: => aten::mul: 1 [supported] INFO:Neuron: => aten::permute: 48 [supported] INFO:Neuron: => aten::rsub: 1 [supported] INFO:Neuron: => aten::select: 1 [supported] INFO:Neuron: => aten::size: 97 [supported] INFO:Neuron: => aten::slice: 5 [supported] INFO:Neuron: => aten::softmax: 12 [supported] INFO:Neuron: => aten::tanh: 1 [supported] INFO:Neuron: => aten::to: 1 [supported] INFO:Neuron: => aten::transpose: 12 [supported] INFO:Neuron: => aten::unsqueeze: 2 [supported] INFO:Neuron: => aten::view: 48 [supported] Traceback (most recent call last): File "sample_pytorch_model.py", line 38, in <module> model_neuron = torch.neuron.trace(model, example_inputs_paraphrase) File "/local/home/spareek/pytorch_venv/lib64/python3.7/site-packages/torch_neuron/convert.py", line 184, in trace cu.stats_post_compiler(neuron_graph) File "/local/home/spareek/pytorch_venv/lib64/python3.7/site-packages/torch_neuron/convert.py", line 493, in stats_post_compiler "No operations were successfully partitioned and compiled to neuron for this model - aborting trace!") RuntimeError: No operations were successfully partitioned and compiled to neuron for this model - aborting trace! ```
1
answers
0
votes
59
views
asked 2 months ago

Not able to convert Hugging Face fine-tuned BERT model into AWS Neuron

Hi Team, I have a fine-tuned BERT model which was trained using following libraries. torch == 1.8.1+cu111 transformers == 4.19.4 And not able to convert that fine-tuned BERT model into AWS neuron and getting following compilation errors. Could you please help me to resolve this issue? **Note:** Trying to compile BERT model on SageMaker notebook instance and with "conda_python3" conda environment. **Installation:** #### Set Pip repository to point to the Neuron repository !pip config set global.extra-index-url https://pip.repos.neuron.amazonaws.com #### Install Neuron PyTorch - Note: Tried both options below. "#!pip install torch-neuron==1.8.1.* neuron-cc[tensorflow] "protobuf<4" torchvision sagemaker>=2.79.0 transformers==4.17.0 --upgrade" !pip install --upgrade torch-neuron neuron-cc[tensorflow] "protobuf<4" torchvision --------------------------------------------------------------------------------------------------------------------------------------------------- **Model compilation:** ``` import os import tensorflow # to workaround a protobuf version conflict issue import torch import torch.neuron from transformers import AutoTokenizer, AutoModelForSequenceClassification model_path = 'model/' # Model artifacts are stored in 'model/' directory # load tokenizer and model tokenizer = AutoTokenizer.from_pretrained(model_path) model = AutoModelForSequenceClassification.from_pretrained(model_path, torchscript=True) # create dummy input for max length 128 dummy_input = "dummy input which will be padded later" max_length = 128 embeddings = tokenizer(dummy_input, max_length=max_length, padding="max_length", truncation=True, return_tensors="pt") neuron_inputs = tuple(embeddings.values()) # compile model with torch.neuron.trace and update config model_neuron = torch.neuron.trace(model, neuron_inputs) model.config.update({"traced_sequence_length": max_length}) # save tokenizer, neuron model and config for later use save_dir="tmpd" os.makedirs("tmpd",exist_ok=True) model_neuron.save(os.path.join(save_dir,"neuron_model.pt")) tokenizer.save_pretrained(save_dir) model.config.save_pretrained(save_dir) ``` --------------------------------------------------------------------------------------------------------------------------------------------------- **Model artifacts:** We have got this model artifacts from multi-label topic classification model. config.json model.tar.gz pytorch_model.bin special_tokens_map.json tokenizer_config.json tokenizer.json --------------------------------------------------------------------------------------------------------------------------------------------------- **Error logs:** ``` INFO:Neuron:There are 3 ops of 1 different types in the TorchScript that are not compiled by neuron-cc: aten::embedding, (For more information see https://github.com/aws/aws-neuron-sdk/blob/master/release-notes/neuron-cc-ops/neuron-cc-ops-pytorch.md) INFO:Neuron:Number of arithmetic operators (pre-compilation) before = 565, fused = 548, percent fused = 96.99% INFO:Neuron:Number of neuron graph operations 1601 did not match traced graph 1323 - using heuristic matching of hierarchical information WARNING:tensorflow:From /home/ec2-user/anaconda3/envs/python3/lib/python3.6/site-packages/torch_neuron/ops/aten.py:2022: where (from tensorflow.python.ops.array_ops) is deprecated and will be removed in a future version. Instructions for updating: Use tf.where in 2.0, which has the same broadcast rule as np.where INFO:Neuron:Compiling function _NeuronGraph$698 with neuron-cc INFO:Neuron:Compiling with command line: '/home/ec2-user/anaconda3/envs/python3/bin/neuron-cc compile /tmp/tmpv4gg13ze/graph_def.pb --framework TENSORFLOW --pipeline compile SaveTemps --output /tmp/tmpv4gg13ze/graph_def.neff --io-config {"inputs": {"0:0": [[1, 128, 768], "float32"], "1:0": [[1, 1, 1, 128], "float32"]}, "outputs": ["Linear_5/aten_linear/Add:0"]} --verbose 35' INFO:Neuron:Compile command returned: -9 WARNING:Neuron:torch.neuron.trace failed on _NeuronGraph$698; falling back to native python function call ERROR:Neuron:neuron-cc failed with the following command line call: /home/ec2-user/anaconda3/envs/python3/bin/neuron-cc compile /tmp/tmpv4gg13ze/graph_def.pb --framework TENSORFLOW --pipeline compile SaveTemps --output /tmp/tmpv4gg13ze/graph_def.neff --io-config '{"inputs": {"0:0": [[1, 128, 768], "float32"], "1:0": [[1, 1, 1, 128], "float32"]}, "outputs": ["Linear_5/aten_linear/Add:0"]}' --verbose 35 Traceback (most recent call last): File "/home/ec2-user/anaconda3/envs/python3/lib/python3.6/site-packages/torch_neuron/convert.py", line 382, in op_converter item, inputs, compiler_workdir=sg_workdir, **kwargs) File "/home/ec2-user/anaconda3/envs/python3/lib/python3.6/site-packages/torch_neuron/decorators.py", line 220, in trace 'neuron-cc failed with the following command line call:\n{}'.format(command)) subprocess.SubprocessError: neuron-cc failed with the following command line call: /home/ec2-user/anaconda3/envs/python3/bin/neuron-cc compile /tmp/tmpv4gg13ze/graph_def.pb --framework TENSORFLOW --pipeline compile SaveTemps --output /tmp/tmpv4gg13ze/graph_def.neff --io-config '{"inputs": {"0:0": [[1, 128, 768], "float32"], "1:0": [[1, 1, 1, 128], "float32"]}, "outputs": ["Linear_5/aten_linear/Add:0"]}' --verbose 35 INFO:Neuron:Number of arithmetic operators (post-compilation) before = 565, compiled = 0, percent compiled = 0.0% INFO:Neuron:The neuron partitioner created 1 sub-graphs INFO:Neuron:Neuron successfully compiled 0 sub-graphs, Total fused subgraphs = 1, Percent of model sub-graphs successfully compiled = 0.0% INFO:Neuron:Compiled these operators (and operator counts) to Neuron: INFO:Neuron:Not compiled operators (and operator counts) to Neuron: INFO:Neuron: => aten::Int: 97 [supported] INFO:Neuron: => aten::add: 39 [supported] INFO:Neuron: => aten::contiguous: 12 [supported] INFO:Neuron: => aten::div: 12 [supported] INFO:Neuron: => aten::dropout: 38 [supported] INFO:Neuron: => aten::embedding: 3 [not supported] INFO:Neuron: => aten::gelu: 12 [supported] INFO:Neuron: => aten::layer_norm: 25 [supported] INFO:Neuron: => aten::linear: 74 [supported] INFO:Neuron: => aten::matmul: 24 [supported] INFO:Neuron: => aten::mul: 1 [supported] INFO:Neuron: => aten::permute: 48 [supported] INFO:Neuron: => aten::rsub: 1 [supported] INFO:Neuron: => aten::select: 1 [supported] INFO:Neuron: => aten::size: 97 [supported] INFO:Neuron: => aten::slice: 5 [supported] INFO:Neuron: => aten::softmax: 12 [supported] INFO:Neuron: => aten::tanh: 1 [supported] INFO:Neuron: => aten::to: 1 [supported] INFO:Neuron: => aten::transpose: 12 [supported] INFO:Neuron: => aten::unsqueeze: 2 [supported] INFO:Neuron: => aten::view: 48 [supported] --------------------------------------------------------------------------- RuntimeError Traceback (most recent call last) <ipython-input-1-97bba321d013> in <module> 18 19 # compile model with torch.neuron.trace and update config ---> 20 model_neuron = torch.neuron.trace(model, neuron_inputs) 21 model.config.update({"traced_sequence_length": max_length}) 22 ~/anaconda3/envs/python3/lib/python3.6/site-packages/torch_neuron/convert.py in trace(func, example_inputs, fallback, op_whitelist, minimum_segment_size, subgraph_builder_function, subgraph_inputs_pruning, skip_compiler, debug_must_trace, allow_no_ops_on_neuron, compiler_workdir, dynamic_batch_size, compiler_timeout, _neuron_trace, compiler_args, optimizations, verbose, **kwargs) 182 logger.debug("skip_inference_context - trace with fallback at {}".format(get_file_and_line())) 183 neuron_graph = cu.compile_fused_operators(neuron_graph, **compile_kwargs) --> 184 cu.stats_post_compiler(neuron_graph) 185 186 # Wrap the compiled version of the model in a script module. Note that this is ~/anaconda3/envs/python3/lib/python3.6/site-packages/torch_neuron/convert.py in stats_post_compiler(self, neuron_graph) 491 if succesful_compilations == 0 and not self.allow_no_ops_on_neuron: 492 raise RuntimeError( --> 493 "No operations were successfully partitioned and compiled to neuron for this model - aborting trace!") 494 495 if percent_operations_compiled < 50.0: RuntimeError: No operations were successfully partitioned and compiled to neuron for this model - aborting trace! ``` --------------------------------------------------------------------------------------------------------------------------------------------------- Thanks a lot.
1
answers
0
votes
115
views
asked 5 months ago
  • 1
  • 2
  • 12 / page