By using AWS re:Post, you agree to the Terms of Use

Questions tagged with AWS Neuron

Sort by most recent
  • 1
  • 12 / page

Browse through the questions and answers listed below or filter and sort to narrow down your results.

AWS Pytorch Neuron Compliation Error

I followed user guide on updating torch neuron and then started compiling the model to neuron. But got an error, from which I don't understand what's wrong. In Neuron SDK you claim that it should compile all operations, even not supported ones, they just should run on CPU. The error: ``` INFO:Neuron:All operators are compiled by neuron-cc (this does not guarantee that neuron-cc will successfully compile) INFO:Neuron:Number of arithmetic operators (pre-compilation) before = 3345, fused = 3345, percent fused = 100.0% INFO:Neuron:Number of neuron graph operations 8175 did not match traced graph 9652 - using heuristic matching of hierarchical information INFO:Neuron:Compiling function _NeuronGraph$3362 with neuron-cc INFO:Neuron:Compiling with command line: '/home/ubuntu/alias/neuron/neuron_env/bin/neuron-cc compile /tmp/tmpmp8qvhtb/graph_def.pb --framework TENSORFLOW --pipeline compile SaveTemps --output /tmp/tmpmp8qvhtb/graph_def.neff --io-config {"inputs": {"0:0": [[1, 3, 768, 768], "float32"]}, "outputs": ["aten_sigmoid/Sigmoid:0"]} --verbose 35' ..............................................................................INFO:Neuron:Compile command returned: -9 WARNING:Neuron:torch.neuron.trace failed on _NeuronGraph$3362; falling back to native python function call ERROR:Neuron:neuron-cc failed with the following command line call: /home/ubuntu/alias/neuron/neuron_env/bin/neuron-cc compile /tmp/tmpmp8qvhtb/graph_def.pb --framework TENSORFLOW --pipeline compile SaveTemps --output /tmp/tmpmp8qvhtb/graph_def.neff --io-config '{"inputs": {"0:0": [[1, 3, 768, 768], "float32"]}, "outputs": ["aten_sigmoid/Sigmoid:0"]}' --verbose 35 Traceback (most recent call last): File "/home/ubuntu/alias/neuron/neuron_env/lib/python3.7/site-packages/torch_neuron/convert.py", line 382, in op_converter item, inputs, compiler_workdir=sg_workdir, **kwargs) File "/home/ubuntu/alias/neuron/neuron_env/lib/python3.7/site-packages/torch_neuron/decorators.py", line 220, in trace 'neuron-cc failed with the following command line call:\n{}'.format(command)) subprocess.SubprocessError: neuron-cc failed with the following command line call: /home/ubuntu/alias/neuron/neuron_env/bin/neuron-cc compile /tmp/tmpmp8qvhtb/graph_def.pb --framework TENSORFLOW --pipeline compile SaveTemps --output /tmp/tmpmp8qvhtb/graph_def.neff --io-config '{"inputs": {"0:0": [[1, 3, 768, 768], "float32"]}, "outputs": ["aten_sigmoid/Sigmoid:0"]}' --verbose 35 INFO:Neuron:Number of arithmetic operators (post-compilation) before = 3345, compiled = 0, percent compiled = 0.0% INFO:Neuron:The neuron partitioner created 1 sub-graphs INFO:Neuron:Neuron successfully compiled 0 sub-graphs, Total fused subgraphs = 1, Percent of model sub-graphs successfully compiled = 0.0% INFO:Neuron:Compiled these operators (and operator counts) to Neuron: INFO:Neuron:Not compiled operators (and operator counts) to Neuron: INFO:Neuron: => aten::Int: 942 [supported] INFO:Neuron: => aten::_convolution: 107 [supported] INFO:Neuron: => aten::add: 104 [supported] INFO:Neuron: => aten::batch_norm: 1 [supported] INFO:Neuron: => aten::cat: 1 [supported] INFO:Neuron: => aten::contiguous: 4 [supported] INFO:Neuron: => aten::div: 104 [supported] INFO:Neuron: => aten::dropout: 208 [supported] INFO:Neuron: => aten::feature_dropout: 1 [supported] INFO:Neuron: => aten::flatten: 60 [supported] INFO:Neuron: => aten::gelu: 52 [supported] INFO:Neuron: => aten::layer_norm: 161 [supported] INFO:Neuron: => aten::linear: 264 [supported] INFO:Neuron: => aten::matmul: 104 [supported] INFO:Neuron: => aten::mul: 52 [supported] INFO:Neuron: => aten::permute: 210 [supported] INFO:Neuron: => aten::relu: 1 [supported] INFO:Neuron: => aten::reshape: 262 [supported] INFO:Neuron: => aten::select: 104 [supported] INFO:Neuron: => aten::sigmoid: 1 [supported] INFO:Neuron: => aten::size: 278 [supported] INFO:Neuron: => aten::softmax: 52 [supported] INFO:Neuron: => aten::transpose: 216 [supported] INFO:Neuron: => aten::upsample_bilinear2d: 4 [supported] INFO:Neuron: => aten::view: 52 [supported] Traceback (most recent call last): File "to_neuron.py", line 14, in <module> model_neuron = torch.neuron.trace(model, example_inputs=[image.cuda()]) File "/home/ubuntu/alias/neuron/neuron_env/lib/python3.7/site-packages/torch_neuron/convert.py", line 184, in trace cu.stats_post_compiler(neuron_graph) File "/home/ubuntu/alias/neuron/neuron_env/lib/python3.7/site-packages/torch_neuron/convert.py", line 493, in stats_post_compiler "No operations were successfully partitioned and compiled to neuron for this model - aborting trace!") RuntimeError: No operations were successfully partitioned and compiled to neuron for this model - aborting trace! ```
0
answers
0
votes
4
views
asked a day ago

Is it possible to compile a neuron model in my local machine?

I'm following some guides and from my understanding this should be possible. But I've been trying for hours to compile a yolov5 model into a neuron model with no success. Is it even possible to do this in my local machine or do I have to be in an inferentia instance? This is what my environment looks like: ``` # packages in environment at /miniconda3/envs/neuron: # # Name Version Build Channel _libgcc_mutex 0.1 main _openmp_mutex 5.1 1_gnu absl-py 1.2.0 pypi_0 pypi astor 0.8.1 pypi_0 pypi attrs 22.1.0 pypi_0 pypi backcall 0.2.0 pyhd3eb1b0_0 ca-certificates 2022.07.19 h06a4308_0 cachetools 5.2.0 pypi_0 pypi certifi 2022.9.24 pypi_0 pypi charset-normalizer 2.1.1 pypi_0 pypi cycler 0.11.0 pypi_0 pypi debugpy 1.5.1 py37h295c915_0 decorator 5.1.1 pyhd3eb1b0_0 dmlc-nnvm 1.11.1.0+0 pypi_0 pypi dmlc-topi 1.11.1.0+0 pypi_0 pypi dmlc-tvm 1.11.1.0+0 pypi_0 pypi entrypoints 0.4 py37h06a4308_0 fonttools 4.37.3 pypi_0 pypi gast 0.2.2 pypi_0 pypi google-auth 2.12.0 pypi_0 pypi google-auth-oauthlib 0.4.6 pypi_0 pypi google-pasta 0.2.0 pypi_0 pypi gputil 1.4.0 pypi_0 pypi grpcio 1.49.1 pypi_0 pypi h5py 3.7.0 pypi_0 pypi idna 3.4 pypi_0 pypi importlib-metadata 4.12.0 pypi_0 pypi inferentia-hwm 1.11.0.0+0 pypi_0 pypi iniconfig 1.1.1 pypi_0 pypi ipykernel 6.15.2 py37h06a4308_0 ipython 7.34.0 pypi_0 pypi ipywidgets 8.0.2 pypi_0 pypi islpy 2021.1+aws2021.x.16.0.bld0 pypi_0 pypi jedi 0.18.1 py37h06a4308_1 jupyter_client 7.3.5 py37h06a4308_0 jupyter_core 4.10.0 py37h06a4308_0 jupyterlab-widgets 3.0.3 pypi_0 pypi keras-applications 1.0.8 pypi_0 pypi keras-preprocessing 1.1.2 pypi_0 pypi kiwisolver 1.4.4 pypi_0 pypi ld_impl_linux-64 2.38 h1181459_1 libffi 3.3 he6710b0_2 libgcc-ng 11.2.0 h1234567_1 libgomp 11.2.0 h1234567_1 libsodium 1.0.18 h7b6447c_0 libstdcxx-ng 11.2.0 h1234567_1 llvmlite 0.39.1 pypi_0 pypi markdown 3.4.1 pypi_0 pypi markupsafe 2.1.1 pypi_0 pypi matplotlib 3.5.3 pypi_0 pypi matplotlib-inline 0.1.6 py37h06a4308_0 ncurses 6.3 h5eee18b_3 nest-asyncio 1.5.5 py37h06a4308_0 networkx 2.4 pypi_0 pypi neuron-cc 1.11.7.0+aec18907e pypi_0 pypi numba 0.56.2 pypi_0 pypi numpy 1.19.5 pypi_0 pypi oauthlib 3.2.1 pypi_0 pypi opencv-python 4.6.0.66 pypi_0 pypi openssl 1.1.1q h7f8727e_0 opt-einsum 3.3.0 pypi_0 pypi packaging 21.3 pyhd3eb1b0_0 pandas 1.3.5 pypi_0 pypi parso 0.8.3 pyhd3eb1b0_0 pexpect 4.8.0 pyhd3eb1b0_3 pickleshare 0.7.5 pyhd3eb1b0_1003 pillow 9.2.0 pypi_0 pypi pip 22.2.2 pypi_0 pypi pluggy 1.0.0 pypi_0 pypi prompt-toolkit 3.0.31 pypi_0 pypi protobuf 3.20.3 pypi_0 pypi psutil 5.9.2 pypi_0 pypi ptyprocess 0.7.0 pyhd3eb1b0_2 py 1.11.0 pypi_0 pypi pyasn1 0.4.8 pypi_0 pypi pyasn1-modules 0.2.8 pypi_0 pypi pygments 2.13.0 pypi_0 pypi pyparsing 3.0.9 py37h06a4308_0 pytest 7.1.3 pypi_0 pypi python 3.7.13 h12debd9_0 python-dateutil 2.8.2 pyhd3eb1b0_0 pytz 2022.2.1 pypi_0 pypi pyyaml 6.0 pypi_0 pypi pyzmq 23.2.0 py37h6a678d5_0 readline 8.1.2 h7f8727e_1 requests 2.28.1 pypi_0 pypi requests-oauthlib 1.3.1 pypi_0 pypi rsa 4.9 pypi_0 pypi scipy 1.4.1 pypi_0 pypi seaborn 0.12.0 pypi_0 pypi setuptools 59.8.0 pypi_0 pypi six 1.16.0 pyhd3eb1b0_1 sqlite 3.39.3 h5082296_0 tensorboard 1.15.0 pypi_0 pypi tensorboard-data-server 0.6.1 pypi_0 pypi tensorboard-plugin-wit 1.8.1 pypi_0 pypi tensorflow 1.15.0 pypi_0 pypi tensorflow-estimator 1.15.1 pypi_0 pypi termcolor 2.0.1 pypi_0 pypi thop 0.1.1-2209072238 pypi_0 pypi tk 8.6.12 h1ccaba5_0 tomli 2.0.1 pypi_0 pypi torch 1.11.0 pypi_0 pypi torch-neuron 1.11.0.2.3.0.0 pypi_0 pypi torchvision 0.12.0 pypi_0 pypi tornado 6.2 py37h5eee18b_0 tqdm 4.64.1 pypi_0 pypi traitlets 5.4.0 pypi_0 pypi typing-extensions 4.3.0 pypi_0 pypi urllib3 1.26.12 pypi_0 pypi wcwidth 0.2.5 pyhd3eb1b0_0 werkzeug 2.2.2 pypi_0 pypi wheel 0.37.1 pypi_0 pypi widgetsnbextension 4.0.3 pypi_0 pypi wrapt 1.14.1 pypi_0 pypi xz 5.2.6 h5eee18b_0 zeromq 4.3.4 h2531618_0 zipp 3.8.1 pypi_0 pypi zlib 1.2.12 h5eee18b_3 ```
1
answers
1
votes
24
views
asked 4 days ago

Not able to compile to NEFF, the BERT model from neuron tutorial

Hi Team, I wanted to compile a BERT model and run it on inferentia. I trained my model using pytorch and tried to convert it by following the same steps in this [tutorial](https://github.com/aws-neuron/aws-neuron-sdk/blob/master/src/examples/pytorch/bert_tutorial/tutorial_pretrained_bert.ipynb) on my amazon linux Machine. But I keep getting failure with this error: ``` 09/22/2022 06:13:56 PM ERROR 23737 [neuron-cc]: Failed to parse model /tmp/tmp64l9ygmj/graph_def.pb: The following operators are not implemented: {'SelectV2'} (NotImplementedError) ``` I followed the installation steps [here](https://awsdocs-neuron.readthedocs-hosted.com/en/latest/neuron-intro/pytorch-setup/pytorch-install.html) for pytorch-1.11.0 and tried to execute the code in tutorial but got the same error. We wanted to explore using Inferentia for our large BERT model but are blocked on doing so due to failure in conversion to NEFF format. I also tried following steps using TF and ran into some other ops unsupported issue. Could you please help! Below are the setup commands i ran on my Amazon Linux Desktop ``` sudo yum install -y python3.7-venv gcc-c++ python3.7 -m venv pytorch_venv source pytorch_venv/bin/activate pip install -U pip # Set Pip repository to point to the Neuron repository pip config set global.extra-index-url https://pip.repos.neuron.amazonaws.com #Install Neuron PyTorch pip install torch-neuron neuron-cc[tensorflow] "protobuf<4" torchvision !pip install --upgrade "transformers==4.6.0" pip install tensorflow==2.8.1 ``` and then executed the below script(copied from tutorial) on my amazon linux host: ``` import tensorflow # to workaround a protobuf version conflict issue import torch import torch.neuron from transformers import AutoTokenizer, AutoModelForSequenceClassification, AutoConfig import transformers import os import warnings # Setting up NeuronCore groups for inf1.6xlarge with 16 cores num_cores = 16 # This value should be 4 on inf1.xlarge and inf1.2xlarge nc_env = ','.join(['1'] * num_cores) warnings.warn("NEURONCORE_GROUP_SIZES is being deprecated, if your application is using NEURONCORE_GROUP_SIZES please \ see https://awsdocs-neuron.readthedocs-hosted.com/en/latest/release-notes/deprecation.html#announcing-end-of-support-for-neuroncore-group-sizes \ for more details.", DeprecationWarning) os.environ['NEURONCORE_GROUP_SIZES'] = nc_env # Build tokenizer and model tokenizer = AutoTokenizer.from_pretrained("bert-base-cased-finetuned-mrpc") model = AutoModelForSequenceClassification.from_pretrained("bert-base-cased-finetuned-mrpc", return_dict=False) # Setup some example inputs sequence_0 = "The company HuggingFace is based in New York City" sequence_1 = "Apples are especially bad for your health" sequence_2 = "HuggingFace's headquarters are situated in Manhattan" max_length=128 paraphrase = tokenizer.encode_plus(sequence_0, sequence_2, max_length=max_length, padding='max_length', truncation=True, return_tensors="pt") not_paraphrase = tokenizer.encode_plus(sequence_0, sequence_1, max_length=max_length, padding='max_length', truncation=True, return_tensors="pt") # Run the original PyTorch model on compilation exaple paraphrase_classification_logits = model(**paraphrase)[0] # Convert example inputs to a format that is compatible with TorchScript tracing example_inputs_paraphrase = paraphrase['input_ids'], paraphrase['attention_mask'], paraphrase['token_type_ids'] example_inputs_not_paraphrase = not_paraphrase['input_ids'], not_paraphrase['attention_mask'], not_paraphrase['token_type_ids'] # Run torch.neuron.trace to generate a TorchScript that is optimized by AWS Neuron model_neuron = torch.neuron.trace(model, example_inputs_paraphrase) ``` This gave me the following error: ``` 2022-09-22 18:13:12.145617: W tensorflow/stream_executor/platform/default/dso_loader.cc:64] Could not load dynamic library 'libcudart.so.11.0'; dlerror: libcudart.so.11.0: cannot open shared object file: No such file or directory 2022-09-22 18:13:12.145649: I tensorflow/stream_executor/cuda/cudart_stub.cc:29] Ignore above cudart dlerror if you do not have a GPU set up on your machine. sample_pytorch_model.py:14: DeprecationWarning: NEURONCORE_GROUP_SIZES is being deprecated, if your application is using NEURONCORE_GROUP_SIZES please see https://awsdocs-neuron.readthedocs-hosted.com/en/latest/release-notes/deprecation.html#announcing-end-of-support-for-neuroncore-group-sizes for more details. for more details.", DeprecationWarning) Downloading: 100%|██████████████████████████████████████████████████████████████████████████████████████████| 433/433 [00:00<00:00, 641kB/s] Downloading: 100%|████████████████████████████████████████████████████████████████████████████████████████| 213k/213k [00:00<00:00, 636kB/s] Downloading: 100%|████████████████████████████████████████████████████████████████████████████████████████| 436k/436k [00:00<00:00, 731kB/s] Downloading: 100%|███████████████████████████████████████████████████████████████████████████████████████| 29.0/29.0 [00:00<00:00, 35.2kB/s] Downloading: 100%|███████████████████████████████████████████████████████████████████████████████████████| 433M/433M [00:09<00:00, 45.7MB/s] /local/home/spareek/pytorch_venv/lib64/python3.7/site-packages/transformers/modeling_utils.py:1968: TracerWarning: Converting a tensor to a Python boolean might cause the trace to be incorrect. We can't record the data flow of Python values, so this value will be treated as a constant in the future. This means that the trace might not generalize to other inputs! input_tensor.shape[chunk_dim] == tensor_shape for input_tensor in input_tensors INFO:Neuron:There are 3 ops of 1 different types in the TorchScript that are not compiled by neuron-cc: aten::embedding, (For more information see https://github.com/aws/aws-neuron-sdk/blob/master/release-notes/neuron-cc-ops/neuron-cc-ops-pytorch.md) INFO:Neuron:Number of arithmetic operators (pre-compilation) before = 565, fused = 548, percent fused = 96.99% INFO:Neuron:Number of neuron graph operations 1601 did not match traced graph 1323 - using heuristic matching of hierarchical information INFO:Neuron:Compiling function _NeuronGraph$662 with neuron-cc INFO:Neuron:Compiling with command line: '/local/home/spareek/pytorch_venv/bin/neuron-cc compile /tmp/tmp64l9ygmj/graph_def.pb --framework TENSORFLOW --pipeline compile SaveTemps --output /tmp/tmp64l9ygmj/graph_def.neff --io-config {"inputs": {"0:0": [[1, 128, 768], "float32"], "1:0": [[1, 1, 1, 128], "float32"]}, "outputs": ["Linear_5/aten_linear/Add:0"]} --verbose 35' huggingface/tokenizers: The current process just got forked, after parallelism has already been used. Disabling parallelism to avoid deadlocks... To disable this warning, you can either: - Avoid using `tokenizers` before the fork if possible - Explicitly set the environment variable TOKENIZERS_PARALLELISM=(true | false) .2022-09-22 18:13:52.697717: W tensorflow/stream_executor/platform/default/dso_loader.cc:64] Could not load dynamic library 'libcudart.so.11.0'; dlerror: libcudart.so.11.0: cannot open shared object file: No such file or directory 2022-09-22 18:13:52.697749: I tensorflow/stream_executor/cuda/cudart_stub.cc:29] Ignore above cudart dlerror if you do not have a GPU set up on your machine. 09/22/2022 06:13:56 PM ERROR 23737 [neuron-cc]: Failed to parse model /tmp/tmp64l9ygmj/graph_def.pb: The following operators are not implemented: {'SelectV2'} (NotImplementedError) Compiler status ERROR INFO:Neuron:Compile command returned: 1 WARNING:Neuron:torch.neuron.trace failed on _NeuronGraph$662; falling back to native python function call ERROR:Neuron:neuron-cc failed with the following command line call: /local/home/spareek/pytorch_venv/bin/neuron-cc compile /tmp/tmp64l9ygmj/graph_def.pb --framework TENSORFLOW --pipeline compile SaveTemps --output /tmp/tmp64l9ygmj/graph_def.neff --io-config '{"inputs": {"0:0": [[1, 128, 768], "float32"], "1:0": [[1, 1, 1, 128], "float32"]}, "outputs": ["Linear_5/aten_linear/Add:0"]}' --verbose 35 Traceback (most recent call last): File "/local/home/spareek/pytorch_venv/lib64/python3.7/site-packages/torch_neuron/convert.py", line 382, in op_converter item, inputs, compiler_workdir=sg_workdir, **kwargs) File "/local/home/spareek/pytorch_venv/lib64/python3.7/site-packages/torch_neuron/decorators.py", line 220, in trace 'neuron-cc failed with the following command line call:\n{}'.format(command)) subprocess.SubprocessError: neuron-cc failed with the following command line call: /local/home/spareek/pytorch_venv/bin/neuron-cc compile /tmp/tmp64l9ygmj/graph_def.pb --framework TENSORFLOW --pipeline compile SaveTemps --output /tmp/tmp64l9ygmj/graph_def.neff --io-config '{"inputs": {"0:0": [[1, 128, 768], "float32"], "1:0": [[1, 1, 1, 128], "float32"]}, "outputs": ["Linear_5/aten_linear/Add:0"]}' --verbose 35 INFO:Neuron:Number of arithmetic operators (post-compilation) before = 565, compiled = 0, percent compiled = 0.0% INFO:Neuron:The neuron partitioner created 1 sub-graphs INFO:Neuron:Neuron successfully compiled 0 sub-graphs, Total fused subgraphs = 1, Percent of model sub-graphs successfully compiled = 0.0% INFO:Neuron:Compiled these operators (and operator counts) to Neuron: INFO:Neuron:Not compiled operators (and operator counts) to Neuron: INFO:Neuron: => aten::Int: 97 [supported] INFO:Neuron: => aten::add: 39 [supported] INFO:Neuron: => aten::contiguous: 12 [supported] INFO:Neuron: => aten::div: 12 [supported] INFO:Neuron: => aten::dropout: 38 [supported] INFO:Neuron: => aten::embedding: 3 [not supported] INFO:Neuron: => aten::gelu: 12 [supported] INFO:Neuron: => aten::layer_norm: 25 [supported] INFO:Neuron: => aten::linear: 74 [supported] INFO:Neuron: => aten::matmul: 24 [supported] INFO:Neuron: => aten::mul: 1 [supported] INFO:Neuron: => aten::permute: 48 [supported] INFO:Neuron: => aten::rsub: 1 [supported] INFO:Neuron: => aten::select: 1 [supported] INFO:Neuron: => aten::size: 97 [supported] INFO:Neuron: => aten::slice: 5 [supported] INFO:Neuron: => aten::softmax: 12 [supported] INFO:Neuron: => aten::tanh: 1 [supported] INFO:Neuron: => aten::to: 1 [supported] INFO:Neuron: => aten::transpose: 12 [supported] INFO:Neuron: => aten::unsqueeze: 2 [supported] INFO:Neuron: => aten::view: 48 [supported] Traceback (most recent call last): File "sample_pytorch_model.py", line 38, in <module> model_neuron = torch.neuron.trace(model, example_inputs_paraphrase) File "/local/home/spareek/pytorch_venv/lib64/python3.7/site-packages/torch_neuron/convert.py", line 184, in trace cu.stats_post_compiler(neuron_graph) File "/local/home/spareek/pytorch_venv/lib64/python3.7/site-packages/torch_neuron/convert.py", line 493, in stats_post_compiler "No operations were successfully partitioned and compiled to neuron for this model - aborting trace!") RuntimeError: No operations were successfully partitioned and compiled to neuron for this model - aborting trace! ```
1
answers
0
votes
22
views
asked 12 days ago

Not able to convert Hugging Face fine-tuned BERT model into AWS Neuron

Hi Team, I have a fine-tuned BERT model which was trained using following libraries. torch == 1.8.1+cu111 transformers == 4.19.4 And not able to convert that fine-tuned BERT model into AWS neuron and getting following compilation errors. Could you please help me to resolve this issue? **Note:** Trying to compile BERT model on SageMaker notebook instance and with "conda_python3" conda environment. **Installation:** #### Set Pip repository to point to the Neuron repository !pip config set global.extra-index-url https://pip.repos.neuron.amazonaws.com #### Install Neuron PyTorch - Note: Tried both options below. "#!pip install torch-neuron==1.8.1.* neuron-cc[tensorflow] "protobuf<4" torchvision sagemaker>=2.79.0 transformers==4.17.0 --upgrade" !pip install --upgrade torch-neuron neuron-cc[tensorflow] "protobuf<4" torchvision --------------------------------------------------------------------------------------------------------------------------------------------------- **Model compilation:** ``` import os import tensorflow # to workaround a protobuf version conflict issue import torch import torch.neuron from transformers import AutoTokenizer, AutoModelForSequenceClassification model_path = 'model/' # Model artifacts are stored in 'model/' directory # load tokenizer and model tokenizer = AutoTokenizer.from_pretrained(model_path) model = AutoModelForSequenceClassification.from_pretrained(model_path, torchscript=True) # create dummy input for max length 128 dummy_input = "dummy input which will be padded later" max_length = 128 embeddings = tokenizer(dummy_input, max_length=max_length, padding="max_length", truncation=True, return_tensors="pt") neuron_inputs = tuple(embeddings.values()) # compile model with torch.neuron.trace and update config model_neuron = torch.neuron.trace(model, neuron_inputs) model.config.update({"traced_sequence_length": max_length}) # save tokenizer, neuron model and config for later use save_dir="tmpd" os.makedirs("tmpd",exist_ok=True) model_neuron.save(os.path.join(save_dir,"neuron_model.pt")) tokenizer.save_pretrained(save_dir) model.config.save_pretrained(save_dir) ``` --------------------------------------------------------------------------------------------------------------------------------------------------- **Model artifacts:** We have got this model artifacts from multi-label topic classification model. config.json model.tar.gz pytorch_model.bin special_tokens_map.json tokenizer_config.json tokenizer.json --------------------------------------------------------------------------------------------------------------------------------------------------- **Error logs:** ``` INFO:Neuron:There are 3 ops of 1 different types in the TorchScript that are not compiled by neuron-cc: aten::embedding, (For more information see https://github.com/aws/aws-neuron-sdk/blob/master/release-notes/neuron-cc-ops/neuron-cc-ops-pytorch.md) INFO:Neuron:Number of arithmetic operators (pre-compilation) before = 565, fused = 548, percent fused = 96.99% INFO:Neuron:Number of neuron graph operations 1601 did not match traced graph 1323 - using heuristic matching of hierarchical information WARNING:tensorflow:From /home/ec2-user/anaconda3/envs/python3/lib/python3.6/site-packages/torch_neuron/ops/aten.py:2022: where (from tensorflow.python.ops.array_ops) is deprecated and will be removed in a future version. Instructions for updating: Use tf.where in 2.0, which has the same broadcast rule as np.where INFO:Neuron:Compiling function _NeuronGraph$698 with neuron-cc INFO:Neuron:Compiling with command line: '/home/ec2-user/anaconda3/envs/python3/bin/neuron-cc compile /tmp/tmpv4gg13ze/graph_def.pb --framework TENSORFLOW --pipeline compile SaveTemps --output /tmp/tmpv4gg13ze/graph_def.neff --io-config {"inputs": {"0:0": [[1, 128, 768], "float32"], "1:0": [[1, 1, 1, 128], "float32"]}, "outputs": ["Linear_5/aten_linear/Add:0"]} --verbose 35' INFO:Neuron:Compile command returned: -9 WARNING:Neuron:torch.neuron.trace failed on _NeuronGraph$698; falling back to native python function call ERROR:Neuron:neuron-cc failed with the following command line call: /home/ec2-user/anaconda3/envs/python3/bin/neuron-cc compile /tmp/tmpv4gg13ze/graph_def.pb --framework TENSORFLOW --pipeline compile SaveTemps --output /tmp/tmpv4gg13ze/graph_def.neff --io-config '{"inputs": {"0:0": [[1, 128, 768], "float32"], "1:0": [[1, 1, 1, 128], "float32"]}, "outputs": ["Linear_5/aten_linear/Add:0"]}' --verbose 35 Traceback (most recent call last): File "/home/ec2-user/anaconda3/envs/python3/lib/python3.6/site-packages/torch_neuron/convert.py", line 382, in op_converter item, inputs, compiler_workdir=sg_workdir, **kwargs) File "/home/ec2-user/anaconda3/envs/python3/lib/python3.6/site-packages/torch_neuron/decorators.py", line 220, in trace 'neuron-cc failed with the following command line call:\n{}'.format(command)) subprocess.SubprocessError: neuron-cc failed with the following command line call: /home/ec2-user/anaconda3/envs/python3/bin/neuron-cc compile /tmp/tmpv4gg13ze/graph_def.pb --framework TENSORFLOW --pipeline compile SaveTemps --output /tmp/tmpv4gg13ze/graph_def.neff --io-config '{"inputs": {"0:0": [[1, 128, 768], "float32"], "1:0": [[1, 1, 1, 128], "float32"]}, "outputs": ["Linear_5/aten_linear/Add:0"]}' --verbose 35 INFO:Neuron:Number of arithmetic operators (post-compilation) before = 565, compiled = 0, percent compiled = 0.0% INFO:Neuron:The neuron partitioner created 1 sub-graphs INFO:Neuron:Neuron successfully compiled 0 sub-graphs, Total fused subgraphs = 1, Percent of model sub-graphs successfully compiled = 0.0% INFO:Neuron:Compiled these operators (and operator counts) to Neuron: INFO:Neuron:Not compiled operators (and operator counts) to Neuron: INFO:Neuron: => aten::Int: 97 [supported] INFO:Neuron: => aten::add: 39 [supported] INFO:Neuron: => aten::contiguous: 12 [supported] INFO:Neuron: => aten::div: 12 [supported] INFO:Neuron: => aten::dropout: 38 [supported] INFO:Neuron: => aten::embedding: 3 [not supported] INFO:Neuron: => aten::gelu: 12 [supported] INFO:Neuron: => aten::layer_norm: 25 [supported] INFO:Neuron: => aten::linear: 74 [supported] INFO:Neuron: => aten::matmul: 24 [supported] INFO:Neuron: => aten::mul: 1 [supported] INFO:Neuron: => aten::permute: 48 [supported] INFO:Neuron: => aten::rsub: 1 [supported] INFO:Neuron: => aten::select: 1 [supported] INFO:Neuron: => aten::size: 97 [supported] INFO:Neuron: => aten::slice: 5 [supported] INFO:Neuron: => aten::softmax: 12 [supported] INFO:Neuron: => aten::tanh: 1 [supported] INFO:Neuron: => aten::to: 1 [supported] INFO:Neuron: => aten::transpose: 12 [supported] INFO:Neuron: => aten::unsqueeze: 2 [supported] INFO:Neuron: => aten::view: 48 [supported] --------------------------------------------------------------------------- RuntimeError Traceback (most recent call last) <ipython-input-1-97bba321d013> in <module> 18 19 # compile model with torch.neuron.trace and update config ---> 20 model_neuron = torch.neuron.trace(model, neuron_inputs) 21 model.config.update({"traced_sequence_length": max_length}) 22 ~/anaconda3/envs/python3/lib/python3.6/site-packages/torch_neuron/convert.py in trace(func, example_inputs, fallback, op_whitelist, minimum_segment_size, subgraph_builder_function, subgraph_inputs_pruning, skip_compiler, debug_must_trace, allow_no_ops_on_neuron, compiler_workdir, dynamic_batch_size, compiler_timeout, _neuron_trace, compiler_args, optimizations, verbose, **kwargs) 182 logger.debug("skip_inference_context - trace with fallback at {}".format(get_file_and_line())) 183 neuron_graph = cu.compile_fused_operators(neuron_graph, **compile_kwargs) --> 184 cu.stats_post_compiler(neuron_graph) 185 186 # Wrap the compiled version of the model in a script module. Note that this is ~/anaconda3/envs/python3/lib/python3.6/site-packages/torch_neuron/convert.py in stats_post_compiler(self, neuron_graph) 491 if succesful_compilations == 0 and not self.allow_no_ops_on_neuron: 492 raise RuntimeError( --> 493 "No operations were successfully partitioned and compiled to neuron for this model - aborting trace!") 494 495 if percent_operations_compiled < 50.0: RuntimeError: No operations were successfully partitioned and compiled to neuron for this model - aborting trace! ``` --------------------------------------------------------------------------------------------------------------------------------------------------- Thanks a lot.
1
answers
0
votes
89
views
asked 3 months ago
  • 1
  • 12 / page