- Newest
- Most votes
- Most comments
Hi, in order to see more information about the error, you can enable debugging during tracing by passing 'verbose' to the tracing command like this:
import torch
import torch.neuron
torch.neuron.trace(
model,
example_inputs=inp,
verbose="debug",
compiler_workdir="logs" # dir where debugging logs will be saved
)
You'll see the error messages in the console and they will also be saved to the "logs" dir.
It is always good to run the NeuronSDK analyzer first to make sure the model is: 1/ torch.jit traceable; 2/ supported by the compiler
import torch
import torch.neuron
torch.neuron.analyze_model(model, example_inputs=inp)
You can also see a sample that shows how to compile an U-net Pytorch (3rd party implementation) to Inf1 instances here: https://github.com/samir-souza/laboratory/blob/master/05_Inferentia/03_UnetPytorch/03_UnetPytorch.ipynb
If everything fails, try to look for something like this in the logs:
INFO:Neuron:Compile command returned: -11
WARNING:Neuron:torch.neuron.trace failed on _NeuronGraph$647; falling back to native python function call
ERROR:Neuron:neuron-cc failed with the following command line call:
And paste here, please. With the "Compile command returned:" code it is possible to identify the error. You are suspecting that there is some issue related to memory, maybe Out of Memory. Normally when that is the case, you'll find the code: -9 in this part of the error.
Following your answer we were able to check the log and got
INFO:Neuron:Compile command returned: -9
which is apparently an out of memory error. Switching to a 6x instance solved the problem
Relevant content
- AWS OFFICIALUpdated 2 years ago
- AWS OFFICIALUpdated 9 months ago
- AWS OFFICIALUpdated 3 years ago
Thanks for the detailed answer, will try