I am currently facing an issue with the AWS Neuron SDK when trying to run the PyTorch example provided in the AWS Neuron GitHub repository on a Deep Learning AMI Neuron PyTorch 1.13 (Ubuntu 20.04) instance with inf1 type.
I followed the steps outlined in the setup.sh script, but encountered an error when executing the following line of code:
cp -f $(find ./venv -name libtorchneuron.so | grep torch_neuronx) libtorch/lib/
The error message was: "cp: missing destination file operand after 'libtorch/lib/'". It seems that the 'find' command is not returning any results, and so the 'cp' command is not receiving a valid source file to copy.
Additionally, I noticed that the python bert_neuronx/compile.py
command was killed, potentially due to a lack of system resources.
This issue is preventing me from successfully completing the setup and running the example. I am unsure if this is due to an error in the setup script, an issue with the instance type, or a problem with the installed packages. Any help you could provide would be greatly appreciated.
Please find the full error message below:
./setup.sh: line 92: 27454 Killed python bert_neuronx/compile.py
Thank you for your assistance.
Best regards,