SageMaker Studio PyTorch 1.8 kernel has no PyTorch, Numpy, or Matplotlib module

0

I'm working with SageMaker studio with the following options:

  • kernel: PyTorch 1.8 Python 3.6 GPU optimized.
  • instance: ml.g4dn.xlarge

When running import torch numpy, matplotlib or PIL, I'm getting the No module named 'X' error. No matter when using pip install in a cell above, it will not be imported. Is this a problem only I am encountering with the new PyTorch 1.8 kernel? It also happens with the CPU-optimized version. However, PyTorch 1.6 kernel does not throw an error.

When running conda list, I get the output without any of the previously mentioned modules.

  • Not sure really why it is not installed.

asked 2 years ago1035 views
2 Answers
0

I have followed your explanation to recreate the No module named 'X'. After instance lunch, I could import numpy, matplotlib and PIL but could not import torch because torch is not initially installed.

I followed these steps to install and import torch:

  • Shut down your instance completely.
  • Attach the kernel to your notebook again
  • Lunch your ml.g4dn.xlarge instance again
  • pip3 install torch
  • import torch
AWS
Zmnako
answered 2 years ago
  • Thanks for your answer. Why is it not initially installed, while it is the PyTorch image?

  • Thanks for raising this @SiouxDMA - I was also able to reproduce and agree it looks unexpected, will follow up internally to see what's going on and what can be done about it

0

No module named 'X' should not be the expected behavior. Thanks for reporting this issue and sorry for the inconvenience it caused.

Is your use case flexible to use other PyTorch versions prior to 1.8? If yes, please try other versions. If PT1.8 is the only choice, please try following workaround for unblocking(while service team is working on the fix).

Two options:
1 - executed following in a notebook cell

# switch the python execution to /usr/local/bin/python in the kernel.json file. 
!sed 's|^ *"python",|  "/usr/local/bin/python",|g' /usr/local/share/jupyter/kernels/python3/kernel.json>/tmp/kernel.json; cp -f /tmp/kernel.json /usr/local/share/jupyter/kernels/python3/kernel.json;

2 - directly execute following shell command in kernel image specific terminal(not the global terminal).

sed 's|^ *"python",|  "/usr/local/bin/python",|g' /usr/local/share/jupyter/kernels/python3/kernel.json>/tmp/kernel.json; cp -f /tmp/kernel.json /usr/local/share/jupyter/kernels/python3/kernel.json;`

The above command is only needed once per kernel gateway app. After above, please restart the kernel. You can verify using following command in a notebook cell. The '/usr/local/bin/python' should be shown as python executable.

import sys
print(sys.executable)
AWS
answered 2 years ago

You are not logged in. Log in to post an answer.

A good answer clearly answers the question and provides constructive feedback and encourages professional growth in the question asker.

Guidelines for Answering Questions