SageMaker Studio PyTorch 1.8 kernel has no PyTorch, Numpy, or Matplotlib module

0

I'm working with SageMaker studio with the following options:

  • kernel: PyTorch 1.8 Python 3.6 GPU optimized.
  • instance: ml.g4dn.xlarge

When running import torch numpy, matplotlib or PIL, I'm getting the No module named 'X' error. No matter when using pip install in a cell above, it will not be imported. Is this a problem only I am encountering with the new PyTorch 1.8 kernel? It also happens with the CPU-optimized version. However, PyTorch 1.6 kernel does not throw an error.

When running conda list, I get the output without any of the previously mentioned modules.

  • Not sure really why it is not installed.

질문됨 2년 전1061회 조회
2개 답변
0

I have followed your explanation to recreate the No module named 'X'. After instance lunch, I could import numpy, matplotlib and PIL but could not import torch because torch is not initially installed.

I followed these steps to install and import torch:

  • Shut down your instance completely.
  • Attach the kernel to your notebook again
  • Lunch your ml.g4dn.xlarge instance again
  • pip3 install torch
  • import torch
AWS
Zmnako
답변함 2년 전
  • Thanks for your answer. Why is it not initially installed, while it is the PyTorch image?

  • Thanks for raising this @SiouxDMA - I was also able to reproduce and agree it looks unexpected, will follow up internally to see what's going on and what can be done about it

0

No module named 'X' should not be the expected behavior. Thanks for reporting this issue and sorry for the inconvenience it caused.

Is your use case flexible to use other PyTorch versions prior to 1.8? If yes, please try other versions. If PT1.8 is the only choice, please try following workaround for unblocking(while service team is working on the fix).

Two options:
1 - executed following in a notebook cell

# switch the python execution to /usr/local/bin/python in the kernel.json file. 
!sed 's|^ *"python",|  "/usr/local/bin/python",|g' /usr/local/share/jupyter/kernels/python3/kernel.json>/tmp/kernel.json; cp -f /tmp/kernel.json /usr/local/share/jupyter/kernels/python3/kernel.json;

2 - directly execute following shell command in kernel image specific terminal(not the global terminal).

sed 's|^ *"python",|  "/usr/local/bin/python",|g' /usr/local/share/jupyter/kernels/python3/kernel.json>/tmp/kernel.json; cp -f /tmp/kernel.json /usr/local/share/jupyter/kernels/python3/kernel.json;`

The above command is only needed once per kernel gateway app. After above, please restart the kernel. You can verify using following command in a notebook cell. The '/usr/local/bin/python' should be shown as python executable.

import sys
print(sys.executable)
AWS
답변함 2년 전

로그인하지 않았습니다. 로그인해야 답변을 게시할 수 있습니다.

좋은 답변은 질문에 명확하게 답하고 건설적인 피드백을 제공하며 질문자의 전문적인 성장을 장려합니다.

질문 답변하기에 대한 가이드라인