I have a Python package saved in CodeCommit and I need it to run in the notebook linked to an EMR cluster.

0

I have a Python package saved in CodeCommit and need to use it in the notebook attached to my EMR cluster workspace. The package is already successfully installed via bootstrap. To do this, in my .sh file, I need to configure git to access CodeCommit and then use pip install git+https://my_package.foo. This part works fine.

However, in the PySpark notebook already attached to the cluster, if I try to install using sc.install_pypi_package("my_package"), it recognizes the package and proceeds with the installation. It even appears in the sc.list_packages() listing. But when I try to import it, I receive the error:

An error was encountered:
No module named 'notebook.notebookapp'
Traceback (most recent call last):
File "/mnt2/yarn/usercache/livy/appcache/application_1710938070423_0006/container_1710938070423_0006_01_000001/tmp/spark-436a127f-1c84-491c-ad55-d60e244939d5/lib/python3.9/site-packages/my_package/init.py", line 31, in <module>
from notebook.notebookapp import NotebookApp
ModuleNotFoundError: No module named 'notebook.notebookapp'

Any help is welcome. Including other installation methods.

No Answers

You are not logged in. Log in to post an answer.

A good answer clearly answers the question and provides constructive feedback and encourages professional growth in the question asker.

Guidelines for Answering Questions