Zeppelin Notebook on EMR - how to do pip install

0

I am trying to install happybase package on Zeppelin notebook ( or for that matter any package ) . How do I do a pip install from a Zeppelin cell . %pip or !pip is not recognized

asked 4 months ago177 views
2 Answers
3

Hello,

You can follow the below steps in zeppelin to install the packages at runtime. This method works in client deploy mode.

  1. Provide access to home directory for other user(zeppelin) using following command using a Bootstrap script so that zeppelin service can install the packages on /home/.local directory.
sudo chmod 757 /home
  1. Add below settings in spark interpreter from Zeppelin UI and restart the interpreter.
spark.pyspark.virtualenv.enabled	true	
spark.pyspark.virtualenv.bin.path	/usr/bin/virtualenv	
spark.pyspark.virtualenv.type           native
spark.pyspark.python                    python3
  1. Now try installing the packages using below command from notebook,
%spark.pyspark
sc.install_pypi_package("xgboost")
AWS
SUPPORT ENGINEER
answered 20 days ago
0

Hi,

Have a look at https://medium.com/@techboomph/getting-zeppelin-to-work-with-emr-93e237ac446a

The author proposes a solution to do the pip install for a Zepplin notebook on EMR that you need.

Didier

profile pictureAWS
EXPERT
answered 4 months ago
  • Thanks for sharing , David - so looks like he is suggesting to include the pip install as part of the bootstrap script which means the cluster would need to be recreated. I could try that , however I believe that something similar to Jupyter - where you could do a pip install in the note book itself - should be available in Zeppelin. I see that the %conda interpreter is loaded , but I am unable to make that work - like if I type %conda install happybase ... it just says command ( install ) not found

You are not logged in. Log in to post an answer.

A good answer clearly answers the question and provides constructive feedback and encourages professional growth in the question asker.

Guidelines for Answering Questions