Zeppelin Notebook on EMR - how to do pip install

0

I am trying to install happybase package on Zeppelin notebook ( or for that matter any package ) . How do I do a pip install from a Zeppelin cell . %pip or !pip is not recognized

질문됨 4달 전198회 조회
2개 답변
3

Hello,

You can follow the below steps in zeppelin to install the packages at runtime. This method works in client deploy mode.

  1. Provide access to home directory for other user(zeppelin) using following command using a Bootstrap script so that zeppelin service can install the packages on /home/.local directory.
sudo chmod 757 /home
  1. Add below settings in spark interpreter from Zeppelin UI and restart the interpreter.
spark.pyspark.virtualenv.enabled	true	
spark.pyspark.virtualenv.bin.path	/usr/bin/virtualenv	
spark.pyspark.virtualenv.type           native
spark.pyspark.python                    python3
  1. Now try installing the packages using below command from notebook,
%spark.pyspark
sc.install_pypi_package("xgboost")
AWS
지원 엔지니어
답변함 한 달 전
0

Hi,

Have a look at https://medium.com/@techboomph/getting-zeppelin-to-work-with-emr-93e237ac446a

The author proposes a solution to do the pip install for a Zepplin notebook on EMR that you need.

Didier

profile pictureAWS
전문가
답변함 4달 전
  • Thanks for sharing , David - so looks like he is suggesting to include the pip install as part of the bootstrap script which means the cluster would need to be recreated. I could try that , however I believe that something similar to Jupyter - where you could do a pip install in the note book itself - should be available in Zeppelin. I see that the %conda interpreter is loaded , but I am unable to make that work - like if I type %conda install happybase ... it just says command ( install ) not found

로그인하지 않았습니다. 로그인해야 답변을 게시할 수 있습니다.

좋은 답변은 질문에 명확하게 답하고 건설적인 피드백을 제공하며 질문자의 전문적인 성장을 장려합니다.

질문 답변하기에 대한 가이드라인

관련 콘텐츠