2 回答
- 最新
- 投票最多
- 评论最多
3
Hello,
You can follow the below steps in zeppelin to install the packages at runtime. This method works in client deploy mode.
- Provide access to home directory for other user(zeppelin) using following command using a Bootstrap script so that zeppelin service can install the packages on /home/.local directory.
sudo chmod 757 /home
- Add below settings in spark interpreter from Zeppelin UI and restart the interpreter.
spark.pyspark.virtualenv.enabled true
spark.pyspark.virtualenv.bin.path /usr/bin/virtualenv
spark.pyspark.virtualenv.type native
spark.pyspark.python python3
- Now try installing the packages using below command from notebook,
%spark.pyspark
sc.install_pypi_package("xgboost")
0
Hi,
Have a look at https://medium.com/@techboomph/getting-zeppelin-to-work-with-emr-93e237ac446a
The author proposes a solution to do the pip install for a Zepplin notebook on EMR that you need.
Didier
相关内容
- 已提问 3 个月前
- AWS 官方已更新 2 年前
- AWS 官方已更新 3 年前
- AWS 官方已更新 3 年前
- AWS 官方已更新 2 年前
Thanks for sharing , David - so looks like he is suggesting to include the pip install as part of the bootstrap script which means the cluster would need to be recreated. I could try that , however I believe that something similar to Jupyter - where you could do a pip install in the note book itself - should be available in Zeppelin. I see that the %conda interpreter is loaded , but I am unable to make that work - like if I type %conda install happybase ... it just says command ( install ) not found