aws-glue-libs:glue_libs_3.0.0_image_01 image issue
I am getting issues in aws-glue-libs:glue_libs_3.0.0_image_01 image
docker run -it -p 8888:8888 -p 4040:4040 -e DISABLE_SSL="true" -v C:/Docker/jupyter_workspace:**/home/glue_user/workspace/jupyter_workspace/ ** --name glue_jupyter amazon/aws-glue-libs:glue_libs_3.0.0_image_01 /home/glue_user/jupyter/jupyter_start.sh
It is getting started locally but When I am trying to read the csv file stored locally it is giving error : An error was encountered: Path does not exist: file:/home/glue_user/workspace/employees.csv Traceback (most recent call last): File "/home/glue_user/spark/python/pyspark/sql/readwriter.py", line 737, in csv return self._df(self._jreader.csv(self._spark._sc._jvm.PythonUtils.toSeq(path))) File "/home/glue_user/spark/python/lib/py4j-0.10.9-src.zip/py4j/java_gateway.py", line 1305, in call answer, self.gateway_client, self.target_id, self.name) File "/home/glue_user/spark/python/pyspark/sql/utils.py", line 117, in deco raise converted from None pyspark.sql.utils.AnalysisException: Path does not exist: file:/home/glue_user/workspace/employees.csv
Or When I am trying to start with
docker run -it -p 8888:8888 -p 4040:4040 -e DISABLE_SSL="true" -v C:/Docker/jupyter_workspace****:/home/glue_user/workspace** ** --name glue_jupyter amazon/aws-glue-libs:glue_libs_3.0.0_image_01 /home/glue_user/jupyter/jupyter_start.sh
then container is not getting started getting following error :
Bad config encountered during initialization: No such directory: ''/home/glue_user/workspace/jupyter_workspace''
Hello,
Assuming your file employees.csv is present in your local path C:/Docker/jupyter_workspace, I could see that you are expecting it gets mounted to the location /home/glue_user/workspace/jupyter_workspace/ within the docker container using below command.
docker run -it -p 8888:8888 -p 4040:4040 -e DISABLE_SSL="true" -v C:/Docker/jupyter_workspace:/home/glue_user/workspace/jupyter_workspace/ --name glue_jupyter amazon/aws-glue-libs:glue_libs_3.0.0_image_01 /home/glue_user/jupyter/jupyter_start.sh
However, when you try to read the file using something like below
df = spark.read.csv("employees.csv")
As per the error message, Spark appears to be looking for the file in the location /home/glue_user/workspace/
So, can you try using full path of the file or something like below ?
df = spark.read.csv("jupyter_workspace/employees.csv")
Relevant questions
aws-glue-libs:glue_libs_3.0.0_image_01 image issue
Accepted Answerasked 2 months agoAWS Glue Image issue
asked 2 months agoAWS Glue 3.0 Docker Image - cannot install Jupyter NBExtensions
asked a month agoGetting a "SINGLE_BUILD_CONTAINER_DEAD" error with a custom Docker image
Accepted Answerasked 3 months agoRun (custom) Keycloak 17 Docker Image on AWS Beanstalk
asked 4 months agoGetting AWS Credentials into a Docker Container without Hardcoding It
asked a day agoOne-shot docker containers using Docker compose
asked 5 months agoDocker Hub Login for AWS CodeBuild (Docker Hub Limit)?
Accepted Answerasked 2 years agoDocker image outdated
asked 7 months agoRun a CLI inside a GreenGrass docker image
Accepted Answerasked 2 months ago
Hello Chiranjeevi Thanks for reply . Yes I resolved it in same way you have mentioned . It was my mistake that even after mounting my directory to working directory I was passing windows path rather than passing path of container.