AWS Docker container for Glue and Databricks JDBC connection

0

Hello, We are using the AWS Docker container for Glue (available here) and we are trying to connect to a Databricks JDBC connection using the DatabricksJDBC42.jar (available here). We placed the jar file both in the same folder as the jupyter notebook, and have also placed it in the C:/.aws/ folder. When we try to connect we get the error "java.lang.ClassNotFoundException: com.databricks.client.jdbc.Driver".

We have used DB2 driver without issue, using the same setup. Also, when we upload the jar to AWS and attach it to the glue job as an --extra-jars parameter it works fine.

Has anyone gotten this to successfully work?

preguntada hace un año694 visualizaciones
3 Respuestas
0

Hello,

I understand that you are receiving the following error while trying to connect to your Databricks cluster when you are following the blog post “Develop and test AWS Glue version 3.0 and 4.0 jobs locally using a Docker container” :

java.lang.ClassNotFoundException: com.databricks.client.jdbc.Driver

Since you are using the updated DatabricksJDBC42.jar driver, please ensure that the naming convention used for the JDBC URL is also as per DatabricksJDBC42.jar and not according to the legacy SparkJDBC42.jar.

Refer to: https://docs.databricks.com/integrations/jdbc-odbc-bi.html#building-the-connection-url-for-the-databricks-driver

Modified params:

  • jdbc:databricks://
  • Use HttpPath
  • Supply driver class name as 'com.databricks.client.jdbc.Driver'

If the issue still persists, then please open a support case with AWS providing the connection details and code snippet used - https://docs.aws.amazon.com/awssupport/latest/user/case-management.html#creating-a-support-case

Thank you.

AWS
INGENIERO DE SOPORTE
respondido hace un año
0

If it works with --extra-jars, it means in the docker container Glue is not able to find the jar, placing it in the notebook folder or .aws won't do.
The safest thing is to ssh into the container and put the jar under /home/glue_user/spark/jars

profile pictureAWS
EXPERTO
respondido hace un año
0

Gonzalo's answer worked, but also I found that adding the jar in the docker run command was easiest. There was no need to commit the modified docker container image. However, I am now facing a new error related to SSL PKIX path building failed. I will post it as a separate question. Thanks for your attention team! Appreciate the inputs. :)

docker run -it -v ~/.aws:/home/glue_user/.aws -v $WORKSPACE_LOCATION:/home/glue_user/workspace/ -e AWS_PROFILE=$PROFILE_NAME -e DISABLE_SSL=true PYSPARK_SUBMIT_ARGS="--jars /root/.aws/db2jcc4.jar,/root/.aws/DatabricksJDBC42.jar,/root/.aws/AthenaJDBC42-2.0.35.1000,/root/.aws/presto-jdbc-0.225-SNAPSHOT.jar pyspark-shell" --rm -p 4040:4040 -p 18080:18080 --name glue_spark_submit amazon/aws-glue-libs:glue_libs_3.0.0_image_01 spark-submit /home/glue_user/workspace/src/$SCRIPT_FILE_NAME

respondido hace un año

No has iniciado sesión. Iniciar sesión para publicar una respuesta.

Una buena respuesta responde claramente a la pregunta, proporciona comentarios constructivos y fomenta el crecimiento profesional en la persona que hace la pregunta.

Pautas para responder preguntas