Is there a way to query a Microsoft SQL Server programmatically in a python job shell in AWS Glue?

1

We have been able to connect to a Microsoft SQL Server DB using both Glue's DynamicFrame and Spark's own JDBC write option due to the Glue connection option. However, considering the transactional nature of the data, we wanted to switch this to sending data programmatically using a python library, thus allowing us to move from Glue ETL Job to Python Shell Job. Our initial option was pyodbc, however, due to our inability to integrate the required driver with Glue, we were unsuccessful. Another option that we looked at was pymssql. Our experience with connecting to Microsoft SQL server using pymssql was seamless. However, it was restricted to Python 3.6 and we were unable to import it with Python 3.9. Our requirement for Python 3.9 was due to the boto3 version compatibility with the said python version as we are also trying to access the boto3 redshift-data client in the same script. Having considered all of the above, is it possible to query Microsoft SQL Server using a python library with Python 3.9 in AWS Glue Python Shell Job?

1 Respuesta
0

Hi

You can use the AWS SDK for pandas - these a tutorial on how to connect to MS SQL Server here: https://aws-sdk-pandas.readthedocs.io/en/stable/tutorials/007%20-%20Redshift%2C%20MySQL%2C%20PostgreSQL%2C%20SQL%20Server%2C%20Oracle.html

HTH

Nick

AWS
Nick
respondido hace un año
  • Hi Nick, Firstly thanks so much for responding. Secondly, when connecting to my SQL Server using wr.sqlserver.connect(), I get an error similar to when I used pyodbc: pyodbc.Error: ('01000', "[01000] [unixODBC][Driver Manager]Can't open lib 'ODBC Driver 17 for SQL Server' : file not found (0) (SQLDriverConnect)") Could you please guide further on this? I have used a glue catalog connection that we have used before to connect to server using spark.jdbc and glue dynamic frame.

  • There are some specific instructions if you want to use the AWS SDK for pandas with MS SQL (https://aws-sdk-pandas.readthedocs.io/en/stable/install.html#notes-for-microsoft-sql-server) - its to keep the package size down for customers that don't choose this data source.

No has iniciado sesión. Iniciar sesión para publicar una respuesta.

Una buena respuesta responde claramente a la pregunta, proporciona comentarios constructivos y fomenta el crecimiento profesional en la persona que hace la pregunta.

Pautas para responder preguntas