Is there a way to query a Microsoft SQL Server programmatically in a python job shell in AWS Glue?

1

We have been able to connect to a Microsoft SQL Server DB using both Glue's DynamicFrame and Spark's own JDBC write option due to the Glue connection option. However, considering the transactional nature of the data, we wanted to switch this to sending data programmatically using a python library, thus allowing us to move from Glue ETL Job to Python Shell Job. Our initial option was pyodbc, however, due to our inability to integrate the required driver with Glue, we were unsuccessful. Another option that we looked at was pymssql. Our experience with connecting to Microsoft SQL server using pymssql was seamless. However, it was restricted to Python 3.6 and we were unable to import it with Python 3.9. Our requirement for Python 3.9 was due to the boto3 version compatibility with the said python version as we are also trying to access the boto3 redshift-data client in the same script. Having considered all of the above, is it possible to query Microsoft SQL Server using a python library with Python 3.9 in AWS Glue Python Shell Job?

已提問 1 年前檢視次數 1242 次
1 個回答
0

Hi

You can use the AWS SDK for pandas - these a tutorial on how to connect to MS SQL Server here: https://aws-sdk-pandas.readthedocs.io/en/stable/tutorials/007%20-%20Redshift%2C%20MySQL%2C%20PostgreSQL%2C%20SQL%20Server%2C%20Oracle.html

HTH

Nick

AWS
Nick
已回答 1 年前
  • Hi Nick, Firstly thanks so much for responding. Secondly, when connecting to my SQL Server using wr.sqlserver.connect(), I get an error similar to when I used pyodbc: pyodbc.Error: ('01000', "[01000] [unixODBC][Driver Manager]Can't open lib 'ODBC Driver 17 for SQL Server' : file not found (0) (SQLDriverConnect)") Could you please guide further on this? I have used a glue catalog connection that we have used before to connect to server using spark.jdbc and glue dynamic frame.

  • There are some specific instructions if you want to use the AWS SDK for pandas with MS SQL (https://aws-sdk-pandas.readthedocs.io/en/stable/install.html#notes-for-microsoft-sql-server) - its to keep the package size down for customers that don't choose this data source.

您尚未登入。 登入 去張貼答案。

一個好的回答可以清楚地回答問題並提供建設性的意見回饋,同時有助於提問者的專業成長。

回答問題指南