Is there a way to query a Microsoft SQL Server programmatically in a python job shell in AWS Glue?

1

We have been able to connect to a Microsoft SQL Server DB using both Glue's DynamicFrame and Spark's own JDBC write option due to the Glue connection option. However, considering the transactional nature of the data, we wanted to switch this to sending data programmatically using a python library, thus allowing us to move from Glue ETL Job to Python Shell Job. Our initial option was pyodbc, however, due to our inability to integrate the required driver with Glue, we were unsuccessful. Another option that we looked at was pymssql. Our experience with connecting to Microsoft SQL server using pymssql was seamless. However, it was restricted to Python 3.6 and we were unable to import it with Python 3.9. Our requirement for Python 3.9 was due to the boto3 version compatibility with the said python version as we are also trying to access the boto3 redshift-data client in the same script. Having considered all of the above, is it possible to query Microsoft SQL Server using a python library with Python 3.9 in AWS Glue Python Shell Job?

1개 답변
0

Hi

You can use the AWS SDK for pandas - these a tutorial on how to connect to MS SQL Server here: https://aws-sdk-pandas.readthedocs.io/en/stable/tutorials/007%20-%20Redshift%2C%20MySQL%2C%20PostgreSQL%2C%20SQL%20Server%2C%20Oracle.html

HTH

Nick

AWS
Nick
답변함 일 년 전
  • Hi Nick, Firstly thanks so much for responding. Secondly, when connecting to my SQL Server using wr.sqlserver.connect(), I get an error similar to when I used pyodbc: pyodbc.Error: ('01000', "[01000] [unixODBC][Driver Manager]Can't open lib 'ODBC Driver 17 for SQL Server' : file not found (0) (SQLDriverConnect)") Could you please guide further on this? I have used a glue catalog connection that we have used before to connect to server using spark.jdbc and glue dynamic frame.

  • There are some specific instructions if you want to use the AWS SDK for pandas with MS SQL (https://aws-sdk-pandas.readthedocs.io/en/stable/install.html#notes-for-microsoft-sql-server) - its to keep the package size down for customers that don't choose this data source.

로그인하지 않았습니다. 로그인해야 답변을 게시할 수 있습니다.

좋은 답변은 질문에 명확하게 답하고 건설적인 피드백을 제공하며 질문자의 전문적인 성장을 장려합니다.

질문 답변하기에 대한 가이드라인

관련 콘텐츠