- Newest
- Most votes
- Most comments
I used pymysql and worked for me.
this is an example of the code I used :
import sys
import pymysql
........
rds_host = "database-id.rds.amazonaws.com"
name = "userName"
password = "secretPassword"
db_name = "myDb"
conn = pymysql.connect(host=rds_host, user=name, passwd=password, db=db_name, connect_timeout=5)
with conn.cursor() as cur:
query="CALL mystoredProcedure()"
cur.execute(query)
conn.commit()
cur.close()
@Jess , Is there any security group or VPC setting required for this ? I am doing the same thing but the connection is failing. My DB is under a VPC
Hi, as pointed in the above comment, I am also facing RDS instance connectivity issues when using the above solution. My RDS instance is in a VPC and I have configured inbound rules for the security group but still cannot connect. Below is the exact error message:
OperationalError: (20009, b'DB-Lib error message 20009, severity 9:\nUnable to connect: Adaptive Server is unavailable or does not exist (gluetest.cgj1ict7xclx.ap-southeast-1.rds.amazonaws.com)\nNet-Lib error during Connection timed out (110)\nDB-Lib error message 20009, severity 9:\nUnable to connect: Adaptive Server is unavailable or does not exist (gluetest.cgj1ict7xclx.ap-southeast-1.rds.amazonaws.com)\nNet-Lib error during Connection timed out (110)\n')
you need to put jdbc:mysql://
jdbc:mysql://you_db_cluster_endpoint:3306/yourDb_name?verifyServerCertificate=false&useSSL=true
maybe try this :
import sys from awsglue.transforms import * from awsglue.utils import getResolvedOptions from pyspark.context import SparkContext from awsglue.context import GlueContext from awsglue.job import Job from botocore.exceptions import ClientError import boto3 from awsglue.dynamicframe import DynamicFrame import pyspark.sql.functions as f
glue_connection_name = args['GLUE_CONNECTION_NAME'] // your glue connection database_name = args['DB_NAME']
print('Getting details for connection ' + glue_connection_name) source_jdbc_conf = glueContext.extract_jdbc_conf(glue_connection_name)
from py4j.java_gateway import java_import java_import(sc._gateway.jvm,"java.sql.Connection") java_import(sc._gateway.jvm,"java.sql.DatabaseMetaData") java_import(sc._gateway.jvm,"java.sql.DriverManager") java_import(sc._gateway.jvm,"java.sql.SQLException")
conn = sc._gateway.jvm.DriverManager.getConnection(source_jdbc_conf.get('url') + '/' + database_name, source_jdbc_conf.get('user'), source_jdbc_conf.get('password')) print('Connected to ' + conn.getMetaData().getDatabaseProductName() + ', ' + source_jdbc_conf.get('url'))
Relevant content
- asked 2 years ago
- asked 2 years ago
- asked 2 years ago
- AWS OFFICIALUpdated a year ago
- AWS OFFICIALUpdated 2 years ago
- AWS OFFICIALUpdated 5 months ago
- AWS OFFICIALUpdated a year ago
if your glue ETL jobs needs multiple workers and the stored procedure it is not immediate, you might be wasting resources. It might be better to run the Stored procedure from a Glue Python Shell job that gets executed as soon as your Glue Spark job is completed in a workflow.