Glue Custom Visual Script Running indefinitely

0

I am trying to create a Custom Visual transform but unfortunately facing some issues.

Here my motive it to truncate a MySQL table before loading the data into it and I want to do it with the help of Visual Tranforms not by changing the auto generated script.

My script is running continueously with the same log:

23/05/14 04:25:00 INFO MultipartUploadOutputStream: close closed:false s3://aws-glue-assets-849950158560-ap-south-1/sparkHistoryLogs/spark-application-1684037765713.inprogress

However removing all the code except this code is working:

from awsglue import DynamicFrame

def truncate_mysql_table(self, database_name, table_name, connection_name):
    return self.filter(lambda row: row['age'] == '21')

DynamicFrame.truncate_mysql_table = truncate_mysql_table

This the code I am using:

import pymysql
import boto3
import json

from awsglue import DynamicFrame

def truncate_mysql_table(self, database_name, table_name, connection_name):
    client = boto3.client('glue')
    response = client.get_connection(Name=connection_name, HidePassword=False)
    connection_props = response.get("Connection").get("ConnectionProperties")
    host_name = connection_props.get("JDBC_CONNECTION_URL").rsplit(":", 1)[0].split("//")[1]
    port = connection_props.get("JDBC_CONNECTION_URL").rsplit(":", 1)[1].split("/", 1)[0]
    secret_id = connection_props.get("SECRET_ID")

    client = boto3.client('secretsmanager')
    response = client.get_secret_value(SecretId=secret_id)

    secret_data = json.loads(response.get("SecretString"))
    username = secret_data.get("username")
    password = secret_data.get("password")

    con = pymysql.connect(host=host_name,
                          user=username,
                          passwd=password,
                          db=database_name,
                          port=port,
                          connect_timeout=60)

    with con.cursor() as cur:
        cur.execute(f"TRUNCATE TABLE {database_name.strip()}.{table_name.strip()}")
        con.commit()
        con.close()

    # print("Table Truncated")
    return self.filter(lambda row: row['age'] == '21')

DynamicFrame.truncate_mysql_table = truncate_mysql_table

My Glue Connection and MySQL RDS is in the same VPC also I am having VPC endpoints for s3 and secret manager. This shouldn't be a problem becuase after changing (or simplifying) the code it is giving the expected output.

Please help.

Piyush
asked a year ago255 views
1 Answer
0

Put print statements to see exactly where it gets stuck, did it open the connection ok?, was it on the execute or later?
In the connection is better if you do conn.autocommit(True), no reason to hold a transaction for that.
Also, you should make sure the connection is closed in a finally block.

My guess is that the it cannot open the connection, the way you are parsing the JDBC looks fragile to me. Maybe consider using the JDBC driver with Py4J instead of having to parse the url and needing an extra library.

profile pictureAWS
EXPERT
answered a year ago

You are not logged in. Log in to post an answer.

A good answer clearly answers the question and provides constructive feedback and encourages professional growth in the question asker.

Guidelines for Answering Questions