Glue Custom Visual Script Running indefinitely

0

I am trying to create a Custom Visual transform but unfortunately facing some issues.

Here my motive it to truncate a MySQL table before loading the data into it and I want to do it with the help of Visual Tranforms not by changing the auto generated script.

My script is running continueously with the same log:

23/05/14 04:25:00 INFO MultipartUploadOutputStream: close closed:false s3://aws-glue-assets-849950158560-ap-south-1/sparkHistoryLogs/spark-application-1684037765713.inprogress

However removing all the code except this code is working:

from awsglue import DynamicFrame

def truncate_mysql_table(self, database_name, table_name, connection_name):
    return self.filter(lambda row: row['age'] == '21')

DynamicFrame.truncate_mysql_table = truncate_mysql_table

This the code I am using:

import pymysql
import boto3
import json

from awsglue import DynamicFrame

def truncate_mysql_table(self, database_name, table_name, connection_name):
    client = boto3.client('glue')
    response = client.get_connection(Name=connection_name, HidePassword=False)
    connection_props = response.get("Connection").get("ConnectionProperties")
    host_name = connection_props.get("JDBC_CONNECTION_URL").rsplit(":", 1)[0].split("//")[1]
    port = connection_props.get("JDBC_CONNECTION_URL").rsplit(":", 1)[1].split("/", 1)[0]
    secret_id = connection_props.get("SECRET_ID")

    client = boto3.client('secretsmanager')
    response = client.get_secret_value(SecretId=secret_id)

    secret_data = json.loads(response.get("SecretString"))
    username = secret_data.get("username")
    password = secret_data.get("password")

    con = pymysql.connect(host=host_name,
                          user=username,
                          passwd=password,
                          db=database_name,
                          port=port,
                          connect_timeout=60)

    with con.cursor() as cur:
        cur.execute(f"TRUNCATE TABLE {database_name.strip()}.{table_name.strip()}")
        con.commit()
        con.close()

    # print("Table Truncated")
    return self.filter(lambda row: row['age'] == '21')

DynamicFrame.truncate_mysql_table = truncate_mysql_table

My Glue Connection and MySQL RDS is in the same VPC also I am having VPC endpoints for s3 and secret manager. This shouldn't be a problem becuase after changing (or simplifying) the code it is giving the expected output.

Please help.

Piyush
질문됨 일 년 전267회 조회
1개 답변
0

Put print statements to see exactly where it gets stuck, did it open the connection ok?, was it on the execute or later?
In the connection is better if you do conn.autocommit(True), no reason to hold a transaction for that.
Also, you should make sure the connection is closed in a finally block.

My guess is that the it cannot open the connection, the way you are parsing the JDBC looks fragile to me. Maybe consider using the JDBC driver with Py4J instead of having to parse the url and needing an extra library.

profile pictureAWS
전문가
답변함 일 년 전

로그인하지 않았습니다. 로그인해야 답변을 게시할 수 있습니다.

좋은 답변은 질문에 명확하게 답하고 건설적인 피드백을 제공하며 질문자의 전문적인 성장을 장려합니다.

질문 답변하기에 대한 가이드라인

관련 콘텐츠