2 réponses
- Le plus récent
- Le plus de votes
- La plupart des commentaires
0
Check the cause of that stuck trace and any errors around it that tell you what went wrong.
Often, it's an issue with the parsing on CSV data when using internally COPY on Redshift, check the Redshift table stl_load_errors for errors.
0
Hi there, I have test and write some script for you
from pyspark.sql import SparkSession
# Create SparkSession
spark = SparkSession.builder.getOrCreate()
# Retrieve connection information from Glue connection
redshift_properties = {
"user": "admin",
"password": "<Redshift password>",
}
# Define the path to the CSV file in S3
csv_file_path = "s3://<your-bucket>/employees.csv"
# Read the CSV file from S3 into a DataFrame
df = spark.read.format("csv").option("header", "true").load(csv_file_path)
# Write the DataFrame to the Redshift table
df.write.jdbc(url="jdbc:redshift://<your-workgroup>.<your-account>.ap-southeast-1.redshift-serverless.amazonaws.com:5439/dev", table="public.<your-table>", mode="overwrite", properties=redshift_properties)
Use Author code with a script editor
instead of Visual ETL offer more custom with your code. And remember to create a connection to Redshift and test your connection. Also add IAM role for your job with same IAM role that you tested with the connection
Here is result
I have tested and it work. So if you want to support more, please notify me. If it helpful, please vote me 🤣
répondu il y a 4 mois
Contenus pertinents
- demandé il y a 7 mois
- demandé il y a 6 mois
- demandé il y a 2 mois
- AWS OFFICIELA mis à jour il y a 3 ans
Hi. Thank you for your response.
I checked the logs but couldn't find any errors around the error I was getting. I couldn't even find the specific error I was getting. Also, I'm using dBeaver to connect to my Redshift cluster, and queried
stl_load_errors
, but I am getting zero results. Maybe I am doing something wrong. Do you have any suggestions?