AWS Glue unclassified error: File already exists s3 temporary folder

0

I'm trying to run this Visual ETL Glue job wich is pretty simple: source node is a Glue Catalog table from RDS MySQL, transform via DataBrew Recipe to replace invalid-special characters, and load to Redshift table. After several attempts and emptying the temporary S3 folder, I keep getting this error:

Error Category: UNCLASSIFIED_ERROR; An error occurred while calling o127.pyWriteDynamicFrame. File already exists:s3://aws-glue-assets-*******-us-east-1/temporary/part-00005********.csv**

profile picture
asked 3 months ago216 views
1 Answer
0

Hello.

It's possible that the answer in the URL below doesn't match your situation, but it seems like you're avoiding the error by using an S3 event trigger to load the data with Lambda.
https://repost.aws/ja/questions/QU2cuK87vHSyCNOBE439xqDQ/aws-glue-error-file-already-exists

I contacted AWS Support and we worked together in this issue. But not able to fix. The query I am using through this logic is very huge. For some reason, if the query is very huge, I ma getting this error. Support team told that, internally some node failure is happening when the data is COPIED from S3 to redshift. This failure is displayed as the error I told you. But till now, I am did not get any solution. But found a workaround. I am writing the output to an S3 file, and Lambda will take care of loading into Redshift through an event trigger. This is working smooth.

profile picture
EXPERT
answered 3 months ago
  • I've read that, thanks. Mine is not a huge query, and we're talking about 1.6M rows, not a big deal for Glue. I'll try to fix it within Glue, if I don't succeed I'll try using S3 as staging area.

  • Follow up: I've loaded the full table to S3, no problem. Then I've created a Glue Job using the S3 bucket catalog table as source, cleaned up the data, and tried to load to Redshift, same error: An error occurred while calling o127.pyWriteDynamicFrame. File already exists:s3://aws-glue-assets-*******

You are not logged in. Log in to post an answer.

A good answer clearly answers the question and provides constructive feedback and encourages professional growth in the question asker.

Guidelines for Answering Questions