I'm testing a relatively simple pyspark script that I first wrote (and tested) EMR. On the EMR script works as intended, but in Glue, the script starts writing output to desired S3 location and stops midway with this error:
An error occurred while calling o285.save. File already exists:s3://bucket/prefix/part-xxxx.json
Syntax I'm using to write DF:
df \
.write.format('json') \
.option('header', 'false') \
.save('s3://...')
The prefix didn't exist on S3 before running the script. I'd appreciate any and all help on how to get this fixed.
can you post the full script?