AWS Glue generating multiple files instead of one

0

Hello,

I'm using a glue studio to do some custom transformation that I have in csv files and it's basically having a scheme as I'm sending here. Glue Scheme

After run the job, I notice that the logic is correct but is generating multiple files with the naming like "part" etc. I suppose it's due some parallel processing. But could I configure to have one output file only, for each document I have in my input folder?

Cheers, Tassio

tassio
已提问 1 年前455 查看次数
1 回答
0

From the looks of it your DynamicFrames are partitioning your files, You can repartition them.

Try the following: https://repost.aws/knowledge-center/glue-job-output-large-files

AWS
vtjean
已回答 1 年前
  • Inside that "Custom transform", you can just call "repartition(1)" before you return the DF

您未登录。 登录 发布回答。

一个好的回答可以清楚地解答问题和提供建设性反馈,并能促进提问者的职业发展。

回答问题的准则