- Newest
- Most votes
- Most comments
Hello,
I have checked internally and Unfortunately, there is no way of controlling the file Size. Currently there is no parameter/setting that can be changed in order to create larger file sizes to avoid creating too many small parquet files.
The reason for this is that the file size is automatically allocated by the automation in order to speed up the process and there is currently no way to customize this, however please be aware that this behavior is currently under review by our Internal teams. I would Kindly advise that you check our What's New page[1] and Blog[2] for latest updates on AWS where we announce all new updates/features when we release them.
In order to reduce the number of files being exported you may consider using the "Partial" [3] option when exporting in order to export only the required databases, tables, etc. By selecting 'Partial' when exporting, you're only moving data which is necessary for your analysis purpose and not entire database.
On behalf of AWS, I would like to apologize for any inconvenience caused by this.
References:
[1] https://aws.amazon.com/new/
Relevant content
- Accepted Answerasked 3 years ago
- asked 2 years ago
- AWS OFFICIALUpdated 2 years ago
- AWS OFFICIALUpdated a year ago
- AWS OFFICIALUpdated a year ago
Thank you for the answer. I am in fact using the partial export filter. Exporting the entire database ran into even bigger problems. We have some partitioned tables in our DB and the S3 Export gets very confused by these and does a full export at every level of the hierarchy. It was creating literally millions of S3 objects.