It really depends on how your data is structured. If it's 1 GB file, then it's going to not benefit from Glue being able to fan out. If it's 1024 1MB files, then you're going to see the benefits. Also, it will depend on the block size of the Parquet to allow for optimal I/O (See tip #5 here https://aws.amazon.com/blogs/big-data/top-10-performance-tuning-tips-for-amazon-athena/).
I could only find some information on how to tune your DPUs optimally. The example given was 428 Gzipped JSON files converting to parquet files.
How could we have Glue to get data from csv as String?Accepted Answerasked 5 months ago
Redshift data warehouse and Glue ETL design recommendationsAccepted Answerasked 2 years ago
ETL using AWS Glueasked 3 months ago
How fast can glue ETL convert data to parquet?Accepted AnswerMODERATORasked 3 years ago
Copying data from sql server to snowflake with AWS GLUEasked 4 months ago
Generating Parquet files from Glue Data Catalogasked 4 months ago
Working only new data in aws glueasked 4 years ago
Using Pandas in Glue ETL Job ( How to convert Dynamic DataFrame or PySpark Dataframe to Pandas Dataframe)Accepted Answerasked 5 months ago
what are advantages of running ETL jobs in aws glue?asked 4 months ago
Utilize Glue ETL for mapping misspelled information to existing data setsasked 2 months ago