Saltar al contenido

Glue performance issue in recent times

0

I'm using glue etl version 4.0 with 50 workers and g.1x worker type for past two weeks my job is running more time that expected while writing a parquet and CSV files in s3 bucket the amount data remains same only . Is there any issue in glue in recent times or any performance upgrade happened in glue from aws side?

preguntada hace 5 meses95 visualizaciones
1 Respuesta
1

There haven't been any reported widespread issues with AWS Glue 4.0 in recent times that would specifically cause performance degradation when writing Parquet and CSV files to S3. However, there are several potential causes and optimizations to consider for your specific situation:

  1. Repartitioning: Check if your job is using proper repartitioning. Using repartition(1) forces a single core to handle all writing, which significantly slows down the process. Consider repartitioning to match your worker configuration (e.g., if you have 50 workers, you might try a more appropriate partition count).

  2. Resource allocation: Ensure your Glue job has sufficient resources. With g.1x worker type and 50 workers, verify that this is still appropriate for your data volume.

  3. S3 bucket permissions: Verify that the IAM role associated with your Glue job has the necessary permissions to write to the S3 bucket.

  4. Network connectivity: If your Glue job is running in a VPC, ensure proper network connectivity to S3. You might need an S3 gateway VPC endpoint if using private subnets.

  5. Spark configuration: Check your Spark configuration settings. Misconfigurations like incorrect spark.dynamicAllocation.minExecutors can lead to suboptimal resource allocation and slower job execution.

  6. Logging and monitoring: Enable detailed logging for your Glue job and use Amazon CloudWatch Logs Insights to analyze performance. Look for patterns in error messages or warnings that might indicate configuration issues.

  7. Checkpoint behavior: If you're running a streaming job, excessive checkpointing can cause performance issues when writing to S3.

To troubleshoot further, I recommend reviewing the full Glue job logs in CloudWatch for any error messages or stack traces, and potentially trying a smaller subset of data to isolate the issue.
Sources
AWS Glue job takes very long time to write a spark dataframe on S3 | AWS re:Post
Troubleshooting AWS Glue ETL Jobs using Amazon CloudWatch Logs Insights enhanced queries | AWS Cloud Operations Blog
How to reduce S3 cost caused by Glue 4.0 streaming job checkpointing | AWS re:Post

respondido hace 5 meses
EXPERTO
revisado hace 5 meses

No has iniciado sesión. Iniciar sesión para publicar una respuesta.

Una buena respuesta responde claramente a la pregunta, proporciona comentarios constructivos y fomenta el crecimiento profesional en la persona que hace la pregunta.