내용으로 건너뛰기

Glue performance issue in recent times

0

I'm using glue etl version 4.0 with 50 workers and g.1x worker type for past two weeks my job is running more time that expected while writing a parquet and CSV files in s3 bucket the amount data remains same only . Is there any issue in glue in recent times or any performance upgrade happened in glue from aws side?

질문됨 4달 전95회 조회
1개 답변
1

There haven't been any reported widespread issues with AWS Glue 4.0 in recent times that would specifically cause performance degradation when writing Parquet and CSV files to S3. However, there are several potential causes and optimizations to consider for your specific situation:

  1. Repartitioning: Check if your job is using proper repartitioning. Using repartition(1) forces a single core to handle all writing, which significantly slows down the process. Consider repartitioning to match your worker configuration (e.g., if you have 50 workers, you might try a more appropriate partition count).

  2. Resource allocation: Ensure your Glue job has sufficient resources. With g.1x worker type and 50 workers, verify that this is still appropriate for your data volume.

  3. S3 bucket permissions: Verify that the IAM role associated with your Glue job has the necessary permissions to write to the S3 bucket.

  4. Network connectivity: If your Glue job is running in a VPC, ensure proper network connectivity to S3. You might need an S3 gateway VPC endpoint if using private subnets.

  5. Spark configuration: Check your Spark configuration settings. Misconfigurations like incorrect spark.dynamicAllocation.minExecutors can lead to suboptimal resource allocation and slower job execution.

  6. Logging and monitoring: Enable detailed logging for your Glue job and use Amazon CloudWatch Logs Insights to analyze performance. Look for patterns in error messages or warnings that might indicate configuration issues.

  7. Checkpoint behavior: If you're running a streaming job, excessive checkpointing can cause performance issues when writing to S3.

To troubleshoot further, I recommend reviewing the full Glue job logs in CloudWatch for any error messages or stack traces, and potentially trying a smaller subset of data to isolate the issue.
Sources
AWS Glue job takes very long time to write a spark dataframe on S3 | AWS re:Post
Troubleshooting AWS Glue ETL Jobs using Amazon CloudWatch Logs Insights enhanced queries | AWS Cloud Operations Blog
How to reduce S3 cost caused by Glue 4.0 streaming job checkpointing | AWS re:Post

답변함 4달 전
전문가
검토됨 4달 전

로그인하지 않았습니다. 로그인해야 답변을 게시할 수 있습니다.

좋은 답변은 질문에 명확하게 답하고 건설적인 피드백을 제공하며 질문자의 전문적인 성장을 장려합니다.

관련 콘텐츠