All Content tagged with AWS Glue

AWS Glue is a serverless data integration service that makes it easy to discover, prepare, and combine data for analytics, machine learning, and application development.

Content language: English

Select tags to filter
Sort by most recent
2010 results
My AWS Glue job keeps failing on daily basis because there is no data to be processed which is not true. I have a system in place which adds new data every morning at 1AM to my S3. Then I use Job to r...
1
answers
0
votes
35
views
asked 2 months ago
The [migration notes for Glue 5.0 (under "breaking changes")](https://docs.aws.amazon.com/glue/latest/dg/migrating-version-50.html#migrating-version-50-features) state that continuous logs are no long...
1
answers
0
votes
29
views
asked 2 months ago
Hi, We are using the AWS Athena MSK Connector to query Kafka messages via AWS Glue Schema Registry. However, I am encountering frequent throttling errors from AWS Glue when running queries in Athena....
1
answers
0
votes
36
views
asked 2 months ago
I have a file in S3 with the following contents: > {"foo-bar": {"name":"Mercury","distanceFromSun":0.39,"orbitalPeriod":0.24,"dayLength":58.65}}{"foo-bar": {"name":"Venus","distanceFromSun":0.72,"or...
1
answers
0
votes
141
views
asked 2 months ago
I am working with AWS Athena and QuickSight. I have a dataset with boolean fields, and I need to ensure that these fields remain intact as true and false instead of being converted into integers (0 an...
1
answers
0
votes
85
views
asked 3 months ago
I have multiple CSV Files arriving in an S3 location. I crawl the files using the glue crawler and store the data in the AWS Glue Data Catalog Table. Recently in a column of the CSV files we are rece...
1
answers
1
votes
203
views
asked 3 months ago
I have a created an Iceberg in the Glue catalog. No partition spec was defined at the time of creation but as we are ingesting more data, a few partition fields must be added. It seems like Athena pow...
1
answers
0
votes
419
views
asked 3 months ago
I have an AWS Glue resource in the us-east-2 region. The Glue job runs daily, and I have integrated my DocumentDB connection into it. Each time the job runs, it creates Elastic Network Interfaces (ENI...
1
answers
0
votes
88
views
asked 3 months ago
We have a zero-etl job streaming form DynamoDb to a S3 Glue Datalake. 1. When items are deleted in DynamoDb they are not deleted from the s3 datalake (even after waiting several hours) 2. Lists of i...
1
answers
1
votes
61
views
asked 3 months ago
I have an Iceberg table which was created as a custom resource in CDK. Every hour I do a merge update to the table to load new and update existing data. To avoid exponential growth of table data vol...
1
answers
0
votes
248
views
profile picture
asked 3 months ago
"Glue version information is as follows. The result of the print statement comes out as '??'. How do I solve it??" ``` %idle_timeout 120 %glue_version 5.0 %worker_type G.1X %number_of_workers 2 # 추가 라...
1
answers
0
votes
17
views
asked 3 months ago