All Content tagged with AWS Glue

AWS Glue is a serverless data integration service that makes it easy to discover, prepare, and combine data for analytics, machine learning, and application development.

Content language: English

Select up to 5 tags to filter
Sort by most recent
We are trying to read a CSV file to process the data using AWS Glue and we are getting an error message as below: Py4JJavaError: An error occurred while calling o91.schema. :...
1
answers
0
votes
291
views
sravan
asked 2 months ago
How can I monitor who is querying which glue tables? After some trial and error I found that the BatchGetTable Glue API event is recorded using CloudTrail every time I run an Athena query, and...
2
answers
0
votes
483
views
Soumaya
asked 2 months ago
I have Athena Iceberg table. The table has 2 partitions. Each hour I update it with `MERGE` and `DELETE` commands. ``` SELECT count(*) FROM "my_table$files" ``` now **gives 16. Meanwhile data...
1
answers
0
votes
438
views
profile picture
Smotrov
asked 2 months ago
I have a bunch of parquet files in a flat S3 folder, no partitions:...
1
answers
0
votes
165
views
ecmons
asked 2 months ago
Hi. I had a table that was created by a crawler, then I deleted the table ( in Athena) and created it by DDL. after running crawler. it could not find the table and create a new table. note: The s3...
1
answers
0
votes
344
views
profile picture
gh02
asked 2 months ago
I have a few text files on S3 that I need to add to the Glue Catalog in order to use them in a job. None of them have separators, they are all fixed-width files. I have the schemas, but the crawler...
Accepted AnswerAWS Glue
1
answers
0
votes
87
views
profile picture
asked 2 months ago
Hello, I am currently working with AWS Glue ETL Jobs and encountered an issue where the "Push to repository" and "Pull from repository" options are disabled when trying to push the script/job to...
1
answers
0
votes
200
views
asked 2 months ago
I created a custom visual transform component and put the needed json and python files in S3. The component loaded up as expected. Later, I needed to do some more adjustments to the parameters...
2
answers
0
votes
149
views
EdwardR
asked 2 months ago
I have a glue pyspark script that processes DDB data exported to S3 and writes it to Redshift. Initially, it was using below logic: ``` redshiftConnectionOptions = { "postactions": "BEGIN; MERGE...
1
answers
0
votes
223
views
asked 2 months ago
I just can't understand what I'm doing wrong. I have a table. ``` CREATE EXTERNAL TABLE test ( originalrequest string, requeststarted string ) PARTITIONED BY ( req_start_partition...
Accepted AnswerAmazon AthenaAWS Glue
2
answers
0
votes
349
views
profile picture
Smotrov
asked 2 months ago
I have been writing CloudFormation Stack using `yaml` and deploying it to AWS Infrastructure ( For legacy reasons, I can not switch to CDK unfortunately ;)) Following yaml code is a part of the...
2
answers
0
votes
259
views
profile picture
asked 2 months ago
Got this error when trying to insert from temp internal table to external table. ERROR: Invalid DataCatalog response for external table "reportdb"."logs_aggregated": Cannot deserialize Table. Error:...
1
answers
0
votes
206
views
shivank
asked 2 months ago