Questions tagged with AWS Glue
Content language: English
Select up to 5 tags to filter
Sort by most recent
Browse through the questions and answers listed below or filter and sort to narrow down your results.
I am working with .sas7bdat file stored in my s3 bucket
I want to convert the sas7bdat file to csv but in glue visual etl I cannot see an option for sas7bdat file format
Can someone please help me...
1
answers
0
votes
301
views
asked 4 months agolg...
Trying to figure out if it's possible to use AWS Glue crawler to parse the spark stderr logs that are dumped from emr-serverless.
The logs are space delimited. I tried running a crawler against the...
0
answers
0
votes
163
views
asked 4 months agolg...
Hello,
While building a job in AWS Glue (Amazon S3, Change Schema, AWS Glue Data Catalog), I had a surprising cost for data preview session (AWS Glue GlueInteractiveSession) of 91% of the total...
1
answers
0
votes
206
views
asked 4 months agolg...
I encountered the following error, “Parquet column cannot be converted in file, Pyspark Expected string Found: INT32.”
I tried to convert the column to INT32 (Applying withColumn(), but the error...
1
answers
0
votes
845
views
asked 4 months agolg...
Hi All,
I set up a crawler, which is giving me headaches when it comes to the "Include path". My path looks currently something like this:
databaseName/schema/%_qt_%
This works fine, meaning that the...
1
answers
0
votes
163
views
asked 4 months agolg...
I want to use Glue Studio for creating a glue ETL job. This job needs to filter out the data in its first step based on the input parameters given to it at run time. Is there a way with visual ETL...
Accepted AnswerAWS Glue
2
answers
0
votes
404
views
asked 4 months agolg...
I have data currently partitioned on a key (say cluster) and I'm repartitioning to a new key 'date'. So I do (in Python)
```
df = glueContext.create_dynamic_frame.from_options(...)
df =...
1
answers
0
votes
177
views
asked 4 months agolg...
Hello,
For an AWS Data Catalog table, I ran Glue (structure: Amazon S3 -> Change Schema -> AWS Glue Data Catalog ) and populate table with only string records. All the actions were done from the...
1
answers
0
votes
165
views
asked 4 months agolg...
We have a file that we used the default XML crawler to crawl the data for, and it correctly created a table and schema for the data (relevant column shown):
![Correct...
0
answers
0
votes
139
views
asked 4 months agolg...
Hello
I am using PySpark on Glue Job to do ETL on a table sourced from S3 And S3 sourced from mysql via DMS (table schema as below, column 'op', 'row_updated_timestamp' & 'row_commit_timestamp' are...
1
answers
0
votes
128
views
asked 4 months agolg...
There was a data source (JSON files) in S3. The JSON structure is as follows.
I used AWS Glue Crawler to build the Glue table based on this S3 data source.
I think the "data" column should be "Struct"...
Accepted AnswerAWS Glue
2
answers
0
votes
383
views
asked 4 months agolg...
Crawler Error:
Insufficient Lake Formation permission(s) on mock_data_patient (Database name: crawl_db, Table Name: mock_data_patient) (Service: AWSGlue; Status Code: 400; Error Code:...
1
answers
0
votes
194
views
asked 4 months agolg...