Questions tagged with Extract Transform & Load Data
Content language: English
Select up to 5 tags to filter
Sort by most recent
Browse through the questions and answers listed below or filter and sort to narrow down your results.
Hey Guys
I want to run my pyspark on EMR Serverless but it has some dependencies/libraries which are needed by the pyspark script to run. Please suggest a optimized approach to import the...
1
answers
0
votes
398
views
asked 9 months agolg...
Hello,
I have gone through the recommended changes provided in [this](https://repost.aws/knowledge-center/glue-crawler-internal-service-exception) article. However, I continue to get the same...
1
answers
0
votes
240
views
asked 9 months agolg...
Hi there I managed to convert csv files to parquet files using glue job, my crawler does see the parquet files in the s3 bucket and crawls it and present me with the proper schema and adds for each...
1
answers
0
votes
510
views
asked 10 months agolg...
Are there any known, recently (~07/18/2023) introduced performance issues with Glue crawlers?
We have recently observed excessive slowness with Glue crawlers that had been running for months without...
0
answers
0
votes
28
views
asked 10 months agolg...
Glue 4 Hudi supportlg...
I am trying to store a data stream from kafka using the hudi format. I am following this doc https://docs.aws.amazon.com/glue/latest/dg/aws-glue-programming-etl-format-hudi.html and I even tried to...
3
answers
0
votes
279
views
asked 10 months agolg...
We are moving our content from one developer to another. I am trying to figure out not to stop my Images and pdf uploads from being rasterized.
My old developer had figured it out, but I can't.
Any...
1
answers
0
votes
206
views
asked 10 months agolg...
I am trying to create a Delta Table from spark sql using the Glue meta catalog.
I can correctly query a Delta table using the Glue metastore:
```
%%sql
select * from `my_table` VERSION AS OF 1 limit...
2
answers
0
votes
1437
views
asked 10 months agolg...
Hi Team,
I have a complex nested xml file which I want to read using AWS Glue and convert it to parquet format. I want to use pandas read_xml function to read the xml file. But, I get error lxml not...
1
answers
0
votes
301
views
asked 10 months agolg...
I read data from s3 using as follow.
```
sec_id_dyf = glueContext.create_dynamic_frame.from_options(
connection_type = 's3',
...
0
answers
0
votes
99
views
asked 10 months agolg...
Hello, I have been experimenting with Aws glue, and created some crawlers to crawl the data but the behavior wasn't what I expected,
Question 1) I had an S3 bucket
with 3...
1
answers
0
votes
320
views
asked 10 months agolg...
Looks like attempting to write to a Delta Lake table from a DynamicFrame is not working. The Visual Glue interface generates a script like:
```
s3 =...
2
answers
0
votes
426
views
asked 10 months agolg...
We are trying to use Glue to query and aggregate some Parquet files in S3.
We get this error related to schema mismatch:
```
An error occurred while calling o106.pyWriteDynamicFrame....
2
answers
0
votes
249
views
asked 10 months agolg...