Questions tagged with Extract Transform & Load Data
Content language: English
Select up to 5 tags to filter
Sort by most recent
Browse through the questions and answers listed below or filter and sort to narrow down your results.
Hello everyone.
Data from the rest api in the form of JSON is loaded daily by lambda into s3-bucket-1.
Then this data should be stored in s3-bucket-2 in the form of a flat parquet table.
I did it in...
0
answers
0
votes
74
views
asked 9 months agolg...
Hey Guys
I want to run my pyspark on EMR Serverless but it has some dependencies/libraries which are needed by the pyspark script to run. Please suggest a optimized approach to import the...
1
answers
0
votes
398
views
asked 9 months agolg...
Hello,
I have gone through the recommended changes provided in [this](https://repost.aws/knowledge-center/glue-crawler-internal-service-exception) article. However, I continue to get the same...
1
answers
0
votes
240
views
asked 9 months agolg...
Hi there I managed to convert csv files to parquet files using glue job, my crawler does see the parquet files in the s3 bucket and crawls it and present me with the proper schema and adds for each...
1
answers
0
votes
512
views
asked 10 months agolg...
Are there any known, recently (~07/18/2023) introduced performance issues with Glue crawlers?
We have recently observed excessive slowness with Glue crawlers that had been running for months without...
0
answers
0
votes
28
views
asked 10 months agolg...
Glue 4 Hudi supportlg...
I am trying to store a data stream from kafka using the hudi format. I am following this doc https://docs.aws.amazon.com/glue/latest/dg/aws-glue-programming-etl-format-hudi.html and I even tried to...
3
answers
0
votes
279
views
asked 10 months agolg...
We are moving our content from one developer to another. I am trying to figure out not to stop my Images and pdf uploads from being rasterized.
My old developer had figured it out, but I can't.
Any...
1
answers
0
votes
206
views
asked 10 months agolg...
I am trying to create a Delta Table from spark sql using the Glue meta catalog.
I can correctly query a Delta table using the Glue metastore:
```
%%sql
select * from `my_table` VERSION AS OF 1 limit...
2
answers
0
votes
1439
views
asked 10 months agolg...
Hi Team,
I have a complex nested xml file which I want to read using AWS Glue and convert it to parquet format. I want to use pandas read_xml function to read the xml file. But, I get error lxml not...
1
answers
0
votes
301
views
asked 10 months agolg...
I read data from s3 using as follow.
```
sec_id_dyf = glueContext.create_dynamic_frame.from_options(
connection_type = 's3',
...
0
answers
0
votes
99
views
asked 10 months agolg...
Hello, I have been experimenting with Aws glue, and created some crawlers to crawl the data but the behavior wasn't what I expected,
Question 1) I had an S3 bucket
with 3...
1
answers
0
votes
320
views
asked 10 months agolg...
Looks like attempting to write to a Delta Lake table from a DynamicFrame is not working. The Visual Glue interface generates a script like:
```
s3 =...
2
answers
0
votes
426
views
asked 10 months agolg...