Questions tagged with Extract Transform & Load Data
Content language: English
Select up to 5 tags to filter
Sort by most recent
Browse through the questions and answers listed below or filter and sort to narrow down your results.
How do I connect Amazon RDS - Microsoft SQL Server through Glue Spark type jobs using python?
1
answers
0
votes
564
views
asked 8 months agolg...
I want to create a glue job to process multiple tables in parallel. If all the tables are to be processed in the same manner, is it possible to do it in only one glue job?
1
answers
0
votes
362
views
asked 8 months agolg...
I have a CSV file delivered by external vendor, to S3 and this file has some Non-ASCII/Junk characters. Before loading this to redahft table, I will need to remove these characters. I tried TRIMBLANKS...
1
answers
0
votes
379
views
asked 8 months agolg...
I was trying to perform Glue ETL transformation and store it in AWS Serverless Redshift database and S3 (both) . However, even the Console generated PySpark sheet fails. Almost none of the methods...
0
answers
0
votes
163
views
asked 8 months agolg...
I'm seeking guidance on a specific requirement and the recommended approach to achieve it.
My data is currently stored in an AWS Aurora SQL database (let's say host1/db1) . The objective is to...
2
answers
0
votes
352
views
asked 8 months agolg...
Hi
I have an architecture like below
user upload file -> S3 -> lambda trigger glue job -> glue job pull the file, read content, and save to a record in a table in Aurora Postgres
Everything is...
1
answers
0
votes
393
views
asked 8 months agolg...
We are in the process of importing data. The data will be provided as flat files or csv and stored in an Amazon S3 bucket. Each file is expected to be approximately 2GB in size with around 200k...
1
answers
0
votes
412
views
asked 8 months agolg...
Hi,
The environment is, there are multiple JSON files in a S3 bucket. I would like to add all of them to the Athena Table with the filtered values. I used the ChatGPT for the Athena query to create...
2
answers
0
votes
2815
views
asked 8 months agolg...
Hello All Experts,
Please help with the below scenario.
Data is stored in the raw zone and a column "ga4_dt "is extracted as a string in the format 'yyyymmdd' example
20230108. I can't update the...
2
answers
0
votes
1130
views
asked 9 months agolg...
I am still learning to use the Glue ETL process for building new aggregate tables and need help optimizing my ETL job.
My ETL job is designed to run once per day in the mornings and pull in all the...
1
answers
0
votes
358
views
asked 9 months agolg...
Good day!
Is there any way to connect to Greenplum db using AWS glue. I need to perform DML operations as well as DDL in Greenplum.
I tried to use psycopg2 library, because it worked fine in local...
1
answers
0
votes
271
views
asked 9 months agolg...
**How to install python packages dependencies which supports user defined package for Glue Spark job? **
For example, I have used redshift_connector package inside my custom package. In my Spark job,...
2
answers
0
votes
551
views
asked 9 months agolg...