Questions tagged with Extract Transform & Load Data
Content language: English
Select up to 5 tags to filter
Sort by most recent
Browse through the questions and answers listed below or filter and sort to narrow down your results.
I'm seeking guidance on a specific requirement and the recommended approach to achieve it.
My data is currently stored in an AWS Aurora SQL database (let's say host1/db1) . The objective is to...
2
answers
0
votes
351
views
asked 8 months agolg...
Hi
I have an architecture like below
user upload file -> S3 -> lambda trigger glue job -> glue job pull the file, read content, and save to a record in a table in Aurora Postgres
Everything is...
1
answers
0
votes
390
views
asked 8 months agolg...
We are in the process of importing data. The data will be provided as flat files or csv and stored in an Amazon S3 bucket. Each file is expected to be approximately 2GB in size with around 200k...
1
answers
0
votes
411
views
asked 8 months agolg...
Hi,
The environment is, there are multiple JSON files in a S3 bucket. I would like to add all of them to the Athena Table with the filtered values. I used the ChatGPT for the Athena query to create...
2
answers
0
votes
2776
views
asked 8 months agolg...
Hello All Experts,
Please help with the below scenario.
Data is stored in the raw zone and a column "ga4_dt "is extracted as a string in the format 'yyyymmdd' example
20230108. I can't update the...
2
answers
0
votes
1118
views
asked 8 months agolg...
I am still learning to use the Glue ETL process for building new aggregate tables and need help optimizing my ETL job.
My ETL job is designed to run once per day in the mornings and pull in all the...
1
answers
0
votes
355
views
asked 8 months agolg...
Good day!
Is there any way to connect to Greenplum db using AWS glue. I need to perform DML operations as well as DDL in Greenplum.
I tried to use psycopg2 library, because it worked fine in local...
1
answers
0
votes
269
views
asked 9 months agolg...
**How to install python packages dependencies which supports user defined package for Glue Spark job? **
For example, I have used redshift_connector package inside my custom package. In my Spark job,...
2
answers
0
votes
542
views
asked 9 months agolg...
Hello Members,
Please help with below issue,
I am not able to find in documentation
I am trying below :
_____________________________________________
silver_target = glueContext.getSink(
...
2
answers
1
votes
335
views
asked 9 months agolg...
Unable to install External python librarires(e.g.redshift_connector==2.0.913) in AWS Glue Spark Job through Job Parameters option "--additional-python-modules" .
1
answers
0
votes
388
views
asked 9 months agolg...
Hello,
I'm writing a custom transform where I want to use mode within pyspark.sql.functions but I get the same issue irrespective of whether I use * or import the specific module. How can I resolve...
0
answers
0
votes
89
views
asked 9 months agolg...
I am running pyspark on glue 3 notebook. The %additional_python_modules works well in the absence of %connections. But if I add %connections the job ignore %additional_python_modules and doesn't...
1
answers
0
votes
266
views
asked 9 months agolg...