Questions tagged with Extract Transform & Load Data
Content language: English
Select up to 5 tags to filter
Sort by most recent
Browse through the questions and answers listed below or filter and sort to narrow down your results.
Hi,
The environment is, there are multiple JSON files in a S3 bucket. I would like to add all of them to the Athena Table with the filtered values. I used the ChatGPT for the Athena query to create...
2
answers
0
votes
3051
views
asked 9 months agolg...
Hello All Experts,
Please help with the below scenario.
Data is stored in the raw zone and a column "ga4_dt "is extracted as a string in the format 'yyyymmdd' example
20230108. I can't update the...
2
answers
0
votes
1207
views
asked 9 months agolg...
I am still learning to use the Glue ETL process for building new aggregate tables and need help optimizing my ETL job.
My ETL job is designed to run once per day in the mornings and pull in all the...
1
answers
0
votes
391
views
asked 9 months agolg...
Good day!
Is there any way to connect to Greenplum db using AWS glue. I need to perform DML operations as well as DDL in Greenplum.
I tried to use psycopg2 library, because it worked fine in local...
1
answers
0
votes
289
views
asked 9 months agolg...
**How to install python packages dependencies which supports user defined package for Glue Spark job? **
For example, I have used redshift_connector package inside my custom package. In my Spark job,...
2
answers
0
votes
591
views
asked 9 months agolg...
Hello Members,
Please help with below issue,
I am not able to find in documentation
I am trying below :
_____________________________________________
silver_target = glueContext.getSink(
...
2
answers
1
votes
357
views
asked 10 months agolg...
Unable to install External python librarires(e.g.redshift_connector==2.0.913) in AWS Glue Spark Job through Job Parameters option "--additional-python-modules" .
1
answers
0
votes
411
views
asked 10 months agolg...
Hello,
I'm writing a custom transform where I want to use mode within pyspark.sql.functions but I get the same issue irrespective of whether I use * or import the specific module. How can I resolve...
0
answers
0
votes
92
views
asked 10 months agolg...
I am running pyspark on glue 3 notebook. The %additional_python_modules works well in the absence of %connections. But if I add %connections the job ignore %additional_python_modules and doesn't...
1
answers
0
votes
291
views
asked 10 months agolg...
Trying to write the records from S3 text file to Redshit. It running when the record count is around 10000, but running long and further connection timing out when trying to write the entire file (50K...
1
answers
0
votes
204
views
asked 10 months agolg...
I am trying to use Amazon Grafana with files on Amazon S3. For that, I need to change the date format of the original file to fit on Grafana. I use the following command :
SELECT...
1
answers
0
votes
313
views
asked 10 months agolg...
Hi,
I have a list of Glue jobs, they are up and running. Starting from 2023/08/14 I'm having a lot of errors from CoarseGrainedExecutorBackend like this:
**ERROR CoarseGrainedExecutorBackend:**...
1
answers
0
votes
1155
views
asked 10 months agolg...