Questions tagged with Extract Transform & Load Data
Content language: English
Select up to 5 tags to filter
Sort by most recent
Browse through the questions and answers listed below or filter and sort to narrow down your results.
1. **Spun up an EMR instance:**
emr-6.10.0
Spark 3.3.1, HBASE 2.4.15, Hive 3.1.3, JupyterHub 1.5.0, Hadoop 3.3.3, ZooKeeper 3.5.10, Zeppelin 0.10.1, Phoenix 5.1.2, Presto 0.278,
...
1
answers
1
votes
313
views
asked 9 months agolg...
hi team, can I ask why Glue is generating so many parquet files from my ETL job?
![Enter image description here](/media/postImages/original/IM6V7UVsE-QSi5AEKRNdOqkQ)
![Enter image description...
2
answers
0
votes
401
views
I am using AWS Glue and using the Glue Console to create ETL jobs for data transfer between Salesforce and AWS S3 bucket. I am using third party (Progress DataDirect and CData) connectors to connect...
1
answers
0
votes
304
views
asked 9 months agolg...
Our current setup involves AWS Glue in operation, where data is being extracted from one SQL Server and loaded into another SQL Server through use of AWS Glue Studio for selected tables.
Is there a...
1
answers
0
votes
197
views
asked 9 months agolg...
How do I connect Amazon RDS - Microsoft SQL Server through Glue Spark type jobs using python?
1
answers
0
votes
593
views
asked 9 months agolg...
I want to create a glue job to process multiple tables in parallel. If all the tables are to be processed in the same manner, is it possible to do it in only one glue job?
1
answers
0
votes
403
views
asked 9 months agolg...
I have a CSV file delivered by external vendor, to S3 and this file has some Non-ASCII/Junk characters. Before loading this to redahft table, I will need to remove these characters. I tried TRIMBLANKS...
1
answers
0
votes
409
views
asked 9 months agolg...
I was trying to perform Glue ETL transformation and store it in AWS Serverless Redshift database and S3 (both) . However, even the Console generated PySpark sheet fails. Almost none of the methods...
0
answers
0
votes
169
views
asked 9 months agolg...
I'm seeking guidance on a specific requirement and the recommended approach to achieve it.
My data is currently stored in an AWS Aurora SQL database (let's say host1/db1) . The objective is to...
2
answers
0
votes
381
views
asked 9 months agolg...
Hi
I have an architecture like below
user upload file -> S3 -> lambda trigger glue job -> glue job pull the file, read content, and save to a record in a table in Aurora Postgres
Everything is...
1
answers
0
votes
432
views
asked 9 months agolg...
We are in the process of importing data. The data will be provided as flat files or csv and stored in an Amazon S3 bucket. Each file is expected to be approximately 2GB in size with around 200k...
1
answers
0
votes
439
views
asked 9 months agolg...
Hi,
The environment is, there are multiple JSON files in a S3 bucket. I would like to add all of them to the Athena Table with the filtered values. I used the ChatGPT for the Athena query to create...
2
answers
0
votes
3034
views
asked 9 months agolg...