Questions tagged with Extract Transform & Load Data
Content language: English
Select up to 5 tags to filter
Sort by most recent
Browse through the questions and answers listed below or filter and sort to narrow down your results.
Hello team,
So, I built an ETL in python using pyspark. I have a bastion EC2 mysql database that is a copy of a production environment.
Every day it is copying the prod at round 2 oclock, and my...
1
answers
0
votes
205
views
asked 3 months agolg...
Hello! According to the [documentation](https://docs.aws.amazon.com/glue/latest/dg/aws-glue-programming-etl-connect-kinesis-home.html), it should be possible to write data to Kinesis from Glue...
2
answers
0
votes
1218
views
asked 3 months agolg...
I have a glue job (job_a) that starts through a Lambda. When a file is placed inside an S3 bucket, I am triggering a glue job (job_a) through Lambda. My requirement is, once this glue job (job_a), is...
1
answers
0
votes
346
views
asked 3 months agolg...
I am interested particularly in `%additional_python_modules` and I always get this error:
`UsageError: Line magic function `%additional_python_modules` not found.`
The same error is thrown when I...
2
answers
0
votes
136
views
asked 3 months agolg...
I am running a PoC around integrating the Glue lineage into the [DataHub](https://datahubproject.io/). I have based my research on this set of AWS blog posts...
1
answers
0
votes
553
views
asked 3 months agolg...
Hi, I am using AWS glue studio to read from a DDB table with direct DDB connection. So far my visual diagram has two nodes:
1. Source DDB table node -> Here preview takes 5 minutes for only 2 rows of...
1
answers
0
votes
253
views
asked 3 months agolg...
Is it possible to wildcard the include path for a MongoDB crawler. I've tried a number of different options similar to the options available for JDBC and other relational database connections, but...
1
answers
0
votes
130
views
asked 3 months agolg...
I receive a file from external vendor. The file is in ***.dat*** format. Once the file arrives into my S3 bucket, I have to trigger a AWS Glue job to read the file and load into my Redshift table. I...
2
answers
0
votes
205
views
asked 3 months agolg...
My dataframe has 2 columns - name and age. If there is name Manish with 2 rows one with age 16 and another with age 23 , will AWS data quality fail both, pass both or one fail one pass. for below...
1
answers
0
votes
220
views
asked 3 months agolg...
I have a glue job that transforms data from glue table. And I encounter the following error. It does not occur for every run of the job.
I have looked at a few documentarians, it seems to be coming...
1
answers
0
votes
353
views
asked 3 months agolg...
Hello
I am using Glue Pyspark to handle ETL, but when I tried running script with bookmark, I found out that if one script handles more than one table and one of them doesn't have changes or...
2
answers
0
votes
371
views
asked 3 months agolg...
When I try and add a new BigQuery connection as a sink for glue I am getting the following error:
InvalidInputException: jdbcEnforceSsl: is not defined in the schema and the schema does not allow...
1
answers
0
votes
171
views
asked 3 months agolg...