Questions tagged with Extract Transform & Load Data
Content language: English
Select up to 5 tags to filter
Sort by most recent
Browse through the questions and answers listed below or filter and sort to narrow down your results.
I have a glue job (job_a) that starts through a Lambda. When a file is placed inside an S3 bucket, I am triggering a glue job (job_a) through Lambda. My requirement is, once this glue job (job_a), is...
1
answers
0
votes
256
views
asked 23 days agolg...
I am interested particularly in `%additional_python_modules` and I always get this error:
`UsageError: Line magic function `%additional_python_modules` not found.`
The same error is thrown when I...
2
answers
0
votes
71
views
asked 23 days agolg...
I am running a PoC around integrating the Glue lineage into the [DataHub](https://datahubproject.io/). I have based my research on this set of AWS blog posts...
1
answers
0
votes
452
views
asked a month agolg...
Hi, I am using AWS glue studio to read from a DDB table with direct DDB connection. So far my visual diagram has two nodes:
1. Source DDB table node -> Here preview takes 5 minutes for only 2 rows of...
1
answers
0
votes
126
views
asked a month agolg...
Is it possible to wildcard the include path for a MongoDB crawler. I've tried a number of different options similar to the options available for JDBC and other relational database connections, but...
1
answers
0
votes
77
views
asked a month agolg...
I receive a file from external vendor. The file is in ***.dat*** format. Once the file arrives into my S3 bucket, I have to trigger a AWS Glue job to read the file and load into my Redshift table. I...
2
answers
0
votes
108
views
asked a month agolg...
My dataframe has 2 columns - name and age. If there is name Manish with 2 rows one with age 16 and another with age 23 , will AWS data quality fail both, pass both or one fail one pass. for below...
1
answers
0
votes
87
views
asked a month agolg...
I have a glue job that transforms data from glue table. And I encounter the following error. It does not occur for every run of the job.
I have looked at a few documentarians, it seems to be coming...
1
answers
0
votes
285
views
asked a month agolg...
Hello
I am using Glue Pyspark to handle ETL, but when I tried running script with bookmark, I found out that if one script handles more than one table and one of them doesn't have changes or...
2
answers
0
votes
273
views
asked a month agolg...
When I try and add a new BigQuery connection as a sink for glue I am getting the following error:
InvalidInputException: jdbcEnforceSsl: is not defined in the schema and the schema does not allow...
1
answers
0
votes
93
views
asked a month agolg...
I tried AWS Glue data quality dynamic rules in my AWS Glue pipeline. I wrote below rule
RowCount > avg(last(3))
Then I processed 3 csv files with 1000,10000 and 100 rows. Then in 4th run I again...
1
answers
0
votes
69
views
asked a month agolg...
It is not at present possible to modify placeholder at all, including deletion. Deleting them is at least in principle possible as it doesn’t actually modify the object, but since the placeholders...
1
answers
0
votes
102
views
asked a month agolg...