Unanswered Questions tagged with Extract Transform & Load Data
Content language: English
Select up to 5 tags to filter
Sort by most recent
Browse through the questions and answers listed below or filter and sort to narrow down your results.
While running AWS glue Job facing this error
0
answers
0
votes
49
views
asked 2 years agolg...
I need to pre-process some data on S3 before the Glue Crawler crawls the data. For this I created an S3 Object Lambda to do the pre-processing. If I test the Object Lambda using the CLI, it provides...
0
answers
1
votes
181
views
asked 2 years agolg...
In the [documentation](https://docs.aws.amazon.com/glue/latest/dg/aws-glue-programming-python-libraries.html) I can see that we can add additional modules from pip using the following...
0
answers
0
votes
81
views
asked 2 years agolg...
Hello,
Currently I am trying to read csv files in my s3 bucket with the following format:
```
header 1, header 2, header 3
value 1, value 2, value 3
header 1, header 2, header 3, header 4
value 1,...
0
answers
1
votes
136
views
asked 2 years agolg...
Using AWS Pydeequ in databricks I am performing Data Quality checks. When I run this below mentioned code it provide only metrics results as my output (like Check_level, check_status, constraint,...
0
answers
0
votes
166
views
asked 2 years agolg...
Hello, I am trying to use Glue to take an input file, do my required transformations, then output the columns in a specific order. I also want to output columns that may not be present in the input...
0
answers
0
votes
63
views
asked 2 years agolg...
I have a parameterized glue job , that will be called in parallel (25 glue job) through step functions, when bookmark enabled , version mismatch exception is thrown, when disabled, it runs fine.
....
0
answers
0
votes
154
views
asked 2 years agolg...
I have a glue job that write to a Data Catalog. In the Data Catalog I originally set it up as CSV, and all works fine. Now I would like to try to use Parquet for the Data Catalog. I thought I would...
0
answers
0
votes
136
views
asked 2 years agolg...
I have a Glue ETL job which creates partitions during the job
```
additionalOptions = {"enableUpdateCatalog": True, "updateBehavior": "LOG"}
additionalOptions["partitionKeys"] = ["year",...
0
answers
0
votes
116
views
asked 2 years agolg...
When developing some Glue scripts from a successful Crawler run from a JDBC Oracle data source, I am encountering an error that I cannot resolve.
```
An error occurred while calling...
0
answers
0
votes
131
views
asked 2 years agolg...
I created a Data Catalog with a table that I manually defined.
I run my ETL job and all works well.
I added partitions to both the table in the Data Catalog, as well as the ETL job. it creates the...
0
answers
0
votes
275
views
asked 2 years agolg...
Hi! I am doing a longitudinal study and am trying to prevent previous workers from taking the second part of my survey. In order to do this, I have been following instructions on the MTurk blog...
0
answers
0
votes
75
views
asked 2 years agolg...