Questions tagged with Extract Transform & Load Data
Content language: English
Select up to 5 tags to filter
Sort by most recent
Browse through the questions and answers listed below or filter and sort to narrow down your results.
I tried AWS Glue data quality dynamic rules in my AWS Glue pipeline. I wrote below rule
RowCount > avg(last(3))
Then I processed 3 csv files with 1000,10000 and 100 rows. Then in 4th run I again...
1
answers
0
votes
208
views
asked 3 months agolg...
It is not at present possible to modify placeholder at all, including deletion. Deleting them is at least in principle possible as it doesn’t actually modify the object, but since the placeholders...
1
answers
0
votes
212
views
asked 3 months agolg...
I am doing a AWS Glue job to read from Redshift (schema_1) and write it back to Redhshift (schema_2). This process is done using below:
```
Redshift_read =...
1
answers
0
votes
453
views
asked 3 months agolg...
Hi,
I have recently started working with AWS Glue. I have created a Visual ETL job and it ran successfully. I noticed it had somehow created an extra S3 bucket instead of using the desired bucket I...
1
answers
0
votes
157
views
asked 3 months agolg...
How do I add a sort / dist key to the glue dynamicframe writer into redshift?
1
answers
0
votes
177
views
asked 3 months agolg...
Error Category: RESOURCE_NOT_FOUND_ERROR; An error occurred while calling o123.pyWriteDynamicFrame. Requested resource not found: Table: datacatalog_table_name not found (Service: AmazonDynamoDBv2;...
1
answers
0
votes
185
views
asked 3 months agolg...
I have crawled the schema of my DynamoDB table using AWS Glue crawler and the table is now shown under the tables section in AWS Glue. However, the table is not being shown under Athena database...
1
answers
0
votes
258
views
asked 3 months agolg...
Good day,
I am trying to build a no/zero code architecture, i was planning to use MWAA for orchestration but i was tasked to look at alternatives keeping it simple
I was hoping Redshift query...
2
answers
0
votes
415
views
asked 4 months agolg...
Bug: Sagemaker Canvas can't import parquet files with numpy.nan/None/pandas.NA as first row valuelg...
I'm trying to create a tabular dataset in Sagemaker Canvas Data Wrangler by importing a local parquet file created with the pandas python library. I succeed in loading the file and can preview it....
1
answers
0
votes
147
views
asked 4 months agolg...
I have an existing AWS Glue script that has been successfully running in Glue 2.4 for some time. I went in today to upgrade it to Glue 3.0 and am unable to connect to my database. I am simply reading...
1
answers
0
votes
465
views
asked 4 months agolg...
We're using S3 Select SelectObjectContent to convert CSV input to JSON output.
CSV files on input are very large, so we're passing chunks using ScanRange. Recently we ran into an issue with CSV files...
1
answers
0
votes
280
views
asked 4 months agolg...
Hi,
I am considering Glue to connect to a third party application's database (Oracle) and bring a data set (in excess of 1M rows) obtained by joining multiple tables at source end. The destination...
1
answers
0
votes
352
views
asked 4 months agolg...