Questions tagged with Extract Transform & Load Data
Content language: English
Select up to 5 tags to filter
Sort by most recent
Browse through the questions and answers listed below or filter and sort to narrow down your results.
We have a glue job that is writing large number of items to dynamo.
**If a write to dynamo fails, how can we have access to these individual failed records in order to attempt to resolve and...
1
answers
0
votes
275
views
asked 24 days agolg...
Hi I have created an external table on AWS Glue catalog db .
The table points to a lz4 compressed file on an s3.
the table definition looks like this
```
CREATE EXTERNAL TABLE `myapplogs`(
...
1
answers
0
votes
281
views
asked 24 days agolg...
Why doesn't Glue Job and Glue Workflow have the function of version control and alias likes Labmda.lg...
I tried to develop the data orchestlation with s3, Glue Job and Glue Workflow. After I developed it, I found that Glue Job and Glue Workflow doesn't have the function of version control and alias...
0
answers
0
votes
171
views
asked a month agolg...
Hi team, first post, let me know if it provides a good explanation.
I'd like to know a way to minimize the effort for data ingestion.
We have two options as follows:
(1) csv files from a file...
0
answers
0
votes
298
views
asked a month agolg...
I am running an EMR cluster with an attached notebook, and using Apache spark to load/process data however I have not been able to load data into Apache. Whenever I try to run...
1
answers
0
votes
326
views
asked a month agolg...
Hello!
I am new to AWS Glue and I starting to create data monitoring rules in AWS Glue. I have tried multiple options with CustomSQL but can not seem to find the solution.
My problem: I want to check...
1
answers
0
votes
154
views
asked a month agolg...
Getting error while connecting streaming data from kinesis to redshift with few transformations using visual ETL. (using amazon kinesis - glue data catalog table as source ). Schema is already...
1
answers
0
votes
201
views
asked a month agolg...
We are using Tableau and Tableau has a schedule querying athena.
It worked well until yesterday but I got below issue today.
> HIVE_CANNOT_OPEN_SPLIT: Error opening Hive split...
1
answers
0
votes
323
views
asked a month agolg...
Hello,
I have an AWS Glue job that is only supposed to perform an SQL query on the current status. Unfortunately, I always get the following error: "Error Category: QUERY_ERROR; AnalysisException:...
1
answers
0
votes
278
views
asked a month agolg...
Question:
We currently have approximately 100 tables in delta format, partitioned by yyyy, mm, dd, hh, mm. Our current process involves reading these delta tables via a crawler, cataloging them, and...
0
answers
0
votes
359
views
asked a month agolg...
Reading few gb say 15gb of parquet skewed data , after few transformation such as data type change for some columns and then doing repartitions (dataframe.repartition(120)) before writing it to s3 in...
1
answers
0
votes
292
views
asked 2 months agolg...
I have a glue job which pushes the data from glue into open search.
The index Id column is automatically created while inserting the data into open search.
I would like to pass the index id _id...
1
answers
0
votes
330
views
asked 2 months agolg...