Questions tagged with Extract Transform & Load Data
Content language: English
Select up to 5 tags to filter
Sort by most recent
Browse through the questions and answers listed below or filter and sort to narrow down your results.
I am reading multiple files from S3 and writing the output to Redshift DB. Below is my code to read all the files from a S3 location (s3://abc/oms/YFS_CATEGORY_ITEM/)
```
yfs_category_item_df =...
2
answers
0
votes
493
views
asked a month agolg...
We have a glue job that is writing large number of items to dynamo.
**If a write to dynamo fails, how can we have access to these individual failed records in order to attempt to resolve and...
1
answers
0
votes
277
views
asked a month agolg...
Hi I have created an external table on AWS Glue catalog db .
The table points to a lz4 compressed file on an s3.
the table definition looks like this
```
CREATE EXTERNAL TABLE `myapplogs`(
...
1
answers
0
votes
283
views
asked a month agolg...
Why doesn't Glue Job and Glue Workflow have the function of version control and alias likes Labmda.lg...
I tried to develop the data orchestlation with s3, Glue Job and Glue Workflow. After I developed it, I found that Glue Job and Glue Workflow doesn't have the function of version control and alias...
0
answers
0
votes
174
views
asked a month agolg...
Hi team, first post, let me know if it provides a good explanation.
I'd like to know a way to minimize the effort for data ingestion.
We have two options as follows:
(1) csv files from a file...
0
answers
0
votes
299
views
asked a month agolg...
I am running an EMR cluster with an attached notebook, and using Apache spark to load/process data however I have not been able to load data into Apache. Whenever I try to run...
1
answers
0
votes
330
views
asked a month agolg...
Hello!
I am new to AWS Glue and I starting to create data monitoring rules in AWS Glue. I have tried multiple options with CustomSQL but can not seem to find the solution.
My problem: I want to check...
1
answers
0
votes
155
views
asked a month agolg...
Getting error while connecting streaming data from kinesis to redshift with few transformations using visual ETL. (using amazon kinesis - glue data catalog table as source ). Schema is already...
1
answers
0
votes
206
views
asked a month agolg...
We are using Tableau and Tableau has a schedule querying athena.
It worked well until yesterday but I got below issue today.
> HIVE_CANNOT_OPEN_SPLIT: Error opening Hive split...
1
answers
0
votes
328
views
asked a month agolg...
Hello,
I have an AWS Glue job that is only supposed to perform an SQL query on the current status. Unfortunately, I always get the following error: "Error Category: QUERY_ERROR; AnalysisException:...
1
answers
0
votes
280
views
asked 2 months agolg...
Question:
We currently have approximately 100 tables in delta format, partitioned by yyyy, mm, dd, hh, mm. Our current process involves reading these delta tables via a crawler, cataloging them, and...
0
answers
0
votes
361
views
asked 2 months agolg...
Reading few gb say 15gb of parquet skewed data , after few transformation such as data type change for some columns and then doing repartitions (dataframe.repartition(120)) before writing it to s3 in...
1
answers
0
votes
294
views
asked 2 months agolg...