Questions tagged with Extract Transform & Load Data
Content language: English
Select up to 5 tags to filter
Sort by most recent
Browse through the questions and answers listed below or filter and sort to narrow down your results.
I am running an EMR cluster with an attached notebook, and using Apache spark to load/process data however I have not been able to load data into Apache. Whenever I try to run...
1
answers
0
votes
326
views
asked a month agolg...
Hello!
I am new to AWS Glue and I starting to create data monitoring rules in AWS Glue. I have tried multiple options with CustomSQL but can not seem to find the solution.
My problem: I want to check...
1
answers
0
votes
151
views
asked a month agolg...
Getting error while connecting streaming data from kinesis to redshift with few transformations using visual ETL. (using amazon kinesis - glue data catalog table as source ). Schema is already...
1
answers
0
votes
197
views
asked a month agolg...
We are using Tableau and Tableau has a schedule querying athena.
It worked well until yesterday but I got below issue today.
> HIVE_CANNOT_OPEN_SPLIT: Error opening Hive split...
1
answers
0
votes
316
views
asked a month agolg...
Hello,
I have an AWS Glue job that is only supposed to perform an SQL query on the current status. Unfortunately, I always get the following error: "Error Category: QUERY_ERROR; AnalysisException:...
1
answers
0
votes
274
views
asked a month agolg...
Question:
We currently have approximately 100 tables in delta format, partitioned by yyyy, mm, dd, hh, mm. Our current process involves reading these delta tables via a crawler, cataloging them, and...
0
answers
0
votes
358
views
asked a month agolg...
Reading few gb say 15gb of parquet skewed data , after few transformation such as data type change for some columns and then doing repartitions (dataframe.repartition(120)) before writing it to s3 in...
1
answers
0
votes
289
views
asked a month agolg...
I have a glue job which pushes the data from glue into open search.
The index Id column is automatically created while inserting the data into open search.
I would like to pass the index id _id...
1
answers
0
votes
324
views
asked 2 months agolg...
How can I "automatically" add new partitions to a Glue table based on a Hive formatted S3 bucket?lg...
I have a Bucket containing AWS AppStream logs on format `s3://appstream-logs.../sessions/schedule=DAILY/year=2024/month=04/day=03/daily-session-report-2024-04-03.csv`. I have made this data available...
2
answers
0
votes
128
views
asked 2 months agolg...
I am working on migrating data from MySQL to S3 using AWS DMS. I want to employ wildcard mapping for the schema name in the DMS task's selection rules. Specifically, I aim to include tables from...
6
answers
0
votes
188
views
asked 2 months agolg...
We are encountering a issue where we're utilizing the "super" datatype. The column in the Parquet file we receive has a maximum length of 192K. How should we handle this data? Are there alternative...
2
answers
0
votes
252
views
asked 2 months agolg...
Example s3://bucket1/mytable/ -- > east-2 bucket folder with same schema
s3://bucket2/mytable/ -- > west-2 bucket folder with same schema
can we create a single table from this two...
3
answers
0
votes
549
views
asked 2 months agolg...