Questions tagged with Extract Transform & Load Data

Content language: English

Select up to 5 tags to filter

Sort by most recent

Filter Questions by

AllAnsweredUnansweredNo Answer

Browse through the questions and answers listed below or filter and sort to narrow down your results.

Best practices for data ingestion - csv or mongodb

Hi team, first post, let me know if it provides a good explanation. I'd like to know a way to minimize the effort for data ingestion. We have two options as follows: (1) csv files from a file...

Analytics AWS Lambda AWS Glue Extract Transform & Load Data

answers

votes

views

Felipe Vaz

asked 25 minutes ago

Unable to load data to apache in EMR cluster notebook

I am running an EMR cluster with an attached notebook, and using Apache spark to load/process data however I have not been able to load data into Apache. Whenever I try to run...

Analytics Amazon EMR Extract Transform & Load Data Amazon EMR Studio

answers

votes

views

Music Dev

asked 2 days ago

AWS Glue two date comparison data quality

Hello! I am new to AWS Glue and I starting to create data monitoring rules in AWS Glue. I have tried multiple options with CustomSQL but can not seem to find the solution. My problem: I want to check...

Analytics Database AWS Glue Extract Transform & Load Data

answers

votes

views

kristalo

asked 6 days ago

Glue job error - Error while List shards

Getting error while connecting streaming data from kinesis to redshift with few transformations using visual ETL. (using amazon kinesis - glue data catalog table as source ). Schema is already...

Accepted AnswerAWS Glue Extract Transform & Load Data Amazon Redshift Amazon Kinesis

answers

votes

126

views

Vanishree

asked 9 days ago

Amazon Athena HIVE_CANNOT_OPEN_SPLIT Error

We are using Tableau and Tableau has a schedule querying athena. It worked well until yesterday but I got below issue today. > HIVE_CANNOT_OPEN_SPLIT: Error opening Hive split...

Accepted AnswerAmazon Simple Storage Service Amazon Athena Analytics Extract Transform & Load Data

answers

votes

254

views

kusrc

asked 14 days ago

AWS Glue job failed while connecting to aws glue cause of timeout

Hello, I have an AWS Glue job that is only supposed to perform an SQL query on the current status. Unfortunately, I always get the following error: "Error Category: QUERY_ERROR; AnalysisException:...

Amazon VPC AWS Glue Extract Transform & Load Data

answers

votes

229

views

Jona

asked 16 days ago

AWS Glue Crawler Scalability for Large Number of Delta Tables

Question: We currently have approximately 100 tables in delta format, partitioned by yyyy, mm, dd, hh, mm. Our current process involves reading these delta tables via a crawler, cataloging them, and...

Analytics AWS Glue Extract Transform & Load Data Amazon Redshift

answers

votes

342

views

pkgp-aws

asked 17 days ago

Spark shuffle huge amount of data even read data is not huge

Reading few gb say 15gb of parquet skewed data , after few transformation such as data type change for some columns and then doing repartitions (dataframe.repartition(120)) before writing it to s3 in...

AWS Glue Extract Transform & Load Data Amazon GameSparks S3 Select

answers

votes

256

views

Bibhu

asked 19 days ago

AWS GLUE to Open Search Index custom Index Id

I have a glue job which pushes the data from glue into open search. The index Id column is automatically created while inserting the data into open search. I would like to pass the index id _id...

Accepted AnswerAnalytics AWS Glue Amazon OpenSearch Service Extract Transform & Load Data

answers

votes

242

views

srm

asked 22 days ago

How can I "automatically" add new partitions to a Glue table based on a Hive formatted S3 bucket?

I have a Bucket containing AWS AppStream logs on format `s3://appstream-logs.../sessions/schedule=DAILY/year=2024/month=04/day=03/daily-session-report-2024-04-03.csv`. I have made this data available...

Accepted AnswerAWS Glue Extract Transform & Load Data

answers

votes

103

views

Andreax

asked 22 days ago

AWS DMS Migration Task Fails with "No Tables Found" Using Wildcard in Schema Mapping for MySQL

I am working on migrating data from MySQL to S3 using AWS DMS. I want to employ wildcard mapping for the schema name in the DMS task's selection rules. Specifically, I aim to include tables from...

AWS Database Migration Service AWS Glue Extract Transform & Load Data

answers

votes

160

views

Bhavesh

asked 24 days ago

Redshift super datatype not enough to store json data type column from Postgres

We are encountering a issue where we're utilizing the "super" datatype. The column in the Parquet file we receive has a maximum length of 192K. How should we handle this data? Are there alternative...

AWS Glue Extract Transform & Load Data Amazon Redshift

answers

votes

211

views

msve

asked a month ago

1
2
3
4
5
•••
51
12 / page