Questions tagged with Extract Transform & Load Data

Content language: English

Select up to 5 tags to filter

Sort by most recent

Filter Questions by

AllAnsweredUnansweredNo Answer

Browse through the questions and answers listed below or filter and sort to narrow down your results.

Athena error while running UNLOAD to PARQUET query using column names with spaces in -- GENERIC_INTERNAL_ERROR: field ended by ';': expected ';'

# Error while running UNLOAD to PARQUET query using column names with spaces in ## Introduction I have a table in Athena with the following column names ["column space 1", "column space 2"]. I...

Amazon Athena AWS Glue Extract Transform & Load Data Amazon Redshift

answers

votes

604

views

toby

asked 2 months ago

Glue Data Catalog configuration when updating with Database Migration Service

I set up a replication task with AWS Database Migration Service to implement full load + CDC from a RDS instance to a S3 bucket. Since I want to use Athena to query the data in S3, I set the option...

Accepted AnswerAWS Database Migration Service AWS Glue Extract Transform & Load Data

answers

votes

201

views

Simona B

asked 2 months ago

Redshift Procedure System Catalog

In Amazon Redshift, the general syntax for creating a procedure is as follows: ``` CREATE [ OR REPLACE ] PROCEDURE sp_procedure_name ( [ [ argname ] [ argmode ] argtype [, ...] ] ) [ NONATOMIC...

Accepted AnswerAnalytics Database Extract Transform & Load Data Amazon Redshift

answers

votes

332

views

dashline

asked 2 months ago

In this simple test, why does Athena fail to prune partitions?

I have defined two tables: ``` CREATE EXTERNAL TABLE `event_data`( `systemid` string COMMENT 'from deserializer', `eventtime` string COMMENT 'from deserializer', `eventtype` string COMMENT...

Amazon Athena Extract Transform & Load Data

answers

votes

560

views

AlexR

asked 2 months ago

How do I replicate an Iceberg table used with Athena SQL and Athena PySpark?

I need to replicate an iceberg datalake stored in S3 from one bucket to another. However, multi-region access point doesn't work with Athena table. And I don't see any pyspark procedure that could...

Amazon Simple Storage Service Amazon Athena AWS Glue Extract Transform & Load Data

answers

votes

381

views

DarkCenobyte

asked 2 months ago

Import Custom Python Modules on EMR Serverless through Spark Configuration

Hello everyone, I created a spark_ready.py module that hosts multiple classes that I want to use as a template. I've seen in multiple configurations online that using the "spark.submit.pyFiles" will...

Serverless Extract Transform & Load Data Amazon EMR Serverless

answers

votes

460

views

Justine

asked 2 months ago

current_time minus 1hr in Glue Pyspark

I need to fetch files that has arrived current_time - 1hr from my S3 bucket for processing. My files name will be in format yyyymmdd-hhmmsssss.parquet (includes milli seconds also). So I am running a...

Accepted AnswerAmazon Simple Storage Service Analytics AWS Glue Extract Transform & Load Data

answers

votes

411

views

Joe

asked 2 months ago

Athena Iceberg creates 100,000 files where just a few dozen were expected

I have an iceberg table defined like this: CREATE TABLE IF NOT EXISTS staging ( id STRING, staging_timestamp BIGINT, ... blah blah blah ... ) PARTITIONED BY...

Amazon Athena Extract Transform & Load Data

answers

votes

173

views

AlexR

asked 2 months ago

AWS Glue & pyspark : How to improve performance on a medium to big scaled table

Hello team, So, I built an ETL in python using pyspark. I have a bastion EC2 mysql database that is a copy of a production environment. Every day it is copying the prod at round 2 oclock, and my...

Accepted AnswerDatabase AWS Glue Extract Transform & Load Data

answers

votes

194

views

Ted

asked 2 months ago

How to write data from Glue ETL Streaming Job to Kinesis Data Stream?

Hello! According to the [documentation](https://docs.aws.amazon.com/glue/latest/dg/aws-glue-programming-etl-connect-kinesis-home.html), it should be possible to write data to Kinesis from Glue...

AWS Glue Extract Transform & Load Data Amazon Kinesis Data Streams Amazon Kinesis

answers

votes

1107

views

Wojtek1902

asked 2 months ago

AWS Glue Workflow Trigger

I have a glue job (job_a) that starts through a Lambda. When a file is placed inside an S3 bucket, I am triggering a glue job (job_a) through Lambda. My requirement is, once this glue job (job_a), is...

AWS Lambda AWS CloudFormation Amazon EventBridge AWS Glue Extract Transform & Load Data

answers

votes

336

views

Joe

asked 2 months ago

Line magics in Glue Docker container not found

I am interested particularly in `%additional_python_modules` and I always get this error: `UsageError: Line magic function `%additional_python_modules` not found.` The same error is thrown when I...

Accepted AnswerAWS Glue Extract Transform & Load Data

answers

votes

127

views

siyala

asked 2 months ago

1
2
3
4
5
•••
52
12 / page