Questions tagged with Extract Transform & Load Data
Content language: English
Select up to 5 tags to filter
Sort by most recent
Browse through the questions and answers listed below or filter and sort to narrow down your results.
In Amazon Redshift, the general syntax for creating a procedure is as follows:
```
CREATE [ OR REPLACE ] PROCEDURE sp_procedure_name
( [ [ argname ] [ argmode ] argtype [, ...] ] )
[ NONATOMIC...
2
answers
0
votes
302
views
asked a month agolg...
I have defined two tables:
```
CREATE EXTERNAL TABLE `event_data`(
`systemid` string COMMENT 'from deserializer',
`eventtime` string COMMENT 'from deserializer',
`eventtype` string COMMENT...
1
answers
0
votes
536
views
asked a month agolg...
I need to replicate an iceberg datalake stored in S3 from one bucket to another. However, multi-region access point doesn't work with Athena table. And I don't see any pyspark procedure that could...
1
answers
0
votes
272
views
asked a month agolg...
Hello everyone,
I created a spark_ready.py module that hosts multiple classes that I want to use as a template. I've seen in multiple configurations online that using the "spark.submit.pyFiles" will...
2
answers
0
votes
282
views
asked a month agolg...
I need to fetch files that has arrived current_time - 1hr from my S3 bucket for processing. My files name will be in format yyyymmdd-hhmmsssss.parquet (includes milli seconds also). So I am running a...
1
answers
0
votes
381
views
asked a month agolg...
I have an iceberg table defined like this:
CREATE TABLE IF NOT EXISTS staging (
id STRING,
staging_timestamp BIGINT,
... blah blah blah ...
)
PARTITIONED BY...
0
answers
0
votes
159
views
asked a month agolg...
Hello team,
So, I built an ETL in python using pyspark. I have a bastion EC2 mysql database that is a copy of a production environment.
Every day it is copying the prod at round 2 oclock, and my...
1
answers
0
votes
152
views
asked a month agolg...
Hello! According to the [documentation](https://docs.aws.amazon.com/glue/latest/dg/aws-glue-programming-etl-connect-kinesis-home.html), it should be possible to write data to Kinesis from Glue...
2
answers
0
votes
669
views
asked a month agolg...
I have a glue job (job_a) that starts through a Lambda. When a file is placed inside an S3 bucket, I am triggering a glue job (job_a) through Lambda. My requirement is, once this glue job (job_a), is...
1
answers
0
votes
286
views
asked a month agolg...
I am interested particularly in `%additional_python_modules` and I always get this error:
`UsageError: Line magic function `%additional_python_modules` not found.`
The same error is thrown when I...
2
answers
0
votes
92
views
asked a month agolg...
I am running a PoC around integrating the Glue lineage into the [DataHub](https://datahubproject.io/). I have based my research on this set of AWS blog posts...
1
answers
0
votes
490
views
asked 2 months agolg...
Hi, I am using AWS glue studio to read from a DDB table with direct DDB connection. So far my visual diagram has two nodes:
1. Source DDB table node -> Here preview takes 5 minutes for only 2 rows of...
1
answers
0
votes
168
views
asked 2 months agolg...