Questions tagged with Extract Transform & Load Data
Content language: English
Select up to 5 tags to filter
Sort by most recent
Browse through the questions and answers listed below or filter and sort to narrow down your results.
Hi
I been create Glue Data Connector using its AWS RDS option
and I also create proper IAM role, that have full access to "rds-data", "s3" and "glue"
but whenever I tried to connect (using test...
0
answers
0
votes
119
views
asked 8 months agolg...
# Background
We were getting a "HIVE_CURSOR_ERROR: Failed to read Parquet file" when running trying to run an athena query using `SELECT * FROM mydb`. Our underlying data that we were querying was...
1
answers
0
votes
580
views
asked 8 months agolg...
I want to run my Glue Streaming job locally on Docker container (amazon/aws-glue-streaming-libs:glue_streaming_libs_4.0.0_image_01) to better troubleshoot memory issues, but I encountered this issue...
1
answers
0
votes
285
views
asked 8 months agolg...
Hi all,
I'm trying to connect to an external MariaDB database instance using a AWS Glue Spark script and a JDBC Glue connection.
The code snippet from the Spark script is:
dyf =...
2
answers
0
votes
204
views
asked 8 months agolg...
I'm using DMS to capture CDC from an RDS PostgreSQL Database, then writing the changes to a Kinesis Data Stream and finally using a Glue Streaming Job to process the data and write it to a Hudi Data...
2
answers
0
votes
400
views
asked 8 months agolg...
I am currently using a Glue job to read data from one Amazon S3 source, perform some transformations and write the transformed data into another S3 bucket in parquet format. While writing data to the...
1
answers
0
votes
566
views
asked 8 months agolg...
Hi,
I am trying to migrate a table from Postgres to Redshift using a migration task
Simplified table structure:
| Name | Type |
| --- | --- |
| id | integer |
| time | timestamp with time zone |
|...
0
answers
0
votes
117
views
asked 8 months agolg...
In a glue job that is using bookmarks, I'm including the transformation_ctx parameter in each of the create dynamic frame methods (where I read data).
If I then do a join and a select and then an...
1
answers
0
votes
425
views
asked 8 months agolg...
I have a Glue job that performs a column mapping (a different question question!), the job fails at the final stage where it is time to persist the results back to the...
3
answers
0
votes
544
views
asked 8 months agolg...
My Glue 4.0 jobs have suddenly stopped working with error message below. As it is related to boto3, I am unable to make any changes to library config. Pls advise.
NB: I noticed that urllib3 released...
0
answers
0
votes
251
views
asked 8 months agolg...
I have converted a json format file in parquet, I can see the parquet file and the columns, but while querying with Athena getting error.
HIVE_UNKNOWN_ERROR: Path is not absolute:...
1
answers
0
votes
289
views
asked 8 months agolg...
1. **Spun up an EMR instance:**
emr-6.10.0
Spark 3.3.1, HBASE 2.4.15, Hive 3.1.3, JupyterHub 1.5.0, Hadoop 3.3.3, ZooKeeper 3.5.10, Zeppelin 0.10.1, Phoenix 5.1.2, Presto 0.278,
...
1
answers
1
votes
301
views
asked 8 months agolg...