Questions tagged with Extract Transform & Load Data
Content language: English
Select up to 5 tags to filter
Sort by most recent
Browse through the questions and answers listed below or filter and sort to narrow down your results.
I added PCA to my flow, I chose the numeric values, added a value for components and clicked preview. This error pops up: OperatorCustomerError: Py4JJavaError: breeze.linalg.NotConvergedException:
I...
0
answers
0
votes
64
views
asked a year agolg...
I work for a company that generates large amounts of business and sensor data and stores it in different databases (Prometheus, InfluxDB, Postgres, Timestream). Currently the querying for analytics...
2
answers
0
votes
373
views
asked a year agolg...
I am following the steps outlined in the link below:
https://aws.amazon.com/blogs/big-data/introducing-native-delta-lake-table-support-with-aws-glue-crawlers/
(1) No issue with Query Delta Lake...
1
answers
0
votes
409
views
asked a year agolg...
I try to use Glue Crawler to read CSV files from S3 and create catalog table from it. Crawler run succesfully and it will create catalog table but those tables are empty (without columns) if I have...
3
answers
0
votes
974
views
asked a year agolg...
When I started an etl job and mapped one table to a s3 bucket and change some data type. I got two columns empty because these two columns included some null value, how can I skip the null value in...
0
answers
0
votes
60
views
asked a year agolg...
I converted a CSV(from S3) to parquet(to S3) using AWS glue and the file which is converted to Parquet was named randomly .How do i choose the name of the file that is to be converted to Parquet from...
1
answers
0
votes
762
views
asked a year agolg...
Macie provides detailed positions of sensitive data in output file. But, I want to extract that data using positions from output file. Also, macie reveal only 10 samples.
Is there any way to get more...
1
answers
0
votes
316
views
asked a year agolg...
I'm writing partitioned parquet data using a Spark data frame and mode=overwrite to update stale partitions. I have this set: spark.conf.set('spark.sql.sources.partitionOverwriteMode','dynamic')
The...
1
answers
0
votes
845
views
asked a year agolg...
How can one set up an Execution Class = FLEX on a Jupyter Job Run , im using the %magic on my %%configure cell like below and also setting the input arguments with --execution_class = FLEX
But still...
2
answers
0
votes
585
views
asked a year agolg...
Hi, I'd appreciate AWS Athena support for TIMESTAMP data type with microsecond precision for all row formats and table engines. Currently, the support is very inconsistent. See the SQL script below....
0
answers
0
votes
143
views
asked a year agolg...
Started getting this error today when querying data from Athena in a table created from parquet files in our S3 bucket:
![Enter image description...
0
answers
0
votes
96
views
asked a year agolg...
Hi community,
I am trying to perform an ETL job using AWS Glue.
Our data is stored in MongoDB Atlas, inside a VPC.
Our AWS is connected to our MongoDB Atlas using VPC peering.
To perform the ETL...
1
answers
1
votes
427
views
asked a year agolg...