Questions tagged with Extract Transform & Load Data
Content language: English
Select up to 5 tags to filter
Sort by most recent
Browse through the questions and answers listed below or filter and sort to narrow down your results.
Hello,
I'm writing a custom transform where I want to use mode within pyspark.sql.functions but I get the same issue irrespective of whether I use * or import the specific module. How can I resolve...
0
answers
0
votes
81
views
asked 7 months agolg...
I am running pyspark on glue 3 notebook. The %additional_python_modules works well in the absence of %connections. But if I add %connections the job ignore %additional_python_modules and doesn't...
1
answers
0
votes
202
views
asked 7 months agolg...
Trying to write the records from S3 text file to Redshit. It running when the record count is around 10000, but running long and further connection timing out when trying to write the entire file (50K...
1
answers
0
votes
159
views
asked 7 months agolg...
I am trying to use Amazon Grafana with files on Amazon S3. For that, I need to change the date format of the original file to fit on Grafana. I use the following command :
SELECT...
1
answers
0
votes
254
views
asked 7 months agolg...
Hi,
I have a list of Glue jobs, they are up and running. Starting from 2023/08/14 I'm having a lot of errors from CoarseGrainedExecutorBackend like this:
**ERROR CoarseGrainedExecutorBackend:**...
1
answers
0
votes
815
views
asked 7 months agolg...
Hi all, I created a EventBridge rule with the following event pattern that is suppose to match all glue crawler state changes and send them to a lambda function, however, the rule is only only sending...
2
answers
0
votes
562
views
asked 7 months agolg...
Hi,
I am searching for a transformation engine that supports multi-tenancy with the following requirements:
* Each tenant must be transformed every 10 minutes.
* One tenant transformation transforms...
1
answers
0
votes
294
views
asked 7 months agolg...
Hello, we have an S3 bucket with various CSV files and an AWS Glue crawler to update the Data Catalog and finally an AWS Glue job to move the data to RedShift. The handling of data and target table is...
0
answers
0
votes
120
views
asked 7 months agolg...
Hi all. I generated a batch transform. The process finished correctly with no issues. When I check the output data, the directory is empty.
This is the manifest file format (RecordIO)...
1
answers
0
votes
210
views
asked 7 months agolg...
importing form encoded/csv data from data logger using aws api gateway,lamda function and dynamodblg...
greetings, for past few weeks i have been trying to fetch data using a solar data logger using an api gateway that sends data to my api gateway which is integrated to my lamda function but the problem...
4
answers
0
votes
359
views
asked 7 months agolg...
Hello everyone.
Data from the rest api in the form of JSON is loaded daily by lambda into s3-bucket-1.
Then this data should be stored in s3-bucket-2 in the form of a flat parquet table.
I did it in...
0
answers
0
votes
59
views
asked 8 months agolg...
Hey Guys
I want to run my pyspark on EMR Serverless but it has some dependencies/libraries which are needed by the pyspark script to run. Please suggest a optimized approach to import the...
1
answers
0
votes
237
views
asked 8 months agolg...