Questions tagged with Amazon EMR Serverless
Content language: English
Select up to 5 tags to filter
Sort by most recent
Browse through the questions and answers listed below or filter and sort to narrow down your results.
Getting this error when trying to run a simple spark job which reads a json file from s3 and prints the...
2
answers
0
votes
26
views
asked 3 days agolg...
I am using EMR 6.13.0, it is using python 3.7. in my code i have used boto3, the boto3 support for python 3.7 will be discontinued from December-2023.
and as we aware the python 3.7 support stopped...
2
answers
1
votes
93
views
asked 7 days agolg...
I'm new to EMR serverless.
i found out that EMR Serverless uses client mode as default deploy mode.
and there's no informations to use cluster mode in emr serverless.
is there any way to use 'cluster...
1
answers
0
votes
55
views
asked a month agolg...
Hey Guys
I want to run my pyspark on EMR Serverless but it has some dependencies/libraries which are needed by the pyspark script to run. Please suggest a optimized approach to import the...
1
answers
0
votes
36
views
asked a month agolg...
Hello everyone!
I'm seeking advice on architecture design using AWS, specifically regarding the feature store process. Currently, I'm in the prototyping phase and using the tsfresh library for...
1
answers
0
votes
65
views
asked 2 months agolg...
I have a data of 225+ million in my Redshift Table. This data is the activity logs of the user who are coming and going at that time after scanning the door like the access logs of the user at what...
2
answers
0
votes
64
views
asked 2 months agolg...
I've been reading through documentation, but not able to find clear instruction on setup hive metastore in S3 for EMR Serverless, I only see examples of use glue cagtalog or aurora rds sql database....
1
answers
0
votes
224
views
asked 2 months agolg...
Json data is being considered as string while loading data from postgres to json file by AWS Gluelg...
I want to migrate postgres data to redshift, but I have a lot of jsonb data in postgres so for that I had given SUPER data type in Redshift but the problem here is while loading the data to redshift...
1
answers
0
votes
213
views
asked 3 months agolg...
I'm having a lot of problems with disk space in emr serveless :
````
org.apache.spark.util.collection.unsafe.sort.UnsafeExternalSorter@1101b82b : No space left on device
````
I have set disk space...
Accepted AnswerAmazon EMR Serverless
2
answers
0
votes
120
views
asked 3 months agolg...
I've followed the methods for adding Python libraries. Documentation here:
https://docs.aws.amazon.com/emr/latest/EMR-Serverless-UserGuide/using-python-libraries.html
Boto installs and loads...
1
answers
0
votes
162
views
asked 4 months agolg...
Hi,
we have a task to move Spark jobs from on-prem to AWS, using lift-and-shift of the code as much as possible. Our Spark is based on Scala 2.11 and Spark 2.4.0. I know EMR supports this version but...
1
answers
1
votes
220
views
asked 4 months agolg...
My job is still running even though I have received the results. why is that ??
I have successfully go the results.
Can I manually set the "Run status" as "Success" after I got result ? I mean are...
1
answers
0
votes
193
views
asked 4 months agolg...