All Content tagged with Amazon EMR
Amazon EMR is a cloud big data platform for running large-scale distributed data processing jobs, interactive SQL queries, and machine learning (ML) applications using open-source analytics frameworks such as Apache Spark, Apache Hive, and Presto.
Content language: English
Filter content
Select tags to filter
Sort by
Sort by most recent
464 results
Hello Team,
Followed https://github.com/aws-samples/aws-emr-utilities/blob/main/utilities/emr-ec2-custom-python3/README.md#2-container-images-on-yarn
Getting issues when we followed to deploy with d...
2
answers
0
votes
167
views
asked a year ago
Hello,
Getting issues post custom ami use at EMR on Ec2 cluster with spark submit resulted in failure
```confs: [default]
0 artifacts copied, 60 already retrieved (0kB/30ms)
25/01/23 13:11:37 WAR...
1
answers
0
votes
233
views
asked a year ago
Dear
i have emr run in old version,and our security tool inspected some security issue ,so i want to update the program of the emr cluster
and what is the best way to do this
Thanks
1
answers
0
votes
398
views
asked a year ago
hello everyone!
I could not find a doc that explicitly mentioned there is any timeout for EMR cluster, do we know if EMR execution itself have a timeout?
would like to know how long the step is allow...
1
answers
0
votes
684
views
asked a year ago
I'm working with AWS EMR Serverless, and I need to construct a job URL for an EMR Serverless job to be sent in a message notification in case of state change. The desired URL includes the associated E...
1
answers
0
votes
400
views
asked a year ago
my pyspark job is failing at map partition function with JavaPackage object is not callable error. I have verified that the function I am passing to map partition function is callable and objects pass...
1
answers
1
votes
731
views
asked a year ago
Aim: Create an EMR cluster and attach to a workspace, to use with JupyerLab.
EMR cluster created with default options: see end of this post for full description.
Creating the studio:
`aws emr crea...
2
answers
0
votes
312
views
asked a year ago
When trying to access the oozie UI on the stated release label, the following error shows up, even with oozie installed and running, how to solve please?
```
HTTP Status 500 - java.lang.NoSuchMethodE...
1
answers
0
votes
81
views
asked a year ago
I am trying to launch a cluster using a JSON script, and we are able to launch it successfully. However, when I attempt to add an 'AutoScalingPolicy' as part of the same JSON file in the Step Function...
2
answers
0
votes
408
views
asked a year ago
Since upgrading from EMR 6.X to EMR 7.X, the spark history server produces three distinct Error 500s, the first being a basic JSON exception:
com.fasterxml.jackson.core.io.JsonEOFException: Unexpected...
Accepted AnswerAmazon EMR
1
answers
0
votes
1.5K
views
asked a year ago
Hello,
I am currently working with Spark and HDFS (on-premise) and have recently started learning AWS (covering various services such as S3, DynamoDB, RDS, Redshift, EMR, etc.). I have two main quest...
4
answers
0
votes
221
views
asked a year ago
GaganBrahmiAWSEXPERT
published a year ago2 votes773 views
Performance testing for big data analytics tools and engines at petabyte scale is an increasingly challenging avenue. Using traditional sample test datasets may not reflect the actual production-grade...