All Content tagged with Amazon EMR

Amazon EMR is a cloud big data platform for running large-scale distributed data processing jobs, interactive SQL queries, and machine learning (ML) applications using open-source analytics frameworks such as Apache Spark, Apache Hive, and Presto.

Content language: English

Select up to 5 tags to filter
Sort by most recent
437 results
Hello Team, Followed https://github.com/aws-samples/aws-emr-utilities/blob/main/utilities/emr-ec2-custom-python3/README.md#2-container-images-on-yarn Getting issues when we followed to deploy with d...
2
answers
0
votes
17
views
asked a day ago
Hello, Getting issues post custom ami use at EMR on Ec2 cluster with spark submit resulted in failure ```confs: [default] 0 artifacts copied, 60 already retrieved (0kB/30ms) 25/01/23 13:11:37 WAR...
1
answers
0
votes
13
views
asked 2 days ago
Dear i have emr run in old version,and our security tool inspected some security issue ,so i want to update the program of the emr cluster and what is the best way to do this Thanks
1
answers
0
votes
18
views
asked 3 days ago
hello everyone! I could not find a doc that explicitly mentioned there is any timeout for EMR cluster, do we know if EMR execution itself have a timeout? would like to know how long the step is allow...
1
answers
0
votes
16
views
profile picture
asked 9 days ago
I'm working with AWS EMR Serverless, and I need to construct a job URL for an EMR Serverless job to be sent in a message notification in case of state change. The desired URL includes the associated E...
1
answers
0
votes
30
views
asked 13 days ago
my pyspark job is failing at map partition function with JavaPackage object is not callable error. I have verified that the function I am passing to map partition function is callable and objects pass...
1
answers
1
votes
48
views
asked 20 days ago
Aim: Create an EMR cluster and attach to a workspace, to use with JupyerLab. EMR cluster created with default options: see end of this post for full description. Creating the studio: `aws emr crea...
2
answers
0
votes
45
views
asked 22 days ago
When trying to access the oozie UI on the stated release label, the following error shows up, even with oozie installed and running, how to solve please? ``` HTTP Status 500 - java.lang.NoSuchMethodE...
1
answers
0
votes
23
views
asked a month ago
I am trying to launch a cluster using a JSON script, and we are able to launch it successfully. However, when I attempt to add an 'AutoScalingPolicy' as part of the same JSON file in the Step Function...
2
answers
0
votes
133
views
asked 2 months ago
Since upgrading from EMR 6.X to EMR 7.X, the spark history server produces three distinct Error 500s, the first being a basic JSON exception: com.fasterxml.jackson.core.io.JsonEOFException: Unexpected...
Accepted AnswerAmazon EMR
1
answers
0
votes
217
views
asked 2 months ago
Hello, I am currently working with Spark and HDFS (on-premise) and have recently started learning AWS (covering various services such as S3, DynamoDB, RDS, Redshift, EMR, etc.). I have two main quest...
4
answers
0
votes
84
views
asked 2 months ago
Performance testing for big data analytics tools and engines at petabyte scale is an increasingly challenging avenue. Using traditional sample test datasets may not reflect the actual production-grade...