All Content tagged with Amazon EMR

Amazon EMR is a cloud big data platform for running large-scale distributed data processing jobs, interactive SQL queries, and machine learning (ML) applications using open-source analytics frameworks such as Apache Spark, Apache Hive, and Presto.

Content language: English

Select up to 5 tags to filter
Sort by most recent
430 results
I am trying to launch a cluster using a JSON script, and we are able to launch it successfully. However, when I attempt to add an 'AutoScalingPolicy' as part of the same JSON file in the Step Function...
2
answers
0
votes
80
views
asked 12 days ago
Since upgrading from EMR 6.X to EMR 7.X, the spark history server produces three distinct Error 500s, the first being a basic JSON exception: com.fasterxml.jackson.core.io.JsonEOFException: Unexpected...
Accepted AnswerAmazon EMR
1
answers
0
votes
74
views
asked 19 days ago
Hello, I am currently working with Spark and HDFS (on-premise) and have recently started learning AWS (covering various services such as S3, DynamoDB, RDS, Redshift, EMR, etc.). I have two main quest...
4
answers
0
votes
66
views
asked 23 days ago
Performance testing for big data analytics tools and engines at petabyte scale is an increasingly challenging avenue. Using traditional sample test datasets may not reflect the actual production-grade...
profile pictureAWS
published a month ago0 votes146 views
This spotlight on Amazon EMR equips you with the skills and troubleshooting tips to get the most out of a cloud big data platform service.
I have a Trino on EMR setup, I need help on accessing Glue tables from the EMR. Athena can access those tables. Below is the error message after running trino cli `show tables;` command ``` dev-dsk...
1
answers
0
votes
24
views
asked 2 months ago
We use AWS EMR 7.2.0 on EC2 with instance fleets (only Primary, Core, no spot instances) and managed scaling for long term use (weeks). On each of the 3 cluster we started so far, we observed the foll...
2
answers
0
votes
136
views
asked 2 months ago
I am using EMR 711395599931.dkr.ecr.us-east-2.amazonaws.com/spark/emr-6.14.0:latest from SparkSubmitOperator and passing this to jar where I am executing a User define function (UDF) in spark. I am ge...
2
answers
0
votes
303
views
asked 2 months ago
Hi. I have set up an EMR Serverless application and i am using my custom image. I've configured everything properly. Next, I've created a custom image according to the official docs: https://docs.a...
2
answers
0
votes
150
views
asked 3 months ago