Explore how you can quickly prepare for, respond to, and recover from security events. Learn more.
Questions tagged with Amazon EMR
Amazon EMR is a cloud big data platform for running large-scale distributed data processing jobs, interactive SQL queries, and machine learning (ML) applications using open-source analytics frameworks such as Apache Spark, Apache Hive, and Presto.
Content language: English
Select up to 5 tags to filter
Sort by most recent
Browse through the questions and answers listed below or filter and sort to narrow down your results.
317 results
I am trying to launch a cluster using a JSON script, and we are able to launch it successfully. However, when I attempt to add an 'AutoScalingPolicy' as part of the same JSON file in the Step Function...
Since upgrading from EMR 6.X to EMR 7.X, the spark history server produces three distinct Error 500s, the first being a basic JSON exception:
com.fasterxml.jackson.core.io.JsonEOFException: Unexpected...
Hello,
I am currently working with Spark and HDFS (on-premise) and have recently started learning AWS (covering various services such as S3, DynamoDB, RDS, Redshift, EMR, etc.). I have two main quest...
I have a Trino on EMR setup, I need help on accessing Glue tables from the EMR.
Athena can access those tables.
Below is the error message after running trino cli `show tables;` command
```
dev-dsk...
We use AWS EMR 7.2.0 on EC2 with instance fleets (only Primary, Core, no spot instances) and managed scaling for long term use (weeks). On each of the 3 cluster we started so far, we observed the foll...
I am using EMR 711395599931.dkr.ecr.us-east-2.amazonaws.com/spark/emr-6.14.0:latest from SparkSubmitOperator and passing this to jar
where I am executing a User define function (UDF) in spark.
I am ge...
Hi.
I have set up an EMR Serverless application and i am using my custom image. I've configured everything properly.
Next, I've created a custom image according to the official docs:
https://docs.a...
I launched an EMR cluster from a CloudFormation template stored as a Service Catalog template **from SageMaker**. In the template, KeepJobFlowAliveWhenNoSteps was not specified in JobFlowInstancesConf...
Hi everyone,
I'm having trouble connecting to my MySQL RDS instance from an EMR cluster, even though both are in the same VPC and port 3306 is open in the security group. Here’s the setup:
RDS Datab...
Hello Community,
I’m trying to run Apache Superset on an EMR cluster and I’m facing an issue with accessing the Superset web interface through SSH tunneling. Here’s a summary of my setup and the issu...
Hello
As part of Cloud Migration and Modernization approach using using AWS, the requirement is to migrate Hbase data directly to S3 then read the data from S3 using Java Microservices. (EMR would not...
I have a use case where I need to run Batch EMR job on schedule (daily). I can make folders on date basis for my data coming from IoT. Or I can make folders for each device sending IoT data and put da...