Skip to content

All Content tagged with Amazon EMR

Amazon EMR is a cloud big data platform for running large-scale distributed data processing jobs, interactive SQL queries, and machine learning (ML) applications using open-source analytics frameworks such as Apache Spark, Apache Hive, and Presto.

Content language: English

Filter content
Select tags to filter
Sort by
Sort by most recent
464 results
I am deploying an EMR HBase cluster with EMR WAL enabled using Terraform. The cluster is created successfully and the WALs are visible using the emrwal CLI. When I change some configuration of my clus...
1
answers
0
votes
136
views
asked 9 months ago
Hi everyone, I currently have an EMR cluster (emr-6.9.0) running a real-time ingestion process. To save disk space, I’ve been using the **Cloud Shuffle Storage Plugin** for Apache Spark. Now, I need ...
1
answers
0
votes
289
views
asked 9 months ago
Hi , Recently i started facing issues with EMR (EC2 is out of capacity), mentioning that "EC2 is out of capacity for m6a.12xlarge in availability zone us-east-1c" I tried different machines in same ...
Accepted AnswerAmazon EC2Amazon EMR
1
answers
0
votes
199
views
asked a year ago
Hi Team, I'm Getting the below error while writing the table in postgres using glue spark script can you please help me on this issue. "Table or view "asset_aircraft_rpt" already exists. SaveMode: ...
1
answers
0
votes
147
views
asked a year ago
Hello, I'm facing a pretty annoying error. Whenever I try to execute a UDF function on a EMR Notebook I get the following error: ``` py4j.protocol.Py4JJavaError: An error occurred while calling o157...
2
answers
0
votes
163
views
asked a year ago
I'm trying to build a simple Collaborative Filtering Recommendation Engine using Apache Spark ML lib on Amazon EMR. So I created a EMR on EC2 cluster, with the following configuration: ![Enter image...
1
answers
0
votes
124
views
asked a year ago
Hello! We're trying to migrate from a stand-alone Hive Metastore to Glue. We've modified the definition of some EMR clusters (v7.0.0) to use Glue as the metastore, we use Spark on Hadoop to process da...
2
answers
0
votes
265
views
asked a year ago
We're looking for native API/CLI/SDK support to manage EMR Studio workspaces programmatically. Currently, these operations (create, list, delete, etc.) are only possible via the UI, making it difficul...
1
answers
0
votes
209
views
asked a year ago
We have an lambda function that creates a cluster on demand , the cluster logs goes by default to S3 bucket specified in loguri, now we have a requirement to get application logs in cloudwatch wheneve...
1
answers
0
votes
429
views
asked a year ago
I have 2. 1. Issue Summary My EMR cluster fails with errors indicating "data source not found" in logs. Cluster apps (Spark, Hive, Livy) seem unable to locate input data, but the exact cause is uncl...
3
answers
0
votes
216
views
asked a year ago
Is it possible to reuse a single Load Balancer when deploying multiple interactive endpoints that will be associated with different users using EMR Studio? Implementation is on Amazon EMR on EKS. Refe...
1
answers
0
votes
147
views
AWS
asked a year ago
  • 1
  • 2
  • 3
  • 4
  • 5
  • •••
  • 39
  • Page size
    12 / page