Questions tagged with Amazon EMR

Content language: English

Select up to 5 tags to filter

Sort by most recent

Filter Questions by

AllAnsweredUnansweredNo Answer

Browse through the questions and answers listed below or filter and sort to narrow down your results.

Unable to load data to apache in EMR cluster notebook

I am running an EMR cluster with an attached notebook, and using Apache spark to load/process data however I have not been able to load data into Apache. Whenever I try to run...

Analytics Amazon EMR Extract Transform & Load Data Amazon EMR Studio

answers

votes

views

Music Dev

asked 3 days ago

Spark application takes longer than expected in emr 7

I have spark application running in emr 7 that took 15+ hours which was taken 9 hours in emr 6.14. There is no code change and data volume changes. One observation is the application attempted thrice...

Accepted AnswerAmazon EMR

answers

votes

144

views

Vaas

asked 6 days ago

How should i configure my emr cluster to handle large data

I have an EMR cluster and I have used the treasure data connector to read data from table into dataframe using pyspark. Now these tables that I'm trying to read have approximately 100 million to 500...

Amazon EMR

answers

votes

204

views

Nakshtra

asked 10 days ago

EMR Jupyter Notebook: PySpark Imports Work in Shell, Not in Notebook- Issue is importing custom files

Issue: PySpark works in the first cells (likely SparkSession creation) but throws import errors when using my Python files in later cells. Environment: AWS EMR ( Amazon EMR...

Amazon EMR

answers

votes

214

views

Harish

asked 15 days ago

Studio Workspace can't see my runnning EMR EC2 cluster to attach to

Let me know if this is something AWS EMR Studio does: 1. in Databricks community edition, and in Google Collab, one can fire up a simple Jupyter notrebook with an automatically started cluster (small...

Amazon WorkSpaces Amazon EMR Amazon EMR Serverless

answers

votes

242

views

ken cottrell

asked 22 days ago

AWS EMR - YARN Resource Issue

Hi everyone, I am using AWS EMR to do some ETL operations on very large datasets (like millions/billions of records). I am using PySpark and reading the csv files using *spark.read.csv*. The results...

Amazon EMR Compute

answers

votes

297

views

vsk95

asked 24 days ago

Serverless job failure

While running the serverless job run, I am getting below errror: "Number of cores specified by 'spark.driver.cores '7' is invalid".

Amazon EMR Amazon EMR Serverless

answers

votes

301

views

Akash

asked a month ago

refresh_hfiles not working

Hi I have a EMR with Hbase on S3 storage mode.I have a read replica cluster pointing to same S3 bucket. Now when I add record in primary cluster and flush table on primary, and then run refresh_hfiles...

Amazon EMR Database AWS IAM Identity Center Amazon S3 Access Grants

answers

votes

321

views

shushant

asked a month ago

AWS EMR WAL creation error

Hi I am getting error while launching EMR with Hbase as S3Storage and WAL backup enabled . Caused by: java.lang.RuntimeException: createWal failed for wal WALMetadata(WALWorkspace=testworkspace2,...

AWS Identity and Access Management Developer Tools Amazon EMR IAM Policies

answers

votes

454

views

shushant

asked a month ago

I have a Python package saved in CodeCommit and I need it to run in the notebook linked to an EMR cluster.

I have a Python package saved in CodeCommit and need to use it in the notebook attached to my EMR cluster workspace. The package is already successfully installed via bootstrap. To do this, in my .sh...

AWS CodeCommit Amazon EC2 Amazon EMR Amazon EMR Studio

answers

votes

359

views

amanda_oliveira

asked a month ago

How do I connect Amazon mq to AWS emr serveless?

I have a Serverless EMR appication, I am submitting a spark job via python script. I have packaged all the dependencies an an the script to an s3 bucket. When I execute the job the spark job is...

Amazon EMR Amazon MQ Amazon EMR Serverless

answers

votes

403

views

Tushar

asked a month ago

Unable to run iceberg insert in hive deployed on EMR

Hello, I configured iceberg formatted table with transaction in hive on EMR 6.4.1. When I insert data into the table, the operation get stuck, without any error. Any insights are highly...

Accepted AnswerAmazon EMR

answers

votes

405

views

Mark

asked a month ago

1
2
3
4
5
•••
25
12 / page