By using AWS re:Post, you agree to the AWS re:Post Terms of Use

All Content tagged with Amazon EMR

Amazon EMR is a cloud big data platform for running large-scale distributed data processing jobs, interactive SQL queries, and machine learning (ML) applications using open-source analytics frameworks such as Apache Spark, Apache Hive, and Presto.

Content language: English

Select up to 5 tags to filter
Sort by most recent
431 results
profile pictureAWS
SUPPORT ENGINEER
published 7 months ago2 votes1.6K views
This article might provide guidance on configuring and accessing the Spark application UI for Interactive Endpoints that are either self-hosted notebooks or EMR Studio managed notebooks.
I have an EMR workspace under which I have 4 Jupyter notebooks created on which PySpark code blocks are run. I want to get the last execution code block time across all 4 notebooks to determine the ti...
1
answers
0
votes
611
views
asked 7 months ago
I want to change the default s3 storage class to INTELLIGENT_TIERING of Hive connector of EMR Trino 426 (EMR 6.15.0). I found the [hive.s3.storage-class option in the Trino 426 official manual](https:...
Accepted AnswerAmazon EMR
2
answers
0
votes
718
views
asked 7 months ago
I am running an EMR cluster with an attached notebook, and using Apache spark to load/process data however I have not been able to load data into Apache. Whenever I try to run spark.read.csv('s3://buc...
2
answers
0
votes
912
views
asked 8 months ago
profile pictureAWS
SUPPORT ENGINEER
published 8 months ago3 votes1.3K views
The guidance provided in the article could prove instrumental in conducting a comprehensive and systematic evaluation of the log data, potentially leading to the identification and resolution of the u...
Amazon EMR
I have spark application running in emr 7 that took 15+ hours which was taken 9 hours in emr 6.14. There is no code change and data volume changes. One observation is the application attempted thrice ...
Accepted AnswerAmazon EMR
3
answers
0
votes
889
views
asked 8 months ago
profile pictureAWS
SUPPORT ENGINEER
published 8 months ago3 votes1.6K views
This article might help to investigate the EMR cluster that terminated with error mentioned as "On the master instance, application provisioning failed".
Amazon EMR
profile pictureAWS
SUPPORT ENGINEER
published 8 months ago3 votes1.1K views
This article might help to investigate the EMR cluster that terminated with error mentioned as "Master instance startup failed due to an internal error" especially when using custom AMI image.
Amazon EMR
profile pictureAWS
SUPPORT ENGINEER
published 8 months ago3 votes1.5K views
This article might help to investigate the EMR cluster that terminated with error mentioned as "Failed to start the job flow due to an internal error" especially when using custom AMI image.
Amazon EMR
profile pictureAWS
SUPPORT ENGINEER
published 8 months ago3 votes1.2K views
The Instance-state log available in Amazon EMR on EC2 that provides valuable information for troubleshooting application failures or investigating system details. This article describes the detailed i...
Amazon EMR
profile pictureAWS
EXPERT
published 8 months ago0 votes1.7K views
Assist with build and install of prerequisite software for TensorFlow on Amazon Linux 2023 for Graviton
I have an EMR cluster and I have used the treasure data connector to read data from table into dataframe using pyspark. Now these tables that I'm trying to read have approximately 100 million to 500 m...
1
answers
0
votes
920
views
asked 8 months ago