By using AWS re:Post, you agree to the AWS re:Post Terms of Use

All Content tagged with Amazon EMR

Amazon EMR is a cloud big data platform for running large-scale distributed data processing jobs, interactive SQL queries, and machine learning (ML) applications using open-source analytics frameworks such as Apache Spark, Apache Hive, and Presto.

Content language: English

Select up to 5 tags to filter
Sort by most recent
431 results
I need to load data from Kinesis Data Streams to EMR via EMR Studio. I Follow this sample but doesn't work: https://github.com/awslabs/spark-sql-kinesis-connector
1
answers
0
votes
1.6K
views
AWS
asked 9 months ago
I want to run EMR On-Premises, no Spark. The question, is possible to run EMR (https://aws.amazon.com/emr/) on EKS Anywhere (https://aws.amazon.com/es/eks/eks-anywhere/) Also, we don't have support ...
1
answers
0
votes
493
views
profile picture
asked 9 months ago
I am working with Step Function, and I have a MAP type step to which I pass an S3 path in which there is a csv on which the MAP has to iterate. In each loop of the map, a script is executed with the c...
2
answers
0
votes
723
views
asked 9 months ago
profile pictureAWS
EXPERT
published 9 months ago1 votes1.5K views
Assist with build and install of prerequisite software for GeoPandas on Amazon Linux 2023 for Graviton
**Overview:** The Spark application in question is deployed within AWS Account A, specifically in the us-west-2 region. This application reads data from and writes data to Amazon S3 buckets hosted in ...
1
answers
0
votes
314
views
asked 10 months ago
EMR API call? Trying to determine if there an API call to determine if "Automatically apply latest Amazon Linux updates" for EMR cluster was checked..
1
answers
0
votes
434
views
profile pictureAWS
asked 10 months ago
Hi team We want to use an EMR Cluster to process data with spark jobs We have 30,000 files per day and approximately 2Gb of information, later it is planned that this will grow. We have a small cluste...
Accepted AnswerAmazon EC2Amazon EMR
1
answers
0
votes
452
views
asked 10 months ago
I am using Jupyter notebook within Amazon EMR studio. I try to run my Jupyter notebook code and I get a kernel-related error (see attached screenshot). Also, my EMR instance is using an EC2 cluster. I...
1
answers
0
votes
617
views
profile picture
asked 10 months ago
* EMR Version: 6.15.0 * Spark Conf * "spark.sql.catalog.spark_catalog": "org.apache.iceberg.spark.SparkSessionCatalog" * "spark.sql.catalog.spark_catalog.catalog-impl": "org.apache.iceberg.aws...
1
answers
0
votes
602
views
asked 10 months ago
Hi Team, We are trying to setup hive with external metastore running in Aurora MySQL 8 , we are using emr 6.15.0 and we used the instructions from the AWS documentation . We are able to successfully ...
1
answers
0
votes
427
views
asked 10 months ago
The Zero ETL Integration for replicating data to Redshift from Aurora PostgreSQL is currently in "Preview", as [this post specifies ](https://docs.aws.amazon.com/AmazonRDS/latest/AuroraUserGuide/zero-...
1
answers
0
votes
556
views
asked 10 months ago
Is there a way to use s3-dist-cp to copy files from a bucket that uses Requestor payments?
2
answers
0
votes
442
views
asked a year ago