Questions tagged with Amazon EMR
Content language: English
Select up to 5 tags to filter
Sort by most recent
Browse through the questions and answers listed below or filter and sort to narrow down your results.
I am trying to read data from 3 node MongoDB cluster(replica set) using PySpark and
native python in AWS EMR. I am facing issues while executing the codes with in AWS EMR cluster as explained below...
2
answers
0
votes
661
views
asked 2 years agolg...
Hello,
I have deployed an EMR on EKS and it works correctly. I have tested sending simple JOBs following the AWS guide:...
1
answers
0
votes
297
views
asked 2 years agolg...
When selecting a pyspark kernel for a notebook in EMR studio, tab completion and tooltips (with shift-Tab) are not working as expected. This is especially true for for attribute listing after a dot...
1
answers
0
votes
92
views
asked 2 years agolg...
The following script does not create the table in the S3 location indicated by the query.
I tested it locally and the Delta Json file is created and contains the information about the created...
0
answers
0
votes
157
views
asked 2 years agolg...
I developed a data processing application using Hive on EMR EC2. I'm trying to run the same code on EMR Serverless and am getting the following exception:
`Job failed with Execution Error, return...
1
answers
0
votes
351
views
asked 2 years agolg...
When configuring an EMR Serverless application, you can choose the disk size for your preinitialized capacity and a maximum disk limit for the application:
![Disk...
1
answers
0
votes
1205
views
asked 2 years agolg...
We are trying to copy a dataset from EMR to Redshift which consist of around 13 billion records and 20-25 columns. I tried copying the dataset with the traditional method suing the COPY command...
3
answers
0
votes
498
views
asked 2 years agolg...
Hi,
We are comparing the emr/eks and emr serverless offer. We are trying to achieve the tpc-ds perf benchmark through (https://github.com/aws-samples/emr-on-eks-benchmark). We are not able to do it on...
1
answers
0
votes
885
views
asked 2 years agolg...
I am processing a dataset and need to submit a job to EMR serverless for the dataset to be processed in a distributed way. I have created an application in EMR studio. I would like to submit jobs to...
2
answers
0
votes
1696
views
asked 2 years agolg...
Was trying to upgrade to the latest r6 instances from r5s and ran into an issue with installing numpy in our bootstrap script via pip.
Found[ this...
1
answers
0
votes
243
views
asked 2 years agolg...
We have an external partitioned table
and want to perform delete operation on it.
Is it possible we can perform we delete the data from the external partitioned table?
As per my understanding we can...
1
answers
0
votes
230
views
asked 2 years agolg...
On EMR trying to add yarn command to cron.
I tested directly calling yarn command in cron. But it seems to be skipping that command.
This is the command added to cron
*/1 * * * * hadoop...
Accepted AnswerAmazon EMR
2
answers
0
votes
317
views
asked 2 years agolg...