Questions tagged with Amazon EMR
Content language: English
Select up to 5 tags to filter
Sort by most recent
Browse through the questions and answers listed below or filter and sort to narrow down your results.
[https://docs.aws.amazon.com/emr/latest/EMR-Serverless-UserGuide/using-ddb-connector.html#using-ddb-connector-query](This doc page) describes connecting to dynamodb from spark, however it is...
0
answers
0
votes
99
views
asked a year agolg...
Hello,
From the new interface how do we attach a cluster to an existing Workspace? In the old UI we could do this by clicking on the workspace but now clicking on it tries to start it. Thanks in...
1
answers
2
votes
99
views
asked a year agolg...
I have a 10GB dataset loaded in a PySpark dataframe.
```
df.coalesce(1).write.mode('overwrite').parquet("s3://xxxxxxxxxx-eu-west-1-athena-results-bucket-h1snx89wnc/output-data-parquet2")...
1
answers
0
votes
232
views
asked a year agolg...
I understand there are application limits for EMR serverless. But when there are multiple jobs running at the same time on the same application, is it possible for them to share workers' available...
1
answers
0
votes
708
views
asked a year agolg...
Hi,
We have a vulnerability risk on our EMR cluster. We wanted to know if the ALAS2-2023-1909( https://alas.aws.amazon.com/AL2/ALAS-2023-1909.html) patch validated for the default Amazon EMR AMI?
We...
1
answers
0
votes
318
views
asked a year agolg...
Is there a way to utilize EMR Serverless to run S3DistCp? Looking at the base Docker images, I can see that the `s3-dist-cp` command is included in the Hive image. How can I submit a job run that runs...
1
answers
0
votes
482
views
asked a year agolg...
Hello Team - Good Morning. My customer is using SPARK 2.4 on EMR for their batch workloads. They are planning for migration to SPARK 3.3 and looking for some guidance/best practices for this...
1
answers
0
votes
327
views
asked a year agolg...
Hi
One point I don't understand about EMR notebook :
this tool is mainly made for developers for which we don't want to allow to connect to the AWS console...
How to provide EMR notebook without a...
1
answers
0
votes
265
views
asked a year agolg...
While processing a file through EMR, if the cluster is terminated, few records were only updated. While processing it again should we delete the file at target location, so we can process the file...
1
answers
0
votes
184
views
asked a year agolg...
Hi Experts, I am trying to use box.api in the EMR Notebook (SparkR Kernel) and using http proxy on the EMR host to route traffic to internet. The connection to box.api.com is established on the EMR...
1
answers
0
votes
264
views
asked a year agolg...
When creating an emr cluster from airflow or manually from the EMR panel, it remains in the starting state and after approximately an hour the cluster ends with errors and the only detail it shows is...
1
answers
0
votes
1069
views
asked a year agolg...
Trying to use HUE as a web interface hosted on EMR server to issue HIVE QL. The file connection works fine -- can explore S3 files no problem (which probably doesn't require controlled core...
2
answers
0
votes
424
views
asked a year agolg...