Skip to content

All Content tagged with Amazon EMR

Amazon EMR is a cloud big data platform for running large-scale distributed data processing jobs, interactive SQL queries, and machine learning (ML) applications using open-source analytics frameworks such as Apache Spark, Apache Hive, and Presto.

Content language: English

Filter content
Select tags to filter
Sort by
Sort by most recent
464 results
Hello, I am attempting to write to AWS Neptune using Neo4j Connector for Spark, as stated in the [compatibility document](https://docs.aws.amazon.com/neptune/latest/userguide/migration-compatibility....
1
answers
0
votes
217
views
asked a year ago
In my account, I have two Glue Catalogs (one is the default catalog, AWSDataCatalog, and another catalog is shared from a different account). How can I access the databases in both catalogs from EMR E...
2
answers
0
votes
458
views
asked a year ago
Hi, I have been looking into a solution option that uses the Athena invoker_principal to get the ARN of the IAM role being used into the SQL query. Is there a way to do the same if EMR or Redshift...
1
answers
0
votes
169
views
asked a year ago
We're currently running EMR clusters with release version 6.10.0 where instances are patched using SSM "AWS-RunPatchBaseline" during bootstrap. We're experiencing several critical issues: cluster fail...
1
answers
0
votes
213
views
AWS
asked a year ago
How much should be approx time taken for EMR batch processing and storing data in Redshift for 1 TB data with simple transformation. I have following characteristics for data * File size varies from...
1
answers
0
votes
226
views
asked a year ago
I have a use case with * 60 MB/sec data volume * Near real time use cases of AI/Data science as downstream applications should be supported * It's not a ultra-low latency use case, even 60 seconds of...
1
answers
0
votes
242
views
asked a year ago
After upgrading EMR from 6.5 to 7.5 I am getting following error OpensslCipher: Failed to load OpenSSL Cipher.java.lang.UnsatisfiedLinkError: EVP_CIPHER_CTX_block_sizeBased on the HADOOP-18994 Failed...
1
answers
0
votes
133
views
asked a year ago
I would like to confirm whether it is possible to configure an Amazon EMR cluster with mixed instance types, combining both Graviton-based and non-Graviton instances within the same cluster. I'm going...
2
answers
0
votes
412
views
asked a year ago
I'm trying to run an EMR notebook to create a delta table in S3. EMR Cluster Version: emr-7.7.0 Installed Applications: Hadoop 3.4.0, Hive 3.1.3, JupyterEnterpriseGateway 2.6.0, Livy 0.8.0, Spark 3.5...
1
answers
0
votes
77
views
asked a year ago
Hi everyone, I am researching about s3 backup and a question is what is the impact on the system or users? I think with backup solutions (s3 versioning, replications, aws backup, custom solution like ...
2
answers
0
votes
126
views
asked a year ago
Here's a link to my sample calculation: https://calculator.aws/#/estimate?id=e1754f12531b5a51f332143cb5e5a53e4a626f34 I read in another answer that short Serverless workloads are cheaper in general t...
1
answers
0
votes
1.6K
views
asked a year ago
Hi Mate, I have steps running on EMR, which was working till 13th January 2025. After that I tried running the job today and it started failing with Error like : **AttributeError: module 'awscrt.chec...
3
answers
0
votes
797
views
asked a year ago
  • 1
  • 2
  • 3
  • 4
  • 5
  • •••
  • 39
  • Page size
    12 / page