EMR Serverless Custom Image not running

0

Hi Team,

I wanted to run spark application built using JDK 11 on EMR Serverless. Since the default image does not have support of JDK 11, I created the custom image based on following links:

Use case 2 : https://aws.amazon.com/ru/blogs/big-data/add-your-own-libraries-and-application-dependencies-to-spark-and-hive-on-amazon-emr-serverless-with-custom-images/

https://docs.aws.amazon.com/emr/latest/EMR-Serverless-UserGuide/using-custom-images.html

This is the content of my DockerFile ( I have M1 Mac)

FROM--platform=linux/amd64 public.ecr.aws/emr-serverless/spark/emr-6.9.0:latest
USER root
# install JDK 11
RUN sudo amazon-linux-extras install java-openjdk11
# EMRS will run the image as hadoop
USER hadoop:hadoop

After uploading the image on ECR, I created the EMR Serverless application (x86_64) using the same custom image. Next I tried submitting the job with my jar built with JDK 11, however, it failed with following error:

The job run failed to be submitted to the application or it completed unsuccessfully.

Then, as per the above mentioned second link, I tried giving these two spark configuration while configuring the job:

--conf spark.executorEnv.JAVA_HOME=/usr/lib/jvm/java-11-openjdk-11.0.16.0.8-1.amzn2.0.1.x86_64 
--conf spark.driverEnv.JAVA_HOME=/usr/lib/jvm/java-11-openjdk-11.0.16.0.8-1.amzn2.0.1.x86_64

I am still getting the below error: Job failed, please check complete logs in configured logging destination. ExitCode: 1. Last few exceptions: Caused by: java.lang.UnsupportedClassVersionError: <ClassName> has been compiled by a more recent version of the Java Runtime (class file version 55.0), this version of the Java Runtime only recognizes class file versions up to 52.0\nException in thread \"main\" java.lang.BootstrapMethodError: java.lang.UnsupportedClassVersionError: <ClassName> has been compiled by a more recent version of the Java Runtime (class file version 55.0), this version of the Java Runtime only recognizes class file versions up to 52.0...

-- Update--

I have also tried running the job by specifying JAVA_HOME in configurations like this:

{
  "applicationConfiguration": [
    {
      "classification": "spark-defaults",
      "configurations": [],
      "properties": {
        "spark.driverEnv.JAVA_HOME": "/usr/lib/jvm/java-11-openjdk-11.0.18.0.10-1.amzn2.0.1.x86_64",
        "spark.executorEnv.JAVA_HOME": "/usr/lib/jvm/java-11-openjdk-11.0.18.0.10-1.amzn2.0.1.x86_64"
      }
    }
  ]
}

Am I missing any step?

Regards Tapan

asked a year ago756 views
1 Answer
1
Accepted Answer

I am able to run my jar in JDK11 environment. The correct driver environment variable for JAVA_HOME is spark.emr-serverless.driverEnv.JAVA_HOME.

Also, the latest JDK that gets deployed is of version : java-11-openjdk-11.0.18.0.10-1.amzn2.0.1.x86_64 and applicationConfiguration is not important.

answered a year ago
profile picture
EXPERT
reviewed 2 months ago

You are not logged in. Log in to post an answer.

A good answer clearly answers the question and provides constructive feedback and encourages professional growth in the question asker.

Guidelines for Answering Questions