EMR serverless on arm64

0

I want to configure jar (deequ-2.0.1-spark-3.2.jar) on EMR serverless arm64. This works for x86_64 but doesn't work for arm64 architecture. Could you please consider this matter it gave this error :

  https://repos.spark-packages.org/com/amazon/deequ/deequ/2.0.1-spark-3.2/deequ-2.0.1-spark-3.2.jar

	::::::::::::::::::::::::::::::::::::::::::::::

	::          UNRESOLVED DEPENDENCIES         ::

	::::::::::::::::::::::::::::::::::::::::::::::

	:: com.amazon.deequ#deequ;2.0.1-spark-3.2: not found

:::: ERRORS Server access error at url https://repo1.maven.org/maven2/com/amazon/deequ/deequ/2.0.1-spark-3.2/deequ-2.0.1-spark-3.2.pom (java.net.ConnectException: Connection timed out (Connection timed out))

Server access error at url https://repo1.maven.org/maven2/com/amazon/deequ/deequ/2.0.1-spark-3.2/deequ-2.0.1-spark-3.2.jar (java.net.ConnectException: Connection timed out (Connection timed out))

Server access error at url https://repos.spark-packages.org/com/amazon/deequ/deequ/2.0.1-spark-3.2/deequ-2.0.1-spark-3.2.pom (java.net.ConnectException: Connection timed out (Connection timed out))

Server access error at url https://repos.spark-packages.org/com/amazon/deequ/deequ/2.0.1-spark-3.2/deequ-2.0.1-spark-3.2.jar (java.net.ConnectException: Connection timed out (Connection timed out))

:: USE VERBOSE OR DEBUG MESSAGE LEVEL FOR MORE DETAILS Exception in thread "main" java.lang.RuntimeException: [unresolved dependency: com.amazon.deequ#deequ;2.0.1-spark-3.2: not found] at org.apache.spark.deploy.SparkSubmitUtils$.resolveMavenCoordinates(SparkSubmit.scala:1494) at org.apache.spark.util.DependencyUtils$.resolveMavenDependencies(DependencyUtils.scala:185) at org.apache.spark.deploy.SparkSubmit.prepareSubmitEnvironment(SparkSubmit.scala:311) at org.apache.spark.deploy.SparkSubmit.org$apache$spark$deploy$SparkSubmit$$runMain(SparkSubmit.scala:944) at org.apache.spark.deploy.SparkSubmit.doRunMain$1(SparkSubmit.scala:180) at org.apache.spark.deploy.SparkSubmit.submit(SparkSubmit.scala:203) at org.apache.spark.deploy.SparkSubmit.doSubmit(SparkSubmit.scala:90) at org.apache.spark.deploy.SparkSubmit$$anon$2.doSubmit(SparkSubmit.scala:1090) at org.apache.spark.deploy.SparkSubmit$.main(SparkSubmit.scala:1099) at org.apache.spark.deploy.SparkSubmit.main(SparkSubmit.scala)

  • Make sure you've configured your EMR Serverless application with VPC connectivity. By default, EMR Serverless only has access to a few AWS services in the same region.

已提问 1 年前521 查看次数
2 回答
2
已接受的回答

Hi,

The error message you encountered indicates that the required dependency, com.amazon.deequ#deequ;2.0.1-spark-3.2, could not be found in the Maven repositories. The connection timeout error suggests that the dependency resolution process failed to retrieve the required files from the Maven repositories.

To address this issue, you can try the following solutions:

Check Internet Connectivity: Ensure that the EMR serverless arm64 instance has proper internet connectivity to access the Maven repositories. You can test the connectivity by running other commands that require internet access on the instance.

Update Maven Repository Configuration: If the instance has internet access, verify that the Maven repository configuration is correct. Check if the Maven repository URLs are properly configured in the settings.xml file located in the .m2 directory in the user's home directory. Make sure the necessary repositories, such as Maven Central and Spark Packages, are included and accessible.

Try a Different Repository: If the default Maven repositories are inaccessible, you can try using alternative repositories.

Alternatively, you can manually download the required JAR file and its dependencies from a different source and install them on the EMR serverless arm64 instance.

profile pictureAWS
已回答 1 年前
profile picture
专家
已审核 7 个月前
0

Thanks.. this was the issue "Check Internet Connectivity " , I didn't configure VPC .

已回答 1 年前

您未登录。 登录 发布回答。

一个好的回答可以清楚地解答问题和提供建设性反馈,并能促进提问者的职业发展。

回答问题的准则