Passer au contenu

Running notebook against EMR cluster trying to create delta table in S3

0

I'm trying to run an EMR notebook to create a delta table in S3.

EMR Cluster Version: emr-7.7.0 Installed Applications: Hadoop 3.4.0, Hive 3.1.3, JupyterEnterpriseGateway 2.6.0, Livy 0.8.0, Spark 3.5.3

Code:

%%configure -f
{
    "conf": {
        "spark.jars": "s3://<bucket>/jars/mssql-jdbc-12.8.1.jre8.jar",
        "spark.sql.catalog.spark_catalog": "org.apache.spark.sql.delta.catalog.DeltaCatalog",
        "spark.sql.extensions": "io.delta.sql.DeltaSparkSessionExtension,com.amazonaws.emr.recordserver.connector.spark.sql.RecordServerSQLExtension",
        "spark.sql.catalog.spark_catalog.lf.managed": "true"
    }
}
spark.sql("""CREATE TABLE IF NOT EXISTS ThisIsATable (
    ColumnName bigint not null
)
USING delta location
's3://<bucket>/ThisIsATable'""");

I'm able to run the notebook on this cluster to query other data sources via JDBC but when I run the spark.sql command I get this error: "org.apache.spark.SparkException: Cannot find catalog plugin class for catalog 'spark_catalog': org.apache.spark.sql.delta.catalog.DeltaCatalog."

Thank you!

demandé il y a un an80 vues
1 réponse
0

Hi Armen,

The error "Cannot find catalog plugin class for catalog 'spark_catalog': org.apache.spark.sql.delta.catalog.DeltaCatalog" means that Delta Lake is not installed on your EMR cluster.

Delta Lake is not automatically installed with EMR 7.7.0 when you only select Hadoop, Hive, JupyterEnterpriseGateway, Livey, and Spark. You must explicitly select Delta Lake during cluster creation as an application to install.

To fix the error you are getting, create a new EMR cluster with Delta Lake selected as an application.

"Cannot find catalog plugin class for catalog 'spark_catalog': org.apache.spark.sql.delta.catalog.DeltaCatalog"

References:

EMR Delta Lake

AWS
EXPERT
répondu il y a 3 mois

Vous n'êtes pas connecté. Se connecter pour publier une réponse.

Une bonne réponse répond clairement à la question, contient des commentaires constructifs et encourage le développement professionnel de la personne qui pose la question.