Saltar al contenido

Running notebook against EMR cluster trying to create delta table in S3

0

I'm trying to run an EMR notebook to create a delta table in S3.

EMR Cluster Version: emr-7.7.0 Installed Applications: Hadoop 3.4.0, Hive 3.1.3, JupyterEnterpriseGateway 2.6.0, Livy 0.8.0, Spark 3.5.3

Code:

%%configure -f
{
    "conf": {
        "spark.jars": "s3://<bucket>/jars/mssql-jdbc-12.8.1.jre8.jar",
        "spark.sql.catalog.spark_catalog": "org.apache.spark.sql.delta.catalog.DeltaCatalog",
        "spark.sql.extensions": "io.delta.sql.DeltaSparkSessionExtension,com.amazonaws.emr.recordserver.connector.spark.sql.RecordServerSQLExtension",
        "spark.sql.catalog.spark_catalog.lf.managed": "true"
    }
}
spark.sql("""CREATE TABLE IF NOT EXISTS ThisIsATable (
    ColumnName bigint not null
)
USING delta location
's3://<bucket>/ThisIsATable'""");

I'm able to run the notebook on this cluster to query other data sources via JDBC but when I run the spark.sql command I get this error: "org.apache.spark.SparkException: Cannot find catalog plugin class for catalog 'spark_catalog': org.apache.spark.sql.delta.catalog.DeltaCatalog."

Thank you!

preguntada hace un año82 visualizaciones

1 Respuesta
0

Hi Armen,

The error "Cannot find catalog plugin class for catalog 'spark_catalog': org.apache.spark.sql.delta.catalog.DeltaCatalog" means that Delta Lake is not installed on your EMR cluster.

Delta Lake is not automatically installed with EMR 7.7.0 when you only select Hadoop, Hive, JupyterEnterpriseGateway, Livey, and Spark. You must explicitly select Delta Lake during cluster creation as an application to install.

To fix the error you are getting, create a new EMR cluster with Delta Lake selected as an application.

"Cannot find catalog plugin class for catalog 'spark_catalog': org.apache.spark.sql.delta.catalog.DeltaCatalog"

References:

EMR Delta Lake

AWS
EXPERTO

respondido hace 3 meses

No has iniciado sesión. Iniciar sesión para publicar una respuesta.

Una buena respuesta responde claramente a la pregunta, proporciona comentarios constructivos y fomenta el crecimiento profesional en la persona que hace la pregunta.