neo4j spark connector not working for loading data into AWS Neptune

0

Hello,

I am attempting to write to AWS Neptune using Neo4j Connector for Spark, as stated in the compatibility document for loading data.

When I try to load data using the spark connector, I get the following error:

Py4JJavaError: An error occurred while calling o561.save.
: org.neo4j.driver.exceptions.DatabaseException: "Unexpected server exception 'There is no procedure with the name `dbms.components` registered.'"
	at org.neo4j.driver.internal.util.Futures.blockingGet(Futures.java:111)

Below is the repro:

peopleDF = spark.createDataFrame(
    [
        {"name": "John", "surname": "Doe", "age": 42},
        {"name": "Jane", "surname": "Doe", "age": 40},
    ]
)

(
    peopleDF
    .write.format("org.neo4j.spark.DataSource")
    .option("url", <NEPTUNE_ENDPOINT>)
    .mode("Append")
    .option("encryption.enabled", "true")
    .option("labels", ":Person")
    .save()
)

Platform details:

  1. connector jar being used: org.neo4j:neo4j-connector-apache-spark_2.12:5.3.6_for_spark_3
  2. Databricks 15.4
  3. Apache Spark: 3.5.0
  4. Scala: 2.12
  5. Python: 3.11.11
  6. Neptune Status:
{"status":"healthy","startTime":"Thu Apr 24 17:58:03 UTC 2025","dbEngineVersion":"1.4.4.0.R1","role":"writer","dfeQueryEngine":"viaQueryHint","gremlin":{"version":"tinkerpop-3.7.1"},"sparql":{"version":"sparql-1.1"},"opencypher":{"version":"Neptune-9.0.20190305-1.0"},"labMode":{"ObjectIndex":"disabled","ReadWriteConflictDetection":"enabled"},"features":{"SlowQueryLogs":"disabled","InlineServerGeneratedEdgeId":"disabled","ResultCache":{"status":"disabled"},"IAMAuthentication":"disabled","Streams":"disabled","AuditLog":"disabled"},"settings":{"StrictTimeoutValidation":"true","clusterQueryTimeoutInMs":"120000","SlowQueryLogsThreshold":"5000"},"serverlessConfiguration":{"minCapacity":"1.0","maxCapacity":"128.0"}}

Are there any working examples using Neo4j spark connector?

  • It looks like you are using the 5.X driver series. Can you try with the 4.X driver series as shown in the docs and see if that solves this issue?

    <dependency> <groupId>org.neo4j</groupId> <artifactId>neo4j-connector-apache-spark_2.12-4.1.0</artifactId> <version>4.0.1_for_spark_3</version> </dependency>
asked 24 days ago69 views
1 Answer
1

I apologize, but there seems to be a misunderstanding in your approach. The Neo4j Connector for Apache Spark is not directly compatible with Amazon Neptune for loading data. While Neptune does provide some compatibility with Neo4j tools, the Spark connector is not one of them.

For loading data into Neptune using Spark, you should use Neptune's own integration methods. Neptune integrates with Apache Spark through an open-source library called neptune-python-utilities, which can be used for running Python and Gremlin jobs.

The error you're encountering ("There is no procedure with the name dbms.components registered") is likely because Neptune doesn't support Neo4j-specific procedures and features.

To load data into Neptune using Spark, you would typically use Neptune's bulk loader API or other AWS services like AWS Glue. Here's a general approach you could consider:

  1. Prepare your data in Spark as a DataFrame.
  2. Use AWS Glue or custom ETL processes to transform your data into a format compatible with Neptune (CSV for bulk loading or Gremlin/SPARQL for individual inserts).
  3. Use Neptune's bulk loader API or write custom code to insert the data using Gremlin or SPARQL queries.

If you need to use Spark specifically, you might want to look into using the AWS Glue Neptune connector, which provides a way to read from and write to Neptune using Spark jobs in AWS Glue.

For the most up-to-date and accurate information on loading data into Neptune, I recommend checking the official AWS documentation or contacting AWS support for guidance tailored to your specific use case.
Sources
Neptune compatibility with Neo4j - Amazon Neptune
Working with other AWS services - Amazon Neptune

profile picture
answered 24 days ago
profile picture
EXPERT
reviewed 23 days ago
  • I think we might want to remove that it is compatible in the documentation in that case.

You are not logged in. Log in to post an answer.

A good answer clearly answers the question and provides constructive feedback and encourages professional growth in the question asker.

Guidelines for Answering Questions