By using AWS re:Post, you agree to the Terms of Use

Neptune and Cypher - Poor Query Performance

0

I am wanting to use Neptune for an application with cypher as my query language. I have a pretty small dataset of around ~8500 nodes and ~8500 edges edges. I am trying to do what seem to be fairly straightforward queries, but the latency is very high (~6-8 seconds for around 1000 rows). I have tried with various instance types, enabling and disabling caches, enabling and disabling the OSGP index to no avail. I'm really at a loss as to why the query performance is so poor.

Does anyone have any experience with poor query query performance using Neptune? I feel I must be doing something incorrect to have such high query latency.

Here is some more detailed information on my graph structure and my query.

I have a graph with 2 node types A and B and a single edge type MAPS_TO which always is directed from an A node to a B node. The relation MAPS_TO is many to many, but with the current dataset it is primarily one-to-one, i.e. the graph is mainly disconnected subgraphs of the form:

(A)-[MAPS_TO]-(B)

What I would like to do is for all A nodes to collect the distinct B nodes which they map to satisfying some conditions. I've experimented with my queries a bit and the fastest one I've been able to arrive at is:

MATCH (a:A)
WHERE a.Owner = $owner AND a.IsPublic = true
WITH a
MATCH (a)-[r:MAPS_TO]->(b:B)
WHERE (b)<-[:MAPS_TO {CreationReason: "origin"}]-(:A {Owner: $owner})
      OR (b)<-[:MAPS_TO {CreationReason: "origin"}]-(:A {IsPublic: true})
WITH a, r, b ORDER BY a.AId SKIP 0 LIMIT 1000
RETURN a {
    .AId
} AS A, collect(distinct b {
B: {BId: b.BId, Name: b.Name, other properties on B nodes...}
R: {CreationReason: r.CreationReason, other relation properties}
})

The above query takes ~6 seconds (looking at explain and http timing) on the t4g.medium instance type. I tried upping to a r5d.2xlarge instance type and this cut the query time in half to 3-4 seconds. However, using such a large instance type seems quite excessive for such a small amount of data.

Really I am just trying to figure out why my query seems to perform so poorly. It seems to me that with the amount of data I have it should not really be possible to have a Neptune configuration with such performance.

EDIT for more info:

We are using the t4g.medium instance type with 3 reader instances and the query going to the reader instances. Again we have around ~8500 nodes split approximately equally into A nodes and B nodes. There are around ~8500 edges of a single type MAPS_TO all going from A to B. The output of the status endpoint for OpenCypher is:

{'status': 'healthy',
 'startTime': 'Mon Sep 19 18:56:50 UTC 2022',
 'dbEngineVersion': '1.1.1.0.R5',
 'role': 'reader',
 'dfeQueryEngine': 'viaQueryHint',
 'gremlin': {'version': 'tinkerpop-3.5.2'},
 'sparql': {'version': 'sparql-1.1'},
 'opencypher': {'version': 'Neptune-9.0.20190305-1.0'},
 'labMode': {'ObjectIndex': 'enabled',
  'ReadWriteConflictDetection': 'enabled'},
 'features': {'ResultCache': {'status': 'enabled'},
  'IAMAuthentication': 'disabled',
  'Streams': 'disabled',
  'AuditLog': 'enabled'},
 'settings': {'clusterQueryTimeoutInMs': '120000'}}

I have tried this with the ObjectIndex enabled and disabled and do not see much difference in performance. I have also tried the query on a larger instance type, the r5d.2xlarge to see if performance was improved by the result cache. The response time roughly cut in half, but that still seems to be very slow and a larger instance type then should be necessary. The only thing being run against the database currently are the above queries so I do not see how it could be a concurrency issue.

We have looked at the output of explain (too long to post). It is not clear to me that there is a single place where the query is spending a large amount of time. The DFEPipelineJoin taking the longest makes sense to me based on the description in the documentation. What is not clear to me would be how to eliminate all the DFEPipelineJoin's from the query.

asked 7 days ago33 views
2 Answers
0

Unfortunately, there are many reasons that performance could be suffering, be it instance size, data not in buffer cache, instance size, concurrent processes, query optimization, etc. so it is hard to provide specific suggestions with the information available.

To better understand the issue, I'd suggest taking a look at how the query is being processed. These details can be found using the [openCypher explain feature][1] which will provide low-level details on what the query is doing and where the time is being spent. If possible, I suggest opening a support case with AWS support.

answered 7 days ago
  • Hi Dave! Thanks for the response. I can provide more detailed information. Please see the edited question.

0

Based on the additional information you provided, I attempted to reproduce what you are seeing, but the query you have posted above on a graph with 10k 1-1 disconnected subgraphs takes ~200ms on a t4g instance. I suspect you may be hitting a supernode or something similar within your data, but without the additional information in the explain plan this is just a guess.

In general, the performance of the t3 and t4 instances will suffer from the lack of statistics generation (reference here) which the DFE query optimizer uses to optimize the query plan. These are disabled for these instances due to memory restrictions. In addition to the statistics, the DFE engine that runs openCypher will also benefit from the additional vCPUs of larger instances as it parallelizes the queries. e.g. I tried the same query/data on an r6g.large and it ran in ~70ms.

answered 5 days ago

You are not logged in. Log in to post an answer.

A good answer clearly answers the question and provides constructive feedback and encourages professional growth in the question asker.

Guidelines for Answering Questions