AWS neptune - hierarchical data model - query execution time is longer

0

Hello team, I am trying to model hierarchical data model using neptune db which has a DAG structure and using development/testing cluster environment with buffer cache enabled. Even a simple node query using opencypher in notebook is taking longer than expected. Both notebook and the cluster share the same VPC. For ex: %%oc MATCH (n:Group :group_67) RETURN n. is taking 35-70 ms and a path query from leaf node to trunk is taking more than 200ms %%oc MATCH P = (n:Group :group_41)-[:inherits*]->() RETURN P

Its a simple test graph with just around 200 nodes with a model like: group(200 nodes) -> Domain->infra->tenant

Please help me identify why the queries are taking longer?

Query plan:
ID	Out #1	Out #2	Name	Arguments	Mode	Units In	Units Out	Ratio	Time (ms)
0	1	-	SolutionInjection	solutions=[{}]	-	0	1	0.00	0
1	2	-	DFESubquery	subQuery=subQuery1	-	0	3	0.00	331
2	-	-	TermResolution	vars=[?P]	id2value_opencypher	3	3	1.00	11.00
subQuery1
ID	Out #1	Out #2	Name	Arguments	Mode	Units In	Units Out	Ratio	Time (ms)
0	1	-	DFEPipelineScan	pattern=Node(?n) with property 'ALL' and label 'group_67'
patternEstimate=1	-	0	1	0.00	0.23
1	2	-	DFEChunkLocalSubQuery	subQuery=http://aws.amazon.com/neptune/vocab/v01/dfe/past/graph#7634c836-4f0d-4f64-bdae-f3ba1a921e6c/graph_1
coordinationTime(ms)=0.051	-	1	1	1.00	0.91
2	3	-	DFEChunkLocalSubQuery	subQuery=http://aws.amazon.com/neptune/vocab/v01/dfe/past/graph#7634c836-4f0d-4f64-bdae-f3ba1a921e6c/graph_2
coordinationTime(ms)=0.037	-	1	1	1.00	3.80
3	4	-	DFEChunkLocalSubQuery	subQuery=http://aws.amazon.com/neptune/vocab/v01/dfe/past/graph#7634c836-4f0d-4f64-bdae-f3ba1a921e6c/graph_3
coordinationTime(ms)=0.064	-	1	3	3.00	108
4	5	-	DFEBindRelation	inputVars=[?__gen_path6, ?n, ?anon_rel28, ?anon_node34, ?__gen_path6]
outputVars=[?__gen_path6, ?n, ?anon_rel28, ?anon_node34, ?P]	-	3	3	1.00	1.53
5	6	-	DFEProject	columns=[?P]	-	3	3	1.00	1.41
6	-	-	DFEDrain	-	-	3	0	0.00	17.17
subQuery=http://aws.amazon.com/neptune/vocab/v01/dfe/past/graph#7634c836-4f0d-4f64-bdae-f3ba1a921e6c/graph_1
ID	Out #1	Out #2	Name	Arguments	Mode	Units In	Units Out	Ratio	Time (ms)
0	1	-	DFESolutionInjection	outSchema=[?n]	-	0	1	0.00	0.02
1	2	3	DFETee	-	-	1	2	2.00	0.02
2	4	-	DFEDistinctColumn	column=?n
ordered=false	-	1	1	1.00	0.31
3	12	-	DFEHashIndexBuild	vars=[?n]	-	1	1	1.00	0.03
4	5	-	DFESplitChunks	-	-	1	1	1.00	0.02
5	6 to 10	-	DFERouteChunks	-	-	1	1	1.00	0.01
6	11	-	DFEPipelineJoin	pattern=Node(?n) with property 'ALL' and label 'Group'
patternEstimate=100	-	1	1	1.00	0.22
7	11	-	DFEPipelineJoin	pattern=Node(?n) with property 'ALL' and label 'Group'
patternEstimate=100	-	0	0	0.00	0
8	11	-	DFEPipelineJoin	pattern=Node(?n) with property 'ALL' and label 'Group'
patternEstimate=100	-	0	0	0.00	0
9	11	-	DFEPipelineJoin	pattern=Node(?n) with property 'ALL' and label 'Group'
patternEstimate=100	-	0	0	0.00	0
10	11	-	DFEPipelineJoin	pattern=Node(?n) with property 'ALL' and label 'Group'
patternEstimate=100	-	0	0	0.00	0
11	12	-	DFEUnion	-	-	1	1	1.00	0.08
12	13	14	DFESync	-	-	2	2	1.00	0.02
13	15	-	DFEForwardValue	-	-	1	1	1.00	0.01
14	15	-	DFEForwardValue	-	-	1	1	1.00	0.01
15	16	-	DFEHashIndexJoin	-	-	2	1	0.50	0.13
16	-	-	DFEDrain	-	-	1	0	0.00	0.01
subQuery=http://aws.amazon.com/neptune/vocab/v01/dfe/past/graph#7634c836-4f0d-4f64-bdae-f3ba1a921e6c/graph_2
ID	Out #1	Out #2	Name	Arguments	Mode	Units In	Units Out	Ratio	Time (ms)
0	1	-	DFESolutionInjection	outSchema=[?n]	-	0	1	0.00	0.01
1	2	3	DFETee	-	-	1	2	2.00	0.01
2	4	-	DFEDistinctColumn	column=?n
ordered=false	-	1	1	1.00	0.24
3	12	-	DFEHashIndexBuild	vars=[?n]	-	1	1	1.00	0.03
4	5	-	DFESplitChunks	-	-	1	1	1.00	0.01
5	6 to 10	-	DFERouteChunks	-	-	1	1	1.00	0.01
6	11	-	DFEPipelineJoin	pattern=Node(?n) with property 'ALL' and label '?n_label1'
patternEstimate=306	-	1	1	1.00	3.21
7	11	-	DFEPipelineJoin	pattern=Node(?n) with property 'ALL' and label '?n_label1'
patternEstimate=306	-	0	0	0.00	0
8	11	-	DFEPipelineJoin	pattern=Node(?n) with property 'ALL' and label '?n_label1'
patternEstimate=306	-	0	0	0.00	0
9	11	-	DFEPipelineJoin	pattern=Node(?n) with property 'ALL' and label '?n_label1'
patternEstimate=306	-	0	0	0.00	0
10	11	-	DFEPipelineJoin	pattern=Node(?n) with property 'ALL' and label '?n_label1'
patternEstimate=306	-	0	0	0.00	0
11	12	-	DFEUnion	-	-	1	1	1.00	0.08
12	13	14	DFESync	-	-	2	2	1.00	0.02
13	15	-	DFEForwardValue	-	-	1	1	1.00	0.01
14	15	-	DFEForwardValue	-	-	1	1	1.00	0.01
15	16	-	DFEHashIndexJoin	-	-	2	1	0.50	0.13
16	-	-	DFEDrain	-	-	1	0	0.00	0.01
subQuery=http://aws.amazon.com/neptune/vocab/v01/dfe/past/graph#7634c836-4f0d-4f64-bdae-f3ba1a921e6c/graph_3
ID	Out #1	Out #2	Name	Arguments	Mode	Units In	Units Out	Ratio	Time (ms)
0	1	-	DFESolutionInjection	outSchema=[?n]	-	0	1	0.00	0.01
1	2	3	DFETee	-	-	1	2	2.00	0.01
2	4	-	DFEDistinctRelation	-	-	1	1	1.00	0.13
3	19	-	DFEHashIndexBuild	vars=[?n]	-	1	1	1.00	0.03
4	5	-	DFEBindRelation	inputVars=[?n, ?n]
outputVars=[?n, ?node3]	-	1	1	1.00	0.19
5	6	-	DFEChunkLocalSubQuery	subQuery=http://aws.amazon.com/neptune/vocab/v01/dfe/past/graph#7634c836-4f0d-4f64-bdae-f3ba1a921e6c/graph_4
coordinationTime(ms)=0.017	-	1	1	1.00	0.32
6	7	-	DFEApply	functor=empty()	-	1	1	1.00	0.68
7	8	-	DFEApply	functor=toPath1(?n)	-	1	1	1.00	0.58
8	9	10	DFETee	-	-	1	2	2.00	0.05
9	11	-	DFEBindRelation	inputVars=[?anon_rel28]
outputVars=[?102]	-	1	1	1.00	0.07
10	12	-	DFEBindRelation	inputVars=[?iterationCounterVar0, ?__gen_path6, ?anon_rel28, ?n, ?node3, ?102]
outputVars=[?iterationCounterVar0, ?__gen_path6, ?anon_rel28, ?n, ?node3, ?102]	-	2	1	0.50	0.15
11	10	-	DFETermResolution	column=?102	value2id	1	1	1.00	0.18
12	13	-	DFEBindRelation	inputVars=[?iterationCounterVar0, ?__gen_path6, ?anon_rel28, ?n, ?node3, ?102]
outputVars=[?iterationCounterVar0, ?__gen_path6, ?102, ?n, ?node3, ?anon_rel28]	-	1	1	1.00	0.38
13	14	-	DFEProject	columns=[?__gen_path6, ?n, ?node3, ?iterationCounterVar0, ?anon_rel28]	-	1	1	1.00	0.17
14	15	-	DFEDistinctRelation	-	-	1	1	1.00	0.37
15	16	-	DFESplitChunks	-	-	1	1	1.00	0.18
16	17	-	DFELoopSubQuery	subQuery=http://aws.amazon.com/neptune/vocab/v01/dfe/past/graph#7634c836-4f0d-4f64-bdae-f3ba1a921e6c/graph_5
coordinationTime(ms)=0.562	-	1	3	3.00	92.02
17	18	-	DFEBindRelation	inputVars=[?__gen_path6, ?iterationCounterVar0, ?n, ?node3, ?anon_rel28, ?node3]
outputVars=[?__gen_path6, ?iterationCounterVar0, ?n, ?node3, ?anon_rel28, ?anon_node34]	-	3	3	1.00	0.94
18	19	-	DFEProject	columns=[?__gen_path6, ?n, ?anon_rel28, ?anon_node34]	-	3	3	1.00	0.59
19	20	21	DFESync	-	-	4	7	1.75	0.27
20	22	-	DFEForwardValue	-	-	4	4	1.00	0.03
21	22	-	DFEForwardValue	-	-	3	3	1.00	0.10
22	23	-	DFEHashIndexJoin	-	-	7	3	0.43	10.59
23	-	-	DFEDrain	-	-	3	0	0.00	0.37
subQuery=http://aws.amazon.com/neptune/vocab/v01/dfe/past/graph#7634c836-4f0d-4f64-bdae-f3ba1a921e6c/graph_4
ID	Out #1	Out #2	Name	Arguments	Mode	Units In	Units Out	Ratio	Time (ms)
0	1	-	DFESolutionInjection	solutions=[?iterationCounterVar0 -> [0^^<LONG>]]
outSchema=[?iterationCounterVar0]	-	0	1	0.00	0.01
1	3	-	DFERelationalJoin	joinVars=[]	-	2	1	0.50	0.25
2	1	-	DFESolutionInjection	outSchema=[?n, ?node3]	-	0	1	0.00	0.01
3	-	-	DFEDrain	-	-	1	0	0.00	0.03
subQuery=http://aws.amazon.com/neptune/vocab/v01/dfe/past/graph#7634c836-4f0d-4f64-bdae-f3ba1a921e6c/graph_5
ID	Out #1	Out #2	Name	Arguments	Mode	Units In	Units Out	Ratio	Time (ms)
0	1	-	DFESolutionInjection	outSchema=[?__gen_path6, ?n, ?node3, ?iterationCounterVar0, ?anon_rel28]	-	0	4	0.00	0.10
1	2	-	DFEChunkLocalSubQuery	subQuery=http://aws.amazon.com/neptune/vocab/v01/dfe/past/graph#7634c836-4f0d-4f64-bdae-f3ba1a921e6c/graph_6
coordinationTime(ms)=0.833	-	4	3	0.75	10.81
2	3	-	DFEChunkLocalSubQuery	subQuery=http://aws.amazon.com/neptune/vocab/v01/dfe/past/graph#7634c836-4f0d-4f64-bdae-f3ba1a921e6c/graph_7
coordinationTime(ms)=0.140	-	3	3	1.00	2.61
3	4	5	DFETee	-	-	3	6	2.00	0.28
4	6	-	DFEApply	functor=ne(?rel2, ?104)	-	3	3	1.00	2.24
5	7	-	DFEBindRelation	inputVars=[?iterationCounterVar0, ?node4, ?__gen_path6, ?rel2, ?n, ?rel2_type5, ?node3, ?anon_rel28]
outputVars=[?iterationCounterVar0, ?node4, ?__gen_path6, ?rel2, ?n, ?rel2_type5, ?node3, ?anon_rel28]	-	3	3	1.00	1.38
6	7	-	DFEApply	functor=trueToRowId(?106)	-	3	3	1.00	1.86
7	8	-	DFESelectRows	-	-	6	3	0.50	2.04
8	9	10	DFETee	-	-	3	6	2.00	0.41
9	11	-	DFEApply	functor=in(?rel2, ?anon_rel28)	-	3	3	1.00	3.05
10	13	-	DFEBindRelation	inputVars=[?iterationCounterVar0, ?node4, ?__gen_path6, ?rel2, ?n, ?rel2_type5, ?node3, ?anon_rel28]
outputVars=[?iterationCounterVar0, ?node4, ?__gen_path6, ?rel2, ?n, ?rel2_type5, ?node3, ?anon_rel28]	-	3	3	1.00	8.18
11	12	-	DFEApply	functor=not(?110)	-	3	3	1.00	2.90
12	13	-	DFEApply	functor=trueToRowId(?109)	-	3	3	1.00	1.82
13	14	-	DFESelectRows	-	-	6	3	0.50	7.09
14	15	-	DFEApply	functor=increment(?iterationCounterVar0)	-	3	3	1.00	1.79
15	16	-	DFEApply	functor=toList1(?rel2)	-	3	3	1.00	3.48
16	17	-	DFEApply	functor=merge(?anon_rel28, ?113)	-	3	3	1.00	5.33
17	18	-	DFEApply	functor=toPath3(?node3, ?rel2, ?node4)	-	3	3	1.00	3.39
18	19	-	DFEApply	functor=merge(?__gen_path6, ?115)	-	3	3	1.00	2.77
19	20	-	DFEProject	columns=[?n, ?rel2, ?rel2_type5, ?node4, ?iterationCounterVar_tmp1, ?uniqueListVar_tmp2, ?pathVar_tmp3]	-	3	3	1.00	2.40
20	21	-	DFEBindRelation	inputVars=[?node4, ?rel2, ?iterationCounterVar_tmp1, ?uniqueListVar_tmp2, ?n, ?rel2_type5, ?pathVar_tmp3]
outputVars=[?node3, ?rel2, ?iterationCounterVar0, ?anon_rel28, ?n, ?rel2_type5, ?__gen_path6]	-	3	3	1.00	1.24
21	22	-	DFEProject	columns=[?__gen_path6, ?n, ?node3, ?iterationCounterVar0, ?anon_rel28]	-	3	3	1.00	0.80
22	23	24	DFETee	-	-	3	6	2.00	0.17
23	25	-	DFEChunkLocalSubQuery	subQuery=http://aws.amazon.com/neptune/vocab/v01/dfe/past/graph#7634c836-4f0d-4f64-bdae-f3ba1a921e6c/graph_8
coordinationTime(ms)=0.257	-	3	3	1.00	3.10
24	38	-	DFEChunkLocalSubQuery	subQuery=http://aws.amazon.com/neptune/vocab/v01/dfe/past/graph#7634c836-4f0d-4f64-bdae-f3ba1a921e6c/graph_9
coordinationTime(ms)=0.156	-	3	3	1.00	1.42
25	26	27	DFETee	-	-	3	6	2.00	0.35
26	28	-	DFEApply	functor=lt(?iterationCounterVar0, ?117)	-	3	3	1.00	2.27
27	29	-	DFEBindRelation	inputVars=[?__gen_path6, ?iterationCounterVar0, ?n, ?node3, ?anon_rel28]
outputVars=[?__gen_path6, ?iterationCounterVar0, ?n, ?node3, ?anon_rel28]	-	3	3	1.00	1.53
28	29	-	DFEApply	functor=trueToRowId(?119)	-	3	3	1.00	1.87
29	30	-	DFESelectRows	-	-	6	3	0.50	1.14
30	31	43	DFETee	-	-	3	6	2.00	0.28
31	32	-	DFEProject	columns=[?__gen_path6]	-	3	3	1.00	0.27
32	33	-	DFEMergeChunks	-	-	3	3	1.00	0.05
33	34	-	DFEReduce	functor=count(?1)	-	3	4	1.33	3.02
34	35	-	DFEBindRelation	inputVars=[?124]
outputVars=[?124]	-	4	4	1.00	0.32
35	36	-	DFEChunkLocalSubQuery	subQuery=http://aws.amazon.com/neptune/vocab/v01/dfe/past/graph#7634c836-4f0d-4f64-bdae-f3ba1a921e6c/graph_10
coordinationTime(ms)=0.075	-	4	4	1.00	0.83
36	37	-	DFEApply	functor=gt(?124, ?126)	-	4	4	1.00	1.69
37	44	-	DFEProject	columns=[?125]	-	4	4	1.00	0.56
38	39	40	DFETee	-	-	3	6	2.00	0.26
39	41	-	DFEApply	functor=ge(?iterationCounterVar0, ?121)	-	3	3	1.00	2.31
40	42	-	DFEBindRelation	inputVars=[?__gen_path6, ?iterationCounterVar0, ?n, ?node3, ?anon_rel28]
outputVars=[?__gen_path6, ?iterationCounterVar0, ?n, ?node3, ?anon_rel28]	-	3	3	1.00	0.80
41	42	-	DFEApply	functor=trueToRowId(?123)	-	3	3	1.00	1.91
42	45	-	DFESelectRows	-	-	6	3	0.50	1.06
43	-	-	DFEDrain	-	-	3	0	0.00	0.10
44	-	-	DFEDrain	-	-	4	0	0.00	0.04
45	-	-	DFEDrain	-	-	3	0	0.00	0.13
subQuery=http://aws.amazon.com/neptune/vocab/v01/dfe/past/graph#7634c836-4f0d-4f64-bdae-f3ba1a921e6c/graph_6
ID	Out #1	Out #2	Name	Arguments	Mode	Units In	Units Out	Ratio	Time (ms)
0	1	-	DFESolutionInjection	outSchema=[?__gen_path6, ?n, ?node3, ?iterationCounterVar0, ?anon_rel28]	-	0	4	0.00	0.06
1	2	3	DFETee	-	-	4	8	2.00	0.18
2	4	-	DFEDistinctColumn	column=?node3
ordered=false	-	4	4	1.00	1.66
3	12	-	DFEHashIndexBuild	vars=[?node3]	-	4	4	1.00	0.27
4	5	-	DFESplitChunks	-	-	4	4	1.00	0.20
5	6 to 10	-	DFERouteChunks	-	-	4	4	1.00	0.04
6	11	-	DFEPipelineJoin	pattern=Edge((?node3)-[?rel2:?rel2_type5]->(?node4))
constraints=[]
patternEstimate=832	-	1	1	1.00	0.90
7	11	-	DFEPipelineJoin	pattern=Edge((?node3)-[?rel2:?rel2_type5]->(?node4))
constraints=[]
patternEstimate=832	-	1	1	1.00	0.48
8	11	-	DFEPipelineJoin	pattern=Edge((?node3)-[?rel2:?rel2_type5]->(?node4))
constraints=[]
patternEstimate=832	-	1	1	1.00	0.93
9	11	-	DFEPipelineJoin	pattern=Edge((?node3)-[?rel2:?rel2_type5]->(?node4))
constraints=[]
patternEstimate=832	-	1	0	0.00	0.67
10	11	-	DFEPipelineJoin	pattern=Edge((?node3)-[?rel2:?rel2_type5]->(?node4))
constraints=[]
patternEstimate=832	-	0	0	0.00	0
11	12	-	DFEUnion	-	-	3	3	1.00	1.26
12	13	14	DFESync	-	-	7	7	1.00	0.41
13	15	-	DFEForwardValue	-	-	4	4	1.00	0.22
14	15	-	DFEForwardValue	-	-	3	3	1.00	0.22
15	16	-	DFEHashIndexJoin	-	-	7	3	0.43	2.29
16	-	-	DFEDrain	-	-	3	0	0.00	0.18
subQuery=http://aws.amazon.com/neptune/vocab/v01/dfe/past/graph#7634c836-4f0d-4f64-bdae-f3ba1a921e6c/graph_7
ID	Out #1	Out #2	Name	Arguments	Mode	Units In	Units Out	Ratio	Time (ms)
0	1	-	DFESolutionInjection	solutions=[?104 -> [DictionaryValue [type=URI, termId=526]]]
outSchema=[?104]	-	0	4	0.00	0.09
1	3	-	DFERelationalJoin	joinVars=[]	-	7	3	0.43	2.08
2	1	-	DFESolutionInjection	outSchema=[?__gen_path6, ?n, ?rel2, ?node3, ?rel2_type5, ?node4, ?iterationCounterVar0, ?anon_rel28]	-	0	3	0.00	0.09
3	-	-	DFEDrain	-	-	3	0	0.00	0.20
subQuery=http://aws.amazon.com/neptune/vocab/v01/dfe/past/graph#7634c836-4f0d-4f64-bdae-f3ba1a921e6c/graph_8
ID	Out #1	Out #2	Name	Arguments	Mode	Units In	Units Out	Ratio	Time (ms)
0	1	-	DFESolutionInjection	solutions=[?117 -> [9223372036854775807^^<LONG>]]
outSchema=[?117]	-	0	4	0.00	0.06
1	3	-	DFERelationalJoin	joinVars=[]	-	7	3	0.43	2.57
2	1	-	DFESolutionInjection	outSchema=[?__gen_path6, ?n, ?node3, ?iterationCounterVar0, ?anon_rel28]	-	0	3	0.00	0.06
3	-	-	DFEDrain	-	-	3	0	0.00	0.16
subQuery=http://aws.amazon.com/neptune/vocab/v01/dfe/past/graph#7634c836-4f0d-4f64-bdae-f3ba1a921e6c/graph_9
ID	Out #1	Out #2	Name	Arguments	Mode	Units In	Units Out	Ratio	Time (ms)
0	1	-	DFESolutionInjection	solutions=[?121 -> [1^^<LONG>]]
outSchema=[?121]	-	0	4	0.00	0.05
1	3	-	DFERelationalJoin	joinVars=[]	-	7	3	0.43	1.01
2	1	-	DFESolutionInjection	outSchema=[?__gen_path6, ?n, ?node3, ?iterationCounterVar0, ?anon_rel28]	-	0	3	0.00	0.06
3	-	-	DFEDrain	-	-	3	0	0.00	0.15
subQuery=http://aws.amazon.com/neptune/vocab/v01/dfe/past/graph#7634c836-4f0d-4f64-bdae-f3ba1a921e6c/graph_10
ID	Out #1	Out #2	Name	Arguments	Mode	Units In	Units Out	Ratio	Time (ms)
0	1	-	DFESolutionInjection	solutions=[?126 -> [0^^<LONG>]]
outSchema=[?126]	-	0	4	0.00	0.06
1	3	-	DFERelationalJoin	joinVars=[]	-	8	4	0.50	0.61
2	1	-	DFESolutionInjection	outSchema=[?124]	-	0	4	0.00	0.02
3	-	-	DFEDrain	-	-	4	0	0.00	0.07

Thanks, Lakshmi Sharma

  • Are you seeing these latency values using the Query Metadata tab in the notebooks? Or using the Neptune openCypher query explain API?

    If you used the %%oc explain parameter on the %%oc magic, it will output actual query times and the query plan. The response time listed in the Query Metadata tab in the notebooks includes other factors related network latency, the notebook creating a connection, etc.

  • For a query: %%oc explain MATCH P=(n:Group :group_67)-[*]->() RETURN P here is the query explain, which again is showing around 219 ms. ID Out #1 Out #2 Name Arguments Mode Units In Units Out Ratio Time (ms) 0 1 - SolutionInjection solutions=[{}] - 0 1 0.00 0 1 2 - DFESubquery subQuery=subQuery1 - 0 3 0.00 219 2 - - TermResolution vars=[?P] id2value_opencypher 3 3 1.00 4.00

  • Can you please provide the entire explain plan? There should be more than this, including the subquery plan. Seeing the entire plan would allow me to see where the bottleneck is in the query.

  • Also, what is the purpose of the :Group label? Is that label attached to every node in the graph? I'd like to better understand the data model and how you are using labels. That could certainly have an effect on query performance.

  • I have updated the entire query plan above and to answer you second question,

    what is the purpose of the :Group label? Is that label attached to every node in the graph?
    yes its a label, each node of type group will have that label grouping them together and there are two other labels which uniquely identifies a group also added.

asked a year ago330 views
2 Answers
0

So labels are generally lower in cardinality. They are there to group like entities together. If you want to uniquely identify a node, it is better to use the node ~id field. The ~id field in Neptune is required for each node/edge and must be globally unique. A lookup for a node or edge by its ~id is the fastest operation inside of Neptune (generally 1-2ms, at most). So instead of using:

MATCH P = (n:Group :group_41)-[:inherits*]->() 
RETURN P

If you move that label to the ID for the node, you could do:

MATCH P= (n:Group)-[:inherits*]->()
WHERE id(n) = 'group_41'
RETURN P

That should show you an improvement in query time.

To create the node using openCypher in Neptune with a custom ID [1]:

CREATE (n:Group {`~id`: 'group_41'})
RETURN n

[1] https://docs.aws.amazon.com/neptune/latest/userguide/feature-opencypher-compliance.html#opencypher-compliance-differences

profile pictureAWS
answered a year ago
  • I have changed the label to ~id and used WHERE id(n) and still not seeing significant difference, it reduced may be by 30-40ms still over 150ms.

  • What is your target latency for this query? Also, what version of the Neptune engine are you using? Also curious if you can run the Gremlin version of this same query and see how it performs (I just want to see this as a baseline of potential performance, if this happens to perform better):

    %%gremlin
    
    g.V('group_41').repeat(out('inherits').simplePath()).until(__.not(out())).path().by(valueMap(true))
    

    Use %%gremlin profile to get the performance info.

  • neptune Engine version 1.2.0.2 Dev/Test cluster t3.medium I am expecting less than 50ms response time for a 4 node path query.

    I ran gremlin query with profile, here is the profile output: Query mode | profile Query execution time (ms) | 6.938 Request execution time (ms) | 31.47 Status code | 200 Status OK? | True

    of predicates | 12

    of results | 1

    Response size (bytes) | 6410 Serialization execution time (ms) | 1.084 Serializer type | application/vnd.gremlin-v3.0+json Results size (bytes) | 3532 [Query] # of statement index ops | 7 [Query] # of unique statement index ops | 7 [Query] Duplication ratio | 1.0 [Query] # of terms materialized | 0 [Serialization] # of statement index ops | 0 [Serialization] # of unique statement index ops | N/A [Serialization] Duplication ratio | N/A [Serialization] # of terms materialized | 0


  •             Neptune Gremlin Profile
    

    Query String

    g.V('group_41').repeat(out('inherits').simplePath()).until(__.not(out())).path().by(valueMap(true))

    Original Traversal

    [GraphStep(vertex,[group_41]), RepeatStep([VertexStep(OUT,[inherits],vertex), PathFilterStep(simple,null,null), RepeatEndStep],until([NotStep([VertexStep(OUT,vertex)])]),emit(false)), PathStep([[PropertyMapStep(value)]])]

    Optimized Traversal

    Neptune steps: [ NeptuneGraphQueryStep(Vertex) { JoinGroupNode { PatternNode[(?1=<group_41>, <label>, ?2, <>) . project distinct ?1 .], {estimatedCardinality=INFINITY, expectedTotalInput=-1, expectedTotalOutput=1, indexTime=0, joinTime=1, numSearches=1, actualTotalOutput=1} RepeatNode { Repeat { JoinGroupNode { PatternNode[(?3, ?6=<inherits>, ?4, ?7) . project ?3,?4 . IsEdgeIdFilter(?7) .], {hashJoin=true, estimatedCardinality=102, expectedTotalInput=-1, indexTime=2, joinTime=0} SimplePathFilter(?1, ?4)) . } } Until { PatternNode[(?4, ?10, ?8, ?11) . project not . IsEdgeIdFilter(?11) .], {hashJoin=true, estimatedCardinality=5, expectedTotalInput=-1, indexTime=0, joinTime=0} } }, i

  • Seems like gremlin is significantly faster, how do we get same execution time with open cypher? its too slow.

0

We're currently investigating. After an initial review of the query plan that you provided, we noticed some gaps in the timings of the plan. Would you, so kindly, also provide us with the detailed explain plan, please? %%oc explain --explain-type details

It was also noted that you're using a t3.medium sized instance. The t3/t4g instances are very resource constrained and we disable things like statistics on these instances. They are generally only good for initial development and not for performance assessments. If you don't mind, it would be beneficial to see how this query performs on at least an r5.large instance.

profile pictureAWS
answered a year ago
  • I have a ticket opened with AWS for the same, I have uploaded the entire document over there, requested to relay the same to you in the ticket.

  • Thanks. I found the attachment on the internal case. We'll take a look and get back to you as soon as possible.

You are not logged in. Log in to post an answer.

A good answer clearly answers the question and provides constructive feedback and encourages professional growth in the question asker.

Guidelines for Answering Questions