Unexpected slow query

0

Hi,

I recently have started using slow query logs for performance evaluation of my queries. Due to recent launch of slow query logs in Neptune, there are not lot of resources available to understand it. Few things that would be very helpful from the community are:

  1. Explaining the slow query logs parameters and what could impact performance.
  2. For below two queries(slow query logs), can you explain how the first query is taking so much time than the other two queries, considering larger number of operations are being performed in other two:

Instance: Neptune serverless (Min 2.5NCU, Max 8NCU)

Query1:

{
    "requestResponseMetadata": {
        "requestId": "940d78d0-2d38-4d44-a928-fe22c0deaba5",
        "requestType": "websocket"
    },
    "queryStats": {
        "query": "g.V(\"97233#US\").outE(\"LANE\").has(\"shipMode\",\"LINE_HAUL\").has(\"equipmentType\",P.within([\"TRAILER_53FT\"])).inV().hasId(\"97203#US\").simplePath().limit(1L).path().by(__.elementMap())",
        "queryFingerprint": "g.V(string0).outE(string1).has(string2,string3).has(string4,P.within([string5])).inV().hasId(string6).simplePath().limit(long0).path().by(__.elementMap())",
        "queryLanguage": "Gremlin"
    },
    "queryTimeStats": {
        "startTimeEpoch": 1683151985609,
        "startTime": "2023-05-03T22:13:05.609Z",
        "overallRunTimeMs": 1625,
        "waitingTimeMs": 44,
        "executionTimeMs": 955
    },
    "statementCounters": {
        "read": 1
    },
    "transactionCounters": {
        "committed": 1
    },
    "concurrentExecutionStats": {
        "acceptedQueryCountAtStart": 1,
        "runningQueryCountAtStart": 1,
        "acceptedQueryCountAtEnd": 1,
        "runningQueryCountAtEnd": 1
    },
    "queryBatchStats": {
        "queryProcessingBatchSize": 1,
        "querySerialisationBatchSize": 1000
    },
    "storageCounters": {
        "statementsScannedInAllIndexes": 1,
        "statementsScannedSPOGIndex": 1,
        "accessPathSearches": 15,
        "fullyBoundedAccessPathSearches": 15,
        "dictionaryReadsFromValueToIdTable": 15,
        "rangeCountsInAllIndexes": 3
    }
}

Profiling:

*******************************************************
                Neptune Gremlin Profile
*******************************************************

Query String
==================
g.V("97233#US")
    .outE("LANE")
    .has("shipMode","LINE_HAUL")
    .has("equipmentType", within(["TRAILER_53FT"]))
    .inV()
    .hasId("97203#US")
    .simplePath().limit(1).path().by(elementMap())


Original Traversal
==================
[GraphStep(vertex,[97233#US]), VertexStep(OUT,[LANE],edge), HasStep([shipMode.eq(LINE_HAUL), equipmentType.within([TRAILER_53FT])]), EdgeVertexStep(IN), HasStep([~id.eq(97203#US)]), PathFilterStep(simple,null,null), RangeGlobalStep(0,1), PathStep([[ElementMapStep]])]

Optimized Traversal
===================
Neptune steps:
[
    NeptuneGraphQueryStep(Vertex) {
        JoinGroupNode {
            PatternNode[(?1=<97233#US>, ?5=<LANE>, ?3=<97203#US>, ?6) . project ?1,?6,?3 . IsEdgeIdFilter(?6) . SimplePathFilter(?1, ?6, ?3)) .], {estimatedCardinality=0, indexTime=0, joinTime=2, numSearches=1, actualTotalOutput=0}
            PatternNode[(?6, <shipMode>, "LINE_HAUL", <~>) . project ask .], {estimatedCardinality=73261, actualTotalOutput=0}
            PatternNode[(?6, <equipmentType>, ?7="TRAILER_53FT", <~>) . project ask .], {estimatedCardinality=82530}
        }, finishers=[limit(1)], {path=[Vertex(?1):GraphStep, Edge(?6,?1,?3):VertexStep, Vertex(?3):EdgeVertexStep], joinStats=true, optimizationTime=7, maxVarId=9, executionTime=7}
    },
    NeptuneTraverserConverterStep
]
+ not converted into Neptune steps: PathStep([[ElementMapStep]]),

WARNING: >> [PathStep([[ElementMapStep]])] << (or one of the children for each step) is not supported natively yet

Physical Pipeline
=================
NeptuneGraphQueryStep
    |-- StartOp
    |-- JoinGroupOp@55bd60b9
        |-- DynamicJoinOp@3ffaa2e9(PipelineJoinOp(PatternNode[(?1=<97233#US>, ?5=<LANE>, ?3=<97203#US>, ?6) . project ?1,?6,?3 . IsEdgeIdFilter(?6) . SimplePathFilter(?1, ?6, ?3)) .], {estimatedCardinality=0}))
        |-- SpoolerOp(100, @3ffaa2e9, null)
        |-- DynamicJoinOp@5222cf63(<null delegate>(PatternNode[(?6, <shipMode>, "LINE_HAUL", <~>) . project ask .], {estimatedCardinality=73261}))
        |-- SpoolerOp(100, @5222cf63, @3ffaa2e9)
        |-- DynamicJoinOp@3f26b255(<null delegate>(PatternNode[(?6, <equipmentType>, ?7="TRAILER_53FT", <~>) . project ask .], {estimatedCardinality=82530}))
        |-- LimitOp(1)

Runtime (ms)
============
Query Execution: 7.191
Serialization:   6.729

Traversal Metrics
=================
Step                                                               Count  Traversers       Time (ms)    % Dur
-------------------------------------------------------------------------------------------------------------
NeptuneGraphQueryStep(Vertex)                                                                  2.571    98.59
NeptuneTraverserConverterStep                                                                  0.021     0.81
PathStep([[ElementMapStep]])                                                                   0.015     0.60
  ElementMapStep                                                                               0.000
                                            >TOTAL                     -           -           2.608        -

Predicates
==========
# of predicates: 19

Results
=======
Count: 0
Output: []
Response serializer: application/vnd.gremlin-v3.0+json
Response size (bytes): 216


Index Operations
================
Query execution:
    # of statement index ops: 1
    # of unique statement index ops: 1
    Duplication ratio: 1.0
    # of terms materialized: 0
Serialization:
    # of statement index ops: 0
    # of terms materialized: 0

Query2:

{
    "requestResponseMetadata": {
        "requestId": "391f0416-6192-4dcf-aa63-73e35e7b2733",
        "requestType": "websocket"
    },
    "queryStats": {
        "query": "g.V(\"97233#US\").in(\"HAS_POSTAL_CODE\").hasLabel(\"DOMICILE\").out(\"HAS_POSTAL_CODE\").hasLabel(\"POSTAL_CODE\").outE(\"LANE\").has(\"shipMode\",\"LINE_HAUL\").has(\"equipmentType\",P.within([\"TRAILER_53FT\"])).inV().in(\"HAS_POSTAL_CODE\").hasLabel(\"DOMICILE\").out(\"HAS_POSTAL_CODE\").hasId(\"97203#US\").simplePath().limit(1L).path().by(__.elementMap())",
        "queryFingerprint": "g.V(string0).in(string1).hasLabel(string2).out(string1).hasLabel(string3).outE(string4).has(string5,string6).has(string7,P.within([string8])).inV().in(string1).hasLabel(string2).out(string1).hasId(string9).simplePath().limit(long0).path().by(__.elementMap())",
        "queryLanguage": "Gremlin"
    },
    "memoryStats": {
        "allocatedPermits": 20
    },
    "queryTimeStats": {
        "startTimeEpoch": 1683152007221,
        "startTime": "2023-05-03T22:13:27.221Z",
        "overallRunTimeMs": 774,
        "waitingTimeMs": 1,
        "executionTimeMs": 771
    },
    "statementCounters": {
        "read": 13630
    },
    "transactionCounters": {
        "committed": 1
    },
    "concurrentExecutionStats": {
        "acceptedQueryCountAtStart": 1,
        "runningQueryCountAtStart": 1,
        "acceptedQueryCountAtEnd": 1,
        "runningQueryCountAtEnd": 1
    },
    "queryBatchStats": {
        "queryProcessingBatchSize": 1,
        "querySerialisationBatchSize": 1000
    },
    "storageCounters": {
        "statementsScannedInAllIndexes": 20420,
        "statementsScannedSPOGIndex": 13868,
        "statementsScannedPOGSIndex": 6552,
        "statementsReadInAllIndexes": 6942,
        "statementsReadSPOGIndex": 3619,
        "statementsReadPOGSIndex": 3323,
        "accessPathSearches": 6824,
        "fullyBoundedAccessPathSearches": 6824,
        "dictionaryReadsFromValueToIdTable": 34,
        "rangeCountsInAllIndexes": 10
    }
}

Profiling:

*******************************************************
                Neptune Gremlin Profile
*******************************************************

Query String
==================
g.V("97233#US")
        .in("HAS_POSTAL_CODE")
        .hasLabel("DOMICILE")
        .out("HAS_POSTAL_CODE")
        .hasLabel("POSTAL_CODE")
        .outE("LANE").has("shipMode","LINE_HAUL").has("equipmentType",within(["TRAILER_53FT"]))
        .inV().in("HAS_POSTAL_CODE").hasLabel("DOMICILE").out("HAS_POSTAL_CODE")
        .hasId("97203#US").simplePath().limit(1).path().by(__.elementMap())


Original Traversal
==================
[GraphStep(vertex,[97233#US]), VertexStep(IN,[HAS_POSTAL_CODE],vertex), HasStep([~label.eq(DOMICILE)]), VertexStep(OUT,[HAS_POSTAL_CODE],vertex), HasStep([~label.eq(POSTAL_CODE)]), VertexStep(OUT,[LANE],edge), HasStep([shipMode.eq(LINE_HAUL), equipmentType.within([TRAILER_53FT])]), EdgeVertexStep(IN), VertexStep(IN,[HAS_POSTAL_CODE],vertex), HasStep([~label.eq(DOMICILE)]), VertexStep(OUT,[HAS_POSTAL_CODE],vertex), HasStep([~id.eq(97203#US)]), PathFilterStep(simple,null,null), RangeGlobalStep(0,1), PathStep([[ElementMapStep]])]

Optimized Traversal
===================
Neptune steps:
[
    NeptuneGraphQueryStep(Vertex) {
        JoinGroupNode {
            PatternNode[(?3, ?5=<HAS_POSTAL_CODE>, ?1=<97233#US>, ?6) . project ?1,?3 . IsEdgeIdFilter(?6) .], {estimatedCardinality=1, expectedTotalOutput=1, indexTime=0, joinTime=9, numSearches=1, actualTotalOutput=1}
            PatternNode[(?3, <~label>, ?4=<DOMICILE>, <~>) . project ask .], {estimatedCardinality=199, expectedTotalInput=1, expectedTotalOutput=1, indexTime=0, joinTime=10, numSearches=1, actualTotalOutput=1}
            PatternNode[(?3, ?9=<HAS_POSTAL_CODE>, ?7, ?10) . project ?3,?7 . IsEdgeIdFilter(?10) .], {estimatedCardinality=14680, expectedTotalInput=1, joinRatio=100.0, expectedTotalOutput=100, indexTime=1, joinTime=6, numSearches=1, actualTotalOutput=335}
            PatternNode[(?7, <~label>, ?8=<POSTAL_CODE>, <~>) . project ask .], {estimatedCardinality=15189, expectedTotalInput=100, joinRatio=1.0, expectedTotalOutput=100, indexTime=34, joinTime=75, numSearches=335, actualTotalOutput=335}
            PatternNode[(?7, ?13=<LANE>, ?11, ?14) . project ?7,?14,?11 . IsEdgeIdFilter(?14) .], {estimatedCardinality=150981, expectedTotalInput=100, joinRatio=1.23, expectedTotalOutput=123, indexTime=19, joinTime=134, numSearches=335, actualTotalOutput=3227}
            PatternNode[(?14, <shipMode>, "LINE_HAUL", <~>) . project ask .], {estimatedCardinality=73261, expectedTotalInput=123, joinRatio=1.0, expectedTotalOutput=123, indexTime=96, joinTime=420, numSearches=3227, actualTotalOutput=3227}
            PatternNode[(?14, <equipmentType>, ?15="TRAILER_53FT", <~>) . project ask .], {estimatedCardinality=82530, expectedTotalInput=123, joinRatio=1.0, expectedTotalOutput=123, indexTime=118, joinTime=243, numSearches=3227, actualTotalOutput=3227}
            PatternNode[(?17, ?19=<HAS_POSTAL_CODE>, ?11, ?20) . project ?11,?17 . IsEdgeIdFilter(?20) .], {estimatedCardinality=14680, expectedTotalInput=123, joinRatio=0.5, expectedTotalOutput=61, indexTime=135, joinTime=633, numSearches=3227, actualTotalOutput=3123}
            PatternNode[(?17, <~label>, ?18=<DOMICILE>, <~>) . project ask .], {estimatedCardinality=199, expectedTotalInput=61, indexTime=40, hashJoin=true, joinRatio=1.0, expectedTotalOutput=61, joinTime=33, actualTotalOutput=3123}
            PatternNode[(?17, ?23=<HAS_POSTAL_CODE>, ?21=<97203#US>, ?24) . project ?17,?21 . IsEdgeIdFilter(?24) . SimplePathFilter(?1, ?3, ?7, ?14, ?11, ?17, ?21)) .], {estimatedCardinality=14680, expectedTotalInput=61, indexTime=118, joinTime=421, numSearches=3123}
        }, finishers=[limit(1)], {path=[Vertex(?1):GraphStep, Vertex(?3):VertexStep, Vertex(?7):VertexStep, Edge(?14,?7,?11):VertexStep, Vertex(?11):EdgeVertexStep, Vertex(?17):VertexStep, Vertex(?21):VertexStep], joinStats=true, optimizationTime=64, maxVarId=25, executionTime=2697}
    },
    NeptuneTraverserConverterStep
]
+ not converted into Neptune steps: PathStep([[ElementMapStep]]),

WARNING: >> [PathStep([[ElementMapStep]])] << (or one of the children for each step) is not supported natively yet

Physical Pipeline
=================
NeptuneGraphQueryStep
    |-- StartOp
    |-- JoinGroupOp@1d6c7eb8
        |-- DynamicJoinOp@67e0c557(PipelineJoinOp(PatternNode[(?3, ?5=<HAS_POSTAL_CODE>, ?1=<97233#US>, ?6) . project ?1,?3 . IsEdgeIdFilter(?6) .], {estimatedCardinality=1, expectedTotalOutput=1}))
        |-- SpoolerOp(100, @67e0c557, null)
        |-- DynamicJoinOp@51897da1(PipelineJoinOp(PatternNode[(?3, <~label>, ?4=<DOMICILE>, <~>) . project ask .], {estimatedCardinality=199, expectedTotalInput=1, expectedTotalOutput=1}))
        |-- SpoolerOp(100, @51897da1, @67e0c557)
        |-- DynamicJoinOp@73e8402(PipelineJoinOp(PatternNode[(?3, ?9=<HAS_POSTAL_CODE>, ?7, ?10) . project ?3,?7 . IsEdgeIdFilter(?10) .], {estimatedCardinality=14680, expectedTotalInput=1, joinRatio=100.0, expectedTotalOutput=100}))
        |-- SpoolerOp(100, @73e8402, @51897da1)
        |-- DynamicJoinOp@7959b3e3(PipelineJoinOp(PatternNode[(?7, <~label>, ?8=<POSTAL_CODE>, <~>) . project ask .], {estimatedCardinality=15189, expect

Query3:

{
    "requestResponseMetadata": {
        "requestId": "596ee4b7-cd94-4eb3-9fab-2b69a6302342",
        "requestType": "websocket",
        "responseStatusCode": 200
    },
    "queryStats": {
        "query": "g.V().hasLabel(\"DOMICILE\").order().by(\"code\").range(0L,20L).elementMap()",
        "queryFingerprint": "g.V().hasLabel(string0).order().by(string1).range(long0,long1).elementMap()",
        "queryLanguage": "Gremlin"
    },
    "memoryStats": {
        "allocatedPermits": 20
    },
    "queryTimeStats": {
        "startTimeEpoch": 1683155840958,
        "startTime": "2023-05-03T23:17:20.958Z",
        "overallRunTimeMs": 1101,
        "waitingTimeMs": 11,
        "executionTimeMs": 992,
        "serializationTimeMs": 5
    },
    "statementCounters": {
        "read": 520
    },
    "transactionCounters": {
        "committed": 1
    },
    "concurrentExecutionStats": {
        "acceptedQueryCountAtStart": 1,
        "runningQueryCountAtStart": 1,
        "acceptedQueryCountAtEnd": 1,
        "runningQueryCountAtEnd": 1
    },
    "queryBatchStats": {
        "queryProcessingBatchSize": 1000,
        "querySerialisationBatchSize": 1000
    },
    "storageCounters": {
        "statementsScannedInAllIndexes": 520,
        "statementsScannedSPOGIndex": 80,
        "statementsScannedPOGSIndex": 200,
        "statementsScannedGPSOIndex": 240,
        "statementsReadInAllIndexes": 478,
        "statementsReadSPOGIndex": 60,
        "statementsReadPOGSIndex": 199,
        "statementsReadGPSOIndex": 219,
        "accessPathSearches": 306,
        "fullyBoundedAccessPathSearches": 306,
        "dictionaryReadsFromValueToIdTable": 27,
        "dictionaryReadsFromIdToValueTable": 279,
        "rangeCountsInAllIndexes": 2
    }
}
Prakash
asked a year ago287 views
1 Answer
0

The main objective of the Slow Query Log is to find queries that are executing with a latency beyond certain query latency threshold. There are some statistics there, as you mentioned, around number of statements assessed that can provide some detail on query performance. However, to really dig into "why" a query is running slow, you'll need to run that same query through the Neptune Gremlin Profile API [1]. The output from the profile API [2] will give you further details as to what is causing the query to perform slow. This will also provide a breakdown of the different query steps/operators and where the bottleneck may be.

Anecdotally (and this may have nothing to do with performance of the query) simplePath() isn't required in Query 1, as it is only a one-hop query. simplePath() is used for detecting (and avoiding) cyclical paths.

Query 1:

g.V(\"97233#US\").
  outE(\"LANE\").
  has(\"shipMode\", \"LINE_HAUL\").
  has(\"equipmentType\", P.within([\"TRAILER_53FT\"])).
  inV().
  hasId(\"97203#US\").
  simplePath().
  limit(1L).
  path().by(__.elementMap())

Query 2:

g.V(\"97233#US\").
  in(\"HAS_POSTAL_CODE\").
  hasLabel(\"DOMICILE\").
  out(\"HAS_POSTAL_CODE\").
  hasLabel(\"POSTAL_CODE\").
  outE(\"LANE\").
  has(\"shipMode\", \"LINE_HAUL\").
  has(\"equipmentType\", P.within([\"TRAILER_53FT\"])).
  inV().
  in(\"HAS_POSTAL_CODE\").
  hasLabel(\"DOMICILE\").
  out(\"HAS_POSTAL_CODE\").
  hasId(\"97203#US\").
  simplePath().
  limit(1L).
  path().by(__.elementMap())

Query 3:

g.V().
  hasLabel(\"DOMICILE\").
  order().by(\"code\").
  range(0L, 20L).
  elementMap()

[1] https://docs.aws.amazon.com/neptune/latest/userguide/gremlin-profile-api.html

[2] https://docs.aws.amazon.com/neptune/latest/userguide/gremlin-traversal-tuning.html

profile pictureAWS
answered a year ago
  • Thanks Taylor. I did the profiling of the first query earlier and did not found anything off. Attached the same profiling results in first 2 queries. As you can see, when I run the first query its not taking that much of latency as it took in first time. Query1 is a straightforward query where the indexes are being used, not sure why it took this much of latency in first go.

  • Likely what you're seeing is a result of cold buffer pool cache. When a query is issued to Neptune, we first look to see if the associated statements are in memory (buffer pool cache). If not, we then have to fetch those from the persistent storage layer. Once they are in buffer pool cache, any subsequent queries can take advantage of the data already being in-memory.

  • Based on the Query#1 and Query#2 profiling, do you see any concern which can lead to this high latency from Neptune? I don't understand why such a simple query like Query#1 will take 1.6secs to evaluate even in first time.

You are not logged in. Log in to post an answer.

A good answer clearly answers the question and provides constructive feedback and encourages professional growth in the question asker.

Guidelines for Answering Questions