Gremlin profile execution time not matching with Gremlin Request/Second

0

Hi,

We are doing POC with AWS Neptune and based on the gremlin profile execution time our query takes around 10-15 ms to execute.
We are using db.r5.8xlarge instance which has 32 vCPUs i.e 64 neptune threads.
So our expectation when we run this query concurrently using multiple Go client we should able to get around 4200 queries/sec (1000 ms / 15 qps * 64 threads) but as per the db instance Gremlin request/second metric the number of queries processed is at 950/sec.
This value is way too low from the expected values based on profile output of the query.

As per the gremlin status curl output during the concurrent run of the query, I see the acceptedQueryCount is always around 150 and runningQueryCount is always 64. By which I assume there is enough job present in the FIFO queue waiting for the threads to process.

But in spite of that db metric Gremlin request/sec is consistently pointing at 950/sec.

Need Clarification:

  1. What are the things accounted for in the gremlin profile execution time calculation?
  2. What are the other factors which will affect the Gremlin request/sec metric which is not accounted for in profile execution time calculation?

Data Model:

R1----R1R2Edge ---> R2 ----R2R3Edge--->R3 (properties: x1, x2, x3)

Query:
g.withSideEffect('Neptune#typePromotion',false).V("R1:1:99").as('r1').out('R1R2Edge').out('R2R3Edge').not(and(has('x1',within('-1','tcp')),and(has('x2',lte(5432)),has('x3',gte(5432))))).select('r1').id()

Thanks

basky
asked 3 years ago437 views
2 Answers
0

Hi, basky,

I reached out to the Gremlin expert on our team, and there are a couple of things that may not be reflected on the execution time from the Gremlin profile.

  1. Results serialization time
    After query execution is complete, it requires Neptune to serialize the results into transferrable format, like GraphSON, GraphBinary, etc, to send back to the client. This serialization time is not counted in the profile by default.
    However, as described in https://docs.aws.amazon.com/neptune/latest/userguide/gremlin-profile-api.html , the client can add a param and specify a serializer then Neptune returns the time took for serialization too. The query may take longer time because Neptune materializes all dict Ids in solutions.

  2. Lock conflicting for read-write queries.
    If you have read-write queries in the mix, there are chances some queries need to wait for other queries to release locks if they are trying to access rows in certain range. This lock-waiting time may increase the query execution time as well.
    If there are no read-write queries, then this may not affect the throughput. Since all read-only queries use shared lock, and there should be no waiting on locks between them.

  3. Network round trip time between a client and Neptune server
    This of course doesn’t show up in query profile. Neptune has no idea how long it took for the client to reach to Neptune. But in your cases, if you can submit enough queries to the Neptune server, hopefully this does not affect much on the throughputs.

simonzh
answered 3 years ago
0

Thanks simonzh.

basky
answered 3 years ago

You are not logged in. Log in to post an answer.

A good answer clearly answers the question and provides constructive feedback and encourages professional growth in the question asker.

Guidelines for Answering Questions