Dynamo DB Batch Query execution performance

0

Hi, I want some advise on which is performant execution in the below

Running below query using Batch PartiQL

SELECT * FROM person WHERE "pk" = 1 AND (“sk” =“name” OR “sk”=“address”);

SELECT * FROM person WHERE "pk" = 2 AND (“sk” =“name” OR “sk”=“address”);

Or Running multiple of this sort in batchGetItemRequest

SELECT * FROM person WHERE "pk" = 1 AND “sk” =“name”

SELECT * FROM person WHERE "pk" = 1 AND “sk” =“address”

SELECT * FROM person WHERE "pk" = 2 AND “sk” =“name”

SELECT * FROM person WHERE "pk" = 2 AND “sk” =“address”

Bhuvi
asked 3 months ago194 views
2 Answers
0

Hello, based on the queries you have provided, here is my advice on which approach would be more performant in DynamoDB:

Using BatchPartiQL is likely going to be faster and more efficient for these queries compared to batchGetItem for a few reasons:

  • BatchPartiQL allows retrieving all the items in 2 API calls rather than 4 batchGetItem calls. This reduces the number of network roundtrips.
  • The OR condition on the sort key is easier to express in PartiQL's SQL-like syntax versus specifying individual get requests.
  • BatchPartiQL has no inherent limits like the 100 item cap per batchGetItem call.
  • PartiQL queries by partition key and sort key condition rather than requiring knowing the exact primary keys.

The advantages of batchGetItem in this case would be:

  • Per-request metrics like consumed capacity units for each get operation. BatchPartiQL does not provide these.
  • More mature API with wider client library support.

So in summary, for these particular queries focused on retrieving multiple items by query conditions, BatchPartiQL is likely going to provide much better performance by reducing the number of API calls and network roundtrips. The OR condition and lack of per-item caps make it a good fit. However, if you needed the fine-grained per-request metrics or have limitations on client library options, batchGetItem may be a better choice despite requiring more API calls.

For more information about BatchGetItem API click here or about batch operations with PartiQL click here.

AWS
answered 3 months ago
  • It would be a single network call to DB for both batchPartQL as well as in BatchGetItemRequest.

    four batchGetItemRequest will be constructed and triggered once

    BatchGetItemResult result = amazonDynamoDB.batchGetItem(batchGetItemRequest);

    Batch PartiQL execution

    dynamodbClient.send(new BatchExecuteStatementCommand({ Statements: [ { Statement: statement } ] }))

    As per my understanding, its only one db call for both partiQL batch and batchGetItemRequest

0

You should test this, but in terms of read unit consumption and total number of requests to the backing store, they are equivalent. I would expect to see very similar results. You could break this out into four individual GetItem requests (in parallel), or use a batch - the batch request may save you some CPU. The PartiQL BatchExecuteStatement API is much nicer to work with as a developer than BatchGetItem - easier to reason about any unprocessed parts of the request and retry them. Here's a blog with some additional background on the PartiQL APIs in DynamoDB: https://aws.amazon.com/blogs/database/build-faster-with-amazon-dynamodb-and-partiql-sql-compatible-operations/

answered 3 months ago

You are not logged in. Log in to post an answer.

A good answer clearly answers the question and provides constructive feedback and encourages professional growth in the question asker.

Guidelines for Answering Questions