Ensure batch_writer flush before read

0

I write with Python to a DynamoDB. I use batch_writer in a with block. After this block I do a query on the DB. But it seems the batch_writer not yet updated the table and I don't get the results writen by the batch_writer in my query.

So how I can ensure that batch_writer written the data to DB before my query is running?

asked a year ago236 views
2 Answers
1

Using boto3 batch_writer in a with block is the recommended method since on execution exiting the with block the batch_writer performs a flush of any items that were still in buffer. If you try to read the data that was just written, it is possible that you are making Eventually consistent reads (default for all read operations). Since DynamoDB replicates every item to 3 copies across Availability Zones, writes are deemed successful as soon as 2 out of 3 copies have persisted the write. The 3rd copy persists the data shortly after. If reading immediately after writing an item, your read request may be served from either of the 3 copies when using the default eventually consistent mode. If you have a use-case to immediately read after writing an item, consider using Strongly consistent reads only for those access patterns. That way DynamoDB would always serve your read request from the copy that has persisted all the writes and is most up to date (but also charge double the read units compared to eventually consistent reads).

Query API - https://docs.aws.amazon.com/amazondynamodb/latest/APIReference/API_Query.html#DDB-Query-request-ConsistentRead

Read Consistency in DynamoDB - https://docs.aws.amazon.com/amazondynamodb/latest/developerguide/HowItWorks.ReadConsistency.html

This may be the case here. Also, if you are reading from a Global Secondary Index after writing to the table, these only support eventually consistent reads, so you could either wait 10s of millis before reading data just written, or use 2 tables instead of a table + index and make TransactionalWrites to both tables in order to get latest items in a subsequent Strongly consistent read request.

Hope this helps!

profile pictureAWS
answered a year ago
1

When reading from a table right after writing to it, it may take a moment before you retrieve all of the desired output as it takes a second (or less) to update across all storage locations due to the fact that the data is eventually consistent [1].

I would suggest you to use strongly consistent reads as DynamoDB will return a response with the most up-to-date data, reflecting the updates from all prior write operations that were successful [2].

I would also suggest you to use unique names for your items as Batch Writer drops request items in the buffer if their primary keys(composite) values are the same as newly added one, as eventually consistent with streams of individual put/delete operations on the same item [3].

  1. Read Consistency: https://docs.aws.amazon.com/amazondynamodb/latest/developerguide/HowItWorks.ReadConsistency.html
  2. API Query: https://docs.aws.amazon.com/amazondynamodb/latest/APIReference/API_Query.html#API_Query_RequestSyntax
  3. Amazon DynamoDB Boto3 Document: https://boto3.amazonaws.com/v1/documentation/api/latest/guide/dynamodb.html#creating-a-new-table
AWS
Yesto
answered a year ago

You are not logged in. Log in to post an answer.

A good answer clearly answers the question and provides constructive feedback and encourages professional growth in the question asker.

Guidelines for Answering Questions