- 最新
- 投票最多
- 评论最多
Yes, it seems like you are running into consistency issues with S3 when invoking Textract immediately after uploading an object.
S3 has different consistency models for reads:
-
Strong consistency - returns latest version of object every time.
-
Eventual consistency - may return older version of object temporarily until latest writes are propagated.
By default, S3 offers read-after-write consistency for PUTs of new objects. So a new object PUT should be immediately readable.
However, there can still be lag in propagating writes across S3 servers. So Textract may retrieve an older version or invalid state of the object if invoked very quickly after the PUT.
A few ways to deal with this:
-
Add a short delay (1-2 sec) before invoking Textract after upload.
-
Use S3 object versioning and pass latest version to Textract.
-
Use S3 replication to replicate to another region first, then read from the replica.
-
Retry the Textract call with exponential backoff if you get "invalid" errors.
So in summary, add a bit of waiting/retry to account for S3 consistency lag when processing right after uploads.
Hello Saad,
Thank you for the quck response.
Although I clearly understand the practical part of your answer, and in any case we have a way to retry failed textract requests, I don't get the conceptual part.
How do this two consistency models co-exist? Does it mean, that textract uses the eventually consistent read? While usual S3 GetObject is strongly consistent (I don't see a parameter for GetObject, that may impact read consistency).
Also, as I read on the AWS site: "After a successful write of a new object, or an overwrite or delete of an existing object, any subsequent read request immediately receives the latest version of the object."
Does it mean that consistency guarantees are provided only when the existing object is updated (readers will never see the old version), but not when the object is created (readers may still have to wait until the newly created object will be propagated).
But this contradics the statement from the site, mentioned above. I'm a bit confused.
/aav
相关内容
- AWS 官方已更新 2 年前
- AWS 官方已更新 1 年前
- AWS 官方已更新 2 年前
- AWS 官方已更新 2 年前