- Newest
- Most votes
- Most comments
It is not possible to perform an incremental 'bookmarked' load from a DynamoDB table without data modeling to design for this (i.e. a sharded-GSI that allows time based queries across the entire data set), which would then require a custom reader (Glue doesn't support GSI queries).
Using streams --> lambda --> firehose is currently the most 'managed' and cost effective way to deliver incremental changes from a DynamoDB table to S3.
Reading DynamoDB streams only has the computational cost of Lambda associated with it, and the Lambdas can read hundreds of items from a single invocation. Having Firehose buffer/package/store these changes as compressed/partitioned/queryable data on S3 is simple and cost effective.
If you are concerned about cost it could be worth opening a specreq to have a specialist take a look at the analysis - these configurations are both common and generally cost effective (the cost is not relative to the size of the table, but rather the velocity/size of the writes - which will often be more efficient than a custom reader/loader).
Export to S3 feature has been enhanced to support incremental data load https://aws.amazon.com/blogs/database/introducing-incremental-export-from-amazon-dynamodb-to-amazon-s3/
.
Relevant content
- asked 4 months ago
- Accepted Answerasked 2 years ago
- asked 5 years ago
- Accepted Answerasked 5 years ago
- AWS OFFICIALUpdated 9 months ago
- AWS OFFICIALUpdated 2 years ago
- AWS OFFICIALUpdated 9 months ago