Storage/DB solution for fast loading 3GB into memory?

0

A customer uses MSSQL for storing customer data. Once a day they load a single table into a memory-based cache solution they developed, and they want to improve that process' speed.

The data is indexed by "tenantID" and each time they select a single tenant from the table. One tenant's data is at least 3GB and ~10M rows. This process takes ~10 minutes and they're looking for a solution to decrease that time to < 5 minutes.

They're open to suggestions and re-architecture, under the constraint that the loading process happens once a day and is as fast as required. For example - change the way they write the data to this table, or changing their technology stack.

Nir_Sh
asked 4 years ago236 views
1 Answer
0
Accepted Answer

If natively implemented in DynamoDB with 'tenantID' as the low-cardinality key, a segmented scan will still result in one segment having to deal with the 3 GB partition key. It would most likely run faster than their current solution since DDB can sustain 24 MB/sec egress using single-threaded eventually consistent scan operation.

If the customer is able use a higher cardinality primary key (say, a GUID), then a Global Secondary Index could be used to achieve the grouping and indexing by 'tenantID'. This would allow a nearly unbounded scan rate on the base table through the use of a segmented scan.

profile pictureAWS
answered 4 years ago

You are not logged in. Log in to post an answer.

A good answer clearly answers the question and provides constructive feedback and encourages professional growth in the question asker.

Guidelines for Answering Questions