- Newest
- Most votes
- Most comments
Redshift's ra3.xlplus instance type supports up to 32TB per node, but this is not the only thing you should be looking to decide the size of a cluster.
Redshift instances come with provisioned vCPU and Memory, which play an important role on cluster size.
The following are the available configs for all availables types and sizes as of today:
Note that you "pseudo-cluster" has only 4 vCPU which might not be enough to handle all incoming changes from DMS, given the amount of tables, especially if these changes includes updates. Redshift has a real hard time dealing with updates, given that every time a record is updated it rewrites the entire 1MB data block on which the record's updated column value resides. Tables that are frequently updated/deleted requires frequent maintenance, a.k.a vacuum and analyze, to reclaim space and regain performance on reads.
All that said, now to your question: probably no. It's highly probable that your 100 high volume tables are responsible for most of the changes in the stream, thus making the other 800 "hourly overwrites" will not drastically reduce cpu usage. That is, if for example 80% of the cpu load is caused by these 100 tables then even completly removing the others won't help.
If this is their only data source I strongly suggest to leverage a Read Replica for their analytical workload, given that the data volume is not that high.
Relevant content
- asked a year ago
- AWS OFFICIALUpdated a year ago
- AWS OFFICIALUpdated 2 years ago
- AWS OFFICIALUpdated 2 years ago
- AWS OFFICIALUpdated a year ago