Resolution
Check the minimum table size
The minimum table size is the smallest footprint that a table has on an Amazon Redshift cluster. You can check the minimal table size when you analyze the cluster storage use or when you resize an Amazon Redshift cluster.
For tables that are created using the KEY, EVEN, or Auto (EVEN) distribution style, use the following formula:
Minimum table size = block_size (1 MB) *(number_of_user_columns + 3 system columns) * number_of_populated_slices * number_of_table_segments
For tables created using the ALL or Auto (ALL) distribution style, use the following formula:
Minimum table size = block_size (1 MB) *(number_of_user_columns + 3 system columns) * number_of_cluster_nodes * number_of_table_segments
For the table formulas, the number of segments is based on whether the table has a defined sort key. If an Amazon Redshift table has a defined sort key, then the table has two segments: one sorted segment and one unsorted segment. If an Amazon Redshift table has no sort key, then the table produces only one unsorted segment.
To calculate the number of populated slices, run the following query:
select count(distinct a.slice) asnumber_of_populated_slices, b."table" from stv_blocklist a,
svv_table_info b where a.tbl = b.table_id group by b."table" ;
Example: six dc2.large cluster
For example, you can create a six dc2.large cluster with four small tables of the same structure and number of rows. If the cluster uses three different distribution styles and one distribution style with a sort key, then different queries are used.
The following query creates a table with an ALL distribution style and an output of six populated slices:
create table testsize_all (a varchar(100),b varchar(100), cvarchar(100)) diststyle all;
insert into testsize_all values
('a','b','c'),('a','b','c'),('a','b','c'),('a','b','c'),('a','b','c'),('a','b','c');
The following query creates a table with an EVEN distribution style and an output of six populated slices:
create table testsize_even (a varchar(100),b varchar(100), cvarchar(100)) diststyle even;
insert into testsize_even values
('a','b','c'),('a','b','c'),('a','b','c'),('a','b','c'),('a','b','c'),('a','b','c');
The following query creates a table with a distribution key and an output of one populated slice:
create table testsize_key (a varchar(100),b varchar(100), cvarchar(100)) distkey (a);
insert into testsize_key values
('a','b','c'),('a','b','c'),('a','b','c'),('a','b','c'),('a','b','c'),('a','b','c');
The following query creates a table with a distribution key and sort key and an output of six populated slices:
create table testsize_sort_even (a varchar(100),bvarchar(100), c varchar(100) ) diststyle even sortkey (a);
insert into testsize_sort_even values
('a','b','c'),('a','b','c'),('a','b','c'),('a','b','c'),('a','b','c'),('a','b','c');
Calculate the minimum table size
To calculate the minimum table size for an EVEN distribution style, use the following formula:
Minimum table size = block_size (1 MB) *(number_of_user_columns + 3 system columns) * number_of_populated_slices *
number_of_table_segments
1MB * (3+3) * 6 *1 = 36MB
To calculate the minimum table size for an ALL distribution style, use the following formula:
Minimum table size = block_size (1 MB) *(number_of_user_columns + 3 system columns) * number_of_cluster_nodes *
number_of_table_segments
1MB * (3+3) * 6 *1 = 36 MB
To calculate the minimum table size for a KEY distribution style, use the following formula:
Minimum table size = block_size (1 MB) * (number_of_user_columns+ 3 system columns) * number_of_populated_slices *
number_of_table_segments
1MB * (3+3) * 1 *1 = 6MB
To calculate the minimum table size for an even distribution with a sort key, use the following formula:
Minimum table size = block_size (1 MB) *(number_of_user_columns + 3 system columns) * number_of_populated_slices *
number_of_table_segments
1MB * (3+3) * 6 *2 = 72MB
With a small number of rows inserted, the size of your table is larger than expected. The size of the table continues to grow as the number of rows are inserted and the number of populated slices grow.
When you perform a classic resize, the number of populated slices grows without a growth in data volume for the table. As a result, the amount of free space after the resize doesn't linearly grow.
Related information
Amazon Redshift best practices for designing tables
Working with automatic table optimization
Columnar storage
Choose the best sort key
Query planning and execution workflow