Insight into Accuracy/Reliability of Redshift SVV_TABLE_INFO.estimated_visible_rows


I am looking at the documentation for SVV_TABLE_INFO and know that the estimated_visible_rows column doesn't include rows marked for deletion (unlike the tbl_rows column, which does include these rows). However, I couldn't find any additional information into the overall accuracy/reliability of the estimated_visible_rows column. Is anything else estimated, besides rows that may/may not be marked for deletion? What is the guaranteed maximum delay in this number being updated (hours, days?).

asked 2 months ago26 views
1 Answer

This is an advanced topic - SVV_TABLE_INFO is a system view so you can check its DDL to find where the column it comes from.

First run show view SVV_TABLE_INFO;. According to the DDL, estimated_visible_rows is defined as following.

(psi.stairows) :: numeric(38, 0) AS estimated_visible_rows,

In the DDL, you can find that psi an alias for pg_statistic_indicator.

The document says:

Stores information about the number of rows inserted or deleted since the last ANALYZE. The PG_STATISTIC_INDICATOR table is updated frequently following DML operations, so statistics are approximate.

To answer your question, this number is approximate and there's no guaranteed maximum delay since ANALYZE happens automatically or triggered by user. Check STL_ANALYZE for ANALYZE operation history.

answered a month ago

You are not logged in. Log in to post an answer.

A good answer clearly answers the question and provides constructive feedback and encourages professional growth in the question asker.

Guidelines for Answering Questions