Query Plan EXPLAIN

0

Hi !

I have two tables in redshift with a one to many relation, each table has ~ 300K rows

If i execute this request :

EXPLAIN
Select Table1.field, Table2.field
FROM Table1 
INNER JOIN Table2 On Table1.Reference = Table2.ReferenceTable1

Here is the query plan

XN Hash Join DS_DIST_ALL_NONE  (cost=3929.08..29669880.27 rows=323034878 width=104)	
  Hash Cond: (("outer".Reference )::text = ("inner".ReferenceTable1)::text)	
  ->  XN Seq Scan on Table1 e  (cost=0.00..3143.26 rows=314326 width=104)	
  ->  XN Hash  (cost=3143.26..3143.26 rows=314326 width=17)	
        ->  XN Seq Scan on Table2 o  (cost=0.00..3143.26 rows=314326 width=17)

I don't understand what does it mean rows=323034878 ?

已提问 1 年前248 查看次数
2 回答
0

Nice and interesting one

The rows estimate in the query plan refers to the estimated number of rows that will be produced by the query. In this case, the query plan estimates that the INNER JOIN operation will produce a total of 323034878 rows.

This estimate is based on the number of rows in the two tables that are being joined and the distribution of the join keys. The query plan uses statistical information about the distribution of data in the tables to make this estimate.

It is important to note that this estimate is just that – an estimate. The actual number of rows produced by the query may be different.

SeanSi
已回答 1 年前
0

You mentioned there is a one to many relation. Maybe that is the reason for such big number of result rows.

You can check it this way to see how many rows are actually returned:

Select COUNT(*)
FROM Table1 
INNER JOIN Table2 On Table1.Reference = Table2.ReferenceTable1

I guess it will be much more than 300K.

XN Seq Scan on Table1 e (cost=0.00..3143.26 rows=314326 width=104)

it reads all rows from Table 1. ~ 300K

XN Seq Scan on Table2 o (cost=0.00..3143.26 rows=314326 width=17

it reads all rows from Table 2. ~ 300K

XN Hash Join DS_DIST_ALL_NONE (cost=3929.08..29669880.27 rows=323034878 width=104) Hash Cond: (("outer".Reference )::text = ("inner".ReferenceTable1)::text)

joins the tables, applies the condition and returns ~ 323034878 rows. Maybe it is somehow related to distribution key. I wonder what COUNT(*) in above example would return.

Let me know if it helps :)

profile picture
Max
已回答 1 年前

您未登录。 登录 发布回答。

一个好的回答可以清楚地解答问题和提供建设性反馈,并能促进提问者的职业发展。

回答问题的准则