Query Plan EXPLAIN

0

Hi !

I have two tables in redshift with a one to many relation, each table has ~ 300K rows

If i execute this request :

EXPLAIN
Select Table1.field, Table2.field
FROM Table1 
INNER JOIN Table2 On Table1.Reference = Table2.ReferenceTable1

Here is the query plan

XN Hash Join DS_DIST_ALL_NONE  (cost=3929.08..29669880.27 rows=323034878 width=104)	
  Hash Cond: (("outer".Reference )::text = ("inner".ReferenceTable1)::text)	
  ->  XN Seq Scan on Table1 e  (cost=0.00..3143.26 rows=314326 width=104)	
  ->  XN Hash  (cost=3143.26..3143.26 rows=314326 width=17)	
        ->  XN Seq Scan on Table2 o  (cost=0.00..3143.26 rows=314326 width=17)

I don't understand what does it mean rows=323034878 ?

asked a year ago228 views
2 Answers
0

Nice and interesting one

The rows estimate in the query plan refers to the estimated number of rows that will be produced by the query. In this case, the query plan estimates that the INNER JOIN operation will produce a total of 323034878 rows.

This estimate is based on the number of rows in the two tables that are being joined and the distribution of the join keys. The query plan uses statistical information about the distribution of data in the tables to make this estimate.

It is important to note that this estimate is just that – an estimate. The actual number of rows produced by the query may be different.

SeanSi
answered a year ago
0

You mentioned there is a one to many relation. Maybe that is the reason for such big number of result rows.

You can check it this way to see how many rows are actually returned:

Select COUNT(*)
FROM Table1 
INNER JOIN Table2 On Table1.Reference = Table2.ReferenceTable1

I guess it will be much more than 300K.

XN Seq Scan on Table1 e (cost=0.00..3143.26 rows=314326 width=104)

it reads all rows from Table 1. ~ 300K

XN Seq Scan on Table2 o (cost=0.00..3143.26 rows=314326 width=17

it reads all rows from Table 2. ~ 300K

XN Hash Join DS_DIST_ALL_NONE (cost=3929.08..29669880.27 rows=323034878 width=104) Hash Cond: (("outer".Reference )::text = ("inner".ReferenceTable1)::text)

joins the tables, applies the condition and returns ~ 323034878 rows. Maybe it is somehow related to distribution key. I wonder what COUNT(*) in above example would return.

Let me know if it helps :)

profile picture
Max
answered a year ago

You are not logged in. Log in to post an answer.

A good answer clearly answers the question and provides constructive feedback and encourages professional growth in the question asker.

Guidelines for Answering Questions