Query Plan EXPLAIN

0

Hi !

I have two tables in redshift with a one to many relation, each table has ~ 300K rows

If i execute this request :

EXPLAIN
Select Table1.field, Table2.field
FROM Table1 
INNER JOIN Table2 On Table1.Reference = Table2.ReferenceTable1

Here is the query plan

XN Hash Join DS_DIST_ALL_NONE  (cost=3929.08..29669880.27 rows=323034878 width=104)	
  Hash Cond: (("outer".Reference )::text = ("inner".ReferenceTable1)::text)	
  ->  XN Seq Scan on Table1 e  (cost=0.00..3143.26 rows=314326 width=104)	
  ->  XN Hash  (cost=3143.26..3143.26 rows=314326 width=17)	
        ->  XN Seq Scan on Table2 o  (cost=0.00..3143.26 rows=314326 width=17)

I don't understand what does it mean rows=323034878 ?

feita há um ano248 visualizações
2 Respostas
0

Nice and interesting one

The rows estimate in the query plan refers to the estimated number of rows that will be produced by the query. In this case, the query plan estimates that the INNER JOIN operation will produce a total of 323034878 rows.

This estimate is based on the number of rows in the two tables that are being joined and the distribution of the join keys. The query plan uses statistical information about the distribution of data in the tables to make this estimate.

It is important to note that this estimate is just that – an estimate. The actual number of rows produced by the query may be different.

SeanSi
respondido há um ano
0

You mentioned there is a one to many relation. Maybe that is the reason for such big number of result rows.

You can check it this way to see how many rows are actually returned:

Select COUNT(*)
FROM Table1 
INNER JOIN Table2 On Table1.Reference = Table2.ReferenceTable1

I guess it will be much more than 300K.

XN Seq Scan on Table1 e (cost=0.00..3143.26 rows=314326 width=104)

it reads all rows from Table 1. ~ 300K

XN Seq Scan on Table2 o (cost=0.00..3143.26 rows=314326 width=17

it reads all rows from Table 2. ~ 300K

XN Hash Join DS_DIST_ALL_NONE (cost=3929.08..29669880.27 rows=323034878 width=104) Hash Cond: (("outer".Reference )::text = ("inner".ReferenceTable1)::text)

joins the tables, applies the condition and returns ~ 323034878 rows. Maybe it is somehow related to distribution key. I wonder what COUNT(*) in above example would return.

Let me know if it helps :)

profile picture
Max
respondido há um ano

Você não está conectado. Fazer login para postar uma resposta.

Uma boa resposta responde claramente à pergunta, dá feedback construtivo e incentiva o crescimento profissional de quem perguntou.

Diretrizes para responder a perguntas