Educated guess about RDS instance with best performance to cost ratio based on EXPLAIN ANALYZE of PostgreSQL

0

I have a table that stores 1M 384-dimensional vectors in the column called vector of type "char"[]. Consider the query:

EXPLAIN ANALYZE
WITH Vars(key) as (
    VALUES (array_fill(1, ARRAY[384])::vector)
)
SELECT content_id
FROM MyTable, Vars
ORDER BY vector::int[]::vector <#> key
LIMIT 10;

The key is just a toy vector consisting of all ones. <#> is the dot product operator of the pgvector extension, and vector is the type defined by that extension, which to my understanding is similar to real[].

Here is the output when running on db.m7g.large RDS PostgreSQL instance:

 Limit  (cost=83268.20..83269.37 rows=10 width=12) (actual time=3070.513..3070.570 rows=10 loops=1)
   ->  Gather Merge  (cost=83268.20..180496.59 rows=833328 width=12) (actual time=3070.512..3070.567 rows=10 loops=1)
         Workers Planned: 2
         Workers Launched: 2
         ->  Sort  (cost=82268.18..83309.84 rows=416664 width=12) (actual time=3062.863..3062.864 rows=8 loops=3)
               Sort Key: ((((mytable.vector)::integer[])::vector <#> '[1,1, ... ,1]'::vector))
               Sort Method: top-N heapsort  Memory: 25kB
               Worker 0:  Sort Method: top-N heapsort  Memory: 25kB
               Worker 1:  Sort Method: top-N heapsort  Memory: 25kB
               ->  Parallel Seq Scan on mytable  (cost=0.00..73264.22 rows=416664 width=12) (actual time=0.017..3005.128 rows=333333 loops=3)
 Planning Time: 0.097 ms
 Execution Time: 3070.597 ms
(12 rows)

Given this output, how can I make an educated guess about which of the many instance types will give the best performance to cost ratio for this particular query?

Nessuna risposta

Accesso non effettuato. Accedi per postare una risposta.

Una buona risposta soddisfa chiaramente la domanda, fornisce un feedback costruttivo e incoraggia la crescita professionale del richiedente.

Linee guida per rispondere alle domande