How can I log execution plans of queries for Amazon RDS PostgreSQL or Aurora PostgreSQL to tune query performance?

Lesedauer: 7 Minute
1

I want to know which execution plans are being used for my queries, so that I can tune them accordingly. How can I log the execution plans for these queries for an Amazon Relational Database Service (Amazon RDS) PostgreSQL DB instance or Amazon Aurora PostgreSQL DB cluster?

Resolution

You can use the auto_explain module to log execution plans for queries.

1.    Add auto_explain to the shared_preload_libraries parameter.

2.    Reboot the DB instance or cluster.

3.    To enable auto_explain, you must change the auto_explain.log_min_duration parameter to a value other than -1. This value is the minimum statement execution time in milliseconds that, if it's passed and the statement runs, its execution plan is logged. If you set the parameter to 0, all plans are logged.

Important: Setting the auto_explain.log_min_duration parameter to 0 leads to degradation of performance and extensive storage space consumption. This can cause problems with the instance.

4.    Run queries on the database.

5.    View or download the PostgreSQL log files, including the execution plans. The following example execution plan has the execution time (4673.182 ms) and the execution plan with the query text:

2020-01-02 03:08:44 UTC:27.0.3.156(21356):postgres@postgres:[2040]:LOG: duration: 4673.182 ms plan:
Query Text: SELECT COUNT(*) FROM hoge ORDER BY 1;
Sort (cost=215575.00..215575.00 rows=1 width=8)
Sort Key: (count(*))
-> Aggregate (cost=215574.98..215574.99 rows=1 width=8)
-> Seq Scan on hoge (cost=0.00..190336.18 rows=10095518 width=0)

Use auto_explain parameters to tune query performance

In addition to enabling pg_autoexplain, the auto_explain.log_min_duration parameter also allows you to use other useful parameters. You can use the following auto_explain parameters (which you can change without rebooting) to tune query performance. For more information, see the PostgreSQL documentation for each parameter.

auto_explain.log_analyze

The auto_explain.log_analyze parameter prints an EXPLAIN ANALYZE output (instead of an EXPLAIN output) when an execution plan is logged. This parameter can degrade query performance due to overhead, and the default value is off. For more information, see auto_explain.log_analyze in the Postgres documentation.

See the following example:

2020-02-01 07:42:09 UTC:27.0.3.145(29943):master@postgres:[789]:LOG:  duration: 18.658 ms  plan:
	Query Text: SELECT COUNT(*) FROM t1 INNER JOIN t2 USING(id);
	Aggregate  (cost=577.45..577.46 rows=1 width=8) (actual time=18.641..18.641 rows=1 loops=1)
	  ->  Hash Join  (cost=269.98..552.45 rows=9999 width=0) (actual time=8.108..16.576 rows=9999 loops=1)
	        Hash Cond: (t1.id = t2.id)
	        ->  Seq Scan on t1  (cost=0.00..144.99 rows=9999 width=4) (actual time=0.839..4.151 rows=9999 loops=1)
	        ->  Hash  (cost=144.99..144.99 rows=9999 width=4) (actual time=7.186..7.186 rows=9999 loops=1)
	              Buckets: 16384  Batches: 1  Memory Usage: 480kB
	              ->  Seq Scan on t2  (cost=0.00..144.99 rows=9999 width=4) (actual time=0.667..4.289 rows=9999 loops=1)

auto_explain.log_buffers

Use the auto_explain.log_buffers parameter to determine whether buffer usage statistics are printed when you log an execution plan. Using this parameter has the same effect as of using the Buffers option with Explain. Be sure to enable auto_explain.log_analyze with auto_explain.log_buffers. By default, this parameter is off, and you can only change this only if you are a superuser. For more information, see auto_explain.log_buffers in the PostgreSQL documentation.

See the following example output:

2020-02-01 08:02:02 UTC:27.0.3.145(27477):master@postgres:[15514]:LOG:  duration: 14.875 ms  plan:
	Query Text: SELECT COUNT(*) FROM t1 INNER JOIN t2 USING(id);
	Aggregate  (cost=577.45..577.46 rows=1 width=8) (actual time=14.861..14.861 rows=1 loops=1)
	  Buffers: shared hit=93
	  ->  Hash Join  (cost=269.98..552.45 rows=9999 width=0) (actual time=5.293..12.768 rows=9999 loops=1)
	        Hash Cond: (t1.id = t2.id)
	        Buffers: shared hit=93
	        ->  Seq Scan on t1  (cost=0.00..144.99 rows=9999 width=4) (actual time=0.007..2.385 rows=9999 loops=1)
	              Buffers: shared hit=45
	        ->  Hash  (cost=144.99..144.99 rows=9999 width=4) (actual time=5.250..5.250 rows=9999 loops=1)
	              Buckets: 16384  Batches: 1  Memory Usage: 480kB
	              Buffers: shared hit=45
	              ->  Seq Scan on t2  (cost=0.00..144.99 rows=9999 width=4) (actual time=0.009..2.348 rows=9999 loops=1)
	                    Buffers: shared hit=45

auto_explain.log_nested_statements

Use the auto_explain.log_nested_statements parameter to consider nested statements for logging. Top-level query plans are logged, by default, unless you enable this parameter. The default value is off. For more information, see auto_explain.nested_statements in the PostgreSQL documentation.

See the following example:

CREATE OR REPLACE FUNCTION count_tables() RETURNS integer AS $$
        DECLARE
         i integer;
         j integer;
        BEGIN
                SELECT COUNT(*) INTO i FROM t1;
                SELECT COUNT(*) INTO j FROM t2;
                RETURN i + j;                
        END;
$$ LANGUAGE plpgsql;

See the following example output:

2020-02-01 08:15:25 UTC:27.0.3.145(27477):master@postgres:[15514]:LOG:  duration: 0.651 ms  plan:
	Query Text: SELECT COUNT(*)        FROM t1
	Aggregate  (cost=169.99..170.00 rows=1 width=8)
	  ->  Seq Scan on t1  (cost=0.00..144.99 rows=9999 width=0)
2020-02-01 08:15:25 UTC:27.0.3.145(27477):master@postgres:[15514]:CONTEXT:  SQL statement "SELECT COUNT(*)        FROM t1"
	PL/pgSQL function count_tables() line 6 at SQL statement
2020-02-01 08:15:25 UTC:27.0.3.145(27477):master@postgres:[15514]:LOG:  duration: 0.637 ms  plan:
	Query Text: SELECT COUNT(*)        FROM t2
	Aggregate  (cost=169.99..170.00 rows=1 width=8)
	  ->  Seq Scan on t2  (cost=0.00..144.99 rows=9999 width=0)
2020-02-01 08:15:25 UTC:27.0.3.145(27477):master@postgres:[15514]:CONTEXT:  SQL statement "SELECT COUNT(*)        FROM t2"
	PL/pgSQL function count_tables() line 7 at SQL statement
2020-02-01 08:15:25 UTC:27.0.3.145(27477):master@postgres:[15514]:LOG:  duration: 1.435 ms  plan:
	Query Text: SELECT count_tables();
	Result  (cost=0.00..0.26 rows=1 width=4)

auto_explain.log_timing

Use the auto_explain.log_timing parameter to controls whether per-node timing information is printed when you log an execution plan. Using this parameter works similarly to using Timing with Explain. Disabling auto_explain.log_timing can mitigate the overhead of repeatedly reading the system clock. The default value is on. For more information, see auto_explain.log_timing in the PostgreSQL documentation.

auto_explain.log_triggers

Use the auto_explain.log_triggers parameter to include trigger execution statistics when you log an execution plan. You must also enable auto_explain.log_analyze when using auto_explain.log_triggers The default value is off.

For more information, see auto_explain.log_triggers in the PostgreSQL documentation.

See the following example:

CREATE FUNCTION emp_log_trigger_func() RETURNS trigger AS $emp_stamp$
    DECLARE
        count integer;
    BEGIN
        SELECT COUNT(*) INTO count FROM emp WHERE empname = NEW.empname;
        INSERT INTO emp_log(created_at, operation, user_name) VALUES(current_timestamp, TG_OP, user);
        RETURN NEW;
    END;
$emp_stamp$ LANGUAGE plpgsql;
CREATE TRIGGER emp_stamp BEFORE INSERT OR UPDATE ON emp
    FOR EACH ROW EXECUTE PROCEDURE emp_log_trigger_func();

See the following example output:

2020-02-01 08:57:44 UTC:27.0.3.145(27477):master@postgres:[15514]:LOG:  duration: 1.463 ms  plan:
	Query Text: INSERT INTO emp_log(created_at, operation, user_name) VALUES(current_timestamp, TG_OP, user)
	Insert on emp_log  (cost=0.00..0.03 rows=1 width=168) (actual time=1.461..1.461 rows=0 loops=1)
	  ->  Result  (cost=0.00..0.03 rows=1 width=168) (actual time=0.009..0.010 rows=1 loops=1)
2020-02-01 08:57:44 UTC:27.0.3.145(27477):master@postgres:[15514]:CONTEXT:  SQL statement "INSERT INTO emp_log(created_at, operation, user_name) VALUES(current_timestamp, TG_OP, user)"
	PL/pgSQL function emp_log_trigger_func() line 3 at SQL statement
2020-02-01 08:57:44 UTC:27.0.3.145(27477):master@postgres:[15514]:LOG:  duration: 1.602 ms  plan:
	Query Text: INSERT INTO emp VALUES('name', 100, current_timestamp, 'hoge');
	Insert on emp  (cost=0.00..0.01 rows=1 width=76) (actual time=1.600..1.600 rows=0 loops=1)
	  ->  Result  (cost=0.00..0.01 rows=1 width=76) (actual time=0.006..0.006 rows=1 loops=1)
	Trigger emp_stamp: time=1.584 calls=1

auto_explain.log_verbose

Use the auto_explain.log_verbose parameter to determine whether verbose details are printed when you log an execution plan. Using this parameter works similarly to using Verbose option with Explain. The default value is off. For more information, see auto_explain.log_verbose in the PostgreSQL documentation.

See the following example:

2020-02-01 09:03:20 UTC:27.0.3.145(27477):master@postgres:[15514]:LOG:  duration: 3.492 ms  plan:
	Query Text: SELECT COUNT(*) FROM t1 INNER JOIN t2 USING(id);
	Aggregate  (cost=577.45..577.46 rows=1 width=8)
	  Output: count(*)
	  ->  Hash Join  (cost=269.98..552.45 rows=9999 width=0)
	        Hash Cond: (t1.id = t2.id)
	        ->  Seq Scan on public.t1  (cost=0.00..144.99 rows=9999 width=4)
	              Output: t1.id
	        ->  Hash  (cost=144.99..144.99 rows=9999 width=4)
	              Output: t2.id
	              Buckets: 16384  Batches: 1  Memory Usage: 480kB
	              ->  Seq Scan on public.t2  (cost=0.00..144.99 rows=9999 width=4)
	                    Output: t2.id

Or, if you want to see the explain plan of fewer queries for tuning, you could run EXPLAIN ANALYZE on them. This eliminates the overhead of logging, and shows the plan of the query step by step, along with the execution time of each step. If you manually invoke the EXPLAIN ANALYZE statement, then you can identify events such as which step produced the temporary files, which step is performing seq scan, and so on.

postgres=> EXPLAIN ANALYZE SELECT * FROM hoge ORDER BY 1;
QUERY PLAN
--------------------------------------------------------------------------------------------------------------------------
Sort (cost=7343904.60..7444904.60 rows=40400000 width=4) (actual time=37795.033..45133.431 rows=40400000 loops=1)
Sort Key: id
Sort Method: external merge Disk: 553512kB
-> Seq Scan on hoge (cost=0.00..582762.00 rows=40400000 width=4) (actual time=0.579..8184.498 rows=40400000 loops=1)
Planning Time: 0.549 ms
Execution Time: 47161.516 ms
(6 rows)

Related information

Best practices for working with PostgreSQL

PostgreSQL documentation for auto_explain

PostgreSQL documentation for Error reporting and logging

PostgreSQL documentation for Using EXPLAIN