In our performance tuning workshop, especially when attendees have an Oracle background, one question for sure pops up every time: How can I use optimizer hints in PostgreSQL. Up until today there are three answers to this:
- You simply can’t, there are no hints
- You might consider using the pg_hint_plan extension
- Not really hints, but you can tell the optimizer to make certain operations more expensive, so other operations might be chosen
Well, now we need to update the workshop material because this was committed for PostgreSQL 19 yesterday. The feature is not called “hints” but it does exactly that: Tell the optimizer what to do because you (might) know it better and you want a specific plan for a given query. Just be aware that this comes with the same issues as listed here.
The new feature comes as an extension so you need to enable it before you can use it. There are three ways to do this:
-- current session
postgres=# load 'pg_plan_advice';
LOAD
-- for all new sessions
postgres=# alter system set session_preload_libraries = 'pg_plan_advice';
ALTER SYSTEM
postgres=# select pg_reload_conf();
pg_reload_conf
----------------
t
(1 row)
-- instance wide
postgres=# alter system set shared_preload_libraries = 'pg_plan_advice';
ALTER SYSTEM
postgres=# select pg_reload_conf();
pg_reload_conf
----------------
t
(1 row)
To see this in action, let’s create a small demo setup:
postgres=# create table t1 ( a int primary key, b text );
CREATE TABLE
postgres=# create table t2 ( a int, b int references t1(a), v text );
CREATE TABLE
postgres=# insert into t1 select i, md5(i::text) from generate_series(1,1000000) i;
INSERT 0 1000000
postgres=# insert into t2 select i, i, md5(i::text) from generate_series(1,1000000) i;
INSERT 0 1000000
postgres=# insert into t2 select i, 1, md5(i::text) from generate_series(1000000,2000000) i;
INSERT 0 1000001
A simple parent child relation having a single match from one to one million and one million and one matches for the value one of the parent table.
EXPLAIN comes with a new option to generate the so called advice string for a given query, e.g.:
postgres=# explain (plan_advice) select * from t1 join t2 on t1.a = t2.b;
QUERY PLAN
------------------------------------------------------------------------------------------
Nested Loop (cost=0.43..111805.81 rows=2000001 width=78)
-> Seq Scan on t2 (cost=0.00..48038.01 rows=2000001 width=41)
-> Memoize (cost=0.43..0.47 rows=1 width=37)
Cache Key: t2.b
Cache Mode: logical
Estimates: capacity=29629 distinct keys=29629 lookups=2000001 hit percent=98.52%
-> Index Scan using t1_pkey on t1 (cost=0.42..0.46 rows=1 width=37)
Index Cond: (a = t2.b)
JIT:
Functions: 8
Options: Inlining false, Optimization false, Expressions true, Deforming true
Generated Plan Advice:
JOIN_ORDER(t2 t1)
NESTED_LOOP_MEMOIZE(t1)
SEQ_SCAN(t2)
INDEX_SCAN(t1 public.t1_pkey)
NO_GATHER(t1 t2)
(17 rows)
What you see here are advice tags, and the full list of those tags is documented in documentation of the extension. First we have the join order, then nested loop memoize, a sequential scan on t2 and an index scan on the primary key of the parent table and finally an instruction that neither t1 nor t2 should appear under a gather node.
This can be given as an advice to the optimizer/planner:
postgres=# SET pg_plan_advice.advice = 'JOIN_ORDER(t2 t1) NESTED_LOOP_MEMOIZE(t1) SEQ_SCAN(t2) INDEX_SCAN(t1 public.t1_pkey) NO_GATHER(t1 t2)';
SET
postgres=# explain (plan_advice) select * from t1 join t2 on t1.a = t2.b;
QUERY PLAN
------------------------------------------------------------------------------------------
Nested Loop (cost=0.43..111805.81 rows=2000001 width=78)
-> Seq Scan on t2 (cost=0.00..48038.01 rows=2000001 width=41)
-> Memoize (cost=0.43..0.47 rows=1 width=37)
Cache Key: t2.b
Cache Mode: logical
Estimates: capacity=29629 distinct keys=29629 lookups=2000001 hit percent=98.52%
-> Index Scan using t1_pkey on t1 (cost=0.42..0.46 rows=1 width=37)
Index Cond: (a = t2.b)
JIT:
Functions: 8
Options: Inlining false, Optimization false, Expressions true, Deforming true
Supplied Plan Advice:
SEQ_SCAN(t2) /* matched */
INDEX_SCAN(t1 public.t1_pkey) /* matched */
JOIN_ORDER(t2 t1) /* matched */
NESTED_LOOP_MEMOIZE(t1) /* matched */
NO_GATHER(t1) /* matched */
NO_GATHER(t2) /* matched */
Generated Plan Advice:
JOIN_ORDER(t2 t1)
NESTED_LOOP_MEMOIZE(t1)
SEQ_SCAN(t2)
INDEX_SCAN(t1 public.t1_pkey)
NO_GATHER(t1 t2)
(24 rows)
Running the next explain with that advice will show you what you’ve advised the planner to do and what was actually done. In this case all the advises matched and you get the same plan as before.
Once you play e.g. with the join order, the plan will change because you told the planner to do so:
postgres=# SET pg_plan_advice.advice = 'JOIN_ORDER(t1 t2)';
SET
postgres=# explain (plan_advice) select * from t1 join t2 on t1.a = t2.b;
QUERY PLAN
-----------------------------------------------------------------------------------
Merge Join (cost=323875.24..390697.00 rows=2000001 width=78)
Merge Cond: (t1.a = t2.b)
-> Index Scan using t1_pkey on t1 (cost=0.42..34317.43 rows=1000000 width=37)
-> Materialize (cost=318880.31..328880.31 rows=2000001 width=41)
-> Sort (cost=318880.31..323880.31 rows=2000001 width=41)
Sort Key: t2.b
-> Seq Scan on t2 (cost=0.00..48038.01 rows=2000001 width=41)
JIT:
Functions: 7
Options: Inlining false, Optimization false, Expressions true, Deforming true
Supplied Plan Advice:
JOIN_ORDER(t1 t2) /* matched */
Generated Plan Advice:
JOIN_ORDER(t1 t2)
MERGE_JOIN_MATERIALIZE(t2)
SEQ_SCAN(t2)
INDEX_SCAN(t1 public.t1_pkey)
NO_GATHER(t1 t2)
(18 rows)
Really nice, now there is an official way to influence the planner using advises but please be aware of the current limitations. Needless to say, that you should use this with caution, because you can easily make things slower by advising what is not optimal for a query.
Thanks to all involved with this, this is really a great improvement.