Some days ago EDB Postgres Advanced Server 10 was released and one feature which might be handy is auto pre-warming. What this does is to save all the buffers (or better a description of the buffers) which are currently loaded in to shared_buffers to disk and then re-read the buffers automatically when the instance is restarted. Lets see how it works.

Before getting the feature to work we need to look at two parameters which control the behavior:

  • pg_prewarm.autoprewarm: Enabled or disabled the feature
  • pg_prewarm.autoprewarm_interval: The interval the current state is written to disk or 0 to only write once when the instance shutsdown

Another requirement is to load the library when the instance starts:

postgres=# alter system set shared_preload_libraries ='$libdir/dbms_pipe,$libdir/edb_gen,$libdir/dbms_aq,$libdir/pg_prewarm';
ALTER SYSTEM

Once the instance is restarted we can proceed with the configuration:

postgres=# alter system set pg_prewarm.autoprewarm=true;
ALTER SYSTEM
postgres=# alter system set pg_prewarm.autoprewarm_interval='10s';
ALTER SYSTEM

By doing this we told the server to write the current state of the buffers to disk every 10 seconds. You’ll also notice a new background worker process which is responsible for doing the work:

postgres=# ! ps -ef | grep prewarm | egrep -v "ps|grep"
postgres  3682  3675  0 12:05 ?        00:00:00 postgres: bgworker: autoprewarm   

Lets load something into shared_buffers:

postgres=# insert into t1 select a, md5(a::varchar) from generate_series(1,1000) a;
INSERT 0 1000
postgres=# select count(*) from t1;
 count 
-------
  1000
(1 row)
postgres=# explain (analyze,buffers) select count(*) from t1;
                                               QUERY PLAN                                                
---------------------------------------------------------------------------------------------------------
 Aggregate  (cost=21.50..21.51 rows=1 width=8) (actual time=0.492..0.492 rows=1 loops=1)
   Buffers: shared hit=9
   ->  Seq Scan on t1  (cost=0.00..19.00 rows=1000 width=0) (actual time=0.019..0.254 rows=1000 loops=1)
         Buffers: shared hit=9
 Planning time: 0.070 ms
 Execution time: 0.538 ms
(6 rows)

The “shared hit” confirms that we read the buffers from shared_buffers and not from the os/file system cache. Then lets restart and do the same check again:

postgres@centos7:/u02/pgdata/PG4/ [EDB10] pg_ctl -D . restart -m fast
postgres@centos7:/u02/pgdata/PG4/ [EDB10] psql -X postgres
psql.bin (10.1.5)
Type "help" for help.

postgres=# explain (analyze,buffers) select count(*) from t1;
                                               QUERY PLAN                                                
---------------------------------------------------------------------------------------------------------
 Aggregate  (cost=21.50..21.51 rows=1 width=8) (actual time=0.586..0.586 rows=1 loops=1)
   Buffers: shared hit=9
   ->  Seq Scan on t1  (cost=0.00..19.00 rows=1000 width=0) (actual time=0.024..0.295 rows=1000 loops=1)
         Buffers: shared hit=9
 Planning time: 0.451 ms
 Execution time: 0.766 ms
(6 rows)

postgres=# 

… here we go. How is this information stored? When you take a look at $PGDATA you’ll notice a file with the following format:

postgres@centos7:/u02/pgdata/PG4/ [EDB10] cat $PGDATA/autoprewarm.blocks | tail
<>
0,1664,1262,0,0
15471,1663,1259,0,0
15471,1663,1259,0,1
15471,1663,1259,0,2
15471,1663,1249,0,0
15471,1663,1249,0,1
15471,1663,1249,0,2
15471,1663,1249,0,3
15471,1663,1249,0,4

The first field is the OID of the database:

postgres=# select oid,datname from pg_database where oid=15471;
  oid  | datname  
-------+----------
 15471 | postgres
(1 row)

The second one is the tablespace:

postgres=# select oid,spcname from pg_tablespace where oid=1663;
 oid  |  spcname   
------+------------
 1663 | pg_default
(1 row)

The third one is the table:

postgres=# select oid,relname from pg_class where oid = 16402;
  oid  | relname 
-------+---------
 16402 | t1
(1 row)

postgres=# ! grep 16402 $PGDATA/autoprewarm.blocks
15471,1663,16402,0,0
15471,1663,16402,0,1
15471,1663,16402,0,2
15471,1663,16402,0,3
15471,1663,16402,0,4
15471,1663,16402,0,5
15471,1663,16402,0,6
15471,1663,16402,0,7
15471,1663,16402,0,8
15471,1663,16402,1,0
15471,1663,16402,1,2

The fourth one is the fork/file (0 is the datafile, 1 is the free space map) and the last one is the actual block to load. This is also described in “./contrib/pg_prewarm/autoprewarm.c” in the PostgreSQL source code:

/* Metadata for each block we dump. */
typedef struct BlockInfoRecord
{
        Oid                     database;
        Oid                     tablespace;
        Oid                     filenode;
        ForkNumber      forknum;
        BlockNumber blocknum;
} BlockInfoRecord;

For community PostgreSQL there is the contrib module pg_prewarm you can use for that, check here.