Logical replication in PostgreSQL came with Version 10, which is already out of support. I’ve written about this a long time ago but the basic building blocks are still the same. Setting this up requires a publication on the source and a subscription on the target and a couple of parameters to bet set on both sides. Starting with PostgreSQL 16 you can setup a logical replication from a physical replica, which gives you more flexibility on how you want to distribute the load. You might want to follow this set of slides to get a general impression on how logical replication evolved over the years.
Starting with PostgreSQL 17 there’s the next evolution: Creating a logical replica out of a physical replica, or in other words: Converting a physical replica into a logical replica. Before we dive into that, let’s quickly look at what this tools solves.
When you setup a logical replication between a source and a target all the data needs to be initially copied from the source to the target. Depending on how large the tables in your setup are, this might quite take some time and the longer this process takes, the more WAL needs to be retained on the source for the replica to catch up once the initial data copy is done. With pg_createsubscriber you don’t need the initial data copy anymore, as this was already done when the physical replica was setup. You can take the physical replica as the starting point and transform it into a logical replica. The downside of this is, that the source and the target need to be on the same major version of PostgreSQL (which is obvious as physical replication cannot be done across major versions of PostgreSQL).
To see how this works lets start from scratch and create a brand new PostgreSQL 17 devel instance and prepare it for physical and logical replication:
1 2 3 4 5 6 7 8 | postgres@pgbox: /home/postgres/ [pgdev] initdb --version initdb (PostgreSQL) 17devel postgres@pgbox: /home/postgres/ [pgdev] initdb -D /var/tmp/source postgres@pgbox: /home/postgres/ [pgdev] echo "wal_level=logical" >> /var/tmp/source/postgresql .auto.conf postgres@pgbox: /home/postgres/ [pgdev] echo "max_replication_slots=10" >> /var/tmp/source/postgresql .auto.conf postgres@pgbox: /home/postgres/ [pgdev] echo "hot_standby=on" >> /var/tmp/source/postgresql .auto.conf postgres@pgbox: /home/postgres/ [pgdev] echo "port=8888" >> /var/tmp/source/postgresql .auto.conf postgres@pgbox: /home/postgres/ [pgdev] pg_ctl -D /var/tmp/source start |
Once we have that, we need a physical replica which is following this primary:
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 | postgres@pgbox: /home/postgres/ [pgdev] pg_basebackup -D /var/tmp/target --write-recovery-conf -p 8888 postgres@pgbox: /home/postgres/ [pgdev] tail -1 /var/tmp/target/postgresql .auto.conf port=8889 primary_conninfo = 'user=postgres passfile=' '/home/postgres/.pgpass' ' channel_binding=prefer port=8888 sslmode=prefer sslcompression=0 sslcertmode=allow sslsni=1 ssl_min_protocol_version=TLSv1.2 gssencmode=prefer krbsrvname=postgres gssdelegation=0 target_session_attrs=any load_balance_hosts=disable' postgres@pgbox: /home/postgres/ [pgdev] pg_ctl -D /var/tmp/target/ start waiting for server to start....2024-03-25 14:15:02.864 CET [23697] LOG: starting PostgreSQL 17devel on x86_64-linux, compiled by gcc -7.5.0, 64-bit 2024-03-25 14:15:02.864 CET [23697] LOG: listening on IPv6 address "::1" , port 8889 2024-03-25 14:15:02.864 CET [23697] LOG: listening on IPv4 address "127.0.0.1" , port 8889 2024-03-25 14:15:02.870 CET [23697] LOG: listening on Unix socket "/tmp/.s.PGSQL.8889" 2024-03-25 14:15:02.876 CET [23700] LOG: database system was interrupted; last known up at 2024-03-25 14:14:08 CET 2024-03-25 14:15:02.898 CET [23700] LOG: starting backup recovery with redo LSN 0 /2000028 , checkpoint LSN 0 /2000080 , on timeline ID 1 2024-03-25 14:15:02.898 CET [23700] LOG: entering standby mode 2024-03-25 14:15:02.909 CET [23700] LOG: redo starts at 0 /2000028 2024-03-25 14:15:02.912 CET [23700] LOG: completed backup recovery with redo LSN 0 /2000028 and end LSN 0 /2000120 2024-03-25 14:15:02.912 CET [23700] LOG: consistent recovery state reached at 0 /2000120 2024-03-25 14:15:02.912 CET [23697] LOG: database system is ready to accept read -only connections 2024-03-25 14:15:02.916 CET [23701] LOG: started streaming WAL from primary at 0 /3000000 on timeline 1 done server started |
To confirm that logical replication is actually working later on, lets populate the primary with pgbench:
1 2 3 4 5 6 7 8 9 10 11 | postgres@pgbox: /home/postgres/ [pgdev] pgbench -i -s 10 -p 8888 dropping old tables... NOTICE: table "pgbench_accounts" does not exist, skipping NOTICE: table "pgbench_branches" does not exist, skipping NOTICE: table "pgbench_history" does not exist, skipping NOTICE: table "pgbench_tellers" does not exist, skipping creating tables... generating data (client-side)... vacuuming... creating primary keys... done in 1.41 s (drop tables 0.00 s, create tables 0.03 s, client-side generate 1.03 s, vacuum 0.08 s, primary keys 0.28 s). |
… and check if the physical replica got the data:
1 2 3 4 5 | postgres@pgbox:/var/tmp/target/ [pgdev] psql -c "select count(*) from pgbench_accounts" -p 8889 count --------- 1000000 (1 row) |
No it is time to convert the physical replica into a logical replica using pg_createsubscriber. The first step is to stop the replica:
1 | postgres@pgbox: /home/postgres/ [pgdev] pg_ctl -D /var/tmp/target/ stop |
In the next step you could either do a dryrun with pg_createsubscriber or directly go for it without a dryrun. Here is a dryrun:
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 | postgres@pgbox: /home/postgres/ [pgdev] pg_createsubscriber --database=postgres \ --pgdata= /var/tmp/target \ --dry-run \ --subscriber-port=8889 \ --publisher-server= 'user=postgres passfile=' '/home/postgres/.pgpass' ' channel_binding=prefer port=8888 sslmode=prefer sslcompression=0 sslcertmode=allow sslsni=1 ssl_min_protocol_version=TLSv1.2 gssencmode=prefer krbsrvname=postgres gssdelegation=0 target_session_attrs=any load_balance_hosts=disable' \ --subscriber-username=postgres \ --publication=pub1 \ --subscription=sub1 2024-03-25 15:12:58.022 CET [24607] LOG: starting PostgreSQL 17devel on x86_64-linux, compiled by gcc -7.5.0, 64-bit 2024-03-25 15:12:58.026 CET [24607] LOG: listening on Unix socket "/home/postgres/.s.PGSQL.8889" 2024-03-25 15:12:58.034 CET [24610] LOG: database system was shut down in recovery at 2024-03-25 15:11:31 CET 2024-03-25 15:12:58.034 CET [24610] LOG: entering standby mode 2024-03-25 15:12:58.038 CET [24610] LOG: redo starts at 0 /15AF3B08 2024-03-25 15:12:58.038 CET [24610] LOG: consistent recovery state reached at 0 /15AF3C48 2024-03-25 15:12:58.038 CET [24607] LOG: database system is ready to accept read -only connections 2024-03-25 15:12:58.038 CET [24610] LOG: invalid record length at 0 /15AF3C48 : expected at least 24, got 0 2024-03-25 15:12:58.043 CET [24611] LOG: started streaming WAL from primary at 0 /15000000 on timeline 1 2024-03-25 15:12:58.096 CET [24607] LOG: received fast shutdown request 2024-03-25 15:12:58.100 CET [24607] LOG: aborting any active transactions 2024-03-25 15:12:58.100 CET [24611] FATAL: terminating walreceiver process due to administrator command 2024-03-25 15:12:58.100 CET [24608] LOG: shutting down 2024-03-25 15:12:58.105 CET [24607] LOG: database system is shut down 2024-03-25 15:12:58.219 CET [24620] LOG: starting PostgreSQL 17devel on x86_64-linux, compiled by gcc -7.5.0, 64-bit 2024-03-25 15:12:58.221 CET [24620] LOG: listening on Unix socket "/home/postgres/.s.PGSQL.8889" 2024-03-25 15:12:58.230 CET [24623] LOG: database system was shut down in recovery at 2024-03-25 15:12:58 CET 2024-03-25 15:12:58.230 CET [24623] LOG: entering standby mode 2024-03-25 15:12:58.233 CET [24623] LOG: redo starts at 0 /15AF3B08 2024-03-25 15:12:58.233 CET [24623] LOG: consistent recovery state reached at 0 /15AF3C48 2024-03-25 15:12:58.233 CET [24620] LOG: database system is ready to accept read -only connections 2024-03-25 15:12:58.233 CET [24623] LOG: invalid record length at 0 /15AF3C48 : expected at least 24, got 0 2024-03-25 15:12:58.237 CET [24624] LOG: started streaming WAL from primary at 0 /15000000 on timeline 1 2024-03-25 15:12:58.311 CET [24620] LOG: received fast shutdown request 2024-03-25 15:12:58.315 CET [24620] LOG: aborting any active transactions 2024-03-25 15:12:58.315 CET [24624] FATAL: terminating walreceiver process due to administrator command 2024-03-25 15:12:58.316 CET [24621] LOG: shutting down 2024-03-25 15:12:58.321 CET [24620] LOG: database system is shut down 15:12:58 postgres@pgbox: /home/postgres/ [pgdev] echo $? 0 |
Don’t count too much on the two “FATAL” messages when the walreiver was shutdown. Important is the exit code, and 0 means success.
Doing the same without the “dryrun” option:
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 | postgres@pgbox: /home/postgres/ [pgdev] pg_createsubscriber --database=postgres --pgdata= /var/tmp/target --subscriber-port=8889 --publisher-server= 'user=postgres passfile=' '/home/postgres/.pgpass' ' channel_binding=prefer port=8888 sslmode=prefer sslcompression=0 sslcertmode=allow sslsni=1 ssl_min_protocol_version=TLSv1.2 gssencmode=prefer krbsrvname=postgres gssdelegation=0 target_session_attrs=any load_balance_hosts=disable' --subscriber-username=postgres --publication=pub1 --subscription=sub1 2024-03-25 15:20:25.575 CET [24669] LOG: starting PostgreSQL 17devel on x86_64-linux, compiled by gcc -7.5.0, 64-bit 2024-03-25 15:20:25.577 CET [24669] LOG: listening on Unix socket "/home/postgres/.s.PGSQL.8889" 2024-03-25 15:20:25.591 CET [24672] LOG: database system was shut down in recovery at 2024-03-25 15:12:58 CET 2024-03-25 15:20:25.591 CET [24672] LOG: entering standby mode 2024-03-25 15:20:25.595 CET [24672] LOG: redo starts at 0 /15AF3B08 2024-03-25 15:20:25.595 CET [24672] LOG: consistent recovery state reached at 0 /15AF3C48 2024-03-25 15:20:25.595 CET [24669] LOG: database system is ready to accept read -only connections 2024-03-25 15:20:25.595 CET [24672] LOG: invalid record length at 0 /15AF3C48 : expected at least 24, got 0 2024-03-25 15:20:25.600 CET [24673] LOG: started streaming WAL from primary at 0 /15000000 on timeline 1 2024-03-25 15:20:25.670 CET [24669] LOG: received fast shutdown request 2024-03-25 15:20:25.674 CET [24669] LOG: aborting any active transactions 2024-03-25 15:20:25.675 CET [24673] FATAL: terminating walreceiver process due to administrator command 2024-03-25 15:20:25.675 CET [24670] LOG: shutting down 2024-03-25 15:20:25.680 CET [24669] LOG: database system is shut down 2024-03-25 15:20:25.807 CET [24678] LOG: logical decoding found consistent point at 0 /15AF3F28 2024-03-25 15:20:25.807 CET [24678] DETAIL: There are no running transactions. 2024-03-25 15:20:25.807 CET [24678] STATEMENT: SELECT lsn FROM pg_catalog.pg_create_logical_replication_slot( 'sub1' , 'pgoutput' , false , false , false ) 2024-03-25 15:20:25.840 CET [24682] LOG: starting PostgreSQL 17devel on x86_64-linux, compiled by gcc -7.5.0, 64-bit 2024-03-25 15:20:25.844 CET [24682] LOG: listening on Unix socket "/home/postgres/.s.PGSQL.8889" 2024-03-25 15:20:25.853 CET [24685] LOG: database system was shut down in recovery at 2024-03-25 15:20:25 CET 2024-03-25 15:20:25.856 CET [24685] LOG: entering standby mode 2024-03-25 15:20:25.859 CET [24685] LOG: redo starts at 0 /15AF3B08 2024-03-25 15:20:25.859 CET [24685] LOG: consistent recovery state reached at 0 /15AF3C48 2024-03-25 15:20:25.859 CET [24682] LOG: database system is ready to accept read -only connections 2024-03-25 15:20:25.859 CET [24685] LOG: invalid record length at 0 /15AF3C48 : expected at least 24, got 0 2024-03-25 15:20:25.864 CET [24686] LOG: started streaming WAL from primary at 0 /15000000 on timeline 1 2024-03-25 15:20:26.231 CET [24685] LOG: recovery stopping after WAL location (LSN) "0/15AF3F60" 2024-03-25 15:20:26.231 CET [24685] LOG: redo done at 0 /15AF3F60 system usage: CPU: user: 0.00 s, system: 0.00 s, elapsed: 0.37 s 2024-03-25 15:20:26.231 CET [24685] LOG: last completed transaction was at log time 2024-03-25 15:20:25.77245+01 2024-03-25 15:20:26.231 CET [24686] FATAL: terminating walreceiver process due to administrator command 2024-03-25 15:20:26.234 CET [24685] LOG: selected new timeline ID: 2 2024-03-25 15:20:26.278 CET [24685] LOG: archive recovery complete 2024-03-25 15:20:26.282 CET [24683] LOG: checkpoint starting: end-of-recovery immediate wait 2024-03-25 15:20:26.315 CET [24683] LOG: checkpoint complete: wrote 10 buffers (0.1%); 0 WAL file (s) added, 0 removed, 0 recycled; write=0.001 s, sync =0.020 s, total=0.037 s; sync files=7, longest=0.004 s, average=0.003 s; distance=6 kB, estimate=6 kB; lsn=0 /15AF5680 , redo lsn=0 /15AF5680 2024-03-25 15:20:26.363 CET [24682] LOG: database system is ready to accept connections 2024-03-25 15:20:26.949 CET [24696] LOG: logical replication apply worker for subscription "sub1" has started 2024-03-25 15:20:26.949 CET [24682] LOG: received fast shutdown request 2024-03-25 15:20:26.950 CET [24698] LOG: starting logical decoding for slot "sub1" 2024-03-25 15:20:26.950 CET [24698] DETAIL: Streaming transactions committing after 0 /15AF3F60 , reading WAL from 0 /15AF3F28 . 2024-03-25 15:20:26.950 CET [24698] STATEMENT: START_REPLICATION SLOT "sub1" LOGICAL 0 /15AF3F60 (proto_version '4' , origin 'any' , publication_names '"pub1"' ) 2024-03-25 15:20:26.950 CET [24698] LOG: logical decoding found consistent point at 0 /15AF3F28 2024-03-25 15:20:26.950 CET [24698] DETAIL: There are no running transactions. 2024-03-25 15:20:26.950 CET [24698] STATEMENT: START_REPLICATION SLOT "sub1" LOGICAL 0 /15AF3F60 (proto_version '4' , origin 'any' , publication_names '"pub1"' ) 2024-03-25 15:20:26.957 CET [24682] LOG: aborting any active transactions 2024-03-25 15:20:26.957 CET [24696] FATAL: terminating logical replication worker due to administrator command 2024-03-25 15:20:26.958 CET [24682] LOG: background worker "logical replication launcher" (PID 24692) exited with exit code 1 2024-03-25 15:20:26.958 CET [24682] LOG: background worker "logical replication apply worker" (PID 24696) exited with exit code 1 2024-03-25 15:20:26.959 CET [24683] LOG: shutting down 2024-03-25 15:20:26.962 CET [24683] LOG: checkpoint starting: shutdown immediate 2024-03-25 15:20:27.013 CET [24683] LOG: checkpoint complete: wrote 20 buffers (0.1%); 0 WAL file (s) added, 0 removed, 0 recycled; write=0.001 s, sync =0.040 s, total=0.055 s; sync files=14, longest=0.004 s, average=0.003 s; distance=3 kB, estimate=6 kB; lsn=0 /15AF64C8 , redo lsn=0 /15AF64C8 2024-03-25 15:20:27.015 CET [24682] LOG: database system is shut down |
There is a lot of output here but if you follow the lines you’ll see the process of converting the physical replica into a logical replica. Lets start it up and check what we’ve got:
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 | postgres@pgbox: /home/postgres/ [pgdev] pg_ctl -D /var/tmp/target/ start waiting for server to start....2024-03-25 15:23:09.137 CET [24714] LOG: starting PostgreSQL 17devel on x86_64-linux, compiled by gcc -7.5.0, 64-bit 2024-03-25 15:23:09.137 CET [24714] LOG: listening on IPv6 address "::1" , port 8889 2024-03-25 15:23:09.137 CET [24714] LOG: listening on IPv4 address "127.0.0.1" , port 8889 2024-03-25 15:23:09.144 CET [24714] LOG: listening on Unix socket "/tmp/.s.PGSQL.8889" 2024-03-25 15:23:09.150 CET [24717] LOG: database system was shut down at 2024-03-25 15:20:27 CET 2024-03-25 15:23:09.150 CET [24717] LOG: recovered replication state of node 1 to 0 /15AF3F60 2024-03-25 15:23:09.159 CET [24714] LOG: database system is ready to accept connections 2024-03-25 15:23:09.165 CET [24721] LOG: logical replication apply worker for subscription "sub1" has started 2024-03-25 15:23:09.167 CET [24722] LOG: 0 /15AF3F60 has been already streamed, forwarding to 0 /15AF5680 2024-03-25 15:23:09.167 CET [24722] STATEMENT: START_REPLICATION SLOT "sub1" LOGICAL 0 /15AF3F60 (proto_version '4' , origin 'any' , publication_names '"pub1"' ) 2024-03-25 15:23:09.167 CET [24722] LOG: starting logical decoding for slot "sub1" 2024-03-25 15:23:09.167 CET [24722] DETAIL: Streaming transactions committing after 0 /15AF5680 , reading WAL from 0 /15AF3F28 . 2024-03-25 15:23:09.167 CET [24722] STATEMENT: START_REPLICATION SLOT "sub1" LOGICAL 0 /15AF3F60 (proto_version '4' , origin 'any' , publication_names '"pub1"' ) 2024-03-25 15:23:09.167 CET [24722] LOG: logical decoding found consistent point at 0 /15AF3F28 2024-03-25 15:23:09.167 CET [24722] DETAIL: There are no running transactions. 2024-03-25 15:23:09.167 CET [24722] STATEMENT: START_REPLICATION SLOT "sub1" LOGICAL 0 /15AF3F60 (proto_version '4' , origin 'any' , publication_names '"pub1"' ) done server started |
We got the publication on the source and the subscription on the target, as expected:
1 2 3 4 5 6 7 8 9 10 | postgres@pgbox: /home/postgres/ [pgdev] psql -p 8888 -c "select * from pg_publication" oid | pubname | pubowner | puballtables | pubinsert | pubupdate | pubdelete | pubtruncate | pubviaroot -------+---------+----------+--------------+-----------+-----------+-----------+-------------+------------ 16438 | pub1 | 10 | t | t | t | t | t | f (1 row) postgres@pgbox: /home/postgres/ [pgdev] psql -p 8889 -c "select * from pg_subscription" oid | subdbid | subskiplsn | subname | subowner | subenabled | subbinary | substream | subtwophasestate | subdisableonerr | subpasswordrequired | subrunasowner | subfailover | > -------+---------+------------+---------+----------+------------+-----------+-----------+------------------+-----------------+---------------------+---------------+-------------+-------------------------------------------------------------------------------------------> 24576 | 5 | 0 /0 | sub1 | 10 | t | f | f | d | f | t | f | f | user=postgres passfile= /home/postgres/ .pgpass channel_binding=prefer port=8888 sslmode=pre> (1 row) |
Ongoing logical replication can easily be verified by adding a row into the source and checking the same row in the target:
1 2 3 4 5 6 7 | postgres@pgbox: /home/postgres/ [pgdev] psql -p 8888 -c "insert into pgbench_accounts values (-1,-1,-1,'aaa')" INSERT 0 1 postgres@pgbox: /home/postgres/ [pgdev] psql -p 8889 -c "select * from pgbench_accounts where aid=-1" aid | bid | abalance | filler -----+-----+----------+-------------------------------------------------------------------------------------- -1 | -1 | -1 | aaa (1 row) |
This is really great stuff and all the credits go to the people involved with this.
x4m
29.01.2025But...how do we upgrade Postgres 17? It's most recent major version :)