While preparing myself for the pgconf.eu in Berlin this week I made a lot of research regarding Patroni and pgBackRest. I was aware that it’s possible to recreate a replica from a pgBackRest backup. But I was also wondering, what happens if the complete Patroni Cluster is gone. I know this is not the most common case, but never say never.

Starting point

Let’s assume we have/had a running Patroni Cluster on three nodes, using etcd as Key Value Store. Everything was working fine.

postgres@hero3:/home/postgres/ [PG1] patronictl list
+--------+----------------+---------+---------+----+-----------+
| Member | Host | Role | State | TL | Lag in MB |
Cluster: PG1 (7157976722952275157) ---------+----+-----------+
| hero1 | 192.168.22.251 | Replica | running | 6 | 0 |
| hero2 | 192.168.22.252 | Leader | running | 6 | |
| hero3 | 192.168.22.253 | Replica | running | 6 | 0 |
+--------+----------------+---------+---------+----+-----------+

The cluster is using pgBackRest to create backups and there are also backups available for this cluster.

postgres@backup_hero:/home/postgres/ [pg14] pgbackrest --stanza=hero2hero info
stanza: hero2hero
    status: ok
    cipher: none

    db (current)
        wal archive min/max (14): 000000010000000000000001/00000007000000000000001D

        full backup: 20221024-094106F
            timestamp start/stop: 2022-10-24 09:41:06 / 2022-10-24 09:42:15
            wal start/stop: 00000001000000000000000B / 00000001000000000000000D
            database size: 105MB, database backup size: 105MB
            repo1: backup set size: 30.8MB, backup size: 30.8MB

        full backup: 20221024-094900F
            timestamp start/stop: 2022-10-24 09:49:00 / 2022-10-24 09:49:11
            wal start/stop: 000000010000000000000011 / 000000010000000000000011
            database size: 105MB, database backup size: 105MB
            repo1: backup set size: 30.8MB, backup size: 30.8MB

        full backup: 20221024-145020F
            timestamp start/stop: 2022-10-24 14:50:20 / 2022-10-24 14:50:34
            wal start/stop: 000000050000000000000018 / 000000050000000000000018
            database size: 105.2MB, database backup size: 105.2MB
            repo1: backup set size: 30.8MB, backup size: 30.8MB

Due to a mistake the PGDATA directory was deleted on all three nodes.

postgres@hero1:/u02/pgdata/14/PG1/ [PG1] ls -al
total 0
drwx------. 2 postgres postgres 6 Oct 24 15:28 .
drwxr-x---. 3 postgres postgres 17 Sep 5 10:20 ..

To have a clear cut, we stop the Patroni Service on all three nodes.

postgres@hero3:/u02/pgdata/14/PG1/ [PG1] sudo systemctl stop patroni
postgres@hero3:/u02/pgdata/14/PG1/ [PG1] sudo systemctl status patroni
● patroni.service - dbi services patroni service
Loaded: loaded (/etc/systemd/system/patroni.service; disabled; vendor preset: disabled)
Active: inactive (dead)
Oct 24 15:29:08 hero3 patroni[2014]: 2022-10-24 15:29:08,144 INFO: waiting for leader to bootstrap
Oct 24 15:29:18 hero3 patroni[2014]: 2022-10-24 15:29:18,138 INFO: Lock owner: None; I am hero3
Oct 24 15:29:47 hero3 systemd[1]: Stopping dbi services patroni service…
Oct 24 15:29:47 hero3 systemd[1]: patroni.service: Succeeded.
Oct 24 15:29:47 hero3 systemd[1]: Stopped dbi services patroni service.

Custom Bootstrap Method

As a next step we need to create a custom bootstrap method to bootstrap from the latest pgBackRest Backup. According to the Patroni documentation this is quite easy.

First of all let’s create a custom bootstrap file in the postgres user home called bootstrap_pgbackrest.sh. This file includes the restore command to restore from the pgBackRest backup.

postgres@hero2:/home/postgres/ [PG1] cat bootstrap_pgbackrest.sh
#!/bin/bash
pgbackrest --stanza=hero2hero --log-level-console=info restore
postgres@hero2:/home/postgres/ [PG1] chmod +x bootstrap_pgbackrest.sh

In the next step we need to adjust the patroni.yml on one host to use the new custom bootstrap method.

Add the following section in the bootstrap section that points to the custom bootstrap file we created.

bootstrap:
  # this section will be written into Etcd:///config after initializing new cluster
  # and all other cluster members will use it as a global configuration
  method: pgbackrest
  pgbackrest:
    command: /home/postgres/bootstrap_pgbackrest.sh
    keep_existing_recovery_conf: False
    no_paramas: False
    recovery_conf:
        recovery_target_action: promote
        recovery_target_timeline: latest
        recovery_command: pgbackrest --stanza=hero2hero archive-get %f %p

Let’s try to start the patroni service again and see if it really bootstraps from the pgBackRest.

postgres@hero3:/u01/app/postgres/local/dmk/etc/ [PG1] sudo systemctl status patroni
● patroni.service - dbi services patroni service
   Loaded: loaded (/etc/systemd/system/patroni.service; disabled; vendor preset: disabled)
   Active: active (running) since Mon 2022-10-24 16:10:33 CEST; 1s ago
  Process: 3020 ExecStartPre=/usr/bin/sudo /bin/chown postgres /dev/watchdog (code=exited, status=0/SUCCESS)
  Process: 3017 ExecStartPre=/usr/bin/sudo /sbin/modprobe softdog (code=exited, status=0/SUCCESS)
 Main PID: 3024 (patroni)
    Tasks: 5 (limit: 11364)
   Memory: 19.8M
   CGroup: /system.slice/patroni.service
           └─3024 /usr/bin/python3.9 /u01/app/postgres/local/dmk/bin/patroni /u01/app/postgres/local/dmk/etc/patroni.yml

Oct 24 16:10:33 hero3 systemd[1]: Starting dbi services patroni service...
Oct 24 16:10:33 hero3 sudo[3017]: postgres : TTY=unknown ; PWD=/ ; USER=root ; COMMAND=/sbin/modprobe softdog
Oct 24 16:10:33 hero3 sudo[3020]: postgres : TTY=unknown ; PWD=/ ; USER=root ; COMMAND=/bin/chown postgres /dev/watchdog
Oct 24 16:10:33 hero3 systemd[1]: Started dbi services patroni service.
Oct 24 16:10:34 hero3 patroni[3024]: 2022-10-24 16:10:34,395 INFO: Selected new etcd server http://192.168.22.251:2379
Oct 24 16:10:34 hero3 patroni[3024]: 2022-10-24 16:10:34,404 INFO: No PostgreSQL configuration items changed, nothing to reload.
Oct 24 16:10:34 hero3 patroni[3024]: 2022-10-24 16:10:34,433 INFO: Lock owner: None; I am hero3
Oct 24 16:10:34 hero3 patroni[3024]: 2022-10-24 16:10:34,438 INFO: waiting for leader to bootstrap

Hm….that’s not what I expected. Unfortunately it is waiting for a primary. So seems like there is something missing / wrong. And the key to this is really simple. When we list the Patroni members we can see that all nodes are in “stopped” state.

 postgres@hero3:/u01/app/postgres/local/dmk/etc/ [PG1] patronictl list PG1
+--------+----------------+---------+---------+----+-----------+
| Member | Host           | Role    | State   | TL | Lag in MB |
+ Cluster: PG1 (7157976722952275157) ---------+----+-----------+
| hero1  | 192.168.22.251 | Replica | stopped |    |   unknown |
| hero2  | 192.168.22.252 | Replica | stopped |    |   unknown |
| hero3  | 192.168.22.253 | Replica | stopped |    |   unknown |
+--------+----------------+---------+---------+----+-----------+

Of course this is the missing step. Etcd is still aware that the cluster is already initialized. So we need to delete the old cluster before we can bootstrap a new one.

 postgres@hero3:/u02/pgdata/etcd/member/ [PG1] patronictl remove PG1
+--------+----------------+---------+---------+----+-----------+
| Member | Host           | Role    | State   | TL | Lag in MB |
+ Cluster: PG1 (7157976722952275157) ---------+----+-----------+
| hero1  | 192.168.22.251 | Replica | stopped |    |   unknown |
| hero2  | 192.168.22.252 | Replica | stopped |    |   unknown |
| hero3  | 192.168.22.253 | Replica | stopped |    |   unknown |
+--------+----------------+---------+---------+----+-----------+
Please confirm the cluster name to remove: PG1
You are about to remove all information in DCS for PG1, please type: "Yes I am aware": Yes I am aware
16:23:06 postgres@hero3:/u02/pgdata/etcd/member/ [PG1] patronictl list PG1
+--------+------+------+-------+----+-----------+
| Member | Host | Role | State | TL | Lag in MB |
+ Cluster: PG1 (uninitialized) +----+-----------+
+--------+------+------+-------+----+-----------+

We are gonna retry it once again to start the patroni service.

 postgres@hero2:/u01/app/postgres/local/dmk/etc/ [PG1] sudo systemctl start patroni
15:39:06 postgres@hero2:/u01/app/postgres/local/dmk/etc/ [PG1] sudo systemctl status patroni
● patroni.service - dbi services patroni service
   Loaded: loaded (/etc/systemd/system/patroni.service; disabled; vendor preset: disabled)
   Active: active (running) since Mon 2022-10-24 15:39:06 CEST; 4s ago
  Process: 38620 ExecStartPre=/usr/bin/sudo /bin/chown postgres /dev/watchdog (code=exited, status=0/SUCCESS)
  Process: 38616 ExecStartPre=/usr/bin/sudo /sbin/modprobe softdog (code=exited, status=0/SUCCESS)
 Main PID: 38626 (patroni)
    Tasks: 10 (limit: 11364)
   Memory: 136.6M
   CGroup: /system.slice/patroni.service
           ├─38626 /usr/bin/python3.9 /u01/app/postgres/local/dmk/bin/patroni /u01/app/postgres/local/dmk/etc/patroni.yml
           ├─38633 bin/bash /home/postgres/bootstrap_pgbackrest.sh --scope=PG1 --datadir=/u02/pgdata/14/PG1/
           ├─38634 pgbackrest --stanza=hero2hero --log-level-console=info --delta restore
           ├─38639 pgbackrest --exec-id=38634-e965e985 --log-level-console=off --log-level-file=off --log-level-stderr=error --process=1 --remote-type=repo --stanza=hero2hero restore:local
           └─38640 ssh -o LogLevel=error -o Compression=no -o PasswordAuthentication=no [email protected] pgbackrest --exec-id=38634-e965e985 --log-level-console=off --log-level-file=off --log-level-stderr=error --pg1-path=/u02/pgdata/14/PG1/ --process=1 --remote-type=repo --repo=1 --repo1-path=>

Oct 24 15:39:06 hero2 sudo[38620]: postgres : TTY=unknown ; PWD=/ ; USER=root ; COMMAND=/bin/chown postgres /dev/watchdog
Oct 24 15:39:06 hero2 systemd[1]: Started dbi services patroni service.
Oct 24 15:39:06 hero2 patroni[38626]: 2022-10-24 15:39:06,924 INFO: Selected new etcd server http://192.168.22.251:2379
Oct 24 15:39:06 hero2 patroni[38626]: 2022-10-24 15:39:06,933 INFO: No PostgreSQL configuration items changed, nothing to reload.
Oct 24 15:39:06 hero2 patroni[38626]: 2022-10-24 15:39:06,961 INFO: Lock owner: None; I am hero2
Oct 24 15:39:06 hero2 patroni[38626]: 2022-10-24 15:39:06,971 INFO: trying to bootstrap a new cluster
Oct 24 15:39:06 hero2 patroni[38626]: 2022-10-24 15:39:06,972 INFO: Running custom bootstrap script: /home/postgres/bootstrap_pgbackrest.sh
Oct 24 15:39:06 hero2 patroni[38634]: 2022-10-24 15:39:06.985 P00   INFO: restore command begin 2.41: --exec-id=38634-e965e985 --log-level-console=info --pg1-path=/u02/pgdata/14/PG1/ --repo1-host=192.168.22.250 --repo1-host-user=postgres --repo1-path=/u99/backups --stanza=hero2hero
Oct 24 15:39:07 hero2 patroni[38634]: 2022-10-24 15:39:07.568 P00   INFO: repo1: restore backup set 20221024-145020F, recovery will start at 2022-10-24 14:50:20

postgres@hero2:/u02/pgdata/14/PG1/ [PG1] patronictl list
+--------+----------------+--------+---------+----+-----------+
| Member | Host           | Role   | State   | TL | Lag in MB |
+ Cluster: PG1 (7157976722952275157) --------+----+-----------+
| hero2  | 192.168.22.252 | Leader | running |  8 |           |
+--------+----------------+--------+---------+----+-----------+

As you can see, it successfully boostraped using the custom bootstrap file.

Recreate the replicas

As the primary is finally back online we can add the replicas again.

This is pretty easy as we can bootstrap directly from the primary again. Which works pretty well for both hosts.

postgres@hero1:/home/postgres/ [PG1] sudo systemctl status patroni
● patroni.service - dbi services patroni service
   Loaded: loaded (/etc/systemd/system/patroni.service; enabled; vendor preset: disabled)
   Active: active (running) since Mon 2022-10-24 18:25:33 CEST; 2s ago
  Process: 8656 ExecReload=/bin/kill -s HUP $MAINPID (code=exited, status=0/SUCCESS)
  Process: 8708 ExecStartPre=/usr/bin/sudo /bin/chown postgres /dev/watchdog (code=exited, status=0/SUCCESS)
  Process: 8705 ExecStartPre=/usr/bin/sudo /sbin/modprobe softdog (code=exited, status=0/SUCCESS)
 Main PID: 8713 (patroni)
    Tasks: 9 (limit: 11364)
   Memory: 128.9M
   CGroup: /system.slice/patroni.service
           ├─8713 /usr/bin/python3.9 /u01/app/postgres/local/dmk/bin/patroni /u01/app/postgres/local/dmk/etc/patroni.yml
           ├─8720 pgbackrest --stanza=hero2hero restore
           ├─8725 pgbackrest --exec-id=8720-f8ab7b63 --log-level-console=off --log-level-file=off --log-level-stderr=error --process=1 --remote-type=repo --stanza=hero2hero restore:local
           └─8726 ssh -o LogLevel=error -o Compression=no -o PasswordAuthentication=no [email protected] pgbackrest --exec-id=8720-f8ab7b63 --log-level-console=off --log-level-file=off --log-level-stderr=error --pg1-path=/u02/pgdata/14/PG1/ --process=1 --remote-type=repo --repo=1 --repo1-path=/u>

Oct 24 18:25:33 hero1 systemd[1]: Starting dbi services patroni service...
Oct 24 18:25:33 hero1 sudo[8705]: postgres : TTY=unknown ; PWD=/ ; USER=root ; COMMAND=/sbin/modprobe softdog
Oct 24 18:25:33 hero1 sudo[8708]: postgres : TTY=unknown ; PWD=/ ; USER=root ; COMMAND=/bin/chown postgres /dev/watchdog
Oct 24 18:25:33 hero1 systemd[1]: Started dbi services patroni service.
Oct 24 18:25:33 hero1 patroni[8713]: 2022-10-24 18:25:33,542 INFO: Selected new etcd server http://192.168.22.253:2379
Oct 24 18:25:33 hero1 patroni[8713]: 2022-10-24 18:25:33,550 INFO: No PostgreSQL configuration items changed, nothing to reload.
Oct 24 18:25:33 hero1 patroni[8713]: 2022-10-24 18:25:33,573 INFO: Lock owner: hero3; I am hero1
Oct 24 18:25:33 hero1 patroni[8713]: 2022-10-24 18:25:33,577 INFO: trying to bootstrap from leader 'hero3'


postgres@hero1:/u02/pgdata/14/PG1/ [PG1] patronictl list
+--------+----------------+---------+---------+----+-----------+
| Member | Host           | Role    | State   | TL | Lag in MB |
+ Cluster: PG1 (7157976722952275157) ---------+----+-----------+
| hero1  | 192.168.22.251 | Replica | running |  8 |         0 |
| hero2  | 192.168.22.252 | Leader  | running |  8 |           |
| hero3  | 192.168.22.253 | Replica | running |  8 |         0 |
+--------+----------------+---------+---------+----+-----------+

Conclusion

Always have a detailed look on your patroni.yml file. Small mistakes can really cost you a lot of time (especially blanks at the wrong place). In case of a mistake while restoring the primary from the backup, don’t forget to run “patronictl remove <CLUSTERNAME>” once again.

In case you want to recreate your Replicas from pgBackRest Backup again, just create a new Backup before you reinit the Replica.