Last week the new Patroni 2.0 release was published which brings many new features. But one makes me absolutely curious. At the moment only as BETA, but I had to test it. Patroni on pure Raft. It’s possible to run Patroni without 3rd party dependencies. So no Etcd, Consul or Zookeeper is needed anymore. Great improvement!
In this blog we will have a look at the setup and try an failover as well.
Starting position
With Patroni on Raft it is possible to run a two node Patroni cluster as well, but I decided to setup a three node cluster.
So what you need to prepare:
- Three identical VMs with CentOS8 installed
- All with Postgres 13 and it’s dependencies installed from source. I chose Postgres 13, because the support is newly added as well in Patroni 2.0.
- I also created the /etc/hostname entries and exchanged the ssh-key between the three servers.
- Our DMK is installed on these servers as well.
- Firewall and SELinux are disabled in this example. If you want to run the setup with both enable, you need to configure the firewall and SELinux first.
Setup Patroni
Let’s start with the installation of Patroni. The following steps need to be performed on all three servers. The installation is also not as complicated as before with the new release.
[root@partoni1 ~]$ yum install python3-psycopg2 [root@partoni1 ~]$ su - postgres postgres@partoni1:/home/postgres/ [pg130] pip3 install patroni[raft] --user Collecting patroni[raft] Using cached https://files.pythonhosted.org/packages/7c/d3/21a189f5f33ef6ce4ff9433c74aa30b70fc3aaf7fecde97c2979a3abdd06/patroni-2.0.0-py3-none-any.whl Requirement already satisfied: cdiff in /usr/local/lib/python3.6/site-packages (from patroni[raft]) Requirement already satisfied: urllib3[secure]!=1.21,>=1.19.1 in /usr/local/lib/python3.6/site-packages (from patroni[raft]) Requirement already satisfied: PyYAML in /usr/local/lib64/python3.6/site-packages (from patroni[raft]) Requirement already satisfied: six>=1.7 in /usr/lib/python3.6/site-packages (from patroni[raft]) Requirement already satisfied: prettytable>=0.7 in /usr/local/lib/python3.6/site-packages (from patroni[raft]) Requirement already satisfied: python-dateutil in /usr/lib/python3.6/site-packages (from patroni[raft]) Requirement already satisfied: click>=4.1 in /usr/local/lib/python3.6/site-packages (from patroni[raft]) Requirement already satisfied: psutil>=2.0.0 in /usr/local/lib64/python3.6/site-packages (from patroni[raft]) Requirement already satisfied: pysyncobj>=0.3.5; extra == "raft" in /usr/local/lib/python3.6/site-packages (from patroni[raft]) Requirement already satisfied: idna>=2.0.0; extra == "secure" in /usr/lib/python3.6/site-packages (from urllib3[secure]!=1.21,>=1.19.1->patroni[raft]) Requirement already satisfied: pyOpenSSL>=0.14; extra == "secure" in /usr/lib/python3.6/site-packages (from urllib3[secure]!=1.21,>=1.19.1->patroni[raft]) Requirement already satisfied: certifi; extra == "secure" in /usr/local/lib/python3.6/site-packages (from urllib3[secure]!=1.21,>=1.19.1->patroni[raft]) Requirement already satisfied: cryptography>=1.3.4; extra == "secure" in /usr/lib64/python3.6/site-packages (from urllib3[secure]!=1.21,>=1.19.1->patroni[raft]) Requirement already satisfied: asn1crypto>=0.21.0 in /usr/lib/python3.6/site-packages (from cryptography>=1.3.4; extra == "secure"->urllib3[secure]!=1.21,>=1.19.1->patroni[raft]) Requirement already satisfied: cffi!=1.11.3,>=1.7 in /usr/lib64/python3.6/site-packages (from cryptography>=1.3.4; extra == "secure"->urllib3[secure]!=1.21,>=1.19.1->patroni[raft]) Requirement already satisfied: pycparser in /usr/lib/python3.6/site-packages (from cffi!=1.11.3,>=1.7->cryptography>=1.3.4; extra == "secure"->urllib3[secure]!=1.21,>=1.19.1->patroni[raft]) Installing collected packages: patroni Successfully installed patroni-2.0.0
Configuration
As the installation was successful, we can go on with the configuration of Patroni. As I am using our DMK, the patroni.yml file is stored in the DMK home. But you can store it somewhere else (of course). You only have to adjust some values like IP addresses and name on every server.
But the most important section in here is the Raft section (line 17-22).
– data_dir: For storing the Raft Log and snapshot. It’s an optional parameter.
– self_addr: It’s the address to listen for Raft connections. This needs to be set. Otherwise the node will not become part of the consensus.
– partner_addrs: Here the list of the other Patroni nodes needs to be added.
postgres@partoni1:/u01/app/postgres/local/dmk/etc/ [pg130] cat patroni.yml scope: PG1 #namespace: /service/ name: patroni1 restapi: listen: 192.168.22.201:8008 connect_address: 192.168.22.201:8008 # certfile: /etc/ssl/certs/ssl-cert-snakeoil.pem # keyfile: /etc/ssl/private/ssl-cert-snakeoil.key # authentication: # username: username # password: password # ctl: # insecure: false # Allow connections to SSL sites without certs # certfile: /etc/ssl/certs/ssl-cert-snakeoil.pem # cacert: /etc/ssl/certs/ssl-cacert-snakeoil.pem raft: data_dir: /u02/pgdata/raft self_addr: 192.168.22.201:5010 partner_addrs: ['192.168.22.202:5010','192.168.22.203:5010'] bootstrap: # and all other cluster members will use it as a `global configuration` dcs: ttl: 30 loop_wait: 10 retry_timeout: 10 maximum_lag_on_failover: 1048576 postgresql: use_pg_rewind: true use_slots: true parameters: wal_level: 'hot_standby' hot_standby: "on" wal_keep_segments: 8 max_replication_slots: 10 wal_log_hints: "on" listen_addresses: '*' port: 5432 logging_collector: 'on' log_truncate_on_rotation: 'on' log_filename: 'postgresql-%a.log' log_rotation_age: '1440' log_line_prefix: '%m - %l - %p - %h - %u@%d - %x' log_directory: 'pg_log' log_min_messages: 'WARNING' log_autovacuum_min_duration: '60s' log_min_error_statement: 'NOTICE' log_min_duration_statement: '30s' log_checkpoints: 'on' log_statement: 'ddl' log_lock_waits: 'on' log_temp_files: '0' log_timezone: 'Europe/Zurich' log_connections: 'on' log_disconnections: 'on' log_duration: 'on' client_min_messages: 'WARNING' wal_level: 'replica' hot_standby_feedback: 'on' max_wal_senders: '10' shared_buffers: '128MB' work_mem: '8MB' effective_cache_size: '512MB' maintenance_work_mem: '64MB' wal_compression: 'off' max_wal_senders: '20' shared_preload_libraries: 'pg_stat_statements' autovacuum_max_workers: '6' autovacuum_vacuum_scale_factor: '0.1' autovacuum_vacuum_threshold: '50' archive_mode: 'on' archive_command: '/bin/true' wal_log_hints: 'on' # recovery_conf: # restore_command: cp ../wal_archive/%f %p # some desired options for 'initdb' initdb: # Note: It needs to be a list (some options need values, others are switches) - encoding: UTF8 - data-checksums pg_hba: # Add following lines to pg_hba.conf after running 'initdb' - host replication replicator 192.168.22.0/24 md5 - host all all 192.168.22.0/24 md5 # - hostssl all all 0.0.0.0/0 md5 # Additional script to be launched after initial cluster creation (will be passed the connection URL as parameter) # post_init: /usr/local/bin/setup_cluster.sh # Some additional users users which needs to be created after initializing new cluster users: admin: password: admin options: - createrole - createdb replicator: password: postgres options: - superuser postgresql: listen: 192.168.22.201:5432 connect_address: 192.168.22.201:5432 data_dir: /u02/pgdata/13/PG1 bin_dir: /u01/app/postgres/product/13/db_0/bin # config_dir: pgpass: /u01/app/postgres/local/dmk/etc/pgpass0 authentication: replication: username: replicator password: ********* superuser: username: postgres password: ********* parameters: unix_socket_directories: '/tmp' watchdog: mode: automatic # Allowed values: off, automatic, required device: /dev/watchdog safety_margin: 5 tags: nofailover: false noloadbalance: false clonefrom: false nosync: false
Service
To start Patroni automatically after reboot. Let’s create a service.
# systemd integration for patroni # Put this file under /etc/systemd/system/patroni.service # then: systemctl daemon-reload # then: systemctl list-unit-files | grep patroni # then: systemctl enable patroni.service # [Unit] Description=dbi services patroni service After=etcd.service syslog.target network.target [Service] User=postgres Group=postgres Type=simple ExecStartPre=-/usr/bin/sudo /sbin/modprobe softdog ExecStartPre=-/usr/bin/sudo /bin/chown postgres /dev/watchdog ExecStart=/home/postgres/.local/bin/patroni /u01/app/postgres/local/dmk/etc/patroni.yml ExecReload=/bin/kill -s HUP $MAINPID KillMode=process Restart=no TimeoutSec=30 [Install] WantedBy=multi-user.target
Once everything is created and before starting Patroni, it is possible to check the Patroni configuration file. And that’s the point, when it gets a bit funny at the moment.
postgres@partoni1:/u02/pgdata/raft/ [PG1] patroni --validate-config /u01/app/postgres/local/dmk/etc/patroni.yml restapi.listen 192.168.22.201:8008 didn't pass validation: 'Port 8008 is already in use.' Traceback (most recent call last): File "/u01/app/postgres/local/dmk/bin/patroni", line 11, in sys.exit(main()) File "/home/postgres/.local/lib/python3.6/site-packages/patroni/__init__.py", line 170, in main return patroni_main() File "/home/postgres/.local/lib/python3.6/site-packages/patroni/__init__.py", line 138, in patroni_main abstract_main(Patroni, schema) File "/home/postgres/.local/lib/python3.6/site-packages/patroni/daemon.py", line 88, in abstract_main Config(args.configfile, validator=validator) File "/home/postgres/.local/lib/python3.6/site-packages/patroni/config.py", line 102, in __init__ error = validator(self._local_configuration) File "/home/postgres/.local/lib/python3.6/site-packages/patroni/validator.py", line 177, in __call__ for i in self.validate(data): File "/home/postgres/.local/lib/python3.6/site-packages/patroni/validator.py", line 209, in validate for i in self.iter(): File "/home/postgres/.local/lib/python3.6/site-packages/patroni/validator.py", line 217, in iter for i in self.iter_dict(): File "/home/postgres/.local/lib/python3.6/site-packages/patroni/validator.py", line 244, in iter_dict validator = self.validator[key]._schema[d] KeyError: 'raft'
I tried several version of the Raft configuration. But every time I got an error. I also tried to set the system parameters for Raft and commented the Raft block out in the configuration file. But then I got the following output.
postgres@partoni1:/u02/pgdata/raft/ [PG1] patroni --validate-config /u01/app/postgres/local/dmk/etc/patroni.yml consul is not defined. etcd is not defined. etcd3 is not defined. exhibitor is not defined. kubernetes is not defined. raft is not defined. zookeeper is not defined. postgresql.authentication.rewind is not defined.
So seems like there is something not working correctly when validating the configuration file. In the end I was not sure what to test anymore and so I just tried to start the Patroni service. Full exploratory spirit.
postgres@partoni1:/u02/pgdata/raft/ [PG1] sudo systemctl start patroni postgres@partoni1:/u02/pgdata/raft/ [PG1] sudo systemctl status patroni ● patroni.service - dbi services patroni service Loaded: loaded (/etc/systemd/system/patroni.service; enabled; vendor preset: disabled) Active: active (running) since Thu 2020-09-10 11:36:40 CEST; 3s ago Process: 5232 ExecStartPre=/usr/bin/sudo /bin/chown postgres /dev/watchdog (code=exited, status=0/SUCCESS) Process: 5229 ExecStartPre=/usr/bin/sudo /sbin/modprobe softdog (code=exited, status=0/SUCCESS) Main PID: 5236 (patroni) Tasks: 2 (limit: 11480) Memory: 19.7M CGroup: /system.slice/patroni.service └─5236 /usr/bin/python3.6 /u01/app/postgres/local/dmk/bin/patroni /u01/app/postgres/local/dmk/etc/patroni.yml Sep 10 11:36:40 partoni1 systemd[1]: Starting dbi services patroni service... Sep 10 11:36:40 partoni1 sudo[5229]: postgres : TTY=unknown ; PWD=/ ; USER=root ; COMMAND=/sbin/modprobe softdog Sep 10 11:36:40 partoni1 sudo[5232]: postgres : TTY=unknown ; PWD=/ ; USER=root ; COMMAND=/bin/chown postgres /dev/watchdog Sep 10 11:36:40 partoni1 systemd[1]: Started dbi services patroni service.
And….it starts without any errors.
But that did not fully convinced me. Let’s check the Raft setup. Here Patroni delivers a simple command to check the status of the Raft setup.
commit_idx: 62850 postgres@partoni1:/u02/pgdata/raft/ [PG1] syncobj_admin -conn 192.168.22.201:5010 -status enabled_code_version: 0 last_applied: 62850 leader: 192.168.22.203:5010 leader_commit_idx: 62850 log_len: 31 match_idx_count: 0 next_node_idx_count: 0 partner_node_status_server_192.168.22.202:5010: 2 partner_node_status_server_192.168.22.203:5010: 2 partner_nodes_count: 2 raft_term: 30 readonly_nodes_count: 0 revision: deprecated self: 192.168.22.201:5010 self_code_version: 0 state: 0 uptime: 379 version: 0.3.6
And of course we can still check the cluster status itself.
postgres@partoni1:/u02/pgdata/raft/ [PG1] patronictl list + Cluster: PG1 (6870147915530980670) -+---------+----+-----------+ | Member | Host | Role | State | TL | Lag in MB | +----------+----------------+---------+---------+----+-----------+ | patroni1 | 192.168.22.201 | Replica | running | 6 | 0 | | patroni2 | 192.168.22.202 | Leader | running | 6 | | | patroni3 | 192.168.22.203 | Replica | running | 6 | 0 | +----------+----------------+---------+---------+----+-----------+
This look good as well. So even the configuration validation gives us an error, the Cluster is running smoothly.
Failover
So let’s see what happens if we restart the Leader node. Within a short time, the Leader changes to another node
postgres@partoni3:/home/postgres/ [PG1] patronictl list + Cluster: PG1 (6870147915530980670) -+---------+----+-----------+ | Member | Host | Role | State | TL | Lag in MB | +----------+----------------+---------+---------+----+-----------+ | patroni1 | 192.168.22.201 | Leader | running | 7 | | | patroni3 | 192.168.22.203 | Replica | running | 7 | 0 | +----------+----------------+---------+---------+----+-----------+
As soon as patroni2 is back to the network, it will attach to the cluster again without issues as a Replica
postgres@partoni3:/home/postgres/ [PG1] patronictl list + Cluster: PG1 (6870147915530980670) -+---------+----+-----------+ | Member | Host | Role | State | TL | Lag in MB | +----------+----------------+---------+---------+----+-----------+ | patroni1 | 192.168.22.201 | Leader | running | 7 | | | patroni2 | 192.168.22.202 | Replica | running | 7 | 0 | | patroni3 | 192.168.22.203 | Replica | running | 7 | 0 | +----------+----------------+---------+---------+----+-----------+
Conclusion
The configuration check was published as a first draft in Version 1.6.5. Seems there is some space for improvement here. Still not sure where I do the mistake and if there is really one. I also tested my patroni.yml with is configured for etcd. Maybe it’s just because of the brand new Raft possibility.
For me, the Raft status overview is a bit cryptical. Of course, it shows the most important information, like leader, partner_node_status_server and partner_nodes_count. But compared to etcd, where I just get a simple “cluster is healthy”, it needs a bit time to getting used to. Besides that it needs some time to recognize the unavailability of one node.
Using Patroni with pure Raft works fine and the setup is easier than the etcd setup, where you can get some member mismatches. Especially when you don’t want to install an additional tool on your server, it could get a really good possibility. The documentation of Patroni is still a bit minimalistic regarding the configuration. But with some patience you find whatever you’re searching for.