This is maybe one the biggest steps forward for PostgreSQL: PostgreSQL 18 will come with support for asynchronous I/O. Traditionally PostgreSQL relies on the operating system to hide the latency of writing to disk, which is done synchronously and can lead to double buffering (PostgreSQL shared buffers and the OS file cache). This is most important for WAL writes, as PostgreSQL must make sure that changes are flushed to disk and needs to wait until it is confirmed.
Before we do some tests let’s see what’s new from a parameter perspective. One of the new parameters is io_method
:
1 2 3 4 5 | postgres=# show io_method; io_method ----------- worker (1 row) |
The default is “worker” and the maximum number of worker processes to perform asynchronous is controller by io_workers
:
1 2 3 4 5 | postgres=# show io_workers; io_workers ------------ 3 (1 row) |
This can also be seen on the operating system:
1 2 3 4 | postgres=# \! ps aux | grep "io worker" | grep -v grep postgres 29732 0.0 0.1 224792 7052 ? Ss Apr08 0:00 postgres: pgdev: io worker 1 postgres 29733 0.0 0.2 224792 9884 ? Ss Apr08 0:00 postgres: pgdev: io worker 0 postgres 29734 0.0 0.1 224792 7384 ? Ss Apr08 0:00 postgres: pgdev: io worker 2 |
The other possible settings for io_method
are:
io_uring
: Asynchronous I/O using io_uringsync
: The behavior before PostgreSQL 18, do synchronous I/O
io_workers
only has an effect if io_method
is set to “worker”, which is the default configuration.
As usual: What follows are just some basic tests. Test for your own, in your environment with your specific workload to get some meaningful numbers. Especially if you test in a public cloud, be aware that the numbers might not show you the full truth.
We’ll do the tests on an AWS EC2 t3.large instance running Debian 12. The storage volume is gp3 with ext4 (default settings):
1 2 3 4 5 6 7 8 9 | postgres@ip-10-0-1-209:~$ grep proc /proc/cpuinfo processor : 0 processor : 1 postgres@ip-10-0-1-209:~$ free -g total used free shared buff /cache available Mem: 7 0 4 0 3 7 Swap: 0 0 0 postgres@ip-10-0-1-209:~$ mount | grep 18 /dev/nvme1n1 on /u02/pgdata/18 type ext4 (rw,relatime) |
PostgreSQL was initialized with the default settings:
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 | postgres@ip-10-0-1-209:~$ /u01/app/postgres/product/18/db_0/bin/initdb --pgdata= /u02/pgdata/18/data/ The files belonging to this database system will be owned by user "postgres" . This user must also own the server process. The database cluster will be initialized with locale "C.UTF-8" . The default database encoding has accordingly been set to "UTF8" . The default text search configuration will be set to "english" . Data page checksums are enabled. fixing permissions on existing directory /u02/pgdata/18/data ... ok creating subdirectories ... ok selecting dynamic shared memory implementation ... posix selecting default "max_connections" ... 100 selecting default "autovacuum_worker_slots" ... 16 selecting default "shared_buffers" ... 128MB selecting default time zone ... Etc /UTC creating configuration files ... ok running bootstrap script ... ok performing post-bootstrap initialization ... ok syncing data to disk ... ok initdb: warning: enabling "trust" authentication for local connections initdb: hint: You can change this by editing pg_hba.conf or using the option -A, or --auth- local and --auth-host, the next time you run initdb. Success. You can now start the database server using: /u01/app/postgres/product/18/db_0/bin/pg_ctl -D /u02/pgdata/18/data/ -l logfile start |
The following settings have been changed:
1 2 3 4 5 6 7 8 9 10 11 | postgres@ip-10-0-1-209:~$ echo "shared_buffers='2GB'" >> /u02/pgdata/18/data/postgresql .auto.conf postgres@ip-10-0-1-209:~$ echo "checkpoint_timeout='20min'" >> /u02/pgdata/18/data/postgresql .auto.conf postgres@ip-10-0-1-209:~$ echo "random_page_cost=1.1" >> /u02/pgdata/18/data/postgresql .auto.conf postgres@ip-10-0-1-209:~$ echo "max_wal_size='8GB'" >> /u02/pgdata/18/data/postgresql .auto.conf postgres@ip-10-0-1-209:~$ /u01/app/postgres/product/18/db_0/bin/pg_ctl --pgdata= /u02/pgdata/18/data/ -l /dev/null start postgres@ip-10-0-1-209:~$ /u01/app/postgres/product/18/db_0/bin/psql -c "select version()" version --------------------------------------------------------------------------------------- PostgreSQL 18devel dbi services build on x86_64-linux, compiled by gcc -12.2.0, 64-bit (1 row) postgres@ip-10-0-1-209:~$ export PATH= /u01/app/postgres/product/18/db_0/bin/ :$PATH |
The first test is data loading. How long does that take when io_method is set to worker (3 times in a row), this gives a data set of around 1536MB:
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 | postgres@ip-10-0-1-209:~$ pgbench -i -s 100 dropping old tables... NOTICE: table "pgbench_accounts" does not exist, skipping NOTICE: table "pgbench_branches" does not exist, skipping NOTICE: table "pgbench_history" does not exist, skipping NOTICE: table "pgbench_tellers" does not exist, skipping creating tables... generating data (client-side)... vacuuming... creating primary keys... done in 31.85 s (drop tables 0.00 s, create tables 0.01 s, client-side generate 24.82 s, vacuum 0.35 s, primary keys 6.68 s). postgres@ip-10-0-1-209:~$ pgbench -i -s 100 dropping old tables... creating tables... generating data (client-side)... vacuuming... creating primary keys... done in 31.97 s (drop tables 0.24 s, create tables 0.00 s, client-side generate 25.44 s, vacuum 0.34 s, primary keys 5.93 s). postgres@ip-10-0-1-209:~$ pgbench -i -s 100 dropping old tables... creating tables... generating data (client-side)... vacuuming... creating primary keys... done in 30.72 s (drop tables 0.26 s, create tables 0.00 s, client-side generate 23.93 s, vacuum 0.55 s, primary keys 5.98 s). |
The same test with “sync”:
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 | postgres@ip-10-0-1-209:~$ psql -c "alter system set io_method='sync'" ALTER SYSTEM postgres@ip-10-0-1-209:~$ pg_ctl --pgdata= /u02/pgdata/18/data/ restart -l /dev/null postgres@ip-10-0-1-209:~$ psql -c "show io_method" io_method ----------- sync (1 row) postgres@ip-10-0-1-209:~$ pgbench -i -s 100 dropping old tables... creating tables... generating data (client-side)... vacuuming... creating primary keys... done in 20.89 s (drop tables 0.29 s, create tables 0.01 s, client-side generate 14.70 s, vacuum 0.45 s, primary keys 5.44 s). postgres@ip-10-0-1-209:~$ pgbench -i -s 100 dropping old tables... creating tables... generating data (client-side)... vacuuming... creating primary keys... done in 21.57 s (drop tables 0.20 s, create tables 0.00 s, client-side generate 16.13 s, vacuum 0.46 s, primary keys 4.77 s). postgres@ip-10-0-1-209:~$ pgbench -i -s 100 dropping old tables... creating tables... generating data (client-side)... vacuuming... creating primary keys... done in 21.44 s (drop tables 0.20 s, create tables 0.00 s, client-side generate 16.04 s, vacuum 0.52 s, primary keys 4.67 s). |
… and finally “io_uring”:
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 | postgres@ip-10-0-1-209:~$ psql -c "alter system set io_method='io_uring'" ALTER SYSTEM postgres@ip-10-0-1-209:~$ pg_ctl --pgdata= /u02/pgdata/18/data/ restart -l /dev/null waiting for server to shut down.... done server stopped waiting for server to start.... done server started postgres@ip-10-0-1-209:~$ psql -c "show io_method" io_method ----------- io_uring (1 row) postgres@ip-10-0-1-209:~$ pgbench -i -s 100 dropping old tables... creating tables... generating data (client-side)... vacuuming... creating primary keys... done in 20.63 s (drop tables 0.35 s, create tables 0.01 s, client-side generate 14.92 s, vacuum 0.47 s, primary keys 4.88 s). postgres@ip-10-0-1-209:~$ pgbench -i -s 100 dropping old tables... creating tables... generating data (client-side)... vacuuming... creating primary keys... done in 20.81 s (drop tables 0.29 s, create tables 0.00 s, client-side generate 14.43 s, vacuum 0.46 s, primary keys 5.63 s). postgres@ip-10-0-1-209:~$ pgbench -i -s 100 dropping old tables... creating tables... generating data (client-side)... vacuuming... creating primary keys... done in 21.11 s (drop tables 0.24 s, create tables 0.00 s, client-side generate 15.63 s, vacuum 0.53 s, primary keys 4.70 s). |
There not much difference for “sync” and “io_uring”, but “worker” clearly is slower for that type of workload.
Moving on, let’s see how that looks like for a standard pgbench benchmark. We’ll start with “io_uring” as this is the current setting:
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 | postgres@ip-10-0-1-209:~$ pgbench -- time =600 --client=2 --jobs=2 pgbench (18devel dbi services build) starting vacuum...end. transaction type : < builtin : TPC-B ( sort of)> scaling factor: 100 query mode: simple number of clients: 2 number of threads: 2 maximum number of tries: 1 duration: 600 s number of transactions actually processed: 567989 number of failed transactions: 0 (0.000%) latency average = 2.113 ms initial connection time = 8.996 ms tps = 946.659673 (without initial connection time ) postgres@ip-10-0-1-209:~$ pgbench -- time =600 --client=2 --jobs=2 pgbench (18devel dbi services build) starting vacuum...end. transaction type : < builtin : TPC-B ( sort of)> scaling factor: 100 query mode: simple number of clients: 2 number of threads: 2 maximum number of tries: 1 duration: 600 s number of transactions actually processed: 557640 number of failed transactions: 0 (0.000%) latency average = 2.152 ms initial connection time = 6.994 ms tps = 929.408406 (without initial connection time ) postgres@ip-10-0-1-209:~$ pgbench -- time =600 --client=2 --jobs=2 pgbench (18devel dbi services build) starting vacuum...end. transaction type : < builtin : TPC-B ( sort of)> scaling factor: 100 query mode: simple number of clients: 2 number of threads: 2 maximum number of tries: 1 duration: 600 s number of transactions actually processed: 563613 number of failed transactions: 0 (0.000%) latency average = 2.129 ms initial connection time = 16.351 ms tps = 939.378627 (without initial connection time ) |
Same test with “worker”:
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 | postgres@ip-10-0-1-209:~$ psql -c "alter system set io_method='worker'" ALTER SYSTEM postgres@ip-10-0-1-209:~$ pg_ctl --pgdata=/u02/pgdata/18/data/ restart -l /dev/null waiting for server to shut down............. done server stopped waiting for server to start.... done server started postgres@ip-10-0-1-209:~$ pgbench --time=600 --client=2 --jobs=2 pgbench (18devel dbi services build) starting vacuum...end. transaction type: <builtin: TPC-B (sort of)> scaling factor: 100 query mode: simple number of clients: 2 number of threads: 2 maximum number of tries: 1 duration: 600 s number of transactions actually processed: 549176 number of failed transactions: 0 (0.000%) latency average = 2.185 ms initial connection time = 7.189 ms tps = 915.301403 (without initial connection time) postgres@ip-10-0-1-209:~$ pgbench --time=600 --client=2 --jobs=2 pgbench (18devel dbi services build) starting vacuum...end. transaction type: <builtin: TPC-B (sort of)> scaling factor: 100 query mode: simple number of clients: 2 number of threads: 2 maximum number of tries: 1 duration: 600 s number of transactions actually processed: 564898 number of failed transactions: 0 (0.000%) latency average = 2.124 ms initial connection time = 11.332 ms tps = 941.511304 (without initial connection time) postgres@ip-10-0-1-209:~$ pgbench --time=600 --client=2 --jobs=2 pgbench (18devel dbi services build) starting vacuum...end. transaction type: <builtin: TPC-B (sort of)> scaling factor: 100 query mode: simple number of clients: 2 number of threads: 2 maximum number of tries: 1 duration: 600 s number of transactions actually processed: 563041 number of failed transactions: 0 (0.000%) latency average = 2.131 ms initial connection time = 9.120 ms tps = 938.412979 (without initial connection time) |
… and finally “sync”:
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 | postgres@ip-10-0-1-209:~$ psql -c "alter system set io_method='sync'" ALTER SYSTEM postgres@ip-10-0-1-209:~$ pg_ctl --pgdata= /u02/pgdata/18/data/ restart -l /dev/null waiting for server to shut down............ done server stopped waiting for server to start.... done server started postgres@ip-10-0-1-209:~$ pgbench -- time =600 --client=2 --jobs=2 pgbench (18devel dbi services build) starting vacuum...end. transaction type : < builtin : TPC-B ( sort of)> scaling factor: 100 query mode: simple number of clients: 2 number of threads: 2 maximum number of tries: 1 duration: 600 s number of transactions actually processed: 560420 number of failed transactions: 0 (0.000%) latency average = 2.141 ms initial connection time = 12.000 ms tps = 934.050237 (without initial connection time ) postgres@ip-10-0-1-209:~$ pgbench -- time =600 --client=2 --jobs=2 pgbench (18devel dbi services build) starting vacuum...end. transaction type : < builtin : TPC-B ( sort of)> scaling factor: 100 query mode: simple number of clients: 2 number of threads: 2 maximum number of tries: 1 duration: 600 s number of transactions actually processed: 560077 number of failed transactions: 0 (0.000%) latency average = 2.143 ms initial connection time = 7.204 ms tps = 933.469665 (without initial connection time ) postgres@ip-10-0-1-209:~$ pgbench -- time =600 --client=2 --jobs=2 pgbench (18devel dbi services build) starting vacuum...end. transaction type : < builtin : TPC-B ( sort of)> scaling factor: 100 query mode: simple number of clients: 2 number of threads: 2 maximum number of tries: 1 duration: 600 s number of transactions actually processed: 566150 number of failed transactions: 0 (0.000%) latency average = 2.120 ms initial connection time = 7.579 ms tps = 943.591451 (without initial connection time ) |
As you see there is not much difference, no matter the io_method. Let’s stress the system a bit more (only putting the summaries here):
1 2 3 4 5 6 7 8 9 10 11 12 13 | postgres@ip-10-0-1-209:~$ pgbench -- time =600 --client=10 --jobs=10 ## sync tps = 2552.785398 (without initial connection time ) tps = 2505.476064 (without initial connection time ) tps = 2542.419230 (without initial connection time ) ## io_uring tps = 2511.138931 (without initial connection time ) tps = 2529.705311 (without initial connection time ) tps = 2573.195751 (without initial connection time ) ## worker tps = 2531.657962 (without initial connection time ) tps = 2523.854335 (without initial connection time ) tps = 2515.490351 (without initial connection time ) |
Some picture, there is not much difference. One last test, hammering the system even more:
1 2 3 4 5 6 7 8 9 10 11 12 13 | postgres@ip-10-0-1-209:~$ pgbench -- time =600 --client=20 --jobs=20 ## worker tps = 2930.268033 (without initial connection time ) tps = 2799.499964 (without initial connection time ) tps = 3033.491153 (without initial connection time ) ## io_uring tps = 2942.542882 (without initial connection time ) tps = 3061.487286 (without initial connection time ) tps = 2995.175169 (without initial connection time ) ## sync tps = 2997.654084 (without initial connection time ) tps = 2924.269626 (without initial connection time ) tps = 2753.853272 (without initial connection time ) |
At least for these tests, there is not much difference between the three settings for io_method (sync seems to be a bit slower), but I think this is still great. For such a massive change getting to the same performance as before is great. Things in PostgreSQL improve all the time, and I am sure there will be a lot of improvements in this area as well.
Usually I link to the commit here, but in this case that would be a whole bunch of commits. To everyone involved in this, a big thank you.