A while ago I posted this tweet on Twitter:

While this might sound unusual, this actually was under consideration by one of our customers. Each of their customers usually gets a dedicated database and is free to do in that database whatever they want. Because they can also create multiple schemas, a schema only solution was not the way to go. To be clear right now: We did not implement it like this, the reasons are in this post. The actual implementation will follow with a follow up post.

To see what it means to work with such many databases let’s initialize the cluster, nothing special here:

[email protected]:/home/postgres/ [pg131] initdb -D /u02/pgdata/13
The files belonging to this database system will be owned by user "postgres".
This user must also own the server process.

The database cluster will be initialized with locale "C.UTF-8".
The default database encoding has accordingly been set to "UTF8".
The default text search configuration will be set to "english".

Data page checksums are disabled.

creating directory /u02/pgdata/13 ... ok
creating subdirectories ... ok
selecting dynamic shared memory implementation ... posix
selecting default max_connections ... 100
selecting default shared_buffers ... 128MB
selecting default time zone ... Etc/UTC
creating configuration files ... ok
running bootstrap script ... ok
performing post-bootstrap initialization ... ok
syncing data to disk ... ok

initdb: warning: enabling "trust" authentication for local connections
You can change this by editing pg_hba.conf or using the option -A, or
--auth-local and --auth-host, the next time you run initdb.

Success. You can now start the database server using:

    pg_ctl -D /u02/pgdata/13 -l logfile start

Starting up:

[email protected]:/home/postgres/ [pg131] pg_ctl -D /u02/pgdata/13/ start
waiting for server to start....2021-02-10 20:51:19.878 UTC [15930] LOG:  starting PostgreSQL 13.1 dbi services build on x86_64-pc-linux-gnu, compiled by gcc (Debian 8.3.0-6) 8.3.0, 64-bit
2021-02-10 20:51:19.879 UTC [15930] LOG:  listening on IPv4 address "", port 5432
2021-02-10 20:51:19.883 UTC [15930] LOG:  listening on Unix socket "/tmp/.s.PGSQL.5432"
2021-02-10 20:51:19.889 UTC [15931] LOG:  database system was shut down at 2021-02-10 20:49:54 UTC
2021-02-10 20:51:19.895 UTC [15930] LOG:  database system is ready to accept connections
server started
[email protected]:/home/postgres/ [pg131] psql -c "select version()" postgres
 PostgreSQL 13.1 on x86_64-pc-linux-gnu, compiled by gcc (Debian 8.3.0-6) 8.3.0, 64-bit
(1 row)

Generating 8000 databases in PostgreSQL is quite easy, all you need to do is something like this:

[email protected]:/home/postgres/ [pg131] for i in {1..8000}; do psql -c "create database db_${i}" postgres; done
[email protected]:/home/postgres/ [pg131] psql -c "select count(*) from pg_database" postgres
(1 row)

This will take some time but PostgreSQL quite happily creates the databases. You can actually create much more databases, if you really want, up to 4,294,950,911. Now that we have 8000 databases, let’s do some math. A single PostgreSQL database consumes around 8MB of disk space. As we have 8000 databases, that is around 64000MB of disk space, for just the empty databases:

[email protected]:/home/postgres/ [pg131] du -sh /u02/pgdata/13/
60G     /u02/pgdata/13/

This is quite some space, that could be used more efficiently. As all of these databases should be owned by a separate user, that makes 8000 users in addition:

[email protected]:/home/postgres/ [pg131] for i in {1..8000}; do psql -c "create user u_${i}" postgres; done
[email protected]:/home/postgres/ [pg131] for i in {1..8000}; do psql -c "alter database db_${i} owner to u_${i}" postgres; done

Now lets assume each of those database contains 100 tables, 100 indexes and 100 sequences. Initially I wanted to do it like this, but this is actually a very bad idea:

[email protected]:/home/postgres/ [pg131] for i in {1..8000}
> do
>     for e in {1..100}
>     do
>         psql -c "create table tab${e}(a int primary key generated always as identity, b text, c date)" db_${i}
>     done
> done

Creating 800000 connections is catastrophic for performance, so a much better way of doing it, is like this (and this are still 8000 connections and it will take some time to complete):

[email protected]:/home/postgres/ [pg131] for e in {1..100}
> do
>     echo "create table tab${e}(a int primary key generated always as identity, b text, c date);" >> /var/tmp/load.sql
> done
[email protected]:/home/postgres/ [pg131] for i in {1..8000}
> do
>     psql -f /var/tmp/load.sql db_${i}
> done
[email protected]:/home/postgres/ [pg131] rm -f /var/tmp/load.sql

Once that completed, let’s check the space usage again:

[email protected]:/home/postgres/ [pg131] du -sh /u02/pgdata/13/
88G     /u02/pgdata/13/

We’ve added 28GB and do not have a single user generated row in any of these databases, this is quite huge. Only from this point of view, it does not seems a good idea to combine as many databases as in this example into a single cluster. But there is more to consider than space usage:

  • When you have 8000 databases you need to expect 8000 concurrent sessions. How do you handle that, especially if each of those connections goes to a different database?
  • Backup and Restore: Physical backup and restore only works for the whole cluster. Of course could restore the whole cluster to another system and then pick specific databases you are interested in, but this will take some time as well.
  • Autovacuum has potentially to process many databases in parallel, so how many worker processes can you give?
  • 8000 databases with 100 tables each are at least 80’0000 files on disk, not counting FSM, VM and index files. Depending on your file system and your segment size you might need to tweak the file system as well.
  • Access permissions: Managing access permissions for at least 8000 users might become tricky. In addition pg_hba.conf gets large.
  • I guess it all comes down to: Are you willing to invest into a monster machine with as many cores as currently are available on the market, which lots and lots of memory and the fastest disks you can get? And then you probably want at least two of them so you can failover in case of a disaster, or even three if you want to have automated failover? And then you need at least one test system as close to production as possible.