As PostgreSQL does not know the concept of running multiple instances against the same files on disk (e.g. like Oracle RAC) it should not be possible to start two or more instances against the same data directory. If that would work the result can only be corruption. In this post we will look at how PostgreSQL is detecting that and what mechanism are build in to avoid the situation of having multiple instances working against the same files on disk.
To start with we create a new cluster:
postgres@rhel8pg:/home/postgres/ [PGDEV] mkdir /var/tmp/pgtest 12:16:46 postgres@rhel8pg:/home/postgres/ [PGDEV] initdb -D /var/tmp/pgtest/ The files belonging to this database system will be owned by user "postgres". This user must also own the server process. The database cluster will be initialized with locales COLLATE: en_US.utf8 CTYPE: en_US.utf8 MESSAGES: en_US.utf8 MONETARY: de_CH.UTF-8 NUMERIC: de_CH.UTF-8 TIME: en_US.UTF-8 The default database encoding has accordingly been set to "UTF8". The default text search configuration will be set to "english". Data page checksums are disabled. fixing permissions on existing directory /var/tmp/pgtest ... ok creating subdirectories ... ok selecting dynamic shared memory implementation ... posix selecting default max_connections ... 100 selecting default shared_buffers ... 128MB selecting default timezone ... Europe/Zurich creating configuration files ... ok running bootstrap script ... ok performing post-bootstrap initialization ... ok syncing data to disk ... ok initdb: warning: enabling "trust" authentication for local connections You can change this by editing pg_hba.conf or using the option -A, or --auth-local and --auth-host, the next time you run initdb. Success. You can now start the database server using: pg_ctl -D /var/tmp/pgtest/ -l logfile start
We use a dedicated port and then start it up:
postgres@rhel8pg:/home/postgres/ [PGDEV] export PGPORT=8888 postgres@rhel8pg:/home/postgres/ [PGDEV] pg_ctl -D /var/tmp/pgtest start waiting for server to start....2019-05-16 12:17:22.399 CEST [7607] LOG: starting PostgreSQL 12devel on x86_64-pc-linux-gnu, compiled by gcc (GCC) 8.2.1 20180905 (Red Hat 8.2.1-3), 64-bit 2019-05-16 12:17:22.403 CEST [7607] LOG: listening on IPv6 address "::1", port 8888 2019-05-16 12:17:22.403 CEST [7607] LOG: listening on IPv4 address "127.0.0.1", port 8888 2019-05-16 12:17:22.409 CEST [7607] LOG: listening on Unix socket "/tmp/.s.PGSQL.8888" 2019-05-16 12:17:22.446 CEST [7608] LOG: database system was shut down at 2019-05-16 12:16:54 CEST 2019-05-16 12:17:22.455 CEST [7607] LOG: database system is ready to accept connections done server started postgres@rhel8pg:/home/postgres/ [PGDEV] psql -p 8888 -c "select version()" postgres version ----------------------------------------------------------------------------------------------------------- PostgreSQL 12devel on x86_64-pc-linux-gnu, compiled by gcc (GCC) 8.2.1 20180905 (Red Hat 8.2.1-3), 64-bit (1 row)
What happens when we want to start another instance against that data directory?
postgres@rhel8pg:/home/postgres/ [PGDEV] pg_ctl -D /var/tmp/pgtest start pg_ctl: another server might be running; trying to start server anyway waiting for server to start....2019-05-16 12:18:26.252 CEST [7629] FATAL: lock file "postmaster.pid" already exists 2019-05-16 12:18:26.252 CEST [7629] HINT: Is another postmaster (PID 7607) running in data directory "/var/tmp/pgtest"? stopped waiting pg_ctl: could not start server Examine the log output.
When PostgreSQL is starting up it will look at a file called “postmaster.pid” which exists in the data directory once the instance is started. If that file exists PostgreSQL will not start up another instance against the same data directory. Once the instance is stopped the file is removed:
postgres@rhel8pg:/home/postgres/ [PGDEV] pg_ctl -D /var/tmp/pgtest/ stop waiting for server to shut down....2019-05-16 12:48:50.636 CEST [7896] LOG: received fast shutdown request 2019-05-16 12:48:50.641 CEST [7896] LOG: aborting any active transactions 2019-05-16 12:48:50.651 CEST [7896] LOG: background worker "logical replication launcher" (PID 7903) exited with exit code 1 2019-05-16 12:48:50.651 CEST [7898] LOG: shutting down 2019-05-16 12:48:50.685 CEST [7896] LOG: database system is shut down done server stopped postgres@rhel8pg:/home/postgres/ [PGDEV] ls -al /var/tmp/pgtest/postmaster.pid ls: cannot access '/var/tmp/pgtest/postmaster.pid': No such file or directory
At least by default this is not possible to start two or more instances as PostgreSQL checks if postmaster.pid already exists. Lets remove that file and try again:
postgres@rhel8pg:/home/postgres/ [PGDEV] rm /var/tmp/pgtest/postmaster.pid postgres@rhel8pg:/home/postgres/ [PGDEV] pg_ctl -D /var/tmp/pgtest start waiting for server to start....2019-05-16 12:20:17.754 CEST [7662] LOG: starting PostgreSQL 12devel on x86_64-pc-linux-gnu, compiled by gcc (GCC) 8.2.1 20180905 (Red Hat 8.2.1-3), 64-bit 2019-05-16 12:20:17.756 CEST [7662] LOG: could not bind IPv6 address "::1": Address already in use 2019-05-16 12:20:17.756 CEST [7662] HINT: Is another postmaster already running on port 8888? If not, wait a few seconds and retry. 2019-05-16 12:20:17.756 CEST [7662] LOG: could not bind IPv4 address "127.0.0.1": Address already in use 2019-05-16 12:20:17.756 CEST [7662] HINT: Is another postmaster already running on port 8888? If not, wait a few seconds and retry. 2019-05-16 12:20:17.756 CEST [7662] WARNING: could not create listen socket for "localhost" 2019-05-16 12:20:17.756 CEST [7662] FATAL: could not create any TCP/IP sockets 2019-05-16 12:20:17.756 CEST [7662] LOG: database system is shut down stopped waiting pg_ctl: could not start server Examine the log output.
Again, this does not work and even the initial instance was shutdown because PostgreSQL detected that the lock file is not there anymore:
2019-05-16 12:20:22.540 CEST [7607] LOG: could not open file "postmaster.pid": No such file or directory 2019-05-16 12:20:22.540 CEST [7607] LOG: performing immediate shutdown because data directory lock file is invalid 2019-05-16 12:20:22.540 CEST [7607] LOG: received immediate shutdown request 2019-05-16 12:20:22.540 CEST [7607] LOG: could not open file "postmaster.pid": No such file or directory 2019-05-16 12:20:22.544 CEST [7612] WARNING: terminating connection because of crash of another server process 2019-05-16 12:20:22.544 CEST [7612] DETAIL: The postmaster has commanded this server process to roll back the current transaction and exit, because another server process exited abnormally and possibly corrupted shared memory. 2019-05-16 12:20:22.544 CEST [7612] HINT: In a moment you should be able to reconnect to the database and repeat your command. 2019-05-16 12:20:22.549 CEST [7664] WARNING: terminating connection because of crash of another server process 2019-05-16 12:20:22.549 CEST [7664] DETAIL: The postmaster has commanded this server process to roll back the current transaction and exit, because another server process exited abnormally and possibly corrupted shared memory.
Lets start the first instance again:
postgres@rhel8pg:/home/postgres/ [PGDEV] pg_ctl -D /var/tmp/pgtest/ start waiting for server to start....2019-05-16 12:22:20.136 CEST [7691] LOG: starting PostgreSQL 12devel on x86_64-pc-linux-gnu, compiled by gcc (GCC) 8.2.1 20180905 (Red Hat 8.2.1-3), 64-bit 2019-05-16 12:22:20.140 CEST [7691] LOG: listening on IPv6 address "::1", port 8888 2019-05-16 12:22:20.140 CEST [7691] LOG: listening on IPv4 address "127.0.0.1", port 8888 2019-05-16 12:22:20.148 CEST [7691] LOG: listening on Unix socket "/tmp/.s.PGSQL.8888" 2019-05-16 12:22:20.193 CEST [7693] LOG: database system was interrupted; last known up at 2019-05-16 12:17:22 CEST .2019-05-16 12:22:21.138 CEST [7693] LOG: database system was not properly shut down; automatic recovery in progress 2019-05-16 12:22:21.143 CEST [7693] LOG: redo starts at 0/15D3420 2019-05-16 12:22:21.143 CEST [7693] LOG: invalid record length at 0/15D3458: wanted 24, got 0 2019-05-16 12:22:21.143 CEST [7693] LOG: redo done at 0/15D3420 2019-05-16 12:22:21.173 CEST [7691] LOG: database system is ready to accept connections done server started postgres@rhel8pg:/home/postgres/ [PGDEV] psql -p 8888 -c "select version()" postgres version ----------------------------------------------------------------------------------------------------------- PostgreSQL 12devel on x86_64-pc-linux-gnu, compiled by gcc (GCC) 8.2.1 20180905 (Red Hat 8.2.1-3), 64-bit (1 row)
Lets change the port for the second instance and then try again to start it against the same data directory:
postgres@rhel8pg:/home/postgres/ [PGDEV] export PGPORT=8889 postgres@rhel8pg:/home/postgres/ [PGDEV] pg_ctl -D /var/tmp/pgtest/ start pg_ctl: another server might be running; trying to start server anyway waiting for server to start....2019-05-16 12:24:41.700 CEST [7754] FATAL: lock file "postmaster.pid" already exists 2019-05-16 12:24:41.700 CEST [7754] HINT: Is another postmaster (PID 7741) running in data directory "/var/tmp/pgtest"? stopped waiting pg_ctl: could not start server Examine the log output.
Does not work as well, which is good. Lets be a bit more nasty and truncate the postmaster.pid file:
postgres@rhel8pg:/home/postgres/ [PGDEV] cat /var/tmp/pgtest/postmaster.pid 7790 /var/tmp/pgtest 1558002434 8888 /tmp localhost 8888001 819201 ready postgres@rhel8pg:/home/postgres/ [PGDEV] cat /dev/null > /var/tmp/pgtest/postmaster.pid postgres@rhel8pg:/home/postgres/ [PGDEV] cat /var/tmp/pgtest/postmaster.pid
The pid file is now empty and right after emptying that file we can see this in the PostgreSQL log file:
019-05-16 12:30:14.140 CEST [7790] LOG: lock file "postmaster.pid" contains wrong PID: 0 instead of 7790 2019-05-16 12:30:14.140 CEST [7790] LOG: performing immediate shutdown because data directory lock file is invalid 2019-05-16 12:30:14.140 CEST [7790] LOG: received immediate shutdown request 2019-05-16 12:30:14.149 CEST [7795] WARNING: terminating connection because of crash of another server process 2019-05-16 12:30:14.149 CEST [7795] DETAIL: The postmaster has commanded this server process to roll back the current transaction and exit, because another server process exited abnormally and possibly corrupted shared memory. 2019-05-16 12:30:14.149 CEST [7795] HINT: In a moment you should be able to reconnect to the database and repeat your command. 2019-05-16 12:30:14.160 CEST [7790] LOG: database system is shut down
So even that case it is detected and PostgreSQL protects you from starting up another instance against the same data directory. Lets try something else and modify PGDATA in the postmaster.pid file:
postgres@rhel8pg:/home/postgres/ [PGDEV] cat /var/tmp/pgtest/postmaster.pid 7896 /var/tmp/pgtest 1558002751 8888 /tmp localhost 8888001 851969 ready postgres@rhel8pg:/home/postgres/ [PGDEV] sed -i 's//var/tmp/pgtest//var/tmp/pgtest2/g' /var/tmp/pgtest/postmaster.pid postgres@rhel8pg:/home/postgres/ [PGDEV] cat /var/tmp/pgtest/postmaster.pid 7896 /var/tmp/pgtest2 1558002751 8888 /tmp localhost 8888001 851969 ready
Although we changed PGDATA PostgreSQL will not start up another instance against this data directory:
postgres@rhel8pg:/home/postgres/ [PGDEV] pg_ctl -D /var/tmp/pgtest/ start pg_ctl: another server might be running; trying to start server anyway waiting for server to start....2019-05-16 12:35:28.540 CEST [7973] FATAL: lock file "postmaster.pid" already exists 2019-05-16 12:35:28.540 CEST [7973] HINT: Is another postmaster (PID 7896) running in data directory "/var/tmp/pgtest"? stopped waiting pg_ctl: could not start server Examine the log output.
So by default you can not get PostgreSQL to start two or even more instances against the same data directory. There is an comment about this behaviour in src/backend/postmaster/postmaster.c in the source code:
/* * Once a minute, verify that postmaster.pid hasn't been removed or * overwritten. If it has, we force a shutdown. This avoids having * postmasters and child processes hanging around after their database * is gone, and maybe causing problems if a new database cluster is * created in the same place. It also provides some protection * against a DBA foolishly removing postmaster.pid and manually * starting a new postmaster. Data corruption is likely to ensue from * that anyway, but we can minimize the damage by aborting ASAP. */
“Once a minute” might be critical and we might be able to start a second one if we are fast enough, so lets try again. This time we start the first one, remove the lock file and immediately start another one using another port:
export PGPORT=8888 pg_ctl -D /var/tmp/pgtest start rm -f /var/tmp/pgtest/postmaster.pid export PGPORT=8889 pg_ctl -D /var/tmp/pgtest start
And here you have it:
postgres@rhel8pg:/home/postgres/ [pg120] ps -ef | grep postgres postgres 1445 1 0 May27 ? 00:00:00 /usr/lib/systemd/systemd --user postgres 1456 1445 0 May27 ? 00:00:00 (sd-pam) root 9780 786 0 06:09 ? 00:00:00 sshd: postgres [priv] postgres 9783 9780 0 06:09 ? 00:00:00 sshd: postgres@pts/1 postgres 9784 9783 0 06:09 pts/1 00:00:00 -bash postgres 10302 1 0 06:19 ? 00:00:00 /u01/app/postgres/product/DEV/db_1/bin/postgres -D /var/tmp/pgtest postgres 10304 10302 0 06:19 ? 00:00:00 postgres: checkpointer postgres 10305 10302 0 06:19 ? 00:00:00 postgres: background writer postgres 10306 10302 0 06:19 ? 00:00:00 postgres: walwriter postgres 10307 10302 0 06:19 ? 00:00:00 postgres: autovacuum launcher postgres 10308 10302 0 06:19 ? 00:00:00 postgres: stats collector postgres 10309 10302 0 06:19 ? 00:00:00 postgres: logical replication launcher postgres 10313 1 0 06:19 ? 00:00:00 /u01/app/postgres/product/DEV/db_1/bin/postgres -D /var/tmp/pgtest postgres 10315 10313 0 06:19 ? 00:00:00 postgres: checkpointer postgres 10316 10313 0 06:19 ? 00:00:00 postgres: background writer postgres 10317 10313 0 06:19 ? 00:00:00 postgres: walwriter postgres 10318 10313 0 06:19 ? 00:00:00 postgres: autovacuum launcher postgres 10319 10313 0 06:19 ? 00:00:00 postgres: stats collector postgres 10320 10313 0 06:19 ? 00:00:00 postgres: logical replication launcher postgres 10327 9784 0 06:19 pts/1 00:00:00 ps -ef
Conclusion: PostgreSQL does some basic checks to avoid starting two instances against the same files on disk. But if you really want (and of course you should never do that) then you can achieve that => with all the consequences! Don’t do it!