If you followed part 1 and part 2 of this little blog series you now have a running Greenplum system. There is one coordinator host and there are two segment hosts. In this post we’ll look at what really was initialized, how that looks on disk and how the PostgreSQL instances communicate with each other.

Lets start with a simple overview of what we have now:


                               |-------------------|
                               |                   |
                               |     Segment 1     |
                               |                   |
                               |-------------------|
                                        /
   |-------------------|               /
   |                   |              /
   |   Coordinator     |--------------
   |                   |              \
   |-------------------|               \
                                        \
                               |-------------------|
                               |                   |
                               |     Segment 2     |
                               |                   |
                               |-------------------|

We have the coordinator node on the left, somehow connected to the two segment nodes on the right. As Greenplum is based on PostgreSQL we should easily be able to find out on which port the coordinator instance is listening:

[gpadmin@cdw ~]$ psql -c "show port" postgres
 port 
------
 5432
(1 row)

Another bit of information we can get out of the Greenplum catalog is, how the instances are connected between the nodes:

postgres=# select * from gp_segment_configuration order by dbid;
 dbid | content | role | preferred_role | mode | status | port | hostname | address |          datadir          
------+---------+------+----------------+------+--------+------+----------+---------+---------------------------
    1 |      -1 | p    | p              | n    | u      | 5432 | cdw      | cdw     | /data/coordinator/gpseg-1
    2 |       0 | p    | p              | s    | u      | 6000 | sdw1     | sdw1    | /data/primary/gpseg0
    3 |       1 | p    | p              | s    | u      | 6000 | sdw2     | sdw2    | /data/primary/gpseg1
    4 |       0 | m    | m              | s    | u      | 7000 | sdw2     | sdw2    | /data/mirror/gpseg0
    5 |       1 | m    | m              | s    | u      | 7000 | sdw1     | sdw1    | /data/mirror/gpseg1
(5 rows)

(All the cluster views are documented here)

Putting this information into the picture from above, gives us this:


                                        |-------------------|
                                    6000|primary            |
                                        |     Segment 1     |
                                    7000|mirror             |
                                        |-------------------|
                                                 /
            |-------------------|               /
            |                   |              /
        5432|   Coordinator     |--------------
            |                   |              \
            |-------------------|               \
                                                 \
                                        |-------------------|
                                    6000|primary            |
                                        |     Segment 2     |
                                    7000|mirror             |
                                        |-------------------|

Looking at the PostgreSQL configuration of the the two mirror instances we can see that the mirror on segment 1 is replicating from the primary from segment 2. The mirror on segment 2 is replicating from the primary on segment 1:

[gpadmin@sdw1 gpseg1]$ hostname
sdw1
[gpadmin@sdw1 gpseg1]$ pwd
/data/mirror/gpseg1
[gpadmin@sdw1 gpseg1]$ grep conninfo postgresql.*conf
postgresql.auto.conf:primary_conninfo = 'user=gpadmin passfile=''/home/gpadmin/.pgpass'' host=sdw2 port=6000 sslmode=prefer sslcompression=0 gssencmode=prefer krbsrvname=postgres target_session_attrs=any application_name=gp_walreceiver'
postgresql.conf:#primary_conninfo = ''                  # connection string to sending server

[gpadmin@sdw2 gpseg0]$ hostname
sdw2
[gpadmin@sdw2 gpseg0]$ pwd
/data/mirror/gpseg0
[gpadmin@sdw2 gpseg0]$ grep conninfo postgresql.*conf
postgresql.auto.conf:primary_conninfo = 'user=gpadmin passfile=''/home/gpadmin/.pgpass'' host=sdw1 port=6000 sslmode=prefer sslcompression=0 gssencmode=prefer krbsrvname=postgres target_session_attrs=any application_name=gp_walreceiver'
postgresql.conf:#primary_conninfo = ''                  # connection string to sending server

This means that the mirror instances are just normal PostgreSQL replicas:

                                        |-------------------|
                             |------6000|primary---------   |
                             |          |     Segment 1 |   |
                             |      7000|mirror<------| |   |
                             |          |-------------------|
                             |                        | |
            |-------------------|                     | |
            |                   |                     | |
        5432|   Coordinator     |                     | |
            |                   |                     | |
            |-------------------|                     | |
                             |                        | |
                             |          |-------------------|
                             |------6000|primary ------ |   |
                                        |     Segment 2 |   |
                                    7000|mirror<--------|   |
                                        |-------------------|

Both replicas run in synchronous mode, which can be seen in gp_stat_replication:

postgres=# select * from gp_stat_replication ;
 gp_segment_id | pid  | usesysid | usename | application_name |   client_addr   | client_hostname | client_port |         backend_start         | backend_xmin |   state   | sent_lsn  | write_lsn | flush_lsn | replay_lsn | write_lag | flush_lag | replay_lag | sync_priority | sync_state |          reply_time           | sync_error 
---------------+------+----------+---------+------------------+-----------------+-----------------+-------------+-------------------------------+--------------+-----------+-----------+-----------+-----------+------------+-----------+-----------+------------+---------------+------------+-------------------------------+------------
             1 | 6156 |       10 | gpadmin | gp_walreceiver   | 192.168.122.201 |                 |       18462 | 2024-02-29 12:51:32.070682+01 |              | streaming | 0/C000158 | 0/C000158 | 0/C000158 | 0/C000158  |           |           |            |             1 | sync       | 2024-02-29 14:04:45.448297+01 | none
             0 | 6599 |       10 | gpadmin | gp_walreceiver   | 192.168.122.202 |                 |       58176 | 2024-02-29 12:51:32.612645+01 |              | streaming | 0/C000158 | 0/C000158 | 0/C000158 | 0/C000158  |           |           |            |             1 | sync       | 2024-02-29 14:04:44.243651+01 | none

When one of the primary segments fails, the attached mirror segment (which is a replica) will take over.

By default direct connections to either a primary or a mirror segment are not allowed / possible:

[gpadmin@sdw1 gpseg1]$ psql -p 7000
psql: error: FATAL:  the database system is in recovery mode
DETAIL:  last replayed record at 0/C000158
- VERSION: PostgreSQL 12.12 (Greenplum Database 7.1.0 build commit:e7c2b1f14bb42a1018ac57d14f4436880e0a0515 Open Source) on x86_64-pc-linux-gnu, compiled by gcc (GCC) 11.3.1 20221121 (Red Hat 11.3.1-4), 64-bit compiled on Jan 19 2024 06:51:45 Bhuvnesh C.
[gpadmin@sdw1 gpseg1]$ psql -p 6000 postgres
psql: error: FATAL:  connections to primary segments are not allowed
DETAIL:  This database instance is running as a primary segment in a Greenplum cluster and does not permit direct connections.
HINT:  To force a connection anyway (dangerous!), use utility mode.

The consequence is, that the coordinator node is the only entry point into the Greenplum system. Nothing is supposed to happen directly on any of the segments.

Lets create a new database on the coordinator host and then retrieve the OID this database got:

postgres=# create database d;
CREATE DATABASE
postgres=# \l
                                List of databases
   Name    |  Owner  | Encoding |   Collate   |    Ctype    |  Access privileges  
-----------+---------+----------+-------------+-------------+---------------------
 d         | gpadmin | UTF8     | en_US.UTF-8 | en_US.UTF-8 | 
 postgres  | gpadmin | UTF8     | en_US.UTF-8 | en_US.UTF-8 | 
 template0 | gpadmin | UTF8     | en_US.UTF-8 | en_US.UTF-8 | =c/gpadmin         +
           |         |          |             |             | gpadmin=CTc/gpadmin
 template1 | gpadmin | UTF8     | en_US.UTF-8 | en_US.UTF-8 | =c/gpadmin         +
           |         |          |             |             | gpadmin=CTc/gpadmin
(4 rows)

postgres=# select oid,datname from pg_database;
  oid  |  datname  
-------+-----------
 13720 | postgres
     1 | template1
 17121 | d
 13719 | template0
(4 rows)

We can’t directly connect to the primary and mirror segments, but we can verify if the database got replicated on the file system on one of the segment nodes:

[gpadmin@sdw2 data]$ ls -la mirror/gpseg0/base/
total 52
drwx------  6 gpadmin gpadmin   54 Feb 29 14:26 .
drwx------ 21 gpadmin gpadmin 4096 Feb 29 13:01 ..
drwx------  2 gpadmin gpadmin 8192 Feb 29 12:51 1
drwx------  2 gpadmin gpadmin 8192 Feb 29 12:51 13719
drwx------  2 gpadmin gpadmin 8192 Feb 29 12:51 13720
drwx------  2 gpadmin gpadmin 8192 Feb 29 14:26 17121
[gpadmin@sdw2 data]$ ls -la primary/gpseg1/base/
total 52
drwx------  6 gpadmin gpadmin   54 Feb 29 14:26 .
drwx------ 21 gpadmin gpadmin 4096 Feb 29 13:01 ..
drwx------  2 gpadmin gpadmin 8192 Feb 29 12:51 1
drwx------  2 gpadmin gpadmin 8192 Feb 29 12:51 13719
drwx------  2 gpadmin gpadmin 8192 Feb 29 12:51 13720
drwx------  2 gpadmin gpadmin 8192 Feb 29 14:26 17121

Not a big surprise, the new database is of course there on the segment, otherwise the whole setup would not make much sense.

In the next post we’ll look at backup and recovery of such a system.