dbi Blog

This is the next post in this little Greenplum series. This time we’ll look at how we can recover from a failed segment. If you are looking for the previous post, they are here: Getting started with Greenplum – 1 – Installation, Getting started with Greenplum – 2 – Initializing and bringing up the cluster, Getting started with Greenplum – 3 – Behind the scenes, Getting started with Greenplum – 4 – Backup & Restore – databases.

Let’s quickly come back to what we’ve deployed currently:

                                        |-------------------|
                             |------6000|primary---------   |
                             |          |     Segment 1 |   |
                             |      7000|mirror&lt;------| |   |
                             |          |-------------------|
                             |                        | |
            |-------------------|                     | |
            |                   |                     | |
        5432|   Coordinator     |                     | |
            |                   |                     | |
            |-------------------|                     | |
                             |                        | |
                             |          |-------------------|
                             |------6000|primary ------ |   |
                                        |     Segment 2 |   |
                                    7000|mirror&lt;--------|   |
                                        |-------------------|

The coordinator host is the entry point for the application and requests are routed to the segment hosts. The idea behind this is, that you can use the power of multiple (segment) hosts to deliver what you’ve asked for. The more segment hosts you add, the more compute resources you can use.

The questions is: How can you recover from a failed segment node? With the current deployment this would reduce the compute resources by 50% and you probably want to have this back online as soon as possible.

To get the current status of your segments you can use “gpstate”:

[gpadmin@cdw ~]$ gpstate
20240307:11:03:16:001723 gpstate:cdw:gpadmin-[INFO]:-Starting gpstate with args: 
20240307:11:03:16:001723 gpstate:cdw:gpadmin-[INFO]:-local Greenplum Version: 'postgres (Greenplum Database) 7.1.0 build commit:e7c2b1f14bb42a1018ac57d14f4436880e0a0515 Open Source'
20240307:11:03:16:001723 gpstate:cdw:gpadmin-[INFO]:-coordinator Greenplum Version: 'PostgreSQL 12.12 (Greenplum Database 7.1.0 build commit:e7c2b1f14bb42a1018ac57d14f4436880e0a0515 Open Source) on x86_64-pc-linux-gnu, compiled by gcc (GCC) 11.3.1 20221121 (Red Hat 11.3.1-4), 64-bit compiled on Jan 19 2024 06:51:45 Bhuvnesh C.'
20240307:11:03:16:001723 gpstate:cdw:gpadmin-[INFO]:-Obtaining Segment details from coordinator...
20240307:11:03:16:001723 gpstate:cdw:gpadmin-[INFO]:-Gathering data from segments...
20240307:11:03:16:001723 gpstate:cdw:gpadmin-[INFO]:-Greenplum instance status summary
20240307:11:03:16:001723 gpstate:cdw:gpadmin-[INFO]:-----------------------------------------------------
20240307:11:03:16:001723 gpstate:cdw:gpadmin-[INFO]:-   Coordinator instance                                      = Active
20240307:11:03:16:001723 gpstate:cdw:gpadmin-[INFO]:-   Coordinator standby                                       = No coordinator standby configured
20240307:11:03:16:001723 gpstate:cdw:gpadmin-[INFO]:-   Total segment instance count from metadata                = 4
20240307:11:03:16:001723 gpstate:cdw:gpadmin-[INFO]:-----------------------------------------------------
20240307:11:03:16:001723 gpstate:cdw:gpadmin-[INFO]:-   Primary Segment Status
20240307:11:03:16:001723 gpstate:cdw:gpadmin-[INFO]:-----------------------------------------------------
20240307:11:03:16:001723 gpstate:cdw:gpadmin-[INFO]:-   Total primary segments                                    = 2
20240307:11:03:16:001723 gpstate:cdw:gpadmin-[INFO]:-   Total primary segment valid (at coordinator)              = 2
20240307:11:03:16:001723 gpstate:cdw:gpadmin-[INFO]:-   Total primary segment failures (at coordinator)           = 0
20240307:11:03:16:001723 gpstate:cdw:gpadmin-[INFO]:-   Total number of postmaster.pid files missing              = 0
20240307:11:03:16:001723 gpstate:cdw:gpadmin-[INFO]:-   Total number of postmaster.pid files found                = 2
20240307:11:03:16:001723 gpstate:cdw:gpadmin-[INFO]:-   Total number of postmaster.pid PIDs missing               = 0
20240307:11:03:16:001723 gpstate:cdw:gpadmin-[INFO]:-   Total number of postmaster.pid PIDs found                 = 2
20240307:11:03:16:001723 gpstate:cdw:gpadmin-[INFO]:-   Total number of /tmp lock files missing                   = 0
20240307:11:03:16:001723 gpstate:cdw:gpadmin-[INFO]:-   Total number of /tmp lock files found                     = 2
20240307:11:03:16:001723 gpstate:cdw:gpadmin-[INFO]:-   Total number postmaster processes missing                 = 0
20240307:11:03:16:001723 gpstate:cdw:gpadmin-[INFO]:-   Total number postmaster processes found                   = 2
20240307:11:03:16:001723 gpstate:cdw:gpadmin-[INFO]:-----------------------------------------------------
20240307:11:03:16:001723 gpstate:cdw:gpadmin-[INFO]:-   Mirror Segment Status
20240307:11:03:16:001723 gpstate:cdw:gpadmin-[INFO]:-----------------------------------------------------
20240307:11:03:16:001723 gpstate:cdw:gpadmin-[INFO]:-   Total mirror segments                                     = 2
20240307:11:03:16:001723 gpstate:cdw:gpadmin-[INFO]:-   Total mirror segment valid (at coordinator)               = 2
20240307:11:03:16:001723 gpstate:cdw:gpadmin-[INFO]:-   Total mirror segment failures (at coordinator)            = 0
20240307:11:03:16:001723 gpstate:cdw:gpadmin-[INFO]:-   Total number of postmaster.pid files missing              = 0
20240307:11:03:16:001723 gpstate:cdw:gpadmin-[INFO]:-   Total number of postmaster.pid files found                = 2
20240307:11:03:16:001723 gpstate:cdw:gpadmin-[INFO]:-   Total number of postmaster.pid PIDs missing               = 0
20240307:11:03:16:001723 gpstate:cdw:gpadmin-[INFO]:-   Total number of postmaster.pid PIDs found                 = 2
20240307:11:03:16:001723 gpstate:cdw:gpadmin-[INFO]:-   Total number of /tmp lock files missing                   = 0
20240307:11:03:16:001723 gpstate:cdw:gpadmin-[INFO]:-   Total number of /tmp lock files found                     = 2
20240307:11:03:16:001723 gpstate:cdw:gpadmin-[INFO]:-   Total number postmaster processes missing                 = 0
20240307:11:03:16:001723 gpstate:cdw:gpadmin-[INFO]:-   Total number postmaster processes found                   = 2
20240307:11:03:16:001723 gpstate:cdw:gpadmin-[INFO]:-   Total number mirror segments acting as primary segments   = 0
20240307:11:03:16:001723 gpstate:cdw:gpadmin-[INFO]:-   Total number mirror segments acting as mirror segments    = 2
20240307:11:03:16:001723 gpstate:cdw:gpadmin-[INFO]:-----------------------------------------------------

This confirms that all is fine as of now. There are two primary and two mirror segments and all of them are up and running. You can also ask “gpstate” to only check for segments which have issues:

[gpadmin@cdw ~]$ gpstate -e
20240307:11:10:15:001862 gpstate:cdw:gpadmin-[INFO]:-Starting gpstate with args: -e
20240307:11:10:15:001862 gpstate:cdw:gpadmin-[INFO]:-local Greenplum Version: 'postgres (Greenplum Database) 7.1.0 build commit:e7c2b1f14bb42a1018ac57d14f4436880e0a0515 Open Source'
20240307:11:10:15:001862 gpstate:cdw:gpadmin-[INFO]:-coordinator Greenplum Version: 'PostgreSQL 12.12 (Greenplum Database 7.1.0 build commit:e7c2b1f14bb42a1018ac57d14f4436880e0a0515 Open Source) on x86_64-pc-linux-gnu, compiled by gcc (GCC) 11.3.1 20221121 (Red Hat 11.3.1-4), 64-bit compiled on Jan 19 2024 06:51:45 Bhuvnesh C.'
20240307:11:10:15:001862 gpstate:cdw:gpadmin-[INFO]:-Obtaining Segment details from coordinator...
20240307:11:10:15:001862 gpstate:cdw:gpadmin-[INFO]:-Gathering data from segments...
20240307:11:10:15:001862 gpstate:cdw:gpadmin-[INFO]:-----------------------------------------------------
20240307:11:10:15:001862 gpstate:cdw:gpadmin-[INFO]:-Segment Mirroring Status Report
20240307:11:10:15:001862 gpstate:cdw:gpadmin-[INFO]:-----------------------------------------------------
20240307:11:10:15:001862 gpstate:cdw:gpadmin-[INFO]:-All segments are running normally

There are some scenarios which can happen to a segment: Either you lose the full node, for whatever reason. Another thing which can happen, is that a specific PostgreSQL instance is not running anymore, either a primary or a mirror segment. Finally it can happen that the PGDATA for specific segment got corrupted.

Let’s start with an segment instance which went down. To force this, let’s kill the mirror instance on the first segment node (sdw1):

[gpadmin@sdw1 ~]$ ps -ef | egrep "7000|mirror" | grep -v grep
gpadmin     1343       1  0 10:50 ?        00:00:00 /usr/local/greenplum-db-7.1.0/bin/postgres -D /data/mirror/gpseg1 -c gp_role=execute
gpadmin     1344    1343  0 10:50 ?        00:00:00 postgres:  7000, logger process   
gpadmin     1346    1343  0 10:50 ?        00:00:00 postgres:  7000, startup   recovering 000000010000000000000004
gpadmin     1348    1343  0 10:50 ?        00:00:00 postgres:  7000, checkpointer   
gpadmin     1349    1343  0 10:50 ?        00:00:00 postgres:  7000, background writer   
gpadmin     1372    1343  0 10:50 ?        00:00:01 postgres:  7000, walreceiver   streaming 0/127CD2A8
[gpadmin@sdw1 ~]$ kill -9 1372 1349 1348 1346 1344 1343
[gpadmin@sdw1 ~]$ ps -ef | egrep "7000|mirror" | grep -v grep
[gpadmin@sdw1 ~]$

The coordinator should become aware of this:

[gpadmin@cdw ~]$ gpstate -e
20240307:11:44:25:002261 gpstate:cdw:gpadmin-[INFO]:-Starting gpstate with args: -e
20240307:11:44:25:002261 gpstate:cdw:gpadmin-[INFO]:-local Greenplum Version: 'postgres (Greenplum Database) 7.1.0 build commit:e7c2b1f14bb42a1018ac57d14f4436880e0a0515 Open Source'
20240307:11:44:25:002261 gpstate:cdw:gpadmin-[INFO]:-coordinator Greenplum Version: 'PostgreSQL 12.12 (Greenplum Database 7.1.0 build commit:e7c2b1f14bb42a1018ac57d14f4436880e0a0515 Open Source) on x86_64-pc-linux-gnu, compiled by gcc (GCC) 11.3.1 20221121 (Red Hat 11.3.1-4), 64-bit compiled on Jan 19 2024 06:51:45 Bhuvnesh C.'
20240307:11:44:25:002261 gpstate:cdw:gpadmin-[INFO]:-Obtaining Segment details from coordinator...
20240307:11:44:25:002261 gpstate:cdw:gpadmin-[INFO]:-Gathering data from segments...
20240307:11:44:25:002261 gpstate:cdw:gpadmin-[WARNING]:-pg_stat_replication shows no standby connections
20240307:11:44:25:002261 gpstate:cdw:gpadmin-[INFO]:-----------------------------------------------------
20240307:11:44:25:002261 gpstate:cdw:gpadmin-[INFO]:-Segment Mirroring Status Report
20240307:11:44:25:002261 gpstate:cdw:gpadmin-[INFO]:-----------------------------------------------------
20240307:11:44:25:002261 gpstate:cdw:gpadmin-[INFO]:-Unsynchronized Segment Pairs
20240307:11:44:25:002261 gpstate:cdw:gpadmin-[INFO]:-   Current Primary   Port   WAL sync remaining bytes   Mirror   Port
20240307:11:44:25:002261 gpstate:cdw:gpadmin-[INFO]:-   sdw2              6000   Unknown                    sdw1     7000
20240307:11:44:25:002261 gpstate:cdw:gpadmin-[INFO]:-----------------------------------------------------
20240307:11:44:25:002261 gpstate:cdw:gpadmin-[INFO]:-Downed Segments (may include segments where status could not be retrieved)
20240307:11:44:25:002261 gpstate:cdw:gpadmin-[INFO]:-   Segment   Port   Config status   Status
20240307:11:44:25:002261 gpstate:cdw:gpadmin-[INFO]:-   sdw1      7000   Down            Down in configuration

This confirms that the segment is down and the coordinator is aware of it. In the most simple case you can just use “gprecoverseg” to recover any failed segments like this:

[gpadmin@cdw ~]$ gprecoverseg
20240307:11:45:33:002349 gprecoverseg:cdw:gpadmin-[INFO]:-Starting gprecoverseg with args: 
20240307:11:45:33:002349 gprecoverseg:cdw:gpadmin-[INFO]:-local Greenplum Version: 'postgres (Greenplum Database) 7.1.0 build commit:e7c2b1f14bb42a1018ac57d14f4436880e0a0515 Open Source'
20240307:11:45:33:002349 gprecoverseg:cdw:gpadmin-[INFO]:-coordinator Greenplum Version: 'PostgreSQL 12.12 (Greenplum Database 7.1.0 build commit:e7c2b1f14bb42a1018ac57d14f4436880e0a0515 Open Source) on x86_64-pc-linux-gnu, compiled by gcc (GCC) 11.3.1 20221121 (Red Hat 11.3.1-4), 64-bit compiled on Jan 19 2024 06:51:45 Bhuvnesh C.'
20240307:11:45:33:002349 gprecoverseg:cdw:gpadmin-[INFO]:-Obtaining Segment details from coordinator...
20240307:11:45:34:002349 gprecoverseg:cdw:gpadmin-[INFO]:-Successfully finished pg_controldata /data/primary/gpseg1 for dbid 3:
stdout: pg_control version number:            12010700
Catalog version number:               302307241
Database system identifier:           7340990201631847636
Database cluster state:               in production
pg_control last modified:             Thu 07 Mar 2024 10:55:32 AM CET
Latest checkpoint location:           0/127CD1E8
Latest checkpoint's REDO location:    0/127CD1B0
Latest checkpoint's REDO WAL file:    000000010000000000000004
Latest checkpoint's TimeLineID:       1
Latest checkpoint's PrevTimeLineID:   1
Latest checkpoint's full_page_writes: on
Latest checkpoint's NextXID:          0:545
Latest checkpoint's NextGxid:         25
Latest checkpoint's NextOID:          17451
Latest checkpoint's NextRelfilenode:  16392
Latest checkpoint's NextMultiXactId:  1
Latest checkpoint's NextMultiOffset:  0
Latest checkpoint's oldestXID:        529
Latest checkpoint's oldestXID's DB:   13719
Latest checkpoint's oldestActiveXID:  545
Latest checkpoint's oldestMultiXid:   1
Latest checkpoint's oldestMulti's DB: 13720
Latest checkpoint's oldestCommitTsXid:0
Latest checkpoint's newestCommitTsXid:0
Time of latest checkpoint:            Thu 07 Mar 2024 10:55:32 AM CET
Fake LSN counter for unlogged rels:   0/3E8
Minimum recovery ending location:     0/0
Min recovery ending loc's timeline:   0
Backup start location:                0/0
Backup end location:                  0/0
End-of-backup record required:        no
wal_level setting:                    replica
wal_log_hints setting:                off
max_connections setting:              750
max_worker_processes setting:         12
max_wal_senders setting:              10
max_prepared_xacts setting:           250
max_locks_per_xact setting:           128
track_commit_timestamp setting:       off
Maximum data alignment:               8
Database block size:                  32768
Blocks per segment of large relation: 32768
WAL block size:                       32768
Bytes per WAL segment:                67108864
Maximum length of identifiers:        64
Maximum columns in an index:          32
Maximum size of a TOAST chunk:        8140
Size of a large-object chunk:         8192
Date/time type storage:               64-bit integers
Float4 argument passing:              by value
Float8 argument passing:              by value
Data page checksum version:           1
Mock authentication nonce:            ae03e3dc891309211ede650a2e28c1b7dac2e510970a912d9f428761f243896e

stderr: 
20240307:11:45:34:002349 gprecoverseg:cdw:gpadmin-[INFO]:-Successfully finished pg_controldata /data/mirror/gpseg1 for dbid 5:
stdout: pg_control version number:            12010700
Catalog version number:               302307241
Database system identifier:           7340990201631847636
Database cluster state:               in archive recovery
pg_control last modified:             Thu 07 Mar 2024 11:00:32 AM CET
Latest checkpoint location:           0/127CD1E8
Latest checkpoint's REDO location:    0/127CD1B0
Latest checkpoint's REDO WAL file:    000000010000000000000004
Latest checkpoint's TimeLineID:       1
Latest checkpoint's PrevTimeLineID:   1
Latest checkpoint's full_page_writes: on
Latest checkpoint's NextXID:          0:545
Latest checkpoint's NextGxid:         25
Latest checkpoint's NextOID:          17451
Latest checkpoint's NextRelfilenode:  16392
Latest checkpoint's NextMultiXactId:  1
Latest checkpoint's NextMultiOffset:  0
Latest checkpoint's oldestXID:        529
Latest checkpoint's oldestXID's DB:   13719
Latest checkpoint's oldestActiveXID:  545
Latest checkpoint's oldestMultiXid:   1
Latest checkpoint's oldestMulti's DB: 13720
Latest checkpoint's oldestCommitTsXid:0
Latest checkpoint's newestCommitTsXid:0
Time of latest checkpoint:            Thu 07 Mar 2024 10:55:32 AM CET
Fake LSN counter for unlogged rels:   0/3E8
Minimum recovery ending location:     0/127CD2A8
Min recovery ending loc's timeline:   1
Backup start location:                0/0
Backup end location:                  0/0
End-of-backup record required:        no
wal_level setting:                    replica
wal_log_hints setting:                off
max_connections setting:              750
max_worker_processes setting:         12
max_wal_senders setting:              10
max_prepared_xacts setting:           250
max_locks_per_xact setting:           128
track_commit_timestamp setting:       off
Maximum data alignment:               8
Database block size:                  32768
Blocks per segment of large relation: 32768
WAL block size:                       32768
Bytes per WAL segment:                67108864
Maximum length of identifiers:        64
Maximum columns in an index:          32
Maximum size of a TOAST chunk:        8140
Size of a large-object chunk:         8192
Date/time type storage:               64-bit integers
Float4 argument passing:              by value
Float8 argument passing:              by value
Data page checksum version:           1
Mock authentication nonce:            ae03e3dc891309211ede650a2e28c1b7dac2e510970a912d9f428761f243896e

stderr: 
20240307:11:45:34:002349 gprecoverseg:cdw:gpadmin-[INFO]:-Heap checksum setting is consistent between coordinator and the segments that are candidates for recoverseg
20240307:11:45:34:002349 gprecoverseg:cdw:gpadmin-[INFO]:-Greenplum instance recovery parameters
20240307:11:45:34:002349 gprecoverseg:cdw:gpadmin-[INFO]:----------------------------------------------------------
20240307:11:45:34:002349 gprecoverseg:cdw:gpadmin-[INFO]:-Recovery type              = Standard
20240307:11:45:34:002349 gprecoverseg:cdw:gpadmin-[INFO]:----------------------------------------------------------
20240307:11:45:34:002349 gprecoverseg:cdw:gpadmin-[INFO]:-Recovery 1 of 1
20240307:11:45:34:002349 gprecoverseg:cdw:gpadmin-[INFO]:----------------------------------------------------------
20240307:11:45:34:002349 gprecoverseg:cdw:gpadmin-[INFO]:-   Synchronization mode                 = Incremental
20240307:11:45:34:002349 gprecoverseg:cdw:gpadmin-[INFO]:-   Failed instance host                 = sdw1
20240307:11:45:34:002349 gprecoverseg:cdw:gpadmin-[INFO]:-   Failed instance address              = sdw1
20240307:11:45:34:002349 gprecoverseg:cdw:gpadmin-[INFO]:-   Failed instance directory            = /data/mirror/gpseg1
20240307:11:45:34:002349 gprecoverseg:cdw:gpadmin-[INFO]:-   Failed instance port                 = 7000
20240307:11:45:34:002349 gprecoverseg:cdw:gpadmin-[INFO]:-   Recovery Source instance host        = sdw2
20240307:11:45:34:002349 gprecoverseg:cdw:gpadmin-[INFO]:-   Recovery Source instance address     = sdw2
20240307:11:45:34:002349 gprecoverseg:cdw:gpadmin-[INFO]:-   Recovery Source instance directory   = /data/primary/gpseg1
20240307:11:45:34:002349 gprecoverseg:cdw:gpadmin-[INFO]:-   Recovery Source instance port        = 6000
20240307:11:45:34:002349 gprecoverseg:cdw:gpadmin-[INFO]:-   Recovery Target                      = in-place
20240307:11:45:34:002349 gprecoverseg:cdw:gpadmin-[INFO]:----------------------------------------------------------

Continue with segment recovery procedure Yy|Nn (default=N):

Once we confirm this, the failed instance should recover from the primary segment on the same host:

Continue with segment recovery procedure Yy|Nn (default=N):
> y
20240307:11:48:44:002349 gprecoverseg:cdw:gpadmin-[INFO]:-Starting to create new pg_hba.conf on primary segments
20240307:11:48:44:002349 gprecoverseg:cdw:gpadmin-[INFO]:-killing existing walsender process on primary sdw2:6000 to refresh replication connection
20240307:11:48:44:002349 gprecoverseg:cdw:gpadmin-[INFO]:-Successfully modified pg_hba.conf on primary segments to allow replication connections
20240307:11:48:44:002349 gprecoverseg:cdw:gpadmin-[INFO]:-1 segment(s) to recover
20240307:11:48:44:002349 gprecoverseg:cdw:gpadmin-[INFO]:-Ensuring 1 failed segment(s) are stopped
20240307:11:48:45:002349 gprecoverseg:cdw:gpadmin-[INFO]:-Setting up the required segments for recovery
20240307:11:48:45:002349 gprecoverseg:cdw:gpadmin-[INFO]:-Updating configuration for mirrors
20240307:11:48:45:002349 gprecoverseg:cdw:gpadmin-[INFO]:-Initiating segment recovery. Upon completion, will start the successfully recovered segments
20240307:11:48:45:002349 gprecoverseg:cdw:gpadmin-[INFO]:-era is 5519b53b4b2c1dab_240307105028
sdw1 (dbid 5): skipping pg_rewind on mirror as standby.signal is present
20240307:11:48:46:002349 gprecoverseg:cdw:gpadmin-[INFO]:-Triggering FTS probe
20240307:11:48:46:002349 gprecoverseg:cdw:gpadmin-[INFO]:-********************************
20240307:11:48:46:002349 gprecoverseg:cdw:gpadmin-[INFO]:-Segments successfully recovered.
20240307:11:48:46:002349 gprecoverseg:cdw:gpadmin-[INFO]:-********************************
20240307:11:48:46:002349 gprecoverseg:cdw:gpadmin-[INFO]:-Recovered mirror segments need to sync WAL with primary segments.
20240307:11:48:46:002349 gprecoverseg:cdw:gpadmin-[INFO]:-Use 'gpstate -e' to check progress of WAL sync remaining bytes

Asking for any failed segments again confirms that all went well and the failed instance is back online:

[gpadmin@cdw ~]$ gpstate -e
20240307:11:49:16:002438 gpstate:cdw:gpadmin-[INFO]:-Starting gpstate with args: -e
20240307:11:49:16:002438 gpstate:cdw:gpadmin-[INFO]:-local Greenplum Version: 'postgres (Greenplum Database) 7.1.0 build commit:e7c2b1f14bb42a1018ac57d14f4436880e0a0515 Open Source'
20240307:11:49:16:002438 gpstate:cdw:gpadmin-[INFO]:-coordinator Greenplum Version: 'PostgreSQL 12.12 (Greenplum Database 7.1.0 build commit:e7c2b1f14bb42a1018ac57d14f4436880e0a0515 Open Source) on x86_64-pc-linux-gnu, compiled by gcc (GCC) 11.3.1 20221121 (Red Hat 11.3.1-4), 64-bit compiled on Jan 19 2024 06:51:45 Bhuvnesh C.'
20240307:11:49:16:002438 gpstate:cdw:gpadmin-[INFO]:-Obtaining Segment details from coordinator...
20240307:11:49:16:002438 gpstate:cdw:gpadmin-[INFO]:-Gathering data from segments...
20240307:11:49:16:002438 gpstate:cdw:gpadmin-[INFO]:-----------------------------------------------------
20240307:11:49:16:002438 gpstate:cdw:gpadmin-[INFO]:-Segment Mirroring Status Report
20240307:11:49:16:002438 gpstate:cdw:gpadmin-[INFO]:-----------------------------------------------------
20240307:11:49:16:002438 gpstate:cdw:gpadmin-[INFO]:-All segments are running normally

This was the easy case. What happens if we remove PGGDATA of a primary segment? We’ll use the same node for this test and remove PGDATA of the primary instance on node sdw1:

[gpadmin@sdw1 ~]$ ps -ef | egrep "6000|primary" | grep -v grep
gpadmin     1329       1  0 10:50 ?        00:00:00 /usr/local/greenplum-db-7.1.0/bin/postgres -D /data/primary/gpseg0 -c gp_role=execute
gpadmin     1345    1329  0 10:50 ?        00:00:00 postgres:  6000, logger process   
gpadmin     1352    1329  0 10:50 ?        00:00:00 postgres:  6000, checkpointer   
gpadmin     1353    1329  0 10:50 ?        00:00:00 postgres:  6000, background writer   
gpadmin     1354    1329  0 10:50 ?        00:00:00 postgres:  6000, walwriter   
gpadmin     1355    1329  0 10:50 ?        00:00:00 postgres:  6000, autovacuum launcher   
gpadmin     1356    1329  0 10:50 ?        00:00:00 postgres:  6000, stats collector   
gpadmin     1357    1329  0 10:50 ?        00:00:00 postgres:  6000, logical replication launcher   
gpadmin     1360    1329  0 10:50 ?        00:00:00 postgres:  6000, walsender gpadmin 192.168.122.202(40808) streaming 0/127E5FE8
[gpadmin@sdw1 ~]$ rm -rf /data/primary/gpseg0/*
[gpadmin@sdw1 ~]$ ps -ef | egrep "6000|primary" | grep -v grep
[gpadmin@sdw1 ~]$

Again, the coordinator node is of course aware of that:

[gpadmin@cdw ~]$ gpstate -e
20240307:15:00:20:004247 gpstate:cdw:gpadmin-[INFO]:-Starting gpstate with args: -e
20240307:15:00:20:004247 gpstate:cdw:gpadmin-[INFO]:-local Greenplum Version: 'postgres (Greenplum Database) 7.1.0 build commit:e7c2b1f14bb42a1018ac57d14f4436880e0a0515 Open Source'
20240307:15:00:20:004247 gpstate:cdw:gpadmin-[INFO]:-coordinator Greenplum Version: 'PostgreSQL 12.12 (Greenplum Database 7.1.0 build commit:e7c2b1f14bb42a1018ac57d14f4436880e0a0515 Open Source) on x86_64-pc-linux-gnu, compiled by gcc (GCC) 11.3.1 20221121 (Red Hat 11.3.1-4), 64-bit compiled on Jan 19 2024 06:51:45 Bhuvnesh C.'
20240307:15:00:20:004247 gpstate:cdw:gpadmin-[INFO]:-Obtaining Segment details from coordinator...
20240307:15:00:20:004247 gpstate:cdw:gpadmin-[INFO]:-Gathering data from segments...
20240307:15:00:20:004247 gpstate:cdw:gpadmin-[WARNING]:-pg_stat_replication shows no standby connections
20240307:15:00:20:004247 gpstate:cdw:gpadmin-[INFO]:-----------------------------------------------------
20240307:15:00:20:004247 gpstate:cdw:gpadmin-[INFO]:-Segment Mirroring Status Report
20240307:15:00:20:004247 gpstate:cdw:gpadmin-[INFO]:-----------------------------------------------------
20240307:15:00:20:004247 gpstate:cdw:gpadmin-[INFO]:-Segments with Primary and Mirror Roles Switched
20240307:15:00:20:004247 gpstate:cdw:gpadmin-[INFO]:-   Current Primary   Port   Mirror   Port
20240307:15:00:20:004247 gpstate:cdw:gpadmin-[INFO]:-   sdw2              7000   sdw1     6000
20240307:15:00:20:004247 gpstate:cdw:gpadmin-[INFO]:-----------------------------------------------------
20240307:15:00:20:004247 gpstate:cdw:gpadmin-[INFO]:-Unsynchronized Segment Pairs
20240307:15:00:20:004247 gpstate:cdw:gpadmin-[INFO]:-   Current Primary   Port   WAL sync remaining bytes   Mirror   Port
20240307:15:00:20:004247 gpstate:cdw:gpadmin-[INFO]:-   sdw2              7000   Unknown                    sdw1     6000
20240307:15:00:20:004247 gpstate:cdw:gpadmin-[INFO]:-----------------------------------------------------
20240307:15:00:20:004247 gpstate:cdw:gpadmin-[INFO]:-Downed Segments (may include segments where status could not be retrieved)
20240307:15:00:20:004247 gpstate:cdw:gpadmin-[INFO]:-   Segment   Port   Config status   Status
20240307:15:00:20:004247 gpstate:cdw:gpadmin-[INFO]:-   sdw1      6000   Down            Down in configuration

Trying to recover in the same way as before:

[gpadmin@cdw ~]$ gprecoverseg 
20240307:15:01:37:004431 gprecoverseg:cdw:gpadmin-[INFO]:-Starting gprecoverseg with args: 
20240307:15:01:37:004431 gprecoverseg:cdw:gpadmin-[INFO]:-local Greenplum Version: 'postgres (Greenplum Database) 7.1.0 build commit:e7c2b1f14bb42a1018ac57d14f4436880e0a0515 Open Source'
20240307:15:01:37:004431 gprecoverseg:cdw:gpadmin-[INFO]:-coordinator Greenplum Version: 'PostgreSQL 12.12 (Greenplum Database 7.1.0 build commit:e7c2b1f14bb42a1018ac57d14f4436880e0a0515 Open Source) on x86_64-pc-linux-gnu, compiled by gcc (GCC) 11.3.1 20221121 (Red Hat 11.3.1-4), 64-bit compiled on Jan 19 2024 06:51:45 Bhuvnesh C.'
20240307:15:01:37:004431 gprecoverseg:cdw:gpadmin-[INFO]:-Obtaining Segment details from coordinator...
20240307:15:01:38:004431 gprecoverseg:cdw:gpadmin-[INFO]:-Successfully finished pg_controldata /data/mirror/gpseg0 for dbid 4:
stdout: pg_control version number:            12010700
Catalog version number:               302307241
Database system identifier:           7340990201624057424
Database cluster state:               in production
pg_control last modified:             Thu 07 Mar 2024 02:59:57 PM CET
Latest checkpoint location:           0/127E6088
Latest checkpoint's REDO location:    0/127E6018
Latest checkpoint's REDO WAL file:    000000020000000000000004
Latest checkpoint's TimeLineID:       2
Latest checkpoint's PrevTimeLineID:   2
Latest checkpoint's full_page_writes: on
Latest checkpoint's NextXID:          0:545
Latest checkpoint's NextGxid:         25
Latest checkpoint's NextOID:          17451
Latest checkpoint's NextRelfilenode:  16392
Latest checkpoint's NextMultiXactId:  1
Latest checkpoint's NextMultiOffset:  0
Latest checkpoint's oldestXID:        529
Latest checkpoint's oldestXID's DB:   13719
Latest checkpoint's oldestActiveXID:  545
Latest checkpoint's oldestMultiXid:   1
Latest checkpoint's oldestMulti's DB: 13720
Latest checkpoint's oldestCommitTsXid:0
Latest checkpoint's newestCommitTsXid:0
Time of latest checkpoint:            Thu 07 Mar 2024 02:59:57 PM CET
Fake LSN counter for unlogged rels:   0/3E8
Minimum recovery ending location:     0/0
Min recovery ending loc's timeline:   0
Backup start location:                0/0
Backup end location:                  0/0
End-of-backup record required:        no
wal_level setting:                    replica
wal_log_hints setting:                off
max_connections setting:              750
max_worker_processes setting:         12
max_wal_senders setting:              10
max_prepared_xacts setting:           250
max_locks_per_xact setting:           128
track_commit_timestamp setting:       off
Maximum data alignment:               8
Database block size:                  32768
Blocks per segment of large relation: 32768
WAL block size:                       32768
Bytes per WAL segment:                67108864
Maximum length of identifiers:        64
Maximum columns in an index:          32
Maximum size of a TOAST chunk:        8140
Size of a large-object chunk:         8192
Date/time type storage:               64-bit integers
Float4 argument passing:              by value
Float8 argument passing:              by value
Data page checksum version:           1
Mock authentication nonce:            300975858d3712b8d6ecd6583814e3bc603e12304aa764d1c659721e205dc0ad

stderr: 
20240307:15:01:38:004431 gprecoverseg:cdw:gpadmin-[WARNING]:-cannot access pg_controldata for dbid 2 on host sdw1
20240307:15:01:38:004431 gprecoverseg:cdw:gpadmin-[INFO]:-Heap checksum setting is consistent between coordinator and the segments that are candidates for recoverseg
20240307:15:01:38:004431 gprecoverseg:cdw:gpadmin-[INFO]:-Greenplum instance recovery parameters
20240307:15:01:38:004431 gprecoverseg:cdw:gpadmin-[INFO]:----------------------------------------------------------
20240307:15:01:38:004431 gprecoverseg:cdw:gpadmin-[INFO]:-Recovery type              = Standard
20240307:15:01:38:004431 gprecoverseg:cdw:gpadmin-[INFO]:----------------------------------------------------------
20240307:15:01:38:004431 gprecoverseg:cdw:gpadmin-[INFO]:-Recovery 1 of 1
20240307:15:01:38:004431 gprecoverseg:cdw:gpadmin-[INFO]:----------------------------------------------------------
20240307:15:01:38:004431 gprecoverseg:cdw:gpadmin-[INFO]:-   Synchronization mode                 = Incremental
20240307:15:01:38:004431 gprecoverseg:cdw:gpadmin-[INFO]:-   Failed instance host                 = sdw1
20240307:15:01:38:004431 gprecoverseg:cdw:gpadmin-[INFO]:-   Failed instance address              = sdw1
20240307:15:01:38:004431 gprecoverseg:cdw:gpadmin-[INFO]:-   Failed instance directory            = /data/primary/gpseg0
20240307:15:01:38:004431 gprecoverseg:cdw:gpadmin-[INFO]:-   Failed instance port                 = 6000
20240307:15:01:38:004431 gprecoverseg:cdw:gpadmin-[INFO]:-   Recovery Source instance host        = sdw2
20240307:15:01:38:004431 gprecoverseg:cdw:gpadmin-[INFO]:-   Recovery Source instance address     = sdw2
20240307:15:01:38:004431 gprecoverseg:cdw:gpadmin-[INFO]:-   Recovery Source instance directory   = /data/mirror/gpseg0
20240307:15:01:38:004431 gprecoverseg:cdw:gpadmin-[INFO]:-   Recovery Source instance port        = 7000
20240307:15:01:38:004431 gprecoverseg:cdw:gpadmin-[INFO]:-   Recovery Target                      = in-place
20240307:15:01:38:004431 gprecoverseg:cdw:gpadmin-[INFO]:----------------------------------------------------------

Continue with segment recovery procedure Yy|Nn (default=N):
> y
20240307:15:01:52:004431 gprecoverseg:cdw:gpadmin-[INFO]:-Starting to create new pg_hba.conf on primary segments
20240307:15:01:52:004431 gprecoverseg:cdw:gpadmin-[INFO]:-killing existing walsender process on primary sdw2:7000 to refresh replication connection
20240307:15:01:52:004431 gprecoverseg:cdw:gpadmin-[INFO]:-Successfully modified pg_hba.conf on primary segments to allow replication connections
20240307:15:01:52:004431 gprecoverseg:cdw:gpadmin-[INFO]:-1 segment(s) to recover
20240307:15:01:52:004431 gprecoverseg:cdw:gpadmin-[INFO]:-Ensuring 1 failed segment(s) are stopped
20240307:15:01:52:004431 gprecoverseg:cdw:gpadmin-[INFO]:-Setting up the required segments for recovery
20240307:15:01:53:004431 gprecoverseg:cdw:gpadmin-[INFO]:-Updating configuration for mirrors
20240307:15:01:53:004431 gprecoverseg:cdw:gpadmin-[INFO]:-Initiating segment recovery. Upon completion, will start the successfully recovered segments
20240307:15:01:53:004431 gprecoverseg:cdw:gpadmin-[INFO]:-era is 5519b53b4b2c1dab_240307105028
sdw1 (dbid 2): pg_rewind: fatal: could not open file "/data/primary/gpseg0/global/pg_control" for reading: No such file or directory
20240307:15:01:53:004431 gprecoverseg:cdw:gpadmin-[INFO]:-----------------------------------------------------------
20240307:15:01:53:004431 gprecoverseg:cdw:gpadmin-[INFO]:-Failed to recover the following segments. You must run either gprecoverseg --differential or gprecoverseg -F for all incremental failures
20240307:15:01:53:004431 gprecoverseg:cdw:gpadmin-[INFO]:- hostname: sdw1; port: 6000; logfile: /home/gpadmin/gpAdminLogs/pg_rewind.20240307_150152.dbid2.out; recoverytype: incremental
20240307:15:01:53:004431 gprecoverseg:cdw:gpadmin-[INFO]:-Triggering FTS probe
20240307:15:01:53:004431 gprecoverseg:cdw:gpadmin-[ERROR]:-gprecoverseg failed. Please check the output for more details.

This fails because “pg_control” is not available anymore. By default recovery will do a incremental recovery and this cannot anymore work. Now we need to do a full recovery:

[gpadmin@cdw ~]$ gprecoverseg -F
20240307:15:03:15:004491 gprecoverseg:cdw:gpadmin-[INFO]:-Starting gprecoverseg with args: -F
20240307:15:03:15:004491 gprecoverseg:cdw:gpadmin-[INFO]:-local Greenplum Version: 'postgres (Greenplum Database) 7.1.0 build commit:e7c2b1f14bb42a1018ac57d14f4436880e0a0515 Open Source'
20240307:15:03:15:004491 gprecoverseg:cdw:gpadmin-[INFO]:-coordinator Greenplum Version: 'PostgreSQL 12.12 (Greenplum Database 7.1.0 build commit:e7c2b1f14bb42a1018ac57d14f4436880e0a0515 Open Source) on x86_64-pc-linux-gnu, compiled by gcc (GCC) 11.3.1 20221121 (Red Hat 11.3.1-4), 64-bit compiled on Jan 19 2024 06:51:45 Bhuvnesh C.'
20240307:15:03:15:004491 gprecoverseg:cdw:gpadmin-[INFO]:-Obtaining Segment details from coordinator...
20240307:15:03:16:004491 gprecoverseg:cdw:gpadmin-[INFO]:-Successfully finished pg_controldata /data/mirror/gpseg0 for dbid 4:
stdout: pg_control version number:            12010700
Catalog version number:               302307241
Database system identifier:           7340990201624057424
Database cluster state:               in production
pg_control last modified:             Thu 07 Mar 2024 03:01:53 PM CET
Latest checkpoint location:           0/127E6180
Latest checkpoint's REDO location:    0/127E6148
Latest checkpoint's REDO WAL file:    000000020000000000000004
Latest checkpoint's TimeLineID:       2
Latest checkpoint's PrevTimeLineID:   2
Latest checkpoint's full_page_writes: on
Latest checkpoint's NextXID:          0:545
Latest checkpoint's NextGxid:         25
Latest checkpoint's NextOID:          17451
Latest checkpoint's NextRelfilenode:  16392
Latest checkpoint's NextMultiXactId:  1
Latest checkpoint's NextMultiOffset:  0
Latest checkpoint's oldestXID:        529
Latest checkpoint's oldestXID's DB:   13719
Latest checkpoint's oldestActiveXID:  545
Latest checkpoint's oldestMultiXid:   1
Latest checkpoint's oldestMulti's DB: 13720
Latest checkpoint's oldestCommitTsXid:0
Latest checkpoint's newestCommitTsXid:0
Time of latest checkpoint:            Thu 07 Mar 2024 03:01:53 PM CET
Fake LSN counter for unlogged rels:   0/3E8
Minimum recovery ending location:     0/0
Min recovery ending loc's timeline:   0
Backup start location:                0/0
Backup end location:                  0/0
End-of-backup record required:        no
wal_level setting:                    replica
wal_log_hints setting:                off
max_connections setting:              750
max_worker_processes setting:         12
max_wal_senders setting:              10
max_prepared_xacts setting:           250
max_locks_per_xact setting:           128
track_commit_timestamp setting:       off
Maximum data alignment:               8
Database block size:                  32768
Blocks per segment of large relation: 32768
WAL block size:                       32768
Bytes per WAL segment:                67108864
Maximum length of identifiers:        64
Maximum columns in an index:          32
Maximum size of a TOAST chunk:        8140
Size of a large-object chunk:         8192
Date/time type storage:               64-bit integers
Float4 argument passing:              by value
Float8 argument passing:              by value
Data page checksum version:           1
Mock authentication nonce:            300975858d3712b8d6ecd6583814e3bc603e12304aa764d1c659721e205dc0ad

stderr: 
20240307:15:03:16:004491 gprecoverseg:cdw:gpadmin-[WARNING]:-cannot access pg_controldata for dbid 2 on host sdw1
20240307:15:03:16:004491 gprecoverseg:cdw:gpadmin-[INFO]:-Heap checksum setting is consistent between coordinator and the segments that are candidates for recoverseg
20240307:15:03:16:004491 gprecoverseg:cdw:gpadmin-[INFO]:-Greenplum instance recovery parameters
20240307:15:03:16:004491 gprecoverseg:cdw:gpadmin-[INFO]:----------------------------------------------------------
20240307:15:03:16:004491 gprecoverseg:cdw:gpadmin-[INFO]:-Recovery type              = Standard
20240307:15:03:16:004491 gprecoverseg:cdw:gpadmin-[INFO]:----------------------------------------------------------
20240307:15:03:16:004491 gprecoverseg:cdw:gpadmin-[INFO]:-Recovery 1 of 1
20240307:15:03:16:004491 gprecoverseg:cdw:gpadmin-[INFO]:----------------------------------------------------------
20240307:15:03:16:004491 gprecoverseg:cdw:gpadmin-[INFO]:-   Synchronization mode                 = Full
20240307:15:03:16:004491 gprecoverseg:cdw:gpadmin-[INFO]:-   Failed instance host                 = sdw1
20240307:15:03:16:004491 gprecoverseg:cdw:gpadmin-[INFO]:-   Failed instance address              = sdw1
20240307:15:03:16:004491 gprecoverseg:cdw:gpadmin-[INFO]:-   Failed instance directory            = /data/primary/gpseg0
20240307:15:03:16:004491 gprecoverseg:cdw:gpadmin-[INFO]:-   Failed instance port                 = 6000
20240307:15:03:16:004491 gprecoverseg:cdw:gpadmin-[INFO]:-   Recovery Source instance host        = sdw2
20240307:15:03:16:004491 gprecoverseg:cdw:gpadmin-[INFO]:-   Recovery Source instance address     = sdw2
20240307:15:03:16:004491 gprecoverseg:cdw:gpadmin-[INFO]:-   Recovery Source instance directory   = /data/mirror/gpseg0
20240307:15:03:16:004491 gprecoverseg:cdw:gpadmin-[INFO]:-   Recovery Source instance port        = 7000
20240307:15:03:16:004491 gprecoverseg:cdw:gpadmin-[INFO]:-   Recovery Target                      = in-place
20240307:15:03:16:004491 gprecoverseg:cdw:gpadmin-[INFO]:----------------------------------------------------------

Continue with segment recovery procedure Yy|Nn (default=N):
> y
20240307:15:03:21:004491 gprecoverseg:cdw:gpadmin-[INFO]:-Starting to create new pg_hba.conf on primary segments
20240307:15:03:21:004491 gprecoverseg:cdw:gpadmin-[INFO]:-killing existing walsender process on primary sdw2:7000 to refresh replication connection
20240307:15:03:21:004491 gprecoverseg:cdw:gpadmin-[INFO]:-Successfully modified pg_hba.conf on primary segments to allow replication connections
20240307:15:03:21:004491 gprecoverseg:cdw:gpadmin-[INFO]:-1 segment(s) to recover
20240307:15:03:21:004491 gprecoverseg:cdw:gpadmin-[INFO]:-Ensuring 1 failed segment(s) are stopped
20240307:15:03:21:004491 gprecoverseg:cdw:gpadmin-[INFO]:-Setting up the required segments for recovery
20240307:15:03:22:004491 gprecoverseg:cdw:gpadmin-[INFO]:-Updating configuration for mirrors
20240307:15:03:22:004491 gprecoverseg:cdw:gpadmin-[INFO]:-Initiating segment recovery. Upon completion, will start the successfully recovered segments
20240307:15:03:22:004491 gprecoverseg:cdw:gpadmin-[INFO]:-era is 5519b53b4b2c1dab_240307105028
sdw1 (dbid 2): pg_basebackup: base backup completed
20240307:15:03:25:004491 gprecoverseg:cdw:gpadmin-[INFO]:-Triggering FTS probe
20240307:15:03:25:004491 gprecoverseg:cdw:gpadmin-[INFO]:-********************************
20240307:15:03:25:004491 gprecoverseg:cdw:gpadmin-[INFO]:-Segments successfully recovered.
20240307:15:03:25:004491 gprecoverseg:cdw:gpadmin-[INFO]:-********************************
20240307:15:03:25:004491 gprecoverseg:cdw:gpadmin-[INFO]:-Recovered mirror segments need to sync WAL with primary segments.
20240307:15:03:25:004491 gprecoverseg:cdw:gpadmin-[INFO]:-Use 'gpstate -e' to check progress of WAL sync remaining bytes

This worked but now we see this when we ask for failed segment nodes:

[gpadmin@cdw ~]$ gpstate -e
20240307:15:04:35:004552 gpstate:cdw:gpadmin-[INFO]:-Starting gpstate with args: -e
20240307:15:04:36:004552 gpstate:cdw:gpadmin-[INFO]:-local Greenplum Version: 'postgres (Greenplum Database) 7.1.0 build commit:e7c2b1f14bb42a1018ac57d14f4436880e0a0515 Open Source'
20240307:15:04:36:004552 gpstate:cdw:gpadmin-[INFO]:-coordinator Greenplum Version: 'PostgreSQL 12.12 (Greenplum Database 7.1.0 build commit:e7c2b1f14bb42a1018ac57d14f4436880e0a0515 Open Source) on x86_64-pc-linux-gnu, compiled by gcc (GCC) 11.3.1 20221121 (Red Hat 11.3.1-4), 64-bit compiled on Jan 19 2024 06:51:45 Bhuvnesh C.'
20240307:15:04:36:004552 gpstate:cdw:gpadmin-[INFO]:-Obtaining Segment details from coordinator...
20240307:15:04:36:004552 gpstate:cdw:gpadmin-[INFO]:-Gathering data from segments...
20240307:15:04:36:004552 gpstate:cdw:gpadmin-[INFO]:-----------------------------------------------------
20240307:15:04:36:004552 gpstate:cdw:gpadmin-[INFO]:-Segment Mirroring Status Report
20240307:15:04:36:004552 gpstate:cdw:gpadmin-[INFO]:-----------------------------------------------------
20240307:15:04:36:004552 gpstate:cdw:gpadmin-[INFO]:-Segments with Primary and Mirror Roles Switched
20240307:15:04:36:004552 gpstate:cdw:gpadmin-[INFO]:-   Current Primary   Port   Mirror   Port
20240307:15:04:36:004552 gpstate:cdw:gpadmin-[INFO]:-   sdw2              7000   sdw1     6000

The reason is, that not all instances are in their preferred role anymore:

[gpadmin@cdw ~]$ psql -c "select * from gp_segment_configuration" postgres
 dbid | content | role | preferred_role | mode | status | port | hostname | address |          datadir          
------+---------+------+----------------+------+--------+------+----------+---------+---------------------------
    1 |      -1 | p    | p              | n    | u      | 5432 | cdw      | cdw     | /data/coordinator/gpseg-1
    3 |       1 | p    | p              | s    | u      | 6000 | sdw2     | sdw2    | /data/primary/gpseg1
    5 |       1 | m    | m              | s    | u      | 7000 | sdw1     | sdw1    | /data/mirror/gpseg1
    4 |       0 | p    | m              | s    | u      | 7000 | sdw2     | sdw2    | /data/mirror/gpseg0
    2 |       0 | m    | p              | s    | u      | 6000 | sdw1     | sdw1    | /data/primary/gpseg0

When you are in such a state you should re-balance the segments:

[gpadmin@cdw ~]$ gprecoverseg -r
20240307:15:08:32:004692 gprecoverseg:cdw:gpadmin-[INFO]:-Starting gprecoverseg with args: -r
20240307:15:08:32:004692 gprecoverseg:cdw:gpadmin-[INFO]:-local Greenplum Version: 'postgres (Greenplum Database) 7.1.0 build commit:e7c2b1f14bb42a1018ac57d14f4436880e0a0515 Open Source'
20240307:15:08:32:004692 gprecoverseg:cdw:gpadmin-[INFO]:-coordinator Greenplum Version: 'PostgreSQL 12.12 (Greenplum Database 7.1.0 build commit:e7c2b1f14bb42a1018ac57d14f4436880e0a0515 Open Source) on x86_64-pc-linux-gnu, compiled by gcc (GCC) 11.3.1 20221121 (Red Hat 11.3.1-4), 64-bit compiled on Jan 19 2024 06:51:45 Bhuvnesh C.'
20240307:15:08:32:004692 gprecoverseg:cdw:gpadmin-[INFO]:-Obtaining Segment details from coordinator...
20240307:15:08:32:004692 gprecoverseg:cdw:gpadmin-[INFO]:-Greenplum instance recovery parameters
20240307:15:08:32:004692 gprecoverseg:cdw:gpadmin-[INFO]:----------------------------------------------------------
20240307:15:08:32:004692 gprecoverseg:cdw:gpadmin-[INFO]:-Recovery type              = Rebalance
20240307:15:08:32:004692 gprecoverseg:cdw:gpadmin-[INFO]:----------------------------------------------------------
20240307:15:08:32:004692 gprecoverseg:cdw:gpadmin-[INFO]:-Unbalanced segment 1 of 2
20240307:15:08:32:004692 gprecoverseg:cdw:gpadmin-[INFO]:----------------------------------------------------------
20240307:15:08:32:004692 gprecoverseg:cdw:gpadmin-[INFO]:-   Unbalanced instance host        = sdw2
20240307:15:08:32:004692 gprecoverseg:cdw:gpadmin-[INFO]:-   Unbalanced instance address     = sdw2
20240307:15:08:32:004692 gprecoverseg:cdw:gpadmin-[INFO]:-   Unbalanced instance directory   = /data/mirror/gpseg0
20240307:15:08:32:004692 gprecoverseg:cdw:gpadmin-[INFO]:-   Unbalanced instance port        = 7000
20240307:15:08:32:004692 gprecoverseg:cdw:gpadmin-[INFO]:-   Balanced role                   = Mirror
20240307:15:08:32:004692 gprecoverseg:cdw:gpadmin-[INFO]:-   Current role                    = Primary
20240307:15:08:32:004692 gprecoverseg:cdw:gpadmin-[INFO]:----------------------------------------------------------
20240307:15:08:32:004692 gprecoverseg:cdw:gpadmin-[INFO]:-Unbalanced segment 2 of 2
20240307:15:08:32:004692 gprecoverseg:cdw:gpadmin-[INFO]:----------------------------------------------------------
20240307:15:08:32:004692 gprecoverseg:cdw:gpadmin-[INFO]:-   Unbalanced instance host        = sdw1
20240307:15:08:32:004692 gprecoverseg:cdw:gpadmin-[INFO]:-   Unbalanced instance address     = sdw1
20240307:15:08:32:004692 gprecoverseg:cdw:gpadmin-[INFO]:-   Unbalanced instance directory   = /data/primary/gpseg0
20240307:15:08:32:004692 gprecoverseg:cdw:gpadmin-[INFO]:-   Unbalanced instance port        = 6000
20240307:15:08:32:004692 gprecoverseg:cdw:gpadmin-[INFO]:-   Balanced role                   = Primary
20240307:15:08:32:004692 gprecoverseg:cdw:gpadmin-[INFO]:-   Current role                    = Mirror
20240307:15:08:32:004692 gprecoverseg:cdw:gpadmin-[INFO]:----------------------------------------------------------
20240307:15:08:32:004692 gprecoverseg:cdw:gpadmin-[WARNING]:-This operation will cancel queries that are currently executing.
20240307:15:08:32:004692 gprecoverseg:cdw:gpadmin-[WARNING]:-Connections to the database however will not be interrupted.

Continue with segment rebalance procedure Yy|Nn (default=N):
> y
20240307:15:09:53:004692 gprecoverseg:cdw:gpadmin-[INFO]:-Determining primary and mirror segment pairs to rebalance
20240307:15:09:53:004692 gprecoverseg:cdw:gpadmin-[INFO]:-Allowed replay lag during rebalance is 10 GB
20240307:15:09:53:004692 gprecoverseg:cdw:gpadmin-[INFO]:-Stopping unbalanced primary segments...
.
20240307:15:09:55:004692 gprecoverseg:cdw:gpadmin-[INFO]:-Triggering segment reconfiguration
20240307:15:10:02:004692 gprecoverseg:cdw:gpadmin-[INFO]:-Starting segment synchronization
20240307:15:10:02:004692 gprecoverseg:cdw:gpadmin-[INFO]:-=============================START ANOTHER RECOVER=========================================
20240307:15:10:02:004692 gprecoverseg:cdw:gpadmin-[INFO]:-local Greenplum Version: 'postgres (Greenplum Database) 7.1.0 build commit:e7c2b1f14bb42a1018ac57d14f4436880e0a0515 Open Source'
20240307:15:10:02:004692 gprecoverseg:cdw:gpadmin-[INFO]:-coordinator Greenplum Version: 'PostgreSQL 12.12 (Greenplum Database 7.1.0 build commit:e7c2b1f14bb42a1018ac57d14f4436880e0a0515 Open Source) on x86_64-pc-linux-gnu, compiled by gcc (GCC) 11.3.1 20221121 (Red Hat 11.3.1-4), 64-bit compiled on Jan 19 2024 06:51:45 Bhuvnesh C.'
20240307:15:10:02:004692 gprecoverseg:cdw:gpadmin-[INFO]:-Obtaining Segment details from coordinator...
20240307:15:10:02:004692 gprecoverseg:cdw:gpadmin-[INFO]:-Successfully finished pg_controldata /data/primary/gpseg0 for dbid 2:
stdout: pg_control version number:            12010700
Catalog version number:               302307241
Database system identifier:           7340990201624057424
Database cluster state:               in production
pg_control last modified:             Thu 07 Mar 2024 03:10:00 PM CET
Latest checkpoint location:           0/18000280
Latest checkpoint's REDO location:    0/18000210
Latest checkpoint's REDO WAL file:    000000030000000000000006
Latest checkpoint's TimeLineID:       3
Latest checkpoint's PrevTimeLineID:   3
Latest checkpoint's full_page_writes: on
Latest checkpoint's NextXID:          0:545
Latest checkpoint's NextGxid:         25
Latest checkpoint's NextOID:          17451
Latest checkpoint's NextRelfilenode:  16392
Latest checkpoint's NextMultiXactId:  1
Latest checkpoint's NextMultiOffset:  0
Latest checkpoint's oldestXID:        529
Latest checkpoint's oldestXID's DB:   13719
Latest checkpoint's oldestActiveXID:  545
Latest checkpoint's oldestMultiXid:   1
Latest checkpoint's oldestMulti's DB: 13720
Latest checkpoint's oldestCommitTsXid:0
Latest checkpoint's newestCommitTsXid:0
Time of latest checkpoint:            Thu 07 Mar 2024 03:10:00 PM CET
Fake LSN counter for unlogged rels:   0/3E8
Minimum recovery ending location:     0/0
Min recovery ending loc's timeline:   0
Backup start location:                0/0
Backup end location:                  0/0
End-of-backup record required:        no
wal_level setting:                    replica
wal_log_hints setting:                off
max_connections setting:              750
max_worker_processes setting:         12
max_wal_senders setting:              10
max_prepared_xacts setting:           250
max_locks_per_xact setting:           128
track_commit_timestamp setting:       off
Maximum data alignment:               8
Database block size:                  32768
Blocks per segment of large relation: 32768
WAL block size:                       32768
Bytes per WAL segment:                67108864
Maximum length of identifiers:        64
Maximum columns in an index:          32
Maximum size of a TOAST chunk:        8140
Size of a large-object chunk:         8192
Date/time type storage:               64-bit integers
Float4 argument passing:              by value
Float8 argument passing:              by value
Data page checksum version:           1
Mock authentication nonce:            300975858d3712b8d6ecd6583814e3bc603e12304aa764d1c659721e205dc0ad

stderr: 
20240307:15:10:02:004692 gprecoverseg:cdw:gpadmin-[INFO]:-Successfully finished pg_controldata /data/mirror/gpseg0 for dbid 4:
stdout: pg_control version number:            12010700
Catalog version number:               302307241
Database system identifier:           7340990201624057424
Database cluster state:               shut down
pg_control last modified:             Thu 07 Mar 2024 03:09:54 PM CET
Latest checkpoint location:           0/18000158
Latest checkpoint's REDO location:    0/18000158
Latest checkpoint's REDO WAL file:    000000020000000000000006
Latest checkpoint's TimeLineID:       2
Latest checkpoint's PrevTimeLineID:   2
Latest checkpoint's full_page_writes: on
Latest checkpoint's NextXID:          0:545
Latest checkpoint's NextGxid:         25
Latest checkpoint's NextOID:          17451
Latest checkpoint's NextRelfilenode:  16392
Latest checkpoint's NextMultiXactId:  1
Latest checkpoint's NextMultiOffset:  0
Latest checkpoint's oldestXID:        529
Latest checkpoint's oldestXID's DB:   13719
Latest checkpoint's oldestActiveXID:  0
Latest checkpoint's oldestMultiXid:   1
Latest checkpoint's oldestMulti's DB: 13720
Latest checkpoint's oldestCommitTsXid:0
Latest checkpoint's newestCommitTsXid:0
Time of latest checkpoint:            Thu 07 Mar 2024 03:09:54 PM CET
Fake LSN counter for unlogged rels:   0/3E8
Minimum recovery ending location:     0/0
Min recovery ending loc's timeline:   0
Backup start location:                0/0
Backup end location:                  0/0
End-of-backup record required:        no
wal_level setting:                    replica
wal_log_hints setting:                off
max_connections setting:              750
max_worker_processes setting:         12
max_wal_senders setting:              10
max_prepared_xacts setting:           250
max_locks_per_xact setting:           128
track_commit_timestamp setting:       off
Maximum data alignment:               8
Database block size:                  32768
Blocks per segment of large relation: 32768
WAL block size:                       32768
Bytes per WAL segment:                67108864
Maximum length of identifiers:        64
Maximum columns in an index:          32
Maximum size of a TOAST chunk:        8140
Size of a large-object chunk:         8192
Date/time type storage:               64-bit integers
Float4 argument passing:              by value
Float8 argument passing:              by value
Data page checksum version:           1
Mock authentication nonce:            300975858d3712b8d6ecd6583814e3bc603e12304aa764d1c659721e205dc0ad

stderr: 
20240307:15:10:02:004692 gprecoverseg:cdw:gpadmin-[INFO]:-Heap checksum setting is consistent between coordinator and the segments that are candidates for recoverseg
20240307:15:10:02:004692 gprecoverseg:cdw:gpadmin-[INFO]:-Greenplum instance recovery parameters
20240307:15:10:02:004692 gprecoverseg:cdw:gpadmin-[INFO]:----------------------------------------------------------
20240307:15:10:02:004692 gprecoverseg:cdw:gpadmin-[INFO]:-Recovery type              = Standard
20240307:15:10:02:004692 gprecoverseg:cdw:gpadmin-[INFO]:----------------------------------------------------------
20240307:15:10:02:004692 gprecoverseg:cdw:gpadmin-[INFO]:-Recovery 1 of 1
20240307:15:10:02:004692 gprecoverseg:cdw:gpadmin-[INFO]:----------------------------------------------------------
20240307:15:10:02:004692 gprecoverseg:cdw:gpadmin-[INFO]:-   Synchronization mode                 = Incremental
20240307:15:10:02:004692 gprecoverseg:cdw:gpadmin-[INFO]:-   Failed instance host                 = sdw2
20240307:15:10:02:004692 gprecoverseg:cdw:gpadmin-[INFO]:-   Failed instance address              = sdw2
20240307:15:10:02:004692 gprecoverseg:cdw:gpadmin-[INFO]:-   Failed instance directory            = /data/mirror/gpseg0
20240307:15:10:02:004692 gprecoverseg:cdw:gpadmin-[INFO]:-   Failed instance port                 = 7000
20240307:15:10:02:004692 gprecoverseg:cdw:gpadmin-[INFO]:-   Recovery Source instance host        = sdw1
20240307:15:10:02:004692 gprecoverseg:cdw:gpadmin-[INFO]:-   Recovery Source instance address     = sdw1
20240307:15:10:02:004692 gprecoverseg:cdw:gpadmin-[INFO]:-   Recovery Source instance directory   = /data/primary/gpseg0
20240307:15:10:02:004692 gprecoverseg:cdw:gpadmin-[INFO]:-   Recovery Source instance port        = 6000
20240307:15:10:02:004692 gprecoverseg:cdw:gpadmin-[INFO]:-   Recovery Target                      = in-place
20240307:15:10:02:004692 gprecoverseg:cdw:gpadmin-[INFO]:----------------------------------------------------------
20240307:15:10:02:004692 gprecoverseg:cdw:gpadmin-[INFO]:-Starting to create new pg_hba.conf on primary segments
20240307:15:10:03:004692 gprecoverseg:cdw:gpadmin-[INFO]:-killing existing walsender process on primary sdw1:6000 to refresh replication connection
20240307:15:10:03:004692 gprecoverseg:cdw:gpadmin-[INFO]:-Successfully modified pg_hba.conf on primary segments to allow replication connections
20240307:15:10:03:004692 gprecoverseg:cdw:gpadmin-[INFO]:-1 segment(s) to recover
20240307:15:10:03:004692 gprecoverseg:cdw:gpadmin-[INFO]:-Ensuring 1 failed segment(s) are stopped
20240307:15:10:03:004692 gprecoverseg:cdw:gpadmin-[INFO]:-Setting up the required segments for recovery
20240307:15:10:03:004692 gprecoverseg:cdw:gpadmin-[INFO]:-Updating configuration for mirrors
20240307:15:10:03:004692 gprecoverseg:cdw:gpadmin-[INFO]:-Initiating segment recovery. Upon completion, will start the successfully recovered segments
20240307:15:10:03:004692 gprecoverseg:cdw:gpadmin-[INFO]:-era is 5519b53b4b2c1dab_240307105028
sdw2 (dbid 4): pg_rewind: no rewind required
20240307:15:10:04:004692 gprecoverseg:cdw:gpadmin-[INFO]:-Triggering FTS probe
20240307:15:10:04:004692 gprecoverseg:cdw:gpadmin-[INFO]:-********************************
20240307:15:10:04:004692 gprecoverseg:cdw:gpadmin-[INFO]:-Segments successfully recovered.
20240307:15:10:04:004692 gprecoverseg:cdw:gpadmin-[INFO]:-********************************
20240307:15:10:04:004692 gprecoverseg:cdw:gpadmin-[INFO]:-Recovered mirror segments need to sync WAL with primary segments.
20240307:15:10:04:004692 gprecoverseg:cdw:gpadmin-[INFO]:-Use 'gpstate -e' to check progress of WAL sync remaining bytes
20240307:15:10:04:004692 gprecoverseg:cdw:gpadmin-[INFO]:-==============================END ANOTHER RECOVER==========================================
20240307:15:10:04:004692 gprecoverseg:cdw:gpadmin-[INFO]:-******************************************************************
20240307:15:10:04:004692 gprecoverseg:cdw:gpadmin-[INFO]:-The rebalance operation has completed successfully.
20240307:15:10:04:004692 gprecoverseg:cdw:gpadmin-[INFO]:-******************************************************************

Once this completed we are back to normal operations:

[gpadmin@cdw ~]$ gpstate -e
20240307:15:11:15:004815 gpstate:cdw:gpadmin-[INFO]:-Starting gpstate with args: -e
20240307:15:11:15:004815 gpstate:cdw:gpadmin-[INFO]:-local Greenplum Version: 'postgres (Greenplum Database) 7.1.0 build commit:e7c2b1f14bb42a1018ac57d14f4436880e0a0515 Open Source'
20240307:15:11:15:004815 gpstate:cdw:gpadmin-[INFO]:-coordinator Greenplum Version: 'PostgreSQL 12.12 (Greenplum Database 7.1.0 build commit:e7c2b1f14bb42a1018ac57d14f4436880e0a0515 Open Source) on x86_64-pc-linux-gnu, compiled by gcc (GCC) 11.3.1 20221121 (Red Hat 11.3.1-4), 64-bit compiled on Jan 19 2024 06:51:45 Bhuvnesh C.'
20240307:15:11:15:004815 gpstate:cdw:gpadmin-[INFO]:-Obtaining Segment details from coordinator...
20240307:15:11:15:004815 gpstate:cdw:gpadmin-[INFO]:-Gathering data from segments...
20240307:15:11:16:004815 gpstate:cdw:gpadmin-[INFO]:-----------------------------------------------------
20240307:15:11:16:004815 gpstate:cdw:gpadmin-[INFO]:-Segment Mirroring Status Report
20240307:15:11:16:004815 gpstate:cdw:gpadmin-[INFO]:-----------------------------------------------------
20240307:15:11:16:004815 gpstate:cdw:gpadmin-[INFO]:-All segments are running normally

[gpadmin@cdw ~]$ psql -c "select * from gp_segment_configuration" postgres
 dbid | content | role | preferred_role | mode | status | port | hostname | address |          datadir          
------+---------+------+----------------+------+--------+------+----------+---------+---------------------------
    1 |      -1 | p    | p              | n    | u      | 5432 | cdw      | cdw     | /data/coordinator/gpseg-1
    3 |       1 | p    | p              | s    | u      | 6000 | sdw2     | sdw2    | /data/primary/gpseg1
    5 |       1 | m    | m              | s    | u      | 7000 | sdw1     | sdw1    | /data/mirror/gpseg1
    2 |       0 | p    | p              | s    | u      | 6000 | sdw1     | sdw1    | /data/primary/gpseg0
    4 |       0 | m    | m              | s    | u      | 7000 | sdw2     | sdw2    | /data/mirror/gpseg0
(5 rows)

What we did here was recovering “in-place”, which means on the the same node. Recovery onto another node is possible as well as long as the new node comes with the same configuration as the current node (Greenplum release, OS version, …).

The important point is, that you definitely should go with mirror segments. Of course you need double the space per node, but it makes recovery a lot easier.

Post Views: 2,548

Getting started with Greenplum – 5 – Recovering from failed segment nodes

Daniel Westermann

Leave a Reply:

Related blog articles

Daniel Westermann

Leave a Reply:

Related blog articles

Contained Availability Group and login failed

SQL Server: New SQL Server Management Studio (SSMS) landing page

Step-by-Step Guide to Enabling Copilot in SSMS

SQL Server 2025: Local SQL Server Container without Docker Command