This was a really interesting customer case and I learned a lot by troubleshooting this. The original request was: We have two databases connected via postgres_fdw and when a merge join is chosen by PostgreSQL we get the wrong result. Setting enable_mergejoin to off causes PostgreSQL to go for a hash join and the result is correct. Well, personally, I’ve never run into a situation where PostgreSQL did not give the correct result, but of course there are bugs. The questions was: Is it a bug, or is this expected behavior?

As I did not have access to the customer’s environment the first step was to rebuild the setup and then to try to reproduce that locally. I went for two Debian 11 VMs and PostgreSQL 13.7 (this is the version which is used at the customer). This already was my first mistake, but more on that later.

The SQL which caused all this is quite simple:

SELECT * 
  FROM "rsu_adm"."data_2d_clb_global_product" f2
  LEFT OUTER JOIN "rsu_adm"."clb_global_product" f1 on f1.cprd=f2.cprd
 WHERE f1.cprd is null;  

A simple join, nothing really special. The first table is this one:

rsup1=# \d "rsu_adm"."data_2d_clb_global_product"
                     Table "rsu_adm.data_2d_clb_global_product"
         Column          |           Type           | Collation | Nullable | Default 
-------------------------+--------------------------+-----------+----------+---------
 cprd                    | character varying        |           | not null | 
 xtc_id                  | numeric                  |           |          | 
 rprd                    | numeric                  |           |          | 
 prdgalsts_id            | integer                  |           |          | 
 dlz_last_transaction_ts | timestamp with time zone |           |          | 
Indexes:
    "data_2d_clb_global_product_pkey" PRIMARY KEY, btree (cprd)
    "data_2d_clb_global_product_idx4" btree (dlz_last_transaction_ts)

The second reference is a view:

rsup1=# \d "rsu_adm"."clb_global_product"
                           View "rsu_adm.clb_global_product"
         Column          |            Type             | Collation | Nullable | Default 
-------------------------+-----------------------------+-----------+----------+---------
 cprd                    | character varying           |           |          | 
 xtc_id                  | numeric(10,0)               |           |          | 
 rprd                    | numeric(10,0)               |           |          | 
 prdgalsts_id            | numeric(10,0)               |           |          | 
 dlz_last_transaction_ts | timestamp(5) with time zone |           |          | 

The view references a foreign table:

rsup1=# select definition 
          from pg_views 
         where viewname = 'clb_global_product';
                   definition                   
------------------------------------------------
  SELECT clb_global_product.cprd,              +
     clb_global_product.xtc_id,                +
     clb_global_product.rprd,                  +
     clb_global_product.prdgalsts_id,          +
     clb_global_product.dlz_last_transaction_ts+
    FROM ro_dlz.clb_global_product;
(1 row)

rsup1=# \d ro_dlz.clb_global_product
                                            Foreign table "ro_dlz.clb_global_product"
         Column          |            Type             | Collation | Nullable | Default |               FDW options               
-------------------------+-----------------------------+-----------+----------+---------+-----------------------------------------
 cprd                    | character varying           |           |          |         | (column_name 'cprd')
 xtc_id                  | numeric(10,0)               |           |          |         | (column_name 'xtc_id')
 rprd                    | numeric(10,0)               |           |          |         | (column_name 'rprd')
 prdgalsts_id            | numeric(10,0)               |           |          |         | (column_name 'prdgalsts_id')
 dlz_last_transaction_ts | timestamp(5) with time zone |           |          |         | (column_name 'dlz_last_transaction_ts')
Server: tgt_srv
FDW options: (schema_name 'ro_rsu', table_name 'clb_global_product')

rsup1=# 

Here is the definition of the foreign server, for completeness:

rsup1=# \des+
                                                                                       List of foreign servers
  Name   |  Owner   | Foreign-data wrapper | Access privileges | Type | Version |                                             FDW options                                              | Description 
---------+----------+----------------------+-------------------+------+---------+------------------------------------------------------------------------------------------------------+-------------
 tgt_srv | postgres | postgres_fdw         |                   |      |         | (host '192.168.100.245', dbname 'dlzp1', port '5432', use_remote_estimate 'true', fetch_size '5000') | 
(1 row)

Having a look at the other instance, the foreign table references a view, here are the definitions of the view and the underlying table:

postgres=# \c dlzp1
You are now connected to database "dlzp1" as user "postgres".
dlzp1=# \d ro_rsu.clb_global_product
                            View "ro_rsu.clb_global_product"
         Column          |            Type             | Collation | Nullable | Default 
-------------------------+-----------------------------+-----------+----------+---------
 cprd                    | character varying           |           |          | 
 xtc_id                  | numeric(10,0)               |           |          | 
 rprd                    | numeric(10,0)               |           |          | 
 prdgalsts_id            | numeric(10,0)               |           |          | 
 dlz_last_transaction_ts | timestamp(5) with time zone |           |          | 

dlzp1=# select definition 
          from pg_views 
         where viewname = 'clb_global_product';
               definition               
----------------------------------------
  SELECT clb_flbgpr.cprd,              +
     clb_flbgpr.xtc_id,                +
     clb_flbgpr.rprd,                  +
     clb_flbgpr.prdgalsts_id,          +
     clb_flbgpr.dlz_last_transaction_ts+
    FROM dlz_clb.clb_flbgpr;
(1 row)

dlzp1=# \d dlz_clb.clb_flbgpr
                               Table "dlz_clb.clb_flbgpr"
         Column          |            Type             | Collation | Nullable | Default 
-------------------------+-----------------------------+-----------+----------+---------
 cprd                    | character varying           |           | not null | 
 fp_lwr_celsius          | numeric(3,0)                |           |          | 
 fp_lwr_farenheit        | numeric(3,0)                |           |          | 
 fp_last_upd             | date                        |           |          | 
 win_indicator           | numeric(6,0)                |           |          | 
 gpr_theme               | character varying(6)        |           |          | 
 comparison_symbol       | character(1)                |           |          | 
 cprd_as_previous        | character varying           |           |          | 
 emp_user_id_as_pcfm     | character varying           |           |          | 
 xtc_id                  | numeric(10,0)               |           |          | 
 gpr_oral_care           | character(1)                |           |          | 
 bunit_id                | numeric(10,0)               |           |          | 
 remark_id               | numeric(10,0)               |           |          | 
 win_them_indicator      | numeric(6,0)                |           |          | 
 plr_id                  | numeric(10,0)               |           |          | 
 stability_status        | character(1)                |           |          | 
 cmr_frm_tested          | character(1)                |           |          | 
 rprd                    | numeric(10,0)               |           |          | 
 like_level_id           | numeric(10,0)               |           |          | 
 lhpr_id                 | numeric(10,0)               |           |          | 
 like_perf_lvl_id        | numeric(10,0)               |           |          | 
 lhdncpr_id              | numeric(10,0)               |           |          | 
 like_hdnc_lvl_id        | numeric(10,0)               |           |          | 
 prdgalsts_id            | numeric(10,0)               |           |          | 
 oil_stock_sts           | character(1)                |           |          | 
 archive_status          | character(2)                |           |          | 
 dlz_last_transaction_ts | timestamp(5) with time zone |           |          | 
Indexes:
    "clb_flbgpr_pk" PRIMARY KEY, btree (cprd)
    "clb_flbgpr_lst_trn" btree (dlz_last_transaction_ts)
Triggers:
    clb_flbgpr_ins BEFORE INSERT ON dlz_clb.clb_flbgpr FOR EACH ROW EXECUTE FUNCTION dlz_adm.fct_trg_setup_transaction_ts()
    clb_flbgpr_upd BEFORE UPDATE ON dlz_clb.clb_flbgpr FOR EACH ROW WHEN (old.* IS DISTINCT FROM new.*) EXECUTE FUNCTION dlz_adm.fct_trg_setup_transaction_ts()

What happened was this: When a merge join was chosen, this was the explain plan:

rsup1=# explain (analyze, verbose) SELECT * FROM "rsu_adm"."data_2d_clb_global_product" f2
LEFT OUTER JOIN "rsu_adm"."clb_global_product" f1 on f1.cprd=f2.cprd
WHERE f1.cprd is null;    
                                                                                                                  QUERY PLAN                                                                                                    
--------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
 Merge Anti Join  (cost=100.84..67203.45 rows=50713 width=122) (actual time=41.769..2452.910 rows=4708 loops=1)
   Output: f2.cprd, f2.xtc_id, f2.rprd, f2.prdgalsts_id, f2.dlz_last_transaction_ts, clb_global_product.cprd, clb_global_product.xtc_id, clb_global_product.rprd, clb_global_product.prdgalsts_id, clb_global_product.dlz_last_t
   Merge Cond: ((f2.cprd)::text = (clb_global_product.cprd)::text)
   ->  Index Scan using data_2d_clb_global_product_pkey on rsu_adm.data_2d_clb_global_product f2  (cost=0.42..2898.56 rows=101426 width=34) (actual time=0.036..29.311 rows=101426 loops=1)
         Output: f2.cprd, f2.xtc_id, f2.rprd, f2.prdgalsts_id, f2.dlz_last_transaction_ts
   ->  Foreign Scan on ro_dlz.clb_global_product  (cost=100.42..52506.16 rows=923613 width=34) (actual time=41.683..2258.170 rows=923613 loops=1)
         Output: clb_global_product.cprd, clb_global_product.xtc_id, clb_global_product.rprd, clb_global_product.prdgalsts_id, clb_global_product.dlz_last_transaction_ts
         Remote SQL: SELECT cprd, xtc_id, rprd, prdgalsts_id, dlz_last_transaction_ts FROM ro_rsu.clb_global_product ORDER BY cprd ASC NULLS LAST
 Planning Time: 7.721 ms
 Execution Time: 2453.489 ms
(10 rows)

There is nothing wrong with that in general, but this plan gave a result when it shouldn’t:

rsup1=# SELECT * 
          FROM "rsu_adm"."data_2d_clb_global_product" f2
          LEFT OUTER JOIN "rsu_adm"."clb_global_product" f1 
            ON f1.cprd=f2.cprd
         WHERE f1.cprd is null;    
     cprd      | xtc_id |  rprd   | prdgalsts_id |   dlz_last_transaction_ts    | cprd | xtc_id | rprd | prdgalsts_id | dlz_last_transaction_ts 
---------------+--------+---------+--------------+------------------------------+------+--------+------+--------------+-------------------------
 0027033       |        |  179722 |            1 | 2023-04-03 06:15:09.45135+02 |      |        |      |              | 
 0112113       |        | 3199208 |            1 | 2023-04-03 06:15:09.45135+02 |      |        |      |              | 
 0116713       |        | 2071012 |            1 | 2023-04-03 06:15:09.45135+02 |      |        |      |              | 
...

Disabling merge joins, which causes PostgreSQL to go for a hash join, gave the correct result:

rsup1=# set enable_mergejoin = off;
SET
rsup1=# explain (analyze, verbose) SELECT * FROM "rsu_adm"."data_2d_clb_global_product" f2
LEFT OUTER JOIN "rsu_adm"."clb_global_product" f1 on f1.cprd=f2.cprd
WHERE f1.cprd is null;    
                                                                                                                  QUERY PLAN                                                                                                    
--------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
 Hash Anti Join  (cost=60274.55..681118.72 rows=50713 width=122) (actual time=2432.850..2432.852 rows=0 loops=1)
   Output: f2.cprd, f2.xtc_id, f2.rprd, f2.prdgalsts_id, f2.dlz_last_transaction_ts, clb_global_product.cprd, clb_global_product.xtc_id, clb_global_product.rprd, clb_global_product.prdgalsts_id, clb_global_product.dlz_last_t
   Hash Cond: ((f2.cprd)::text = (clb_global_product.cprd)::text)
   ->  Seq Scan on rsu_adm.data_2d_clb_global_product f2  (cost=0.00..1768.26 rows=101426 width=34) (actual time=0.034..5.166 rows=101426 loops=1)
         Output: f2.cprd, f2.xtc_id, f2.rprd, f2.prdgalsts_id, f2.dlz_last_transaction_ts
   ->  Hash  (cost=41513.39..41513.39 rows=923613 width=34) (actual time=2297.904..2297.905 rows=923613 loops=1)
         Output: clb_global_product.cprd, clb_global_product.xtc_id, clb_global_product.rprd, clb_global_product.prdgalsts_id, clb_global_product.dlz_last_transaction_ts
         Buckets: 131072  Batches: 16  Memory Usage: 4666kB
         ->  Foreign Scan on ro_dlz.clb_global_product  (cost=100.00..41513.39 rows=923613 width=34) (actual time=33.391..2035.852 rows=923613 loops=1)
               Output: clb_global_product.cprd, clb_global_product.xtc_id, clb_global_product.rprd, clb_global_product.prdgalsts_id, clb_global_product.dlz_last_transaction_ts
               Remote SQL: SELECT cprd, xtc_id, rprd, prdgalsts_id, dlz_last_transaction_ts FROM ro_rsu.clb_global_product
 Planning Time: 6.572 ms
 Execution Time: 2433.502 ms
(13 rows)

rsup1=# SELECT * 
          FROM "rsu_adm"."data_2d_clb_global_product" f2
          LEFT OUTER JOIN "rsu_adm"."clb_global_product" f1 
            ON f1.cprd=f2.cprd
         WHERE f1.cprd is null;    
 cprd | xtc_id | rprd | prdgalsts_id | dlz_last_transaction_ts | cprd | xtc_id | rprd | prdgalsts_id | dlz_last_transaction_ts 
------+--------+------+--------------+-------------------------+------+--------+------+--------------+-------------------------
(0 rows)

rsup1=# 

No matter what I’ve tried, I was not able to reproduce this. On my local setup the merge join gave exactly the same result as the hash join.

After several hours of testing this and that, it became somehow clear that it was related to sorting. A merge join sorts both sides of the joins before returning the results. The sort, in this case, was on a text based column: Merge Cond: ((f2.cprd)::text = (clb_global_product.cprd)::text).

When we talk about sorting text based data in PostgreSQL this is always about collations. But when I’ve looked at the collations used in both the source and the target instance, both were configured in the same way and there is no other collation defined on the column itself:

-- source
rsup1=# \l
                                  List of databases
   Name    |  Owner   | Encoding |   Collate   |    Ctype    |   Access privileges   
-----------+----------+----------+-------------+-------------+-----------------------
 postgres  | postgres | UTF8     | en_US.UTF-8 | en_US.UTF-8 | 
 rsup1     | postgres | UTF8     | en_US.UTF-8 | en_US.UTF-8 | 
 template0 | postgres | UTF8     | en_US.UTF-8 | en_US.UTF-8 | =c/postgres          +
           |          |          |             |             | postgres=CTc/postgres
 template1 | postgres | UTF8     | en_US.UTF-8 | en_US.UTF-8 | =c/postgres          +
           |          |          |             |             | postgres=CTc/postgres
(4 rows)

-- target
postgres=# \l
                                  List of databases
   Name    |  Owner   | Encoding |   Collate   |    Ctype    |   Access privileges   
-----------+----------+----------+-------------+-------------+-----------------------
 dlzp1     | postgres | UTF8     | en_US.UTF-8 | en_US.UTF-8 | 
 postgres  | postgres | UTF8     | en_US.UTF-8 | en_US.UTF-8 | 
 template0 | postgres | UTF8     | en_US.UTF-8 | en_US.UTF-8 | =c/postgres          +
           |          |          |             |             | postgres=CTc/postgres
 template1 | postgres | UTF8     | en_US.UTF-8 | en_US.UTF-8 | =c/postgres          +
           |          |          |             |             | postgres=CTc/postgres
(4 rows)

What else could cause this weird behavior then? Finally I’ve asked the right question: What operating systems are these two PostgreSQL instances running on? It turned out that the source instance is running on RHEL 7.9 and the target instance is running on RHEL 8.7. Re-doing the same setup with those versions of RHEL I was able to re-produce it.

One that was clear it was almost obvious that it some relates to the versions of glibc.

RHEL 7 uses version 2.17:

postgres@rsu1p1:/home/postgres/ [src] cat /etc/os-release 
NAME="Red Hat Enterprise Linux Server"
VERSION="7.9 (Maipo)"
ID="rhel"
ID_LIKE="fedora"
VARIANT="Server"
VARIANT_ID="server"
VERSION_ID="7.9"
PRETTY_NAME="Red Hat Enterprise Linux"
ANSI_COLOR="0;31"
CPE_NAME="cpe:/o:redhat:enterprise_linux:7.9:GA:server"
HOME_URL="https://www.redhat.com/"
BUG_REPORT_URL="https://bugzilla.redhat.com/"

REDHAT_BUGZILLA_PRODUCT="Red Hat Enterprise Linux 7"
REDHAT_BUGZILLA_PRODUCT_VERSION=7.9
REDHAT_SUPPORT_PRODUCT="Red Hat Enterprise Linux"
REDHAT_SUPPORT_PRODUCT_VERSION="7.9"
postgres@rsu1p1:/home/postgres/ [src] rpm -qa | grep glibc
glibc-headers-2.17-326.el7_9.x86_64
glibc-common-2.17-326.el7_9.x86_64
glibc-2.17-326.el7_9.x86_64
glibc-devel-2.17-326.el7_9.x86_64

RHEL 8 comes with version 2.28:

postgres@dlzp1:/home/postgres/ [tgt] cat /etc/os-release 
NAME="Red Hat Enterprise Linux"
VERSION="8.7 (Ootpa)"
ID="rhel"
ID_LIKE="fedora"
VERSION_ID="8.7"
PLATFORM_ID="platform:el8"
PRETTY_NAME="Red Hat Enterprise Linux 8.7 (Ootpa)"
ANSI_COLOR="0;31"
CPE_NAME="cpe:/o:redhat:enterprise_linux:8::baseos"
HOME_URL="https://www.redhat.com/"
DOCUMENTATION_URL="https://access.redhat.com/documentation/red_hat_enterprise_linux/8/"
BUG_REPORT_URL="https://bugzilla.redhat.com/"

REDHAT_BUGZILLA_PRODUCT="Red Hat Enterprise Linux 8"
REDHAT_BUGZILLA_PRODUCT_VERSION=8.7
REDHAT_SUPPORT_PRODUCT="Red Hat Enterprise Linux"
REDHAT_SUPPORT_PRODUCT_VERSION="8.7"
13:08:10 postgres@dlzp1:/home/postgres/ [tgt] rpm -qa | grep glibc
glibc-langpack-en-2.28-211.el8.x86_64
glibc-devel-2.28-211.el8.x86_64
glibc-2.28-211.el8.x86_64
glibc-common-2.28-211.el8.x86_64
glibc-gconv-extra-2.28-211.el8.x86_64

This can easily be proven by sorting these simple strings on both versions:

-- source, RHEL 7.8
CREATE TABLE
postgres=# INSERT INTO t1 VALUES ('1-a'), ('1a'), ('1-aa');
INSERT 0 3
postgres=# SELECT * FROM t1 ORDER BY c1;
  c1  
------
 1a
 1-a
 1-aa
(3 rows)

-- target, RHEL 8.7
postgres=# CREATE TABLE t1 (c1 varchar PRIMARY KEY);
CREATE TABLE
postgres=# INSERT INTO t1 VALUES ('1-a'), ('1a'), ('1-aa');
INSERT 0 3
postgres=# SELECT * FROM t1 ORDER BY c1;
  c1  
------
 1-a
 1a
 1-aa
(3 rows)

Lessons learned: If you connect two instances with the foreign data wrapper for PostgreSQL, make sure that both operating systems come with the same version of glibc, otherwise you might not get what you expect.