I’m currently working on a customer project, where we have 2 old Lenovo Servers, one ODA X7-2M and another ODA X8-2M. In the new project, we are replacing the 2 old Lenovo Servers by 2 new ODA X9-2L. The production database, let’s call it DB01, is a Standard Edition database with dbvisit Standby as Disaster Recovery solution. During each project, we take the opportunity to upgrade dbvisit Stanby software to last available version. In 2022, I then upgraded dbvisit from v8 to v10, and recently to v11.

dbvisit Standby upgrade to v10

In 2022, when upgrading dbvisit Standby from v8 to v10, I faced a dbvnet port problem for the standby database running on the old Lenovo Server. For the standby databases running on the ODA there was absolutely no problem.

The problem raised already when trying to upgrade the DDC configuration to the new v10 version:

oracle@srv01:/u01/app/dbvisit/standby/ [LX] ./dbvctl -d DB01STD2 -o upgrade
=============================================================
Dbvisit Standby Database Technology (10.2.0_0_gd69c33d2) (pid 72540)
dbvctl started on srv01: Tue Sep 13 20:29:20 2022
=============================================================



=========================================================

     Dbvisit Standby Database Technology (10.2.0_0_gd69c33d2)
           http://www.dbvisit.com

=========================================================

=>dbvctl only needs to be run on the primary server.

Is this the primary server?  [Yes]:

>>> DDC file DB01STD2 version: 8.0.26

>>> Failed to copy /u01/app/dbvisit/standby/conf/dbv_DB01STD2.env to standby: Problem with
    /u01/app/dbvisit/dbvnet/dbvnet   /u01/app/dbvisit/standby/conf/dbv_DB01STD2.env
    srv02:/u01/app/dbvisit/standby/conf/dbv_DB01STD2.env. Ensure
    /u01/app/dbvisit/dbvnet/dbvnet is configured correctly without needing a password or
    passphrase, the file exists and the remote directory /u01/app/dbvisit/standby/conf on
    srv02 exists.  time="2022-09-13T20:29:33+02:00" level=info msg="dbvnet: error:
    unable to connect: dial tcp X.X.X.43:7890: connect: connection refused"

>>> DDC file DB01STD2 upgraded to version 10.2.0.

>>> Dbvisit Database repository (DDR) up to date, no upgrade required.

=============================================================
dbvctl ended on srv01: Tue Sep 13 20:29:33 2022
=============================================================

Albeit dbvnet process was up and running on 7890 port on the standby server:

[root@srv02 ~]# ps -ef | grep -i dbv | grep -v grep
oracle     2049      1  0 20:57 ?        00:00:00 /u01/app/dbvisit/dbvagent/dbvagent -d start
oracle     2077      1  0 20:57 ?        00:00:00 /u01/app/dbvisit/dbvnet/dbvnet -d start

It was impossible to reach the connection to the standby server on the standard port.

oracle@srv01:/u01/app/dbvisit/standby/conf/ [LX] nc -zv X.X.X.43 7890
Ncat: Version 7.50 ( https://nmap.org/ncat )
Ncat: Connection refused.

And no firewall was in used.

At that time, for the concerned configurations involving the Lenovo servers as standby, we decided to move the communication to use ssh connection instead of standard port for dbvnet. Port 22 instead of port 7890. This resolved at that time the problem.

dbvisit Standby upgrade to v11

This year, upgrading dbvisit Standby v10 to last StandbyMP v11 version, implied to first solve the dbvnet problem with port 7890 related to the Lenovo servers. This is mandatory knowing last StandbyMP version does not support 22 port connection any more.

When upgrading dbvisit, the primary database was running on srv02. srv01 was other old Lenovo servers running a standby database with configuration DB01STD2. Both other ODAs were running standby database with DDC configuration DB01STD1 and DB01STD3.

As we can see, the DDC configuration in relation with standby databases running on the ODA were still using dbvnet port 7890:

oracle@srv02:/u01/app/dbvisit/standby/conf/ [DB01] grep -i '^RSH\|^CP\|NETPORT\|NETPORT_DR' dbv_DB01STD3.env
# NETPORT             - Dbvnet port on primary server(default 7890)
NETPORT = 7890
# NETPORT_DR          - Dbvnet port on standby server(default 7890)
NETPORT_DR = 7890
CP =
RSH =

oracle@srv02:/u01/app/dbvisit/standby/conf/ [DB01] grep -i '^RSH\|^CP\|NETPORT\|NETPORT_DR' dbv_DB01STD1.env
# NETPORT             - Dbvnet port on primary server(default 7890)
NETPORT = 7890
# NETPORT_DR          - Dbvnet port on standby server(default 7890)
NETPORT_DR = 7890
CP =
RSH =

When DDC configuration in relation with standby database running on other Lenovo Server srv01 was using dbvnet port 22:

oracle@srv02:/u01/app/dbvisit/standby/conf/ [DB01] grep -i '^RSH\|^CP\|NETPORT\|NETPORT_DR' dbv_DB01STD2.env
# NETPORT             - Dbvnet port on primary server(default 7890)
NETPORT = 22
# NETPORT_DR          - Dbvnet port on standby server(default 7890)
NETPORT_DR = 22
CP = /usr/bin/scp
RSH = /usr/bin/ssh

I changed the DB01STD2 configuration to use again dbvnet on port 7890:

oracle@srv02:/u01/app/dbvisit/standby/conf/ [DB01] cp -p dbv_DB01STD2.env dbv_DB01STD2.env.202312291139
oracle@srv02:/u01/app/dbvisit/standby/conf/ [DB01] vi dbv_DB01STD2.env
oracle@srv02:/u01/app/dbvisit/standby/conf/ [DB01] diff dbv_DB01STD2.env dbv_DB01STD2.env.202312291139
61c61
 NETPORT = 22
103c103
 NETPORT_DR = 22
419c419
 CP = /usr/bin/scp
432c432
 RSH = /usr/bin/ssh

But of course synchronising the configuration with srv01 standby server was failing with same error faced in 2022:

oracle@srv02:/u01/app/dbvisit/standby/conf/ [DB01] ../dbvctl -d DB01STD2 -C
=============================================================
Dbvisit Standby Database Technology (10.2.0_0_gd69c33d2) (pid 224727)
dbvctl started on srv02: Tue Dec 19 11:41:06 2023
=============================================================


<<<>>>

PID:224727
TRACEFILE:224727_dbvctl_DB01STD2_202312191141.trc
SERVER:srv02
ERROR_CODE:1
Connection to srv01 failed.
No initial contact can be made or remote command cannot be run
Please check network connection and review the settings:
        RSH=
        DEST_SERVER=srv01
        DEST_NETPORT=7890
        DBVISIT_BASE=/u01/app/dbvisit

time="2023-12-19T11:41:16+01:00" level=info msg="dbvnet: error: unable to connect: dial
tcp X.X.X.42:7890: connect: connection refused"


>>>> Dbvisit Standby terminated <<<<

I confirmed I could not reach the standby server on the port 7890:

oracle@srv02:/u01/app/dbvisit/standby/conf/ [DB01] nc -zv X.X.X.42 7890
Ncat: Version 7.50 ( https://nmap.org/ncat )
Ncat: Connection refused.

oracle@srv02:/u01/app/dbvisit/standby/conf/ [DB01] nc -zv srv01 7890
Ncat: Version 7.50 ( https://nmap.org/ncat )
Ncat: Connection refused.

Doing some further troubleshooting I could realize that the server was listening on the port 7890 on the internal 127.0.0.1 IP address:

[root@srv01 ~]# ss -tulpn | grep 7890
tcp    LISTEN     0      128    127.0.0.1:7890                  *:*                   users:(("dbvnet",pid=1712,fd=0))
[root@srv01 ~]#

Which was definitively the problem. But why?

The dbvnet process is using the hostname as listener address and listener port as 7890:

oracle@srv01:/home/oracle/ [rdbms12201] cd /u01/app/dbvisit/dbvnet/conf/
oracle@srv01:/u01/app/dbvisit/dbvnet/conf/ [rdbms12201] cat dbvnetd.conf
[general]
listener_address=srv01
listener_port=7890
passphrase=encrypt


certkey=

debug=3


Solution

The problem was coming from a bad /etc/hosts configuration in the old Lenovo Servers.

As we can see in the /etc/hosts:

[root@srv01 ~]# cat /etc/hosts
127.0.0.1   srv01 srv01.customer.local localhost localhost.localdomain localhost4 localhost4.localdomain4
::1         localhost localhost.localdomain localhost6 localhost6.localdomain6

# Productive DB Servers
X.X.X.56      db01vip           db01vip.customer.local
X.X.X.77      dbvisitconsole01   dbvisitconsole01.customer.local

X.X.X.43      srv02        srv02.customer.local
Y.Y.Y.43  srv02_arch

X.X.X.42      srv01        srv01.customer.local
Y.Y.Y.42  srv01_arch

10.50.7.11      srv03        srv03.customer.local
# Y.Y.Y.  srv03_arch

the hostname is incorrectly set on the localhost line.

I updated accordingly the /etc/hosts configuration file on both Lenovo servers srv01 and srv02 as following:

[root@srv01 ~]# vi /etc/hosts
[root@srv01 ~]# cat /etc/hosts
#127.0.0.1   srv01 srv01.customer.local localhost localhost.localdomain localhost4 localhost4.localdomain4
127.0.0.1   localhost localhost.localdomain localhost4 localhost4.localdomain4
::1         localhost localhost.localdomain localhost6 localhost6.localdomain6

# Productive DB Servers
X.X.X.56      db01vip           db01vip.customer.local
X.X.X.77      dbvisitconsole01   dbvisitconsole01.customer.local

X.X.X.43      srv02        srv02.customer.local
Y.Y.Y.43  srv02_arch

X.X.X.42      srv01        srv01.customer.local
Y.Y.Y.42  srv01_arch

10.50.7.11      srv03        srv03.customer.local
# Y.Y.Y.  srv03_arch

[root@srv01 ~]#

I restarted dbvnet service process on both Lenovo servers srv01 and srv02:

[root@srv01 ~]# systemctl stop dbvnet
[root@srv01 ~]# ps -ef | grep [d]bv
[root@srv01 ~]# systemctl start dbvnet
[root@srv01 ~]# ps -ef | grep [d]bv
oracle    57869      1  0 11:58 ?        00:00:00 /u01/app/dbvisit/dbvnet/dbvnet -d start

And could confirm that now the server was listening on the right IP address interface (and not the local one anymore) for port 7890:

[root@srv01 ~]# ss -tulpn | grep 7890
tcp    LISTEN     0      128    X.X.X.42:7890                  *:*                   users:(("dbvnet",pid=57869,fd=0))
[root@srv01 ~]#

And now the standby server was allowing connection on the appropriate port listening on the hostname:

oracle@srv02:/u01/app/dbvisit/standby/conf/ [DB01] nc -zv srv01 7890
Ncat: Version 7.50 ( https://nmap.org/ncat )
Ncat: Connected to X.X.X.42:7890.
Ncat: 0 bytes sent, 0 bytes received in 0.01 seconds.

And I could synchronise the configuration on the standby server now:

oracle@srv02:/u01/app/dbvisit/standby/conf/ [DB01] ../dbvctl -d DB01STD2 -C
=============================================================
Dbvisit Standby Database Technology (10.2.0_0_gd69c33d2) (pid 252908)
dbvctl started on srv02: Tue Dec 19 11:59:00 2023
=============================================================

>>> Dbvisit Standby configurational differences found between srv02 and srv01.
    Synchronised.

=============================================================
dbvctl ended on srv02: Tue Dec 19 11:59:01 2023
=============================================================

And all was ok then to successfully complete dbvisit StandbyMP upgrade to last v11 version.