I’m currently working on a customer project, where we have 2 old Lenovo Servers, one ODA X7-2M and another ODA X8-2M. In the new project, we are replacing the 2 old Lenovo Servers by 2 new ODA X9-2L. The production database, let’s call it DB01, is a Standard Edition database with dbvisit Standby as Disaster Recovery solution. During each project, we take the opportunity to upgrade dbvisit Stanby software to last available version. In 2022, I then upgraded dbvisit from v8 to v10, and recently to v11.
dbvisit Standby upgrade to v10
In 2022, when upgrading dbvisit Standby from v8 to v10, I faced a dbvnet port problem for the standby database running on the old Lenovo Server. For the standby databases running on the ODA there was absolutely no problem.
The problem raised already when trying to upgrade the DDC configuration to the new v10 version:
oracle@srv01:/u01/app/dbvisit/standby/ [LX] ./dbvctl -d DB01STD2 -o upgrade ============================================================= Dbvisit Standby Database Technology (10.2.0_0_gd69c33d2) (pid 72540) dbvctl started on srv01: Tue Sep 13 20:29:20 2022 ============================================================= ========================================================= Dbvisit Standby Database Technology (10.2.0_0_gd69c33d2) http://www.dbvisit.com ========================================================= =>dbvctl only needs to be run on the primary server. Is this the primary server? [Yes]: >>> DDC file DB01STD2 version: 8.0.26 >>> Failed to copy /u01/app/dbvisit/standby/conf/dbv_DB01STD2.env to standby: Problem with /u01/app/dbvisit/dbvnet/dbvnet /u01/app/dbvisit/standby/conf/dbv_DB01STD2.env srv02:/u01/app/dbvisit/standby/conf/dbv_DB01STD2.env. Ensure /u01/app/dbvisit/dbvnet/dbvnet is configured correctly without needing a password or passphrase, the file exists and the remote directory /u01/app/dbvisit/standby/conf on srv02 exists. time="2022-09-13T20:29:33+02:00" level=info msg="dbvnet: error: unable to connect: dial tcp X.X.X.43:7890: connect: connection refused" >>> DDC file DB01STD2 upgraded to version 10.2.0. >>> Dbvisit Database repository (DDR) up to date, no upgrade required. ============================================================= dbvctl ended on srv01: Tue Sep 13 20:29:33 2022 =============================================================
Albeit dbvnet process was up and running on 7890 port on the standby server:
[root@srv02 ~]# ps -ef | grep -i dbv | grep -v grep oracle 2049 1 0 20:57 ? 00:00:00 /u01/app/dbvisit/dbvagent/dbvagent -d start oracle 2077 1 0 20:57 ? 00:00:00 /u01/app/dbvisit/dbvnet/dbvnet -d start
It was impossible to reach the connection to the standby server on the standard port.
oracle@srv01:/u01/app/dbvisit/standby/conf/ [LX] nc -zv X.X.X.43 7890 Ncat: Version 7.50 ( https://nmap.org/ncat ) Ncat: Connection refused.
And no firewall was in used.
At that time, for the concerned configurations involving the Lenovo servers as standby, we decided to move the communication to use ssh connection instead of standard port for dbvnet. Port 22 instead of port 7890. This resolved at that time the problem.
dbvisit Standby upgrade to v11
This year, upgrading dbvisit Standby v10 to last StandbyMP v11 version, implied to first solve the dbvnet problem with port 7890 related to the Lenovo servers. This is mandatory knowing last StandbyMP version does not support 22 port connection any more.
When upgrading dbvisit, the primary database was running on srv02. srv01 was other old Lenovo servers running a standby database with configuration DB01STD2. Both other ODAs were running standby database with DDC configuration DB01STD1 and DB01STD3.
As we can see, the DDC configuration in relation with standby databases running on the ODA were still using dbvnet port 7890:
oracle@srv02:/u01/app/dbvisit/standby/conf/ [DB01] grep -i '^RSH\|^CP\|NETPORT\|NETPORT_DR' dbv_DB01STD3.env # NETPORT - Dbvnet port on primary server(default 7890) NETPORT = 7890 # NETPORT_DR - Dbvnet port on standby server(default 7890) NETPORT_DR = 7890 CP = RSH = oracle@srv02:/u01/app/dbvisit/standby/conf/ [DB01] grep -i '^RSH\|^CP\|NETPORT\|NETPORT_DR' dbv_DB01STD1.env # NETPORT - Dbvnet port on primary server(default 7890) NETPORT = 7890 # NETPORT_DR - Dbvnet port on standby server(default 7890) NETPORT_DR = 7890 CP = RSH =
When DDC configuration in relation with standby database running on other Lenovo Server srv01 was using dbvnet port 22:
oracle@srv02:/u01/app/dbvisit/standby/conf/ [DB01] grep -i '^RSH\|^CP\|NETPORT\|NETPORT_DR' dbv_DB01STD2.env # NETPORT - Dbvnet port on primary server(default 7890) NETPORT = 22 # NETPORT_DR - Dbvnet port on standby server(default 7890) NETPORT_DR = 22 CP = /usr/bin/scp RSH = /usr/bin/ssh
I changed the DB01STD2 configuration to use again dbvnet on port 7890:
oracle@srv02:/u01/app/dbvisit/standby/conf/ [DB01] cp -p dbv_DB01STD2.env dbv_DB01STD2.env.202312291139 oracle@srv02:/u01/app/dbvisit/standby/conf/ [DB01] vi dbv_DB01STD2.env oracle@srv02:/u01/app/dbvisit/standby/conf/ [DB01] diff dbv_DB01STD2.env dbv_DB01STD2.env.202312291139 61c61 NETPORT = 22 103c103 NETPORT_DR = 22 419c419 CP = /usr/bin/scp 432c432 RSH = /usr/bin/ssh
But of course synchronising the configuration with srv01 standby server was failing with same error faced in 2022:
oracle@srv02:/u01/app/dbvisit/standby/conf/ [DB01] ../dbvctl -d DB01STD2 -C ============================================================= Dbvisit Standby Database Technology (10.2.0_0_gd69c33d2) (pid 224727) dbvctl started on srv02: Tue Dec 19 11:41:06 2023 ============================================================= <<<>>> PID:224727 TRACEFILE:224727_dbvctl_DB01STD2_202312191141.trc SERVER:srv02 ERROR_CODE:1 Connection to srv01 failed. No initial contact can be made or remote command cannot be run Please check network connection and review the settings: RSH= DEST_SERVER=srv01 DEST_NETPORT=7890 DBVISIT_BASE=/u01/app/dbvisit time="2023-12-19T11:41:16+01:00" level=info msg="dbvnet: error: unable to connect: dial tcp X.X.X.42:7890: connect: connection refused" >>>> Dbvisit Standby terminated <<<<
I confirmed I could not reach the standby server on the port 7890:
oracle@srv02:/u01/app/dbvisit/standby/conf/ [DB01] nc -zv X.X.X.42 7890 Ncat: Version 7.50 ( https://nmap.org/ncat ) Ncat: Connection refused. oracle@srv02:/u01/app/dbvisit/standby/conf/ [DB01] nc -zv srv01 7890 Ncat: Version 7.50 ( https://nmap.org/ncat ) Ncat: Connection refused.
Doing some further troubleshooting I could realize that the server was listening on the port 7890 on the internal 127.0.0.1 IP address:
[root@srv01 ~]# ss -tulpn | grep 7890 tcp LISTEN 0 128 127.0.0.1:7890 *:* users:(("dbvnet",pid=1712,fd=0)) [root@srv01 ~]#
Which was definitively the problem. But why?
The dbvnet process is using the hostname as listener address and listener port as 7890:
oracle@srv01:/home/oracle/ [rdbms12201] cd /u01/app/dbvisit/dbvnet/conf/ oracle@srv01:/u01/app/dbvisit/dbvnet/conf/ [rdbms12201] cat dbvnetd.conf [general] listener_address=srv01 listener_port=7890 passphrase=encrypt certkey= debug=3
Solution
The problem was coming from a bad /etc/hosts configuration in the old Lenovo Servers.
As we can see in the /etc/hosts:
[root@srv01 ~]# cat /etc/hosts 127.0.0.1 srv01 srv01.customer.local localhost localhost.localdomain localhost4 localhost4.localdomain4 ::1 localhost localhost.localdomain localhost6 localhost6.localdomain6 # Productive DB Servers X.X.X.56 db01vip db01vip.customer.local X.X.X.77 dbvisitconsole01 dbvisitconsole01.customer.local X.X.X.43 srv02 srv02.customer.local Y.Y.Y.43 srv02_arch X.X.X.42 srv01 srv01.customer.local Y.Y.Y.42 srv01_arch 10.50.7.11 srv03 srv03.customer.local # Y.Y.Y. srv03_arch
the hostname is incorrectly set on the localhost line.
I updated accordingly the /etc/hosts configuration file on both Lenovo servers srv01 and srv02 as following:
[root@srv01 ~]# vi /etc/hosts [root@srv01 ~]# cat /etc/hosts #127.0.0.1 srv01 srv01.customer.local localhost localhost.localdomain localhost4 localhost4.localdomain4 127.0.0.1 localhost localhost.localdomain localhost4 localhost4.localdomain4 ::1 localhost localhost.localdomain localhost6 localhost6.localdomain6 # Productive DB Servers X.X.X.56 db01vip db01vip.customer.local X.X.X.77 dbvisitconsole01 dbvisitconsole01.customer.local X.X.X.43 srv02 srv02.customer.local Y.Y.Y.43 srv02_arch X.X.X.42 srv01 srv01.customer.local Y.Y.Y.42 srv01_arch 10.50.7.11 srv03 srv03.customer.local # Y.Y.Y. srv03_arch [root@srv01 ~]#
I restarted dbvnet service process on both Lenovo servers srv01 and srv02:
[root@srv01 ~]# systemctl stop dbvnet [root@srv01 ~]# ps -ef | grep [d]bv [root@srv01 ~]# systemctl start dbvnet [root@srv01 ~]# ps -ef | grep [d]bv oracle 57869 1 0 11:58 ? 00:00:00 /u01/app/dbvisit/dbvnet/dbvnet -d start
And could confirm that now the server was listening on the right IP address interface (and not the local one anymore) for port 7890:
[root@srv01 ~]# ss -tulpn | grep 7890 tcp LISTEN 0 128 X.X.X.42:7890 *:* users:(("dbvnet",pid=57869,fd=0)) [root@srv01 ~]#
And now the standby server was allowing connection on the appropriate port listening on the hostname:
oracle@srv02:/u01/app/dbvisit/standby/conf/ [DB01] nc -zv srv01 7890 Ncat: Version 7.50 ( https://nmap.org/ncat ) Ncat: Connected to X.X.X.42:7890. Ncat: 0 bytes sent, 0 bytes received in 0.01 seconds.
And I could synchronise the configuration on the standby server now:
oracle@srv02:/u01/app/dbvisit/standby/conf/ [DB01] ../dbvctl -d DB01STD2 -C ============================================================= Dbvisit Standby Database Technology (10.2.0_0_gd69c33d2) (pid 252908) dbvctl started on srv02: Tue Dec 19 11:59:00 2023 ============================================================= >>> Dbvisit Standby configurational differences found between srv02 and srv01. Synchronised. ============================================================= dbvctl ended on srv02: Tue Dec 19 11:59:01 2023 =============================================================
And all was ok then to successfully complete dbvisit StandbyMP upgrade to last v11 version.