I tried to apply a patch to my 18.3.0 GI/ASM two node cluster on RHEL 7.5.
The first node worked fine, but the second node got always an error…
Environment:
Server Node1: dbserver01
Server Node2: dbserver02
Oracle Version: 18.3.0 with PSU OCT 2018 ==> 28660077
Patch to be installed: 28655784 (RU 18.4.0.0)
First node (dbserver01)
Everything fine:
cd ${ORACLE_HOME}/OPatch sudo ./opatchauto apply /tmp/28655784/ ... Sucessfull
Secondary node (dbserver02)
Same command but different output:
cd ${ORACLE_HOME}/Patch sudo ./opatchauto apply /tmp/28655784/ ... Remote command execution failed due to No ECDSA host key is known for dbserver01 and you have requested strict checking. Host key verification failed. Command output: OPATCHAUTO-72050: System instance creation failed. OPATCHAUTO-72050: Failed while retrieving system information. OPATCHAUTO-72050: Please check log file for more details.
After playing around with the keys I found out, that the host keys had to be exchange also for root.
So I connected as root and made an ssh from dbserver01 to dbserver02 and from dbserver02 to dbserver01.
After I exchanged the host keys the error message changed:
Remote command execution failed due to Permission denied (publickey,gssapi-keyex,gssapi-with-mic,password). Command output: OPATCHAUTO-72050: System instance creation failed. OPATCHAUTO-72050: Failed while retrieving system information. OPATCHAUTO-72050: Please check log file for more details.
So I investigated the log file a litte further and the statement with the error was:
/bin/ssh -o FallBackToRsh=no -o PasswordAuthentication=no -o StrictHostKeyChecking=no -o NumberOfPasswordPrompts=0 dbserver01 /bin/ssh -o FallBackToRsh=no -o PasswordAuthentication=no -o StrictHostKeyChecking=yes -o NumberOfPasswordPrompts=0 dbserver01 /u00/app/oracle/product/18.3.0/dbhome_1//perl/bin/perl /u00/app/oracle/product/18.3.0/dbhome_1/OPatch/auto/database/bin/RemoteHostExecutor.pl -GRID_HOME=/u00/app/oracle/product/18.3.0/grid_1 -OBJECTLOC=/u00/app/oracle/product/18.3.0/dbhome_1//cfgtoollogs/opatchautodb/hostdata.obj -CRS_ACTION=get_all_homes -CLUSTERNODES=dbserver01,dbserver02,dbserver02 -JVM_HANDLER=oracle/dbsysmodel/driver/sdk/productdriver/remote/RemoteOperationHelper
Soooooo: dbserver02 starts a ssh session to dbserver01 and from there an additional session to dbserver01 (himself).
I don’t know why but it is as it is….after I did a keyexchange from dbserver01 (root) to dbserver01 (root) the patching worked fine.
At the moment I can not remeber that I ever had to do a keyexchange from the root User on to the same host.
Did you got the same problem or do you know a better way to do that? Write me a comment!
Seb
23.04.2024Hi,
I've encountered this behaviour too, when applying fix for 36114443: at some point opatchauto runs following command:
/bin/ssh -o FallBackToRsh=no -o PasswordAuthentication=no -o StrictHostKeyChecking=no -o [...] ls
When /root/.ssh/know_hosts on other node does not exist or contains no reference to the other node part of the cluster, it returns failure code 1, but *also and at the same time* updates known_hosts (msg "Warning: permanently added 'host,IP address' to the list of known hosts" is displayed), so re-running opatchauto again right away works...
To anticipate and prevent further opatchauto failures we've updated all our .ssh/known_hosts files under /root on machines that are cluster members, and since then didn't meet this problem any longer.
Clemens Bleile
17.05.2024Thanks Seb, appreciate your info.
Regards
Clemens