By William Sescu
If you are using RMAN to back up your database, you are already doing the right thing. However, RMAN does not take care of everything. e.g. it is not backing up the following things, to mention just a few.
- Oracle Home
- Grid Infrastructure Home
- Data Guard broker files
- Password File
- SQL*Net file like ldap.ora, sqlnet.ora, tnsnames.ora and listener.ora
- /etc/oratab
- OS audit files
- Wallets
- /etc/sysctl.conf and limits.conf
- OLR and OCR
- Voting Disks
- ASM Metadata
- passwd, shadow, group
- RMAN scripts itself ksh/rcv (some exceptions when using the RMAN catalog)
- crontab
There are for sure many more, but in this article I would like to take a closer look at the Data Guard broker configuraiton files. The DataGuard Broker Configuration Files are quite important files. The contain entries that describe the state and the properties of the DataGuard configuration like the sites and databases that are part of the configuration, the roles and properties of each of the databases, and the state of each of the elements of the configuration. The broker configuration files are very small, in my case they have only 12K. Don’t expect them to grow very big. They usually stay at this size.
oracle@dbidg01:/u01/app/oracle/admin/DBIT121/pfile/ [DBIT121] ls -lh dr* -rw-r----- 1 oracle oinstall 12K Feb 28 09:27 dr1DBIT121_SITE1.dat -rw-r----- 1 oracle oinstall 12K Feb 28 09:34 dr2DBIT121_SITE1.dat
You can have a maximum of two different copies which can be configured via the DG_BROKER_CONFIG_FILEn (where n = 1, 2) parameter. Maybe, maximum is not the correct word in this context, because if you don’t specify the broker file location, Oracle still creates one in the default directory which is OS dependent. On Linux they end up in $ORACLE_HOME/dbs/.
However, not like you might expect, the broker configuration files are not a 1:1 mirror like e.g. redo log members or controlfiles. They are individual copies and they maintain the last known good state of the configuration. You can check it quite easily yourself by editing e.g. the FastStartFailoverThreshold and check the time stamps of the broker files afterwards.
DGMGRL> EDIT CONFIGURATION SET PROPERTY FastStartFailoverThreshold='40'; Property "faststartfailoverthreshold" updated -- Primary oracle@dbidg01:/u01/app/oracle/admin/DBIT121/pfile/ [DBIT121] ls -l dr* -rw-r----- 1 oracle oinstall 12288 Feb 28 09:27 dr1DBIT121_SITE1.dat -rw-r----- 1 oracle oinstall 12288 Feb 28 09:34 dr2DBIT121_SITE1.dat -- Standby oracle@dbidg02:/u01/app/oracle/admin/DBIT121/pfile/ [DBIT121] ls -l dr* -rw-r----- 1 oracle oinstall 12288 Feb 28 09:34 dr1DBIT121_SITE2.dat -rw-r----- 1 oracle oinstall 12288 Feb 28 09:27 dr2DBIT121_SITE2.dat
As you can see here, the broker configuration files have two different time stamps. One with the last know good state at 09:27 and one with new state and 09:34. When the broker is started for the first time, it is expected that you see only one configuration file. But don’t panic, the other will be created by the next updates done by the broker.
The broker configuration file is a binary file, however the readable contents of that file can be extracted via the strings command. It gives an idea of some contents of the broker file. e.g. it shows you that the broker config file is brought to you by the fine folks from NEDC, whoever they are. 🙂 With 12cR1 is looks like the following.
oracle@dbidg01:/u01/app/oracle/admin/DBIT121/pfile/ [DBIT121] file dr2DBIT121_SITE1.dat dr2DBIT121_SITE1.dat: data oracle@dbidg01:/u01/app/oracle/admin/DBIT121/pfile/ [DBIT121] strings dr2DBIT121_SITE1.dat }|{z DBIT121 DBIT121_SITE1 DBIT121_SITE2 DBIT121_SITE2 Brought to you by the fine folks at NEDC. TRUETRUEdbidg03ALLDBIT121FALSECONTINUE DBIT121_SITE1DBIT121_SITE1FALSEFALSEFALSESYNCoptionalDISABLEON1,1948637,931101276,0*AUTO0,0,00,0,0AUTODBIT121_SITE2 dbidg01DBIT121DBIT121(DESCRIPTION=(ADDRESS=(PROTOCOL=tcp)(HOST=dbidg01)(PORT=1521))(CONNECT_DATA=(SERVICE_NA ME=DBIT121_SITE1_DGMGRL)(INSTANCE_NAME=DBIT121)(SERVER=DEDICATED)))USE_DB_RECOVERY_FILE_DEST%t_%s_%r.dbf4 DBIT121_SITE2DBIT121_SITE2FALSEFALSEFALSESYNCoptionalDISABLEON1,1948637,931101276,0*DBIT121AUTO0,0,00,0,0AUTODBIT121_SITE1 dbidg02DBIT121DBIT121(DESCRIPTION=(ADDRESS=(PROTOCOL=tcp)(HOST=dbidg02)(PORT=1521))(CONNECT_DATA=(SERVICE_NAME=DBIT121_SITE2_DGMGRL)(INSTANCE_NAME=DBIT121)(SERVER=DEDICATED)))USE_DB_RECOVERY_FILE_DEST%t_%s_%r.dbf
Things are changing with Oracle 12cR2. The broker config files are still binary files, however the readable content via the strings command is very good now. You can get a lot of very useful information out of it. And even with Oracle 12cR2, it looks like that it is still brought to you by the same fine folks from NEDC. 🙂 Looks like that the Oracle developers have a sort of humor. 🙂
oracle@dbidg01:/u01/app/oracle/admin/DBIT122/pfile/ [DBIT122] file dr2DBIT122_SITE1.dat dr2DBIT122_SITE1.dat: data oracle@dbidg01:/u01/app/oracle/admin/DBIT122/pfile/ [DBIT122] strings dr2DBIT122_SITE1.dat }|{z DBIT122 DBIT122_SITE1 Brought to you by the fine folks at NEDC. <?xml version="1.0" encoding="UTF-8"?> <DRC Version="12.2.0.1.0" Name="DBIT122" CurrentPath="True"> <PlannedState>OFFLINE</PlannedState> <Status> <Severity>Success</Severity> <Error>0</Error> </Status> <DefaultState>ONLINE</DefaultState> <IntendedState>ONLINE</IntendedState> <MIV PropertyID="1">0</MIV> <PRIMARY_SITE_ID PropertyID="26">513</PRIMARY_SITE_ID> <DRC_UNIQUE_ID PropertyID="4">152596437</DRC_UNIQUE_ID> <DRC_UNIQUE_ID_SEQUENCE PropertyID="5">18</DRC_UNIQUE_ID_SEQUENCE> <EXT_COND PropertyID="29">7</EXT_COND> <OVERALL_PROTECTION_MODE PropertyID="30">2</OVERALL_PROTECTION_MODE> <FastStartFailoverMode PropertyID="32">0</FastStartFailoverMode> <FSFO_MIV PropertyID="33">11</FSFO_MIV> <FastStartFailoverOBID1 PropertyID="119">470173189</FastStartFailoverOBID1> <FastStartFailoverOBID2 PropertyID="120">470173183</FastStartFailoverOBID2> <FastStartFailoverOBID3 PropertyID="121">470173184</FastStartFailoverOBID3> <ObserverVersion1 PropertyID="133">0</ObserverVersion1> <Configuration_Name PropertyID="31">DBIT122</Configuration_Name> <ObserverName1 PropertyID="129">dbidg03</ObserverName1> <ConfigurationWideServiceName PropertyID="132">DBIT122_CFG</ConfigurationWideServiceName> <RoleChangeHistory PropertyID="106"> <RoleChangeRecord> <Event>PhysicalFailover</Event> <OldPrimary>DBIT122_SITE1</OldPrimary> <NewPrimary>DBIT122_SITE2</NewPrimary> <Status>0</Status> <Timestamp>931098450</Timestamp> </RoleChangeRecord> <RoleChangeRecord> <Event>PhysicalFailover</Event> <OldPrimary>DBIT122_SITE2</OldPrimary> <NewPrimary>DBIT122_SITE1</NewPrimary> <Status>0</Status> <Timestamp>931098812</Timestamp> </RoleChangeRecord> <RoleChangeRecord> <Event>PhysicalFailover</Event> <OldPrimary>DBIT122_SITE1</OldPrimary> <NewPrimary>DBIT122_SITE2</NewPrimary> <Status>0</Status> <Timestamp>932306689</Timestamp> </RoleChangeRecord> <RoleChangeRecord> <Event>PhysicalSwitchover</Event> <OldPrimary>DBIT122_SITE2</OldPrimary> <NewPrimary>DBIT122_SITE1</NewPrimary> <Status>0</Status> <Timestamp>932307856</Timestamp> </RoleChangeRecord> <RoleChangeRecord> <Event>PhysicalSwitchover</Event> <OldPrimary>DBIT122_SITE1</OldPrimary> <NewPrimary>DBIT122_SITE2</NewPrimary> <Status>0</Status> <Timestamp>932377455</Timestamp> </RoleChangeRecord> <RoleChangeRecord> <Event>PhysicalSwitchover</Event> <OldPrimary>DBIT122_SITE2</OldPrimary> <NewPrimary>DBIT122_SITE1</NewPrimary> <Status>0</Status> <Timestamp>932381717</Timestamp> </RoleChangeRecord> <RoleChangeRecord> <Event>PhysicalSwitchover</Event> <OldPrimary>DBIT122_SITE1</OldPrimary> <NewPrimary>DBIT122_SITE2</NewPrimary> <Status>0</Status> <Timestamp>932382294</Timestamp> </RoleChangeRecord> <RoleChangeRecord> <Event>PhysicalSwitchover</Event> <OldPrimary>DBIT122_SITE2</OldPrimary> <NewPrimary>DBIT122_SITE1</NewPrimary> <Status>0</Status> <Timestamp>932383387</Timestamp> </RoleChangeRecord> <RoleChangeRecord> <Event>PhysicalSwitchover</Event> <OldPrimary>DBIT122_SITE1</OldPrimary> <NewPrimary>DBIT122_SITE2</NewPrimary> <Status>0</Status> <Timestamp>934017954</Timestamp> </RoleChangeRecord> <RoleChangeRecord> <Event>PhysicalSwitchover</Event> <OldPrimary>DBIT122_SITE2</OldPrimary> <NewPrimary>DBIT122_SITE1</New Primary> <Status>0</Status> <Timestamp>934018548</Timestamp> </RoleChangeRecord> </RoleChangeHistory> <Member MemberID="1" Name="DBIT122_SITE1" CurrentPath="True" Enabled="True" MultiInstanced="True"> <PlannedState>STANDBY</PlannedState> <StandbyRole>PhysicalStandby</StandbyRole> <Status> <Severity>Success</Severity> <Error>0</Error> </Status> <DefaultState>PRIMARY</DefaultState> <IntendedState>PRIMARY</IntendedState> <ResourceType>Database</ResourceType> <CurrentState>PRIMARY</CurrentState> <Role> <ConditionState>PRIMARY</ConditionState> <DefaultState>READ-WRITE-XPTON</DefaultState> <IntendedState>READ-WRITE-XPTON</IntendedState> </Role> <Role> <ConditionState>STANDBY</ConditionState> <DefaultState>PHYSICAL-APPLY-ON</DefaultState> <IntendedState>OFFLINE</IntendedState> </Role> <DB_Unique_Name PropertyID="23">DBIT122_SITE1</DB_Unique_Name> <DGConnectIdentifier PropertyID="6">DBIT122_SITE1</DGConnectIdentifier> <DbDomain PropertyID="37"/> <ClusterDatabase PropertyID="42">FALSE</ClusterDatabase> <DbChangeCritical PropertyID="8">FALSE</DbChangeCritical> <DbIsCritical PropertyID="9">FALSE</DbIsCritical> <LogXptMode PropertyID="40">SYNC</LogXptMode> <IncarnationTable PropertyID="57">6,2568637,932306696,5*5,2514031,931098817,4#4,2513489,931098453,2#2,1396169,929894741,1#1,1,924281211,0#</IncarnationTable> <SRLStatus PropertyID="58">0</SRLStatus> <ActualApplyInstance PropertyID="7"/> <StandbyFileManagement PropertyID="72">AUTO</StandbyFileManagement> <ArchiveLagTarget PropertyID="73">0</ArchiveLagTarget> <LogArchiveMaxProcesses PropertyID="74">4</LogArchiveMaxProcesses> <LogArchiveMinSucceedDest PropertyID="75">1</LogArchiveMinSucceedDest> <DataGuardSyncLatency PropertyID="138">0</DataGuardSyncLatency> <DbFileNameConvert PropertyID="76"/> <LogFileNameConvert PropertyID="77"/> <FastStartFailoverTarget PropertyID="38">DBIT122_SITE2</FastStartFailoverTarget> <ReinstateContextArray PropertyID="39"/> <Instance InstanceID="1" Name="DBIT122" CurrentPath="True" Enabled="True" MultiInstanced="True" DefaultWriteOnce="True"> <PlannedState>OFFLINE</PlannedState> <HostName PropertyID="2" Default="True">dbidg01</HostName> <SidName PropertyID="3">DBIT122</SidName> <InstanceName PropertyID="36">DBIT122</InstanceName> <StaticConnectIdentifier PropertyID="25">(DESCRIPTION=(ADDRESS=(PROTOCOL=TCP)(HOST=dbidg01)(PORT=1521))(CONNECT_DATA=(SERVICE_NAME=DBIT122_SITE1_DGMGRL)(UR=A)(INSTANCE_NAME=DBIT122)(SERVER=DEDICATED)))</StaticConnectIdentifier> <StandbyArchiveLocation PropertyID="96" Default="True">USE_DB_RECOVERY_FILE_DEST</StandbyArchiveLocation> <LogArchiveTrace PropertyID="98">0</LogArchiveTrace> <LogArchiveFormat PropertyID="99">%t_%s_%r.dbf</LogArchiveFormat> </Instance> </Member> <Member MemberID="2" Name="DBIT122_SITE2" CurrentPath="True" Enabled="True" MultiInstanced="True"> <PlannedState>STANDBY</PlannedState> <StandbyRole>PhysicalStandby</StandbyRole> <Status> <Severity>Success</Severity> <Error>0</Error> </Status> <DefaultState>STANDBY</DefaultState> <IntendedState>STANDBY</IntendedState> <ResourceType>Database</ResourceType> <CurrentState>STANDBY</CurrentState> <Role> <ConditionState>PRIMARY</ConditionState> <DefaultState>READ-WRITE-XPTON</DefaultState> <IntendedState>OFFLINE</IntendedState> </Role> <Role> <ConditionState>STANDBY</ConditionState> <DefaultState>PHYSICAL-APPLY-ON</DefaultState> <IntendedState>PHYSICAL-APPLY-ON</IntendedState> </Role> <DB_Unique_Name PropertyID="23">DBIT122_SITE2</DB_Unique_Name> <DGConnectIdentifier PropertyID="6">DBIT122_SITE2</DGConnectIdentifier> <DbDomain PropertyID="37"/> <ClusterDatabase PropertyID="42">FALSE</ClusterDatabase> <DbChangeCritical PropertyID="8">FALSE</DbChangeCritical> <DbIsCritical PropertyID="9">FALSE</DbIsC ritical> <LogXptMode PropertyID="40">SYNC</LogXptMode> <IncarnationTable PropertyID="57">8,2568637,932306696,4*4,2514031,931098817,3#3,2513489,931098453,2#2,1396169,929894741,1#1,1,924281211,0#</IncarnationTable> <SRLStatus PropertyID="58">0</SRLStatus> <ActualApplyInstance PropertyID="7">DBIT122</ActualApplyInstance> <StandbyFileManagement PropertyID="72">AUTO</StandbyFileManagement> <ArchiveLagTarget PropertyID="73">0</ArchiveLagTarget> <LogArchiveMaxProcesses PropertyID="74">4</LogArchiveMaxProcesses> <LogArchiveMinSucceedDest PropertyID="75">1</LogArchiveMinSucceedDest> <DataGuardSyncLatency PropertyID="138">0</DataGuardSyncLatency> <DbFileNameConvert PropertyID="76"/> <LogFileNameConvert PropertyID="77"/> <FastStartFailoverTarget PropertyID="38">DBIT122_SITE1</FastStartFailoverTarget> <ReinstateContextArray PropertyID="39"/> <Instance InstanceID="1" Name="DBIT122" CurrentPath="True" Enabled="True" MultiInstanced="True" DefaultWriteOnce="True"> <PlannedState>OFFLINE</PlannedState> <HostName PropertyID="2" Default="True">dbidg02</HostName> <SidName PropertyID="3">DBIT122</SidName> <InstanceName PropertyID="36">DBIT122</InstanceName> <StaticConnectIdentifier PropertyID="25">(DESCRIPTION=(ADDRESS=(PROTOCOL=TCP)(HOST=dbidg02)(PORT=1521))(CONNECT_DATA=(SERVICE_NAME=DBIT122_SITE2_DGMGRL)(UR=A)(INSTANCE_NAME=DBIT122)(SERVER=DEDICATED)))</StaticConnectIdentifier> <StandbyArchiveLocation PropertyID="96" Default="True">USE_DB_RECOVERY_FILE_DEST</StandbyArchiveLocation> <LogArchiveTrace PropertyID="98">0</LogArchiveTrace> <LogArchiveFormat PropertyID="99">%t_%s_%r.dbf</LogArchiveFormat> </Instance> </Member> </DRC>
In 12cR2, Oracle put more information into the broker files and that’s why they are getting bigger. With a standard config they have now 16k instead of 12k like they had before with 12cR1. Ok .. it looks like I am drifting away. Let’s get back to the original question, how do I recover the broker configuration files, in case they get lost? Like always … it depends. You could lose the old copy on the standby, you could lose the new copy on the standby, or you could lose both copies on the standby, and you could lose them while the standby is up and running or while it is shutdown. The same applies to the primary. It might get even more complicated in case you have a far sync database in between, or more standby’s.
Not making it too complex, in the end, we might end up with 6 different recovery scenarios.
1.) We lose the old copy of the broker config file on either the primary or the standby
2.) We lose both copies of the broker config file on the standby while it is up and running
3.) We lose both copies of the broker config file on the standby while it is shut down
4.) We lose both copies of the broker config file on the primary while it is up and running
5.) We lose both copies of the broker config file on the primary while it is shut down
6.) We lose both copies of the broker contfig files on the primary and the standby (quite unlikely to happen)
My demos are built on a simple primary/standby setup with 12cR1. The primary is on host dbidg01 and the standby is currently on host dbidg02.
DGMGRL> show configuration; Configuration - DBIT121 Protection Mode: MaxAvailability Members: DBIT121_SITE1 - Primary database DBIT121_SITE2 - Physical standby database Fast-Start Failover: DISABLED Configuration Status: SUCCESS (status updated 2 seconds ago) oracle@dbidg01:/u01/app/oracle/admin/DBIT121/pfile/ [DBIT121] ls -lh dr* -rw-r----- 1 oracle oinstall 12K Feb 28 11:03 dr1DBIT121_SITE1.dat -rw-r----- 1 oracle oinstall 12K Feb 28 11:04 dr2DBIT121_SITE1.dat oracle@dbidg02:/u01/app/oracle/admin/DBIT121/pfile/ [DBIT121] ls -lh dr* -rw-r----- 1 oracle oinstall 12K Feb 28 11:04 dr1DBIT121_SITE2.dat -rw-r----- 1 oracle oinstall 12K Feb 28 11:03 dr2DBIT121_SITE2.dat
Scenario 1.) We lose the old copy of the broker config file on either the primary or the standby
There is absolutely not need to panic in this case. I am deleting the old copies on the primary and standby at the same time.
-- primary oracle@dbidg01:/u01/app/oracle/admin/DBIT121/pfile/ [DBIT121] rm dr1DBIT121_SITE1.dat oracle@dbidg01:/u01/app/oracle/admin/DBIT121/pfile/ [DBIT121] -- standby oracle@dbidg02:/u01/app/oracle/admin/DBIT121/pfile/ [DBIT121] rm dr2DBIT121_SITE2.dat oracle@dbidg02:/u01/app/oracle/admin/DBIT121/pfile/ [DBIT121]
The broker does not care at all.
DGMGRL> show configuration; Configuration - DBIT121 Protection Mode: MaxAvailability Members: DBIT121_SITE1 - Primary database DBIT121_SITE2 - Physical standby database Fast-Start Failover: DISABLED Configuration Status: SUCCESS (status updated 57 seconds ago)
The second copy will be created automatically as soon as I am changing a property. e.g. I am setting the FastStartFailoverThreshold to the same value as it had before.
DGMGRL> EDIT CONFIGURATION SET PROPERTY FastStartFailoverThreshold='40'; Property "faststartfailoverthreshold" updated
The broker log has no information about the new created file and also the alert.log does not say anything, but automagically we end up with two broker config files after the update was done. So .. nothing to do here. Oracle handles this itself.
oracle@dbidg01:/u01/app/oracle/diag/rdbms/dbit121_site1/DBIT121/trace/ [DBIT121] tail -20f drcDBIT121.log ... 02/28/2017 11:10:25 EDIT CONFIGURATION SET PROPERTY faststartfailoverthreshold = 40 FSFO threshold value set to 40 seconds oracle@dbidg01:/u01/app/oracle/admin/DBIT121/pfile/ [DBIT121] ls -l dr* -rw-r----- 1 oracle oinstall 12288 Feb 28 11:10 dr1DBIT121_SITE1.dat -rw-r----- 1 oracle oinstall 12288 Feb 28 11:04 dr2DBIT121_SITE1.dat oracle@dbidg02:/u01/app/oracle/admin/DBIT121/pfile/ [DBIT121] ls -lh dr* -rw-r----- 1 oracle oinstall 12K Feb 28 11:04 dr1DBIT121_SITE2.dat -rw-r----- 1 oracle oinstall 12K Feb 28 11:10 dr2DBIT121_SITE2.dat
2.) We lose both copies of the broker config file on the standby while it is up and running
But what happens if I lose both copies on the standby while it is up and running.
oracle@dbidg02:/u01/app/oracle/admin/DBIT121/pfile/ [DBIT121] rm dr*.dat oracle@dbidg02:/u01/app/oracle/admin/DBIT121/pfile/ [DBIT121] ls -lh dr* ls: cannot access dr*: No such file or directory
I am logging in into the standby database and check the configuration. The broker does not care at all if the files are there or not. It seems like nothing has happend.
oracle@dbidg02:/home/oracle/ [DBIT121] dgmgrl -debug -xml DGMGRL for Linux: Version 12.1.0.2.0 - 64bit Production Copyright (c) 2000, 2013, Oracle. All rights reserved. Welcome to DGMGRL, type "help" for information. DGMGRL> connect sys/manager@DBIT121_SITE2 [W000 02/28 11:17:00.64] Connecting to database using DBIT121_SITE2. [W000 02/28 11:17:01.72] Checking broker version [BEGIN :version := dbms_drs.dg_broker_info('VERSION'); END;]. [W000 02/28 11:17:01.72] Oracle database version is '12.1.0.2.0' Connected as SYSDBA. DGMGRL> show configuration; Configuration - DBIT121 Protection Mode: MaxAvailability Members: DBIT121_SITE1 - Primary database DBIT121_SITE2 - Physical standby database Fast-Start Failover: DISABLED Configuration Status: SUCCESS (status updated 8 seconds ago)
The broker is reading the information from memory, maybe because of performance reasons. The current state is also reflected in the x$drc.
SQL> desc x$drc Name Null? Type ----------------------------------------- -------- ---------------------------- ADDR RAW(8) INDX NUMBER INST_ID NUMBER CON_ID NUMBER OBJECT_ID NUMBER ATTRIBUTE VARCHAR2(30) VALUE VARCHAR2(512) PARENT_ID VARCHAR2(15) STATUS VARCHAR2(30) MESSAGE VARCHAR2(256) ERRNUM NUMBER VALUE_RAW RAW(512) ERRTIME NUMBER
So … how do we get the broker files back? Â By simply editing any property. You can choose any property you want.
oracle@dbidg02:/home/oracle/ [DBIT121] dgmgrl -debug -xml DGMGRL for Linux: Version 12.1.0.2.0 - 64bit Production Copyright (c) 2000, 2013, Oracle. All rights reserved. Welcome to DGMGRL, type "help" for information. DGMGRL> connect sys/manager@DBIT121_SITE2 [W000 02/28 11:27:58.23] Connecting to database using DBIT121_SITE2. [W000 02/28 11:27:59.33] Checking broker version [BEGIN :version := dbms_drs.dg_broker_info('VERSION'); END;]. [W000 02/28 11:27:59.33] Oracle database version is '12.1.0.2.0' Connected as SYSDBA. DGMGRL> EDIT CONFIGURATION SET PROPERTY FastStartFailoverThreshold='40'; <DO_CONFIGURE_DRC version="12.1"><EDIT_DRC><PROPERTY name="faststartfailoverthreshold" value="40" context="0"/></EDIT_DRC></DO_CONFIGURE_DRC> <RESULT ><MESSAGE ><SUCCESS ></SUCCESS></MESSAGE><MESSAGE ><SUCCESS ></SUCCESS></MESSAGE></RESULT> <RESULT ><MESSAGE ><SUCCESS ></SUCCESS></MESSAGE><MESSAGE ><SUCCESS ></SUCCESS></MESSAGE></RESULT> Property "faststartfailoverthreshold" updated
As soon as you have edited your configuration, a new broker file appears on the standby.
oracle@dbidg02:/u01/app/oracle/admin/DBIT121/pfile/ [DBIT121] ls -lh dr* -rw-r----- 1 oracle oinstall 12K Feb 28 11:28 dr1DBIT121_SITE2.dat
If you run the same command again, the second one appears as well.
oracle@dbidg02:/u01/app/oracle/admin/DBIT121/pfile/ [DBIT121] ls -lh dr* -rw-r----- 1 oracle oinstall 12K Feb 28 11:28 dr1DBIT121_SITE2.dat -rw-r----- 1 oracle oinstall 12K Feb 28 11:30 dr2DBIT121_SITE2.dat
3.) We lose both copies of the broker config file on the standby while it is shut down
What happens if we lose both copies while the standby is shut down. Does the standby come up correctly? Is there something in the drc.log or alert.log?
SQL> shutdown immediate Database closed. Database dismounted. ORACLE instance shut down. oracle@dbidg02:/u01/app/oracle/admin/DBIT121/pfile/ [DBIT121] rm dr*.dat oracle@dbidg02:/u01/app/oracle/admin/DBIT121/pfile/ [DBIT121] SQL> startup mount ORACLE instance started. Total System Global Area 1325400064 bytes Fixed Size 2924112 bytes Variable Size 352321968 bytes Database Buffers 956301312 bytes Redo Buffers 13852672 bytes Database mounted.
The standby comes up and no entries in the alert.log, however the drc.log shows that the broker files are missing.
02/28/2017 11:35:26 >> Starting Data Guard Broker bootstrap << Broker Configuration File Locations: dg_broker_config_file1 = "/u01/app/oracle/admin/DBIT121/pfile/dr1DBIT121_SITE2.dat" dg_broker_config_file2 = "/u01/app/oracle/admin/DBIT121/pfile/dr2DBIT121_SITE2.dat" 2017-02-28 11:35:26.313 DMON: Attach state object 2017-02-28 11:35:26.314 DMON: cannot open configuration file "/u01/app/oracle/admin/DBIT121/pfile/dr1DBIT121_SITE2.dat", retrying 2017-02-28 11:35:27.334 DMON: cannot open configuration file "/u01/app/oracle/admin/DBIT121/pfile/dr1DBIT121_SITE2.dat" 2017-02-28 11:35:27.334 ORA-27037: unable to obtain file status 2017-02-28 11:35:27.335 Linux-x86_64 Error: 2: No such file or directory 2017-02-28 11:35:27.335 Additional information: 3 2017-02-28 11:35:27.335 DMON: Error opening "/u01/app/oracle/admin/DBIT121/pfile/dr1DBIT121_SITE2.dat", error = ORA-16572 2017-02-28 11:35:27.335 DMON: Establishing "/u01/app/oracle/admin/DBIT121/pfile/dr2DBIT121_SITE2.dat" as the more current file 2017-02-28 11:35:27.335 DMON: cannot open configuration file "/u01/app/oracle/admin/DBIT121/pfile/dr2DBIT121_SITE2.dat", retrying 2017-02-28 11:35:28.355 DMON: cannot open configuration file "/u01/app/oracle/admin/DBIT121/pfile/dr2DBIT121_SITE2.dat" ... 2017-02-28 11:35:42.893 Configuration does not exist, Data Guard broker ready 2017-02-28 11:35:42.894 7fffffff 0 DMON: Entered rfm_release_chief_lock() for CTL_BOOTSTRAP DGMGRL> connect sys/manager@DBIT121_SITE2 [W000 02/28 11:38:50.58] Connecting to database using DBIT121_SITE2. [W000 02/28 11:38:51.67] Checking broker version [BEGIN :version := dbms_drs.dg_broker_info('VERSION'); END;]. [W000 02/28 11:38:51.68] Oracle database version is '12.1.0.2.0' Connected as SYSDBA. DGMGRL> show configuration; ORA-16532: Oracle Data Guard broker configuration does not exist Configuration details cannot be determined by DGMGRL SQL> select attribute, value from x$drc; select attribute, value from x$drc * ERROR at line 1: ORA-16532: Oracle Data Guard broker configuration does not exist
If we logon to the primary, we can see that the standby DBIT121_SITE2 was disabled.
oracle@dbidg01:/u01/app/oracle/admin/DBIT121/pfile/ [DBIT121] dgmgrl -debug -xml DGMGRL for Linux: Version 12.1.0.2.0 - 64bit Production Copyright (c) 2000, 2013, Oracle. All rights reserved. Welcome to DGMGRL, type "help" for information. DGMGRL> connect sys/manager@DBIT121_SITE1 [W000 02/28 11:45:40.56] Connecting to database using DBIT121_SITE1. [W000 02/28 11:45:41.68] Checking broker version [BEGIN :version := dbms_drs.dg_broker_info('VERSION'); END;]. [W000 02/28 11:45:41.68] Oracle database version is '12.1.0.2.0' Connected as SYSDBA. DGMGRL> show configuration; Configuration - DBIT121 Protection Mode: MaxAvailability Members: DBIT121_SITE1 - Primary database Warning: ORA-16629: database reports a different protection level from the protection mode DBIT121_SITE2 - Physical standby database (disabled) Fast-Start Failover: DISABLED Configuration Status: WARNING (status updated 7 seconds ago)
However, after we enable the standby, we can see in the drc.log that the standby receives the metadata from the primary and creates the broker config file.
DGMGRL> enable database 'DBIT121_SITE2'; <DO_CONTROL version="12.1"><DO_COMMAND type="Enable" object_id="33554432"/></DO_CONTROL> <RESULT ><MESSAGE ><SUCCESS ></SUCCESS></MESSAGE><MESSAGE ><SUCCESS ></SUCCESS></MESSAGE></RESULT> Enabled. DGMGRL> drc.log 2017-02-28 11:46:03.352 DRCX: Start receiving metadata file: "/u01/app/oracle/admin/DBIT121/pfile/dr1DBIT121_SITE2.dat" 2017-02-28 11:46:03.356 DRCX: Receiving block #1 (containing Seq.MIV = 2.20), 2 blocks 2017-02-28 11:46:03.358 DRCX: End receiving metadata file: opcode CTL_ENABLE (1.1.631286030) 2017-02-28 11:46:03.360 DMON: Entered rfm_get_chief_lock() for CTL_ENABLE, reason 1 2017-02-28 11:46:03.360 02001000 631286030 DMON: start task execution: for metadata resynchronization 2017-02-28 11:46:03.360 02001000 631286030 DMON: status from posting standby instances for RESYNCH = ORA-00000 2017-02-28 11:46:03.360 INSV: Received message for inter-instance publication 2017-02-28 11:46:03.361 req ID 1.1.631286030, opcode CTL_ENABLE, phase RESYNCH, flags 8005 2017-02-28 11:46:03.361 02001000 631286030 DMON: Metadata available (1.1.631286030), loading from "/u01/app/oracle/admin/DBIT121/pfile/dr1DBIT121_SITE2.dat" 2017-02-28 11:46:03.361 02001000 631286030 Opcode = CTL_ENABLE, Chief Instance I_ID = 1 2017-02-28 11:46:03.364 DMON Registering service DBIT121_SITE2_DGB with listener(s) 2017-02-28 11:46:03.364 DMON: Executing SQL [ALTER SYSTEM REGISTER] 2017-02-28 11:46:03.365 SQL [ALTER SYSTEM REGISTER] Executed successfully 02/28/2017 11:46:06 Creating process RSM0 oracle@dbidg02:/u01/app/oracle/admin/DBIT121/pfile/ [DBIT121] ls -lh dr* -rw-r----- 1 oracle oinstall 12K Feb 28 11:46 dr1DBIT121_SITE2.dat
Again, if you do any property change again, the second broker file will be created.
4.) We lose both copies of the broker config file on the primary while it is up and running
There is not much difference with the scenario we have seen with the standby. The broker just reads from memory and as soon as any update to the config is done, the broker file is created again. There will be no entries in the drc.log or the alert.log. The broker file is just silently recreated.
oracle@dbidg01:/u01/app/oracle/admin/DBIT121/pfile/ [DBIT121] rm dr* oracle@dbidg01:/u01/app/oracle/admin/DBIT121/pfile/ [DBIT121] ls -lh dr* ls: cannot access dr*: No such file or directory oracle@dbidg01:/u01/app/oracle/admin/DBIT121/pfile/ [DBIT121] dgmgrl -debug -xml DGMGRL for Linux: Version 12.1.0.2.0 - 64bit Production Copyright (c) 2000, 2013, Oracle. All rights reserved. Welcome to DGMGRL, type "help" for information. DGMGRL> connect sys/manager [W000 02/28 12:28:45.07] Connecting to database using . [W000 02/28 12:28:45.10] Checking broker version [BEGIN :version := dbms_drs.dg_broker_info('VERSION'); END;]. [W000 02/28 12:28:45.10] Oracle database version is '12.1.0.2.0' Connected as SYSDG. DGMGRL> show configuration; Configuration - DBIT121 Protection Mode: MaxAvailability Members: DBIT121_SITE1 - Primary database DBIT121_SITE2 - Physical standby database Fast-Start Failover: DISABLED Configuration Status: SUCCESS (status updated 9 seconds ago) DGMGRL> EDIT CONFIGURATION SET PROPERTY FastStartFailoverThreshold='40'; <DO_CONFIGURE_DRC version="12.1"><EDIT_DRC><PROPERTY name="faststartfailoverthreshold" value="40" context="0"/></EDIT_DRC></DO_CONFIGURE_DRC> <RESULT ><MESSAGE ><SUCCESS ></SUCCESS></MESSAGE><MESSAGE ><SUCCESS ></SUCCESS></MESSAGE></RESULT> <RESULT ><MESSAGE ><SUCCESS ></SUCCESS></MESSAGE><MESSAGE ><SUCCESS ></SUCCESS></MESSAGE></RESULT> Property "faststartfailoverthreshold" updated oracle@dbidg01:/u01/app/oracle/admin/DBIT121/pfile/ [DBIT121] ls -lh dr* -rw-r----- 1 oracle oinstall 12K Feb 28 12:30 dr2DBIT121_SITE1.dat
5.) We lose both copies of the broker config file on the primary while it is shut down
Let’s do the whole thing again while the primary DB is shutdown.
SQL> shutdown immediate Database closed. Database dismounted. ORACLE instance shut down. SQL> oracle@dbidg01:/u01/app/oracle/admin/DBIT121/pfile/ [DBIT121] rm dr* oracle@dbidg01:/u01/app/oracle/admin/DBIT121/pfile/ [DBIT121] ls -lh dr* ls: cannot access dr*: No such file or directory
In the drc.log log file you can immediately see that there is something wrong.
>> Starting Data Guard Broker bootstrap << Broker Configuration File Locations: dg_broker_config_file1 = "/u01/app/oracle/admin/DBIT121/pfile/dr1DBIT121_SITE1.dat" dg_broker_config_file2 = "/u01/app/oracle/admin/DBIT121/pfile/dr2DBIT121_SITE1.dat" 2017-02-28 12:34:45.866 DMON: Attach state object 2017-02-28 12:34:45.866 DMON: cannot open configuration file "/u01/app/oracle/admin/DBIT121/pfile/dr1DBIT121_SITE1.dat", retrying 2017-02-28 12:34:46.876 DMON: cannot open configuration file "/u01/app/oracle/admin/DBIT121/pfile/dr1DBIT121_SITE1.dat" 2017-02-28 12:34:46.880 ORA-27037: unable to obtain file status 2017-02-28 12:34:46.881 Linux-x86_64 Error: 2: No such file or directory 2017-02-28 12:34:46.881 Additional information: 3 2017-02-28 12:34:46.881 DMON: Error opening "/u01/app/oracle/admin/DBIT121/pfile/dr1DBIT121_SITE1.dat", error = ORA-16572 2017-02-28 12:34:46.881 DMON: Establishing "/u01/app/oracle/admin/DBIT121/pfile/dr2DBIT121_SITE1.dat" as the more current file 2017-02-28 12:34:46.882 DMON: cannot open configuration file "/u01/app/oracle/admin/DBIT121/pfile/dr2DBIT121_SITE1.dat", retrying 2017-02-28 12:34:47.899 DMON: cannot open configuration file "/u01/app/oracle/admin/DBIT121/pfile/dr2DBIT121_SITE1.dat" ... 2017-02-28 12:35:02.058 7fffffff 0 DMON: Entered rfm_release_chief_lock() for CTL_BOOTSTRAP 2017-02-28 12:35:02.424 Fore: Continuing with primary evaluation, rfmsp.drc_status_rfmp = ORA-16532 2017-02-28 12:35:03.507 Fore: Initiating post-open tasks 2017-02-28 12:35:09.192 DMON: Initiating post-open tasks 2017-02-28 12:35:22.242 00000000 1934847279 DMON: GET_FSFO will be retired 2017-02-28 12:35:22.242 00000000 1934847279 severity = ORA-16501, status = ORA-16532 2017-02-28 12:35:22.242 00000000 1934847279 DMON: GET_FSFO operation completed 2017-02-28 12:35:52.250 00000000 1934847280 DMON: GET_FSFO will be retired 2017-02-28 12:35:52.250 00000000 1934847280 severity = ORA-16501, status = ORA-16532 2017-02-28 12:35:52.251 00000000 1934847280 DMON: GET_FSFO operation completed
If you take a look at the alert.log, everything is fine.
oracle@dbidg01:/u01/app/oracle/admin/DBIT121/pfile/ [DBIT121] dgmgrl -debug -xml DGMGRL for Linux: Version 12.1.0.2.0 - 64bit Production Copyright (c) 2000, 2013, Oracle. All rights reserved. Welcome to DGMGRL, type "help" for information. DGMGRL> connect sys/manager@DBIT121_SITE1 [W000 02/28 12:39:01.18] Connecting to database using DBIT121_SITE1. [W000 02/28 12:39:02.28] Checking broker version [BEGIN :version := dbms_drs.dg_broker_info('VERSION'); END;]. [W000 02/28 12:39:02.28] Oracle database version is '12.1.0.2.0' Connected as SYSDBA. DGMGRL> show configuration; ORA-16532: Oracle Data Guard broker configuration does not exist Configuration details cannot be determined by DGMGRL
But if connected to the standby, you will see a clear error message: ORA-16532: Oracle Data Guard broker configuration does not exist.
DGMGRL> connect sys/manager@DBIT121_SITE2 [W000 02/28 12:39:18.81] Connecting to database using DBIT121_SITE2. [W000 02/28 12:39:19.90] Checking broker version [BEGIN :version := dbms_drs.dg_broker_info('VERSION'); END;]. [W000 02/28 12:39:19.90] Oracle database version is '12.1.0.2.0' Connected as SYSDBA. DGMGRL> show configuration; Configuration - DBIT121 Protection Mode: MaxAvailability Members: DBIT121_SITE1 - Primary database Error: ORA-16532: Oracle Data Guard broker configuration does not exist DBIT121_SITE2 - Physical standby database Fast-Start Failover: DISABLED Configuration Status: ERROR (status updated 0 seconds ago)
Taking a close look at the error message, it does not give you a hint how to correct the issue.
oracle@dbidg01:/u01/app/oracle/admin/DBIT121/pfile/ [DBIT121] oerr ora 16532 16532, 00000, "Oracle Data Guard broker configuration does not exist" // *Cause: A broker operation was requested that required a broker // configuration to already be created. // *Action: Create a Data Guard broker configuration prior to performing // other broker operations. If only one instance of a RAC // database is reporting this error, ensure that the // DG_BROKER_CONFIG_FILE[1|2] initialization parameters are // set to file locations that are shared by all instances of // the RAC database.
Let’s try the same trick, as we have done on the standby by simply enabling the database again.
DGMGRL> enable database 'DBIT121_SITE1'; <DO_CONTROL version="12.1"><DO_COMMAND type="Enable" object_id="16777216"/></DO_CONTROL> <RESULT ><MESSAGE ><FAILURE error_num="16532" error_prefix="ORA"><ERROR_TEXT >ORA-16532: Oracle Data Guard broker configuration does not exist </ERROR_TEXT></FAILURE></MESSAGE><MESSAGE ><FAILURE error_num="16625" error_prefix="ORA" error_tag1="DBIT121_SITE1"><ERROR_TEXT >ORA-16625: cannot reach database "DBIT121_SITE1" </ERROR_TEXT></FAILURE></MESSAGE></RESULT> <RESULT ><MESSAGE ><FAILURE error_num="16532" error_prefix="ORA"><ERROR_TEXT >ORA-16532: Oracle Data Guard broker configuration does not exist </ERROR_TEXT></FAILURE></MESSAGE><MESSAGE ><FAILURE error_num="16625" error_prefix="ORA" error_tag1="DBIT121_SITE1"><ERROR_TEXT >ORA-16625: cannot reach database "DBIT121_SITE1" </ERROR_TEXT></FAILURE></MESSAGE></RESULT> Failed.
Hmmmm … does not look good. To recovery from that situation, we need to figure out the latest broker file version on the standby, which is dr2DBIT121_SITE2.dat is my case, and copy it over to the primary. Before doing that, we need to stop the broker on the primary.
SQL> alter system set dg_broker_start=false; System altered.
Now we can copy the latest version over.
-- standby oracle@dbidg02:/u01/app/oracle/admin/DBIT121/pfile/ [DBIT121] ls -rlth dr* -rw-r----- 1 oracle oinstall 12K Feb 28 12:30 dr1DBIT121_SITE2.dat -rw-r----- 1 oracle oinstall 12K Feb 28 12:33 dr2DBIT121_SITE2.dat oracle@dbidg02:/u01/app/oracle/admin/DBIT121/pfile/ [DBIT121] scp -p dr2DBIT121_SITE2.dat oracle@dbidg01:/u01/app/oracle/admin/DBIT121/pfile/dr1DBIT121_SITE1.dat dr2DBIT121_SITE2.dat 100% 12KB 12.0KB/s 00:00
And optionally create the second broker file on the primary.
-- primary oracle@dbidg01:/u01/app/oracle/admin/DBIT121/pfile/ [DBIT121] cp -p dr1DBIT121_SITE1.dat dr2DBIT121_SITE1.dat oracle@dbidg01:/u01/app/oracle/admin/DBIT121/pfile/ [DBIT121]
Last but not least, we need to enable the broker again.
SQL> alter system set dg_broker_start=true; System altered. DGMGRL> connect sys/manager@DBIT121_SITE1 Connected as SYSDBA. DGMGRL> show configuration; Configuration - DBIT121 Protection Mode: MaxAvailability Members: DBIT121_SITE1 - Primary database DBIT121_SITE2 - Physical standby database Fast-Start Failover: DISABLED Configuration Status: SUCCESS (status updated 7 seconds ago)
6.) We lose both copies of the broker config files on the primary and the standby (quite unlikely to happen)
This scenario is quite unlikely to happen, but if it happens it is good to have your Data Guard configuration as a script in place.
-- shutdown primary SQL> shutdown immediate Database closed. Database dismounted. ORACLE instance shut down. -- shutdown standby SQL> shutdown immediate ORA-01109: database not open Database dismounted. ORACLE instance shut down.
Remove the broker files on the primary and the standby
-- primary oracle@dbidg01:/u01/app/oracle/admin/DBIT121/pfile/ [DBIT121] rm dr* oracle@dbidg01:/u01/app/oracle/admin/DBIT121/pfile/ [DBIT121] -- standby oracle@dbidg02:/u01/app/oracle/admin/DBIT121/pfile/ [DBIT121] rm dr* oracle@dbidg02:/u01/app/oracle/admin/DBIT121/pfile/ [DBIT121]
Now, after we have lost everything, meaning all broker config files, the only chance that we have is to create it from scratch. It is quite cool, if you have the configuration is place, because in a blink of an eye you recreate the whole configuration.
oracle@dbidg01:/u01/app/oracle/admin/DBIT121/create/ [DBIT121] cat broker.cfg CONNECT SYS/manager@DBIT121_SITE1 REMOVE CONFIGURATION; sql "alter system set log_archive_dest_2=''"; CREATE CONFIGURATION 'DBIT121' AS PRIMARY DATABASE IS 'DBIT121_SITE1' CONNECT IDENTIFIER IS 'DBIT121_SITE1'; ADD DATABASE 'DBIT121_SITE2' AS CONNECT IDENTIFIER IS 'DBIT121_SITE2'; EDIT DATABASE 'DBIT121_SITE1' SET PROPERTY StandbyFileManagement='AUTO'; EDIT DATABASE 'DBIT121_SITE2' SET PROPERTY StandbyFileManagement='AUTO'; EDIT DATABASE 'DBIT121_SITE1' SET PROPERTY LogXptMode='SYNC'; EDIT DATABASE 'DBIT121_SITE2' SET PROPERTY LogXptMode='SYNC'; ENABLE CONFIGURATION; EDIT CONFIGURATION SET PROTECTION MODE AS MaxAvailability; EDIT CONFIGURATION SET PROPERTY FastStartFailoverThreshold='40'; EDIT DATABASE 'DBIT121_SITE1' SET PROPERTY FastStartFailoverTarget='DBIT121_SITE2'; EDIT DATABASE 'DBIT121_SITE2' SET PROPERTY FastStartFailoverTarget='DBIT121_SITE1'; sql "alter system archive log current"; sql "alter system register"; SHOW CONFIGURATION; oracle@dbidg01:/u01/app/oracle/admin/DBIT121/create/ [DBIT121] dgmgrl -debug -xml < broker.cfg > broker.log 2>&1 oracle@dbidg01:/u01/app/oracle/admin/DBIT121/create/ [DBIT121] DGMGRL> show configuration; Configuration - DBIT121 Protection Mode: MaxAvailability Members: DBIT121_SITE1 - Primary database DBIT121_SITE2 - Physical standby database Fast-Start Failover: DISABLED Configuration Status: SUCCESS (status updated 2 seconds ago)
Conclusion
Loosing broker configuration files is not the end of the world. As you have seen in the blog, it makes quite a huge difference if you loose the broker file while the DB is up and running or it is shut down. In case you haven’t lost all of them you can recover them in most of the situations. However, the alert.log is not a good candidate to check if the broker files are missing. So better scan the alert.log and the drc.log with your preferred monitoring tool to get the whole picture. Besides that, it is good practice to adjust your broker.cfg file after you have done some changes because it makes it very easy to rebuild the whole broker config from scratch.
Ed
05.11.2024Most of the dbi Blogs are really, really good. The only downside is the page title only show "dbi Blog" only instead maybe " dbi Blog + [blog-subject-matter]. Is this by design somehow as intended :-) ?
Morgan Patou
06.11.2024Hi Ed,
Thanks for the feedback. I think it used to contain the title of the blog, but maybe we lost that in our last blog platform migration. I will have a quick talk with the team that handles that, thanks again.
Cheers,
Morgan