Oracle GoldenGate allows to migrate Oracle database with Near Zero Downtime and with failback capability.
Near Zero Downtime migration means very minimal application switchover downtime.
The failback consists to rollback the migration from 19c to 12c.
The goal of this blog is to describe how to migrate an oracle database from 12c to 19c via Oracle GoldenGate with Near Zero Downtime and with failback capability.
STEP 1 – CHECK GOLDENGATE SYNCHRONIZATION
The Oracle GoldenGate processes must be up and running (EXTRACT and PUMP on source, REPLICAT on target) and the source and target databases must be synchronized (same data between source and target).
Since GoldenGate is a logical replication, comparison of data must be done regularly via Oracle Veridata (under license) or manually (my preferred method):
- Compare the number of rows between Source and Target via the sql function COUNT(*) –> must return the same number of rows
- Compare the data between Source and Target via the sql function MINUS –> must return “No rows selected”
- MINUS does not support CLOB/BLOB columns
- Compare the data for CLOB/BLOB between Source and Target via dbms_lob.compare –> must return “0”
--COMPARE NB OF ROWS SQL> select count(*) from [email protected]_source union all select count(*) from TABLE_A; COUNT(*) ---------- 3754 3754 --COMPARE DATA BETWEEN SOURCE AND TARGET SQL> select * from [email protected]_source minus select * from TABLE_A; no rows selected --COMPARE DATA BETWEEN TARGET AND SOURCE SQL> select * TABLE_A minus select * from [email protected]_source; no rows selected --COMPARE DATA FOR CLOB/BLOB COLUMNS CREATE OR REPLACE FUNCTION compare_clob(clob_src IN clob, clob_trg IN clob) RETURN NUMBER IS v_result number; clob_src1 long; clob_trg1 long; BEGIN clob_src1 := dbms_lob.substr( clob_src, 32000, 1 ); clob_trg1 := dbms_lob.substr( clob_trg, 32000, 1 ); select dbms_lob.compare(clob_src1,clob_trg1) into v_result from dual; RETURN(v_result); END; /
STEP 2 – STOP THE APPLICATION ON SOURCE
At this step, the downtime starts…
- Stop the Application on the source database (12c)
- If the application cannot be stopped :
- Lock the applicatives schemas
- Check there is no more transaction int v$transaction/gv$transaction
- Kill session
- Stop the listener
- Save the job_queue_processes value (needed later when we will start the application on 19c)
- Disable the job_queue_processes (“alter system set job_queue_processes = 0 scope = both)
To be sure there is no transaction running, create a dummy table on 12c database and insert 1 row, later we will check into the GoldenGate trail files that this transaction is the last one, that will ensure us that no “real” transaction are lost from Source database before the switchover.
SQL> create table source_schema.zz_dummy2 (SCN varchar2(100), d date, comment_txt varchar2(50)); Table created. SQL> --on target SQL> desc source_schema.zz_dummy2 Name Null? Type ----------------------------------------- -------- ---------------------------- SCN VARCHAR2(100) D DATE COMMENT_TXT VARCHAR2(50) SQL> alter session set nls_date_format='dd.mm.yyyy hh24:mi:ss'; insert into source_schema.zz_dummy2 select (select to_char(current_scn) from v$database),sysdate, 'RECORD TEST' from dual; 1 row created. SQL> commit; Commit complete. SQL> select * from source_schema.zz_dummy2; SCN -------------------------------------------------------------------------------- D ------------------- COMMENT_TXT -------------------------------------------------------------------------------- 3386436290352 13.04.2022 14:36:41 RECORD TEST SQL>
Check the table is synchronized by GoldenGate on the target database (19c).
STEP 3 – CREATE RESTORE POINT
On the source database, create a restore point :
SQL> CREATE RESTORE POINT rp_before_switch GUARANTEE FLASHBACK DATABASE; Restore point created.
If we want to rollback the migration, we will replicate the 12c database from the 19c database via GoldenGate (with new transaction existing since the start of the Application on 19c). If the rollback with GoldenGate fails (for any reason), this restore point will allow to restore the database in point in time.
STEP 4 – CHECK EXTRACT HAS NO MORE TRANSACTION TO PROCESS
Check the EXTRACT process has no record to process.
GGSCI (source_server as [email protected]_source) 42> send extract e1 logend Sending LOGEND request to EXTRACT E1 ... YES GGSCI (source_server) 8> send extract e1 showtrans Sending SHOWTRANS request to EXTRACT E1... No transactions found. GGSCI (source_server) 16> send extract e1 logend Sending LOGEND request to EXTRACT E1 ... YES
The command “send extract e1 logend” goes to the end of the current trailfile and check the last transaction.
If the command “send extract e1 logend” send YES, that means the last record has been handled.
If the command “send extract e1 logend” send NO, that means there is always record to handle by the EXTRACT process.
If there is no more transaction to manage by GoldenGate, we are ready to stop the EXTRACT process:
GGSCI (source_server) 9> stop extract e1 Sending STOP request to EXTRACT E1... Request processed. GGSCI (source_server) 9> info all Program Status Group Lag at Chkpt Time Since Chkpt MANAGER RUNNING EXTRACT STOPPED E1 00:00:02 00:00:02
STEP 5 – CHECK REPLICAT HAS NO MORE TRANSACTION TO PROCESS
Check the REPLICAT process has no record to process:
GGSCI (target_server) 32> send replicat R1 logend Sending LOGEND request to REPLICAT R1... YES
The command “send replicat R1 logend” must return “YES”
STEP 6 – CHECK THE LAST TRANSACTION INTO TRAIL FILE
From logdump utility, check into the current trailfile (last trail file used by the EXTRACT or the REPLICAT) if the last transaction concerns the Insert into the dummy table (remember the step2) :
--got the trail file name with the command “view report E1” Logdump 25 >ghdr on detail data ggstoken detailLogdump 26 >Logdump 27 > Logdump 28 > Logdump 28 > Logdump 28 >open /u01/xs/ogg/xs000000000 Current LogTrail is /u01/xs/ogg/xs000000000 Logdump 29 >pos last Reading forward from RBA 348547 Logdump 30 >pos rev Reading in reverse from RBA 348547 Logdump 31 >n ___________________________________________________________________ Hdr-Ind : E (x45) Partition : . (x0c) UndoFlag : . (x00) BeforeAfter: A (x41) RecLength : 65 (x0041) IO Time : 2022/04/13 14:36:44.000.000 IOType : 5 (x05) OrigNode : 255 (xff) TransInd : . (x03) FormatType : R (x52) SyskeyLen : 0 (x00) Incomplete : . (x00) AuditRBA : 61367 AuditPos : 522820 Continued : N (x00) RecCount : 1 (x01) 2022/04/13 14:36:44.000.000 Insert Len 65 RBA 348360 Name: SOURCE_SCHEMA.ZZ_DUMMY2 (TDR Index: 3) After Image: Partition x0c G s 0000 1100 0000 0d00 3333 3836 3433 3632 3930 3335 | ........338643629035 3201 0015 0000 0032 3032 322d 3034 2d31 333a 3134 | 2......2022-04-13:14 3a33 363a 3431 0200 0f00 0000 0b00 5245 434f 5244 | :36:41........RECORD 2054 4553 54 | TEST Column 0 (x0000), Len 17 (x0011) 0000 0d00 3333 3836 3433 3632 3930 3335 32 | ....3386436290352 Column 1 (x0001), Len 21 (x0015) 0000 3230 3232 2d30 342d 3133 3a31 343a 3336 3a34 | ..2022-04-13:14:36:4 31 | 1 Column 2 (x0002), Len 15 (x000f) 0000 0b00 5245 434f 5244 2054 4553 54 | ....RECORD TEST GGS tokens: TokenID x52 'R' ORAROWID Info x00 Length 20 4141 4347 5159 4141 4541 4141 4154 4d41 4141 0001 | AACGQYAAEAAAATMAAA.. TokenID x4c 'L' LOGCSN Info x00 Length 13 3333 3836 3433 3632 3930 3336 36 | 3386436290366 TokenID x36 '6' TRANID Info x00 Length 15 302e 3130 2e31 392e 3437 3631 3931 39 | 0.10.19.4761919 TokenID x69 'i' ORATHREADID Info x01 Length 2 0001 | .. Logdump 32 >
The last transaction must be the INSERT INTO SOURCE_SCHEMA.ZZ_DUMMY2. If this is the case, stop the REPLICAT PROCESS:
GGSCI (target_server) 2> stop replicat R1 Sending STOP request to REPLICAT R1... Request processed. GGSCI (target_server) 3> info all Program Status Group Lag at Chkpt Time Since Chkpt MANAGER RUNNING REPLICAT STOPPED R1 00:00:00 00:00:11
STEP 7 – DISABLE AUTOSTART/AUTORESTART INTO MANAGER PARAMETER FILE
Remove the AUTOSTART/AUTORESTART parameters into the MANAGER parameter file on source and target:
--FOR EXTRACT AUTORESTART EXTRACT E1, RETRIES 3, WAITMINUTES 1, RESETMINUTES 60 AUTOSTART EXTRACT E1 --FOR REPLICAT AUTORESTART EXTRACT R1, RETRIES 3, WAITMINUTES 1, RESETMINUTES 60 AUTOSTART EXTRACT R1
STEP 8 – CREATE THE REVERSE EXTRACT (NEEDED IN CASE OF FAILBACK)
This step is needed in case we want to rollback the migration after the switchover. Rollback means go-back to 12c database without lose data.
Because new transactions has created new rows into the target database after the switchover, we must re-synchronize the 12c database from the first transaction on 19c database. If we rollback the migration and if the application go back to 12c, no transaction will be lost.
Configure the 19c database to be ready to capture data via the REVERSE EXTRACT:
- Add schematrandata on the schema to be replicated
Register the REVERSE EXTRACT:
GGSCI (target_server) 10> alter credentialstore add user [email protected]_target alias target Password: Credential store altered. GGSCI (target_server) 11> dblogin useridalias target Successfully logged into database db_target. GGSCI (target_server as [email protected]_target) 2> GGSCI (target_server as [email protected]_target) 2> register extract E2 database 2022-04-13 18:31:19 INFO OGG-02003 Extract E2 successfully registered with database at SCN 3386436450080
Save the SCN related the “REGISTER EXTRACT” command and create the REGISTER EXTRACT :
[[email protected]_server:/u01/app/oracle/product/22.214.171.124.4/gg_1]$ mkdir -p /u01/gs_x/ogg/ GGSCI (target_server) 7> add extract E2, integrated tranlog, begin now EXTRACT (Integrated) added. GGSCI (target_server) 8> INFO ALL Program Status Group Lag at Chkpt Time Since Chkpt MANAGER RUNNING EXTRACT STOPPED E2 00:00:02 00:00:08
The extract E2 captures data from 19c database and the parameter file must be configured to extract data from the schema on 19c database.
GGSCI (target_server as [email protected]_target) 13> edit param e2 Extract E2 useridalias target Exttrail /u01/gs_x/ogg/gz LOGALLSUPCOLS UPDATERECORDFORMAT COMPACT DDL & INCLUDE MAPPED OBJNAME target_schema.* Sequence target_schema.*; Table target_schema.* ;
Start the EXTRACT with the SCN got after the REGISTER EXTRACT step:
GGSCI (target_server as [email protected]_target) 12> START EXTRACT E2 atcsn 3386436450080 Sending START request to MANAGER ... EXTRACT E2 starting GGSCI (TARGET_SERVER as [email protected]_db) 15> info all Program Status Group Lag at Chkpt Time Since Chkpt MANAGER RUNNING EXTRACT RUNNING E2 00:00:05 00:00:03
STEP 9 – START THE APPLICATION ON 19C DATABASE
Change the parameter job_queue_processes to the value we had in 12c database (step 2).
Start the application on 19c database by changing the DNS alias.
At this step, the downtime ends…
STEP 10 – GOT THE FIRST TRANSACTION ON 19C DATABASE
View the report file related to the REVERSE EXTRACT:
GGSCI (target_server as [email protected]_target) 71> view report e2 . . . 2022-04-13 18:38:40 INFO OGG-01872 Transaction delivery commencing with Transaction ID 4825175126.96.36.1993, CSN 3386436508051, 0 transaction(s) skipped. . . SCN --> CSN 3386436508051
The SCN related to the first transaction after the switchover is 3386436508051
STEP 11 – CREATE THE REVERSE REPLICAT – (NEEDED IN CASE OF FAILBACK)
The REVERSE REPLICAT replicates the data from 19c database to 12c database from the first transaction occuring on 19c database.
The SCN value is very important because :
- Before the SCN 3386436508051, we will have duplicate data on 12c database
- After the SCN 3386436508051, we risk to have missing data on 12c database
GGSCI (source_server as [email protected]_db) 7> edit param r2 Replicat R2 DBOPTIONS INTEGRATEDPARAMS ( parallelism 6 ) DISCARDFILE /u01/app/oracle/product/188.8.131.52.4/gg_1/dirrpt/R2_discard.txt, append, megabytes 10 USERIDALIAS ggsource MAP target_schema.*, TARGET source_schema.*; add replicat R2 integrated exttrail /u01/gs_x/ogg/gz GGSCI (source_server as [email protected]_db) 6> add replicat R2 integrated exttrail /u01/gs_x/ogg/gz REPLICAT (Integrated) added. GGSCI (source_server as [email protected]_db) 7> info all Program Status Group Lag at Chkpt Time Since Chkpt MANAGER RUNNING EXTRACT STOPPED E1 00:00:02 00:00:02 EXTRACT STOPPED P1 00:00:02 00:00:02 EXTRACT STOPPED R2 00:00:02 00:00:02
Start the REVERSE REPLICAT:
GGSCI (source_server as [email protected]_source) 52> start replicat r2 atcsn 3386436508051 GGSCI (source_server as [email protected]_source) 57> info all Program Status Group Lag at Chkpt Time Since Chkpt MANAGER RUNNING EXTRACT STOPPED E1 00:00:02 00:00:00 EXTRACT STOPPED P1 00:00:02 00:00:00 REPLICAT RUNNING R2 00:00:02 00:00:00
From now the 12c database is syncrhonized from 19c database
STEP 12 – CHECK THE SYNCHRONIZATION – 19C –> 12C
Compare the rows like we do in the step 1.
STEP 13 – REMOVE OR NOT ALL GOLDENGATE PROCESS
If the migration has been validated (application is working as expected on 19c), remove all GoldenGate process on source and target server.
If the migration has not been validated (application is not working as expected), and you want to rollback the migration, replay all the step from step1 but in the direction 19c –> 12c, and start the application on 12c.
Near Zero Downtime migration means very minimal application switchover downtime, so the step from 1 to 9 must be done as faster as possible in order to minimize the application downtime.
It’s difficult to give a downtime estimation because it depends of the complexity of the application and there are a lot of components outside the database and outside GoldenGate which can increase the downtime (jdbc/odbc driver compatibility, dependent application, LDAP connection configuration,…)
Migrate an oracle database in near zero downtime with GoldenGate implies several steps where some are very important :
- Check no more transaction is running before to switchover, otherwise we will lose data in 19c databse
- In case of rollback, choose the correct SCN when we create the REVERSE EXTRACT (SCN got after the REGISTER REVERSE EXTRACT on 19c) and the REVERSE REPLICAT (SCN of the 1st transaction on 19c)
- If we want a minimum of downtime, of course this procedure must be tested on a non production environment as close as possible of PROD in term of activity and architecture to be ready when you will execute it on PROD.