The database Capture/Replay feature of the Real Application Testing (RAT) suite enables you to capture the workload from a production database and replay it on a test or development environment. This allows for realistic testing of database changes, such as upgrades, patches, or hardware migrations.

In this blog, we’ll explore how to accomplish this when both your production and test databases are configured in a two-node RAC environment.

Restrictions :

You should be aware that some workload operations are not supported, such as flashback queries and distributed transactions.

In most cases, Database Replay refrains from capturing these operations, or does not replay them. However you should refer to the oracle documentation bellow for the full list of non-supported operations and related considerations :

Oracle Database Documentation

The Workload Capture

The capture procedure on a RAC database is the same as for a single instance database, you just need to run it on one of the instances, without adding an instance filter :

The path you use to store the raw workload files must be defined as a database directory, and must be empty.

1
SQL> CREATE DIRECTORY capture_dir AS '/u90/db_capture/';

It is recommended, but NOT mandatory, to restart your database before the capture, so that all current processes are finished, and the capture would catch absolutely all the changes from the timestamp of its start.

Let’s take for instance a 4 hours workload capture (14400 seconds) :

1
2
3
4
5
6
BEGIN
  DBMS_WORKLOAD_CAPTURE.start_capture (name     => 'capture_4h',
                                       dir      => 'CAPTURE_DIR',
                                       duration => 14400);
END;
/

This package will capture the workload on both RAC instances for 4 hours and will automatically stop. If you need to stop it earlier, you can do it manually using:

1
SQL>  exec DBMS_WORKLOAD_CAPTURE.finish_capture;

The capture process does not significantly impact performance (it adds about 3-5% CPU overhead). However, it may require substantial storage space. A common reason to stop the capture prematurely is running out of space in the designated directory.

Oracle Support Note Doc ID 1920275.1 provides a method to estimate the space required. For a detailed explanation, please refer to the note:

Doc ID 1920275.1

The general formula for the space needed provided by this note is:

 
2 * Bytes received via SQL*Net from client (as seen in AWR or StatsPack).

Alternatively, you can perform a short capture, such as for 10 minutes, and extrapolate the estimate based on the size captured during that period.

At the end of the capture, you should generate an html summary, and export related AWR data, for that, you’ll need the ID of you capture :

1
2
3
4
5
6
7
8
SELECT id, name FROM dba_workload_captures;
SQL>
        ID NAME
---------- ------------------------------
        11 capture_4h
         2 test_capture_1
 
SQL>

So the capture id in my case is : 11

– Generating The html report (It is generated in the directory of the capture) :

1
2
3
4
5
6
SQL> DECLARE
  l_report  CLOB;
BEGIN
  l_report := DBMS_WORKLOAD_CAPTURE.report(capture_id => 11,
                                           format     => DBMS_WORKLOAD_CAPTURE.TYPE_HTML);
END;

– Exporting the AWR data (It generates a datapump dump file contaiting all awr data of the period of the capture) :

1
SQL> exec DBMS_WORKLOAD_CAPTURE.export_awr (capture_id => 11);

The Replay

To prepare the replay you need two things :

– To copy all the files from the directory of the capture to the test system (As it is two nodes RAC it should be on a shared location).

– Prepare a test database : The html report of the capture contains the SCN of the start, it’s best practice for instance to duplicate your database up to that specific SCN, in the test environment, before applying the changes you want to evaluate (for instance an upgrade, or parameters changes).

Once your test database is ready, and the captured files are available on a shared location :

– We create the directory of the replay :

1
SQL> CREATE OR REPLACE DIRECTORY replay_dir AS '/u90/replay/';

The captured raw files must be processed by the database, before the capture could be replayed, this step could take some time depending of the size of the workload raw files :

1
SQL> exec DBMS_WORKLOAD_REPLAY.process_capture('REPLAY_DIR');

Then we initiate the replay (just an initiation, the replay is not yet started) :

1
SQL> exec DBMS_WORKLOAD_REPLAY.initialize_replay(replay_name => 'replay_capture_4h', replay_dir  => 'REPLAY_DIR');

This step generated an entry in the DBA_WORKLOAD_REPLAYS view, we must retrieve the associated id :

1
2
3
4
5
6
7
SQL> select id,status from dba_workload_replays;
 
        ID STATUS
---------- ----------------------------------------
         1 COMPLETED
        11 COMPLETED
        21 INITIALIZED

So, here, the ID of the replay is 21.

Next step is to remap the workload connections. To be able to apply the replay on both instances, we must use a TNS entry to a scan host and scan service :

1
2
3
4
5
6
SQL>
set linesize 250
set pages 9999
spool remap_connections.sql
select 'exec dbms_workload_replay.remap_connection ( connection_id => '||conn_id||', replay_connection => ''(DESCRIPTION=(ADDRESS=(PROTOCOL=TCP)(HOST=a_scan_host)(PORT=your_port))(CONNECT_DATA=(SERVICE_NAME=a_scan_service)))'');' from dba_workload_connection_map where replay_id=21;
spool off

Then we clean and execute the script :

1
2
3
4
5
6
7
8
9
10
sqlplus / as sysdba
SQL>
@remap_connections.sql
…………………
PL/SQL procedure successfully completed.
 
 
PL/SQL procedure successfully completed.
 
SQL>

Next we prepare the replay :

1
2
3
4
5
SQL> exec DBMS_WORKLOAD_REPLAY.prepare_replay (synchronization => TRUE);
 
PL/SQL procedure successfully completed.
 
SQL>

synchronization => TRUE : ensures that the replay sessions are synchronized, maintaining the order and dependency between transactions as they occurred during the workload capture.

Checking the status of the replay :

1
2
3
4
5
6
7
SQL> select id,status from dba_workload_replays;
 
        ID STATUS
---------- ----------------------------------------
         1 COMPLETED
        11 COMPLETED
        21 PREPARE

Next we need to calibrate the replay clients, we use the wrc program in calibrate mode. wrc is The Workload Replay Client tool, used to replay database workload captured by Oracle Database.

The calibration mode purpose is to ensures that the workload replay environment has enough replay clients to handle the workload in the capture. After calibration, the wrc tool will output the minimum number of replay clients required.

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
[oracle@LinuxHost~]$ wrc mode=calibrate replaydir=/u90/replay
 
Workload Replay Client: Release 19.23.0.0.0 - Production on Wed Dec 11 10:58:06 2024
 
Copyright (c) 1982, 2019, Oracle and/or its affiliates.  All rights reserved.
 
 
 
Workload Report
-----------------------
Recommendation:
Consider using at least 52 clients divided among 14 CPU(s)
You will need at least 152 MB of memory per client process.
If your machine(s) cannot match that number, consider using more clients.
 
Workload Characteristics:
- max concurrency: 3984 sessions
- total number of sessions: 56136
 
Assumptions:
- 100 concurrent sessions per client process
- 4 client processes per CPU
- 256 KB of memory cache per concurrent session
- think time scale = 100
- connect time scale = 100
- synchronization = TRUE
 
[oracle@LinuxHost ~]$

Based on this calibration, and after checking the cpu and memory available, I will use 56 clients (56 x 152 MB = 8.5G of minimal memory needed for the clients : Ok for me).

And this means using 28 clients per node, as I want to balance the overhead on the two nodes.

The command to start a single client is :

1
[oracle@LinuxHost]$ wrc system/password@test mode=replay replaydir=/u90/replay

But I prepared a script to start 28 clients at once in background mode, for instance in the first node :

 
[oracle@LinuxHost]$ cat replay_node1.sh
#!/bin/bash
 
export ORACLE_HOME=/u01/app/oracle/product/19.0.0.0/dbhome_1
export PATH=/home/oracle/.local/bin:/home/oracle/bin:/usr/local/bin:/usr/bin:/usr/local/sbin:/usr/sbin:/u01/app/oracle/product/19.0.0.0/dbhome_1/bin
 
# Configuration
USER="system"
PASSWORD="Oracle$"
DB_CONNECT="Instance1"
MODE="replay"
REPLAY_DIR="/u90/replay"
LOG_DIR="./wrc_logs"  # Directory to store log files
NUM_CLIENTS=28        # Number of clients to start
 
# Ensure the log directory exists
mkdir -p $LOG_DIR

# Start WRC clients in a loop
for ((i=1; i<=NUM_CLIENTS; i++))
do
nohup wrc ${USER}/${PASSWORD}@${DB_CONNECT} mode=${MODE} replaydir=${REPLAY_DIR} > ${LOG_DIR}/wrc_client_$i.log 2>&1 &
echo "Started WRC client $i, log: ${LOG_DIR}/wrc_client_$i.log"
done
 
echo "All $NUM_CLIENTS clients started. Logs are in $LOG_DIR."
 
[oracle@LinuxHost]
 

starting the clients on the first node :

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
[oracle@LinuxHost]$ ./replay_node1.sh
Started WRC client 1, log: ./wrc_logs/wrc_client_1.log
Started WRC client 2, log: ./wrc_logs/wrc_client_2.log
Started WRC client 3, log: ./wrc_logs/wrc_client_3.log
Started WRC client 4, log: ./wrc_logs/wrc_client_4.log
Started WRC client 5, log: ./wrc_logs/wrc_client_5.log
Started WRC client 6, log: ./wrc_logs/wrc_client_6.log
Started WRC client 7, log: ./wrc_logs/wrc_client_7.log
Started WRC client 8, log: ./wrc_logs/wrc_client_8.log
Started WRC client 9, log: ./wrc_logs/wrc_client_9.log
Started WRC client 10, log: ./wrc_logs/wrc_client_10.log
Started WRC client 11, log: ./wrc_logs/wrc_client_11.log
Started WRC client 12, log: ./wrc_logs/wrc_client_12.log
Started WRC client 13, log: ./wrc_logs/wrc_client_13.log
Started WRC client 14, log: ./wrc_logs/wrc_client_14.log
Started WRC client 15, log: ./wrc_logs/wrc_client_15.log
Started WRC client 16, log: ./wrc_logs/wrc_client_16.log
Started WRC client 17, log: ./wrc_logs/wrc_client_17.log
Started WRC client 18, log: ./wrc_logs/wrc_client_18.log
Started WRC client 19, log: ./wrc_logs/wrc_client_19.log
Started WRC client 20, log: ./wrc_logs/wrc_client_20.log
Started WRC client 21, log: ./wrc_logs/wrc_client_21.log
Started WRC client 22, log: ./wrc_logs/wrc_client_22.log
Started WRC client 23, log: ./wrc_logs/wrc_client_23.log
Started WRC client 24, log: ./wrc_logs/wrc_client_24.log
Started WRC client 25, log: ./wrc_logs/wrc_client_25.log
Started WRC client 26, log: ./wrc_logs/wrc_client_26.log
Started WRC client 27, log: ./wrc_logs/wrc_client_27.log
Started WRC client 28, log: ./wrc_logs/wrc_client_28.log
All 28 clients started. Logs are in ./wrc_logs.
[oracle@LinuxHost]$

We do the same thing on the second node, we just need to change the DB_CONNECT variable in the script.

Finally, we can start the replay :

1
2
3
4
BEGIN
  DBMS_WORKLOAD_REPLAY.start_replay;
END;
/

The replay is finally started, it can be monitored by :

– Querying the views : dba_workload_replays and v$workload_replay_thread.
– Checking the clients log files.

Hope it helps !