By Franck Pachot
.
Do you use ARCHIVELOG DELETION POLICY TO APPLIED ON ALL STANDBY for your Oracle databases in Data Guard? Maybe you also use the Fast Recovery Area as archive log destination. That’s good practice! But did you ever check that it works as expected?
What I mean is this:
- The archived logs that you don’t need are reclaimable by the FRA when space is needed
- And the archived logs that are required for availability (standby or backup) are not deleted.
It’s not an easy thing to check because Oracle doesn’t show which archive log is reclaimable. Only the total reclaimable space is shown in v$recovery_area_usage. But that is not sufficient to validate which archivelog sequence is concerned. I’ll show you below a query that returns the reclaimable status from the archived logs. And you will see that until 12c the APPLIED ON ALL STANDBY does not work as expected. You’ve probably seen a FRA full at standby site and solved it by deleting archived logs. But this is not the right solution because the FRA is supposed to do that.
Let’s look at an example I encountered recently. The archivelog deletion policy is set correctly:
RMAN> show archivelog deletion policy;
RMAN configuration parameters for database with db_unique_name DATABASE_SITE2 are:
CONFIGURE ARCHIVELOG DELETION POLICY TO APPLIED ON ALL STANDBY;
This configuration, an 11g feature, allows to delete an archive log as soon as it is applied to all standby destinations. Note that it works if I manually do a ‘delete archivelog all;’ but I expect that the archivelogs in the FRA becomes reclaimable automatically.
Unfortunately, this is not the case and the FRA is growing:
SQL> select * from v$recovery_area_usage where file_type='ARCHIVED LOG';
FILE_TYPE PERCENT_SPACE_USED PERCENT_SPACE_RECLAIMABLE NUMBER_OF_FILES
-------------------- ------------------ ------------------------- ---------------
ARCHIVED LOG 61.11 43.02 467
Let’s check everything. We are on the standby database:
SQL> select open_mode,database_role from v$database;
OPEN_MODE DATABASE_ROLE
-------------------- ----------------
MOUNTED PHYSICAL STANDBY
The archivelogs are going to the Fast Recovery Area:
SQL> show parameter log_archive_dest_1
NAME TYPE VALUE
------------------------------------ ----------- ------------------------------
log_archive_dest_1 string location=USE_DB_RECOVERY_FILE_
DEST, valid_for=(ALL_LOGFILES,
ALL_ROLES)
All archived logs are applied (we are in SYNC AFFIRM):
DGMGRL> show database 'DATABASE_SITE2';
Database - DATABASE_SITE2
Role: PHYSICAL STANDBY
Intended State: APPLY-ON
Transport Lag: 0 seconds
Apply Lag: 0 seconds
Real Time Query: OFF
Instance(s):
DATABASE
Database Status:
SUCCESS
Well, with that configuration, I expect that all archivelogs are reclaimable – except the current one.
Let’s investigate. V$RECOVERY_AREA_USAGE is an aggregate view. If we check its definition, we see that the reclaimable size comes from x$kccagf.rectype.
So I’ll use it in in conjunction with v$archived_log in order to give the detail about which archived logs are reclaimable:
SQL> select applied,deleted,decode(rectype,11,'YES','NO') reclaimable
,count(*),min(sequence#),max(sequence#)
from v$archived_log left outer join sys.x$kccagf using(recid)
where is_recovery_dest_file='YES' and name is not null
group by applied,deleted,decode(rectype,11,'YES','NO') order by 5
/
APPLIED DELETED RECLAIMABLE COUNT(*) MIN(SEQUENCE#) MAX(SEQUENCE#)
--------- ------- ----------- ---------- -------------- --------------
YES NO YES 429 5938 6366
YES NO NO 37 6367 6403
IN-MEMORY NO NO 1 6404 6404
The problem is there: Because of a bug (Bug 14227959 : STANDBY DID NOT RELEASE SPACE IN FRA) the archivelogs are not marked as reclaimable when the database is in mount mode.
The workaround is to execute dbms_backup_restore.refreshagedfiles. This is what must be scheduled (maybe daily) on the standby. It can be a good idea to do it at the same time as a daily ‘delete obsolete’, so here is the way to call it from RMAN:
RMAN> sql "begin dbms_backup_restore.refreshagedfiles; end;";
But I’ve found another workaround: just run the CONFIGURE ARCHIVELOG POLICY from RMAN as it refreshes the reclaimable flag – even when there is no change.
Then, you can run a daily job that does CONFIGURE ARCHIVELOG DELETION POLICY TO APPLIED ON ALL STANDBY on your database, whatever the role is.
It’s different for the database where you do the backup, because you want to be sure that the backup is done before an archivelog is deleted: CONFIGURE ARCHIVELOG DELETION POLICY TO APPLIED ON ALL STANDBY BACKED UP 1 TIMES TO DISK;
This is a good way to prevent anyone from changing the configuration and keep it close to the backup scripts. At dbi services, we advise to keep the same configuration on all Data Guard sites for the same database so that a switchover can be done without any problem. For this reason, having a script that depends on the place where the backups are done is a better alternative than a configuration that depends on the database role.
Finally, here is the state of our reclaimable archivelogs after any of these solutions:
APPLIED DELETED RECLAIMABLE COUNT(*) MIN(SEQUENCE#) MAX(SEQUENCE#)
--------- ------- ----------- ---------- -------------- --------------
YES NO YES 466 5938 6403
IN-MEMORY NO NO 1 6404 6404
FILE_TYPE PERCENT_SPACE_USED PERCENT_SPACE_RECLAIMABLE NUMBER_OF_FILES
-------------------- ------------------ ------------------------- ---------------
ARCHIVED LOG 61.11 61.09 467
All applied archived logs are reclaimable and the FRA will never be full.
You can check the primary as well. Are you sure that Oracle will never delete an archived log that has not been backed up ? Check your deletion policy.
Here is the full query I use for that:
column deleted format a7
column reclaimable format a11
set linesize 120
select applied,deleted,backup_count
,decode(rectype,11,'YES','NO') reclaimable,count(*)
,to_char(min(completion_time),'dd-mon hh24:mi') first_time
,to_char(max(completion_time),'dd-mon hh24:mi') last_time
,min(sequence#) first_seq,max(sequence#) last_seq
from v$archived_log left outer join sys.x$kccagf using(recid)
where is_recovery_dest_file='YES'
group by applied,deleted,backup_count,decode(rectype,11,'YES','NO') order by min(sequence#)
/
This is the result on primary where the last archivelog backup has run around 21:00
APPLIED DELETED BACKUP_COUNT RECLAIMABLE COUNT(*) FIRST_TIME LAST_TIME FIRST_SEQ LAST_SEQ
--------- ------- ------------ ----------- -------- ------------ ------------ --------- --------
NO YES 1 NO 277 15-jan 17:56 19-jan 09:49 5936 6212
NO NO 1 YES 339 19-jan 10:09 22-jan 21:07 6213 6516
NO NO 0 NO 33 22-jan 21:27 23-jan 07:57 6517 6549
That is fine according to my policy APPLIED ON ALL STANDBY BACKED UP 1 TIMES TO DISK
And here is the result on standby where the workaround job has run around 06:00 and redo apply is in SYNC
APPLIED DELETED BACKUP_COUNT RECLAIMABLE COUNT(*) FIRST_TIME LAST_TIME FIRST_SEQ LAST_SEQ
--------- ------- ------------ ----------- -------- ------------ ------------ --------- --------
YES YES 0 NO 746 07-jan 13:27 17-jan 11:17 5320 6065
YES NO 0 YES 477 17-jan 11:37 23-jan 05:37 6066 6542
YES NO 0 NO 8 23-jan 05:57 23-jan 08:14 6543 6550
IN-MEMORY NO 0 NO 1 23-jan 08:15 23-jan 08:15 6551 6551
This is good for my policy APPLIED ON ALL STANDBY – except that because of the bug mentioned above, redo applied since 06:00 are not yet reclaimable.
Update SEP-17
When troubleshooting archivelog deletion policy issue, here is a better query which counts the number of backups for each sequence:
set linesize 200 pagesize 1000
column is_recovery_dest_file format a21
select
deleted,status,is_recovery_dest_file,thread#,min(sequence#),max(sequence#),min(first_time),max(next_time),count(distinct sequence#),archived,applied,backup_count,count("x$kccagf")
from (
select deleted,thread#,sequence#,status,name ,first_time, next_time,case x$kccagf.rectype when 11 then recid end "x$kccagf"
,count(case archived when 'YES' then 'YES' end)over(partition by thread#,sequence#) archived
,count(case applied when 'YES' then 'YES' end)over(partition by thread#,sequence#) applied
,sum(backup_count)over(partition by thread#,sequence#) backup_count
,listagg(is_recovery_dest_file||':'||dest_id,',')within group(order by dest_id)over(partition by thread#,sequence#) is_recovery_dest_file
from v$archived_log left outer join sys.x$kccagf using(recid)
) group by deleted,status,is_recovery_dest_file,thread#,archived,applied,backup_count
order by max(sequence#),min(sequence#),thread#,deleted desc,status;
With the following output:
DEL S IS_RECOVERY_DEST_FILE THREAD# MIN(SEQUENCE#) MAX(SEQUENCE#) MIN(FIRST MAX(NEXT_ COUNT(DISTINCTSEQUENCE#) ARCHIVED APPLIED BACKUP_COUNT COUNT("X$KCCAGF")
--- - --------------------- ---------- -------------- -------------- --------- --------- ------------------------ ---------- ---------- ------------ -----------------
NO A YES:1 1 3233 3233 23-JUN-17 23-JUN-17 1 1 0 1 1
NO A YES:1,NO:2 1 3234 5387 23-JUN-17 21-JUL-17 2154 2 1 1 2154
NO A YES:1,NO:2 1 5388 11596 21-JUL-17 10-OCT-17 6209 2 1 0 6208
NO A YES:1,NO:2 1 11597 11597 10-OCT-17 10-OCT-17 1 2 0 0 0