By Franck Pachot
It don’t want to rub salt in the wound about 12c upgrades and Adaptive Dynamic Sampling being overzealous, but here is a case with an unexpected consequence: you see nothing in the ‘Database Feature Usage’ since migration.
The photo novel starts like this:
It’s quite embarrassing as this is the only way to ensure that I’m not using features that are not available with the license options I have.
Well, I have Tuning Pack so I can use SQL Monitoring, and this is what I remark coincidentally:
You see ‘DBMS_FEATURE’ functions ending in ORA-12751 error after 20 minutes of heavy CPU usage.
ORA-12751 is “cpu time or run time policy violation”, it’s an internal limit for queries run by MMON, such as the one that collects the features usage.
This can explain why the feature usage is not collected. Let’s check the history about it:
Thanks to AWR that stores SQL Monitoring statistics, we can see that the function always fail. So why is it so long to run?
Here is a pattern that we start to see a lot when upgrading to 12c.
If you have attended to my ‘AWR straight to the goal’ session then you know the path I follow: time model, top events, statements, etc.
(If you didn’t, then come to Birmingham in December at tech15 or in January at Oracle Midlands)
lot of parsing time here…
Lot of time in CPU and a small result cache contention…
One SQL query that takes all the time of the dbms feature call. Note that /* DS_SVC */ is dynamic sampling…
Let’s put it together:
- long hard parse
- result cache contention
- long dynamic sampling query
This is Adaptive Dynamic Sampling. The feature usage function run a lot of queries on big tables (AWR history, RMAN history, etc) and the 12c ADS queries on those big tables are very long. Fortunately there is a timeout after 20 minutes. But the result is that we have no result: I cannot know which features are used.
The goal of this post is not to tune those internal queries. There is enough to do with production queries.
The goal is to explain how it is possible to see nothing in DBA_FEATURE_USAGE_STATISTICS for past months.
Now, if you have an LMS auditor that find it suspicious, then you can prove that you did nothing bad… except upgrading to 12c 😉
Well, you should adapt the method presented here of course: don’t show an AWR report if you don’t have Diagnostic Pack!
Added 1st Oct. 2015
Of course, when you see a problem you check if the problem occurs elsewhere.
I was on a server with 50 12c instances and checked the CPU usage:
What happens around 14:30 is that a lot of instances are running that feature usage query. I’m not sure about that, but I think that the time when it run is determined by the instance startup time and on that server all instances have been started after a server maintenance.
The query is: auyf8px9ywc6j
WITH SNAP_RANGES AS (SELECT /*+ FULL(ST) */ SN.DBID ,SN.INSTANCE_NUMBER ,SN.STARTUP_TIME ,ST.STAT_ID ,ST.STAT_NAME ,MIN(SN.SNAP_ID) AS MIN_SNAP ,MAX(SN.SNAP_ID) AS MAX_SNAP ,MIN(CAST(BEGIN_INTERVAL_TIME AS DATE)) AS MIN_DATE ,MAX(CAST(END_INTERVAL_TIME AS DATE)) AS MAX_DATE FROM DBA_HIST_SNAPSHOT SN ,WRH$_STAT_NAME ST WHERE SN.BEGIN_INTERVAL_TIME > TRUNC(SYSDATE) - 7 AND SN.END_INTERVAL_TIME < TRUNC(SYSDATE) AND SN.DBID = ST.DBID AND ST.STAT_NAME IN ('DB time', 'DB CPU') GROUP BY SN.DBID,SN.INSTANCE_NUMBER,SN.STARTUP_TIME,ST.STAT_ID,ST.STAT_NAME ) ,DELTA_DATA AS (SELECT SR.DBID ,SR.INSTANCE_NUMBER ,SR.STAT_NAME ,CASE WHEN SR.STARTUP_TIME BETWEEN SR.MIN_DATE AND SR.MAX_DATE THEN TM1.VALUE + (TM2.VALUE - TM1.VALUE) ELSE (TM2.VALUE - TM1.VALUE) END AS DELTA_TIME FROM WRH$_SYS_TIME_MODEL TM1 ,WRH$_SYS_TIME_MODEL TM2 ,SNAP_RANGES SR WHERE TM1.DBID = SR.DBID AND TM1.INSTANCE_NUMBER = SR.INSTANCE_NUMBER AND TM1.SNAP_ID = SR.MIN_SNAP AND TM1.STAT_ID = SR.STAT_ID AND TM2.DBID = SR.DBID AND TM2.INSTANCE_NUMBER = SR.INSTANCE_NUMBER AND TM2.SNAP_ID = SR.MAX_SNAP AND TM2.STAT_ID = SR.STAT_ID ) SELECT STAT_NAME ,ROUND(SUM(DELTA_TIME/1000000),2) AS SECS FROM DELTA_DATA GROUP BY STAT_NAME
It’s a query that gathers the following:
STAT_NAME SECS ---------- ---------- DB CPU 76451.04 DB time 147707.98
but the problem is the Adaptive Dynamic Sampling that is behind.
I’ve no solution yet, but Not-a-Bug number 14510010″ suggests that the query is not run when _swrf_mmon_dbfus”=false but of course your usage of Diagnostic Pack will not be monitored.
Update June 6th, 2016
The best solution to the issue (following the Rule #1 of troubleshooting: the scope of the solution should match the scope of the problem) is on Ludovico Caldara blog post: http://www.ludovicocaldara.net/dba/cpu-usage-12c-dbms_feature_awr/