During the last year, I faced many performance issues at customer sites with their applications. To solve the issues, I had to implement some workarounds because in most cases, the application code couldn’t be changed.

As Oracle 12c is now available since more than 2 month, I decided to focus on how this new release will help our customers to solve their performance issues. I will present you 3 real customer cases reproduced in lab where the Oracle 12c Optimizer will solve the issues in a serie of 3 articles:

  • Adaptive execution plans
  • Histogram precision
  • Histogram on varchar2 column

These 3 real customer cases will be presented during my session “Solving critical customer issues with the Oracle 12c Optimizer” at the Oracle Open World 2013.

This first blog will focus on a common issue, when Oracle is not able to get a good estimation of the number of rows and the chosen execution plan is not the most efficient.

Context

The application is a custom development to manage the production in the factories. A specific team at customer site is responsible for maintenance and development.
It is linked to other applications to deal with stocks for example and there are interface tables for a kind of communication.

In a part of the application, all data to be processed is inserted in an interface table with the flag ‘I’ in a field named MSG_STATUS.
When data has been processed, the flag is updated and several values are possible depending of the results and its status: ‘C’, ‘E’ and ‘Z’ for example.

Most of the data is processed normally without errors so most of the rows have the same value ‘Z’ for the column MSG_STATUS.

Let’s see the data distribution in the table INTERFACE:

11g_AdaptivePlans_2013-09-17_a_09.18.02

11g_AdaptivePlans_2013-09-17_a_09.18.14

And then in the table FACTORY:

11g_AdaptivePlans_2013-09-17_a_09.18.29

A job is responsible for purging the interface table and to keep a few days of data.
During the night, there is no data to process so when the automatic job gathers the statistics there is no data with the flag ‘I’.

What does the histogram look like on the column MSG_STATUS?

11g_AdaptivePlans_2013-09-17_a_09.19.55

The histogram is describing the state of the table during the night, the value ‘I’ is missing and Oracle has no ways to detect that.

Behavior in Oracle 11

What happens if we look at the estimations for the following statement joining the two tables:

11g_AdaptivePlans_2013-09-17_a_09.21.01

Using explain plan, we can have a look at Oracle estimations:

11g_AdaptivePlans_2013-09-17_a_09.20.10

11g_AdaptivePlans_2013-09-17_a_09.20.29

We can see that Oracle estimated that a big number of rows will be returned from both tables and the optimizer chooses an execution plan using a hash join between two full table scans.

But in fact, there are no rows or few rows so Oracle will spend unnecessary time on I/Os reading the tables, let’s see the execution plan after an execution.

11g_AdaptivePlans_2013-09-17_a_09.21.18

Oracle didn’t changed its mind, the statement run with HASH JOIN and read fully the two tables.

Behavior in Oracle 12c

Here, what are the estimations from the optimizer?

12c_AdaptivePlans_2013-09-17_a_09.38.34

We do have the same bad estimations in Oracle 12c but the plan is adaptive so we can look at the alternative plan.

12c_AdaptivePlans_2013-09-17_a_09.38.46

So default plan uses hash join between two full table scans and the alternative plan uses a nested loop over a full table scan and an index scan.
What happens if the statement is started?

12c_AdaptivePlans_2013-09-17_a_09.39.20

Oracle changes its mind, during the execution the plan changed from hash join to nested loop. Why?
Because of the number of rows returned mostly. We can see in the estimations that this plan was designed for 6 rows returned from FACTORY table.

You can download the scripts to replay the demo just following the link: Scripts_Adaptive_Plans.zip

Conclusion

In Oracle 12c, the root cause is still present because the estimations are still wrong. And Oracle has no ways to figure out because it’s due to application design.

But being able to change the execution plan during execution solved the issue before it happens, that’s a powerful feature!