By Franck Pachot

.
In a previous post I’ve shown that from 12.1.0.2 when you have an index on trunc(date) you don’t need additional index. If you need the column with full precision, then you can add it to the index on trunc(). A comment from Rainer Stenzel asked if that optimization is available for other functions. And Mohamed Houri has linked to his post where he shows that it’s the same with a trunc() on a number.

Besides that, there is the same kind of optimization with SUBSTR(string,1,n) so here is the demo, with a little warning at the end.

I start with the same testcase as the previous post.

SQL> create table DEMO as select prod_id,prod_name,prod_eff_from +rownum/0.3 prod_date from sh.products,(select * from dual connect by level>=1000);
Table created.

SQL> create index PROD_NAME on DEMO(prod_name);
Index created.

SQL> create index PROD_DATE on DEMO(prod_date);
Index created.

 

string>Z

I’ve an index on the PROD_NAME and I can use it with equality or inequality predicates:

SQL> set autotrace on explain
SQL> select distinct prod_name from DEMO where prod_name > 'Z';
no rows selected

Execution Plan
----------------------------------------------------------
Plan hash value: 72593368

--------------------------------------------------------
| Id  | Operation          | Name      | Rows  | Bytes |
--------------------------------------------------------
|   0 | SELECT STATEMENT   |           |     1 |    27 |
|   1 |  SORT UNIQUE NOSORT|           |     1 |    27 |
|*  2 |   INDEX RANGE SCAN | PROD_NAME |     1 |    27 |
--------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------

   2 - access("PROD_NAME">'Z')

 
And I also can use it with a LIKE when there is no starting joker:

SQL> select distinct prod_name from DEMO where prod_name like 'Z%';
no rows selected

Execution Plan
----------------------------------------------------------
Plan hash value: 72593368

--------------------------------------------------------
| Id  | Operation          | Name      | Rows  | Bytes |
--------------------------------------------------------
|   0 | SELECT STATEMENT   |           |     1 |    27 |
|   1 |  SORT UNIQUE NOSORT|           |     1 |    27 |
|*  2 |   INDEX RANGE SCAN | PROD_NAME |     1 |    27 |
--------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------

   2 - access("PROD_NAME" LIKE 'Z%')
       filter("PROD_NAME" LIKE 'Z%')

 
That optimization is available for several releases (9.2 if I remember well but I didn’ check).

substr(string,1,n)

But sometimes, when we want to check if a column starts with a string, the application uses SUBSTR instead of LIKE:

SQL> select distinct prod_name from DEMO where substr(prod_name,1,1) = 'Z';
no rows selected

Execution Plan
----------------------------------------------------------
Plan hash value: 1665545956

--------------------------------------------------------
| Id  | Operation          | Name      | Rows  | Bytes |
--------------------------------------------------------
|   0 | SELECT STATEMENT   |           |     1 |    27 |
|   1 |  SORT UNIQUE NOSORT|           |     1 |    27 |
|*  2 |   INDEX FULL SCAN  | PROD_NAME |     1 |    27 |
--------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------

   2 - filter(SUBSTR("PROD_NAME",1,1)='Z')

 
But – as you see – there is no access predicate here. The whole index has to be read.
Of course, I can use a function based index for that:

SQL> create index PROD_NAME_SUBSTR on DEMO( substr(prod_name,1,1) );
Index created.

SQL> select distinct prod_name from DEMO where substr(prod_name,1,1) = 'Z';
no rows selected

Execution Plan
----------------------------------------------------------
Plan hash value: 4209586087

-------------------------------------------------------------------------
| Id  | Operation                    | Name             | Rows  | Bytes |
-------------------------------------------------------------------------
|   0 | SELECT STATEMENT             |                  |     1 |    31 |
|   1 |  HASH UNIQUE                 |                  |     1 |    31 |
|   2 |   TABLE ACCESS BY INDEX ROWID| DEMO             |     1 |    31 |
|*  3 |    INDEX RANGE SCAN          | PROD_NAME_SUBSTR |     1 |       |
-------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------

   3 - access(SUBSTR("PROD_NAME",1,1)='Z')

 

One index only?

Then, as in the previous post about TRUNC I’ll check if that new index is sufficient. Let’s fdrop the first one.

SQL> drop index PROD_NAME;
Index dropped.

The previous index is dropped. Let’s see if the index on SUBSTR can be used with an equality predicate:

SQL> select distinct prod_name from DEMO where prod_name = 'Zero';
no rows selected

Execution Plan
----------------------------------------------------------
Plan hash value: 953445334

-------------------------------------------------------------------------
| Id  | Operation                    | Name             | Rows  | Bytes |
-------------------------------------------------------------------------
|   0 | SELECT STATEMENT             |                  |     1 |    27 |
|   1 |  SORT UNIQUE NOSORT          |                  |     1 |    27 |
|*  2 |   TABLE ACCESS BY INDEX ROWID| DEMO             |     1 |    27 |
|*  3 |    INDEX RANGE SCAN          | PROD_NAME_SUBSTR |     1 |       |
-------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------

   2 - filter("PROD_NAME"='Zero')
   3 - access(SUBSTR("PROD_NAME",1,1)='Z')

 
Good. The index on substring is used for index range scan on the prefix, and then the filter occurs on the result. This is fine as long as the prefix is selective enough.
It is also available with inequality:

SQL> select distinct prod_name from DEMO where prod_name > 'Z';
no rows selected

...
Predicate Information (identified by operation id):
---------------------------------------------------

   2 - filter("PROD_NAME">'Z')
   3 - access(SUBSTR("PROD_NAME",1,1)>='Z')

And we can use it even when using a substring with a different number of characters:
SQL> select distinct prod_name from DEMO where substr(prod_name,1,4) = 'Zero';
no rows selected
...

Predicate Information (identified by operation id):
---------------------------------------------------

   2 - filter(SUBSTR("PROD_NAME",1,4)='Zero')
   3 - access(SUBSTR("PROD_NAME",1,1)='Z')

 
However, if we use the LIKE syntax:

SQL> select distinct prod_name from DEMO where prod_name like 'Z%';
no rows selected

Execution Plan
----------------------------------------------------------
Plan hash value: 51067428

---------------------------------------------------
| Id  | Operation          | Name | Rows  | Bytes |
---------------------------------------------------
|   0 | SELECT STATEMENT   |      |     1 |    27 |
|   1 |  HASH UNIQUE       |      |     1 |    27 |
|*  2 |   TABLE ACCESS FULL| DEMO |     1 |    27 |
---------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------

   2 - filter("PROD_NAME" LIKE 'Z%')

 
The LIKE snytax does not allow to filter from the index on SUBSTR. So there are cases where we have to keep all indexes. Index on full column for LIKE predicates, and index on substring for SUBSTR predicates.
Note that indexes on SUBSTR are mandatory when you have columns larger than your block size, which is probably the case if you allow extended datatypes (VARCHAR2 up to 32k)