By Franck Pachot

.
The default block size is 8k in Oracle. This is the most common. Sometimes, people create the database with 16k block size for datawarehouses. You can also find some 32k block size, but less common which means more bugs. In our tuning workshop we recommend the default, 8k, for most of the cases. I’ll show here 3 tests to show what different block size change for full table scan and index access.

Test case

I have defined a cache size for the non default block size I want to use:

SQL> show parameter db%_cache_size
 
NAME                                 TYPE        VALUE
------------------------------------ ----------- -----
db_cache_size big integer 0
db_2k_cache_size                     big integer 0
db_4k_cache_size                     big integer 0
db_8k_cache_size                     big integer 0
db_16k_cache_size big integer 112M
db_32k_cache_size big integer 112M
db_flash_cache_size                  big integer 0
db_keep_cache_size                   big integer 0
db_recycle_cache_size                big integer 0

I’m creating 3 tablespaces with 8k, 16k and 32k block size. I create them as uniform with a 1M extent size because we are supposed to use large block size for large tables and I don’t want the side effects of smaller first extents in auto extent size.

SQL> create tablespace DEMO08K datafile '/oracle/u01/oradata/DEMO08K.dbf' size 1024M extent management local uniform size 1M blocksize 8k;
Tablespace created.
 
SQL> create tablespace DEMO16K datafile '/oracle/u01/oradata/DEMO16K.dbf' size 1024M extent management local uniform size 1M blocksize 16k;
Tablespace created.
 
SQL> create tablespace DEMO32K datafile '/oracle/u01/oradata/DEMO32K.dbf' size 1024M extent management local uniform size 1M blocksize 32k;
Tablespace created.

and then create 3 identical tables in each tablespace:

SQL> create table TAB08K (id constraint PK_08K primary key,n,x) tablespace DEMO08K as
     select rownum id , ora_hash(rownum,10) n , lpad(ora_hash(rownum,100),20,'0') x from xmltable('1 to 10000000');
Table created.
 
SQL> create table TAB16K (id constraint PK_16K primary key,n,x) tablespace DEMO16K as
     select rownum id , ora_hash(rownum,10) n , lpad(ora_hash(rownum,100),20,'0') x from xmltable('1 to 10000000');
Table created.
 
SQL> create table TAB32K (id constraint PK_32K primary key,n,x) tablespace DEMO32K as
     select rownum id , ora_hash(rownum,10) n , lpad(ora_hash(rownum,100),20,'0') x from xmltable('1 to 10000000');
Table created.

My tables have 10 million rows, two number column and one larger varchar2:

SQL> select table_name,num_rows,avg_row_len,blocks from user_tables where table_name like 'TAB__K';
 
TABLE_NAME   NUM_ROWS AVG_ROW_LEN     BLOCKS
---------- ---------- ----------- ----------
TAB08K       10000000          30      48459
TAB16K       10000000          30      23997
TAB32K       10000000          30      11933

Of course, larger block size need smaller number of blocks, but the total size is roughly the same. Here I have small rows so this is where the fixed size of block header can make the most difference.

 

Full Table Scan

So, the common idea is that larger block size helps to do larger i/o calls when doing full table scan…

SQL> set timing on arraysize 5000 autotrace trace
 
SQL> select * from TAB08K;
10000000 rows selected.
 
Elapsed: 00:00:32.53
 
Execution Plan
----------------------------------------------------------
Plan hash value: 1209268626
 
-------------------------------------------------------------------------
| Id  | Operation                 | Name   | Rows  | Bytes | Cost (%CPU)|
-------------------------------------------------------------------------
|   0 | SELECT STATEMENT          |        |    10M|   286M|  8462   (1)|
|   1 |  TABLE ACCESS STORAGE FULL| TAB08K |    10M|   286M|  8462   (1)|
-------------------------------------------------------------------------
 
Statistics
----------------------------------------------------------
          5  recursive calls
          0  db block gets
      50174  consistent gets
      48175  physical reads
          0  redo size
  348174002  bytes sent via SQL*Net to client
      22489  bytes received via SQL*Net from client
       2001  SQL*Net roundtrips to/from client
          0  sorts (memory)
          0  sorts (disk)
   10000000  rows processed
 
SQL> set autotrace off
SQL> select name,value from v$mystat join v$statname using(statistic#) where name like 'physical read%' and value>0;
 
NAME                                                                  VALUE
---------------------------------------------------------------- ----------
physical read total IO requests                                         403
physical read total multi block requests                                379
physical read total bytes                                         394821632
physical reads                                                        48196
physical reads cache                                                     23
physical reads direct                                                 48173
physical read IO requests                                               403
physical read bytes                                               394821632
physical reads direct temporary tablespace                                1

I’ve read 48175 8k blocks with 403 i/o calls.

 

Now doing the same from the table stored in the 16k blocksize tablespace:

SQL> select * from TAB16K;
 
10000000 rows selected.
 
Elapsed: 00:00:31.04
 
Execution Plan
----------------------------------------------------------
Plan hash value: 2288178481
 
-------------------------------------------------------------------------
| Id  | Operation                 | Name   | Rows  | Bytes | Cost (%CPU)|
-------------------------------------------------------------------------
|   0 | SELECT STATEMENT          |        |    10M|   286M|  4378   (2)|
|   1 |  TABLE ACCESS STORAGE FULL| TAB16K |    10M|   286M|  4378   (2)|
-------------------------------------------------------------------------
 
SQL> select name,value from v$mystat join v$statname using(statistic#) where name like 'physical read%' and value>0;
 
NAME                                                                  VALUE
---------------------------------------------------------------- ----------
physical read total IO requests                                         397
physical read total multi block requests                                375
physical read total bytes                                         391012352
physical reads                                                        23876
physical reads cache                                                     21
physical reads direct                                                 23855
physical read IO requests                                               397
physical read bytes                                               391012352
physical reads direct temporary tablespace                                1

I’ve read 23855 16k blocks with 397 i/o calls. It’s not a lot better.

SQL> select * from TAB32K;
 
10000000 rows selected.
 
Elapsed: 00:00:29.61
 
Execution Plan
----------------------------------------------------------
Plan hash value: 1240330363
 
-------------------------------------------------------------------------
| Id  | Operation                 | Name   | Rows  | Bytes | Cost (%CPU)|
-------------------------------------------------------------------------
|   0 | SELECT STATEMENT          |        |    10M|   286M|  2364   (3)|
|   1 |  TABLE ACCESS STORAGE FULL| TAB32K |    10M|   286M|  2364   (3)|
-------------------------------------------------------------------------
 
SQL> select name,value from v$mystat join v$statname using(statistic#) where name like 'physical read%' and value>0;
 
NAME                                                                  VALUE
---------------------------------------------------------------- ----------
physical read total IO requests                                         398
physical read total multi block requests                                373
physical read total bytes                                         388890624
physical reads                                                        11886
physical reads cache                                                     24
physical reads direct                                                 11862
physical read IO requests                                               398
physical read bytes                                               388890624
physical reads direct temporary tablespace                                1

I’ve read 11892 32k blocks with 398 i/o calls.

 

Conclusion: we do roughly the same amount of i/o when doing a full table scan. This is because Oracle is reading in multiblock. Note that the db_file_multiblock_read_count is defined as the number of blocks of default block size, but the i/o size is adapted for tablespace with non default block size. For example, db_file_multiblock_read_count=128 when default block size is 8k will always read 1MB.

Indexes

I already have an index on the primary key. Let’s add some more indexes:

SQL> create index ID_08K on TAB08K(x) tablespace DEMO08K ;
Index created.
SQL> create index ID_16K on TAB16K(x) tablespace DEMO16K ;
Index created.
SQL> create index ID_32K on TAB32K(x) tablespace DEMO32K ;
Index created.
SQL> create bitmap index BI_08K on TAB08K(n) tablespace DEMO08K ;
Index created.
SQL> create bitmap index BI_16K on TAB16K(n) tablespace DEMO16K ;
Index created.
SQL> create bitmap index BI_32K on TAB32K(n) tablespace DEMO32K ;
Index created.

and check their size:

SQL> select index_name,num_rows,blevel,leaf_blocks from user_indexes where table_name like 'TAB__K' order by 1;
 
INDEX_NAME   NUM_ROWS     BLEVEL LEAF_BLOCKS
---------- ---------- ---------- -----------
BI_08K           3211          2        1606
BI_16K           1562          1         781
BI_32K            759          1         380
ID_08K       10000000          2       44643
ID_16K       10000000          2       22027
ID_32K       10000000          2       10929
PK_08K       10000000          2       22132
PK_16K       10000000          2       10921
PK_32K       10000000          2        5425

Of course the number of blocks is lower when the block size is bigger. And because branches are larger, then you may have a smaller depth. But look: on my 10000000 rows table the depth is the same for the regular indexes: 2 branch levels. Only for the bitmap indexes, because they are very small, we need one less branch level here.

 

But think about it. Index depth mostly matter for OLTP where you get rows by their primary key. But people say that smaller blocks are better for OLTP… Datawarehouses often have bitmap indexes, but do you care to have smaller bitmap indexes?

Index access (lot of rows, good clustering factor)

Anyway, let’s test a large range scan:

SQL> select * from TAB08K where id between 1 and 100000;
 
100000 rows selected.
 
Elapsed: 00:00:00.44
 
Execution Plan
----------------------------------------------------------
Plan hash value: 2790916815
 
-----------------------------------------------------------------------------------
| Id  | Operation                           | Name   | Rows  | Bytes | Cost (%CPU)|
-----------------------------------------------------------------------------------
|   0 | SELECT STATEMENT                    |        |   100K|  2929K|   707   (1)|
|   1 |  TABLE ACCESS BY INDEX ROWID BATCHED| TAB08K |   100K|  2929K|   707   (1)|
|*  2 |   INDEX RANGE SCAN                  | PK_08K |   100K|       |   225   (1)|
-----------------------------------------------------------------------------------
 
Predicate Information (identified by operation id):
---------------------------------------------------
 
   2 - access("ID">=1 AND "ID" v=100000)
 
Statistics
----------------------------------------------------------
          1  recursive calls
          0  db block gets
        878  consistent gets
        679  physical reads
          0  redo size
    3389860  bytes sent via SQL*Net to client
       1589  bytes received via SQL*Net from client
        101  SQL*Net roundtrips to/from client
          0  sorts (memory)
          0  sorts (disk)
     100000  rows processed
 
SQL> set autotrace off
SQL> select name,value from v$mystat join v$statname using(statistic#) where name like 'physical read%' and value>0;
 
NAME                                                                  VALUE
---------------------------------------------------------------- ----------
physical read total IO requests                                         705
physical read total bytes                                           5775360
physical reads                                                          705
physical reads cache                                                    705
physical read IO requests                                               705
physical read bytes                                                 5775360

We have read 100000 rows through index. The index is very well clustered. I’ve done 705 i/o calls to get those rows from 8k blocks.

 

Now with 16k blocks:

SQL> select * from TAB16K where id between 1 and 100000;
 
100000 rows selected.
 
Elapsed: 00:00:00.37
 
Execution Plan
----------------------------------------------------------
Plan hash value: 1432239150
 
-----------------------------------------------------------------------------------
| Id  | Operation                           | Name   | Rows  | Bytes | Cost (%CPU)|
-----------------------------------------------------------------------------------
|   0 | SELECT STATEMENT                    |        |   100K|  2929K|   352   (1)|
|   1 |  TABLE ACCESS BY INDEX ROWID BATCHED| TAB16K |   100K|  2929K|   352   (1)|
|*  2 |   INDEX RANGE SCAN                  | PK_16K |   100K|       |   113   (1)|
-----------------------------------------------------------------------------------
 
Predicate Information (identified by operation id):
---------------------------------------------------
 
   2 - access("ID">=1 AND "ID"v=100000)
 
Statistics
----------------------------------------------------------
          1  recursive calls
          0  db block gets
        537  consistent gets
        337  physical reads
          0  redo size
    3389860  bytes sent via SQL*Net to client
       1589  bytes received via SQL*Net from client
        101  SQL*Net roundtrips to/from client
          0  sorts (memory)
          0  sorts (disk)
     100000  rows processed
 
SQL> set autotrace off
SQL> select name,value from v$mystat join v$statname using(statistic#) where name like 'physical read%' and value>0;
 
NAME                                                                  VALUE
---------------------------------------------------------------- ----------
physical read total IO requests                                         363
physical read total bytes                                           5734400
physical reads                                                          363
physical reads cache                                                    363
physical read IO requests                                               363
physical read bytes                                                 5734400

the number of i/o calls have been divided by two.

 

 

SQL> select * from TAB32K where id between 1 and 100000;
 
100000 rows selected.
 
Elapsed: 00:00:00.35
 
Execution Plan
----------------------------------------------------------
Plan hash value: 3074346038
 
-----------------------------------------------------------------------------------
| Id  | Operation                           | Name   | Rows  | Bytes | Cost (%CPU)|
-----------------------------------------------------------------------------------
|   0 | SELECT STATEMENT                    |        |   100K|  2929K|   177   (1)|
|   1 |  TABLE ACCESS BY INDEX ROWID BATCHED| TAB32K |   100K|  2929K|   177   (1)|
|*  2 |   INDEX RANGE SCAN                  | PK_32K |   100K|       |    58   (2)|
-----------------------------------------------------------------------------------
 
Predicate Information (identified by operation id):
---------------------------------------------------
 
   2 - access("ID">=1 AND "ID"v=100000)
 
Statistics
----------------------------------------------------------
          1  recursive calls
          0  db block gets
        369  consistent gets
        169  physical reads
          0  redo size
    3389860  bytes sent via SQL*Net to client
       1589  bytes received via SQL*Net from client
        101  SQL*Net roundtrips to/from client
          0  sorts (memory)
          0  sorts (disk)
     100000  rows processed
 
SQL> set autotrace off
SQL> select name,value from v$mystat join v$statname using(statistic#) where name like 'physical read%' and value>0;
 
NAME                                                                  VALUE
---------------------------------------------------------------- ----------
physical read total IO requests                                         195
physical read total bytes                                           5750784
physical reads                                                          195
physical reads cache                                                    195
physical read IO requests                                               195
physical read bytes                                                 5750784

with 32k blocks, it’s once again divided by two.

 

Conclusion: when doing single block reads, coming from a well clustered index, we do less i/o calls with larger blocks. The fact is that because we need contiguous rows (because we are using a well clustered index) having large blocks makes more rows physically contiguous.

Index access (few rows, bad clustering factor)

Here is a query WHERE X=’00000000000000000000′. The index on N – that I’ve populated with a hash value on rownum – has a bad clustering factor. I fetch only 30 rows.

----------------------------------------------------------------------------
| Id  | Operation                    | Name   | Rows  | Bytes | Cost (%CPU)|
----------------------------------------------------------------------------
|   0 | SELECT STATEMENT             |        |    30 |   900 |    16   (0)|
|   1 |  COUNT STOPKEY               |        |       |       |            |
|   2 |   TABLE ACCESS BY INDEX ROWID| TAB08K |    30 |   900 |    16   (0)|
|*  3 |    INDEX RANGE SCAN          | ID_08K | 99010 |       |     3   (0)|
----------------------------------------------------------------------------
 
Predicate Information (identified by operation id):
---------------------------------------------------
 
   3 - access("X"='00000000000000000000')
 
NAME                                                                  VALUE
---------------------------------------------------------------- ----------
physical read total IO requests                                          16
physical read total bytes                                            131072
physical reads                                                           16
physical reads cache                                                     16
physical read IO requests                                                16
physical read bytes                                                  131072
 

The query returned about 30 rows. Because of the bad clustering factor we had to read a block every two rows on average. Let’s see the same with 16k blocks.

---------------------------------------------------------------------------------------
| Id  | Operation                    | Name   | Rows  | Bytes | Cost (%CPU)| Time     |
---------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT             |        |    30 |   900 |     7   (0)| 00:00:01 |
|   1 |  COUNT STOPKEY               |        |       |       |            |          |
|   2 |   TABLE ACCESS BY INDEX ROWID| TAB32K |    30 |   900 |     7   (0)| 00:00:01 |
|*  3 |    INDEX RANGE SCAN          | ID_32K | 99010 |       |     3   (0)| 00:00:01 |
---------------------------------------------------------------------------------------
 
Predicate Information (identified by operation id):
---------------------------------------------------
   3 - access("X"='00000000000000000000')
 
NAME                                                                  VALUE
---------------------------------------------------------------- ----------
physical read total IO requests                                          33
physical read total bytes                                            442368
physical reads                                                           33
physical reads cache                                                     33
physical read IO requests                                                33
physical read bytes                                                  442368

More i/o calls here and higher block size.
Conclusion: larger block size is bad when we need only few rows, especially from a badly clustered index. More i/o calls, larger i/o size, and the large blocks takes more space in the buffer cache.

 

So, which block size suits your workload?

Question is: do you get better performance in datawarhouse with larger block size? For full table scans, no it’s the same and has always been the same. For index range scans retrieving large number of rows, then yes, the access by index may be faster. But think about it. Retrieving lot of rows by index is always bad, – whatever the block size is. If you want do to something about it, look at the design: define better indexes (covering all selected columns), partition the table, compress the table, use Parallel Query,… Lot of tuning to do before thinking about block size.

With smaller block size the optimizer will favor a full table scan for those cases, and today the gap between full table scan and index access is not so large. There is Exadata SmartScan, In-Memory. Even without options, multiblock read is faster since 11g when doing serial direct read. All that makes full table scan faster. And index access has also been improved: you have the batched access by rowid to compensate bad clustring factor.

And if you think about having a non default block size for some tablespaces, are you sure that the gain you expect will not be ruined by a bad sizing of buffer cache? When having non default blocksize tablespace you have to manage their buffer cache manually.

Other considerations

Larger block size may have been recommended for very large tablespaces in order to avoid to have too many datafiles (their maximum size is in number of block) but today you can create bigfile tablespaces for them, so it is not a reason anymore.

There is one reason to have larger block size. When you have very large rows, you can avoid row chaining. But once again, are you sure it is a problem (i.e do you select often the columns at the end)? And maybe you should review the design first.

There was another reason to have a larger block size for tablespace containing large LOB (I know that the ‘L’ is already for ‘Large’ but I mean LOBs larger than the default block size). Today you should use SecureFiles and we get better performance with them. But that’s for another blog post.

Conclusion

When people come with those kinds of rules of thumbs, I usually try to see if it is something they thought about, or just a blind idea. For example, when they want to rebuild indexes, I ask them which PCTFREE they set for that. Because rebuilding without knowing the PCTFREE we want to achieve is pointless.
And when they want to create a tablespace with non default block size, I ask them how they have calculated the buffer cache to allocate for that blocksize. The whole size of the table? Then why do you want to do less i/o calls if everything is in cache. A small size? Then are you sure that the lower number of i/o calls will compensate the cache you can have in the default buffer pool?

In most of the cases, just keep the default block size, and you have probably lot of other things to tune. I’ll now tell you a short story. I was a junior not-yet-DBA in 1998, implementing a banking software (COBOL, tuxedo, HP-UX, Oracle 7). Application design had a lot to review. Things like comments stored in CHAR(2000) for example. Then I’ve seen a senior consultant recommending ‘The’ solution: increase the block size. The customer accepted that, we did it and everything was immediately better. Of course, in order to do that you have to export/import the database, everything was reorganized, lot of design problems were hidden for a few days. After a while, the performance issues came back, and we had to continue the optimization tasks. Being on a larger block size did not change anything about that. This is where I learned exactly which kind of consultant I don’t want to be.