By Franck Pachot

.
In a previous post I tried to compare a single thread workload between Exadata X5 Bare Metal and Virtualized. The conclusions were that there is no huge differences, and that this kind of comparison is not easy.

About the comparison is not easy, some reasons have been nicely detailed by Lonny Niederstadt in this twitter thread

Besides the single thread tests, I did a test with 50 sessions doing updates on a small data set. It’s a 50 session SLOB with SCALE=100M WORK_UNIT=64 UPDATE_PCT=100 and a small buffer cache.

Here are the load profiles, side by side:

                           Bare Metal                           Virtualized
 
Load Profile                    Per Second   Per Transaction         Per Second   Per Transaction
~~~~~~~~~~~~~~~            ---------------   ---------------    ---------------   --------------- -
             DB Time(s):              42.9               0.0               48.6               0.1
              DB CPU(s):               5.6               0.0                6.8               0.0
      Background CPU(s):               3.4               0.0                3.8               0.0
      Redo size (bytes):      58,364,739.1          52,693.1       52,350,760.2          52,555.7
  Logical read (blocks):          83,306.4              75.2           73,826.5              74.1
          Block changes:         145,360.7             131.2          130,416.6             130.9
 Physical read (blocks):          66,038.6              59.6           60,512.7              60.8
Physical write (blocks):          69,121.8              62.4           62,962.0              63.2
       Read IO requests:          65,944.8              59.5           60,448.0              60.7
      Write IO requests:          59,618.4              53.8           55,883.5              56.1
           Read IO (MB):             515.9               0.5              472.8               0.5
          Write IO (MB):             540.0               0.5              491.9               0.5
         Executes (SQL):           1,170.7               1.1            1,045.6               1.1
              Rollbacks:               0.0               0.0                0.0               0.0
           Transactions:           1,107.6                                996.1

and I/O profile:

                           Bare Metal                                           Virtualized
 
IO Profile                  Read+Write/Second     Read/Second    Write/Second   Read+Write/Second     Read/Second    Write/Second
~~~~~~~~~~                  ----------------- --------------- ---------------   ----------------- --------------- ---------------
            Total Requests:         126,471.0        65,950.0        60,521.0           117,014.4        60,452.8        56,561.6
         Database Requests:         125,563.2        65,944.8        59,618.4           116,331.5        60,448.0        55,883.5
        Optimized Requests:         125,543.0        65,941.1        59,601.9           116,130.7        60,439.9        55,690.7
             Redo Requests:             902.2             0.1           902.1               677.1             0.1           677.0
                Total (MB):           1,114.0           516.0           598.0             1,016.5           472.8           543.6
             Database (MB):           1,055.9           515.9           540.0               964.7           472.8           491.9
      Optimized Total (MB):           1,043.1           515.9           527.2               942.6           472.7           469.8
                 Redo (MB):              57.7             0.0            57.7                51.7             0.0            51.7
         Database (blocks):         135,160.4        66,038.6        69,121.8           123,474.7        60,512.7        62,962.0
 Via Buffer Cache (blocks):         135,159.8        66,038.5        69,121.3           123,474.0        60,512.6        62,961.4
           Direct (blocks):               0.6             0.2             0.5                 0.7             0.1             0.5

This is roughly what you can expect from OLTP workload: small data set that fits in flash cache, high redo rate. Of course in OLTP you will have a higher buffer cache, but this is not what I wanted to measure here. It seems that the I/O performance is slightly better in bare metal. This is what we also see on averages:

Top 10 Foreground Events by Total Wait Time
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
                                           Total Wait       Wait   % DB Wait
Event                                Waits Time (sec)    Avg(ms)   time Class
------------------------------ ----------- ---------- ---------- ------ --------
cell single block physical rea   9,272,548     4182.4       0.45   69.3 User I/O
free buffer waits                  152,043     1511.5       9.94   25.1 Configur
DB CPU                                          791.4              13.1
 
Top 10 Foreground Events by Total Wait Time
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
                                           Total Wait       Wait   % DB Wait
Event                                Waits Time (sec)    Avg(ms)   time Class
------------------------------ ----------- ---------- ---------- ------ --------
cell single block physical rea   7,486,727     3836.2       0.51   63.7 User I/O
free buffer waits                  208,658     1840.9       8.82   30.6 Configur
DB CPU                                          845.9              14.1

It’s interesting to see that even when on I/O bound system there are no significant waits on log file sync.

I’ll focus on ‘log file paralle writes’:

Bare Metal

EVENT                                    WAIT_TIME_MICRO WAIT_COUNT WAIT_TIME_FORMAT
---------------------------------------- --------------- ---------- ------------------------------
log file parallel write                                1          0 1 microsecond
log file parallel write                                2          0 2 microseconds
log file parallel write                                4          0 4 microseconds
log file parallel write                                8          0 8 microseconds
log file parallel write                               16          0 16 microseconds
log file parallel write                               32          0 32 microseconds
log file parallel write                               64          0 64 microseconds
log file parallel write                              128          0 128 microseconds
log file parallel write                              256       8244 256 microseconds
log file parallel write                              512     102771 512 microseconds
log file parallel write                             1024      14812 1 millisecond
log file parallel write                             2048        444 2 milliseconds
log file parallel write                             4096         42 4 milliseconds
log file parallel write                             8192         11 8 milliseconds
log file parallel write                            16384          3 16 milliseconds
log file parallel write                            32768          1 32 milliseconds

Virtualized

EVENT                                    WAIT_TIME_MICRO WAIT_COUNT WAIT_TIME_FORMAT
---------------------------------------- --------------- ---------- ------------------------------
log file parallel write                                1          0 1 microsecond
log file parallel write                                2          0 2 microseconds
log file parallel write                                4          0 4 microseconds
log file parallel write                                8          0 8 microseconds
log file parallel write                               16          0 16 microseconds
log file parallel write                               32          0 32 microseconds
log file parallel write                               64          0 64 microseconds
log file parallel write                              128          0 128 microseconds
log file parallel write                              256        723 256 microseconds
log file parallel write                              512      33847 512 microseconds
log file parallel write                             1024      41262 1 millisecond
log file parallel write                             2048       6483 2 milliseconds
log file parallel write                             4096        805 4 milliseconds
log file parallel write                             8192        341 8 milliseconds
log file parallel write                            16384         70 16 milliseconds
log file parallel write                            32768         10 32 milliseconds

As I’ve seen in previous tests, most of the writes where below 512 microseconds in bare metal, and above in virtualized.

And here are the histograms for the single block reads:

Bare Metal

EVENT                                    WAIT_TIME_MICRO WAIT_COUNT WAIT_TIME_FORMAT
---------------------------------------- --------------- ---------- ------------------------------
cell single block physical read                        1          0 1 microsecond
cell single block physical read                        2          0 2 microseconds
cell single block physical read                        4          0 4 microseconds
cell single block physical read                        8          0 8 microseconds
cell single block physical read                       16          0 16 microseconds
cell single block physical read                       32          0 32 microseconds
cell single block physical read                       64          0 64 microseconds
cell single block physical read                      128        432 128 microseconds
cell single block physical read                      256    2569835 256 microseconds
cell single block physical read                      512    5275814 512 microseconds
cell single block physical read                     1024     837402 1 millisecond
cell single block physical read                     2048     275112 2 milliseconds
cell single block physical read                     4096     297320 4 milliseconds
cell single block physical read                     8192       4550 8 milliseconds
cell single block physical read                    16384       1485 16 milliseconds
cell single block physical read                    32768         99 32 milliseconds
cell single block physical read                    65536         24 65 milliseconds
cell single block physical read                   131072         11 131 milliseconds
cell single block physical read                   262144         14 262 milliseconds
cell single block physical read                   524288          7 524 milliseconds
cell single block physical read                  1048576          4 1 second
cell single block physical read                  2097152          1 2 seconds

Virtualized


EVENT                                    WAIT_TIME_MICRO WAIT_COUNT WAIT_TIME_FORMAT
---------------------------------------- --------------- ---------- ------------------------------
cell single block physical read                        1          0 1 microsecond
cell single block physical read                        2          0 2 microseconds
cell single block physical read                        4          0 4 microseconds
cell single block physical read                        8          0 8 microseconds
cell single block physical read                       16          0 16 microseconds
cell single block physical read                       32          0 32 microseconds
cell single block physical read                       64          0 64 microseconds
cell single block physical read                      128          0 128 microseconds
cell single block physical read                      256     518447 256 microseconds
cell single block physical read                      512    5371496 512 microseconds
cell single block physical read                     1024    1063689 1 millisecond
cell single block physical read                     2048     284640 2 milliseconds
cell single block physical read                     4096     226581 4 milliseconds
cell single block physical read                     8192      16292 8 milliseconds
cell single block physical read                    16384       3191 16 milliseconds
cell single block physical read                    32768        474 32 milliseconds
cell single block physical read                    65536         62 65 milliseconds
cell single block physical read                   131072          2 131 milliseconds

Same conclusions here: the ‘less than 256 microseconds’ occurs more frequently in bare metal than virtualized.

For the reference, those tests wer done on similar configuration (except virtualization): X5-2L High Capacity with 3 storage cells, version cell-12.1.2.3.1_LINUX.X64_160411-1.x86_64, flashcache in writeback. Whole system started before the test. This test is the only thing running on the whole database machine.