Introduction
Oracle Database Appliances rely on ASM to manage disk redundancy. And ASM is brilliant. Compared to RAID, redundancy is managed at the block level. For NORMAL redundancy, which is similar to RAID1, you need at least 2 disks, but it can also work with 3 disks, 4 disks, 5 disks and so on. There is no need for parity at the disk level. HIGH redundancy, which does not exist in RAID technology, is basically a triple security. Each block is written on 3 different disks. For this kind of redundancy, you need at least 3 disks, but you can also use 4 disks, 5 disks, 6 disks and so on. You can add and remove disks online, without any downtime, using various degrees of parallelism to increase speed or to lower CPU usage during the rebalancing operations.
RAW space vs usable space
As there is no RAID controler in your ODA, you will see from the system, and more precisely from ASM instance, the RAW space available. For example, on ODA X8-2M with 4 disks, RAW capacity is 25.6TB. This is the free space size you would see on this kind of ODA if there were no databases configured on it. This is not a problem as soon as you understand that you don’t really have these 25.6TB. There is also a usable space notion. One should think it is space available with redundancy being computed, but it’s not exactly that. It can be quite different actually depending on your ODA.
Real world example
For my example, I will use an ODA X8-2M with 4 disks running on 19.6. Redundancy has been set to NORMAL, and DATA/RECO ratio to 90/10. Several databases are running on this ODA. Regarding the spec sheet of this server, the ODA X8-2M comes with 2x 6.4TB disks as standard, and you can add up to 5 expansions, each expansion being a bundle of 2x 6.4TB disks. RAW capacity starts from 12.4TB and goes up to 76.8TB. As you probably know, a 6.4TB disk hasn’t 6.4TB of real usable capacity, so don’t expect to store more than 5.8TB on each disk. But this is not related to ODA. It’s been years that disk manufacturers are writing optimistic sizes on their disks.
I’m using V$ASM_DISKGROUP dynamic view from +ASM1 instance to check available space and free space.
desc v$asm_diskgroup Name Null ? Type ----------------------------------------- -------- ---------------------------- GROUP_NUMBER NUMBER NAME VARCHAR2( 30 ) SECTOR_SIZE NUMBER LOGICAL_SECTOR_SIZE NUMBER BLOCK_SIZE NUMBER ALLOCATION_UNIT_SIZE NUMBER STATE VARCHAR2( 11 ) TYPE VARCHAR2( 6 ) TOTAL_MB NUMBER FREE_MB NUMBER HOT_USED_MB NUMBER COLD_USED_MB NUMBER REQUIRED_MIRROR_FREE_MB NUMBER USABLE_FILE_MB NUMBER OFFLINE_DISKS NUMBER COMPATIBILITY VARCHAR2( 60 ) DATABASE_COMPATIBILITY VARCHAR2( 60 ) VOTING_FILES VARCHAR2( 1 ) CON_ID NUMBER |
One can guess that real diskgroup free % is normally FREE_MB/TOTAL_MB:
SQL> set lines 200 SQL> select GROUP_NUMBER, NAME, TOTAL_MB, FREE_MB, USABLE_FILE_MB, TYPE from v$asm_diskgroup; GROUP_NUMBER NAME TOTAL_MB FREE_MB USABLE_FILE_MB TYPE ------------ ------------------------------ ---------- ---------- -------------- ------ 1 DATA 21977088 9851876 2178802 NORMAL 2 RECO 2441216 1421004 405350 NORMAL Select round( 9851876 / 21977088 * 100 , 1 ) "% Free" from dual; % Free ---------- 44.8 |
Free space is more than 44% on my ODA. Not bad.
And when I use USABLE_FILE_MB to get another metric for the same thing:
SQL> set lines 200 SQL> select GROUP_NUMBER, NAME, TOTAL_MB, FREE_MB, USABLE_FILE_MB, TYPE from v$asm_diskgroup; GROUP_NUMBER NAME TOTAL_MB FREE_MB USABLE_FILE_MB TYPE ------------ ------------------------------ ---------- ---------- -------------- ------ 1 DATA 21977088 9851876 2178802 NORMAL 2 RECO 2441216 1421004 405350 NORMAL Select round( 2178801 / 21977088 * 100 , 1 ) "% Free" from dual; % Free ---------- 9.9 |
This is bad. According to this metric, I have less than 10% free in that diskgroup. I’m getting anxious… I thought I was fine but I’m now critical?
What is really USABLE_FILE_MB?
When you look into the documentation, it’s quite clear:
- USABLE_FILE_MB is free MB according to diskgroup redundancy. Among 9’851’876 MB, only half, 4’925’938 MB of data, can be used in that diskgroup. This is for NORMAL redundancy (each block exists on 2 different disks). This is quite relevant regarding what has been said before
- USABLE_FILE_MB is free MB according to a disk being able to get lost and redundancy would be guaranteed. On this ODA with 4 disks, ¼ of the total disk space shouldn’t be considered as available unlike RAID system (a loss of one disk is not visible by the system). For a total MB of 21’977’088, only 16’482’816 MB should be considered as usable for DATA
- Finally, USABLE_FILE_MB is the mix of these 2 facts. For NORMAL redundancy, the formula is USABLE_FILE_MB = (FREE_MB – TOTAL_MB/nb_disks) / 2 = (9’851’876 MB – 5’494’272 MB) / 2 = 2’178’802 MB
Let’s take another example to be sure. This time it’s an ODA X8-2M with 6 disks in NORMAL redundancy. Let’s do the math:
SQL> set lines 200 SQL> select GROUP_NUMBER, NAME, TOTAL_MB, FREE_MB, USABLE_FILE_MB, TYPE from v$asm_diskgroup; GROUP_NUMBER NAME TOTAL_MB FREE_MB USABLE_FILE_MB TYPE ------------ ------------------------------ ---------- ---------- -------------- ------ 1 DATA 32965632 25028756 9767242 NORMAL 2 RECO 3661824 2549852 969774 NORMAL select round(( 25028756 - 32965632 / 6 )/ 2 , 1 ) "DATA_USABLE_FILE_MB" from v$asm_diskgroup where name= 'DATA' ; DATA_USABLE_FILE_MB ------------------- 9767242 |
The formula is correct.
Should I use USABLE_FILE_MB for monitoring?
That’s a good question. Using USABLE_FILE_MB for monitoring is considering the worst case. Using FREE_MB/TOTAL_MB is considering the best case. Using FREE_MB seems recommended but with lower values than a normal filesystem: WARNING should be triggered when 65/70% is reached, and CRITICAL should be triggered when 80/85% is reached. For 2 reasons: because the volume will be filled 2 times faster than a view through a RAID system (3 times faster with HIGH redundancy) and because when your disks are nearly full, the only way to extend the volume is to buy new disks from Oracle (if you have not reached the limit).
Remember that the only resilience guarantee for an ODA is not having enough space in diskgroups for loosing one disk but having a functional Data Guard configuration. It’s why I never configure HIGH redundancy on ODA, it’s a waste of disk space and it does not provide me much higher failure tolerance (I still have “only” 2 power supplies and 2 network interfaces).
To make it crystal clear, let’s compare again to a RAID system. Imagine you have a 4x 6TB disks RAID1 system. These 4 disks have a RAW capacity of 24TB, but only 12TB are usable. If you loose one disk, 12TB are still usable, but you’ve lost redundancy for half of the data. With ASM in NORMAL redundancy, you can see a total of 24TB, but only 12TB is really available for your databases. But if you look at the USABLE_FILE_MB, you will find that only 9TB is usable, because you redundancy can survive to a disk crash. The RAID is simply not able to do that.
Furthermore, if you want to do the same with RAID1 you could, but it means that you will need 5 disks instead of 4. The fifth one being the spare disk to rebuild redundancy in case of disk failure of one of the four disks.
Should I use storage even if USABLE_FILE_MB is 0?
Yes, you can. But you have to know that if you loose a disk, redundancy cannot be guaranteed anymore. Like if it were on a RAID system. You can also see negative values in USABLE_FILE_MB.
And what about the number of disks?
For sure, the more disk you have, the less space you will “loose” from the USABLE_FILE_MB view. An ODA with 3 or 4 disks with NORMAL redundancy is definitely not very comfortable, but starting from 6 disks, this USABLE_FILE_MB becomes much more convenient.
On a 2-disk ODA with NORMAL redundancy, there is no way of keeping redundancy after loosing a disk. That’s quite obvious. X8-2S and X8-2M with base disk configuration are not that nice for this reason.
Number of disks is not only a matter of storage size you need but also an increased level of security for your databases. The more disks you have, the more disk failure you can survive keeping redundancy (if disks are not having simultaneous failures for sure).
Conclusion
ODA storage is commonly misunderstood, because it does not use classic RAID. ASM is very powerful and more secure than a RAID system. Don’t hesitate to order more disks than needed on your ODAs. Yes it’s expensive but this is a good investment for the next 5 years. And it’s usually cheaper to order additional disks with the ODA than ordering them later.
Wolfgang Rauchenstein
15.11.2023@Jimm: ODA/Oracle is never using RAID. ODA use 1 or more copies of files. Something like RAID5 is not possible (striping and parity-bits/bytes).
Even comparison with RAID1 is not completely correct if you have 4 or more disks installed. In a RAID1 in good case you can half of the disks damaged if you don't hit two disk of original and mirror. Example 24 disks installed you can loose 12 disk in maximum without data-loss.
With ODA you will loose data with 'NORMAL' redundancy if original and copy OF FILE is affected, independant how many disks you have installed. A very, very theoretical view with 24 disks is that you can loose 23 disks if your data is stored on 2 disks only (happens if you have 1 file only). RAID will store 'striped' over the disks and all disks are used when the file is big enough for using 1 stripe on each disk at least.
The only thing you can compare is: guaranteed security is 1 disk can fail without data-loss when you use 'normal redundancy' or 'RAID1'. On both storage technologies you can hit original and copy if 2 disks fail.
But:
If 1 disk fails, ODA will rearrange the files over the remaining disks and if there is enough time more disks can fail one after the other until all of your diskspace is used.
RAID needs replacement of mirror - no chance if original and mirror disks fail.