When you install SQL Server, you have to consider how to place the database files. At this point you will probably meet the well-known best practices and guidelines provided by Microsoft but are you really aware of the storage vendor guidelines? Let’s talk about it in this blog post.

A couple of weeks ago, I had the opportunity to implement a new SQL Server architecture over on a Netapp storage model FAS 3140. The particularity here is that my customer wants to manage his backups with NetApp SnapManager for SQL Server. This detail is very important because in this case we have to meet some requirements about storage layout. After reading the Netapp storage documents I was ready to begin my database files.

First of all, we should know how SnapManager works. In fact SnapManager performs SQL Server backups by using either a snapshot of database files or by issuing streaming backups. Streaming backups concern only the system database files and the transaction log files. To achieve this task, SnapManager will coordinate several components like SnapDrive for Windows to control the storage LUNS and SQL Server to freeze IO by using the VSS Framework. It is important to precise that without SnapManager, performing snapshots from a pure storage perspective is still possible but unfortunately in this case the storage layer is not aware of the database files placement. We may find in such situation that a database is not consistent. Here a simplified scheme of what’s happen during the SnapManager backup process:

SnapManager (1) –> VDI (2) – – backup database .. with snapshot –> VSS (3) – database files IO freeze  –> SnapDrive (3) – snapshot creation

So why do we have to take care about database file placements with Netapp Storage? Well, according to our first explanation above, if we perform snapshot backups the unit of work is in fact the volume (FlexVol is probably the better term here. FlexVol is a kind of virtual storage pool to manage physical storage efficiently and can include one or more LUNs). In others words, when a snapshot is performed by SnapManager all related LUNs are considered together and all concerned database files in single volume are frozen when snapshot is created. Notice that if a database is spread across several volumes, IO are frozen for all concerned volumes at the same time but snapshots are taken serially.

This concept is very important and as a database administrator, we also now have to deal with these additional constraints (that I will describe below) to place our database files regarding the storage layout and the RPO / RTO needs.

1- System database files must be placed on a dedicated volume including their respective transaction log files. Only streaming backups are performed for system databases
2- Tempdb database files must also be placed on a dedicated volume that will be excluded from SnapManager backup operations.
3- Otherwise, placement of user databases depends on several factors as their size or the storage layout architecture in-place. For example, we may encounter cases with small databases that will share all their database files on the same volume. We may also encounter cases where we have to deal with placement of large database files. Indeed, large databases often include different Filegroups with several database files. In this case, spreading database files on different LUNs may be a performance best practice and we have to design the storage with different LUNs that share the same dedicated volume. Also note that the database transaction log files have to be on separated volume (remember that transaction log files are concerned by streaming backups).
4- In consolidated scenario, we may have several SQL Server instances on the same server with database files on the same volume or we may have dedicated volumes for each instance.

Regarding these constraints, keep in mind that your placement strategy may presents some advantages and but also some drawbacks:

For example, restoring a database on a dedicated volume from a snapshot will be quicker than using streaming backups but in the same time we may reach quickly the maximum number of drives we may use if we are in the situation where we dedicate one or several volumes per database. Imagine that you have 5 databases spreaded on 4 volumes (system, tempdb, data and logs)… I’ll let you do the math. A possible solution is to replace the letter-oriented volume identification by mount points.

In the same time, sharing volumes between several databases makes more difficult the restore in-place process of only one database because in this case we will have to copy data from a mounted snapshot back into the active file system. But increasing the number of shared files on a single volume may cause timeout issues during the snapshot operation. According to Netapp Knowledgebase here the maximum recommended number of databases on a single volume is 35.  If you want to implement a policy that verify that the number of database files is lower than this threshold please see the blog post of my colleague Stéphane Haby. The number of files concerned by a snapshot creation, the size of the volumes or the storage performance are all factors to consider during the database files placement.

After digesting all these considerations, let me present briefly my context: 3 SQL Server instances on a consolidated server with the following storage configuration. We designed the storage layout as follows

blog_30_1_storage_layout_architecture

One volume for system databases, one volume for tempdb, one volume for user database data files, one volume for user database log files and finally one volume dedicated for the snapinfo. The snapinfo volume is used by SnapManager to store streaming backups and some metadata

But according to what I said earlier my configuration seems to have some caveats. Indeed, to restore only one database in-place in this case is possible but because we’re actually sharing database files between several databases and several SQL Server instances, we will force the snapmanager to copy data from a mounted snapshot back into the file system. Otherwise, we may also face the timeout threshold issue (10s hard-coded into the VSS framework) during the freezing database IO step. Finally we may not exclude issues caused by the limit of 35 databases on a single volume because I’m in a consolidated environment scenario. Fortunately my customer is aware of all the limitations and he is waiting for the upgrade of its physical storage layout. We will plan to redesign the final storage layout at this time. Until then, the number of databases should not be a problem.

See you!

By David Barbarin