During the last DevOps Day in Geneva, I presented a sample of our CI implementation related to our MSSQL DMK maintenance product. It was definitely a very good experience for me and good opportunity to get feedback from DevOps community as well.
During the session I explained our CI pipeline includes SQL Server containers both on AKS (K8s as managed service in Azure) and Windocks. I got questions from some of attendees who asked me why we are using Windocks as container solution for SQL Server on the Windows side in our specific context? As promised here some explanations in this blog post but let’s provide now the quick answer: we are using Windocks to address challenges that exist with SQL Server containers on Windows side. The long answer will follow but let’s set first the context with a high-level overview of our continuous integration pipeline architecture:
We are using a hybrid scenario where tools for development (SSDT and GitLab) are located in an on-premises dbi internal infrastructure whereas the CI pipeline runs entirely on Azure. The pipeline breaks down into two main areas including CI testing performed on different SQL Server containers that run on Windows through Windocks as well as Microsoft SQL Server containers that run on Linux and AKS. The AKS (K8s managed service in Azure) hosts SQL Server availability groups (in a beta release) and Windocks (surrounded in green in the above picture) is also part of this Azure architecture in IaaS mode within an Azure virtual machine Standard D4s v3 (4 vCPUs, 16 GB memory and 512GB of disk space). As an aside, we choose this machine size because nested virtualization is required by Windocks and cloned database feature that uses Hyper-V differencing disk capabilities in the background.
- Maintaining docker SQL Server images on Windows may be cumbersome
The DMK maintenance tool performs database maintenance tasks, basically the tasks you may find usually on SQL Server environments including database backups, database check integrity and maintenance of indexes and statistics as well. We obviously brought our added value and best practices in the tool and we provided to our customers they want to use it. The main challenge here consists in supporting a wide range of versions from 2008R2 to 2017 versions at the moment of this write-up (both on Windows and Linux obviously) and most of issues encountered with Docker images came from SQL Server docker images on Windows. First, if you refer to the Docker Hub (and new Microsoft Container Registry), there are no real official images for SQL Server versions prior 2016. Thus, maintaining such images is at your own responsibility and risk and we were not confident to go this way. However, I kept motivated and I decided to perform further tests to check the feasibility with Docker images. However, I quickly figured out that the going through a Docker native based solution will lead to some boring challenges. Indeed, having no official images from Microsoft for older versions of SQL Server, I had to build mine but I was disappointed by the image size that was too large compared to those we may found officially for Linux – more than 10GB for a SQL Server docker image on Windows versus ~ 1.4GB on Linux.
SQL Server Docker image size on Windows after building the custom image
The total size includes the SQL Server binaries but event if we exclude it from the calculation the final size leads to the same conclusion.
SQL Server image size on Linux
In addition, building a basic image of SQL Server on Windows remains a boring stuff and may be time consuming to be honest because you need to write some pieces of code to install optional prerequisites, SQL Server itself meaning you have first to copy binaries (CUs or / and SPs according the version) and then run the command file to install it. A lot of work and no real added values (and no warranties) at the end. That is definitely at the opposite of what I may expect as part of a DevOps process when I want to be fast and to use simply a SQL Server docker based image. Indeed, in this case, I would like to just pick up the right docker image version and corresponding tag and then to focus on my work.
Windocks fills the gap that exists with older versions (and probably new ones) of SQL Server on Windows by providing a different way to create based images compared to the docker-native solution. The first step consists in installing SQL Server instances as we would in a traditional approach. Then the interesting point is that these instances will serve as based images when spinning up containers. This new approach provides several advantages we get through but here I would like to point out the ability to apply configuration settings directly at the SQL Server instance level that will be propagated automatically to new created containers. From my point of view, it is an interesting way to apply segregation of duties without compromising the architecture’s agility. DBAs (Ops) may still work on providing a well configured template from an architecture point of view whereas developers will focus on their work but both will interact with the same tool.
- Storage concern may exist even on DEV environment
Usually in DEV environment storage is not a big concern for DBAs. From my experience, they usually provide to developers a shared environment with different SQL Server instances and application databases as well. Most of time developers get often high privileges on those environments as – db_owner or sysadmin according to the context – because it is about a DEV environment after all and DBAs apply often a dirty fix to make these environments more “agile”. But this approach implies installing a static environment that is in fact not as flexible as we may think for developers. For instance, how to reinitialize an environment for a specific developer without impacting the work of other ones? The ideal context would be each developer is able to create quickly an isolated and ephemeral environment on-demand. But in turn this new approach comes with its new challenges: Indeed, how to deal with the total disk space consumption in this case? Let’s say each developer wants to spin up a new SQL Server container environment, then the total storage footprint would include the SQL Server docker image and the space consumed by the user databases as well, right? Let’s take a real customer example who wants to provide fresh data from production databases every week to the developers (after applying sanitized data scripts or not). This is a common scenario by the way and let’s say the final storage size of databases is roughly 500GB for this customer. Adding ~ 20 developers in the game, I ‘m pretty sure you already guessed the potential storage concern which may result here if all developers want to spin up their own environment in the same time. Let’s do a quick math: 20 [developers] x (10GB [Docker image size] + 500GB [user databases] ~= 10 TB.
Going back to my specific context (our DMK maintenance tool) the storage footprint is not so exacerbated because we could be up to 7 developers at the same time with a total storage footprint of 770GB (10GB for the Docker image + 100GB of user databases). It remains too much for us even if we have provisioned 512GB of premium SSD and we can increase it in an easy way … Storage has also a cost on Azure right? Furthermore, we know that for each developer the ratio between the payload disk space and real consumed disk space is low for the most part of developed features. We need to find a way to improve this ratio and Windocks provides a simple way to address it by providing Hyper-V differencing disk capabilities directly integrated with containerization.
How to secure our environment was a question that came at the end of our CI implementation. As many DevOps projects security is usually not at the heart of first concern but moving to the cloud helped to consider security as an important topic in our architecture.
First, we need to ensure images used by our team are secure. Insecure images are part of new issues that come with container environments and image checking process requires a more complex infrastructure with often EE capabilities and extra components on the container side (at least in the case you don’t want to put your images on a public repository. Using a private registry on Azure is another option but after some investigations we were in favor of Windocks capabilities in our context. Windocks goes through a different approach to create SQL Server images by using SQL Server native instance installation as based template rather than relying on a Docker native images and on potential docker registry. The built-in approach to prevent compromising the container infrastructure with potential malicious code without further complexifying the architecture was a good argument for us because it can help DBAs to keep security concerns under control here.
Then Windocks provides other features that help us securing the container environment in an easy way with basic authentication to prevent an unauthorized user to spin up a Windocks container for instance. The native support of Windows authentication was another good argument because it simplified the security management of admin users. We are using a mix of Windows sysadmin accounts and SQL Server logins for applications.
In the bottom line, as a small DEV team we are really satisfied with Windocks that was able to address challenges we faced on the operational side. But it is worth noting that our needs and challenges are closed to what we may see with some of our customers, but in a different order of magnitude, when SQL Server is introducing in their CI/CD pipeline. In our context, we are running standard edition of Windocks but EE capabilities are also available that are more suitable with enterprise-class environments.
By David Barbarin