dbi services Blog
Welcome to the dbi services Blog! This blog focuses on IT infrastructure - featuring news, troubleshooting, and tips & tricks. It covers database, middleware, and OS technologies such as Oracle, Microsoft SQL Server, Documentum, MySQL, PostgreSQL, Sybase, Unix/Linux, etc. The dbi services blog represents the view of our consultants, not necessarily that of dbi services. Feel free to comment on the postings!
High availability and disaster recovery on Windows Azure VMs
To be honest, I am not a specialist of Windows Azure. In order to learn a little bit more about this subject, I decided to follow a TechEd Europe 2013 session about High Availability and Desaster Recovery on Windows Azure Virtual Machines.
Overview of Windows Azure
Windows Azure is Microsoft's application platform for the public cloud. You have the possibility to use this platform in many different ways:
- build a web application that runs and stores its data on Windows Azure data centers
- just storing data and running your application on-premises (outside the public cloud)
- create VMs for development or test
Windows Azure offers multiple solutions.
Windows Azure Principles
It is economic and usage-based:
- pay for what you use
- pay by the minute
- MSDN usage free in VMs
It is automated and elastic:
- you can use PowerShell automation
- Easy to scale-out
- Easy to scale-up
It is managed, hybrid, and supports AlwaysOn:
- simple load-balancing possible
- managed availability
- easy hybrid (Windows Azure and on-premises)
Infrastructure services on Windows Azure
What does Windows Azure offer in terms of infrastructure services?
- Experience of IT professionals
- multiple way to get started with Management Portal, Scripting...
- Image of application available to install it quickly
- SQL Server(2008 R2 WEB/Standard/Entreprise, 2012 Express/WEB/Standard/Entreprise), SharePoint 2010/2013...
- Storage Manageability and Mobility
- possibility to have his own storage or/and to use Windows Azure storage
- High Availability Features
- Power Unit Rack Switch: Load balancing between Rack containing VMs(Availability SLA: 99.95%)
- Advanced Hybrid Networking
- Virtual Network Site-To-Site VPN netween on-premises datacenter and Windows Azure
- Virtual Network Point-To-Site: on-premises individual computers behind firewall and Remote workers connect via Windows Azure Gateway
- IaaS, PaaS, and Agility
- Pay by the minute: VMs stops = Payment stops, no rounding-up, no minimun
- MSDN Usage Improvements
- MSDN products can be used on VMs
- Single monetory credit instead of multiple
- Focusing on Test/dev usage
Three "infrastructure as a service" scenarios
There are three main "infrastructure as a service" scenarios for SQL Server high availability and disaster recovery:
- HA within Azure
- Availability of SQL Server in Azure VM
- Protection from issues impacting SQL Server or VM
- Using another SQL Server VM in same Azure DC
- DR between On-Premise and Azure
- Ensure availability of on-premise SQL Server (physical or virtual)
- Protection from issues impacting on-premise DC
- Using a SQL Server VM in Azure
- DR across Azure DCs
- Availability of SQL Server in Azure VM
- Protection from issues impacting the Azure DC
- Using another SQL Server VM in different Azure DC
SQL Server High Availability with Azure
What are the reasons for achieving high availability with Azure?
- Azure’s failure detection for VM (not SQL Server)
- SQL Server service could be down or hung
- Servicing of guest OS can cause downtime
- Servicing of SQL Server can cause downtime
- Azure's service healing involves restarting VM in different host
- around 12 minutes downtime each time
- Azure's upgrade involves servicing host OS and restarting VM in the host
- around 15 minutes downtime each time
Have a look at this exemple:
There are some limitations for the current version of Windows Azure - e. g. mirroring is only possible with one secondary.
You can also use SQL Server technology such as Availability Group in Windows Azure. It offers the following advantages:
- Provides many other capabilities
- Flexible Failover Policy
- Automatic Page Repair
- Backups on Secondaries
- Improved Manageability
- FileStream & FileTable support
- But requires
- Windows Cluster
- Though no shared storage
- Same Windows Domain
- Needs an Active Directory Domain Controller
- Windows Cluster
For the moment Availability Group Listeners are not yet supported by Windows Azure. It will be supported in next couple of months.
In the meantime it is possible to use Failover Partner as in database mirroring, but only with two replicas.
How to configure SQL Server Availability Group?
You will have to setup an Active Directory Domain Controller and add VMs to this domain and create a Windows Cluster.
Take care: Azure’s DHCP assigns a dup IP to the cluster network name (CNN) which can cause cluster creation to fail as Availability Groups do not use CNN.
As a workaround, you can use this script.
The rest of the process is the same as on-premises.
If you need Windows Authentication you will have to setup an Active Directory Domain Controller and add VMs to this domain. The rest of the process is the same as on-premises.
SQL Server Disaster Recovery between On-Premises and Azure
Why should we need that?
- An event can cause on-premises SQL Server to become unavailable temporarily (gateway failure) or permanently (flooding).
- A Disaster recovery site is expensive
- site rent + maintenance
- operations (maintenance)
How to do it?
- Deploy one or more secondary replicas for on-premises SQL Server
- Replicas continuously synchronize
- Best regions: Western US, Eastern US, East Asia, Southeast Asia, Northern Europe, Western Europe
- Political considerations
- Low TCO (Total Cost Of)
- VM and storage
It should look like this:
There are some limitations in terms of supported technologies:
SQL Server Disaster Recovery across Azure Datacenters
Why could it be interesting to use this configuration?
- If you use multiple disks:
- Azure’s Geo-Replication doesn’t guarantee write order across disks
- This can break SQL Server’s recovery requirement (log always more up-to-date than data)
- If Azure’s DR doesn’t satisfy your requirements:
- NO SLA
- Based on Azure tests:
- VM recovery: less than twenty four hours
- Data loss: less than thirty minutes
These are the supported technologies:
For the moment, Availability Groups are not supported in this configuration because they require the same Windows Domain. Thye will however be supported later this year.
For the moment it is possible to use Database Mirroring or Availability Group with an on-premises Disaster Recovery replica.
I have tries to describe different ways to achieve high availability and disaster recovery with Windows Azure based on a session of TechEd 2013. Hope it helps!