A couple of days ago, I had the opportunity to work with AlwaysOn 2014 infrastructure that included a report server (SSRS) in native and scale-out mode. As you know, dealing with SSRS and availability groups is not an easy task depending on the used features. This is particularly true when scheduling and subscriptions are used by customers.

Here’s a brief description of the concerned architecture:

blog_57-_0_-_infra_aag_ssrs

2 SSRS servers in scale-out mode and an availability group with 2 replicas. The corresponding availability group configuration is shown below:

blog_57-_01_-_config_aagjpg

First of all, schedules are tied to SQL Server agent jobs and according to the BOL, there are some precautions to take in this case:

  • A schedule is created on the primary replica and is not automatically replicated to the other secondaries. In the same manner, a deleted schedule will not be replicated to other secondaries
  • After the database failover completes and after the report server service is re-started, SQL Server Agent jobs will be re-created automatically. Until the SQL agent jobs are recreated, any background executions associated with SQL Server Agent jobs will not be processed. This includes Reporting Services subscriptions, schedules, and snapshots

These two points introduce some challenges in the management stuff for the following reasons:

We must detect an availability group failover event in order to take the corresponding actions:

  • Disable the concerned SQL Server agent jobs on the secondaries to avoid monitoring false errors
  • Enable the SSRS agents jobs on the primary
  • Ensure that all of the SSRS jobs are synchronized between replicas when they are used as primary (schedules can be created or deleted between failovers)

 

How can we process in this case?

In fact there is a different wayto meet these requirements and you can probably find one of them on the internet. From my part, I decided to choose a solution based on our DMK monitoring tool for AlwaysOn that consists of:

  • A stored procedure that detects the availability group failover events. It will run on each replica on scheduled basis and will return the availability groups related to a failover event and also their new primary replica.
  • A SQL Server agent job that gets information from the above stored procedure and perform the necessary actions described below:
  • Disable the SSRS scheduling jobs if the replica’s new role is secondary
  • Drop the SSRS scheduling jobs and then restart one of the SSRS Server in the farm in order to resynchronize all the schedule jobs on the new replica (PowerShell script)

At this point, I just want to point out that the executed scheduling jobs may fail in a transitioning state according to BOL. Take a look at the following points:

  • The execution of background processing may occur more than once due to retry logic and the inability of the report server to mark scheduled work as completed during the failover period
  • The execution of background processing that would have normally been triggered to run during the period of the failover will not occur because SQL Server Agent will not be able to write data into the report server database and this data will not be synchronized to the new primary replica.

 

The DMK stored procedure

This script is part of our DMK AlwaysOn monitoring script. When a failover is triggered for an availability group, the stored procedure gives as output the concerned availability group and the new replica

:connect sql161
 
exec dbi_tools.[dbo].[dbi_alwayson_failover_detect]
go
 
:connect sql162
 
exec dbi_tools.[dbo].[dbi_alwayson_failover_detect]
go
 
:connect sql163
 
exec dbi_tools.[dbo].[dbi_alwayson_failover_detect]

blog_57-_1_-_dmk_monitoring_aag_script

After the first detection of the availability group failover is done, the next time no output is provided by the stored procedure to avoid a false positive:

blog_57-_2_-_dmk_monitoring_aag_script_2

PowerShell script:

The PowerShell script gets automatically the availability group where the ReportServer database is member of. Then it detects if a failover has occurred for the concerned availability group and perform the extra steps in order to update the reporting environment to use correctly the new availability replica.

Moreover, note that we get the report server names in the scale-out infrastructure from the dbo.keys table in the ReportServer database. Then we try to restart the first report server in the farm and we stop if it is successfully (we need only to restart one of the report servers in order to recreate all of the SSRS jobs). If the restart fails, we move on the next server and try to restart it and so on…

     .SYNOPSIS
   Restart a SSRS report server in the farm if a failover is detected for the availability that concerns the ReportServer database
 
   .DESCRIPTION
       You must configure two parameters
 
   $replica : replica name where the script is running
  
   .EXAMPLE
#>
 
$replica = "replica-name";
 
# Get the availability group that includes the ReportServer database
$aag_ssrs = Invoke-Sqlcmd "SELECT DISTINCT
                                  g.name as aag_name
                           FROM sys.availability_groups as g
                           JOIN sys.dm_hadr_database_replica_states as rs
                                  ON g.group_id = rs.group_id
                           JOIN sys.databases AS d
                                  ON rs.database_id = d.database_id
                           WHERE d.name = N''ReportServer'';" -ServerInstance $replica;
 
$restart_state = 0;
 
# Detect if a failover has occured
$aags_failover = Invoke-Sqlcmd "EXEC dbi_tools.dbo.dbi_alwayson_failover_detect" -ServerInstance $replica;
 
Foreach ($aag in $aags_failover)
{
   If (![string]::IsNullOrEmpty($aag.group_name) -and ![string]::IsNullOrEmpty($aag_ssrs.aag_name))
   {
       If ($aag_ssrs.aag_name -eq $aag.group_name)
       {
           #Write-Host "SSRS availability group failover detected";
 
           # If the concerned replica is the new primary --> Get SSRS reports servers in the farm (Scale-Out)
            $ssrs_servers = Invoke-Sqlcmd "If (SELECT sys.fn_hadr_is_primary_replica(N''ReportServer'')) = 1 SELECT MachineName FROM [ReportServer].[dbo].[Keys] WHERE MachineName IS NOT NULL" -ServerInstance $replica;
 
           If ($ssrs_servers.ItemArray.Count -gt 0)
           {
               # Delete all SSRS agent jobs (deleted subscriptions are not automatically replicated on the secondaries)
               Invoke-Sqlcmd "DECLARE @sql NVARCHAR(MAX) = N'''';
 
                               SELECT
                                      @sql = @sql + ''EXEC msdb.dbo.sp_delete_job @job_name = N'''''' + j.name + '''''''' + CHAR(13)
                               FROM msdb.dbo.sysjobs AS j
                               JOIN msdb.dbo.syscategories AS c
                                       ON j.category_id = c.category_id
                               WHERE c.name = N''Report Server'';
 
                               EXEC sp_executesql @sql;" -ServerInstance $replica;
 
               Foreach ($ssrs in $ssrs_servers)
               {
                   try
                   {
                  
                       #Write-Host "Restart ssrs server : " $ssrs.MachineName;
 
                       # Restart one SSRS report in the farm (scale-out) in order to recreate subscription jobs
                       Get-Service -Computer $ssrs.MachineName -Name "ReportServer" | Restart-Service;
                  
                       Break;
                   }
                   catch [Microsoft.PowerShell.Commands.ServiceCommandException]
                   {
                          $pass = 0;
                       #Write-Host "Error during the service restart of the report server " $ssrs.MachineName;
                   }
               }
           }
       }
   }
}

The T-SQL script for disabling / enabling SSRS subscription jobs

This T-SQL script code will be used to either enable or disable the SSRS jobs depending on the replica context. It will use the new sys.fn_hadr_is_primary_replica() DMF to determine if the current replica is primary or not for a concerned database.

DECLARE @sql NVARCHAR(MAX) = '';
DECLARE @state BIT = COALESCE((SELECT sys.fn_hadr_is_primary_replica(N'ReportServer')), 0);
 
SELECT @sql = @sql + 'IF NOT EXISTS (SELECT 1 FROM msdb.dbo.sysjobs WHERE name = N''' + j.name + ''' AND enabled = ' + CAST(@state AS CHAR(1)) + ')
                                EXEC sp_update_job @job_name = N''' + j.name + ''', @enabled = ' + CAST(@state AS CHAR(1)) + '' + CHAR(13)
FROM msdb.dbo.sysjobs AS j
JOIN msdb.dbo.syscategories AS c
       ON j.category_id = c.category_id
WHERE c.name = N'Report Server'
 
--PRINT @sql;
 
EXEC sp_executesql@sql;

 

Let me show you the scenario:

1- First, we have two schedules on the SSRS report server as follows …

blog_57-_30_-_ssrs_schedules

… with 2 associated SQL Server agent jobs on the primary replica (SQL161). You may also notice that the jobs have not been replicated on the secondary (SQL162) yet.

From the SQL Server side, I created the ReportServerSubscriptionControl job that includes two steps with the above scripts. It’s up to you to configure the schedule.

blog_57-_4_-_config_sql_job

We begin with an initial configuration where SQL161 is the primary replica and SQL162 the secondary

blog_57-_31_-_aag_case_1

2- A failover has occurred and the availability group has switched from SQL161 to SQL162 which is the new primary. After that ReportServerSubscriptionControl  job has been executed the configuration has changed as shown in the next picture:

blog_57-_32_-_aag_case_2

SQL162 is the new primary replica and all of the SSRS jobs are enabled and synchronized on it. Likewise, SQL161 becomes the new secondary and all of the SRRS jobs are now disabled.

Note that in this blog post, I voluntary bypassed the security configuration because it depends mainly on the customer context with the security rules in place. For instance, in my scenario, I played with a proxy account for executing some steps in the ReportServerSubscriptionControl job and the PowerShell subsystem. This proxy account is tied to a Windows credential that has minimum privileges configured in order to deal with both the information from the SQL Server instance itself and restart remotely and safely the SSRS service.

Hope this helps and please feel free to contact me if you have any questions.

By David Barbarin