The client’s environment includes two SQL Server instances (I1 and I2). One of the databases, D1 (on I1), is replicated to I2 via transactional replication. Additionally, the tables in this database are partitioned. The largest table (which is the one of interest) is approximately 350 GB in size and contains around 800 million rows. This table represents transactions.

The client has submitted the following requirements:

  • They want to archive data with minimal impact on data availability (during the archiving process)
  • They want to be able to retain the archived data
  • They want to be able to make the archived data accessible or make it visible again

The following elements also need to be taken into account:

  • Transactional replication is enabled on this table
  • This table is partitioned

What does the client’s environment look like?

The client’s environment consists of the following elements:

  • A SQL Server 2019 Standard Edition instance (Server 1) – Publisher
  • A SQL Server 2019 Standard Edition instance (Server 2) – Subscriber
  • A database composed of several tables, including one table that represents transactions
  • Transactional replication is used to replicate certain tables between the two SQL Server instances

The table we are interested in consists of several columns and is partitioned by quarter (based on the Transaction Date column):

What are the possible solutions?

One possible solution for archiving data (while keeping it easily accessible) is partition switching. It is indeed possible to “move” data from one partition to another. To do this, an identical table must be created — in our case, it must have the same number of partitions.

We will therefore need to:

  • Create a “copy” or “archive” table that will contain the archived data from the transactions table
  • Since we are using transactional replication, the ALTER SWITCH command must be accounted for and replicated in such a way that the subscriber also executes the ALTER SWITCH command

The command looks like this:

ALTER TABLE transactions SWITCH PARTITION 2 TO transactions_archive PARTITION 2;

There are many advantages:

  • Data “movement” using the ALTER SWITCH command is almost instantaneous, even when millions of rows are involved
  • The command is simple to implement
  • A rollback is possible — meaning data can be moved from one partition to another, or from one table to another. The “movement” can occur in both directions:
    • From the transactions table to the archive_transactions table
    • From the archive_transactions table back to the transactions table
  • Data remains easily accessible since it is only moved to another table

What does it look like?

The data from the selected partition is “migrated” to the same partition in the destination table.

Problems that arise:

By default, it is not possible to perform partition switches when transactional replication is being used. You can verify this using the following command:

exec sp_helppublication @publication = N'Publication_MyTransactions';

It is therefore necessary to allow the publication to handle “ALTER SWITCH” commands so that the partition switch is replicated to the subscriber. This is done using the following command:

exec sp_changepublication
	@publication = N'Publication_MyTransactions',
	@property = N'allow_partition_switch',
	@value = N'true'
go

exec sp_changepublication
	@publication = N'Publication_MyTransactions',
	@property = N'replicate_partition_switch',
	@value = N'true'
go

In our client’s case, the indexes were not aligned. In other words, they were not partitioned. When this happens, it is not possible to perform a partition switch. More generally, the source and destination tables must have exactly the same structure.

From there, we have two issues:

  • The unaligned indexes need to be rebuilt (and partitioned)
  • The new table (transactions_archive) needs to be added to the publication

Once these two actions are completed, a new snapshot must be generated. When this occurs, all articles are taken into account, and the snapshot captures all of them. The issue is that our client’s publication includes several dozen tables with a total volume exceeding 1 TB.

To include only the modified articles (in our specific case), the transactions table must be removed from the publication, the necessary changes must be made, and then the table should be added back (along with the transactions_archive table). Some publication properties must also be modified. This is done using the commands below:

exec sp_helppublication @publication = N'Publication_MyTransactions'
go

exec sp_changepublication 
	@publication = N'Publication_MyTransactions', 
	@property = N'allow_anonymous', 
	@value = 'false'
go

exec sp_changepublication 
	@publication = N'Publication_MyTransactions', 
	@property = N'immediate_sync', 
	@value = 'false'
go

Here are the different steps:

  • Enable partition switching since we are using transactional replication
  • Generate a snapshot only for the modified articles
  • Remove the transactions article from replication
  • Rebuild the non-partitioned indexes on the transactions table
  • Create the transactions_archive table with the same structure as the transactions table
  • Add the transactions and transactions_archive articles to the publication
    • When adding them, consider the clustered and nonclustered indexes as well as the partitioning
  • Start the snapshot agent
  • Execute a partition switch and verify the result
  • Plan for a rollback of the allow_anonymous and immediate_sync properties

Implementation

We previously mentioned that the transactions table already existed and was being replicated (via transactional replication).

We therefore have our database (MyTransactions) with several filegroups (to simplify, we are only creating a few filegroups that will contain the data for 2024):

use master
go

declare @NumPart int = 1, @NumPartStr varchar(2), @SQLCmd nvarchar(max) = ''
while (@NumPart <= 4)
begin
select @NumPart
set @NumPartStr = +RIGHT('0'+CAST(@NumPart as varchar(2)), 2)
set @SQLCmd = @SQLCmd+'alter database MyTransactions add filegroup FG2024Q'+@NumPartStr+'; '
set @SQLCmd = @SQLCmd+'alter database MyTransactions add file (name = F2024Q'+@NumPartStr+', filename = ''D:\Data\F2024Q'+@NumPartStr+'.mdf'') to filegroup FG2024Q'+@NumPartStr+';'
set @NumPart = @NumPart+1
end
print @SQLCMD
exec(@SQLCMD)

We create our partition function:

use MyTransactions 
go

create partition function FunctionPartition_Transactions(DATE)
as range right for values ('20240101','20240401','20240701','20241001')
go

We create our partition scheme:

use MyTransactions 
go

create partition scheme SchemePartition_Transactions as partition FunctionPartition_Transactions
to ([Primary], FG2024Q01, FG2024Q02, FG2024Q03, FG2024Q04)
go

We then create our transactions table:

use MyTransactions 
go

create table transactions
(
	transactions_id int identity(1,1) not null,
	transactions_label nvarchar(50) not null,
	transactions_amount int not null,
	transactions_date date not null
) on SchemePartition_Transactions(transactions_date)

alter table transactions add constraint pk_transactions primary key clustered(transactions_id, transactions_date)
go

create nonclustered index ix_transactions_label_date on transactions(transactions_label) include (transactions_date) on [PRIMARY]
go

We notice that the clustered index is partitioned:

However, the nonclustered index ix_transactions_label_date is not partitioned. Indeed, it was not created using the clause: ON SchemePartition_Transactions(transactions_date).

The table looks something like this:

We also mentioned that this table was replicated:

We now need to create our archive table, which will be an identical copy of the source table (transactions):

create table transactions_archive
(
	transactions_id int identity(1,1) not null,
	transactions_label nvarchar(50) not null,
	transactions_amount int not null,
	transactions_date date not null
) on SchemePartition_Transactions(transactions_date)

alter table transactions_archive add constraint pk_transactions_archive primary key clustered(transactions_id, transactions_date)
go

create nonclustered index ix_transactions_label_date on transactions_archive(transactions_label) include (transactions_date) on SchemePartition_Transactions(transactions_date)
go

This table is partitioned, as are its indexes (clustered and nonclustered). However, this is not the case for the transactions table, and this will cause issues when executing ALTER SWITCH commands to perform partition switches.

We then check what the partitioning of our tables looks like:

select 
	OBJECT_NAME(p.object_id) as obj_name, 
	f.name, 
	p.partition_number, 
	p.rows
		from sys.system_internals_allocation_units a
			join sys.partitions p
				on p.partition_id = a.container_id
			join sys.filegroups f on a.filegroup_id = f.data_space_id
	
	where p.object_id in (object_id('transactions'), object_id('transactions_archive'))
	order by obj_name, p.index_id, p.partition_number

We now need to perform the following operations:

  • Ensure that the nonclustered index ix_transactions_label_date on the transactions table is partitioned. To do this, it must be recreated. However, this table is replicated, and explicit index creation and deletion commands are not replicated. This is why we need to modify certain publication properties (allow_anonymous and immediate_sync), remove the article from the publication, partition the nonclustered index, and then add the article back to the publication.

Reference: https://learn.microsoft.com/en-us/previous-versions/sql/sql-server-2012/ms151870(v=sql.110)?redirectedfrom=MSDN

We have the following steps:

  • Remove the transactions article from the publication
  • Modify the following properties: allow_anonymous and immediate_sync
  • (Re)create and partition the nonclustered index
  • Modify certain replication properties to allow SQL commands like ALTER SWITCH
  • Add the transactions and transactions_article articles to the publication

First, we remove the transactions article from the publication (note that in our client’s case, several dozen tables were replicated).

Second, we only want to generate a snapshot for the modified objects (transactions and transactions_archive).

exec sp_changepublication 
	@publication = N'Publication_MyTransactions', 
	@property = N'allow_anonymous', 
	@value = 'false'
go

exec sp_changepublication 
	@publication = N'Publication_MyTransactions', 
	@property = N'immediate_sync', 
	@value = 'false'
go

We (re)create the nonclustered index on the transactions table in order to partition it:

drop index ix_transactions_label_date on [dbo].[transactions]
go

create nonclustered index ix_transactions_label_date on transactions(transactions_label) on SchemePartition_Transactions(transactions_date)
go

Our index is now partitioned:

We then need to modify certain publication properties to allow ALTER SWITCH commands:

exec sp_helppublication @publication = N'Publication_MyTransactions'
go

exec sp_changepublication
	@publication = N'Publication_MyTransactions',
	@property = N'allow_partition_switch',
	@value = N'true'
go

exec sp_changepublication
	@publication = N'Publication_MyTransactions',
	@property = N'replicate_partition_switch',
	@value = N'true'
go

exec sp_helppublication @publication = N'Publication_MyTransactions'
go

The article can be added easily via (for example) the graphical interface:

We take partitioning into account in the following way:

It is very important to modify the properties of only these two articles and not those of already published articles (if any). Modifying the properties of already published articles would invalidate the existing snapshot and force SQL Server to generate a snapshot of all articles.

The above modification was made to the two articles (transactions and transactions_archive).

Once these steps are completed, we need to generate a snapshot via the corresponding job:

The corresponding objects are taken into account:

Once these operations are completed, we can see that our objects have been replicated and that the elements below have been taken into account:

  • Partitioned nonclustered index
  • Partition scheme
  • Partition function

Replicated partition switch

We now want to perform partition switches and verify that these commands are replicated.

Here is the corresponding code to run on the publisher (I am moving my data from partition 2 of my transactions table to partition 2 of my transactions_archive table):

alter table transactions switch partition 2 to transactions_archive partition 2

Before (from the publisher):

After (from the publisher):

From the subscriber:

Here is the related code :

:connect TR-Pub-1\INSTANCE2019

use MyTransactions
go

alter table transactions switch partition 2 to transactions_archive partition 2
go

select 
	OBJECT_NAME(p.object_id) as obj_name, 
	f.name, 
	p.partition_number, 
	p.rows
		from sys.system_internals_allocation_units a
			join sys.partitions p
				on p.partition_id = a.container_id
			join sys.filegroups f on a.filegroup_id = f.data_space_id
	
	where p.object_id in (object_id('transactions'), object_id('transactions_archive'))
	order by obj_name, p.index_id, p.partition_number
go

:connect TR-Sub-1\INSTANCE2019

use MyTransactions
go

select 
	OBJECT_NAME(p.object_id) as obj_name, 
	f.name, 
	p.partition_number, 
	p.rows
		from sys.system_internals_allocation_units a
			join sys.partitions p
				on p.partition_id = a.container_id
			join sys.filegroups f on a.filegroup_id = f.data_space_id
	
	where p.object_id in (object_id('transactions'), object_id('transactions_archive'))
	order by obj_name, p.index_id, p.partition_number
go

We then need to roll back the modification of the following properties: allow_anonymous et immediate_sync.

exec sp_changepublication 
	@publication = N'Publication_MyTransactions', 
	@property = N'immediate_sync', 
	@value = 'true'
go

exec sp_changepublication 
	@publication = N'Publication_MyTransactions', 
	@property = N'allow_anonymous', 
	@value = 'true'
go

exec sp_helppublication @publication = N'Publication_MyTransactions'
go

Thank you. Amine Haloui.