SQL Server 2014 is definitively designed for performance and I will try to demonstrate it during this blog post. I like to talk about hidden performance features because generally it does not require any changes for applications unlike in-memory tables for instance (aka hekaton tables).

Since SQL Server 2005 version some improvements have been made for tempdb. Tempdb caching is one of them and allows to reduce the page allocation contention. Basically to create a table SQL Server must first build the system catalog entries related to system pages. Then, SQL Server has to allocate an IAM page and find a mixed extent in an SGAM page to store data and mark it in the PFS page (as a reminder, by default mixed extent is chosen unless to force uniform extent by using the traceflag 1118). Finally the allocation process must be recorded to system pages. When a table is dropped SQL Server has to revert all it have done for creating the table. It implies some normal locks and latches during all the allocation process, same ones used for creating and dropping a temporary table. However in tempdb tables are created and dropped very quickly and it can generate page allocation contention especially for PFS, SGAM and GAM system pages (the famous PAGELATCH_UP wait type against the concerned pages). The bottom line is that SQL Server can cache some of the metadata and page allocations from temporary objects for easier and faster reuse with less contention.

In addition, to allow a temporary table to be cached it must be first used into a stored procedure but some situations can prevent this caching like:

  • Using named constraints
  • Using DDL after the temporary table creation
  • Create a table in a different scope
  • Using a stored procedure with recompile option

By executing the following T-SQL script with SQLQueryStress we can easily show that temporary tables are not reused by SQL Server.

use [AdventureWorks2012]
go
 
create table #test
(
       TransactionID bigint,
       ProductID    int,
       TransactionDate     datetime,
       Quantity     int,
       ActualCost   money
)
 
insert #test
select top 10000 *
from AdventureWorks2012.dbo.bigTransactionHistory
 
select
       ProductID,
       sum(Quantity * ActualCost) as total_cost
from #test
where ProductID = ‘16004’
group by ProductID
 
drop table #test;

I used 8 concurrent threads with 100 iterations during this test.

Blog_9_-sqlstress_test1

In the same time, I enabled the following perfmon counters:

Counter name

Min value

avg value

Max value

Average latch wait time (ms)

1,043

3,327

7,493

Latch wait / sec

110,014

242,468

965,508

Temp tables creation rate / sec

4,001

16

21,146

Cache objects in Use

0

0

0

Now if I rewrite the same ad-hoc T-SQL statement into a stored procedure and then I perform the same test we can notice some speed improvements:

use [AdventureWorks2012]
go
 
create procedure [dbo].[sp_test_tempdb]
as
 
create table #test
(
       TransactionID bigint,
       ProductID    int,
       TransactionDate     datetime,
       Quantity     int,
       ActualCost   money
)
 
insert #test
select top 10000 *
from AdventureWorks2012.dbo.bigTransactionHistory
 
select
       ProductID,
       sum(Quantity * ActualCost) as total_cost
from #test
where ProductID = ‘16004’
group by ProductID
 
drop table #test;

 

blog_9_-_sqlstress_test2

 

counter name

min value

avg value

Max value

Average latch wait time (ms)

0

0,855

1,295

Latch wait / sec

0

4405,145

5910,304

Temp tables creation rate / sec

0

0

0

Cache objects in Use

0

7,048

8

As expected, this improvement is due to the tempdb caching mechanism. We can notice here that SQL Server reuses caching objects (“Cache objects in Use” counter > 0) that are in fact the temporary table into the stored procedure. Using caching objects decrease drastically the temporary table creation rate (Temp Tables creation rate / sec is equal to 0 here).

The cached objects themselves are visible by using the system table sys.tables in the tempdb context. For example during the first test we can easily observe that SQL Server does not deallocate completely a temporary table used into a stored procedure. The relationship can be made with the object id column value with a negative number. When SQL Server uses a temporary table the name of table is #test and when SQL Server doesn’t use it without deallocating the associated pages the name is composed of a 8-character hexadecimal string that maps in fact to the object id value. #AF42A2AE is the hexadecimal representation of the #test temporary table with the object id equal to -1354587474.

blog_9_-_tempdb_caching

blog_9_-_tempdb_caching_2

Furthermore we can notice several records in the above results because I used SQLQueryStress with 8 concurrent threads that imply concurrent executions of the stored procedure with separate cached objects in tempdb. We can see 4 records (I didn’t show completely the entire result here) but in fact we retrieved 8 records.

As I said earlier, DDL statements after the creation of the temporary table inhibits the ability to cache the temporary objects by SQL Server and can decrease the global performance of the stored procedure (we can ask here what is a DDL statement .. because DROP TABLE #table is apparently not considered as such because tempdb caching mechanism is not impacted). In my sample, SQL Server proposes to create the following index on the ProductID column to improve the query statement:

create nonclustered index idx_test_transaction_product_id
on #test
(
       ProductID
)

 

Go ahead, we trust SQL Server and we will add the creation of the index after the creation of the temporary table into the stored procedure:

use [AdventureWorks2012]
go
 
create procedure [dbo].[sp_test_tempdb]
as
 
create table #test
(
       TransactionID bigint,
       ProductID    int,
       TransactionDate     datetime,
       Quantity     int,
       ActualCost   money
)
 
–create index for ProductID predicate
create nonclustered index idx_test_transaction_product_id
on #test
(
       ProductID
)
 
insert #test
select top 10000 *
from AdventureWorks2012.dbo.bigTransactionHistory
 
select
       ProductID,
       sum(Quantity * ActualCost) as total_cost
from #test
where ProductID = ‘16004’
group by ProductID
 
drop table #test;

However, the result is not as good as we would expect …

blog_9_-_sqlstress_test3

If we take a look at the perfmon counters values:

Counter name

min value

avg value

Max value

Average latch wait time (ms)

0,259

0,567

0,821

Latch wait / sec

0

2900

4342

Temp tables creation rate / sec

3,969

5,09

8,063

temp tables for destruction

0

27,02

58

Cache objects in Use

6

7,9

8

 

For this test I added a new perfmon counter: temp tables for destruction that indicates clearly that the temporary tables will be destroyed by SQL Server because they cannot be used in this case: the index creation DDL prevents the tempdb caching mechanism.

Here comes a new SQL Server 2014 feature that introduces a new way for declaring nonclustered indexes directly into the table creation DDL which can be a good workaround to the preceding test.

alter procedure [dbo].[sp_test_tempdb]
as
 
create table #test
(
       TransactionID bigint,
       ProductID int index idx_test_transaction_product_id, –< index created “on the fly”
       TransactionDate     datetime,
       Quantity     int,
       ActualCost   money
)
 
insert #test
select top 1000000 *
from AdventureWorks2012.dbo.bigTransactionHistory
 
select
       ProductID,
       sum(Quantity * ActualCost) as total_cost
from #test
where ProductID = ‘16004’
group by ProductID
 
drop table #test;

After running the test we can notice that the temp tables creation rate and temp tables for destruction counters value are again equal to zero. SQL Server used the temporary table during the testing as showing the “Cache objects in User” counter.

Counter name

min value

avg value

Max value

Average latch wait time (ms)

0

0,262

0,568

Latch wait / sec

0

1369

3489

Temp tables creation rate / sec

0

5,09

8,063

temp tables for destruction

0

0

0

Cache objects in Use

6

7,9

8

However if we can still use the tempdb caching mechanism with SQL Server 2014 and this new tips the above result is contrasted with the total duration of execution as showed by the following picture:

blog_9_-_sqlstress_test5

The global execution time is larger than the test first with the stored procedure, the temporary table without any nonclustered index (02:44 vs 00:21) in my case. This is because inserting data into a table with a nonclustered index can take more time than a table without any indexes but in a real production environment we will probably encounter situations where the cost for inserting data into a table with an index would be substantial compared to the gain made for the following readings. If you have some examples please feel free to share with us 😀

Another interesting feature since many versions is the concept of eager writes that prevent flooding the buffer pool with pages that are newly created, from bulk activities, and need to be written to disk. Eager write is another background process that helps to reduce the pressure of the well-known lazy writes and checkpoint background processes as well as increasing the IO performance by gathering pages before writing to disk. Basically, SQL Server tracks these pages into a circular list in memory. When the list is full old entries are removed by writing them to disk if still dirty.

Let me show you with the following T-SQL script on the SQL Server 2012 instance. I used the traceflag 3917 to show eager writes activity (thanks to Bod Dorr for this tip).

use AdventureWorks2012;
go
 
— create procedure sp_test_tempdb_2
— bulk activity by using select into #table
CREATE procedure sp_test_tempdb_2
as
 
select
       bth.*,
       p.Name AS ProductName,
       p.Color
into #test
from AdventureWorks2012.dbo.bigTransactionHistory as bth
       join AdventureWorks2012.dbo.bigProduct as p
             on bth.ProductID = p.ProductID
where p.Color in(‘White’)
       and p.Size = ‘M’
option (maxdop 1);
 
select
       TransactionDate,
       ProductID,
       ProductName
       Quantity
       –Quantity * ActualCost AS total_individual_sale
from (
       select
             ROW_NUMBER() OVER (PARTITION BY TransactionDate ORDER BY Quantity DESC) AS num,
             *
       from #test
)
transaction_production_sales_top_ten
where num
option (maxdop 1);
 
drop table #test
go
 
— using of traceflag 3917 to show eager write activity (be carefull the ouput may be verbose)
dbcc traceon(3917);
dbcc traceon(3605);
go
 
— cycle errorlog for next easy read
exec sp_cycle_errorlog;
go
 
— execution of the stored procedure dbo.sp_test_tempdb_2;
exec dbo.sp_test_tempdb_2;
go
 
— Reading the error log file
exec xp_readerrorlog;

 

Below a sample of the SQL Server error log:

blog_9_-_sql12_eager_writes

We can notice that SQL Server writes up contiguous 32 dirty pages to disk in my test.

Even if this process is optimized to write pages efficiently to disk, we have still IO activity. SQL Server 2014 enhances this process by relaxing the need to flush these pages to disk as quickly as the older versions. SQL Server recognizes the bulk activity and the concerned pages are loaded, queried and released without any flushing disk activity.

The same test performed on the SQL Server 2014 environment gives the following result:

blog_9_-_test_select_into_eager_write_sql14

The eager write process was not triggered this time. So let’s compare with a simulating workload by using ostress this time. Ostress is a stress tool provided by the RML utilities. This time I used ostress with 4 threads and 1000 iterations each. SQLQueryStress generated a bunch of ASYNC_IO_NETWORK during my tests which potentially distorts the final result.

So, I used the following script for the both environment (SQL Server 2012 and SQL Server 2014):

“C:Program FilesMicrosoft CorporationRMLUtilsostress.exe” -Slocalhost -dAdventureWorks2012 -Q”exec dbo.sp_test_tempdb_2″ -n4 -r1000 -N –q

 

SQL Server 2012

blog_9_-_ostress_sql12

… the corresponding io file stats:

SELECT
       d.name AS database_name,
       f.name AS [file_name],
       f.physical_name,
       f.type_desc,
       vf.num_of_reads,
       vf.num_of_writes
FROM sys.dm_io_virtual_file_stats(NULL, NULL) AS vf
INNER JOIN sys.databases AS d
ON d.database_id = vf.database_id
INNER JOIN sys.master_files AS f
ON f.file_id = vf.file_id
   AND f.database_id = vf.database_id
where f.database_id = db_id(‘tempdb’)

 

blog_9_-_ostress_sql12_tempdb_io

… and the corresponding wait types:

Wait type

Total wait

ms

Total wait count

Avg wait time

ms

PAGEIOLATCH_UP

452737834

3333841

135

PAGEIOLATCH_EX

343071451

4696853

73

PREEMPTIVE_OS_ENCRYPTMESSAGE

929

29527

0

PAGELATCH_SH

603

201

3

 

SQL Server 2014

blog_9_-_ostress_sql14

blog_9_-_ostress_sql14_tempdb_io

Wait type
Total wait
ms
Total wait count

By David Barbarin