A customer of mine had a recurring issue with one of his business critical applications where everything was frozen and stuck for the users. The issue was caused due to a head blocking transaction which increased blocking in the database exponentially. Depending on your technical background, you may know that locking is a very important mechanism in relational database systems to ensure transactional isolation, thus fulfilling the “I” in the theoretical ACID concept. However, when locks cause extensive blocking, it can lead to performance issues.
But what is now a head blocker?
A head blocker is a transaction that is either very slow or never commits, holding locks on database resources. Other transactions that must access the resources locked by the head blocker are also holding locks while waiting for those resources.
On the next level, there are more transactions that need access to resources locked by the transactions blocked by the head blocker, and they too are holding locks. This process continues on multiple levels, creating a “lock chain” where all the transactions are waiting for the first transaction (the head blocker) to commit or roll back.
For the non-technical blog reader, let me give you an example from a daily life situation:
You are in the supermarket and you see a long queue behind the checkout. At the front of the queue you can see person A who is talking with the cashier about the weather, the next neighborhood event and about the cat of the neighbor. This person is the head blocker. Behind that person is person B and person C waiting. This is the beginning of the lock chain and the checkout is the database resource on which the head blocker is holding a lock.
To better understand the complexity of the lock mechanism in relational database systems with this metaphor imagine that only one person is allowed to have a specific item in their shopping cart at any given time in the supermarket.
Person C is waiting for person A at the checkout and has a liter of milk in their shopping cart. Person D, who already has bread and butter in their shopping cart, but wants some milk too, must wait until person C can pay and leave the supermarket. Person E and F, who also already have some items in their shopping cart but still have butter or bread on their shopping list, must wait too. The whole process continues until everyone in the supermarket is waiting for person A to complete the story about the neighbor’s cat.
In the case of my customer, the IT-staff there terminated the head blocking transaction manually as far as they recognized it.
The major problem on this approach was, that the IT-staff remarked the problem through user feedback. So way too late.
Therefore I implemented a custom developed alert solution for the case when blocking is increasing within the applications database. This enabled the IT-staff to terminate the head blocking transaction before the application users were disturbed.
How to determine blocking in a database?
SQL-Server offers some very useful dynamic management views (DMV) to query information’s about blocking. For example the DMV “sys.dm_exec_requests”. When you query this DMV and you are interested in blocking, the most interesting column is the “blocking_session_id” column. This column shows you which session is currently blocking the particular request. If the request isn’t blocked, there will be a “0”.
There are also other pragmatic ways to gather information about currently running processes and their blocking. One very user-friendly method that provides a wealth of useful information is the stored procedure sp_whoisactive, developed by Adam Mechanic. This stored procedure displays currently running processes in SQL Server in a user-friendly manner and can be managed through multiple input parameters. It also returns a blocking_session_id column and many other useful details, such as the SQL text, the number of reads and writes, CPU time, and more. The stored procedure is not available by default in SQL Server but can be downloaded and installed from GitHub.
Another valuable solution is the stored procedure sp_HumanEventsBlockViewer, created by Erik Darling. This procedure is used in conjunction with an extended event session, which collects data from the Blocked Process Report. The stored procedure reads from the extended event session and presents the information in a very useful format, including valuable details such as the SQL text, the isolation level of the transactions, the wait time, and more.
I used th DMV “sys.dm_exec_requests” in the stored procedure that I developed to monitor blocking in the customer’s database.
Please note that I’m just sharing some key snippets of the stored procedure and not the entire procedure, as sharing the whole one would go beyond the planed scope of this blog post.
The stored procedure queries first of all the DMV with the “count” aggregation function to get the total number of blocked requests:
After that the stored procedure checks if the amount of blocked requests exceeds the defined sensitivity threshold:
If this is true, the stored procedure starts with the collection of the information’s which we were interested in. The first information is the session that is holding the most locks in the database. Therefore I wrote the query below which is storing the particular session ID in a variable:
After that, the stored procedure additionally gathers the number of blocked requests by this session and stores it in a second variable:
After that I used the session ID from the session, which is blocking the most requests, as a starting point to loop the lock chain up through the DMV:
After that I formatted everything in HTML and sent the information’s through the database mail. This resulted in the mail below which warned the recipients about the exceeding of the blocking threshold and provided some valuable information’s:
Summary:
This blog post presents an approach to monitor blocking within a SQL-Server database. Locking is an important mechanism in SQL-Server and other relational database systems to ensure transactional isolation and it is usual that locking leads to blocking in an acceptable extent. However, extensive blocking can slow down the performance, which is considered a problem. There are multiple of potential root causes of blocking problems, including:
- Poor query performance, leading to long running transactions
- Lock escalation
- High concurrency on particular database resources
- Bad transaction design
- High transaction isolation level
If you are facing similar problems or if you have any questions don’t hesitate to share it in the comment section!
GWagner
23.10.2024Dear Mr. Mechara
I know it cannot be, but it sounds as we are having the same customer :-)
I am in the process of setting up something similar and I like the parts of your solution I can see!
Would it be possible to get the rest, too?
Best regards
Guenter Wagner
P.S. dbi Services and Aspectra do in fact have one shared customer
Hocine Mechara
28.10.2024Dear Mr. Wagner
I'm glad to hear that my blog helped you already as an inspiration ;)
I will send you the whole stored procedure by Mail. If you find any improvements, feel free to share it back with me!
Best regards, Hocine