The Truth about "Cursor Busting" in SQL
Let's say you are called in to troubleshoot a stored procedure that is performing poorly.
You dive in to investigate and this is what you find:
create procedure ProcessProducts
as
declare @Products cursor, @ProductID int
set @Products = cursor for select ProductID from Products order by ProductID
open @Products
fetch next from @Products into @ProductID
while (@@FETCH_STATUS=0)
begin
exec DoSomething @ProductID
fetch next from @Products into @ProductID
end
deallocate @Products
as
declare @Products cursor, @ProductID int
set @Products = cursor for select ProductID from Products order by ProductID
open @Products
fetch next from @Products into @ProductID
while (@@FETCH_STATUS=0)
begin
exec DoSomething @ProductID
fetch next from @Products into @ProductID
end
deallocate @Products
Ah ha! A cursor! It seems we have identified the bottleneck: Clearly, the performance problems are because the code is not doing things in a set-based manner, but rather by processing rows one at a time using a dreaded cursor. This cursor is opening up the Products table, looping through the rows one at a time, and calling the "DoSomething" stored procedure for each ProductID. As we all know, cursors are not the way to go when writing SQL code; this cursor should eliminated and replaced with a cleaner, more efficient (and more socially acceptable!) solution.
So, how we do optimize this? Well, a commonly suggested approach is to eliminate the CURSOR by replacing it with a WHILE loop:
declare @ProductID int
set @ProductID = -99999
while (@ProductID is not null)
begin
set @ProductID = (select top 1 ProductID
from Products
where ProductID > @ProductID
order by ProductID asc)
exec DoSomething @ProductID
end
Instead of declaring a CURSOR to loop through the table, we now are using "set-based" code and our problems seem to be solved. The cursor is gone, our code looks much cleaner, we've tested it and it works properly, so off to production it goes. Another cursor has been busted!
Right?
Actually ... no.
You see, eliminating cursors is not about syntax. It is not about searching for the word "cursor" in your code and just replacing it with a WHILE loop that does the same thing. Optimizing and replacing cursors involves much more. We can never optimize any cursor code until we look deeper into what exactly is happening when we "process" each of those rows. In this case, we need to find out what that "DoSomething" procedure is actually doing.
Suppose the DoSomething procedure is generating a report and sending an email to the "Product Manager" for each product that contains status information, and then logging this email message into a table somewhere.
If that is the case, what have we just gained by replacing our CURSOR?
Honestly -- not much, if anything at all. Because of the task at hand, we may very well need to process rows in the Product table one-by-one to send our emails and generate the report, and the bottleneck here is not the cursor code at all, but rather the report generation and maybe sending the email. Eliminating the cursor code probably gains us nothing here. If you need to process rows one at a time, go ahead and use a cursor -- that's what they are there for! Replacing a perfectly fine, simple cursor with a WHILE loop might even make your code longer, or more confusing, or even less efficient depending on circumstances.
For example, what if we need to process the Products ordered by Region, then Product Name, for whatever reason. Our cursor code is simple:
set @Products = cursor for
select ProductID
from Products
order by Region, ProductName
select ProductID
from Products
order by Region, ProductName
All that we needed to change was our ORDER BY clause. Now, how would we write this as a WHILE loop? Is it possible? Sure. Will it be as simple and clean as using a cursor? No, it won't. (Though ROW_COUNT() makes this much easier than it used to be)
Now, I am not here to say that cursors are "good", but if you really need to process rows one by one, go ahead and proudly use a cursor. Replacing cursors isn't about processing rows one-by-one in a different way (i.e., using a WHILE loop instead), it is about not processing rows one-by-one at all!
Let's consider another scenario: What if the DoSomething stored procedure is checking to see if the Product's ExpireDate is greater than today's date, and if so, it is updates the Status column for that Product to 'X'.
In that situation, what have we gained by rewriting ProcessProducts without a cursor, and using a WHILE loop instead? The answer is, once again: nothing! In fact, we potentially have once again made our code more confusing or even less efficient than a cursor might be! Remember, the bottleneck isn't the cursor syntax -- it is the fact that we are processing rows one at a time. Replacing the cursor with the WHILE loop didn't solve this problem, did it?
So, looking now at both of the scenarios I presented for the DoSomething stored procedure, it should be clear that we did not fix anything by replacing the cursor in either case simply by writing a WHILE loop. If that's all you are doing, don't bother replacing the cursor at all. You haven't optimized anything.
As I said before, the art of replacing a cursor is not a find-and-replace syntax change operation -- it is a fundamental change in how you process your data. As in the Product report generation and email example, it may be that we simply need to process rows one by one, and thus no further optimization is possible from a SQL point of view. In situations like updating the Product table, however, we do not need to process the rows individually -- we can do everything in one single UPDATE statement. Thus, in order to determine how to optimize the ProcessProducts stored procedure, we needed to dig deeper into entire process as a whole, which included examining the DoSomething stored procedure and determining the full scope of exactly what this "ProcessProducts" stored procedure is doing.
So, if "DoSomething" is updating the Products table as specified, we now know that a good replacement for our cursor code doesn't result in a WHILE loop and calling a separate stored procedure over and over at all -- it results a true, set-based solution:
create procedure ProcessProducts
as
Update Products set Status='X' where ExpireDate > getdate()
And THAT is how you optimize a cursor! No loops, no calling of another stored procedure for each row in a table, no "find-and-replace" cursor code removal. We examined the entire process, and rewrote the entire process, to get it done quicker and shorter and faster without cursors or loops. as
Update Products set Status='X' where ExpireDate > getdate()
Always remember: Replacing a cursor isn't about rewriting your syntax, it is about redesigning your algorithm.
Legacy Comments
jeff maass
2008-06-05 |
re: The Truth about "Cursor Busting" in SQL nIce. Thanks, Jeff! |
Ian Gibson
2008-06-06 |
re: The Truth about "Cursor Busting" in SQL I couldn't agree more. So many people don't understand why cursors should be avoided, but just have a knee jerk reaction that they are automatically bad. You've explained the truth very well here. |
Zack Jones
2008-06-06 |
re: The Truth about "Cursor Busting" in SQL What a great post! I've seen over and over how bad cursors are but your post really clears things up as to when it's OK to use them. |
Mackin
2008-06-06 |
re: The Truth about "Cursor Busting" in SQL Yes, but the way this cursor is replaced is not the best way. One of the biggest problems with Cursors is that the source table needs to be queried each time it fetches the next value. In the exampole above, probably not a big deal if you're only looking at one table, but most queries/cursors are more complex than that. One big advantage of a while loop is to dump the values into either a temp table or a table variable and then use that for your looping. This point is overlooked in this article. |
Jeff
2008-06-06 |
re: The Truth about "Cursor Busting" in SQL Mackin -- if dump those values into a temp table, that's great. And guess what? You can then use a CURSOR to iterate through the temp table just fine, right? No need for a WHILE loop. Again, what does the WHILE loop give you that a cursor doesn't if you indeed process each row one by one? Nothing. The solution is not to replace cursors with while loops, it is to rewrite your code. |
Manie Verster
2008-06-06 |
re: The Truth about "Cursor Busting" in SQL This is a very simple example of doing a cursor. But, as Mackin said, you get very complex queries where the best option is to use a cursor. I don't understand what the problem is with cursors. I will give you an example of a cursor query in our database. The select in the cursor reads three to four different tables with joins. the number of rows ranges between 2000 and 3000. As the rows are read inserts is done into other +- 5 tables and a whole accounting procedure happens. This query runs once a month to print cheques for a client. I would like to see that a temporary table will do it better than a cursor. I agree that with simple queries like Jeff's a cursor is way overkill. I would appreciate it if someone can explain another much better way but I don't think so. |
Jeff
2008-06-06 |
re: The Truth about "Cursor Busting" in SQL Manie -- I think you should re-read my blog post very carefully. My entire point is that: a) sometimes cursors are fine, if you truly need to process things row by row and b) Determining if you really need to process things row-by-row is to key to understanding if you can and/or should replace a cursor; it is not just for the sake of altering syntax or removing cursors because they aren't "cool." In your example, it sounds like it can easily be re-written without a cursor. But, as I said in my article, we have no way of knowing without examining the big picture and understanding exactly what your code is doing. I'd write more in my comment here, but I would just be further repeating what I already wrote in the article, so maybe just try reading it again instead? You also should probably re-read Mackin's comment carefully because he also wrote the opposite of what you seemed to think he wrote. |
Addai
2008-06-07 |
re: The Truth about "Cursor Busting" in SQL Jeff, I think what you are sayng is correct because if a process has to deal with records one at a time, using a while loop or a cursor will still result in one-by-one processing, and possibly the same cost. In that case, as you stated, just replacing a cursor with a while loop will not yield any benefit. I have realised of late that people tend to condem cursor use just arbitrarily without looking deep into the underlining processes, which is awkward to me. I went to an interview one time and the interviewer insisted it is never good to use a cursor without exception, and I said it depends. I lost the job because I disagreed with him. Clearing things the way you did in your post will clear the misconception some people have about cursor usage. Thank you. |
Jim Jaggers
2008-06-08 |
re: The Truth about "Cursor Busting" in SQL I'm more of a power user than a dba, but I do work with very large data sets. Isn't one of the differences between a cursor and a while loop that the cursor will store the entire result set in memry while the list is traversed while the while loop only keeps the current member of the list in memory? |
Jeff
2008-06-10 |
re: The Truth about "Cursor Busting" in SQL Hi Jim -- It depends on the type of cursor that you use. In fact, some cursors automatically create a temp table for you to traverse the data, you don't even need to do it manually, as you would with the WHILE-SELECT rewrite. see: http://msdn.microsoft.com/en-us/library/ms180169.aspx for some of the options available. Depending on what you need, you should be able to declare a very efficient cursor. In fact, in my article, I should have mentioned that if indeed row-by-row processing is required, one thing you can do to optimize is to simply check the cursor declaration to be sure that you have selected READONLY and FORWARDONLY and all that, depending on what you need. To be honest, I am not a pro when it comes to cursors! I've yet to really use one. If I need to do row-by-row processing to do things like generate reports or send emails, I write it as a client application in .NET, not in T-SQL. Of course, even when you write your app in .NET and use a DataReader to loop through the rows, guess what happens behind the scenes? That's right -- a CURSOR is used! :) |
anil
2008-08-31 |
re: The Truth about "Cursor Busting" in SQL cant i just insert this into a temp table with rownumber column and do row_number() over (partition by region order by productname) for getting the row number and then looping thro for each rownumber? also probably reset rownum variable in loop when region changes? i find that temp tables and looping thro row numbers always gives better performance than cursors. |
Aravamudan
2009-01-12 |
re: The Truth about "Cursor Busting" in SQL Hi, I read the articles as well as comments, but none of you not talked about the locks when the cursors are running? my questions is, is there any difference in the LOCK when we use cursor OR While loop. With Best Regards Aravamudan |
Luís Quedas
2010-01-27 |
re: The Truth about "Cursor Busting" in SQL Hi Jeff well done. Using SQL Server 2008 and the RowNumber() function you can create an identical script to replace cursors on tables that don't have integer key columns, one exemple: use AdventureWorks2008 declare @Rowid int , @RegionName varchar(100), @RegionCode varchar(4) set @Rowid = -99999 WHILE(@Rowid is not null) BEGIN select top 1 @Rowid=A.rowId,@RegionCode=CountryRegionCode,@RegionName=A.Name from (select ROW_NUMBER() over(order by CountryRegionCode) as rowId,CountryRegionCode,Name from Person.CountryRegion) as A where A.rowId> @Rowid order by A.rowid asc if @@ROWCOUNT<>0 -- Do something with the variables select @Rowid,@RegionCode,@RegionName else set @Rowid =null END Best REgards, Luís |
Jeff
2010-01-27 |
re: The Truth about "Cursor Busting" in SQL Luis -- I appreciate your comment, but you seem to be missing the point of the post. If you are going to write code like that, go ahead and just use a cursor ! |