Does order matter in a JOIN clause?
A very astute SQL Server professional and reader of this blog posed the following question in a comment to a prior posting entitled "Does the order of criteria the WHERE clause matter?" The comment, submitted through Plaxo Pulse, is not available outside the Pulse community so I've included it below since it's the genesis of today's post.
"What if you were using join statements? Does it matter what order your columns are used in the ON statement? ~ Ariel M."
There are several derivatives of the original question:
- Does the order of tables referenced in the ON clause of the JOIN matter?
- Does the order of tables referenced in the FROM clause matter?
- Does the order of columns referenced in the ON clause of the JOIN matter when multiple columns are used in the JOIN?
I'll address questions 2 and 3 in another post; for now let's consider the first question. To paraphrase and exemplify the question: Will the following two queries produce differing execution plans and thus varying performance? Or does the Query Optimizer take this into account as it analyzes a statement before choosing an execution plan?
--Query #1
SELECT
c.Customer_ID
,c.Last_Name
,c.First_Name
,s.Order_ID
,s.Order_Date
FROM
dbo.Customers AS c JOIN
dbo.Sales_Orders AS s
ON s.Customer_ID = c.Customer_ID
--Query #2
SELECT
c.Customer_ID
,c.Last_Name
,c.First_Name
,s.Order_ID
,s.Order_Date
FROM
dbo.Customers AS c JOIN
dbo.Sales_Orders AS s
ON c.Customer_ID = s.Customer_ID
Notice that the only difference in the two queries lies in the ON clause of the JOIN. The first query references Sales then Customers while the second reverses the order.
To answer this question, let's look at the execution plan as provided by SQL Server Management Studio. The first query produces the following execution plan.
In this query, SQL Server has chosen a Clustered Index Scan of the Sales_Orders table and an Index Scan for the Customers table. The two tables are joined using a Hash Match Inner Join.
Now, let's look at the execution plan for the second query.
Query #2 produced the exact same execution plan! So, we can conclude from this simple example that the order of tables referenced in the ON clause of a JOIN doesn't affect the performance of a query.
As an aside, though, both execution plans use a Hash Match Inner Join. Generally speaking this is not the most efficient join type for SQL Server; Loop Join is much more efficient. When you see a Hash Join in a query execution plan, consider your indexes. A Hash Join is frequently an indicator of inefficient indexing. I'll delve more deeply into this in another post.
Cheers!
Joe
Legacy Comments
dmose
2008-02-29 |
re: Does order matter in a JOIN clause? did you clear buffers before running each query? |
Joe Webb
2008-02-29 |
re: Does order matter in a JOIN clause? Good question! Yes, and I should have shown that. You can actually see the results in either the estimated execution plan, or the actual execution. If you use the latter, you can issue the following DBCC commands to ensure that you're comparing apples to apples. DBCC dropcleanbuffers DBCC freeproccache Of course, do not do this on a production server! You'll clear your procedure cache and buffers. Joe |
Peso
2008-03-01 |
re: Does order matter in a JOIN clause? It doesn't matter, because query engine will reorganize JOIN order depending on statistisc for index and other things. Add OPTION (FORCE ORDER) to the query and you will see that it matters. AS a general rule of thumb, JOIN order should be with table of least records on top, and most records last. |
Bart Czernicki
2008-03-02 |
re: Does order matter in a JOIN clause? Joe, "Generally speaking this is not the most efficient join type for SQL Server" Ur comment about hash joins couldn't be MORE further from the truth. While true on small tables hash joins are not ur best option. Furthermore, hash joins are the default join when you have moderate tables without proper column indexing. However, hash joins are a critical when joining two larger tables. The hashing algorithm can work in parallel (creating the hash tables on different threads) and then doing the join. It scales a lot better than a nested loop or a merge join. You might want to check out "Inside MS SQL Server 2005 Query Tuning And Optimization" (page 134) http://www.amazon.com/Inside-Microsoft%C2%AE-SQL-Server-2005/dp/0735621969 If your into .NET/LINQ there are some great extensions of LINQ with Hash joins. It gives you a good idea of what SQL Server does as the flow of the logic is the same. I would say check it out on a google blog to confirm, but you get incorrect information A LOT of the time :) |
Joe Webb
2008-03-02 |
re: Does order matter in a JOIN clause? Hi Bart - Thanks for the comments. I am very much aware of and agree with you regarding the use of Hash Joins. I would have done well to further qualify the sentence you called out in your comment. Under specific circumstances, Hash Joins are absolutely the best join type and can produce the best performance. Usually this is the case for larger tables residing on servers with enough processing capabilities to handle the hashing algorithms. However, I don't think I would go so far as to say that this is the norm. As you said, for small to moderately sized tables (which is where I was going when I said "generally speaking") there may very well be a better performing join type, if an appropriate index was in place. My point was one should examine the join type to see if it is optimal for the situation. Again, I should have been more clear. Thanks, also for the link to the book. I have read it and found it to be an unsurpassed resource for its topic. Like you, I'd highly recommend it to others. Thanks for reading! Joe |
Scott Swank
2008-03-03 |
re: Does order matter in a JOIN clause? Oh crikey. A hash join is the best choice when there are no significant predicates to eliminate data from a table. Consider a block/page size of 8K, 16K or maybe even 32K. For the sake of argument that is going to hold 50 or 100 rows. If we're joining to the entire table we are going to read every row in order by the index. Unless this is a clustered index that means that we are going to be reading the blocks in roughly random order, only getting a row or two from each block before moving on to another. But to get all of the rows we have to revisit each block 50 to 100 times. That entails increasing our IO by 50x or 100x. Ok, but what if we are joining on a clustered index? Then we are reading the rows in the same order as they exist in the pages so we only need to read each page once. But we're still reading the index (and remember that a b-tree takes a few pages itself) only to then go ahead an read EVERY block. I.e. we're read the entire index & table instead of just reading the entire table, which is a waste of IO on reading the index no? Now for fun consider a small table that fits onto a single page. We ALWAYS want to see a hash join to that table because we never want to read 1 page of index and then 1 page of data when we could have just read the 1 (and only) page of data. In fact we don't want to use an index until our table is at least 3 pages, because then we could potentially read 1 page of index and 1 page of data. So, if you have 50 rows per page (for example) then you only want to see a nested loop join when you are using an index to access less than 1/50th of the data in your table. Read Jonathan P Lewis' book about Oracle's cost-based optimizer sometime. (Not that I'm implying that SQL Server has as good an optimizer as Oracle does...) |
Joe Webb
2008-03-03 |
re: Does order matter in a JOIN clause? I'm enjoying this discuss. Keep it going. For those interested, here are some good articles that dig deeper into the different types of joins. http://blogs.msdn.com/craigfr/archive/2006/08/16/702828.aspx http://www.sqlservercentral.com/articles/Advanced+Querying/optimizerjoinmethods/1708/ http://blogs.msdn.com/craigfr/archive/2006/08/10/687630.aspx Joe |
iyinet webmaster forumu 2008 seo
2008-03-06 |
re: Does order matter in a JOIN clause? Good job!Good articles! |
dna
2008-03-10 |
re: Does order matter in a JOIN clause? Thanks... |
srowland
2008-03-25 |
re: Does order matter in a JOIN clause? what makes Oracle's optimizer so much better? |
David
2008-04-24 |
re: Does order matter in a JOIN clause? Thanks, Joe, both for the information (valuable) and the cheerful attitude and helpful manner of your replies to comments that one so inclined could respond to fairly nastily (nonpareil). It's a truly refreshing change from many of the conversations one finds on blogs. |
Joe Webb
2008-04-24 |
re: Does order matter in a JOIN clause? Thanks, David! As a SQL Server MVP, I try to stay very active in the SQL Server community by reading and writing blogs, responding to forum postings, speaking at conferences, etc. And like you, I've seen quite a few rather nasty exchanges among various participants. In my mind, that serves no purpose. It doesn't serve to edify or foster the sense of community that I'd like to see around SQL Server. Rather it tends to alienate other participants and even discourage new people from joining in. So, I simply don't engage in that. Let me be clear though, I'm not ascribing any of that to anyone that has posted comments here. I'm just offering my perspective on the community as a whole and how I'd like to see it move toward a more constructive and educational environment rather than confrontational. That's not to say that its some kind of utopia. When someone offers errant advice, they do need to be made aware of their errors to be sure. But it can be done in a constructive, uplifting manner. Cheers! Joe |
mohan
2008-04-30 |
re: Does order matter in a JOIN clause? Joe, my friend, where is the link for the second/third questions ? I'm particularly interested in knowing about the FROM clause table order. |
havalandırma
2008-07-15 |
re: Does order matter in a JOIN clause? I'm particularly interested in knowing about the FROM clause table order. |
Joe Webb
2008-07-15 |
re: Does order matter in a JOIN clause? I've recently received several emails and comments (that for some reason are not showing here on the blog) about the second and third questions. Thanks for the interest! And I'll turn my attention back to those questions as soon as possible. Cheers! Joe |
Parul
2008-11-02 |
re: Does order matter in a JOIN clause? hey Joe, Its great to read through your knowledgeable blog. Like others even I am interested in knowing the answer for 2nd and 3rd question.Pls tell abt the links to those questions. Thanks |
Joe Webb
2008-11-05 |
re: Does order matter in a JOIN clause? Thanks! I'm glad you've found these postings useful! I got sidetracked and never got back to answering questions 2 and 3, but I will soon. Stay tuned. Cheers! Joe |
snow boots for women
2010-10-20 |
re: Does order matter in a JOIN clause? To paraphrase and exemplify the question: Will the following two queries produce differing execution plans and thus varying performance? Or does the Query Optimizer take this into account as it analyzes a statement before choosing an execution plan? snow boots | columbia jackets | mac makeup | the north face outlet |
costume jewellery uk
2010-11-03 |
cheap wholesale jewelry Court. Its dolce and gabbana ring Its largest creditors |
Gian Maria
2011-03-10 |
re: Does order matter in a JOIN clause? I've just faced a complex query that in sql server 2005 and 2008 had no problem. In sqlserver 2000 it took about 2 minutes to be executed. By inverting the order of 2 join the performance of sql server 2000 has become the same of 2005 and 2008!!!! That's....it depends on the intelligence of the query optimizer I guess... |
turquoise jewelry
2011-07-08 |
re: Does order matter in a JOIN clause? This article helps me more. Thanks for your sharing, I will pay more attentions to your blog. Looking forward to your better and better articles. See you next time. http://www.stagecoachgifts.biz/ provides you a beautiful and unique combination of genuine turquoise jewelry set in sterling silver. They offer the finest selection of authentic turquoise jewelry with great jewelry design. |