A very astute SQL Server professional and reader of this blog posed the following question in a comment to a prior posting entitled "Does the order of criteria the WHERE clause matter?" The comment, submitted through Plaxo Pulse, is not available outside the Pulse community so I've included it below since it's the genesis of today's post.
"What if you were using join statements? Does it matter what order your columns are used in the ON statement? ~ Ariel M."
There are several derivatives of the original question:
- Does the order of tables referenced in the ON clause of the JOIN matter?
- Does the order of tables referenced in the FROM clause matter?
- Does the order of columns referenced in the ON clause of the JOIN matter when multiple columns are used in the JOIN?
I'll address questions 2 and 3 in another post; for now let's consider the first question. To paraphrase and exemplify the question: Will the following two queries produce differing execution plans and thus varying performance? Or does the Query Optimizer take this into account as it analyzes a statement before choosing an execution plan?
--Query #1
SELECT
c.Customer_ID
,c.Last_Name
,c.First_Name
,s.Order_ID
,s.Order_Date
FROM
dbo.Customers AS c JOIN
dbo.Sales_Orders AS s
ON s.Customer_ID = c.Customer_ID
--Query #2
SELECT
c.Customer_ID
,c.Last_Name
,c.First_Name
,s.Order_ID
,s.Order_Date
FROM
dbo.Customers AS c JOIN
dbo.Sales_Orders AS s
ON c.Customer_ID = s.Customer_ID
Notice that the only difference in the two queries lies in the ON clause of the JOIN. The first query references Sales then Customers while the second reverses the order.
To answer this question, let's look at the execution plan as provided by SQL Server Management Studio. The first query produces the following execution plan.
In this query, SQL Server has chosen a Clustered Index Scan of the Sales_Orders table and an Index Scan for the Customers table. The two tables are joined using a Hash Match Inner Join.
Now, let's look at the execution plan for the second query.
Query #2 produced the exact same execution plan! So, we can conclude from this simple example that the order of tables referenced in the ON clause of a JOIN doesn't affect the performance of a query.
As an aside, though, both execution plans use a Hash Match Inner Join. Generally speaking this is not the most efficient join type for SQL Server; Loop Join is much more efficient. When you see a Hash Join in a query execution plan, consider your indexes. A Hash Join is frequently an indicator of inefficient indexing. I'll delve more deeply into this in another post.
Cheers!
Joe