Jeff Smith Blog

Random Thoughts & Cartesian Products with Microsoft SQL Server

On RIGHT OUTER JOINS …

Because I feel pretty strongly about this and the entire focus of my blog is writing clear, clean and efficient SQL, I thought I'd repeat my response from a SQLTeam forum question here.

smithje asks this, regarding OUTER JOINS:

Left/Right, does it matter. Is one better than the other? A few years ago consultant on a project for our company advised me to always write my queries to use Left joins. He had worked on the project to convert the original database application to MS SQL when Microsoft took it over. He claimed the design of the query engine handled Left joins more effeciently than right. I converted several queries that processed large datasets to Left join only and got quicker results. I have used Left exclusively since then. Has this concept ever been tested or written about?

My response:

Hi -- they are technically the same, but it is always clearer to use LEFT OUTER JOINS. I strongly recommend to never use RIGHT OUTER JOINS.

When you write a SQL statement, you should express your "base" in your FROM clause, and from there join to auxiliary tables. There is no SELECT statement that cannot be written in that manner, and it is a nice clear, clean way to organize your code. So, if you want ALL customers and ANY orders that match, I think we can all agree that it makes logical sense to express this as:

SELECT ...
FROM customers
OUTER JOIN TO orders

Clearly, we are primarily selecting customers as our "base", and including any Orders that may or may not exist.

As a RIGHT OUTER JOIN, it becomes:

SELECT
FROM Orders
OUTER JOIN TO Customers

which doesn't make sense -- why are we selecting FROM Orders and joining TO Customers, when potentially we want to return Customers that don't have ANY orders?

Anyway, it is rare to get good advice from a consultant, but it appears that you actually did! Avoid RIGHT JOINS, and stick with LEFT JOINS. If a right outer join seems required to make your query work, you should re-write it and change your FROM clause to make it cleaner, simpler and clearer.

(by the same token, I strongly recommend to avoid FULL OUTER JOINS as well.)

Legacy Comments


Adam Machanic
2008-02-13
re: On RIGHT OUTER JOINS ...
Agreed that LEFT OUTER JOINs are clearer to read and understand than RIGHT OUTER JOINs... But why do you say that it's rare to get good advice from a consultant? Kind of a strange thing to say; perhaps you're not hiring the right consultants?

Jeff
2008-02-13
re: On RIGHT OUTER JOINS ...
That was a bit of a joke, I actually do quite a bit of consulting myself. (more than I should be handling right now, probably, but that's a good problem to have, I guess).

However, it is fairly well-known that many consultants are very overpaid and end up leaving a bigger mess when they leave than was there originally. I have quite a few horror stories about consultants who spend months at my company whose entire output I needed to re-write or re-do from scratch -- usually in a day or so.

Of course, there are many great consultants out there... Sorry if I offended!

Jeff
2008-02-13
re: On RIGHT OUTER JOINS ...
Oh yeah -- also incompetent consultants are a bit of a running joke over at TheDailyWTF...

Adam Machanic
2008-02-13
re: On RIGHT OUTER JOINS ...
No offense taken; I get a lot of consulting business cleaning up the messes that other "consultants" left. I don't refer to them as consultants at that point, since the whole point of consulting is to provide quality, authoritative service to your clients, not to waste their money and trash their systems... But hey, I can't complain when it gets me janitorial business ;-)

Hugo Kornelis
2008-02-14
re: On RIGHT OUTER JOINS ...
Hi Jeff,

There is one situation where I would not immediately dismiss a RIGHT OUTER JOIN. The easiest way to explain is to take your example and now assume that this company has found some weird way of doing business that involves a many to many realtionship between customers and orders. So now the tables involved are Orders, Customers, and CustomerOrders. The goal is still to list all customers, along with their orders if they have any (and yes, multi-customer orders will be listed more than once this way; that is acceptable).

With using only LEFT JOIN, we have to write this with a nested join, which is hard to understand and maintain (use fixed font to see it with proper indenting):

SELECT c.CustomerName, o.OrderDate
FROM Customers AS c
LEFT JOIN CustomerOrders AS co
INNER JOIN Orders AS o
ON o.OrderID = co.OrderID
ON co.CustomerID = c.CustomerID;

Readability can be slightly improved by indenting the nested join a bit more and/or adding parentheses, but it remains hard to grok. So maybe you are wondering why I didn't use:

SELECT c.CustomerName, o.OrderDate
FROM Customers AS c
LEFT JOIN CustomerOrders AS co
ON co.CustomerID = c.CustomerID
some JOIN Orders AS o
ON o.OrderID = co.OrderID;

With "some" replaced by INNER, the customers with no orders are removed, so you defy the use of the outer join in the first query. With "some" replaced by LEFT (I see that a lot, unfortunately), we are now also including customers who have CustomerOrder rows with no matching Order - this stretches the imagination beyond the believable for this specific case, but there are other cases where this dooes apply - and even if it doesn't, we're still asking SQL Server to perform extra work for no extran gain but the opportunity to stare at a sandglass pointer even a bit longer...

Now here is how it looks if we just use a RIGHT OUTER JOIN:

SELECT c.CustomerName, o.OrderDate
FROM Orders AS o
INNER JOIN CustomerOrders AS co
ON co.OrderID = o.OrderID
RIGHT JOIN Customers AS c
ON co.CustomerID = c.CustomerID;

Easy, huh?

Jeff
2008-02-14
re: On RIGHT OUTER JOINS ...
Hi Hugo -- Well, all I can say is I completely disagree. First off, regarding just using two left outer joins, you are correct and I wrote about that here. (be sure to read the comment there as well, ROJ's are discussed quite a bit).

OK, on to your point ... Using the RIGHT JOIN in your example still makes things more confusing and tougher to work with. Most importantly, it makes things tougher to maintain. When we see:

SELECT c.CustomerName, o.OrderDate
FROM Orders AS o
INNER JOIN CustomerOrders AS co
ON co.OrderID = o.OrderID
RIGHT JOIN Customers AS c
ON co.CustomerID = c.CustomerID;

and we read that, it goes like this: "ok, we want all orders, joined to the customerOrders, joined to .. wait. never mind. we don't want all orders. We REALLY want all customers, outer joined backwards to the customer orders, and from there joined to orders. "

Now, what if we want to add an inner join from Customers to the Customer Type? Or the customer Address? Where does that go? Can we just tack it to the end? (the answer is no, it will then negate your RIGHT OUTER JOIN). So, how do we handle simple things like that working from this RIGHT OUTER JOIN? The answer is -- we can't, and we don't, we simply rewrite the query more logically.

So, instead we use either nested joins, a CTE, or a derived table to write a clean, clear and precise SQL statement that completely explains EXACTLY what we are doing, and it is easily maintained because it is logically expressed:

select
from Customer
LEFT OUTER JOIN
(select ... from Orders INNER JOIN CustomerOrders on ... ) o
on ...

Now it is clear that our primary, driving source is Customers, and we are outer joining to our auxiliary data, which is Orders. Orders inner joined to CustomerOrders is self-contained and clear and something that can be tested and verified independently, helping to keep our SQL clean and organized. And, of course, adding an inner join from Customers to CustomerTypes is now clear and easy to do with the query well-structured like this, and it still reads perfectly after making that addition.

Using a RIGHT OUTER JOIN to avoid nesting, or derived tables, or CTEs, is like using nothing but VARCHARS to avoid the need for data types. Maybe data types, like nested joins, derived table, and CTES, require a little more work and learning or using a new concept, but you certainly are better off rather than avoiding them to make things "easier" or "simpler" in the short term.

TT
2008-06-12
re: On RIGHT OUTER JOINS ...
I usually avoid RIGHT JOIN at all cost, for all the reasons already specified.

So when have I used RIGHT JOIN? Maintenance, in large (legacy) queries that are badly formatted and that I do not want to rewrite. Sometimes a RIGHT JOIN might do the trick.

Roy Harvey
2008-12-05
re: On RIGHT OUTER JOINS ...
I know I am coming along REALLY late....

Like you I only write LEFT OUTER, avoiding RIGHT OUTER. But I know that might not always be possible.

You said "they are technically the same", I thought that for years, but eventually ran into a query on the newsgroups where they were NOT the same, and the required results could only be achieved using RIGHT OUTER JOIN - there was no LEFT OUTER JOIN equivalent. The whole thing was quite confusing, as you can imagine (and no, I do not have an example to show). The explanation for this came down to the fact that in both cases the JOIN process has to proceed from left to right, at least logically. That is part of the SQL standard definition of how JOIN works.

Jeff S
2008-12-05
re: On RIGHT OUTER JOINS ...
Roy -- a link if you could find one would be great. I highly doubt there is anything that can be written only with a right outer join, anything can be rewritten more clearly and logically using derived tables and/or nested join expressions if necessary. The key is restructure the query, not try to swap tables A and B in a join expression to turn a RIGHT into a LEFT.

Please, if you could, track down some links or references to the newsgroup discussion you saw! Thank you in advance!
- Jeff

Mirko
2009-01-26
re: On RIGHT OUTER JOINS ...
I only can say that I completely agree with Jeff's way and explanations. I've been a programmer and consultant now for about ten years, and I never ever got into a situation which would justify a right outer join, maybe with the only exception when doing quick data analysis and debugging of applications, where it's just simpler to change the word "left" to "right" in your SQL editor to see different results...

Originally, I'm coming from an OOA/OOP background, and with that, the logic of a right outer join completely defies what an object oriented application really should do. Additionally, general expressions are evaluated from left to right, and that holds true for information systems all over the world, even in countries where the general reading direction is right-to-left.