Most of these are really basic. Some are my take on established ideas and standards, and may be controversial. Agree or disagree? All feedback is welcome! (well ... mostly the "I agree, you're a genius" feedback is welcome ... but I'll accept all of it, I suppose)
Fully qualify your columns
If your SQL statement involves more than 1 table, ALWAYS fully qualify your columns with the table they come from. If a column name exists only in 1 table, sometimes people tend to feel that a table reference is not necessary. After all, the statement compiles, right? But when reviewing the SQL statement later on, it is impossible to know which column a table is coming from unless you have the schema in front of you when you omit the table name. It's simple to do, and even easier if you follow my next piece of advice.
Use table aliases liberally and intelligently
Table aliases are great for making your SQL leaner and more readable. But try to alias "CustomerTransactions" as "CustTrans" or "CT", and not "X" or some other meaningless abbreviation. Aliasing derived tables in a meaningful way, even if it ends up being a little long, is a great idea. For example, joining to a derived table aliased as "TotalSalesPerCustomer" or "SalesTotals" really makes the SQL clear and readable.
Avoid RIGHT OUTER JOINS
I've talked about this many times in the SQLTeam forums. There is not one single reason to ever, ever use a RIGHT OUTER JOIN in a SQL statement. Never, ever, ever. Always select FROM your primary table (or derived table) and LEFT OUTER JOIN to auxillary tables. What's that? You say sometimes you need a RIGHT JOIN here and there to get the results you need? Then you have written your SQL wrong; start over. The most important part of a SELECT is also the simpliest -- the FROM clause. Your starting point needs to be correct before you can starting joining to other tables.
Derived tables are your friend
Use them as often as necessary to keep things organized and short. For example, if you have a long expression in your SELECT that needs to be referenced several times, do the calculation in a derived table and give it a meaningful alias. Then, you can simply refer to the alias in other parts of your SELECT.
Too often, people keep stuffing related tables into 1 big SELECT until they have every possible piece of data they might need, then they start GROUP'ing and DISTINCT'ing almost randomly until the results start "looking OK". Avoid this. Break your problem down into logical parts; usually that means each part ends up being it's own derived table.
Don't forget UNION ALL
UNION ALL is often forgotten about, when most times it is really what should have been used. Remember -- UNION removes all duplicates in the result set, while UNION ALL just returns everything. This means that UNION is less efficient and may even have unintended side effects if you forget about how its behavoir works. Unless you need to remove duplicates, always use UNION ALL. And if you *do* need to remove duplicates with a regular UNION, be sure to add a comment to indicate this.
DISTINCT is *usually* bad
A good rule of thumb -- if you need a distinct list of values in a single column or two, DISTINCT is the way to go. But if your results "don't look right" or you see some duplicate rows and can't figure out why, do NOT just add DISTINCT to your SELECT to "fix it" !! Step back, look at your joins, and re-write your query properly. Even worse, I've seen people simply add DISTINCT to all their SELECT's right from the start, to preemptively "avoid duplicates". Not good. (Side note: If you are a DISTINCT abuser, try adding meaningful primary keys to your tables).
GROUP BY only what you need
See here for more information on this one.
State your intentions clearly with Parenthesis
Never mix AND's and OR's without parenthesis in boolean expressions. I don't care if you know exactly how SQL will evaluate it -- use parenthesis. They key is to be sure that your intended expression is clearly stated.
Indent!
I don't care how you do it. Just be consistent about it.
Learn when to use * and when not to
Mostly old news here. When you're selecting from a table, don't use it -- list the columns out explicitly, for readablity, performance, and to avoid re-compile bugs that can be tough to track down.
When you are doing a COUNT() or an EXISTS(), no problem, go ahead and use *. Also (and this is my opinion here), when you are selecting from a derived table, go ahead and use * if you need all the columns. The columns are already explicitly defined in the derived table, and it keeps things shorter and cleaner.
Remember your Job
The SQL Server's job is to return data. Your client application or Reporting tool's job is to present that data. Always remember this. If you try to format a resultset using T-SQL, your nice DateTime and Money values come to the client as VARCHAR's. Which means that if the client wants to do anything meaningful with these values (sort, compare, do math, format, etc) it must immediately convert those strings back to the original datatype! So many people try to format their results in T-SQL thinking this makes things easier for their clients, when it makes the T-SQL longer, more complicated, and less efficient. Let SQL do what it does best, and let your presentation layer do what it does best. It really kind of almost makes sense when you think about it.
Always Use Parameters!
OK, not everyone likes stored procedures. I can accept that; after all, while I can be pretty strict on some things (i.e., RIGHT JOINS), I definitely recognize that there are multiple ways to do things. But if you are going to write SQL statements directly at the client, there is still no excuse for not using parameters. And likewise with dynamic SQL generated at the server. Always, always use parameters -- never parse and concatenate user input into SQL strings and execute them directly. Not only is it more work, less secure, harder to maintain, and less readable, but you now have to deal with the format of your data at the SQL Server layer. If a parameter is a DateTime parameter and you set it and reference it in a SQL statement, you are done. But if you are taking a string that *looks* like a DateTime, and you are concatenating it into a SQL command, suddenly the format of that string becomes very important. Just don't do it. Use parameters, whether you are using stored procedures or not. ADO, ADO.NET and even DAO all support parameterized commands for your clients, and the sp_ExecuteSQL stored procedure supports parameters for dynamic SQL on the server.
see also: