<rss version="2.0" xmlns:dc="http://purl.org/dc/elements/1.1/" xmlns:trackback="http://madskills.com/public/xml/rss/module/trackback/" xmlns:wfw="http://wellformedweb.org/CommentAPI/" xmlns:slash="http://purl.org/rss/1.0/modules/slash/" xmlns:copyright="http://blogs.law.harvard.edu/tech/rss" xmlns:image="http://purl.org/rss/1.0/modules/image/">
    <channel>
        <title>Efficiency</title>
        <link>http://weblogs.sqlteam.com/jeffs/category/163.aspx</link>
        <description>Tips and tricks on getting things done more efficiently in SQL or other programming languages.</description>
        <language>en-US</language>
        <copyright>Jeff Smith</copyright>
        <managingEditor>smith_jeffreyt@yahoo.com</managingEditor>
        <generator>Subtext Version 1.9.4.0</generator>
        <item>
            <title>Does SQL Server Short-Circuit?</title>
            <link>http://weblogs.sqlteam.com/jeffs/archive/2008/02/22/sql-server-short-circuit.aspx</link>
            <description>I got an email recently regarding one of my early blog posts from the olden days:&lt;br /&gt;
&lt;br style="font-style: italic;" /&gt;
&lt;div style="margin-left: 40px;"&gt;&lt;span style="font-style: italic;"&gt;&lt;span style="font-style: italic;"&gt;Steve  Kass  wrote  about  &lt;a href="http://weblogs.sqlteam.com/jeffs/archive/2003/11/14/513.aspx"&gt;your  post:&lt;/a&gt;&lt;/span&gt;&lt;span style="font-style: italic;"&gt;"there  is  no  guarantee  that  WHERE  &amp;lt;filter  1&amp;gt;  OR  &amp;lt;filter  2&amp;gt;  will  be  optimized  so  that  the  filters  are  evaluated  in  the  order  typed".&lt;/span&gt;&lt;br style="font-family: Courier New; font-style: italic;" /&gt;
&lt;br style="font-family: Courier New; font-style: italic;" /&gt;
&lt;span style="font-style: italic;"&gt;I  am  not  certain  that  optimization  changes  the  priority  of  the  expressions,  but  I  do  not  think  so.  We  can  force  evaluation  so  that  it  is  done  in  a  certain  order  by  enclosing  the  first  expression  to  evaluate  in  parentesis,  since  enclosed  in  parentesis  expressions  are  evaluated  first,   like  this:&lt;/span&gt;&lt;br style="font-family: Courier New; font-style: italic;" /&gt;
&lt;span style="font-style: italic;"&gt;((&amp;lt;filter  1&amp;gt;)  OR  &amp;lt;filter  2&amp;gt;)  AND&lt;/span&gt;&lt;br style="font-family: Courier New; font-style: italic;" /&gt;
&lt;span style="font-style: italic;"&gt;((&amp;lt;filter  3&amp;gt;)  OR  &amp;lt;filter  4&amp;gt;) &lt;/span&gt;&lt;br style="font-family: Courier New; font-style: italic;" /&gt;
&lt;br style="font-family: Courier New; font-style: italic;" /&gt;
&lt;span style="font-style: italic;"&gt;In  this  case  filter  1  is  evaluated  before  than  filter  2  and  filter  3  is  evaluated  before  than  filter  4. &lt;/span&gt;&lt;br style="font-family: Courier New; font-style: italic;" /&gt;
&lt;br style="font-family: Courier New; font-style: italic;" /&gt;
&lt;span style="font-style: italic;"&gt;Thanks  in  advance&lt;br /&gt;
&lt;/span&gt;&lt;/span&gt;&lt;/div&gt;
&lt;br /&gt;
In that original post, I claimed that using efficient OR logic will not only make your code simpler and cleaner, but also more efficient because in theory SQL Server can "&lt;a href="http://en.wikipedia.org/wiki/Short-circuit_evaluation" target="_blank"&gt;short-circuit&lt;/a&gt;" on the first part of an OR if it is true, without the need to evaluate the second part.  &lt;br /&gt;
&lt;br /&gt;
However, in one of the comments, it was correctly pointed out that this is not true; you really have no control over how SQL Server will evaluate something -- it will generally do what it thinks is best regardless of the order in which you specify your conditions.  &lt;br /&gt;
&lt;br /&gt;
After getting that email I did some research to determine the truth behind this, and came up with a couple of good resources:&lt;br /&gt;
&lt;ul&gt;
    &lt;li&gt;This &lt;a href="http://www.sqlteam.com/forums/topic.asp?TOPIC_ID=83016&amp;amp;whichpage=2" target="_blank"&gt;SQLTeam forum thread&lt;/a&gt; from last year explores the topic and is a good read.&lt;br /&gt;
    &lt;br /&gt;
    &lt;/li&gt;
    &lt;li&gt;This &lt;a href="http://beingmarkcohen.com/?p=62" target="_blank"&gt;blog post from Mark Cohen&lt;/a&gt; discusses the topic as well, but the best thing to read here is the very last comment from "sc0rp10n" which explains things pretty well. (Plus, with a "l33t" hacker name like that, you know he must know what he is talking about!  These are assigned by some committee somewhere after a vigorous review process, right?)&lt;br /&gt;
    &lt;br /&gt;
    &lt;/li&gt;
    &lt;li&gt;In &lt;a href="http://www.microsoft.com/technet/community/chats/trans/sql/sql1119.mspx" target="_blank"&gt;this TechNet chat,&lt;/a&gt; Nigel Ellis, the development manager for the SQL Server Query Processor team, says about halfway into it that SQL Server &lt;span style="font-style: italic;"&gt;does&lt;/span&gt; indeed short-circuit, but it is not specified if this is something users can control based on how they write their boolean expressions.&lt;br /&gt;
    &lt;/li&gt;
&lt;/ul&gt;
The final verdict?  Well, I don't really have one yet, but it is probably safe to say that the &lt;span style="font-weight: bold;"&gt;only time you can ensure a specific short-circuit is when you express multiple WHEN conditions in a CASE expression&lt;/span&gt;.  With standard boolean expressions, the optimizer will move things around as it sees fit based on the tables, indexes and data you are querying.&lt;br /&gt;
&lt;br /&gt;
Anyone have any thoughts, feedback, further info or links regarding this topic?&lt;img src="http://weblogs.sqlteam.com/jeffs/aggbug/60526.aspx" width="1" height="1" /&gt;</description>
            <dc:creator>Jeff Smith</dc:creator>
            <guid>http://weblogs.sqlteam.com/jeffs/archive/2008/02/22/sql-server-short-circuit.aspx</guid>
            <pubDate>Fri, 22 Feb 2008 16:04:10 GMT</pubDate>
            <wfw:comment>http://weblogs.sqlteam.com/jeffs/comments/60526.aspx</wfw:comment>
            <comments>http://weblogs.sqlteam.com/jeffs/archive/2008/02/22/sql-server-short-circuit.aspx#feedback</comments>
            <wfw:commentRss>http://weblogs.sqlteam.com/jeffs/comments/commentRss/60526.aspx</wfw:commentRss>
            <trackback:ping>http://weblogs.sqlteam.com/jeffs/services/trackbacks/60526.aspx</trackback:ping>
        </item>
        <item>
            <title>On RIGHT OUTER JOINS ...</title>
            <link>http://weblogs.sqlteam.com/jeffs/archive/2008/02/13/on-right-outer-joins.aspx</link>
            <description>Because I feel pretty strongly about this and the entire focus of my blog is writing clear, clean and efficient SQL, I thought I'd repeat my response from a &lt;a href="http://www.sqlteam.com/forums/topic.asp?TOPIC_ID=97137" target="_blank"&gt;SQLTeam forum question&lt;/a&gt; here.&lt;br /&gt;
&lt;br /&gt;
&lt;span style="font-weight: bold;"&gt;smithje &lt;/span&gt;asks this, regarding OUTER JOINS:&lt;br /&gt;
&lt;br /&gt;
&lt;div style="margin-left: 40px; font-style: italic;"&gt;Left/Right, does it matter. Is one better than the other? A few years ago consultant on a project for our company advised me to always write my queries to use Left joins. He had worked on the project to convert the original database application to MS SQL when Microsoft took it over. He claimed the design of the query engine handled Left joins more effeciently than right. I converted several queries that processed large datasets to Left join only and got quicker results. I have used Left exclusively since then. Has this concept ever been tested or written about?&lt;/div&gt;
&lt;br /&gt;
&lt;span style="font-weight: bold;"&gt;My response:&lt;/span&gt;&lt;br /&gt;
&lt;br /&gt;
Hi -- they are technically the same, but it is always clearer to use LEFT OUTER JOINS. I strongly recommend to never use RIGHT OUTER JOINS.&lt;br /&gt;
&lt;br /&gt;
When you write a SQL statement, you should express your "base" in your FROM clause, and from there join to auxiliary tables. There is no SELECT statement that cannot be written in that manner, and it is a nice clear, clean way to organize your code. So, if you want ALL customers and ANY orders that match, I think we can all agree that it makes logical sense to express this as:&lt;br /&gt;
&lt;br /&gt;
SELECT ...&lt;br /&gt;
FROM customers&lt;br /&gt;
OUTER JOIN TO orders&lt;br /&gt;
&lt;br /&gt;
Clearly, we are primarily selecting customers as our "base", and including any Orders that may or may not exist.&lt;br /&gt;
&lt;br /&gt;
As a RIGHT OUTER JOIN, it becomes:&lt;br /&gt;
&lt;br /&gt;
SELECT&lt;br /&gt;
FROM Orders&lt;br /&gt;
OUTER JOIN TO Customers&lt;br /&gt;
&lt;br /&gt;
which doesn't make sense -- why are we selecting FROM Orders and joining TO Customers, when potentially we want to return Customers that don't have ANY orders?&lt;br /&gt;
&lt;br /&gt;
Anyway, it is rare to get good advice from a consultant, but it appears that you actually did! Avoid RIGHT JOINS, and stick with LEFT JOINS. If a right outer join seems required to make your query work, you should re-write it and change your FROM clause to make it cleaner, simpler and clearer.&lt;br /&gt;
&lt;br /&gt;
(by the same token, I strongly recommend to &lt;a href="http://weblogs.sqlteam.com/jeffs/archive/2007/04/19/Full-Outer-Joins.aspx"&gt;avoid FULL OUTER JOINS&lt;/a&gt; as well.)&lt;img src="http://weblogs.sqlteam.com/jeffs/aggbug/60508.aspx" width="1" height="1" /&gt;</description>
            <dc:creator>Jeff Smith</dc:creator>
            <guid>http://weblogs.sqlteam.com/jeffs/archive/2008/02/13/on-right-outer-joins.aspx</guid>
            <pubDate>Wed, 13 Feb 2008 16:22:25 GMT</pubDate>
            <wfw:comment>http://weblogs.sqlteam.com/jeffs/comments/60508.aspx</wfw:comment>
            <comments>http://weblogs.sqlteam.com/jeffs/archive/2008/02/13/on-right-outer-joins.aspx#feedback</comments>
            <slash:comments>6</slash:comments>
            <wfw:commentRss>http://weblogs.sqlteam.com/jeffs/comments/commentRss/60508.aspx</wfw:commentRss>
            <trackback:ping>http://weblogs.sqlteam.com/jeffs/services/trackbacks/60508.aspx</trackback:ping>
        </item>
        <item>
            <title>Rewriting correlated sub-queries with CASE expressions</title>
            <link>http://weblogs.sqlteam.com/jeffs/archive/2008/01/09/rewrite-correlated-sub-query-with-case-sql.aspx</link>
            <description>Here's a very common situation that is very easy to optimize and simplify, submitted via the &lt;a target="_blank" href="http://weblogs.sqlteam.com/jeffs/contact.aspx"&gt;mailbag&lt;/a&gt;:&lt;br /&gt;
&lt;br /&gt;
&lt;div style="margin-left: 40px;"&gt;Nate writes:&lt;span style="font-style: italic;"&gt;&lt;br /&gt;
&lt;br /&gt;
Hey, I have a read a bunch of your stuff on your blog and you seem to&lt;/span&gt;&lt;br style="font-style: italic;" /&gt;
&lt;span style="font-style: italic;"&gt; be right on the money. I thought maybe you would be able to point me in&lt;/span&gt;&lt;br style="font-style: italic;" /&gt;
&lt;span style="font-style: italic;"&gt; the right direction and possibly address this issue on your blog so&lt;/span&gt;&lt;br style="font-style: italic;" /&gt;
&lt;span style="font-style: italic;"&gt; others could benefit from your understanding. &lt;/span&gt;&lt;br style="font-style: italic;" /&gt;
&lt;br style="font-style: italic;" /&gt;
&lt;span style="font-style: italic;"&gt;I have been searching for the best way to do what I think should be a&lt;/span&gt;&lt;br style="font-style: italic;" /&gt;
&lt;span style="font-style: italic;"&gt; simple task in SQL. I have a table full of call history events and I&lt;/span&gt;&lt;br style="font-style: italic;" /&gt;
&lt;span style="font-style: italic;"&gt; want to get a summary of some events for each calling party that matches&lt;/span&gt;&lt;br style="font-style: italic;" /&gt;
&lt;span style="font-style: italic;"&gt; the form "username(extension)". Using correlated subqueries I do the&lt;/span&gt;&lt;br style="font-style: italic;" /&gt;
&lt;span style="font-style: italic;"&gt; following.&lt;/span&gt;&lt;br style="font-style: italic;" /&gt;
&lt;br style="font-style: italic;" /&gt;
&lt;span style="font-style: italic;"&gt;SELECT calling_party,&lt;/span&gt;&lt;br style="font-style: italic;" /&gt;
&lt;br style="font-style: italic;" /&gt;
&lt;span style="font-style: italic;"&gt;(SELECT SUM(end_time-start_time) AS total_time FROM `event` ei WHERE&lt;/span&gt;&lt;br style="font-style: italic;" /&gt;
&lt;span style="font-style: italic;"&gt; event_type=4 AND ei.calling_party = eo.calling_party) AS&lt;/span&gt;&lt;br style="font-style: italic;" /&gt;
&lt;span style="font-style: italic;"&gt; total_talking_time, &lt;/span&gt;&lt;br style="font-style: italic;" /&gt;
&lt;br style="font-style: italic;" /&gt;
&lt;span style="font-style: italic;"&gt;(SELECT SUM(end_time-start_time) AS total_time FROM `event` ei WHERE&lt;/span&gt;&lt;br style="font-style: italic;" /&gt;
&lt;span style="font-style: italic;"&gt; event_type=7 AND ei.calling_party = eo.calling_party) AS&lt;/span&gt;&lt;br style="font-style: italic;" /&gt;
&lt;span style="font-style: italic;"&gt; total_ringing_time,&lt;/span&gt;&lt;br style="font-style: italic;" /&gt;
&lt;br style="font-style: italic;" /&gt;
&lt;span style="font-style: italic;"&gt;(SELECT MAX(end_time-start_time) AS total_time FROM `event` ei WHERE&lt;/span&gt;&lt;br style="font-style: italic;" /&gt;
&lt;span style="font-style: italic;"&gt; event_type=4 AND ei.calling_party = eo.calling_party) AS max_talking_time&lt;/span&gt;&lt;br style="font-style: italic;" /&gt;
&lt;br style="font-style: italic;" /&gt;
&lt;span style="font-style: italic;"&gt;FROM `event` eo WHERE calling_party LIKE '%(%)'&lt;/span&gt;&lt;br style="font-style: italic;" /&gt;
&lt;br style="font-style: italic;" /&gt;
&lt;span style="font-style: italic;"&gt;GROUP BY calling_party&lt;/span&gt;&lt;br style="font-style: italic;" /&gt;
&lt;br style="font-style: italic;" /&gt;
&lt;span style="font-style: italic;"&gt;This works, but ends up taking a really long time to run. I figured&lt;/span&gt;&lt;br style="font-style: italic;" /&gt;
&lt;span style="font-style: italic;"&gt; that each subquery was getting executed multiple times. I made a quick php&lt;/span&gt;&lt;br style="font-style: italic;" /&gt;
&lt;span style="font-style: italic;"&gt; script that returns the same information but uses multiple queries. I&lt;/span&gt;&lt;br style="font-style: italic;" /&gt;
&lt;span style="font-style: italic;"&gt; just query all of the "groups" (SELECT DISTINCT calling_party FROM&lt;/span&gt;&lt;br style="font-style: italic;" /&gt;
&lt;span style="font-style: italic;"&gt; `event` WHERE calling_party LIKE '%(%)') and then iterate over that&lt;/span&gt;&lt;br style="font-style: italic;" /&gt;
&lt;span style="font-style: italic;"&gt; recordset plugging the actual calling_party value into the where clause of the&lt;/span&gt;&lt;br style="font-style: italic;" /&gt;
&lt;span style="font-style: italic;"&gt; aggregate query. The output is identical, but this script runs in well&lt;/span&gt;&lt;br style="font-style: italic;" /&gt;
&lt;span style="font-style: italic;"&gt; under a second.&lt;/span&gt;&lt;br style="font-style: italic;" /&gt;
&lt;br style="font-style: italic;" /&gt;
&lt;span style="font-style: italic;"&gt;So the question is what is the correct way to do this using SQL? Any&lt;/span&gt;&lt;br style="font-style: italic;" /&gt;
&lt;span style="font-style: italic;"&gt; help would be really appreciated.&lt;/span&gt;&lt;br style="font-style: italic;" /&gt;
&lt;span style="font-style: italic;"&gt; &lt;/span&gt;&lt;br /&gt;
&lt;/div&gt;
Hi Nate --&lt;br /&gt;
&lt;br /&gt;
The script as written is making 4 different calls to the calling_party table, and we can reduce this to one call to the table by using CASE expressions to only aggregate the data we need for each column.  This is pretty much the standard "static cross-tab" technique to summarize data into multiple columns in a single SELECT.&lt;br /&gt;
&lt;br /&gt;
So, all we need to do is re-write your SELECT like this:&lt;br /&gt;
&lt;br style="font-family: Courier New;" /&gt;
&lt;div style="margin-left: 40px;"&gt;&lt;span style="font-family: Courier New;"&gt;select&lt;/span&gt;&lt;br style="font-family: Courier New;" /&gt;
&lt;span style="font-family: Courier New;"&gt;  calling_party,&lt;/span&gt;&lt;br style="font-family: Courier New;" /&gt;
&lt;span style="font-family: Courier New;"&gt;  sum(case when event_type=4 then end_time-start_time else 0 end) as total_talking_time,&lt;/span&gt;&lt;br style="font-family: Courier New;" /&gt;
&lt;span style="font-family: Courier New;"&gt;  sum(case when event_type=7 then end_time-start_time else 0 end) as total_ringing_time,&lt;/span&gt;&lt;br style="font-family: Courier New;" /&gt;
&lt;span style="font-family: Courier New;"&gt;  max(case when event_type=4 then end_time-start_time else 0 end) as max_talking_time&lt;/span&gt;&lt;br style="font-family: Courier New;" /&gt;
&lt;span style="font-family: Courier New;"&gt;from&lt;/span&gt;&lt;br style="font-family: Courier New;" /&gt;
&lt;span style="font-family: Courier New;"&gt;  event&lt;br /&gt;
where&lt;br /&gt;
&lt;/span&gt;    &lt;span style="font-family: Courier New;"&gt;calling_party LIKE '%(%)'&lt;/span&gt;&lt;br style="font-family: Courier New;" /&gt;
&lt;span style="font-family: Courier New;"&gt;group by&lt;/span&gt;&lt;br style="font-family: Courier New;" /&gt;
&lt;span style="font-family: Courier New;"&gt;  calling_party&lt;/span&gt;&lt;br /&gt;
&lt;/div&gt;
&lt;br /&gt;
That will return the same results as what you had written, only it should be much more efficient.  It is also much shorter.  Notice that the CASE expressions ensure that only the data with the specified event_type is including in each aggregate calculation, otherwise we aggregate a value of 0 which has no affect on the results.  You can also use a default value of NULL instead of 0 if you'd like, depending on your desired results.&lt;br /&gt;
&lt;br /&gt;
If you have more data in that table for event_types other than 4 and 7, and you only want to return calling_party rows that contain data with those event_types, then be sure to add&lt;br /&gt;
&lt;br /&gt;
&lt;div style="margin-left: 40px;"&gt;&lt;span style="font-family: Courier New;"&gt;and event_type in (4,7)&lt;/span&gt;&lt;br /&gt;
&lt;/div&gt;
&lt;br /&gt;
to your WHERE clause, which will increase the efficiency even further.   However, adding that will return different results from your original SELECT (it returns all calling_groups regardless of event_type) so be sure that what you do meets the specifications you require.&lt;br /&gt;
&lt;br /&gt;
As always, be sure that you have proper indexes on all of your tables, and in this case it seems that event_type should certainly be indexed. &lt;br /&gt;
&lt;br /&gt;
&lt;span style="font-style: italic;"&gt;see also:&lt;br /&gt;
&lt;/span&gt;
&lt;ul&gt;
    &lt;li&gt;&lt;a id="ctl00_pageContent_Editor_Results_rprSelectionList_ctl06_HyperLink1" title="View Entry" href="../../../../jeffs/archive/2007/11/13/sql-aggregate-totals.aspx"&gt;Some SELECTs will never return 0 rows -- regardless of the criteria&lt;/a&gt;&lt;/li&gt;
    &lt;li&gt;&lt;a title="View Entry" href="../../../../jeffs/archive/2007/10/18/sql-server-cross-apply.aspx"&gt;Taking a look at CROSS APPLY&lt;/a&gt;&lt;/li&gt;
    &lt;li&gt;&lt;a title="View Entry" href="../../../../jeffs/archive/2007/09/18/sql-conditional-where-clauses.aspx"&gt;Optimizing Conditional WHERE Clauses: Avoiding ORs and CASE Expressions&lt;/a&gt;&lt;/li&gt;
    &lt;li&gt;&lt;a id="ctl00_pageContent_Editor_Results_rprSelectionList_ctl10_HyperLink1" title="View Entry" href="../../../../jeffs/archive/2007/06/12/60230.aspx"&gt;Using GROUP BY to avoid self-joins&lt;/a&gt;&lt;/li&gt;
    &lt;li&gt;&lt;a title="View Entry" href="../../../../jeffs/archive/2007/05/14/60205.aspx"&gt;Criteria on Outer Joined Tables&lt;/a&gt;&lt;/li&gt;
    &lt;li&gt;&lt;a title="View Entry" href="../../../../jeffs/archive/2007/04/19/Full-Outer-Joins.aspx"&gt;Better Alternatives to a FULL OUTER JOIN&lt;/a&gt;&lt;/li&gt;
    &lt;li&gt;&lt;a id="ctl00_pageContent_Editor_Results_rprSelectionList_ctl02_HyperLink1" title="View Entry" href="../../../../jeffs/archive/2007/05/03/60195.aspx"&gt;In SQL, it's a Case Expression, *not* a Case Statement&lt;/a&gt;&lt;/li&gt;
&lt;/ul&gt;&lt;img src="http://weblogs.sqlteam.com/jeffs/aggbug/60452.aspx" width="1" height="1" /&gt;</description>
            <dc:creator>Jeff Smith</dc:creator>
            <guid>http://weblogs.sqlteam.com/jeffs/archive/2008/01/09/rewrite-correlated-sub-query-with-case-sql.aspx</guid>
            <pubDate>Wed, 09 Jan 2008 13:42:16 GMT</pubDate>
            <wfw:comment>http://weblogs.sqlteam.com/jeffs/comments/60452.aspx</wfw:comment>
            <comments>http://weblogs.sqlteam.com/jeffs/archive/2008/01/09/rewrite-correlated-sub-query-with-case-sql.aspx#feedback</comments>
            <slash:comments>2</slash:comments>
            <wfw:commentRss>http://weblogs.sqlteam.com/jeffs/comments/commentRss/60452.aspx</wfw:commentRss>
            <trackback:ping>http://weblogs.sqlteam.com/jeffs/services/trackbacks/60452.aspx</trackback:ping>
        </item>
        <item>
            <title>Simplify Your SQL with Variables and Derived Tables (or Common Table Expressions)</title>
            <link>http://weblogs.sqlteam.com/jeffs/archive/2007/12/20/simplify-sql-with-variables-and-derived-tables.aspx</link>
            <description>As with any programming language, it is important in SQL to keep your code short, clear and concise.   Here are two quick tips that I find are very helpful in obtaining this goal.  &lt;br /&gt;
&lt;br /&gt;
&lt;span style="font-weight: bold;"&gt;Tip 1:  For any relatively complicated constant expression, always declare a variable&lt;/span&gt;&lt;br /&gt;
&lt;br /&gt;
Consider the following:&lt;br /&gt;
&lt;br /&gt;
&lt;div style="margin-left: 40px;"&gt;&lt;span style="font-family: Courier New;"&gt;select *&lt;/span&gt;&lt;br style="font-family: Courier New;" /&gt;
&lt;span style="font-family: Courier New;"&gt;from Transactions&lt;/span&gt;&lt;br style="font-family: Courier New;" /&gt;
&lt;span style="font-family: Courier New;"&gt;where TranDate &amp;gt;= &lt;span style="font-style: italic;"&gt;{some long, complicated expression to determine the start date}&lt;/span&gt; and&lt;/span&gt;&lt;br style="font-family: Courier New;" /&gt;
&lt;span style="font-family: Courier New;"&gt;      &lt;/span&gt;&lt;span style="font-family: Courier New;"&gt;TranDate &lt;/span&gt;&lt;span style="font-family: Courier New;"&gt;&amp;lt;  &lt;span style="font-style: italic;"&gt;{some long, complicated expression to determine the end date}&lt;/span&gt;&lt;/span&gt;&lt;br /&gt;
&lt;/div&gt;
      &lt;br /&gt;
This is one of the most common things that I see in the &lt;a target="_blank" href="http://www.sqlteam.com/forums"&gt;SQLTeam forums&lt;/a&gt;, from both the people asking questions and the people giving answers.  If the starting date and ending date range are &lt;span style="font-style: italic;"&gt;constants for the entire SELECT&lt;/span&gt; (i.e., they don't vary by row, they are based on the current date or some parameters passed to the stored procedure), then simply declare them as variables, set the values once, and reference those variables in your WHERE clause:&lt;br /&gt;
&lt;br /&gt;
&lt;div style="margin-left: 40px;"&gt;&lt;span style="font-family: Courier New;"&gt;declare @start datetime, @end datetime&lt;/span&gt;&lt;br style="font-family: Courier New;" /&gt;
&lt;span style="font-family: Courier New;"&gt;set @start = &lt;span style="font-style: italic;"&gt;{some long, complicated expression to determine the start date}&lt;/span&gt;&lt;/span&gt;&lt;br style="font-family: Courier New;" /&gt;
&lt;span style="font-family: Courier New;"&gt;set @end = &lt;span style="font-style: italic;"&gt;{some long, complicated expression to determine the end date}&lt;/span&gt;&lt;/span&gt;&lt;br style="font-family: Courier New;" /&gt;
&lt;br style="font-family: Courier New;" /&gt;
&lt;span style="font-family: Courier New;"&gt;select * from Transactions&lt;/span&gt;&lt;br style="font-family: Courier New;" /&gt;
&lt;span style="font-family: Courier New;"&gt;where TranDate &amp;gt;= @start and TranDate &amp;lt; @end&lt;/span&gt;&lt;br style="font-family: Courier New;" /&gt;
&lt;/div&gt;
&lt;br /&gt;
Now, you could argue that we just technically made the code longer.  But, we have made our code clearer and much easier to read and debug! We can now print and/or analyze the @start and @end dates to make sure that our formulas are working, without repeatedly running the entire SELECT over and over and glancing at the results and trying to determine if they "look right".  &lt;br /&gt;
&lt;br /&gt;
We can also potentially use these variables to make these formulas (or others) much simpler.   For example, if the end date is always one day later than the start date, we can write:&lt;br /&gt;
&lt;br /&gt;
&lt;div style="margin-left: 40px;"&gt;&lt;span style="font-family: Courier New;"&gt;declare @start datetime, @end datetime&lt;/span&gt;&lt;br style="font-family: Courier New;" /&gt;
&lt;span style="font-family: Courier New;"&gt;set @start = &lt;/span&gt;&lt;span style="font-style: italic; font-family: Courier New;"&gt;{some long, complicated expression to determine the start date}&lt;/span&gt;&lt;br style="font-family: Courier New;" /&gt;
&lt;span style="font-family: Courier New;"&gt;set @end = &lt;span style="font-weight: bold;"&gt;dateadd(day, 1, @start)&lt;/span&gt;&lt;/span&gt;&lt;br style="font-family: Courier New;" /&gt;
&lt;br style="font-family: Courier New;" /&gt;
&lt;span style="font-family: Courier New;"&gt;select * from Transactions&lt;/span&gt;&lt;br style="font-family: Courier New;" /&gt;
&lt;span style="font-family: Courier New;"&gt;where TranDate &amp;gt;= @start and TranDate &amp;lt; @end&lt;/span&gt;&lt;br /&gt;
&lt;/div&gt;
&lt;br /&gt;
That is much, much easier to read, write and debug then if we repeated the long complicated expression twice, and it makes it very clear what the relationship is between the start date and the ending date.  If we need to tweak our starting date formula, we can do it in one place and again it is easier to debug since we can just print out the value of the variable. We also have the ability to name our variables intelligently to help with the readability of our code. &lt;br /&gt;
&lt;br /&gt;
&lt;span style="font-weight: bold;"&gt;Tip 2: To avoid repeating non-constant expressions, use a Derived Table or a Common Table Expression (CTE)&lt;/span&gt;&lt;br /&gt;
&lt;br /&gt;
Another common thing found in lots of SQL code is to repeat the non-constant expressions over and over.  For example, if we have a DateTime column called TranDate, this expression will round that date to the first day of the month:&lt;br /&gt;
&lt;br /&gt;
&lt;div style="margin-left: 40px;"&gt;&lt;span style="font-family: Courier New;"&gt;dateadd(m, datediff(m,0, TranDate),0)&lt;/span&gt;&lt;br /&gt;
&lt;/div&gt;
    &lt;br /&gt;
Using that knowledge, if we need to summarize transactions by month, the result often ends up looking something like this:&lt;br /&gt;
&lt;br style="font-family: Courier New;" /&gt;
&lt;div style="margin-left: 40px;"&gt;&lt;span style="font-family: Courier New;"&gt;select &lt;/span&gt;&lt;br style="font-family: Courier New;" /&gt;
&lt;span style="font-family: Courier New;"&gt;  dateadd(m, datediff(m,0, TranDate),0) as [Month], sum(value)&lt;/span&gt;&lt;br style="font-family: Courier New;" /&gt;
&lt;span style="font-family: Courier New;"&gt;from &lt;/span&gt;&lt;br style="font-family: Courier New;" /&gt;
&lt;span style="font-family: Courier New;"&gt;  table&lt;/span&gt;&lt;br style="font-family: Courier New;" /&gt;
&lt;span style="font-family: Courier New;"&gt;where &lt;/span&gt;&lt;br style="font-family: Courier New;" /&gt;
&lt;span style="font-family: Courier New;"&gt;  dateadd(m, datediff(m,0, TranDate),0) &amp;gt;= @min and &lt;/span&gt;&lt;br style="font-family: Courier New;" /&gt;
&lt;span style="font-family: Courier New;"&gt;  dateadd(m, datediff(m,0, TranDate),0) &amp;lt;= @max&lt;/span&gt;&lt;br style="font-family: Courier New;" /&gt;
&lt;span style="font-family: Courier New;"&gt;group by &lt;/span&gt;&lt;br style="font-family: Courier New;" /&gt;
&lt;span style="font-family: Courier New;"&gt;   dateadd(m, datediff(m,0, TranDate),0)&lt;/span&gt;&lt;br style="font-family: Courier New;" /&gt;
&lt;span style="font-family: Courier New;"&gt;order by&lt;/span&gt;&lt;br style="font-family: Courier New;" /&gt;
&lt;span style="font-family: Courier New;"&gt;   dateadd(m, datediff(m,0, TranDate),0)&lt;/span&gt;&lt;br /&gt;
&lt;/div&gt;
&lt;br /&gt;
Notice that the date expression is repeated &lt;span style="font-style: italic;"&gt;five times &lt;/span&gt;in the SQL statement!   Looks kind of silly, but I see it over and over again from both beginner and experienced programmers alike.  The solution to simplifying this is to calculate the formula &lt;span style="font-style: italic;"&gt;once&lt;/span&gt; and assign it a meaningful alias&lt;span style="font-style: italic;"&gt; &lt;/span&gt;using a derived table:&lt;br /&gt;
&lt;br /&gt;
&lt;div style="margin-left: 40px;"&gt;&lt;span style="font-family: Courier New;"&gt;select&lt;/span&gt;&lt;br style="font-family: Courier New;" /&gt;
&lt;span style="font-family: Courier New;"&gt;  [Month], sum(value)&lt;/span&gt;&lt;br style="font-family: Courier New;" /&gt;
&lt;span style="font-family: Courier New;"&gt;from&lt;/span&gt;&lt;br style="font-family: Courier New;" /&gt;
&lt;span style="font-family: Courier New;"&gt;  (&lt;br /&gt;
   select &lt;/span&gt;&lt;br style="font-family: Courier New;" /&gt;
&lt;span style="font-family: Courier New;"&gt;     dateadd(m, datediff(m,0, TranDate),0) as [Month], value&lt;/span&gt;&lt;br style="font-family: Courier New;" /&gt;
&lt;span style="font-family: Courier New;"&gt;   from &lt;/span&gt;&lt;br style="font-family: Courier New;" /&gt;
&lt;span style="font-family: Courier New;"&gt;     table&lt;/span&gt;&lt;br style="font-family: Courier New;" /&gt;
&lt;span style="font-family: Courier New;"&gt;  ) x&lt;/span&gt;&lt;br style="font-family: Courier New;" /&gt;
&lt;span style="font-family: Courier New;"&gt;where&lt;/span&gt;&lt;br style="font-family: Courier New;" /&gt;
&lt;span style="font-family: Courier New;"&gt;    x.[Month] &amp;gt;= @min and x.[Month] &amp;lt;= @max&lt;/span&gt;&lt;br style="font-family: Courier New;" /&gt;
&lt;span style="font-family: Courier New;"&gt;group by&lt;/span&gt;&lt;br style="font-family: Courier New;" /&gt;
&lt;span style="font-family: Courier New;"&gt;    x.[Month]&lt;/span&gt;&lt;br style="font-family: Courier New;" /&gt;
&lt;span style="font-family: Courier New;"&gt;order by&lt;/span&gt;&lt;br style="font-family: Courier New;" /&gt;
&lt;span style="font-family: Courier New;"&gt;    x.[Month]&lt;/span&gt;&lt;br /&gt;
&lt;/div&gt;
&lt;br /&gt;
Now the SELECT is much easier to read and maintain, and we can access the result of the calculation in the outer SELECT as many times as we need by referencing the "month" column returned by the derived table.   Also, if our "month" formula is wrong, or needs tweaking or testing, it can all be done in one place and it doesn't need to be repeated over and over.  &lt;br /&gt;
&lt;br /&gt;
If you are using SQL Server version 2005 or greater, you can also use a Common Table Expression (CTE) in the same way.  In fact, in many ways CTEs are even more readable and easier to work with than derived tables since the code doesn't require as much nesting and is more linear.   Here's the previous example as a CTE:&lt;br /&gt;
&lt;br /&gt;
&lt;div style="margin-left: 40px;"&gt;&lt;span style="font-family: Courier New;"&gt;with MonthSummary as&lt;/span&gt;&lt;br style="font-family: Courier New;" /&gt;
&lt;span style="font-family: Courier New;"&gt;(&lt;/span&gt;&lt;br style="font-family: Courier New;" /&gt;
&lt;span style="font-family: Courier New;"&gt;   select&lt;/span&gt;&lt;br style="font-family: Courier New;" /&gt;
&lt;span style="font-family: Courier New;"&gt;     dateadd(m, datediff(m,0, TranDate),0) as [Month], value&lt;/span&gt;&lt;br style="font-family: Courier New;" /&gt;
&lt;span style="font-family: Courier New;"&gt;   from&lt;/span&gt;&lt;br style="font-family: Courier New;" /&gt;
&lt;span style="font-family: Courier New;"&gt;     table&lt;/span&gt;&lt;br style="font-family: Courier New;" /&gt;
&lt;span style="font-family: Courier New;"&gt;)&lt;/span&gt;&lt;br style="font-family: Courier New;" /&gt;
&lt;span style="font-family: Courier New;"&gt;select&lt;/span&gt;&lt;br style="font-family: Courier New;" /&gt;
&lt;span style="font-family: Courier New;"&gt;  [Month], sum(value)&lt;/span&gt;&lt;br style="font-family: Courier New;" /&gt;
&lt;span style="font-family: Courier New;"&gt;from&lt;/span&gt;&lt;br style="font-family: Courier New;" /&gt;
&lt;span style="font-family: Courier New;"&gt;  MonthSummary&lt;/span&gt;&lt;br style="font-family: Courier New;" /&gt;
&lt;span style="font-family: Courier New;"&gt;where&lt;/span&gt;&lt;br style="font-family: Courier New;" /&gt;
&lt;span style="font-family: Courier New;"&gt;  [Month] &amp;gt;= @min and [Month] &amp;lt;= @max&lt;/span&gt;&lt;br style="font-family: Courier New;" /&gt;
&lt;span style="font-family: Courier New;"&gt;group by&lt;/span&gt;&lt;br style="font-family: Courier New;" /&gt;
&lt;span style="font-family: Courier New;"&gt;  [Month]&lt;/span&gt;&lt;br style="font-family: Courier New;" /&gt;
&lt;span style="font-family: Courier New;"&gt;order by&lt;/span&gt;&lt;br style="font-family: Courier New;" /&gt;
&lt;span style="font-family: Courier New;"&gt;  [Month]&lt;/span&gt;&lt;br style="font-family: Courier New;" /&gt;
&lt;/div&gt;
&lt;br /&gt;
Repeating expressions is common for mathematical formulas as well, such as using CASE to avoid dreaded "divide by zero" errors when the denominator is an expression:&lt;br /&gt;
&lt;br /&gt;
&lt;div style="margin-left: 40px;"&gt;&lt;span style="font-family: Courier New;"&gt;select &lt;br /&gt;
  c as Numerator, a+b as Denominator, &lt;br /&gt;
  case when a+b = 0 then null else c / (a+b) end as Result&lt;/span&gt;&lt;br style="font-family: Courier New;" /&gt;
&lt;span style="font-family: Courier New;"&gt;from &lt;br /&gt;
  tbl&lt;/span&gt;&lt;br /&gt;
&lt;/div&gt;
&lt;br /&gt;
Notice that a+b is repeated &lt;span style="font-style: italic;"&gt;three times&lt;/span&gt; in the previous SELECT statement.  Again, using a derived table, we can easily change this to:&lt;br /&gt;
&lt;br /&gt;
&lt;div style="margin-left: 40px;"&gt;&lt;span style="font-family: Courier New;"&gt;select &lt;br /&gt;
  Numerator, &lt;br /&gt;
  Denominator, &lt;br /&gt;
  case when Denominator = 0 the null else Numerator / Denominator end as Result&lt;/span&gt;&lt;br style="font-family: Courier New;" /&gt;
&lt;span style="font-family: Courier New;"&gt;from&lt;/span&gt;&lt;br style="font-family: Courier New;" /&gt;
&lt;span style="font-family: Courier New;"&gt;  (&lt;br /&gt;
  select&lt;/span&gt;&lt;br style="font-family: Courier New;" /&gt;
&lt;span style="font-family: Courier New;"&gt;     c as Numerator, a+b as Denominator&lt;/span&gt;&lt;br style="font-family: Courier New;" /&gt;
&lt;span style="font-family: Courier New;"&gt;  from&lt;/span&gt;&lt;br style="font-family: Courier New;" /&gt;
&lt;span style="font-family: Courier New;"&gt;     tbl&lt;/span&gt;&lt;br style="font-family: Courier New;" /&gt;
&lt;span style="font-family: Courier New;"&gt;  ) x&lt;/span&gt;&lt;br /&gt;
&lt;/div&gt;
&lt;br /&gt;
... and once again, we can now clearly label our expression and see exactly what it is and our code is much clearer and easier to maintain.  This example is quite simple and the difference doesn't seem like much, but if that expression is long and complicated and used in other places as well, the improvement in the code can be tremendous.&lt;br /&gt;
&lt;br /&gt;
&lt;span style="font-weight: bold;"&gt;Conclusion&lt;/span&gt;&lt;br /&gt;
&lt;br /&gt;
It is not just readability that we are concerned with, it is also eliminating and finding bugs and optimizing code.  Repeating expressions in more than one place, or embedding them deep into SQL statements where we can never really be sure they are working, can result in bugs that are tough to track down.   It is also can force the optimizer to calculate these expressions over and over as well, when they should optimally only be calculated once.&lt;br /&gt;
&lt;br /&gt;
So, remember: for constant expressions that don't change, just declare and set variables, and examine the variables' contents before using them anywhere to ensure that your expressions are working as you intend.  For repeated expressions that are based on values in the tables you are querying, consider calculating them once within a Derived Table or a Common Table Expression.  In addition, be sure to use meaningful variable names and aliases which will help to greatly improve the clarity and intentions of your code.&lt;br /&gt;
&lt;br /&gt;
It all comes back to the "golden rule" of programming: If you need to cut and paste, you're probably doing it wrong!&lt;img src="http://weblogs.sqlteam.com/jeffs/aggbug/60440.aspx" width="1" height="1" /&gt;</description>
            <dc:creator>Jeff Smith</dc:creator>
            <guid>http://weblogs.sqlteam.com/jeffs/archive/2007/12/20/simplify-sql-with-variables-and-derived-tables.aspx</guid>
            <pubDate>Thu, 20 Dec 2007 17:20:25 GMT</pubDate>
            <wfw:comment>http://weblogs.sqlteam.com/jeffs/comments/60440.aspx</wfw:comment>
            <comments>http://weblogs.sqlteam.com/jeffs/archive/2007/12/20/simplify-sql-with-variables-and-derived-tables.aspx#feedback</comments>
            <slash:comments>6</slash:comments>
            <wfw:commentRss>http://weblogs.sqlteam.com/jeffs/comments/commentRss/60440.aspx</wfw:commentRss>
            <trackback:ping>http://weblogs.sqlteam.com/jeffs/services/trackbacks/60440.aspx</trackback:ping>
        </item>
        <item>
            <title>Creating CSV strings in SQL: Should Concatenation and Formatting Be Done at the Database Layer?</title>
            <link>http://weblogs.sqlteam.com/jeffs/archive/2007/10/09/csv-strings-database-or-presentation.aspx</link>
            <description>A question I see very often in the SQLTeam forums is how to return data in a summarized form by concatenating a list of values into a single CSV column.  This can be done fairly easily in T-SQL, but as the formatting and concatenation requirements becomes more elaborate, be sure to ask yourself: Am I forcing presentation code into the database layer?  &lt;a href="http://weblogs.sqlteam.com/jeffs/archive/2007/10/09/csv-strings-database-or-presentation.aspx"&gt;read more...&lt;/a&gt;&lt;img src="http://weblogs.sqlteam.com/jeffs/aggbug/60359.aspx" width="1" height="1" /&gt;</description>
            <dc:creator>Jeff Smith</dc:creator>
            <guid>http://weblogs.sqlteam.com/jeffs/archive/2007/10/09/csv-strings-database-or-presentation.aspx</guid>
            <pubDate>Tue, 09 Oct 2007 19:02:57 GMT</pubDate>
            <wfw:comment>http://weblogs.sqlteam.com/jeffs/comments/60359.aspx</wfw:comment>
            <comments>http://weblogs.sqlteam.com/jeffs/archive/2007/10/09/csv-strings-database-or-presentation.aspx#feedback</comments>
            <slash:comments>18</slash:comments>
            <wfw:commentRss>http://weblogs.sqlteam.com/jeffs/comments/commentRss/60359.aspx</wfw:commentRss>
            <trackback:ping>http://weblogs.sqlteam.com/jeffs/services/trackbacks/60359.aspx</trackback:ping>
        </item>
        <item>
            <title>Optimizing Conditional WHERE Clauses: Avoiding ORs and CASE Expressions</title>
            <link>http://weblogs.sqlteam.com/jeffs/archive/2007/09/18/sql-conditional-where-clauses.aspx</link>
            <description>Often, we need to create a flexible stored procedure that returns data that is optionally filtered by some parameters. If you wish to apply a filter, you set the parameter to the necessary value, if not, you leave it null.  This is pretty standard stuff, of course, that we can write fairly easily, without the need for dynamic SQL, like this:&lt;br /&gt;
&lt;br /&gt;
&lt;div style="margin-left: 40px;"&gt;&lt;span style="font-family: Courier New;"&gt;create procedure GetData &lt;/span&gt;&lt;br style="font-family: Courier New;" /&gt;
&lt;span style="font-family: Courier New;"&gt;    @MinDate int = null,&lt;/span&gt;&lt;br style="font-family: Courier New;" /&gt;
&lt;span style="font-family: Courier New;"&gt;    @MaxDate int = null,&lt;/span&gt;&lt;br style="font-family: Courier New;" /&gt;
&lt;span style="font-family: Courier New;"&gt;    @MinAmount money = null,&lt;/span&gt;&lt;br style="font-family: Courier New;" /&gt;
&lt;span style="font-family: Courier New;"&gt;    @MaxAmount money = null,&lt;br /&gt;
&lt;/span&gt;&lt;span style="font-family: Courier New;"&gt;    @ProductCode varchar(200) = null,&lt;/span&gt;&lt;br style="font-family: Courier New;" /&gt;
&lt;span style="font-family: Courier New;"&gt;    @CompanyID int&lt;/span&gt;&lt;br style="font-family: Courier New;" /&gt;
&lt;span style="font-family: Courier New;"&gt;as&lt;/span&gt;&lt;br style="font-family: Courier New;" /&gt;
&lt;br style="font-family: Courier New;" /&gt;
&lt;span style="font-family: Courier New;"&gt;    select *&lt;/span&gt;&lt;br style="font-family: Courier New;" /&gt;
&lt;span style="font-family: Courier New;"&gt;    from Data&lt;/span&gt;&lt;br style="font-family: Courier New;" /&gt;
&lt;span style="font-family: Courier New;"&gt;    where&lt;/span&gt;&lt;br style="font-family: Courier New;" /&gt;
&lt;span style="font-family: Courier New;"&gt;        (@MinDate is null OR @MinDate &amp;lt;= Date) and&lt;/span&gt;&lt;br style="font-family: Courier New;" /&gt;
&lt;span style="font-family: Courier New;"&gt;        (@MaxDate is null OR @MaxDate &amp;gt;= Date) and&lt;/span&gt;&lt;br style="font-family: Courier New;" /&gt;
&lt;span style="font-family: Courier New;"&gt;        (@MinAmount is null OR @MinAmount &amp;gt;= Amount) and&lt;/span&gt;&lt;br style="font-family: Courier New;" /&gt;
&lt;span style="font-family: Courier New;"&gt;        (@MaxAmount is null OR @MaxAmount &amp;lt;= Amount) and&lt;/span&gt;&lt;br style="font-family: Courier New;" /&gt;
&lt;span style="font-family: Courier New;"&gt;        (@ProductCode  is null OR ProductCode = @ProductCode) and&lt;/span&gt;&lt;br style="font-family: Courier New;" /&gt;
&lt;span style="font-family: Courier New;"&gt;        (@CompanyID is null OR CompanyID = @CompanyID)&lt;/span&gt;&lt;br /&gt;
&lt;/div&gt;
&lt;br /&gt;
Note that we are using good boolean algebra as discussed &lt;a href="http://weblogs.sqlteam.com/jeffs/archive/2003/11/14/513.aspx"&gt;here&lt;/a&gt;, and not using inefficient CASE or COALESCE() expressions around the columns in our WHERE clause.  It is pretty easy to read and quite logical, and certainly it is easy to edit and to extend with more filtering options as necessary.  We also made sure to use parenthesis in our WHERE clause to organize it and to enforce a clear and correct order of operations regarding our AND and OR clauses.&lt;br /&gt;
&lt;br /&gt;
However, if we manipulate our parameters a little and re-organize our criteria, we can potentially write this much more efficiently.&lt;br /&gt;
&lt;br /&gt;
First, note that we are allowing the user to specify an optional date range via the @MinDate and @MaxDate parameters.  If either is NULL, we simply do not use a minimum or maximum date in our filter.  However, we can simplify things by writing our criteria like this:&lt;br /&gt;
&lt;span style="font-family: Courier New;"&gt;&lt;br /&gt;
&lt;/span&gt;
&lt;div style="margin-left: 40px;"&gt;&lt;span style="font-family: Courier New;"&gt;select *&lt;/span&gt;&lt;span style="font-family: Courier New;"&gt;&lt;/span&gt;&lt;br /&gt;
&lt;span style="font-family: Courier New;"&gt;from Data&lt;/span&gt;&lt;br /&gt;
&lt;span style="font-family: Courier New;"&gt;&lt;/span&gt;&lt;span style="font-family: Courier New;"&gt;where&lt;/span&gt;&lt;span style="font-family: Courier New;"&gt;&lt;/span&gt;&lt;br /&gt;
&lt;span style="font-family: Courier New; font-weight: bold;"&gt;Date between coalesce(@MinDate, '1/1/1900') and coalesce&lt;/span&gt;&lt;span style="font-family: Courier New;"&gt;&lt;span style="font-weight: bold;"&gt;(@MaxDate, '12/31/2999')&lt;/span&gt; and &lt;/span&gt;&lt;br /&gt;
&lt;span style="font-family: Courier New;"&gt;... etc &lt;/span&gt;&lt;br /&gt;
&lt;/div&gt;
&lt;br /&gt;
All we need to do is use COALESCE() to transform our NULL values to dates that will encompass the entire range of values, thus "eliminating" that filter from the results. This greatly simplifies our WHERE clause and ensures that indexes on our Date column can be used.  &lt;br /&gt;
&lt;br /&gt;
We can do the exact same thing for our Amount range as well:&lt;br /&gt;
&lt;br style="font-family: Courier New;" /&gt;
&lt;span style="font-family: Courier New;"&gt;    select *&lt;/span&gt;&lt;br style="font-family: Courier New;" /&gt;
&lt;span style="font-family: Courier New;"&gt;    from Data&lt;/span&gt;&lt;br style="font-family: Courier New;" /&gt;
&lt;span style="font-family: Courier New;"&gt;    where&lt;/span&gt;&lt;br style="font-family: Courier New;" /&gt;
&lt;span style="font-family: Courier New;"&gt;        Date between coalesce(@MinDate, '1/1/1900') and coalesce(@MaxDate, '12/31/2999') and &lt;/span&gt;&lt;br style="font-family: Courier New;" /&gt;
&lt;span style="font-family: Courier New;"&gt;        &lt;span style="font-weight: bold;"&gt;Amount between coalesce(@MinAmount,-99999999) and coalesce(@MaxAmount,99999999)&lt;/span&gt; and ...&lt;/span&gt;&lt;br /&gt;
&lt;br /&gt;
For numeric values, this can be trickier because you really want to make sure that you have your entire range of values covered.  When in doubt, just use the smallest and largest possible values that your data type will allow.&lt;br /&gt;
&lt;br /&gt;
Next, we have a string comparison based on product code.  These also can be very efficiently rewritten by using LIKE.  Normally, LIKE is much slower than using equals, but if you do not use any wild cards and it allows you to eliminate an "OR" in your WHERE clause, it may actually be more efficient:&lt;br /&gt;
&lt;br /&gt;
&lt;span style="font-family: Courier New;"&gt;    select *&lt;/span&gt;&lt;br style="font-family: Courier New;" /&gt;
&lt;span style="font-family: Courier New;"&gt;    from Data&lt;/span&gt;&lt;br style="font-family: Courier New;" /&gt;
&lt;span style="font-family: Courier New;"&gt;    where&lt;/span&gt;&lt;br style="font-family: Courier New;" /&gt;
&lt;span style="font-family: Courier New;"&gt;        Date between coalesce(@MinDate, '1/1/1900') and coalesce(@MaxDate, '12/31/2999') and &lt;/span&gt;&lt;br style="font-family: Courier New;" /&gt;
&lt;span style="font-family: Courier New;"&gt;        Amount between coalesce(@MinAmount,-999999999) and coalesce(@MaxAmount,999999999) and&lt;/span&gt;&lt;br style="font-family: Courier New;" /&gt;
&lt;span style="font-family: Courier New;"&gt;        &lt;span style="font-weight: bold;"&gt;ProductCode like coalesce(@ProductCode,'%')&lt;/span&gt; and ...&lt;/span&gt;&lt;br /&gt;
&lt;br /&gt;
Note that if your product codes contain symbols like % or _, this will not work for you.  But in general, using LIKE in this case can be really handy.  I also sometimes see this:&lt;br /&gt;
&lt;br /&gt;
&lt;div style="margin-left: 40px;"&gt;&lt;span style="font-family: Courier New;"&gt;where (@Name is null OR Name like '%' + @Name + '%')&lt;br /&gt;
&lt;br /&gt;
&lt;/span&gt;&lt;/div&gt;
.. which can of course be written much simply and more efficiently, since we are using LIKE already, as:&lt;br /&gt;
&lt;br /&gt;
&lt;div style="margin-left: 40px;"&gt;&lt;span style="font-family: Courier New;"&gt;where (Name like '%' + coalesce(@Name,'') + '%')&lt;br /&gt;
&lt;br /&gt;
&lt;/span&gt;&lt;/div&gt;
The idea is to eliminate those OR operators and simplify your WHERE condition wherever possible.&lt;br /&gt;
&lt;br /&gt;
Finally, we have an optional filter for a specific @CompanyID integer parameter.  We don't want to use a LIKE wildcard, since this will require that all of the integer values in the table must be converted to strings for the comparison -- thus, no indexes will be used.  What we can do, even though it might not seem intuitive, is use a range!  For example:&lt;br /&gt;
&lt;br style="font-family: Courier New;" /&gt;
&lt;span style="font-family: Courier New;"&gt;    select *&lt;/span&gt;&lt;br style="font-family: Courier New;" /&gt;
&lt;span style="font-family: Courier New;"&gt;    from Data&lt;/span&gt;&lt;br style="font-family: Courier New;" /&gt;
&lt;span style="font-family: Courier New;"&gt;    where&lt;/span&gt;&lt;br style="font-family: Courier New;" /&gt;
&lt;span style="font-family: Courier New;"&gt;        Date between coalesce(@MinDate, '1/1/1900') and coalesce(@MaxDate, '12/31/2999') and &lt;/span&gt;&lt;br style="font-family: Courier New;" /&gt;
&lt;span style="font-family: Courier New;"&gt;        Amount between coalesce(@MinAmount,-99999999) and coalesce(@MaxAmount,99999999) and&lt;/span&gt;&lt;br style="font-family: Courier New;" /&gt;
&lt;span style="font-family: Courier New;"&gt;        ProductCode like coalesce(@ProductCode,'%') and&lt;/span&gt;&lt;br style="font-family: Courier New;" /&gt;
&lt;span style="font-family: Courier New;"&gt;        &lt;span style="font-weight: bold;"&gt;CustomerID between coalesce(@CustomerID,0) and coalesce(@CustomerID,999999999)&lt;/span&gt;&lt;/span&gt;&lt;br /&gt;
&lt;br /&gt;
Now, we have no ORs in our WHERE clause at all, and we have not wrapped any columns in our table in any expressions, so all indexes can potentially be used.  Our SQL is still relatively clear and easy to read and work with, and we can easily add more conditions or criteria as necessary.&lt;br /&gt;
&lt;br /&gt;
You may also find that it is more efficient and/or easier to maintain if you first set your parameter values as necessary before the SELECT, like this:&lt;br /&gt;
&lt;span style="font-family: Courier New;"&gt;&lt;br /&gt;
    declare @MinID int, @MaxID int&lt;/span&gt;&lt;br /&gt;
&lt;div style="margin-left: 40px;"&gt;&lt;span style="font-family: Courier New;"&gt;&lt;/span&gt;&lt;/div&gt;
&lt;span style="font-family: Courier New;"&gt; &lt;br style="font-family: Courier New;" /&gt;
&lt;/span&gt;&lt;span style="font-family: Courier New;"&gt;    set @MinDate = coalesce(@MinDate, '1/1/1900') &lt;/span&gt;&lt;br style="font-family: Courier New;" /&gt;
&lt;span style="font-family: Courier New;"&gt;    set @MaxDate = coalesce(@MaxDate,'12/31/2999')&lt;/span&gt;&lt;br style="font-family: Courier New;" /&gt;
&lt;span style="font-family: Courier New;"&gt;    set @MinAmount = coalesce(@MinAmount,-99999999) &lt;/span&gt;&lt;br style="font-family: Courier New;" /&gt;
&lt;span style="font-family: Courier New;"&gt;    set @MaxAmount = coalesce(@MaxAmount,99999999)&lt;/span&gt;&lt;br style="font-family: Courier New;" /&gt;
&lt;span style="font-family: Courier New;"&gt;    set @ProductCode = coalesce(@ProductCode,'%')&lt;/span&gt;&lt;br style="font-family: Courier New;" /&gt;
&lt;span style="font-family: Courier New;"&gt;    set @MinID = coalesce(@CustomerID,-99999999) &lt;/span&gt;&lt;br style="font-family: Courier New;" /&gt;
&lt;span style="font-family: Courier New;"&gt;    set @MaxID = coalesce(@&lt;/span&gt;&lt;span style="font-family: Courier New;"&gt;CustomerID&lt;/span&gt;&lt;span style="font-family: Courier New;"&gt;,99999999)&lt;/span&gt;&lt;br style="font-family: Courier New;" /&gt;
&lt;br style="font-family: Courier New;" /&gt;
&lt;span style="font-family: Courier New;"&gt;    select *&lt;/span&gt;&lt;br style="font-family: Courier New;" /&gt;
&lt;span style="font-family: Courier New;"&gt;    from Data&lt;/span&gt;&lt;br style="font-family: Courier New;" /&gt;
&lt;span style="font-family: Courier New;"&gt;    where&lt;/span&gt;&lt;br style="font-family: Courier New;" /&gt;
&lt;span style="font-family: Courier New;"&gt;        Date between @MinDate and @MaxDate and&lt;/span&gt;&lt;br style="font-family: Courier New;" /&gt;
&lt;span style="font-family: Courier New;"&gt;        Amount between @MinAmount and @MaxAmount and&lt;/span&gt;&lt;br style="font-family: Courier New;" /&gt;
&lt;span style="font-family: Courier New;"&gt;        ProductCode like @ProductCode and&lt;/span&gt;&lt;br style="font-family: Courier New;" /&gt;
&lt;span style="font-family: Courier New;"&gt;        CustomerID between @MinID and @MaxID&lt;/span&gt;&lt;br /&gt;
&lt;br /&gt;
This will ensure that the coalesce() expressions are evaluated only once, and it does make things a little easier to read.&lt;br /&gt;
&lt;br /&gt;
The overall trick is to think to yourself: Can I take a parameter value and either alter it or use it to create new values, and then efficiently use those new values in my WHERE clause &lt;span style="font-style: italic;"&gt;instead&lt;/span&gt; of simply using that raw parameter value?  Sometimes, it takes some "outside-of-the-box" thinking, but it often can be done and the resulting performance gains can be tremendous.&lt;br /&gt;
&lt;br /&gt;
Ultimately, you always should test the different ways of writing your WHERE clauses and always double check the indexing on your tables to get the maximum efficiency.  Writing things a particular way in one situation may work great, but then applying it to another may not work quite as well. &lt;br /&gt;
&lt;br /&gt;
&lt;span style="font-style: italic;"&gt; See also: &lt;/span&gt;&lt;br /&gt;
&lt;ul&gt;
    &lt;li&gt;&lt;a href="http://weblogs.sqlteam.com/jeffs/archive/2003/11/14/513.aspx"&gt;SQL WHERE clauses: Avoid CASE, use Boolean logic&lt;/a&gt;&lt;/li&gt;
    &lt;li&gt;&lt;a target="_blank" href="http://www.sqlteam.com/article/avoid-enclosing-indexed-columns-in-a-function-in-the-where-clause"&gt;Avoid enclosing Indexed Columns in a Function in the WHERE clause&lt;/a&gt;&lt;/li&gt;
&lt;/ul&gt;&lt;img src="http://weblogs.sqlteam.com/jeffs/aggbug/60328.aspx" width="1" height="1" /&gt;</description>
            <dc:creator>Jeff Smith</dc:creator>
            <guid>http://weblogs.sqlteam.com/jeffs/archive/2007/09/18/sql-conditional-where-clauses.aspx</guid>
            <pubDate>Tue, 18 Sep 2007 20:25:21 GMT</pubDate>
            <wfw:comment>http://weblogs.sqlteam.com/jeffs/comments/60328.aspx</wfw:comment>
            <comments>http://weblogs.sqlteam.com/jeffs/archive/2007/09/18/sql-conditional-where-clauses.aspx#feedback</comments>
            <slash:comments>23</slash:comments>
            <wfw:commentRss>http://weblogs.sqlteam.com/jeffs/comments/commentRss/60328.aspx</wfw:commentRss>
            <trackback:ping>http://weblogs.sqlteam.com/jeffs/services/trackbacks/60328.aspx</trackback:ping>
        </item>
        <item>
            <title>Filter by month (plus other time periods)</title>
            <link>http://weblogs.sqlteam.com/jeffs/archive/2007/09/14/sql-filter-by-month.aspx</link>
            <description>&lt;span style="font-weight: bold;"&gt;Introduction&lt;br /&gt;
&lt;br /&gt;
&lt;/span&gt;Previously, &lt;a href="http://weblogs.sqlteam.com/jeffs/archive/2007/09/10/group-by-month-sql.aspx"&gt;I wrote about grouping transactions by month&lt;/a&gt;.  Another common area of difficulty or confusion for SQL beginners is how to efficiently retrieve data just for a single month.&lt;br /&gt;
&lt;br /&gt;
There are two parts to this equation: First, what is the best way to declare parameters that will be used to indicate which month you are looking for?  Second, how can we efficiently and easily make use of those parameters to get back the data we need?&lt;br /&gt;
&lt;br /&gt;
Let's take a look at some approaches, both &lt;span style="color: rgb(0, 102, 0);"&gt;recommended &lt;/span&gt;and &lt;span style="color: rgb(128, 0, 0);"&gt;not&lt;/span&gt;.  &lt;br /&gt;
&lt;br /&gt;
And, please: if you are not using the DATETIME data type to store dates in your tables, don't even bother reading further -- fix your design first!&lt;br /&gt;
&lt;br /&gt;
&lt;span style="font-weight: bold;"&gt;Month Parameter Options &lt;/span&gt;&lt;span style="font-family: Courier New; color: rgb(128, 0, 0);"&gt; &lt;br /&gt;
&lt;br /&gt;
@YearMonth as char(6)&lt;/span&gt;&lt;br /&gt;
&lt;br /&gt;
If you know me at all, you know that I will advise not to use this approach.  We’ll need to parse and validate the values passed into ensure that they are valid, and we need to somehow communicate exactly which format to use when setting the parameter values. For example, is it “MM-YY”, or “MMYYYY”, or “MM/YY”, or “YYYYMM”, etc?  It simply makes no sense to introduce string formatting conventions and parsing and validating into the equation when you simply don’t need to.  Don’t do it.&lt;br /&gt;
&lt;br /&gt;
&lt;span style="font-family: Courier New; color: rgb(0, 102, 0);"&gt; @Year as int, @Month as int&lt;/span&gt;&lt;br /&gt;
&lt;br /&gt;
This approach works fine, since now we have no string parsing or ambiguity to deal with. It is very clear that Year and Month are both simply numeric values.  We can do a quick check to ensure that are @Month value is between 1 and 12, and that our @Year value is within our desired range as well.&lt;br /&gt;
&lt;br /&gt;
As we will see later, an optimal solution will require finding the first day of the month requested.  How can we do that given a @Year and a @Month value?  Simple, you can either &lt;a href="http://weblogs.sqlteam.com/jeffs/archive/2007/01/02/56079.aspx"&gt;use the Date function here&lt;/a&gt; and write:&lt;br /&gt;
&lt;br /&gt;
&lt;div style="margin-left: 40px;"&gt;&lt;span style="font-family: Courier New;"&gt;Declare @FirstDayOfMonth datetime&lt;/span&gt;&lt;br style="font-family: Courier New;" /&gt;
&lt;span style="font-family: Courier New;"&gt; Set @FirstDayOfMonth = dbo.Date(@Year,@Month,1)&lt;/span&gt;&lt;br style="font-family: Courier New;" /&gt;
&lt;/div&gt;
&lt;br /&gt;
… or you can apply the same logic using DateAdd() like this:&lt;br /&gt;
&lt;br /&gt;
&lt;div style="margin-left: 40px; font-family: Courier New;"&gt;Declare @FirstDayOfMonth datetime&lt;br /&gt;
Set @FirstDayOfMonth = dateadd(month,((@Year-1900)*12)+@Month-1,0)&lt;br /&gt;
&lt;/div&gt;
&lt;br /&gt;
Either way, the result is the same: for a given @Year and @Month, the first day of that month (at midnight) is returned.  If you aren’t sure how this formula works (or even if it will work at all), just test it out.&lt;br /&gt;
&lt;br style="color: rgb(51, 153, 102);" /&gt;
&lt;span style="color: rgb(0, 102, 0); font-family: Courier New;"&gt; @DateInMonth as datetime&lt;/span&gt;&lt;br /&gt;
&lt;br /&gt;
Here, we can simply accept any datetime value, and use the year and month of that date to indicate which month should be returned.  A user can pass in the first day of the month, the last day, or any day in between – it doesn’t matter.  This works well because we know the parameter will be a valid date, and we can quickly and simply get the year/month of the @DateInMonth parameter using the Year() and Month() functions:&lt;br /&gt;
&lt;br /&gt;
&lt;div style="margin-left: 40px;"&gt;&lt;span style="font-family: Courier New;"&gt;Declare @Year int, @Month int&lt;/span&gt;&lt;br style="font-family: Courier New;" /&gt;
&lt;span style="font-family: Courier New;"&gt; Select @Year = Year(@DateInMonth), @Month = Month(@DateInMonth)&lt;/span&gt;&lt;br style="font-family: Courier New;" /&gt;
&lt;/div&gt;
&lt;br /&gt;
We can also get the first day of the month for a given @DateInMonth quite easily, using another DateAdd() trick:&lt;br /&gt;
&lt;br /&gt;
&lt;div style="margin-left: 40px;"&gt;&lt;span style="font-family: Courier New;"&gt;Declare @FirstDayOfMonth&lt;/span&gt;&lt;br style="font-family: Courier New;" /&gt;
&lt;span style="font-family: Courier New;"&gt; Set @FirstDayOfMonth = DateAdd(month, DateDiff(month,0, @DateInMonth), 0)&lt;/span&gt;&lt;br /&gt;
&lt;/div&gt;
&lt;br /&gt;
So, accepting a dateTime parameter to indicate which month you are after is also a good technique to use.&lt;br /&gt;
&lt;br /&gt;
&lt;span style="font-weight: bold;"&gt;WHERE Clause Options &lt;/span&gt;&lt;br /&gt;
&lt;br /&gt;
Now that we have the @Year, @Month and @FirstDayOfMonth values ready to go, we must decide how to use these values to write our query.   Here are some options (again, both &lt;span style="color: rgb(0, 102, 0);"&gt;good &lt;/span&gt;and &lt;span style="color: rgb(128, 0, 0);"&gt;bad&lt;/span&gt;):&lt;br /&gt;
&lt;br /&gt;
&lt;span style="font-family: Courier New; color: rgb(128, 0, 0);"&gt; where Year(TranDate) =@Year and Month(TranDate) = @Month&lt;/span&gt;&lt;br /&gt;
&lt;br /&gt;
This approach will work fine, but it will not be able to use any indexes that might exist on your TranDate column.  Thus, it will not be as efficient as possible.  Every single TranDate value must be retrieved from your table, the Year() and Month() formulas must be applied, and then the results must be tested to determine if that row should be returned or not.    Overall, avoid wrapping indexable columns in functions such as these in your criteria.&lt;br /&gt;
&lt;br /&gt;
&lt;span style="font-family: Courier New; color: rgb(128, 0, 0);"&gt; where convert(varchar(6), TranDate, 112) = convert(varchar(6), @FirstDayOfMonth) &lt;/span&gt;&lt;br /&gt;
-- or --&lt;br /&gt;
&lt;span style="font-family: Courier New; color: rgb(128, 0, 0);"&gt;where convert(varchar(6), TranDate, 112) = left(@Year,4)+right('0' + left(@Month,2),2)&lt;br /&gt;
&lt;/span&gt; -- or --&lt;span style="font-family: Courier New; color: rgb(128, 0, 0);"&gt;&lt;br /&gt;
&lt;/span&gt;&lt;span style="font-family: Courier New; color: rgb(128, 0, 0);"&gt;where convert(varchar(6), TranDate, 112) = @YearMonth&lt;/span&gt;&lt;br /&gt;
&lt;br /&gt;
These WHERE clauses all convert the TranDate to a VARCHAR in the format of YYYMMDD, but truncated at 6 characters, resulting in a YYYMM formatted string.  Then, using different methods, that string is compared to a YYYYMM representation of the month we'd like to return.  This is a bad approach to take because it requires that every TranDate be converted to a string in order to filter the rows, thus no indexes can be used.  Using any conversions or string manipulations to filter data when better options exist is never the way to go.   At worst, it is also very confusing to work with since it can be unclear exactly what some of these convert() functions and string expressions are doing.&lt;br /&gt;
&lt;br /&gt;
&lt;span style="font-family: Courier New; color: rgb(128, 0, 0);"&gt; where TranDate between @FirstDayOfMonth and @LastDayOfMonth&lt;/span&gt;&lt;br /&gt;
&lt;br /&gt;
What if we calculate, for a given @Year and @Month, the first and last day of the month, store them in variables, and then use that range for our criteria?  This will let us use indexes, it will be clear and concise, and it makes good sense.&lt;br /&gt;
&lt;br /&gt;
However, there is a minor flaw in this logic:  Suppose @Year is 2006, and @Month is 1. If we set @LastDayOfMonth to ‘2006-01-31 12:00:00 AM’, any transactions after 12:00:00 AM on the last day of the month will not be included!  We could calculate the very last 1/300th of a millisecond on the last day of the month to use as our upper limit, but there is an even easier way ...&lt;br /&gt;
&lt;br /&gt;
&lt;span style="font-family: Courier New; color: rgb(0, 102, 0);"&gt; where TranDate &amp;gt;= @FirstDayOfMonth and TranDate &amp;lt; @FirstDayOfNextMonth&lt;/span&gt;&lt;br /&gt;
&lt;br /&gt;
Here, we won’t have any issues with times other than midnight leading to missed transactions.&lt;br /&gt;
&lt;br /&gt;
We already have the @FirstDayOfMonth, all we need is the @FirstDayOfNextMonth.  That is easily obtained by using DateAdd() to add one month to the @FirstDayOfMonth, as shown:&lt;br /&gt;
&lt;br /&gt;
&lt;div style="margin-left: 40px;"&gt;&lt;span style="font-family: Courier New;"&gt;Declare @FirstDayOfNextMonth&lt;/span&gt;&lt;br style="font-family: Courier New;" /&gt;
&lt;span style="font-family: Courier New;"&gt; Set @FirstDayOfNextMonth = DateAdd(month, @FirstDayOfMonth,1)&lt;/span&gt;&lt;br /&gt;
&lt;/div&gt;
&lt;br /&gt;
With that, we are all set!  Indexes can be used and any datetime value within our range will be properly selected.  I highly recommend using this approach whenever you need to select dates for a single month; there is no need to force date computations upon all values in your table when a simple range is all you need.&lt;br /&gt;
&lt;br /&gt;
&lt;span style="font-weight: bold;"&gt;&lt;span style="font-weight: bold;"&gt;Other Time Periods&lt;br /&gt;
&lt;br /&gt;
&lt;/span&gt;&lt;/span&gt;The same approach can be applied if you need to allow for a range of months to be specified, not just a single month.  Simply calculate the first day of the&lt;span style="font-style: italic;"&gt; starting &lt;/span&gt;month, and then calculate the first day of the month &lt;span style="font-style: italic;"&gt;after &lt;/span&gt;the &lt;span style="font-style: italic;"&gt;ending &lt;/span&gt;month.  Then, you can use that date range to filter your data as shown above.  &lt;br /&gt;
&lt;br /&gt;
You should also follow these same guidelines if you'd like to filter data by a specified year, quarter, or week.  "Year-to-date" filtering will also work the exact same way.&lt;br /&gt;
&lt;span style="font-weight: bold;"&gt;&lt;span style="font-weight: bold;"&gt;&lt;/span&gt;&lt;br /&gt;
Conclusion &lt;/span&gt;&lt;br /&gt;
&lt;br /&gt;
At the end of the day, accomplishing our task is very simple:&lt;br /&gt;
&lt;br /&gt;
1.    Accept either @Year and @Month parameters or a single @DateInMonth parameter to indicate which month should be returned&lt;br /&gt;
2.    Calculate the @FirstDayOfMonth and the @FirstDayOfNextMonth using those parameters&lt;br /&gt;
3.    Select your rows with a WHERE clause that returns transaction dates equal to or greater than @FirstDayOfMonth, and less than @FirstDayOfNextMonth, to ensure that all dates at the end of your range are captured and that indexes can be used.&lt;br /&gt;
&lt;br /&gt;
Good luck, and remember: Declare your parameters to be clear and precise, and write your WHERE clauses to be as efficient and accurate as possible.&lt;br /&gt;
&lt;br /&gt;
&lt;span style="font-style: italic;"&gt;&lt;span style="font-style: italic;"&gt;see also:&lt;br /&gt;
&lt;/span&gt;
&lt;ul&gt;
    &lt;li&gt;&lt;a href="../../../../jeffs/archive/2007/10/15/time-spans-durations-sql-server.aspx" title="View Entry" id="ctl00_pageContent_Editor_Results_rprSelectionList_ctl08_HyperLink1"&gt; Working with Time Spans and Durations in SQL Server&lt;/a&gt;&lt;/li&gt;
    &lt;li&gt;&lt;a href="../../../../jeffs/archive/2007/09/10/group-by-month-sql.aspx" title="View Entry"&gt;Group by Month (and other time periods)&lt;/a&gt;&lt;/li&gt;
    &lt;li&gt;&lt;a href="../../../../jeffs/archive/2007/08/29/SQL-Dates-and-Times.aspx" title="View Entry" id="ctl00_pageContent_Editor_Results_rprSelectionList_ctl08_HyperLink1"&gt;Working with Date and/or Time values in SQL Server: Don't Format, Don't Convert -- just use DATETIME&lt;/a&gt;&lt;/li&gt;
    &lt;li&gt;&lt;a href="../../../../jeffs/archive/2007/07/03/60248.aspx" title="View Entry" id="ctl00_pageContent_Editor_Results_rprSelectionList_ctl10_HyperLink1"&gt;Data Types -- The Easiest Part of Database Design&lt;/a&gt;&lt;/li&gt;
    &lt;li&gt;&lt;a href="../../../../jeffs/archive/2007/04/13/format-date-sql-server.aspx" title="View Entry" id="ctl00_pageContent_Editor_Results_rprSelectionList_ctl06_HyperLink1"&gt;How to format a Date or DateTime in SQL Server&lt;/a&gt;&lt;/li&gt;
    &lt;li&gt;&lt;a href="../../../../jeffs/archive/2004/12/02/2954.aspx" title="View Entry" id="ctl00_pageContent_Editor_Results_rprSelectionList_ctl06_HyperLink1"&gt;Breaking apart the DateTime datatype -- Separating Dates from Times in your Tables&lt;/a&gt;&lt;/li&gt;
    &lt;li&gt;&lt;a href="../../../../jeffs/archive/2007/10/31/sql-server-2005-date-time-only-data-types.aspx" title="View Entry" id="ctl00_pageContent_Editor_Results_rprSelectionList_ctl04_HyperLink1"&gt;Date Only and Time Only data types in SQL Server 2005 (without the CLR)&lt;/a&gt;&lt;/li&gt;
    &lt;li&gt;&lt;a href="../../../../jeffs/archive/2007/01/02/56079.aspx" title="View Entry"&gt;Essential SQL Server Date, Time and DateTime Functions&lt;/a&gt; 				 				&lt;/li&gt;
&lt;/ul&gt;
&lt;/span&gt;&lt;img src="http://weblogs.sqlteam.com/jeffs/aggbug/60326.aspx" width="1" height="1" /&gt;</description>
            <dc:creator>Jeff Smith</dc:creator>
            <guid>http://weblogs.sqlteam.com/jeffs/archive/2007/09/14/sql-filter-by-month.aspx</guid>
            <pubDate>Fri, 14 Sep 2007 16:21:07 GMT</pubDate>
            <wfw:comment>http://weblogs.sqlteam.com/jeffs/comments/60326.aspx</wfw:comment>
            <comments>http://weblogs.sqlteam.com/jeffs/archive/2007/09/14/sql-filter-by-month.aspx#feedback</comments>
            <slash:comments>5</slash:comments>
            <wfw:commentRss>http://weblogs.sqlteam.com/jeffs/comments/commentRss/60326.aspx</wfw:commentRss>
            <trackback:ping>http://weblogs.sqlteam.com/jeffs/services/trackbacks/60326.aspx</trackback:ping>
        </item>
        <item>
            <title>Working with Date and/or Time values in SQL Server: Don't Format, Don't Convert -- just use DATETIME</title>
            <link>http://weblogs.sqlteam.com/jeffs/archive/2007/08/29/SQL-Dates-and-Times.aspx</link>
            <description>&lt;span style="font-weight: bold;"&gt;The Importance of Data Types&lt;/span&gt;&lt;br /&gt;
&lt;br /&gt;
Imagine that SQL Server only provided two data types:  the MONEY data type to store numeric values, and VARCHAR to store text. &lt;br /&gt;
&lt;br /&gt;
If you are designing a database in this scenario and you need to store or return integer values, which data type -- MONEY or VARCHAR -- would you use?&lt;br /&gt;
&lt;br /&gt;
Suppose I were to argue that MONEY is too complicated, and a waste of space, since all MONEY values always have 4 digits after the decimal which we don't need and it might get confusing.  Therefore, we should use VARCHAR, which is much simpler.  After all, if we use MONEY, every time we execute a SELECT, our integer data will be returned with 4 decimal places like "45.0000", which is certainly not what we want and doesn’t &lt;span style="font-style: italic;"&gt;look &lt;/span&gt;like an integer; but if we use VARCHAR we will get "45" which looks much better and is really what we want to see.  While eventually we will notice that we cannot add these VARCHAR numbers together, or sort them, or compare them, or do any kind of math with them, we can always temporarily convert those values to MONEY if we need to do these things, right?   And as for ensuring that we only store valid numbers in our VARCHAR columns, we can handle that with some complicated CHECK expressions that parses the string to ensure that it stores only valid digits.  We also need to make sure that everyone is aware of the rules for the format of integer data in our VARCHAR columns, to avoid problems as well: after all, we may wish to store values like 34934 as "34,942" so that it looks just right when the values are returned and is easier to work with.  &lt;br /&gt;
&lt;br /&gt;
Now, if I make this argument to you, what would you say?  Would you think it's a good idea?&lt;br /&gt;
&lt;br /&gt;
I think we can all agree that a much better approach would be to use MONEY. Now we can add, sort, compare, do math, and all kinds of things on our values which are stored as numeric data, not text.  We can add a quick, simple CHECK constraint to ensure that our MONEY values have nothing stored after the decimal (e.g., &lt;span style="font-family: Courier New;"&gt;CHECK round(value,0) = value&lt;/span&gt;), and we simply let our front end format those numbers without any decimal places.  In this scenario, it's probably fair to say that no reasonable programmer would argue that VARCHARs are better for working with integers than a true, numeric data type like MONEY. &lt;br /&gt;
&lt;br /&gt;
&lt;span style="font-weight: bold;"&gt;Manipulate and Return Data -- not Strings&lt;/span&gt;&lt;br /&gt;
&lt;br /&gt;
If we agree to use the MONEY data type, wouldn't it make sense to keep those values as MONEY throughout our T-SQL code, and only return MONEY values to the clients?  We wouldn't want to try to convert our MONEY values to VARCHAR in our database code just so that things "look good" and we "don't see" that pesky little ".0000" after each value, right?  It means nothing to us, it doesn’t affect our data, so we simply store and work with and return MONEY in &lt;span style="font-style: italic;"&gt;all &lt;/span&gt;of our database code and we don't worry about formatting or how things "look" or the extra decimal places that we aren't using.  This keeps our code short, clear, easy to read since we are not converting things back and forth between VARCHAR and MONEY over and over to make things "look good" in one spot but to make them “work correctly” in another.&lt;br /&gt;
&lt;br /&gt;
Along those same lines, suppose we have a MONEY value like this:&lt;br /&gt;
&lt;br /&gt;
&lt;div style="margin-left: 40px;"&gt; 1846.4069&lt;br /&gt;
&lt;/div&gt;
&lt;br /&gt;
... and we need to return that value as an integer.  What should we return, a VARCHAR value or a MONEY value?  The VARCHAR value of "1846" will &lt;span style="font-style: italic;"&gt;look &lt;/span&gt;nice, but again: we cannot sort, compare or do anything with it -- it is just a picture of data, not actual data.  If we return a MONEY value of 1846.0000, we can and should ignore the decimal portion since it does not affect anything and enjoy the benefit of returning an actual numeric value that both the database and the client can sort, compare, calculate, and so on.&lt;br /&gt;
&lt;br /&gt;
If you don't agree with any of that, stop reading now and let me know in the comments.  Otherwise, let's continue on ...&lt;br /&gt;
&lt;br /&gt;
&lt;span style="font-weight: bold;"&gt;Another Obvious Example&lt;/span&gt;&lt;br /&gt;
&lt;br /&gt;
What about if we want to store only values that are decimals, such as .0234 or .963?  These values will never have any digits before the decimal, but will always have one to four digits after the decimal (we only need 4 digits of precision).  That is, we are storing values from -.9999 to .9999 only.   With only MONEY and VARCHAR available, what data type should we use to store and work with these values?&lt;br /&gt;
&lt;br /&gt;
Once again, suppose I argue that VARCHAR is the way to go.  After all, if we use MONEY, every value will be preceded by "0.", which is not what we want and again a waste of space.  We don't need the integer portion, and having it returned and seeing it everywhere will be confusing.  If we use VARCHAR it will always look perfect, and we again can use CHECK constraints that parse the string to ensure we have valid data.  Things still don’t sort or compare or calculate, but we can CONVERT back and forth as needed. &lt;br /&gt;
&lt;br /&gt;
Yet once again, I would hope that you will agree that the suggestion to use VARCHAR is not a good one.  We should use a numeric data type, and therefore MONEY is the best choice.  It handles our data perfectly with the accuracy we need, we can do math, sort, and compare.  We know that it will be valid and accurate, and a simple CHECK constraint ensures that we only have values between -.9999 and .9999 as opposed to using string parsing to determine this.  And, again, even though in the T-SQL world all of our money values have an integer portion of “0” attached to them, it should make sense that we should not worry about this since our data is accurate and correct, it doesn't affect calculations, and the client can output these decimals without the preceding “0” very easily.  &lt;br /&gt;
&lt;br /&gt;
This also affects doing some math on a value like 139.592 to return only the decimal portion; for all of the reasons stated, we should all agree that the value to return is a MONEY value of 0.5920 and &lt;span style="font-style: italic;"&gt;not&lt;/span&gt; a VARCHAR value of ".5920".&lt;br /&gt;
&lt;br /&gt;
Finally, if we always store and return MONEY for&lt;span style="font-style: italic;"&gt; both&lt;/span&gt; our integer values and our decimal values, we have another huge benefit:  we can do math on both of those values &lt;span style="font-style: italic;"&gt;together &lt;/span&gt;and get accurate, consistent results.  No conversions, no string manipulation or parsing or concatenation, no worrying about how things are formatted or stored or how they look; we can do simple, efficient and accurate math on all of our data with standard mathematical operations and it just works perfectly and intuitively.&lt;br /&gt;
&lt;br /&gt;
&lt;span style="font-weight: bold;"&gt;Getting to the Point (Finally!)&lt;/span&gt;&lt;br /&gt;
&lt;br /&gt;
Are we all agreed on the above?  Good.  So, then, what’s my point?&lt;br /&gt;
&lt;br /&gt;
&lt;span style="font-style: italic;"&gt;Shouldn't we follow the exact same logic and reasoning when &lt;/span&gt;&lt;span style="font-style: italic;"&gt;storing &lt;/span&gt;&lt;span style="font-style: italic;"&gt;&lt;span style="font-weight: bold;"&gt;date &lt;/span&gt;and &lt;span style="font-weight: bold;"&gt;time &lt;/span&gt;data?&lt;/span&gt;&lt;br /&gt;
&lt;span style="font-weight: bold;"&gt;&lt;br /&gt;
&lt;/span&gt;The exact same scenario happens every day, only programmers are deciding how to store or return dates without times, or times without dates.  Yet, very often we see beginners choosing to store their values as VARCHARs and/or return them and manipulate them as VARCHARs, instead of using the easiest, most obvious and appropriate data type: DATETIME.  &lt;br /&gt;
&lt;span style="font-weight: bold;"&gt;&lt;br /&gt;
Working with Date-Only Data&lt;/span&gt;&lt;br /&gt;
&lt;br /&gt;
If you need to store just a date, use a DATETIME.  Don’t worry about the “extra” time that is stored, just enforce that it is always midnight and it won’t affect anything – just like forcing a decimal to be “.000” and ignoring it!  It is the &lt;span style="font-style: italic;"&gt;exact &lt;/span&gt;same logic.  And, by that same token, &lt;span style="font-style: italic;"&gt;don’t worry about that time being there and feel you need to convert everything to VARCHAR to make it “look” right. &lt;/span&gt; Just use data in the proper type throughout your SQL code, ignore the time portion, and return the DATETIME value – in the correct type – to your clients.  Don’t try to “help” them by converting those nice, accurate, and clean DATETIME values to a string to “hide” the time part – just return the value, just as you would not try to “hide” a decimal portion of “.000” in a numeric by converting it to a string.&lt;br /&gt;
&lt;br /&gt;
Luckily, many programmers do indeed use DateTime when all they want to store is dates, but of course many do obsess about “hiding” that time portion.  Don’t do it.  Leave your data in the proper type, enforce a time of only midnight with a simple check constraint (e.g., &lt;span style="font-family: Courier New;"&gt;CHECK dateAdd(dd,datediff(dd,0,yourdate),0) = yourdate&lt;/span&gt;) and now you can compare, sort, do math, and everything else you ever need on those values, and your client can do the same.  Just as you would use a MONEY data type to store numeric data, if you had to, as opposed to a VARCHAR.  It’s easier, shorter, and more efficient – it just makes sense. &lt;br /&gt;
&lt;br /&gt;
The same concept applies if you have a DATETIME value like "5/6/2007 5:00 PM" and you'd like to return just the date portion:  Should you return a VARCHAR or a DATETIME?  For all of the reasons stated, it should be clear that you simply return a DATETIME value a midnight -- "5/6/2007 12:00AM" -- knowing that the time portion is now "zero" and even though it is there and we see it, it does not affect any calculations and can easily be excluded when the client presents the data.&lt;br /&gt;
&lt;br /&gt;
To "round" a DATETIME so that the time at midnight is returned:&lt;br /&gt;
&lt;br /&gt;
&lt;div style="margin-left: 40px;"&gt;&lt;span style="font-family: Courier New;"&gt;select dateadd(dd,0, datediff(dd,0, &lt;span style="font-style: italic;"&gt;datetimeval&lt;/span&gt;)) as date_at_midnight&lt;/span&gt;&lt;/div&gt;
&lt;br /&gt;
&lt;span style="font-weight: bold;"&gt;Working with Time-Only Data&lt;/span&gt;&lt;br /&gt;
&lt;br /&gt;
Now, what about storing only a Time?  Here’s where people &lt;span style="font-style: italic;"&gt;really &lt;/span&gt;get confused, but, again, &lt;span style="font-style: italic;"&gt;it is exactly the same as storing a value between -.9999 and .9999 in either MONEY or a VARCHAR&lt;/span&gt;.   We agreed to use MONEY, and simply accept the fact that our integer portion of “0” is &lt;span style="font-style: italic;"&gt;always &lt;/span&gt;there, but it doesn’t affect anything; we can simply ignore it and let our client worry about not displaying it.  Remember, when storing a true decimal value, we had no choice – that zero was always there, whether we displayed it or not, right?  Well, DATETIME values have the &lt;span style="font-style: italic;"&gt;exact &lt;/span&gt;same concept!  The decimal portion of a DATETIME is the time component, and even though we just want to store time values, a DATETIME works perfectly for us.  We simply ensure that we store a 0 value as the date portion and we &lt;span style="font-style: italic;"&gt;ignore it&lt;/span&gt;!  &lt;br /&gt;
&lt;br /&gt;
For DATETIME data, a date of “0” that doesn’t affect any calculations is “1/1/1900”, just as a time of “0” that doesn’t affect any calculations is “12:00:00 AM”.  Thus, “1/1/1900 12:00:00AM” is the equivalent to a decimal value of 0.0 -- adding or subtracting that value to other values will not have any effect. &lt;br /&gt;
&lt;br /&gt;
In fact, take a look at this:&lt;br /&gt;
&lt;br /&gt;
&lt;div style="margin-left: 40px;"&gt;&lt;span style="font-family: Courier New;"&gt;select cast(0.0 as datetime) as ZeroDateTime&lt;br /&gt;
&lt;br /&gt;
ZeroDateTime&lt;br /&gt;
-----------------------&lt;br /&gt;
1900-01-01 00:00:00.000&lt;br /&gt;
&lt;br /&gt;
(1 row(s) affected)&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
select getdate() as Now, getdate() + '1/1/1900 12:00:00 AM' as NowPlusZeroDateTime&lt;br /&gt;
                        &lt;br /&gt;
Now                     NowPlusZeroDateTime&lt;br /&gt;
----------------------- -----------------------&lt;br /&gt;
2007-08-30 20:55:48.513 2007-08-30 20:55:48.513&lt;br /&gt;
&lt;br /&gt;
(1 row(s) affected)&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
&lt;/span&gt;&lt;/div&gt;
Notice that the numeric value of 0.0 is converted nicely to '1/1/1900 12:00:00 AM', and that adding '1/1/1900 12:00:00 AM' to a DATETIME value has no effect.&lt;br /&gt;
&lt;br /&gt;
So, if we want to store &lt;span style="font-style: italic;"&gt;only &lt;/span&gt;a time, without a date, we just:&lt;br /&gt;
&lt;ul&gt;
    &lt;li&gt;Use a DATETIME data type&lt;/li&gt;
    &lt;li&gt;Ensure that the date value of that time is 0 (that is, "1/1/1900") with a check constraint (e.g., &lt;span style="font-family: Courier New;"&gt;CHECK datediff(dd,0,TransTime) = 0&lt;/span&gt;)&lt;/li&gt;
    &lt;li&gt;Ignore the date in our SQL code even though we might “see it” here and there&lt;/li&gt;
    &lt;li&gt;Keep our data in the correct data type throughout our code without worrying how it “looks” and trying to “hide” the date part&lt;/li&gt;
    &lt;li&gt;Return that DATETIME value to our client and let it worry about hiding the “1/1/1900”, just as we let the client worry about hiding the preceding “0.” in a decimal value such as 0.2394.&lt;/li&gt;
&lt;/ul&gt;
Once again, let's consider what to do if we have a DATETIME value with such as "6/1/2007 5:00 PM" and we'd like to just return the time portion: Do we return a VARCHAR or a DATETIME?  By now, the answer should be clear:  we return data in the correct type, DATETIME, and we simply use the "0" date of 1/1/1900 by returning "1/1/1900 5:00 PM" knowing that the date portion will not affect the time value just as a an integer portion of 0 will not affect a decimal value.&lt;br /&gt;
&lt;br /&gt;
To return just the time portion of a DATETIME (i.e., at the base date of 1/1/1900):&lt;br /&gt;
&lt;br /&gt;
&lt;div style="margin-left: 40px;"&gt;&lt;span style="font-family: Courier New;"&gt;select &lt;span style="font-style: italic;"&gt;datetimeval &lt;/span&gt;- dateadd(dd,0, datediff(dd,0, &lt;span style="font-style: italic;"&gt;datetimeval&lt;/span&gt;)) as time_at_base_date&lt;/span&gt;&lt;br /&gt;
&lt;/div&gt;
&lt;br /&gt;
Finally, just as in our previous MONEY example,  if we exclusively work with dates and times using the correct DATETIME data type throughout our schema and our SQL code, we can simply add values together to combine a date and a time using simple math:&lt;br /&gt;
&lt;br /&gt;
  &lt;span style="font-family: Courier New;"&gt;    select &lt;span style="font-style: italic;"&gt;somedate &lt;/span&gt;+ &lt;span style="font-style: italic;"&gt;sometime &lt;/span&gt;as &lt;span style="font-style: italic;"&gt;somedatetime&lt;/span&gt;&lt;/span&gt;&lt;br /&gt;
&lt;br /&gt;
No converting, no string manipulations or parsing, no worrying about AM or PM or time formats or anything.  If we keep our data in the correct types, and constraint it properly, everything works as it should – quickly, accurately, and intuitively.&lt;br /&gt;
&lt;br /&gt;
&lt;span style="font-weight: bold;"&gt;Conclusion&lt;/span&gt;&lt;br /&gt;
&lt;br /&gt;
The next time you are working with dates and times, please remember: how would you handle things if you were working with integers and decimals?  The same logic and reasoning applies.  Be smart, let SQL do the work for you and use the right data types for the job, even if things don't always "look" right.  It's not about how good your data &lt;span style="font-style: italic;"&gt;looks&lt;/span&gt;, it's about how accurate it actually &lt;span style="font-style: italic;"&gt;is&lt;/span&gt;.&lt;br /&gt;
&lt;br /&gt;
&lt;span style="font-style: italic;"&gt;see also:&lt;br /&gt;
&lt;/span&gt;
&lt;ul&gt;
    &lt;li&gt;&lt;a href="../../../../jeffs/archive/2007/10/15/time-spans-durations-sql-server.aspx" title="View Entry" id="ctl00_pageContent_Editor_Results_rprSelectionList_ctl08_HyperLink1"&gt; Working with Time Spans and Durations in SQL Server&lt;/a&gt;&lt;/li&gt;
    &lt;li&gt;&lt;a href="../../../../jeffs/archive/2007/09/10/group-by-month-sql.aspx" title="View Entry"&gt;Group by Month (and other time periods)&lt;/a&gt;&lt;/li&gt;
    &lt;li&gt;&lt;a href="../../../../jeffs/archive/2007/08/29/SQL-Dates-and-Times.aspx" title="View Entry" id="ctl00_pageContent_Editor_Results_rprSelectionList_ctl08_HyperLink1"&gt;Working with Date and/or Time values in SQL Server: Don't Format, Don't Convert -- just use DATETIME&lt;/a&gt;&lt;/li&gt;
    &lt;li&gt;&lt;a href="../../../../jeffs/archive/2007/07/03/60248.aspx" title="View Entry" id="ctl00_pageContent_Editor_Results_rprSelectionList_ctl10_HyperLink1"&gt;Data Types -- The Easiest Part of Database Design&lt;/a&gt;&lt;/li&gt;
    &lt;li&gt;&lt;a href="../../../../jeffs/archive/2007/04/13/format-date-sql-server.aspx" title="View Entry" id="ctl00_pageContent_Editor_Results_rprSelectionList_ctl06_HyperLink1"&gt;How to format a Date or DateTime in SQL Server&lt;/a&gt;&lt;/li&gt;
    &lt;li&gt;&lt;a href="../../../../jeffs/archive/2004/12/02/2954.aspx" title="View Entry" id="ctl00_pageContent_Editor_Results_rprSelectionList_ctl06_HyperLink1"&gt;Breaking apart the DateTime datatype -- Separating Dates from Times in your Tables&lt;/a&gt;&lt;/li&gt;
    &lt;li&gt;&lt;a href="../../../../jeffs/archive/2007/10/31/sql-server-2005-date-time-only-data-types.aspx" title="View Entry" id="ctl00_pageContent_Editor_Results_rprSelectionList_ctl04_HyperLink1"&gt;Date Only and Time Only data types in SQL Server 2005 (without the CLR)&lt;/a&gt;&lt;/li&gt;
    &lt;li&gt;&lt;a href="../../../../jeffs/archive/2007/01/02/56079.aspx" title="View Entry"&gt;Essential SQL Server Date, Time and DateTime Functions&lt;/a&gt; 				 				&lt;/li&gt;
&lt;/ul&gt;&lt;img src="http://weblogs.sqlteam.com/jeffs/aggbug/60309.aspx" width="1" height="1" /&gt;</description>
            <dc:creator>Jeff Smith</dc:creator>
            <guid>http://weblogs.sqlteam.com/jeffs/archive/2007/08/29/SQL-Dates-and-Times.aspx</guid>
            <pubDate>Wed, 29 Aug 2007 14:04:02 GMT</pubDate>
            <wfw:comment>http://weblogs.sqlteam.com/jeffs/comments/60309.aspx</wfw:comment>
            <comments>http://weblogs.sqlteam.com/jeffs/archive/2007/08/29/SQL-Dates-and-Times.aspx#feedback</comments>
            <slash:comments>18</slash:comments>
            <wfw:commentRss>http://weblogs.sqlteam.com/jeffs/comments/commentRss/60309.aspx</wfw:commentRss>
            <trackback:ping>http://weblogs.sqlteam.com/jeffs/services/trackbacks/60309.aspx</trackback:ping>
        </item>
        <item>
            <title>Using GROUP BY to avoid self-joins</title>
            <link>http://weblogs.sqlteam.com/jeffs/archive/2007/06/12/60230.aspx</link>
            <description>Sometimes, it appears that a necessary solution to common SQL problems is to join a table to itself.   While self-joins do indeed have their place, and can be very powerful and useful, often times there is a much easier and more efficient way to get the results you need when querying a single table.
&lt;br&gt;&lt;br&gt;&lt;a href="http://weblogs.sqlteam.com/jeffs/archive/2007/06/12/60230.aspx"&gt;read more...&lt;/a&gt;&lt;img src="http://weblogs.sqlteam.com/jeffs/aggbug/60230.aspx" width="1" height="1" /&gt;</description>
            <dc:creator>Jeff Smith</dc:creator>
            <guid>http://weblogs.sqlteam.com/jeffs/archive/2007/06/12/60230.aspx</guid>
            <pubDate>Tue, 12 Jun 2007 15:35:33 GMT</pubDate>
            <wfw:comment>http://weblogs.sqlteam.com/jeffs/comments/60230.aspx</wfw:comment>
            <comments>http://weblogs.sqlteam.com/jeffs/archive/2007/06/12/60230.aspx#feedback</comments>
            <slash:comments>3</slash:comments>
            <wfw:commentRss>http://weblogs.sqlteam.com/jeffs/comments/commentRss/60230.aspx</wfw:commentRss>
            <trackback:ping>http://weblogs.sqlteam.com/jeffs/services/trackbacks/60230.aspx</trackback:ping>
        </item>
        <item>
            <title>More on Runs and Streaks in SQL</title>
            <link>http://weblogs.sqlteam.com/jeffs/archive/2007/05/30/60218.aspx</link>
            <description>That's right boys and girls, it's what you've been waiting for all weekend:  Another edition of the &lt;a href="http://weblogs.sqlteam.com/Jeffs/Contact.aspx"&gt;mailbag&lt;/a&gt;!&lt;br /&gt;
&lt;br /&gt;
Damian writes:&lt;br /&gt;
&lt;br /&gt;
&lt;div style="margin-left: 40px; font-style: italic;"&gt;Hi&lt;br /&gt;
I have a tricky SQL question that I have been trawling the net and &lt;br /&gt;
workmates to find an answer. We are accessing a real time proprietary &lt;br /&gt;
database that approximates classic SQL in syntax. The GUI we are using does &lt;br /&gt;
not allow anything to be done effectively on the front end. It all has &lt;br /&gt;
to happen in the query.&lt;br /&gt;
Trades flow in and we would like to sum them on the fly with only a &lt;br /&gt;
query. We need to sum based on the change in price. &lt;br /&gt;
eg&lt;br /&gt;
for this series of trades (ID is ordered but non-contiguous)&lt;br /&gt;
Trade ID   Price   Volume&lt;br /&gt;
1              4       10&lt;br /&gt;
3              4       20&lt;br /&gt;
6              5       15&lt;br /&gt;
7              4       20&lt;br /&gt;
should produce&lt;br /&gt;
&lt;br /&gt;
price  volume&lt;br /&gt;
4      30&lt;br /&gt;
5      15&lt;br /&gt;
4      20&lt;br /&gt;
&lt;br /&gt;
so it sums the volume on a change in price. but note if the same price &lt;br /&gt;
appears again this is not included in the total for the first instance &lt;br /&gt;
of that price but in the total for a new instance of that price.&lt;br /&gt;
&lt;br /&gt;
It is so simple but the concensus is it is impossible in pure SQL. At &lt;br /&gt;
first I did not think it was but now I tend to agree. we cannot change &lt;br /&gt;
the tables in the database as it is all locked away and front end &lt;br /&gt;
processing is unworkable.&lt;br /&gt;
would you agree?&lt;br /&gt;
thanks&lt;br /&gt;
Damian&lt;br /&gt;
&lt;/div&gt;
&lt;br /&gt;
It is a little tricky, and it is not terribly efficient, but it can be done in pure set-based SQL if I am understanding your requirements correctly.&lt;br /&gt;
&lt;br /&gt;
All you need is the technique shown &lt;a href="http://www.sqlteam.com/item.asp?ItemID=12654"&gt;here&lt;/a&gt;.  &lt;br /&gt;
&lt;br /&gt;
(As far as I know, until I hear otherwise, I'll take credit for coming up with this method! )&lt;br /&gt;
&lt;br /&gt;
Here the code:&lt;br /&gt;
&lt;br /&gt;
&lt;div style="margin-left: 40px;"&gt;&lt;span style="font-family: Courier New;"&gt;-- set up sample table:&lt;/span&gt;&lt;br style="font-family: Courier New;" /&gt;
&lt;br style="font-family: Courier New;" /&gt;
&lt;span style="font-family: Courier New;"&gt;create table Trades (TradeID int primary key, Price int, Volume int)&lt;/span&gt;&lt;br style="font-family: Courier New;" /&gt;
&lt;br style="font-family: Courier New;" /&gt;
&lt;span style="font-family: Courier New;"&gt;insert into Trades&lt;/span&gt;&lt;br style="font-family: Courier New;" /&gt;
&lt;span style="font-family: Courier New;"&gt;select 1,4,10 union all&lt;/span&gt;&lt;br style="font-family: Courier New;" /&gt;
&lt;span style="font-family: Courier New;"&gt;select 3,4,20 union all&lt;/span&gt;&lt;br style="font-family: Courier New;" /&gt;
&lt;span style="font-family: Courier New;"&gt;select 6,5,15 union all&lt;/span&gt;&lt;br style="font-family: Courier New;" /&gt;
&lt;span style="font-family: Courier New;"&gt;select 7,4,20&lt;/span&gt;&lt;br style="font-family: Courier New;" /&gt;
&lt;br style="font-family: Courier New;" /&gt;
&lt;span style="font-family: Courier New;"&gt;-- your solution:&lt;/span&gt;&lt;br style="font-family: Courier New;" /&gt;
&lt;br style="font-family: Courier New;" /&gt;
&lt;span style="font-family: Courier New;"&gt;select &lt;/span&gt;&lt;br style="font-family: Courier New;" /&gt;
&lt;span style="font-family: Courier New;"&gt;  min(tradeID) as StartID, max(TradeID) as EndID, Price, sum(Volume) as TotalVolume&lt;/span&gt;&lt;br style="font-family: Courier New;" /&gt;
&lt;span style="font-family: Courier New;"&gt;from&lt;/span&gt;&lt;br style="font-family: Courier New;" /&gt;
&lt;span style="font-family: Courier New;"&gt;(&lt;/span&gt;&lt;br style="font-family: Courier New;" /&gt;
&lt;span style="font-family: Courier New;"&gt;  select t1.TradeID, t1.Price, t1.Volume,&lt;/span&gt;&lt;br style="font-family: Courier New;" /&gt;
&lt;span style="font-family: Courier New;"&gt;    (select count(*) from Trades t2 &lt;/span&gt;&lt;br style="font-family: Courier New;" /&gt;
&lt;span style="font-family: Courier New;"&gt;     where t2.TradeID &amp;lt; t1.TradeID and t2.Price &amp;lt;&amp;gt; t1.Price) as RunGroup&lt;/span&gt;&lt;br style="font-family: Courier New;" /&gt;
&lt;span style="font-family: Courier New;"&gt;  from Trades t1&lt;/span&gt;&lt;br style="font-family: Courier New;" /&gt;
&lt;span style="font-family: Courier New;"&gt;) x&lt;/span&gt;&lt;br style="font-family: Courier New;" /&gt;
&lt;span style="font-family: Courier New;"&gt;group by &lt;/span&gt;&lt;br style="font-family: Courier New;" /&gt;
&lt;span style="font-family: Courier New;"&gt;  Price, RunGroup&lt;br /&gt;
order by &lt;br /&gt;
  min(tradeID)&lt;br /&gt;
&lt;br /&gt;
-- results:&lt;br /&gt;
&lt;br /&gt;
StartID     EndID       Price       TotalVolume &lt;br /&gt;
----------- ----------- ----------- ----------- &lt;br /&gt;
1           3           4           30&lt;br /&gt;
&lt;/span&gt;&lt;span style="font-family: Courier New;"&gt; 6           6           5           15&lt;/span&gt;&lt;br /&gt;
&lt;span style="font-family: Courier New;"&gt;7           7           4           20&lt;br /&gt;
&lt;br /&gt;
(3 row(s) affected)&lt;br /&gt;
&lt;/span&gt; &lt;/div&gt;
&lt;br /&gt;
As I mentioned, this works, but it is not very efficient; depending on how much data you have, it might run very slowly.  The culprit is that correlated sub-query that calculates the "RunGroup".   You can also calculate the RunGroup using either of these techniques if they are more efficient for your data:&lt;br /&gt;
&lt;br /&gt;
&lt;div style="margin-left: 40px; font-family: Courier New;"&gt;-- option 2&lt;br /&gt;
&lt;br /&gt;
select *, (select min(tradeID) from  Trades t2 &lt;br /&gt;
           where t2.TradeID &amp;gt; t1.TradeID and t2.Price != t1.Price) as RunGroup&lt;br /&gt;
from Trades t1&lt;br /&gt;
&lt;br /&gt;
-- option 3&lt;br /&gt;
&lt;br /&gt;
select *, (select max(TradeID) from Trades t2 &lt;br /&gt;
           where t2.TradeID &amp;lt; t1.TradeID and t2.Price &amp;lt;&amp;gt; t1.Price) as RunGroup&lt;br /&gt;
from Trades t1&lt;br /&gt;
&lt;/div&gt;
&lt;br /&gt;
For this small sample table, the execution plan is the same for all 3.&lt;br /&gt;
&lt;br /&gt;
Also, note that I am tracking the "RunGroups" for all of the data in your table, but if it can be partitioned -- for example, by Customer or Client -- then that will make it much more efficient.  To add a partition on, say, Client, you would alter the code like this:&lt;br /&gt;
&lt;br /&gt;
&lt;div style="margin-left: 40px;"&gt;&lt;span style="font-family: Courier New;"&gt;select &lt;/span&gt;&lt;br style="font-family: Courier New;" /&gt;
&lt;span style="font-family: Courier New;"&gt;  &lt;span style="font-weight: bold;"&gt;Client, &lt;/span&gt;min(tradeID) as StartID, max(TradeID) as EndID, Price, sum(Volume) as TotalVolume&lt;/span&gt;&lt;br style="font-family: Courier New;" /&gt;
&lt;span style="font-family: Courier New;"&gt;from&lt;/span&gt;&lt;br style="font-family: Courier New;" /&gt;
&lt;span style="font-family: Courier New;"&gt;(&lt;/span&gt;&lt;br style="font-family: Courier New;" /&gt;
&lt;span style="font-family: Courier New;"&gt;  select &lt;span style="font-weight: bold;"&gt;t1.Client,&lt;/span&gt; t1.TradeID, t1.Price, t1.Volume,&lt;/span&gt;&lt;br style="font-family: Courier New;" /&gt;
&lt;span style="font-family: Courier New;"&gt;    (select count(*) from Trades t2 &lt;/span&gt;&lt;br style="font-family: Courier New;" /&gt;
&lt;span style="font-family: Courier New;"&gt;     where t2.TradeID &amp;lt; t1.TradeID and t2.Price &amp;lt;&amp;gt; t1.Price &lt;br /&gt;
           and &lt;span style="font-weight: bold;"&gt;t1.Client = t2.Client&lt;/span&gt;) as RunGroup&lt;/span&gt;&lt;br style="font-family: Courier New;" /&gt;
&lt;span style="font-family: Courier New;"&gt;  from Trades t1&lt;/span&gt;&lt;br style="font-family: Courier New;" /&gt;
&lt;span style="font-family: Courier New;"&gt;) x&lt;/span&gt;&lt;br style="font-family: Courier New;" /&gt;
&lt;span style="font-family: Courier New;"&gt;group by &lt;/span&gt;&lt;br style="font-family: Courier New;" /&gt;
&lt;span style="font-family: Courier New;"&gt;  &lt;span style="font-weight: bold;"&gt;Client, &lt;/span&gt;Price, RunGroup&lt;br /&gt;
order by&lt;br /&gt;
  &lt;span style="font-weight: bold;"&gt;Client, &lt;/span&gt;min(tradeID)&lt;/span&gt;&lt;/div&gt;
&lt;br /&gt;
I hope this helps!  if not, let me know.&lt;img src="http://weblogs.sqlteam.com/jeffs/aggbug/60218.aspx" width="1" height="1" /&gt;</description>
            <dc:creator>Jeff Smith</dc:creator>
            <guid>http://weblogs.sqlteam.com/jeffs/archive/2007/05/30/60218.aspx</guid>
            <pubDate>Wed, 30 May 2007 12:43:01 GMT</pubDate>
            <wfw:comment>http://weblogs.sqlteam.com/jeffs/comments/60218.aspx</wfw:comment>
            <comments>http://weblogs.sqlteam.com/jeffs/archive/2007/05/30/60218.aspx#feedback</comments>
            <slash:comments>11</slash:comments>
            <wfw:commentRss>http://weblogs.sqlteam.com/jeffs/comments/commentRss/60218.aspx</wfw:commentRss>
            <trackback:ping>http://weblogs.sqlteam.com/jeffs/services/trackbacks/60218.aspx</trackback:ping>
        </item>
    </channel>
</rss>