Filter by month (plus other time periods)
Introduction
Previously, I wrote about grouping transactions by month. Another common area of difficulty or confusion for SQL beginners is how to efficiently retrieve data just for a single month.
There are two parts to this equation: First, what is the best way to declare parameters that will be used to indicate which month you are looking for? Second, how can we efficiently and easily make use of those parameters to get back the data we need?
Let's take a look at some approaches, both recommended and not.
And, please: if you are not using the DATETIME data type to store dates in your tables, don't even bother reading further – fix your design first!
Month Parameter Options
@YearMonth as char(6)
If you know me at all, you know that I will advise not to use this approach. We’ll need to parse and validate the values passed into ensure that they are valid, and we need to somehow communicate exactly which format to use when setting the parameter values. For example, is it “MM-YY”, or “MMYYYY”, or “MM/YY”, or “YYYYMM”, etc? It simply makes no sense to introduce string formatting conventions and parsing and validating into the equation when you simply don’t need to. Don’t do it.
@Year as int, @Month as int
This approach works fine, since now we have no string parsing or ambiguity to deal with. It is very clear that Year and Month are both simply numeric values. We can do a quick check to ensure that are @Month value is between 1 and 12, and that our @Year value is within our desired range as well.
As we will see later, an optimal solution will require finding the first day of the month requested. How can we do that given a @Year and a @Month value? Simple, you can either use the Date function here and write:
Set @FirstDayOfMonth = dbo.Date(@Year,@Month,1)
… or you can apply the same logic using DateAdd() like this:
Set @FirstDayOfMonth = dateadd(month,((@Year-1900)*12)+@Month-1,0)
Either way, the result is the same: for a given @Year and @Month, the first day of that month (at midnight) is returned. If you aren’t sure how this formula works (or even if it will work at all), just test it out.
@DateInMonth as datetime
Here, we can simply accept any datetime value, and use the year and month of that date to indicate which month should be returned. A user can pass in the first day of the month, the last day, or any day in between – it doesn’t matter. This works well because we know the parameter will be a valid date, and we can quickly and simply get the year/month of the @DateInMonth parameter using the Year() and Month() functions:
Select @Year = Year(@DateInMonth), @Month = Month(@DateInMonth)
We can also get the first day of the month for a given @DateInMonth quite easily, using another DateAdd() trick:
Set @FirstDayOfMonth = DateAdd(month, DateDiff(month,0, @DateInMonth), 0)
So, accepting a dateTime parameter to indicate which month you are after is also a good technique to use.
WHERE Clause Options
Now that we have the @Year, @Month and @FirstDayOfMonth values ready to go, we must decide how to use these values to write our query. Here are some options (again, both good and bad):
where Year(TranDate) =@Year and Month(TranDate) = @Month
This approach will work fine, but it will not be able to use any indexes that might exist on your TranDate column. Thus, it will not be as efficient as possible. Every single TranDate value must be retrieved from your table, the Year() and Month() formulas must be applied, and then the results must be tested to determine if that row should be returned or not. Overall, avoid wrapping indexable columns in functions such as these in your criteria.
where convert(varchar(6), TranDate, 112) = convert(varchar(6), @FirstDayOfMonth)
-- or --
where convert(varchar(6), TranDate, 112) = left(@Year,4)+right('0' + left(@Month,2),2)
-- or --
where convert(varchar(6), TranDate, 112) = @YearMonth
These WHERE clauses all convert the TranDate to a VARCHAR in the format of YYYMMDD, but truncated at 6 characters, resulting in a YYYMM formatted string. Then, using different methods, that string is compared to a YYYYMM representation of the month we'd like to return. This is a bad approach to take because it requires that every TranDate be converted to a string in order to filter the rows, thus no indexes can be used. Using any conversions or string manipulations to filter data when better options exist is never the way to go. At worst, it is also very confusing to work with since it can be unclear exactly what some of these convert() functions and string expressions are doing.
where TranDate between @FirstDayOfMonth and @LastDayOfMonth
What if we calculate, for a given @Year and @Month, the first and last day of the month, store them in variables, and then use that range for our criteria? This will let us use indexes, it will be clear and concise, and it makes good sense.
However, there is a minor flaw in this logic: Suppose @Year is 2006, and @Month is 1. If we set @LastDayOfMonth to ‘2006-01-31 12:00:00 AM’, any transactions after 12:00:00 AM on the last day of the month will not be included! We could calculate the very last 1/300th of a millisecond on the last day of the month to use as our upper limit, but there is an even easier way ...
where TranDate >= @FirstDayOfMonth and TranDate < @FirstDayOfNextMonth
Here, we won’t have any issues with times other than midnight leading to missed transactions.
We already have the @FirstDayOfMonth, all we need is the @FirstDayOfNextMonth. That is easily obtained by using DateAdd() to add one month to the @FirstDayOfMonth, as shown:
Set @FirstDayOfNextMonth = DateAdd(month, @FirstDayOfMonth,1)
With that, we are all set! Indexes can be used and any datetime value within our range will be properly selected. I highly recommend using this approach whenever you need to select dates for a single month; there is no need to force date computations upon all values in your table when a simple range is all you need.
Other Time Periods
The same approach can be applied if you need to allow for a range of months to be specified, not just a single month. Simply calculate the first day of the starting month, and then calculate the first day of the month after the ending month. Then, you can use that date range to filter your data as shown above.
You should also follow these same guidelines if you'd like to filter data by a specified year, quarter, or week. "Year-to-date" filtering will also work the exact same way.
Conclusion
At the end of the day, accomplishing our task is very simple:
1. Accept either @Year and @Month parameters or a single @DateInMonth parameter to indicate which month should be returned
2. Calculate the @FirstDayOfMonth and the @FirstDayOfNextMonth using those parameters
3. Select your rows with a WHERE clause that returns transaction dates equal to or greater than @FirstDayOfMonth, and less than @FirstDayOfNextMonth, to ensure that all dates at the end of your range are captured and that indexes can be used.
Good luck, and remember: Declare your parameters to be clear and precise, and write your WHERE clauses to be as efficient and accurate as possible.
see also:
- Working with Time Spans and Durations in SQL Server
- Group by Month (and other time periods)
- Working with Date and/or Time values in SQL Server: Don't Format, Don't Convert -- just use DATETIME
- Data Types -- The Easiest Part of Database Design
- How to format a Date or DateTime in SQL Server
- Breaking apart the DateTime datatype -- Separating Dates from Times in your Tables
- Date Only and Time Only data types in SQL Server 2005 (without the CLR)
- Essential SQL Server Date, Time and DateTime Functions
Legacy Comments
Lee
2007-09-15 |
re: Filtering transactions by month (plus other time periods) Would tinyint not be more appropriate for @Month and smallint for @Year? |
Jeff
2007-09-17 |
re: Filtering transactions by month (plus other time periods) Lee -- sure, you can use that as well. Most of the T-SQL datetime functions use integers as parameters or return values, so I tend to use them in parameters as well. I suppose it's like C# or other languages -- we know that we should use byte if we only need an integer from 1-255, but 99% of programmers use an integer because that's the default data type to use for whole numbers. |
georgev
2007-09-17 |
re: Filtering transactions by month (plus other time periods) Another great article Jeff, thank you! Before now I have generally approached this problem using WHERE Year(theDate) = @year... etc and I never even considered my poor little indexes! * * * * * |
Oscar
2008-02-03 |
re: Filtering transactions by month (plus other time periods) Hi Jeff - First off thanks for a great article. This is why I love t-sql. In the last group by datetime article I commented about just this. And I wouldnt rate me amongst the "SQL beginners". Of these solutions, only the final would satisfy me too. In my particular case I was joining a table with periods rounded to whole monts on a table with days. And I want to group all this by monts. :) Bear with me... period table: Id, PeriodStart, PeriodEnd eg) 1, 2007-11-01, 2008-02-01 day table: Id, Time, Value select d.id, sum(d.value) from day d join period p on d.id=p.id and d.time >= p.periodstart and d.time < p.periodend group by DateAdd(month, DateDiff(month,0, d.time), 0) I'm not at my workstation, so might be many typos.. What do u think about this approach? |
Jeff
2008-02-07 |
re: Filtering transactions by month (plus other time periods) Oscar -- in the very first sentence of this post, I link to an article about grouping by month. Did you miss it? |