Jeff Smith Blog

Random Thoughts & Cartesian Products with Microsoft SQL Server

A Quick Lesson on SQL Indexes

Just a quick post of some benchmarking code for our good friend Guayo at TheDailyWTF, in regards to this discussion.

The debate is, essentially: Assuming everything is properly indexed, which is faster/more efficient?

A) select * from …
WHERE DATEPART( … ) = @X and DATEPART( …) = @Y

B) select * from …
WHERE SomeDate BETWEEN @A and @B

C) select * from … inner join SomeLookupTable on ….
WHERE SomeLookUptable.X = @X and SomeLookupTable.Y = @Y

see for yourself ….. and enjoy the thread on the other site as well, for those who are curious.

Basically, below, we have:

Days – A lookup table of all possible dates our database needs to consider (1 row per date), along with a indexed Year and WeekNo columns
Test – A table of data covering many days within that date range, from which we will be querying for a specific week number and year.


 

create table Test

  (ID int primary key, TestDate datetime)

 

create index dt_index on Test (TestDate)

 

create table Days

(DateVal datetime primary key, [WeekNo] int, [year] int)

 

create index days_wd on Days ([WeekNo])

create index days_yr on Days ([Year])

 

go

 

declare @i int;

declare @d datetime;

 

set @i=0

set nocount on

while @i < 1000

begin

  insert into Test (ID,TestDate) values (@i, convert(datetime, '1/1/2000') + (@i/4))

  set @d = convert(datetime,'1/1/2000') + @i

  insert into Days (DateVal, [WeekNo], [year]) values (@d, datepart(ww,@d), Year(@d))  

  set @i=@i+1

end

 

set nocount off

 

-- Here's the slowest:

select * from Test

where Datepart(ww,TestDate) = 14 and DatePart(yy,testdate) = 2000

 

-- Here's the fastest

select * from Test where TestDate between '3/26/2000' and '4/1/2000'

 

-- Here's the second fastest (twice as fast as the first):

select Test.* from test

inner join Days on Test.TestDate = Days.dateval

where Days.WeekNo = 14 and Days.Year = 2000

 

go

drop table Days

drop table Test

Legacy Comments


Alex B.
2004-11-12
re: A Quick Lesson on Indexes
Exactly what do you mean by slowest, fastest, etc.? Some precise figures would be more effective. I think this comparison is pretty much moot, unless you can compare the combination of speed and management/upkeep of a solution.

Jeff
2004-11-12
re: A Quick Lesson on Indexes
I mean which SELECT consistently executes in the quickest amount of time.

Not sure what you mean by moot -- do the comparison and look at the difference in the execution plans. It's not exactly a "moot" difference in the performance statics and execution plans between these 3 possibilities. The best outperforms the worst by a factor of over 10.


Guayo
2004-11-12
re: A Quick Lesson on Indexes
I think you misunderstood me.
Yes, the day table could improve performance while queering your test table, but not always… the simpler example would be something like (using your db schema)

select * from test
where id = 5 and datepart(ww,testDate) = 14 and
datePart(yy,testdate) = 2000

the db engine would choose the primary key index no matter what way you try to do the condition on the dates. So simply saying that using datepart it’s a WTF is wrong.
In the snip posted in the daily WTF there is an additional condition besides the dates, if the column in that condition is indexed and it discards more effectively records in the table then you will gain nothing indexing the date column and using the auxiliary table.

Matthew Wills
2004-11-14
re: A Quick Lesson on Indexes
One potential issue is that:

select * from Test
where Datepart(ww,TestDate) = 14 and
DatePart(yy,testdate) = 2000

and

select * from Test
where TestDate between '3/26/2000' and '4/1/2000'

are not actually equivalent.

The former will match on 1st April 2000 4am, while the latter will not.

Jeff S
2004-11-14
re: A Quick Lesson on Indexes
This is obviously easily fixed as:

select * from Test
where TestDate >= '3/26/2000' and testDate < '4/2/2000'

Yes, it was an oversight on my part I suppose. I threw this example together pretty quick .....

It really has nothing to do with overall performance or index use, of course. However, thanks for pointing that out !!



Matthew Wills
2004-11-14
re: A Quick Lesson on Indexes
It also appears to me that:

create table Days2 (DateVal datetime primary key, [WeekNo] int, [year] int)
create index days_wd on Days2 ([Year], [WeekNo])

Would be more efficient (it seems to reduce the number of logical reads)?

Of course it will only be more efficient if Year is specified in the WHERE / JOIN clause (ie not just WeekNo)...

Matthew Wills
2004-11-14
re: A Quick Lesson on Indexes
Jeff,

<quote>
This is obviously easily fixed as:
</quote>

Its a common error I have seen, so I thought I would mention it just in case someone started 'fixing' all their code with BETWEEN.

Jeff S
2004-11-14
re: A Quick Lesson on Indexes
>>create index days_wd on Days2 ([Year], [WeekNo])

yeah, i thought of that as well.

As for the BETWEEN, that's a great point about a common error. At the SQLTeam forum, we must answer that question at least once per day ! of course, it all depends on whether or not you are recording the time in that particular datetime column as well. (I actually recommend separting the date from the time, if you need both, in almost all cases ... stay tuned for a quick blog on the pros/cons and some implementation ideas regarding this)

thanks again, Matt!

Guayo
2004-11-15
re: A Quick Lesson on Indexes
@Jeff S
The point I'm trying to make is that although sometimes there is necessary an auxiliary table to speed up or simplify querying a date range, in this particular example you can do that without the need of such table. So the thing I didn't agree with your post in the daily WTF is that using Datepart function it's a WTF with a large database. As I said sometimes your date range is not the best way of discard records in your query so an index in that column plus the extra table won't give you nothing. But once you got all those extra db objects you will always have to pay the cost they impose.
But if we talk only of the example in the daily WTF what I can say is that there is a way to calculate the starting and ending date of a week of a year just using build in functions, so for this scenario an extra table isn't justified.
For example:

case
when @w = 1 then convert(datetime, '01/01/'+cast(@a as varchar), 103)
else 1+((@w-1)*7) + convert(datetime, '01/01/'+cast(@y as varchar), 103) - datepart(dw,convert(datetime, '01/01/'+cast(@y as varchar), 103))
end

will give you the start date of a given week of a given year. If you wrap that code in a UDF you could reuse that code easily across your queries. The code for get the end date of a given week it's similar. SO why do we need an extra table in this case.
As I said sometimes such auxiliary tables are useful, I don't like them a lot as they are just a hack but hacks are sometimes the easy way of do something, in this case it's not the case IMO.

Jeff S
2004-11-15
re: A Quick Lesson on Indexes
Thanks, Guayo. But you really need to read things more carefully, both here and at TheDailyWTF, if you don't understand that I've already talked about that situationa and already indicated it is the BEST and provided it as the benchmark in my SQL example !! re-read my VERY FIRST POST at the DailyWTF !!

At this point, if you consider any use of Date tables or Tally tables "hacks" (as you stated), then I invite you to spend some time at the SQL Server forums to learn some of the more advanced T-SQL techniques to help improve your skills.

Jeff S
2004-11-15
re: A Quick Lesson on Indexes
by the way -- here's quick way to calculate your range of dates based on a year and a Week Number:

WHERE

someDate >= dateadd(ww, @WeekNo-1, dateadd(yy, @Year-1900, 0 ))

and

SomeDate < dateadd(ww,@WeekNo, dateadd(yy, @Year-1900,0))

Tweak as necessary.

Adam Machanic
2004-11-17
re: A Quick Lesson on Indexes
The only "WTF" here is why anyone would think that using DATEPART would be either faster or better in any other way (readability? code manageability?)... And that the "WTF" guys have the gall to criticize a solution they obviously don't understand (using a calendar table) -- yes, the guy used the calendar table incorrectly, but it's certainly not even close to being a "WTF"...

Jeff S
2004-11-17
re: A Quick Lesson on Indexes
Hey Adam -- if you ever want to start a SQL Server consulting business in the Boston area, let me know ! we have very similiar philosophies and views on SQL, it appears!