Jeff Smith Blog

Random Thoughts & Cartesian Products with Microsoft SQL Server

Searching a column for all words, any words, or exact phrase in a SQL Table

(This is from the SQLTeam forums. I noticed a reply of mine to a user's problem might be useful to others .. so here it is!)

Here's one way to implement a search, similar to Google's, if you don't want to (or can't) use full text indexing. This allows you to pass a string of words, and indicate that either ALL words must match, ANY must match, or the exact string must match.

This is a stripped-down, highly edited version of code I wrote a while ago here at work. This will perform a search in the table "SomeTable", checking the text in the column "SomeColumn".

Use only if you are not searching lots of rows -- less than a few thousand should be fairly efficient; any more than that and you should use SQL's text indexing features.



create  procedure SearchSample @SearchString varchar(200), @MatchType int = 0

as

 

-- SeachString:  String of 1 or more search terms, all separated by spaces

-- MatchType: 0=match any, 1 =match all, 2 = exact match of entire expression only

 

-- Results are returned in order of relevance

 

declare @i1 int;

declare @i2 int;

 

declare @Word varchar(100);

declare @Words table (Word varchar(100) not null);

declare @WordCount as integer;

 

set nocount on

 

-- Parse the SearchString to extract all words:

 

if (@MatchType != 2)

  begin

  set @SearchString = ' ' + @SearchString  + ' ';

  set @i1 = 1;

  while (@i1 != 0)

    begin

    set @i2=charindex(' ', @SearchString, @i1+1)

    if (@i2 != 0)

      begin

      set @Word = rtrim(ltrim(substring(@SearchString, @i1+1, @i2-@i1)))

      if @Word != '' insert into @Words select @Word

      end

    set @i1 = @i2

    end

  end

else

  insert into @Words select ltrim(rtrim(@SearchString))

 

-- Get the total # of words:

 

set @WordCount = (select count(*) from @Words)

 

-- Return Results in order of relevance:

 

select

  a.MatchPct, T.*

from

  SomeTable T

inner join

  (

  select

    ID, Count(*)  * 1.0 / @WordCount as MatchPct

  from

    SomeTable T

  inner join

    @Words W on ' ' + T.SomeColumn + ' ' like '%[^a-z]' + Word + '[^a-z]%'

  group by

    ID

  ) a on T.ID = a.ID

where

  MatchPct = 1 or @MatchType <>1

order by

  MatchPct

 

Legacy Comments


Barry Dorrans
2004-11-03
re: Search Logic For Multiple Search Terms
Isn't this somewhat open to SQL injection attacks?

Jeff S
2004-11-03
re: Search Logic For Multiple Search Terms
Nope -- no dynamic SQL is used at all. At no point is any text passed into the paramater ever executed or concatenated into a SQL statement.

rahul
2005-09-17
re: Search Logic For Multiple Search Terms
this code thows an error and does not work.
:(

Jeff S
2005-09-17
re: Search Logic For Multiple Search Terms
>>this code thows an error and does not work.

How about a little more info? How are you calling it? What version of SQL Server are you using? Do you realize that you need to replace "SomeTable" and "SomeColumn" with actual objects in your database?

Alexander Munro
2005-09-26
re: Search Logic For Multiple Search Terms
Interesting script this. There is an error in the code though. I have this code working fine on Win 2000 Advanced Server running SQL Server 2000 - 8.00.2039

The script fails on Win 2003 sp1 running SQL Server 2000 - 8.00.2039 with a collation error. The error appears to occur because the parser is unable to interpret the 'a' set in your inner join.

It appears that, for whatever reason, tsql running under Win 2003 takes a strict approach to collation.

It makes sense though. When I fist looked at your code as well I thought huh? How could this work? It would never have occurred to me to pass rows into a set without first declaring the set, at least in quite this way - and I suspect this may be true of tsql under Win 2003 as well.

I’m going to have a play around and see whether I can simplify the script (for the parser) but still produce the logic of the original code.

If you already have a more robust rendition this would be very much appreciated as this little snippet of code is quite useful, particularly when called as a function and used in conjunction with text indexing - helps refine searches on all those erroneous fields not included in the indexing service :)

Jeff S
2005-09-26
re: Search Logic For Multiple Search Terms
Thanks, Alexander .... what collation are you using on that column in your table? Perhaps try rewriting the inner part of the query using a WHERE instead of a JOIN, and/or explcitily expressing the collation to use in that expression ?

Alexander Munro
2005-09-26
re: Search Logic For Multiple Search Terms
OK, that was easy. The error in Win 2003 is simply a collation error and nothing more. The logic within the script works just fine as is. If you are using Win 2003 consider making the following alteration:

Change

declare @Words table (Word varchar(200) not null);

To read

declare @Words table (Word varchar(200) COLLATE database_default not null);

This ensures that the temp table inherits the collation of the user database, which prevents a collation conflict.

Alexander Munro
2005-09-26
re: Search Logic For Multiple Search Terms
On a more technical note, trying to find why the collation conflict occurs at all is weird. The database collation default on both servers is SQL_Latin1_General_CP1_CI_AS. The collation on the table column I was referencing is

[nvKeyWord] [nvarchar] (128) COLLATE SQL_Latin1_General_CP1_CI_AS NOT NULL

I though there may be a conflict between the datatypes - varchar vs nvarchar but this didnt make any difference, either works fine in this context.

Its beats me. No time left to work it out though.


Hasan
2006-02-24
re: Search Logic For Multiple Search Terms
Thanks for the articles.

As this article shows how to search multiple words in ONE column,
Any idea how to search in multiple columns???

Thanks

Jeff
2006-02-24
re: Search Logic For Multiple Search Terms
@hasan:

just change

' ' + T.SomeColumn + ' '

to

' ' + T.SomeColumn + ' ' + T.SomeOtherColumn + ... + ' '

cm
2006-03-05
re: Searching a Column for Multiple Terms in SQL
any idea on how to do this with columns of type 'text'

George
2006-03-09
re: Searching a Column for Multiple Terms in SQL
Been travelling and making music for 5 years. catching up on my IT stuff now as a freelance website engineer - articles like yours have helped me get up to speed in just 3 months - thanks a lot.

As for text and Nvarchar, etc. I simply used a convert.. probably up for a slapped wrist ;-) If so please let me know - figured it was quicker than resorting to PAT indexing..

Shridhar
2006-03-28
re: Searching a Column for Multiple Terms in SQL
Hi,
I am getting the following error @ line..
set @SearchString = "long_description";

Server: Msg 207, Level 16, State 3, Line 20
Invalid column name 'long_description'.

What value should be set here..

Thanks,
Shridhar..

Stu
2006-04-18
re: Searching a Column for Multiple Terms in SQL
What does this line do exactly:

inner join
@Words W on ' ' + T.SomeColumn + ' ' like '%[^a-z]' + Word + '[^a-z]%'

I realize that @Words is a table of seperated words (for @MatchType = 0 || 1) but, how is this a loop? How does this inner join statement loop through each seperate word. Or am I missing something?
Thanks,


Stu
2006-04-18
re: Searching a Column for Multiple Terms in SQL
I understand now. This inner join is no different then any other. The 'like' was throwing me off a bit. My new question is how @MatchType = 0 and @MatchType = 1 are treated differently?

Jeff
2006-04-18
re: Searching a Column for Multiple Terms in SQL
>>I understand now. This inner join is no different then any other. The 'like' was throwing me off a bit. My new question is how @MatchType = 0 and @MatchType = 1 are treated differently?

Stu --

it's handled in the WHERE clause:

where

MatchPct = 1 or @MatchType <>1

If @MatchType is 1, it doesn't matter what percent of the words match -- the results are returned. But if teh @MatchType is 1, then only results with a match pct of 100% is returned (as indicated in the specs).

It's just boolean logic.

Sanspirit
2007-08-28
re: Searching a column for all words, any words, or exact phrase in a SQL Table
How can i find out if the word is repeated more than once?

thanks

iman
2008-06-30
re: Searching a column for all words, any words, or exact phrase in a SQL Table
this code snippet is great, when i type the search word 'banks' for example , it is not treated as 'bank'
which is not right , am I missing something?

lopa
2008-07-02
re: Searching a column for all words, any words, or exact phrase in a SQL Table
Where does a.MatchPct come from? and/or a.ID?

David
2008-08-06
re: Searching a column for all words, any words, or exact phrase in a SQL Table
How could I return all rows if nothing was passed in or &quot;*&quot;?

Martin M
2008-11-05
re: Searching a column for all words, any words, or exact phrase in a SQL Table
Amazing. Thanks for this.

dave
2008-12-20
re: Searching a column for all words, any words, or exact phrase in a SQL Table
Hi,

This is exactly what i was looking for.

Thank you, very much appreciated.

Dave

dipesh
2009-01-20
re: Searching a column for all words, any words, or exact phrase in a SQL Table
how can we provide weight to perticular word

David
2009-08-05
re: Searching a column for all words, any words, or exact phrase in a SQL Table
Hi,
This is exactly what i was looking for.Its working dude.
Thank you very much.Good work

Arul
2009-08-05
re: Searching a column for all words, any words, or exact phrase in a SQL Table
Hi
The following would be much more simple i think

Declare @SEARCH NVARCHAR(200)
set @SEARCH='asdf'
SELECT * FROM Table_Name WHERE [column1] like '['+@SEARCH+']%' or [column2] like '['+@SEARCH+']%'