|
|
Monday, June 08, 2009
This is one of those things that you probably know, but have not run into it lately. That was the case for me recently as I was troubleshooting an issue in someone else's code, and spent several hours on it before stumbling across this cause. The issue at hand is that people forget about the true nature of NULL in a database column. Another way of thinking about NULL is to substitute the word UNKNOWN. NULL is not anything...it is unknown. NULL does not equal anything. That is why you have to use the magic phrase IS NULL instead of = NULL in your SQL queries: USE AdventureWorks
SELECT COUNT(*) FROM Person.Address
-- 19,614 total records in the table
SELECT * FROM Person.Address WHERE AddressLine2 = NULL
-- ZERO Records Returned
SELECT * FROM Person.Address WHERE AddressLine2 IS NULL
-- 19,252 Records Returned
And not only is NULL not equal to anything, it is not LIKE anything either. That's pretty easy for people to grasp, once they have bought into the equals issue...
SELECT * FROM Person.Address WHERE AddressLine2 LIKE '%2%'
-- 61 Records returned
Notice that none of the rows returned have NULL in the AddressLine2 column because NULL is not LIKE 2. But what trips up some database developers, and was at the root of my trouble today, is that while NULL is not LIKE anything, it also is not NOT LIKE anything either.
SELECT * FROM Person.Address WHERE AddressLine2 NOT LIKE '%2%'
-- 19,553 Records EXPECTED
-- Only 301 Records actually returned
In the above example, the developer was expecting the inverse of the LIKE clause, that is, he expected the other 19,553 records to be returned, but if you look at the actual results, you see that what you get are only the records that have a value (NOT NULL) and that value is not like the search value. All of the records that have a NULL AddressLine2 are excluded from the result set.
How do you fix it?
There are a few different approaches you could take to fixing it. My first suggestion would be to disallow NULLs in the column in the first place.
-- When removing NULL from an existing table, first UPDATE the table with values.
UPDATE Person.Address SET AddressLine2 = '' WHERE AddressLine2 IS NULL
-- NOW Alter the table to disallow NULL
ALTER TABLE Person.Address
ALTER COLUMN AddressLine2 NVARCHAR(60) NOT NULL
Now don't get me wrong, I'm not one of those guys who forbids NULL in his databases. I'm like the ability to have NULL in many situations. To me there is a certain amount of truthiness to saying, "Hey, I don't know what the value is." instead of just using an empty string or zero (especially zero...zero is a real value...there's a big difference between having zero dollars in your savings account and not knowing how much is in savings). On the other hand, there are times when a column absolutely must have a value. It is NOT acceptable for the bank to not know how much is in my savings account, for example.
So, if you are going to still allow NULL in your database, then you need to interpret or convert NULL on the fly using ISNULL or a CASE statement, or some other related approach.
SELECT * FROM Person.Address WHERE ISNULL(AddressLine2, '') LIKE '%2%'
-- 61 Records returned
SELECT * FROM Person.Address WHERE ISNULL(AddressLine2, '') NOT LIKE '%2%'
-- 19,553 Records returned
This returned the expected results, but be warned, there are serious performance ramifications to using functions (e.g. ISNULL() ) in your WHERE clause, and better options exist. Some recommended reading on that subject are the article Avoid enclosing Indexed Columns in a Function in the WHERE clause by Bill Graziano and the blog post Optimizing Conditional WHERE Clauses: Avoiding ORs and CASE Expressions by Jeff Smith.
P.S. For another interesting challenge with NULL, check out Peter Larson's NULL NOT IN Conundrum.
I recently upgraded to Internet Explorer 8 because I was having an issue with a collection of sites that I visit regularly (typically opening them in a tab group). There is something renegade that frequently caused IE7 to crash if I too-quickly closed a tab and tried to scroll or quickly close another tab...it was annoying, but not enough for me to truly troubleshoot and pinpoint the culprit. This didn't happen all the time, but when it did, usually that group was involved. Well, I thought maybe it was just something screwy with my dev machine, and maybe the IE8 upgrade would fix it. And it did...for a while. You see, IE8 installs with the default action of spanning new processes for new tabs (there is some formula where it's not exactly 1-for-1, but for the first few it looks that way). Well, this apparently handled my issue such that when whichever site or sites it was that were trouble, when I closed them, the process responsible for them might crash, but it wouldn't crash the whole of Internet Explorer (and my remaining tabs). Now, I'm not here to promote Internet Explorer, but it is our company standard, so I use it because I have to develop to it. And I know someone reading this will go off about Google Chrome or whatever other favorite tabbed browser they love and how it is far superior to IE. Fine, I'm not interested. You see, I'm not zealous about my browser, and I don't care too much, as long as it let's me do what I want to do. But I'm getting off-track. I just wanted to say, "hey, I upgraded to IE8 recently". But the point of this post is that when I upgraded to IE8 on my development machine, I ran into a very serious problem. A problem big enough that I started asking myself if I would be able to uninstall it and get back to IE7, even though the tab crashing thing was really annoying. The problem I ran into was that Visual Studio 2005 would no longer stop on any Breakpoints in my ASP.NET application. When you're in the middle of a big development effort, that's a very bad problem to have. Fortunately, with a quick search of the Internet, I found out what the problem was. That crash-proofing feature I espoused just paragraphs ago was causing problems with the debugger. In essence, the Visual Studio 2005 debugger couldn't figure out which process to attach to. See http://weblogs.asp.net/abdullaabdelhaq/archive/2009/06/01/VS-Debug-Problem-with-IE8.aspx for another explanation. And also https://blogs.msdn.com/askie/archive/2009/03/09/opening-a-new-tab-may-launch-a-new-process-with-internet-explorer-8-0.aspx for even more details. So, I did just as it recommended and added the registry key to tell IE to stop spawning separate processes, and PRESTO! the debugger was working again and recognizing my breakpoints. Hooray! (Unfortunately, that tab-crashing behavior came back with it. I'm not going to identify the sites in that group because I don't want to throw a bunch of innocent sites under the bus along with the guilty party. And I'm still not convinced it isn't something screwy about my own setup.) So I'm doing my part to spread the word for developers in case you run into this problem yourself. There are notes in one of those links I provided that suggest this may not be a problem for "newer development environments" whatever that means (Visual Studio 2008? The one after that?) and I'd love to upgrade to VS 2008, but that sort of change comes with its own set of baggage and you don't just jump into something like that with a project like this, without a little planning. I'll get to that later. In the meantime, I'm back in business, and if you have the same problem, now you know how to get unstuck, too.
Tuesday, March 24, 2009
The Professional Association for SQL Server (PASS) is getting a great, early, start on preparations for the 2009 community summit to be held in Seattle November 2nd - 5th. I have worked on the Program Committee for several of the past Summit events (and again this year) and I am impressed and excited by what has been accomplished already. First, the event site is up and running with a lot of information about the event and also some tips to help you emphasize the return on investment to your boss (or yourself). On the site you will also find testimonials from past attendees and links to several blog posts about last year's Summit, a preview of the agenda and tracks, and all the latest news. I am also happy to see the Speaker Resources Page (actually part of the main PASS and PASS Chapters site). This was an area of particular interest to me last year, and I like what they have put together already. There are options to register for a single-day or for the full conference. While a single day is better than nothing, I will tell you that the one year that I did that, I was sorely disappointed that I had not found a way to make it to the whole thing, as it was abundantly clear how much I was missing. Of course there are huge learning opportunities, but also the networking opportunities (and in this economy who couldn't use a little boost to their professional network?) There are vendor exhibits (been thinking about an auditing and compliance solution? Come see some of the competitors almost side-by-side!) Oh, and of course, you get access to all the recorded content online after the Summit is over, so for those sessions that you couldn't get to, you can still see them at no additional cost (available to full-conference attendees only). And I haven't even mentioned all of the free stuff you can get just for being there. It's pretty awesome. Simply put, if you make your living somehow related to SQL Server, you need to be at the PASS Community Summit! It will be here before you know it, so start planning, and saving, NOW.
Saturday, February 21, 2009
I find myself spending more time in .NET code lately. While I was verifying my understanding of one of the other attributes of the @Page directive under the ASP.NET 2.0 (Visual Studio 2005) release, I stumbled across the following note regarding the CodeBehind attribute: This attribute is included for compatibility with previous versions of ASP.NET, to implement the code-behind feature. In ASP.NET version 2.0, you should instead use the CodeFile attribute to specify the name of the source file, along with the Inherits attribute to specify the fully qualified name of the class. So, I thought I would go ahead and clean up this little bit of deprecated code while I was working on this page. It sounded like a good idea at the time, and everything ran fine for a while...but a while later, "all of a sudden" I started getting compiler errors saying that "The name 'myCheckBox' does not exist in the current context". What made this really strange was that I had selected it from the IntelliSense selection list, which is usually very good at only providing names of objects that are valid. And, on top of that, objects in the .aspx file should pretty much always be available and valid. In fact, I was not getting errors on any other objects on the page that were being referenced in the routine. So what happened? Well, what I found was that my checkbox, and several other controls I had added recently, were not referenced in the .designer.cs file. Therefore, when being compiled, the code-behind file didn't really know about the checkbox. It is interesting that IntelliSense was able to recognize it, but the compiler did not. The real culprit took some effort to find, because quite a bit of real-world time had elapsed between the change which caused the problem, and the change that discovered the problem. It turns out that simply switching "CodeFile" in the Page directive back to "CodeBehind" and saving it, was enough to trigger the automatic update to the .designer.cs file and it caught up on all objects that were missing. And suddenly everything compiled fine again. I don't know why using "CodeFile" caused the automatic updating to suddenly stop, but this appears to be related to the fact that we are using a Web Application Project and not a Web Site. While writing this blog post, I ran across this post by Steve Smith which appears to confirm that. So, among the explicit lesson learned above, it also is a good reminder that there is some wisdom in the old saying, "if it ain't broke, don't fix it". I'm generally in favor of refactoring code, but this was one case where I just threw it in along with a bunch of other changes I was making, and it bit me. Thankfully the problem came up quickly enough that I remembered having made that change and unwound the issue. If it had been the last thing I did for a long time on that page, I may never have made the connection.
A few months ago, I announced that I was going to do a test-run with Twitter. So now I thought I'd share a few thoughts. Probably the most telling bit of evidence that shows how my test of Twitter went, is that I haven't posted anything to my Twitter account since December 5th, a whopping 3 weeks after I started, and two and a half months ago. It was at that time also that I stopped reading tweets with any degree of regularity. It's not that I think Twitter is evil or utterly banal (although some of the users may be) it's just that I couldn't seem to make it fit into my life very well. Maybe I'm just getting old, I am now in the "over 40" crowd, but I had a helluva time keeping up on reading posts, and I was only following about 25 people. I don't know how anyone follows hundreds of people and actually gets anything else done, like holding down a full-time job and a family. Granted, I mostly was using my cell phone, and I know there are a lot of little apps out there that I could use to do my twittering on my PC, but frankly, the last thing I want while I'm in the middle of work, focusing on a programming or management task is to keep getting distracted by tweets popping up on my PC screen. Heck, it's bad enough all the distractions that email from coworkers causes. It was fun to tweet from the PASS Community Summit and read some others' tweets while I was there, but looking back, I can see that a lot of the tweets were duplicates of others, especially during the Keynote addresses each morning. And since I was sitting there watching the keynote (as well as adding to the redundant rebroadcast) it wasn't very valuable. Now, maybe if I was a professional journalist, I would find Twitter to be a valuable source of leads or commentary. Or maybe if I was trying to market my company, it would be a good way to get the word out and let people know what we were doing. But for where I am in life right now, not so much. The one Twitter account that got my attention recently, was the SQLTeam one where Graz has posted several tips, and maybe I'll get hooked again some day with a differently targeted group to follow, and, well, I'll keep my Twitter account in case I want to get back into it, but for now I don't plan to update it very often.
Tuesday, January 27, 2009
Since we're all super-geniuses, I'm sure you don't really need this, but you might "know a guy" who does, and you can pass this along... I stumbled upon some really good, short videos on the Marketplace Whiteboard that explain in layman's terms what the heck has been going on in the U.S. financial debacle. Start at the bottom and work your way up. At the very least, watch the first one, Uncorking CDOs. New ones are put out periodically.
Monday, November 24, 2008
As I was reviewing my notes from the PASS Community Summit, a couple more things from Greg Low's talk on Avoiding Recompiles stood out as worth repeating here. First, was his comment that using single-part names when referencing your objects causes a performance hit because they cannot be resolved until execution time. SQL Server does not know, when you refer to the table MyTable which one you mean. You may know that there is only one, for example, if all your objects are owned by DBO (in SQL 2000 and earlier) or are in the DBO schema (SQL 2005 and later). But SQL Server has to check, and the first check is to look for YourName.MyTable, and then, if it doesn't find it, SQL Server will fall back to checking for dbo.MyTable. This is similar to how SQL Server always looks in the Master database first for stored procedures when the name starts with "sp_". So to get better performance, you should always use at least 2-part naming of your objects. (There might be an argument for 3-part naming for performance, but I have not heard it yet.) I have to admit to having gotten sloppy in my code because all of our objects are owned by DBO. Thankfully, I have been using SQLPrompt lately, and there is an option to have it always qualify your object names, so that will cover me if I slip up. Second, and more intriguing in my opinion, is that when using ADO (for example in your web application) you should be sure to define the actual size of your parameters when calling SQL Server. I do this as a matter of style and habit, but it is not required. But if you do not, then, apparently, ADO sets the size information based on the actual value that you are passing. And the next time your program calls SQL, passing different values (in particular Strings/VarChars), the size part of the parameter definition will be different, and so SQL Server will not be able to match it against the cached query plan from the previous call, and it will have to generate the plan all over again. If I understood what Greg was saying, this is inherent with using LINQ, too. It just goes to show you that SQL Server is very forgiving in terms of what it allows you to get away with, and still get accurate results; but if you are writing for very high performance, just because you can do something, it does not mean that you should do it.
Sunday, November 23, 2008
Thursday and Friday of the PASS Community Summit blurred together a bit for me. Not because of the parties Thursday night which apparently left many people bleary-eyed, but the cold medicine I was taking sure didn't help things. But primarily because I bounced in and out of a few sessions and had some work stuff mixed in. And on top of that, my 5-month old waking up every couple of hours throughout the night did not make for very restful nights. Oh well, the things I endure for personal development. ;-) Here are a few of the miscellaneous lessons I learned or was reminded of: - When attending a conference, be sure to read the detailed description in the program guide of the sessions you are planning to attend. There were a couple of times where I went to a session based primarily on the Title of the session, only to find out that I had misinterpreted it, and the session was not at all about the topic I was expecting. Unfortunately, by the time I figured it out, it was a toss-up whether I could get into another session that interested me. There were several sessions that were Standing Room Only, so if you weren't there at the beginning of the session, you might be out of luck.
- XML is not necessarily evil. OK, I didn't really think it was evil, but I have been very skeptical of its value in the database. I am not one to abandon my Relational designs and thought patterns on a whim. But Denise gave a great presentation on her experience implementing XML for a large project, and I completely understand and agree with her reasoning for doing so. There were specific conditions in her project which made using XML palatable, and even attractive. In particular, she had collections of element and attribute data that was hierarchical, had to be portable, and the required structure would change (in fact she already had multiple "schemas" she needed to represent for different records). And best of all, in this particular system, the data that was being stored in XML would not be queried in segments. Primarily she would want all of the XML returned based on other data elements in the record. Sounds perfect to me.
- In SQL 2005, the Execution Plan when shown in XML will display information on indexes that SQL Server thinks would have been helpful, but are missing. In SQL 2008, this information is also visible in the graphical execution plan.
- If your Stored Procedure is only going to return one row (e.g. GetRecordByPK) then you should consider using Output Parameters instead. There is some overhead to producing a recordset and sending it down the line; and a handful of Output Parameters may be a lot more efficient.
- Beware of Implicit Data Conversions in your JOINS and WHERE clauses (e.g. comparing a VARCHAR column to an NVARCHAR variable or a DATETIME column to a string literal date). In SQL 2008, the impact of this was greatly lessened, but in SQL 2005 and earlier, the impact is similar to placing a function in the statement which is to say that it can seriously hurt performance.
- Remember in SQL 2005 that you can build covering indexes by "INCLUDE"ing columns in the index that are not actually key parts of the index. For example, you may build and index on LastName, FirstName and INCLUDE MiddleName so that whenever you need to search and display the full name, you don't have to do a Key Lookup to get the middle name, but you also will not have so many splits occurring because your index is keyed on just LastName, FirstName.
- The new DATETIMEOFFSET data type in SQL 2008, while being Time Zone aware, it is NOT intrinsically aware of Daylight Savings Time.
- Microsoft is considering a FOREACH construct in T-SQL for some future release of SQL Server. On the one hand, I like how much simpler it is to understand than the current CURSOR constructs. On the other hand, given the frequency with which they perform poorly, I don't really want to make creating cursors any easier for developers, lest we encourage them to do it more often.
- ENUM is another feature under consideration for some future release of SQL Server (T-SQL). This one I like. I can definitely see a few places where I would like to use an ENUM in some procedure code.
There was so much more to the conference than this, but here are few things to whet your appetite and help you see the benefit of attending next year. Remember that on top of this, there was a huge track on BI stuff, several presentations by Microsoft folks, and easy access to the SQL CAT group. I hope to see you all there next year!
Saturday, November 22, 2008
And you thought that you only got $200 for passing GO, didn't you? Well, this isn't Monopoly, my friends, this is SQL Server. And while attending the PASS Community Summit this year, I saw Joe Webb do this trick. I wasn't the only one in the crowd who went, "Wha???" when they saw it. Most of you readers already know that GO is not a SQL command, but rather it is the commonly used word to separate batches. In other words, the client tool (Query Analyzer, SQL Management Studio, OSQL.exe, etc.) parse whatever commands you execute, separating them into batches whenever GO is found. (Actually, GO is not a magic word by itself, you can set the "batch separator word" to be whatever you want in the Options panel of Query Analyzer and Management Studio.) Well, it turns out that in SQL 2005, a new option was implemented whereby you can put a number after the GO command and that will cause the batch to be executed that many times. This does not work in SQL 2000 and earlier. Here's an example if you want to test it out for yourself. -- Create a test table to use in our test.
CREATE TABLE dbo.TestGo ( RecID INT NOT NULL IDENTITY(1,1), SomeText VARCHAR(50) NULL )
GO
-- Now a simple test to prove that this next batch is being executed multiple times.
INSERT INTO dbo.TestGo VALUES ('This is a test.')
GO 500
-- Look at the records to prove it worked
SELECT *
FROM dbo.TestGo
-- Now let's cleanup after ourselves.
DROP TABLE dbo.TestGo
GO
So there you have it. The batch that precedes the GO with a number is executed however many times you specify. I cannot say at this point that I would recommend using that for anything other than ad-hoc testing, perhaps to simulate a load; or possibly to generate some quick test data. But I thought it was a neat trick and after proving to myself that it really worked, thought I would share it. And for those who are curious, yes, this is documented in Books On-Line.
UPDATED: Added formatting to code blocks.
Friday, November 21, 2008
I went to the Women in Technology lunch yesterday. Yes, some of you will be quick to realize that I am not a woman, but the lunch is not exclusive to women. Rather, it is about women who have achieved success in technology sharing tips and suggestions on how to do likewise. Maybe as a guy it never really occurred to you that A) Technology is a very male-dominated career field and B) there are issues that professional women have to deal with that men just never (or rarely) even think about. The luncheon was a huge success, with about 250 people in attendance. Six years ago when this started at PASS, it was a small fraction of that, in the range of 65 or so. Stephanie Higgins and others get a lot of credit for getting the annual event started, but I know that Denise McInerney deserves a lot of credit for her hard work over the last several years, too. While, as a man, I was in the distinct minority, I would guess that there were a dozen or two other men in the room, which would work out to be 5%-10%. The panelists this year were Billie Jo Murray, General Manager SQL Central Services from Microsoft; Crystal Robinson-Pipersburgh, Sr. Manager Database Administration from Intuit; Kimberly Tripp, SQL Server MVP and Microsoft Regional Director; Lynda Rab, Sr. Data Warehouse Analyst and PASS Program Director; and Kalen Delaney, SQL Server MVP, trainer and author. While some of the subjects specifically covered issues that women must deal with, much of the advice given would be valuable to anyone looking to improve their career. Here are a few highlights. - Figure out what you are passionate about and focus on that. Is it Performance Tuning? High Availability? Data Normalization?
- Get involved in the community. Start a blog, attend local chapter meetings, etc. You'll learn a lot, make new friends and contacts, and perhaps establish a reputation for yourself as being very skilled in the area that you are passionate about (see #1).
- What is your career goal / vision. Billie Jo Murray mentioned that her experience with mentoring people, both men and women, is that men tend to have a specific goal in mind ("I want to be VP of ...") and women tend to be focused on the process ("what should I learn next?"). If you know where you want to go, it is much easier to figure out how to get there.
- Get mentoring. Who, in the greater community, is in the position you seek, and will talk to you? Ask them questions!
- Continuous Improvement: What do you see where you are right now that could/should be improved? What ideas do you have for how to make it better? Work on being able to clearly communicate your ideas (solutions) to your boss or others. But don't complain about systems if you aren't willing to put in the work to make things better.
- Hang around people who enhance who you are, yet allow you to be who you are. Are your best friends in denial, or think that it is weird, that you work in technology? That will hold you back.
I applaud PASS and all the people involved in putting together this luncheon. Well done! And let's keep striving to break down barriers and open up opportunities for any and all people who want to make their career in technology.
|