Thinking outside the box

Patron Saint of Lost Yaks
posts - 199, comments - 684, trackbacks - 4

My Links

Advertisement

News

Archives

Post Categories

Wednesday, January 18, 2012

How to calculate the covariance in T-SQL

DECLARE @Sample TABLE
        (
            x INT NOT NULL,
            y INT NOT NULL
        )

INSERT  @Sample
VALUES  (3, 9),
        (2, 7),
        (4, 12),
        (5, 15),
        (6, 17)

;WITH cteSource(x, xAvg, y, yAvg, n)
AS (
        SELECT  1E * x,
                AVG(1E * x) OVER (PARTITION BY (SELECT NULL)),
                1E * y,
                AVG(1E * y) OVER (PARTITION BY (SELECT NULL)),
                COUNT(*) OVER (PARTITION BY (SELECT NULL))
        FROM    @Sample
)
SELECT  SUM((x - xAvg) *(y - yAvg)) / MAX(n) AS [COVAR(x,y)]
FROM    cteSource

posted @ Wednesday, January 18, 2012 1:01 PM | Feedback (0) | Filed Under [ Optimization SQL Server 2008 Algorithms SQL Server 2005 SQL Server 2000 Denali ]

Thursday, September 22, 2011

Avoid stupid mistakes

Today I had the opportunity to debug a system with a client. I have to confess it took a while to figure out the bug, but here it is


SELECT COUNT(*) OfflineData


Do you see the bug?


Yes, there should be a FROM clause before the table name. Without the from clause, SQL Server treats the name as an alias for the count column. And what do the COUNT always return in this case?

It returns 1.

So the bug had a severe implication. Now I now it's easy to forget to write a FROM in your query. How can we avoid these stupid mistakes?
An way is very easy; always prefix your table names with schema. Besides this bug there are a lot of other benefits from prefixing your tables names with schema.

In my client's case, if OfflineData had been prefixed with dbo, the query wouldn't parse and you get a compile error.
Next thing to do to avoid stupid mistakes is to put AS before alias names, and have alias names after the expression.

SELECT COUNT(*) AS MyCount FROM dbo.OfflineData

posted @ Thursday, September 22, 2011 8:38 AM | Feedback (1) | Filed Under [ SQL Server 2008 SQL Server 2005 SQL Server 2000 Miscellaneous Denali ]

Wednesday, September 14, 2011

Convert UTF-8 string to ANSI

CREATE FUNCTION dbo.fnConvertUtf8Ansi
(
    @Source VARCHAR(MAX)
)
RETURNS VARCHAR(MAX)
AS
BEGIN
    DECLARE @Value SMALLINT = 160,
            @Utf8 CHAR(2),
            @Ansi CHAR(1)

    IF @Source NOT LIKE '%[ÂÃ]%'
        RETURN  @Source

    WHILE @Value <= 255
        BEGIN
            SELECT  @Utf8 = CASE
                                WHEN @Value BETWEEN 160 AND 191 THEN CHAR(194) + CHAR(@Value)
                                WHEN @Value BETWEEN 192 AND 255 THEN CHAR(195) + CHAR(@Value - 64)
                                ELSE NULL
                            END,
                    @Ansi = CHAR(@Value)

            WHILE CHARINDEX(@Source, @Utf8) > 0
                SET    @Source = REPLACE(@Source, @Utf8, @Ansi)

            SET    @Value += 1
        END

    RETURN  @Source
END

posted @ Wednesday, September 14, 2011 6:30 AM | Feedback (2) | Filed Under [ Optimization SQL Server 2008 Algorithms SQL Server 2005 SQL Server 2000 Denali ]

Sunday, July 24, 2011

Do people want help? I mean, real help?


Or do they just want to continue with their old habits?

The reason for this blog post is that I the last week have tried to help people on several forums. Most of them just want to know how to solve their current problem and there is no harm in that. But when I recognize the same poster the very next day with a similar problem I ask myself; Did I really help him or her at all?

All I did was probably to help the poster keep his or her job. It sound harsh, but is probably true. Why would the poster else continue in the old habit? The most convincing post was about someone wanted to use SP_DBOPTIONS. He had an ugly procedure which used dynamic sql and other things done wrong.

I wrote to him he should stop using SP_DBOPTION because that procedure have been marked for deprecation and will not work on a SQL Server version after 2008R2, and that he should start using DATABASEPROPERTYEX() function instead.
His response was basically “Thanks, but no thanks”. Then some other MVP jumped in and gave him a solution using SP_DBOPTIONS and the original poster once again was a happy camper.

Another problem was posted by someone who wanted a unique sequence number like “T000001” to “T999999”. I suggested him to use a normal IDENTITY column and add a computed column and concatenate the “T” with the value from the IDENTITY column. Even if other people several times proposed my suggestion as an answer, the original poster (OP) unproposed my suggestion! Why?

The only reason I can think of, is that OP is not used to (or even heard of) computed columns. Some other guy posted and insinuated that computed columns don’t work on SQL Server 2000 and earlier. To that I just posted that computed columns did in fact work already back in SQL Server 7.

Are people so stuck in their old habit and inept to change for whatever reason that might be? Could it be they are not qualified, or lack enough experience, for their current position? Do they lack basic education about relational databases?

My question to you is, how do you really help people with these mindsets?

posted @ Sunday, July 24, 2011 8:08 AM | Feedback (9) | Filed Under [ Miscellaneous Denali ]

Thursday, July 21, 2011

Code Audit - The Beginning

For the next few months, I will be involved in an interesting project for a mission critical application that our company have outsourced to a well-known and respected consulting company here Sweden.
My part will be the technical details of the forecasting application now when our former DBA has left our company.

Today I took a brief look at the smallest building blocks; the Functions. No function is inline so I can assume some of the performance issues are derived from these.

One function I stumled across is very simple. All it does is to add a timepart from current execution time to the datepart from a variable.


CREATE FUNCTION dbo.GetDateTimeFromDate
(
   
@p_date DATE
)
RETURNS DATETIME
AS
BEGIN
   
DECLARE @ActualWorkDateTime DATETIME

   
SET @ActualWorkDateTime = CONVERT(VARCHAR, @p_date, 101) + ' '+ CONVERT(VARCHAR, GETDATE(), 114)

   
RETURN  @ActualWorkDateTime
END


This doesn't look to bad compared to what I have seen on the online forums. But there is a hidden performance issue here, besides being not an inline function, and that is the conversion to varchar and back to datetime. Also, this functions crashed in my tests when I changed dateformat to dmy. This is because the developer used style 101 in the convert function. If he had used style 112 the function would not have crashed no matter which dateformat value I use.

So to our next meeting I will explain to the consultants the issues I have with this function and the others that I've found.
A better choice for this function would be


CREATE FUNCTION dbo.GetDateTimeFromDate
(
   
@p_date DATE
)
RETURNS DATETIME
AS
BEGIN
    RETURN  (
               
SELECT  DATEADD(DAY, DATEDIFF(DAY, GETDATE(), @p_date), GETDATE())
           
)
END


See, now there is no conversion, it's inline and dateformat-safe! A generic function for adding the date part from one variable to the time part from another variable looks like this


CREATE FUNCTION dbo.fnGetDateTimeFromDatePartAndTimePart
(
   
@DatePart DATETIME,
   
@TimePart DATETIME
)
RETURNS DATETIME
AS
BEGIN
   
RETURN  (
               
SELECT  DATEADD(DAY, DATEDIFF(DAY, @TimePart, @DatePart), @TimePart)
           
)
END

posted @ Thursday, July 21, 2011 8:44 AM | Feedback (1) | Filed Under [ Optimization SQL Server 2008 Algorithms Administration SQL Server 2005 ]

Wednesday, July 13, 2011

A glance at SQL Server Denali CTP3 - DATEFROMPARTS

There is a new function in SQL Server Denali named DATEFROMPART. What is does, is to calculate a date from a number of user supplied parameters such as Year, Month and Date.

Previously you had to use a formula like this

DATEADD(MONTH, 12 * @Year + @Month - 22801, @Day)

to calculate the correct datevalue from the parameters. With the new DATEFROMPARTS, you simple write

DATEFROMPARTS(@Year, @Month, @Day)

and you get the same result, only slower by 22 percent. So why should you use the new function, if it's slower?
There are two good arguments for this

1) It is easier to remember
2) It has a built-in validator so that you cannot "spill" over the current month.

For the old way of doing this, using @Year = 2009, @Month = 2 and @Day = 29 you would end up with a date of 2009-02-28 and the DATEFROMPARTS will give you an error message.

posted @ Wednesday, July 13, 2011 9:18 AM | Feedback (3) | Filed Under [ Denali ]

Sunday, July 03, 2011

MVP renewed

I got an email last friday telling me I was to keep my MVP status!

What do one say about that? Except "Thank you". To all that reads my articles and posts. To all who attends my presentations.

//Peter

posted @ Sunday, July 03, 2011 3:20 PM | Feedback (0) | Filed Under [ Miscellaneous ]

Wednesday, May 18, 2011

Feedback from SQLBits 8

This years SQLBits occurred in Brighton. Although I didn’t have the opportunity to attend the full conference, I did a presentation at Saturday.

Getting to Brighton was easy. Drove to Copenhagen airport at 0415, flew 0605 and arrived at Gatwick 0735. Then I took the direct train to Brighton and showed up at 0830, just one hour before presenting. This was the easy part.

Getting home was much worse. Presentation ended at 1030 and I had to rush to the train station to get back to London, change to tube for Heathrow. Made it at the gate just 15 seconds before closing. That included a half mile run in the airport…

Anyway, yesterday I got the feedback for my presentation. It does look good, especially since English is not my first language.

This is the first graph

Seems to be just halfway between conference average and best session. I can live with that.

Second graph shows more detail about attendees voting.

It also look acceptable. A wider spread for the 9’s, but it is an inevitable effect from how attendees percept the session. I did get a lot of 8’s and the lower grades in an descending order. The two people voting 4 and 5 didn’t say why they voted this so I don’t know how to remedy this.

Third graph is about each category of votes.

 

Again, I find this acceptable. The Session abstract and Speaker’s knowledge seems to follow attendees expectations compared to conference average.
I seem to have met the attendees expectations (and some more) for the other four categories, also compared to conference average.

Since this did encourage me, I believe I will present some more at future meetings. I do have a new presentation about something all developers are doing every day but they may not know it.

I will also cover this new topic in the next Deep Dives II book.

Stay tuned!

//Peter

posted @ Wednesday, May 18, 2011 1:25 PM | Feedback (0) | Filed Under [ Miscellaneous ]

Wednesday, April 27, 2011

A tale from a Stalker

A tale from a Stalker who licked his wounds and got back 9 months later...

posted @ Wednesday, April 27, 2011 4:16 PM | Feedback (4) | Filed Under [ Algorithms Miscellaneous ]

Thursday, April 14, 2011

Simple function to get beginning or end of month

CREATE FUNCTION dbo.fnIsOnMonthEdge
(
    @theDate DATETIME
)
RETURNS SMALLINT
AS
BEGIN
    RETURN CASE @theDate
                WHEN '99991231' THEN 1
                ELSE DATEDIFF(MONTH, @theDate, DATEADD(DAY, 1, @theDate))
            END +
            CASE @theDate
                WHEN '17530101' THEN -1
                ELSE DATEDIFF(MONTH, @theDate, DATEADD(DAY, -1, @theDate))
            END
END

 

posted @ Thursday, April 14, 2011 1:40 PM | Feedback (0) | Filed Under [ Optimization SQL Server 2008 Algorithms SQL Server 2005 SQL Server 2000 ]

Powered by:
Powered By Subtext Powered By ASP.NET