Thinking outside the box

Patron Saint of Lost Yaks
posts - 203, comments - 734, trackbacks - 4

ISO week calculation for all years 1-9999 without dependencies

CREATE FUNCTION dbo.fnISOWEEK ( @Year SMALLINT, @Month TINYINT, @Day TINYINT ) RETURNS TINYINT AS BEGIN RETURN ( SELECT CASE WHEN nextYearStart <= theDate THEN 0 WHEN currYearStart <= theDate THEN (theDate - currYearStart) / 7 ELSE (theDate - prevYearStart) / 7 END + 1 FROM ( SELECT (currJan4 - 365 - prevLeapYear) / 7 * 7 AS prevYearStart, currJan4 / 7 * 7 AS currYearStart, (currJan4 + 365 + currLeapYear) / 7 * 7 AS nextYearStart, CASE @Month WHEN 1 THEN @Day WHEN 2 THEN 31 + @Day WHEN 3 THEN 59 + @Day + currLeapYear WHEN 4 THEN 90 + @Day + currLeapYear WHEN 5 THEN 120 + @Day + currLeapYear WHEN 6 THEN 151 + @Day + currLeapYear WHEN 7 THEN 181 + @Day + currLeapYear WHEN 8 THEN 212 + @Day + currLeapYear WHEN 9 THEN 243 + @Day + currLeapYear WHEN 10 THEN 273 + @Day + currLeapYear WHEN 11 THEN 304 + @Day + currLeapYear WHEN 12 THEN 334 + @Day + currLeapYear END + currJan4 - 4 AS theDate FROM ( SELECT CASE WHEN (@Year - 1) % 400 = 0 THEN 1 WHEN (@Year - 1) % 100 = 0 THEN 0 WHEN (@Year - 1) % 4 = 0 THEN 1 ELSE 0 END AS prevLeapYear, CASE WHEN @Year % 400 = 0 THEN 1 WHEN @Year % 100 = 0 THEN 0 WHEN @Year % 4 = 0 THEN 1 ELSE 0 END AS currLeapYear, 365 * (@Year - 1) + (@Year - 1) / 400 - (@Year - 1) / 100 + (@Year - 1) / 4 + 3 AS currJan4 WHERE @Year BETWEEN 0 AND 9999 AND @Month BETWEEN 1 AND 12 AND @Day >= 1 AND 1 = CASE WHEN @Month IN (1, 3, 5, 7, 8, 10, 12) AND @Day <= 31 THEN 1 WHEN @Month IN (4, 6, 9, 11) AND @Day <= 30 THEN 1 ELSE 0 END ) AS d WHERE CASE WHEN currLeapYear = 1 AND @Day <= 29 THEN 1 WHEN @Day <= 28 THEN 1 ELSE 0 END = 1 ) AS d ) END

posted @ Saturday, July 27, 2013 10:07 AM | Feedback (0) | Filed Under [ Algorithms ]

Remove all Extended Properties in a database

During my tests to port several databases to SQL Azure, one of the recurring things that fails export is the Extended Properties. So I just wanted to remove them.
This is a simple wayh to list all Extended Properties and the corresponding delete statement.

SELECT 'EXEC sp_dropextendedproperty @name = ' + QUOTENAME(ep.name, '''') + ', @level0type = ''schema'', @level0name = ''dbo''' + ', @level1type = ''table'', @level1name = ' + QUOTENAME(OBJECT_NAME(c.[object_id]), '''') + ', @level2type = ''column'', @level2name = ' + QUOTENAME(c.name, '''') + ';' FROM sys.extended_properties AS ep INNER JOIN sys.columns AS c ON c.[object_id] = ep.major_id AND c.column_id = ep.minor_id

posted @ Sunday, May 27, 2012 10:02 PM | Feedback (3) | Filed Under [ Denali ]

The one feature that would make me invest in SSIS 2012

This week I was invited my Microsoft to give two presentations in Slovenia. My presentations went well and I had good energy and the audience was interacting with me.

When I had some time over from networking and partying, I attended a few other presentations. At least the ones who where held in English. One of these was "SQL Server Integration Services 2012 - All the News, and More", given by Davide Mauri, a fellow co-worker from SolidQ.

We started to talk and soon came into the details of the new things in SSIS 2012. All of the official things Davide talked about are good stuff, but for me, the best thing is one he didn't cover in his presentation.

In earlier versions of SSIS than 2012, it is possible to have a stored procedure to act as a data source, as long as it doesn't have a temp table in it. In that case, you will get an error message from SSIS that "Metadata could not be found".
This is still true with SSIS 2012, so the thing I am talking about is not really a SSIS feature, it's a SQL Server 2012 feature.

And this is the EXECUTE WITH RESULTSETS feature! With this, you can have a stored procedure with a temp table to deliver the resultset to SSIS, if you execute the stored procedure from SSIS and add the "WITH RESULTSETS" option.

If you do this, SSIS is able to take the metadata from the code you write in SSIS and not from the stored procedure! And it's very fast too. Let's say you have a stored procedure in earlier versions and when referencing that stored procedure in SSIS forced SSIS to call the stored procedure (which can take hours), to retrieve the metadata. Now, with RESULTSETS, SSIS 2012 can continue in milliseconds!

This is because you provide the metadata in the RESULTSETS clause, and if the data from the stored procedure doesn't match this RESULTSETS, you will get an error anyway, so it makes sense Microsoft has provided this optimization for us.

posted @ Saturday, May 26, 2012 10:52 AM | Feedback (2) | Filed Under [ Optimization Denali SSIS ]

New Article series

I have started a new article series at Simple Talk. It's all about the transition from procedural programming to declarative programming.

http://www.simple-talk.com/sql/

And it is already viewed 5500 times.

posted @ Friday, February 24, 2012 11:39 AM | Feedback (0) | Filed Under [ Miscellaneous ]

How to calculate the covariance in T-SQL

DECLARE @Sample TABLE
(
x INT NOT NULL,
y INT NOT NULL
)

INSERT  @Sample
VALUES  (3, 9),
(2, 7),
(4, 12),
(5, 15),
(6, 17)

;WITH cteSource(x, xAvg, y, yAvg, n)
AS (
SELECT  1E * x,
AVG(1E * x) OVER (PARTITION BY (SELECT NULL)),
1E * y,
AVG(1E * y) OVER (PARTITION BY (SELECT NULL)),
COUNT(*) OVER (PARTITION BY (SELECT NULL))
FROM    @Sample
)
SELECT  SUM((x - xAvg) *(y - yAvg)) / MAX(n) AS [COVAR(x,y)]
FROM    cteSource

Avoid stupid mistakes

Today I had the opportunity to debug a system with a client. I have to confess it took a while to figure out the bug, but here it is

SELECT COUNT(*) OfflineData

Do you see the bug?

Yes, there should be a FROM clause before the table name. Without the from clause, SQL Server treats the name as an alias for the count column. And what do the COUNT always return in this case?

It returns 1.

So the bug had a severe implication. Now I now it's easy to forget to write a FROM in your query. How can we avoid these stupid mistakes?
An way is very easy; always prefix your table names with schema. Besides this bug there are a lot of other benefits from prefixing your tables names with schema.

In my client's case, if OfflineData had been prefixed with dbo, the query wouldn't parse and you get a compile error.
Next thing to do to avoid stupid mistakes is to put AS before alias names, and have alias names after the expression.

SELECT COUNT(*) AS MyCount FROM dbo.OfflineData

posted @ Thursday, September 22, 2011 8:38 AM | Feedback (3) | Filed Under [ SQL Server 2008 SQL Server 2005 SQL Server 2000 Miscellaneous Denali ]

Convert UTF-8 string to ANSI

CREATE FUNCTION dbo.fnConvertUtf8Ansi
(
@Source VARCHAR(MAX)
)
RETURNS VARCHAR(MAX)
AS
BEGIN
DECLARE @Value SMALLINT = 160,
@Utf8 CHAR(2),
@Ansi CHAR(1)

IF @Source NOT LIKE '%[ÂÃ]%'
RETURN  @Source

WHILE @Value <= 255
BEGIN
SELECT  @Utf8 = CASE
WHEN @Value BETWEEN 160 AND 191 THEN CHAR(194) + CHAR(@Value)
WHEN @Value BETWEEN 192 AND 255 THEN CHAR(195) + CHAR(@Value - 64)
ELSE NULL
END,
@Ansi = CHAR(@Value)

WHILE CHARINDEX(@Source, @Utf8) > 0
SET    @Source = REPLACE(@Source, @Utf8, @Ansi)

SET    @Value += 1
END

RETURN  @Source
END

Do people want help? I mean, real help?

Or do they just want to continue with their old habits?

The reason for this blog post is that I the last week have tried to help people on several forums. Most of them just want to know how to solve their current problem and there is no harm in that. But when I recognize the same poster the very next day with a similar problem I ask myself; Did I really help him or her at all?

All I did was probably to help the poster keep his or her job. It sound harsh, but is probably true. Why would the poster else continue in the old habit? The most convincing post was about someone wanted to use SP_DBOPTIONS. He had an ugly procedure which used dynamic sql and other things done wrong.

I wrote to him he should stop using SP_DBOPTION because that procedure have been marked for deprecation and will not work on a SQL Server version after 2008R2, and that he should start using DATABASEPROPERTYEX() function instead.
His response was basically “Thanks, but no thanks”. Then some other MVP jumped in and gave him a solution using SP_DBOPTIONS and the original poster once again was a happy camper.

Another problem was posted by someone who wanted a unique sequence number like “T000001” to “T999999”. I suggested him to use a normal IDENTITY column and add a computed column and concatenate the “T” with the value from the IDENTITY column. Even if other people several times proposed my suggestion as an answer, the original poster (OP) unproposed my suggestion! Why?

The only reason I can think of, is that OP is not used to (or even heard of) computed columns. Some other guy posted and insinuated that computed columns don’t work on SQL Server 2000 and earlier. To that I just posted that computed columns did in fact work already back in SQL Server 7.

Are people so stuck in their old habit and inept to change for whatever reason that might be? Could it be they are not qualified, or lack enough experience, for their current position? Do they lack basic education about relational databases?

My question to you is, how do you really help people with these mindsets?

posted @ Sunday, July 24, 2011 8:08 AM | Feedback (14) | Filed Under [ Miscellaneous Denali ]

Code Audit - The Beginning

For the next few months, I will be involved in an interesting project for a mission critical application that our company have outsourced to a well-known and respected consulting company here Sweden.
My part will be the technical details of the forecasting application now when our former DBA has left our company.

Today I took a brief look at the smallest building blocks; the Functions. No function is inline so I can assume some of the performance issues are derived from these.

One function I stumled across is very simple. All it does is to add a timepart from current execution time to the datepart from a variable.

CREATE FUNCTION dbo.GetDateTimeFromDate
(

@p_date DATE
)
RETURNS DATETIME
AS
BEGIN

DECLARE @ActualWorkDateTime DATETIME

SET @ActualWorkDateTime = CONVERT(VARCHAR, @p_date, 101) + ' '+ CONVERT(VARCHAR, GETDATE(), 114)

RETURN  @ActualWorkDateTime
END

This doesn't look to bad compared to what I have seen on the online forums. But there is a hidden performance issue here, besides being not an inline function, and that is the conversion to varchar and back to datetime. Also, this functions crashed in my tests when I changed dateformat to dmy. This is because the developer used style 101 in the convert function. If he had used style 112 the function would not have crashed no matter which dateformat value I use.

So to our next meeting I will explain to the consultants the issues I have with this function and the others that I've found.
A better choice for this function would be

CREATE FUNCTION dbo.GetDateTimeFromDate
(

@p_date DATE
)
RETURNS DATETIME
AS
BEGIN
RETURN  (

SELECT  DATEADD(DAY, DATEDIFF(DAY, GETDATE(), @p_date), GETDATE())

)
END

See, now there is no conversion, it's inline and dateformat-safe! A generic function for adding the date part from one variable to the time part from another variable looks like this

CREATE FUNCTION dbo.fnGetDateTimeFromDatePartAndTimePart
(

@DatePart DATETIME,

@TimePart DATETIME
)
RETURNS DATETIME
AS
BEGIN

RETURN  (

SELECT  DATEADD(DAY, DATEDIFF(DAY, @TimePart, @DatePart), @TimePart)

)
END

posted @ Thursday, July 21, 2011 8:44 AM | Feedback (2) | Filed Under [ Optimization SQL Server 2008 Algorithms Administration SQL Server 2005 ]

A glance at SQL Server Denali CTP3 - DATEFROMPARTS

There is a new function in SQL Server Denali named DATEFROMPART. What is does, is to calculate a date from a number of user supplied parameters such as Year, Month and Date.

Previously you had to use a formula like this

DATEADD(MONTH, 12 * @Year + @Month - 22801, @Day)

to calculate the correct datevalue from the parameters. With the new DATEFROMPARTS, you simple write

DATEFROMPARTS(@Year, @Month, @Day)

and you get the same result, only slower by 22 percent. So why should you use the new function, if it's slower?
There are two good arguments for this

1) It is easier to remember
2) It has a built-in validator so that you cannot "spill" over the current month.

For the old way of doing this, using @Year = 2009, @Month = 2 and @Day = 29 you would end up with a date of 2009-02-28 and the DATEFROMPARTS will give you an error message.

posted @ Wednesday, July 13, 2011 9:18 AM | Feedback (5) | Filed Under [ Denali ]