Blog Stats
- Blogs - 47
- Posts - 1810
- Articles - 61
- Comments - 10147
- Trackbacks - 441
Bloggers
-
31 Aug 2010 2:11 PM
-
09 Aug 2010 11:49 PM
-
24 Aug 2010 12:39 PM
-
08 Jul 2010 11:09 AM
-
30 Aug 2010 7:40 AM
-
23 Jul 2010 10:06 AM
-
30 Jun 2010 2:43 PM
-
23 Jun 2009 11:09 AM
-
03 Feb 2008 8:34 AM
-
11 Feb 2010 11:18 AM
-
11 Mar 2010 1:38 PM
-
30 Aug 2009 9:20 PM
-
22 Nov 2009 7:08 AM
-
19 Aug 2004 7:00 PM
-
21 Aug 2010 2:02 PM
-
19 Jul 2010 7:58 PM
-
05 Dec 2005 10:29 PM
-
26 Aug 2010 2:46 PM
-
11 Mar 2010 4:20 PM
-
28 Nov 2009 7:55 PM
-
19 Aug 2010 6:02 PM
-
31 Aug 2010 5:17 PM
-
20 Aug 2008 11:24 AM
-
26 Jun 2010 8:05 PM
-
20 Aug 2009 2:27 PM
-
03 Feb 2008 8:07 PM
-
04 Aug 2010 2:38 AM
-
02 Sep 2010 9:58 PM
-
01 Jan 2003 12:00 AM
-
08 May 2008 7:21 AM
-
11 Oct 2007 6:21 PM
-
07 Dec 2007 10:53 AM
-
01 Jan 2003 12:00 AM
-
23 Sep 2003 12:00 AM
-
23 Sep 2003 7:38 PM
|
Lately I’ve been working on a data migration SSIS package for a customer and some of it is just a straight forward execute sql tasks in SSIS on some very large tables. I pull 10 million rows into a staging area, massage some data into the proper format, insert it into it’s final production table resting place, easy enough. I’m testing it on a very low powered 32 bit server and it takes a while. At one point today I started wondering if it was just locked up or was still running. I’ve used the sys.partitions table before as an alternate method of getting row counts for a table as it already has that information handy for me. This time I was querying the table while the insert was running and I was able to see exactly what was happening. The code below will walk you through an easy example. This will work the same in 2005, or 2008. First, just create a throw away table: Create table Loadtest ( IDVal int identity(1,1), CustomerName varchar(200), CustomerCity varchar(200) ) Now let’s create a looping insert that gives us enough time to look at things. You can just stop it when you have had enough. It’s nothing exciting here, I’m just creating an loop 2 million times and doing an insert..like I said, stop it when you’ve had enough: declare @counter int set @counter=2000000 While @counter>0 BEGIN Insert into LoadTest(CustomerName,CustomerCity) values('Mike Femenella','Matthews') SET @counter=@counter-1 END While your loop is running, open up another query analyzer window to the same database and run this: SELECT sp.index_id,Rows,si.name,si.type_desc FROM sys.partitions sp inner join sys.indexes si on sp.object_id=si.object_id AND sp.index_id=si.index_id where object_name(sp.object_id)='LoadTest' Since the table is not indexed you’ll see the row count and that it’s populating the heap, hit F5 a few times and you can watch the rows field increment. If you have doubts your insert statement is actually running, this is a good way to check that it is alive and you can also get an idea of it’s progress if you have a guess as to how many rows “should” be there. --Unindexed, you see the heap being populated.
index_id Rows name type_desc 0 24252 NULL HEAP So that was mildly interesting and sort of cool. Of course we can take it a step further (and I always love doing that). Go ahead and stop the loop when you have have a million or so rows in there, you need a large amount of data in the LoadTest table so you have enough time to run the sys.partitions query again. I was letting my loop run while writing the first part of the blog so I have 2 million rows in LoadTest. Let’s create a mirror image of our LoadTest table and since I’m at the end of my work week and feeling very creative I’m going to call it LoadTest_Copy, go ahead and F5 and create the table.. Award winning material here isn’t it? Create table Loadtest_Copy ( IDVal int , CustomerName varchar(200), CustomerCity varchar(200) ) GO Create Clustered index ixc_IDVal on LoadTest_Copy(IDVal) Create index ix_CustomerName on LoadTest_Copy(CustomerName) Create index ix_CustomerCity on LoadTest_Copy(CustomerCity) GO In another window modify your sys.partition query and change the name from LoadTest to LoadTest_Copy. Run this now and you have 3 rows, one for the clustered index and one for each of the other indexes. Somewhat more realistic and more interesting. SELECT sp.index_id,Rows,si.name,si.type_desc FROM sys.partitions sp inner join sys.indexes si on sp.object_id=si.object_id AND sp.index_id=si.index_id where object_name(sp.object_id)='LoadTest_Copy' Results: index_id Rows name type_desc 1 0 ixc_IDVal CLUSTERED 2 0 ix_CustomerName NONCLUSTERED 3 0 ix_CustomerCity NONCLUSTERED Now, do an insert from LoadTest into LoadTest_Copy. Start it and then flip over to your sys.partition query, run it while the insert is happening and look at the results. You can keep hitting F5 and watch the numbers change. Notice it builds the clustered index first, which of course makes sense, and then will go through each of the other 2 indexes building them out. index_id Rows name type_desc 1 449900 ixc_IDVal CLUSTERED 2 0 ix_CustomerName NONCLUSTERED 3 0 ix_CustomerCity NONCLUSTERED index_id Rows name type_desc 1 2000000 ixc_IDVal CLUSTERED 2 1065245 ix_CustomerName NONCLUSTERED 3 0 ix_CustomerCity NONCLUSTERED index_id Rows name type_desc 1 2000000 ixc_IDVal CLUSTERED 2 2000000 ix_CustomerName NONCLUSTERED 3 126137 ix_CustomerCity NONCLUSTERED And finally now that it’s done: index_id Rows name type_desc 1 2000000 ixc_IDVal CLUSTERED 2 2000000 ix_CustomerName NONCLUSTERED 3 2000000 ix_CustomerCity NONCLUSTERED So next time you’re wondering if your insert or update statement is actually doing anything useful, you can leverage some of the underlying DMV data to make sure things are still on track and it’s a good way to tell if you have time to run out for coffee. That’s the last SQL related thing I’m doing, time for vacation…cheers!
Having the SSMS Tools Pack out in the wild enables me to get much “joy” from different bug reports people send me. And let me tell you, people send me back some seriously weird errors. But the most unexpected error message I’ve seen so far was the OverflowException when calling System.Data.SqlClient.SqlDataReader.GetDecimal(Int32 i). It turns out that SQL Server decimal data type is not mapped to the .Net decimal in it’s whole range. .Net decimal type only maps to SQL Server one in the decimal(29, X) range. That means that if you have a decimal column that has the precision higher than 29 and no matter the scale, you won’t be able to use the native .Net data type. So what to do? Let’s take a look with an example. The comments provide additional info. SQL Code: -- create a test table in tempdb with one valid and one invalid decimal mapping column. CREATE TABLE TestTable ( [ID] [INT] IDENTITY(1,1) NOT NULL, [ValidDecVal] [DECIMAL](29, 2) NOT NULL, [InvalidDecVal] [DECIMAL](30, 30) NOT NULL ) GO
-- insert some data INSERT INTO TestTable(ValidDecVal, InvalidDecVal) -- both values are in correct range SELECT 123456789012345678901234567.56, 0.123456789012345678901234567890
C# Code:
private void GetData() { using (SqlConnection conn = new SqlConnection(@"server=TestServer; database=tempdb; Integrated Security=SSPI;")) { using (SqlCommand cmd = new SqlCommand("SELECT ID, ValidDecVal, InvalidDecVal FROM TestTable", conn)) { conn.Open(); SqlDataReader rdr = cmd.ExecuteReader(CommandBehavior.CloseConnection); rdr.Read(); // get the id int id = rdr.GetInt32(0); // get the 29 precision value just fine decimal validDecimal = rdr.GetDecimal(1); // this errors out ... decimal invalidDecimal = rdr.GetDecimal(2); // so does this ... decimal invalidDecimal = rdr.GetSqlDecimal(2).Value;
// the only thing to do is to either pass around SqlDecimal class SqlDecimal invalidDecimalAsSqlDecimal = rdr.GetSqlDecimal(2); // or to cast it to string string invalidDecimalAsString = invalidDecimalAsSqlDecimal.ToString(); // ... do something with upper values Console.WriteLine(invalidDecimalAsSqlDecimal); Console.WriteLine(invalidDecimalAsString); } } }
If you’re working with large precision types I’d like to hear how you overcome this problem in .Net.
The only way I’ve found to deal with this is to either pass around the raw SqlDecimal data type or its string value.
A quick search revealed this Connect item that showed the problem with Linq2SQL. I don’t know why this isn’t fixed yet but I hope it will be soon. If you encounter this problem, vote it up.
Nothing earth-shattering here, I was just helping out a colleague with this so I thought I'd post up the example I gave him.
-- sample table:
create table People
(
Person varchar(1) primary key,
City varchar(10),
Age int
)
go
-- with some sample data:
insert into People
select 'A','Boston',23 union all -- odd #
select 'B','Boston',43 union all
select 'C','Boston',29 union all
select 'D','Chicago',15 union all -- single #
select 'E','NY',12 union all -- even #
select 'F','NY',55 union all
select 'G','NY',57 union all
select 'H','NY',61
go
-- here's our query, showing median age per city:
select city,
AVG(age) as MedianAge
from
(
select City, Person, Age,
ROW_NUMBER() over (partition by City order by Age ASC) as AgeRank,
COUNT(*) over (partition by City) as CityCount
from
People
) x
where
x.AgeRank in (x.CityCount/2+1, (x.CityCount+1)/2)
group by
x.City
go
-- clean it all up
drop table People
And here's the result:
city MedianAge
---------- -----------
Boston 29
Chicago 15
NY 56
(3 row(s) affected)
Simply remove "City" from the SELECT clause and the GROUP BY clause to get the median age for all.
There may be more efficient tricks out there, but this is certainly the shortest and simplest technique I am aware of.
Here's a handy link that shows all the query operator icons used in graphical SQL Server execution plans.
Your mission, if you choose to accept it, is to write a query or batch that displays all of them. :)
This tape/disk/database will self-destruct in 5 seconds. Good luck, Jim.
This is a release that fixes all known bugs. If you encounter any new ones don’t hesitate to report them. :)
The main feature list hasn’t changed.
A few improvements have been made though:
Save SQL Snippets to HTML and/or print them directly from SSMS.
In Window Connection Coloring the server names can now be regular expressions giving you the ability to color multiple servers with the same color.
Here’s a blog from David Levy (Blog|Twitter) about the new window connection coloring using regular expression and why is it really useful to him. Thanx for kind words Dave.
You can download the new version 1.9 here.
Enjoy it!
Microsoft makes a pretty good OLE DB Provider for DB2 that you can use for SSIS, DTS, and linked servers under SQL Server. It's sometimes difficult to configure but you can get some good starter settings here.
In yet another remarkable but not uncommon blunder on their part, Microsoft's installation package will only install the provider on a server with Enterprise or Developer Edition. Why they do this I don't know, but if you have a Standard Edition SQL Server and want to use this provider, here's a workaround:
- Install Enterprise or Developer Edition side-by-side on your Standard Edition server as a named instance, only the database engine components.
- Install the DB2 OLEDB Provider.
- Uninstall the side-by-side instance. Or keep it if you like it and have a valid license.
Notes:
- I've only tested this with Standard Edition, but I believe it will work with any edition of SQL Server.
- You may be able to use a trial version of SQL Server for the side-by-side install, I haven't tested it.
- This technique also works for SQL Server 2000 instances, as long as your side-by-side instance is 2005 or higher
If you're wondering why I'm posting this: <rant>
- It was in answer to this Tweet.
- I'm aggravated that Microsoft:
- Makes a great product
- That they give away for free
- And deprive their user base with a ridiculous and unnecessary SKU limitation
Sorry MS if you don't agree, but DB2 integration is NOT an Enterprise-only feature, nor one that should cost $30K+ to get. If you really think so, then charge $30K for the provider, don't tease your customers with a freebie.
</rant>
I added the parenthetical title because no one really cares if they can make their databases better. :)
Based on some recent tweets (one and two) and blogs I've been reading/writing, I thought I'd create an example of some excellent but underutilized features in SQL Server: defaults and user-defined types.
User-defined types (UDTs) are a way of renaming a standard SQL type and additionally providing a NULL or NOT NULL constraint on that type:
CREATE TYPE [USERNAME] AS varchar(64) NULL;
This can be used in a table definition like so:
CREATE TABLE [MyTable] [UserCreated] USERNAME;
It saves some typing, and I get an automatic NULL constraint on the column, which I can override with NOT NULL. (Normally I do the reverse; allowing nulls is not a "constraint")
Defaults are a deprecated feature that will be removed from SQL Server at a later date. Why? Because someone hates this:
CREATE DEFAULT [whoami] AS SUSER_SNAME();
They also hate this:
EXEC sp_bindefault 'whoami', 'USERNAME';
Which allows me to reduce this:
ALTER TABLE [dbo].[MyTable]
ADD [UserCreated] VARCHAR (64) CONSTRAINT [DF_MyTable_UserCreated] DEFAULT (suser_sname()) NULL,
[UserModified] VARCHAR (64) CONSTRAINT [DF_MyTable_UserModified] DEFAULT (suser_sname()) NULL;
to this:
ALTER TABLE [dbo].[MyTable]
ADD [UserCreated] USERNAME,
[UserModified] USERNAME;
Yeah I know, when you add up the default and UDT definitions and bindings, you...still save a decent amount of typing, but not all that much...
...unless you have several hundred tables that use the exact same definition. To rehash from my blog post:
ALTER TABLE [dbo].[MyTable]
ADD [DateCreated] DATETIME CONSTRAINT [DF_MyTable_DateCreated] DEFAULT (getdate()) NULL,
[DateModified] DATETIME CONSTRAINT [DF_MyTable_DateModified] DEFAULT (getdate()) NULL,
[UserCreated] VARCHAR (64) CONSTRAINT [DF_MyTable_UserCreated] DEFAULT (suser_sname()) NULL,
[UserModified] VARCHAR (64) CONSTRAINT [DF_MyTable_UserModified] DEFAULT (suser_sname()) NULL;
With a quick mod to my Tweet:
CREATE TYPE [MYDATE] AS DATETIME;
GO
CREATE DEFAULT [now] AS GETDATE();
GO
EXEC sp_bindefault 'now', 'MYDATE';
I can really beat this horse to death:
ALTER TABLE [dbo].[MyTable]
ADD [DateCreated] MYDATE,
[DateModified] MYDATE,
[UserCreated] USERNAME,
[UserModified] USERNAME;
And reduce my typing by over 60%. By the way, did you like how the datetime UDT, default definition, and binding all fit in a single 140-character tweet? I sure did!
I'll leave it up to you to decide if consistent, reusable data types with shorter names and identical defaults are worthwhile. And in case you're asking, yes, I've had developers argue AGAINST this very sentence.
Rules are another non-ANSI feature that SQL Server inherited from Sybase, and Microsoft also feels are A Bad Thing™ that should be deprecated. I don't have a good example right now but I should have one in a future blog soon. In the meantime you can read some of my thoughts in this forum post.
I can understand Microsoft wanting to remove Sybase cruft, especially if it's non-ANSI standard, but so far I've not heard of any new features to replace them, such as CREATE DOMAIN, which is ANSI standard and neatly combines types, defaults, and rules into one package.
If you feel this is a useful feature to have, and a bad one to lose, please visit Microsoft's Connect site and vote on this item. It shouldn't be that hard for them to implement, since they can just borrow the code from the open-source PostGreSQL project. ;)
I recently had to modify tables to add some auditing columns and triggers. Being a good fair to middling crazed, insane DBA and wanting to ensure data integrity and good design, I made the columns not nullable, and therefore also had to add defaults for these new columns: ALTER TABLE [dbo].[MyTable]
ADD [DateCreated] DATETIME CONSTRAINT [DF_MyTable_DateCreated] DEFAULT (getdate()) NOT NULL,
[DateModified] DATETIME CONSTRAINT [DF_MyTable_DateModified] DEFAULT (getdate()) NOT NULL,
[UserCreated] VARCHAR (64) CONSTRAINT [DF_MyTable_UserCreated] DEFAULT (suser_sname()) NOT NULL,
[UserModified] VARCHAR (64) CONSTRAINT [DF_MyTable_UserModified] DEFAULT (suser_sname()) NOT NULL;
Unfortunately by making them not nullable, the ALTER statement has to rewrite all the data pages to insert the default values. This kinda-sorta locked down the table(s) and made the database unresponsive for about, oh, 2 hours or so. (Stooooooooopid multi-million row tables!) It also had the side effect of making months worth of transactions appear as if they were created instantly by the same user. For some odd reason my boss felt neither circumstance was a good idea to deploy to production.
Happily the solution to both problems was to make the new columns nullable:
ALTER TABLE [dbo].[MyTable]
ADD [DateCreated] DATETIME NULL,
[DateModified] DATETIME NULL,
[UserCreated] VARCHAR (64) NULL,
[UserModified] VARCHAR (64) NULL;
And add the constraints afterwards:
ALTER TABLE [dbo].[MyTable]
ADD CONSTRAINT [DF_MyTable_DateCreated] DEFAULT (getdate()) FOR [DateCreated],
CONSTRAINT [DF_MyTable_DateModified] DEFAULT (getdate()) FOR [DateModified],
CONSTRAINT [DF_MyTable_UserCreated] DEFAULT (suser_sname()) FOR [UserCreated],
CONSTRAINT [DF_MyTable_UserModified] DEFAULT (suser_sname()) FOR [UserModified];
The problem was turning the original, single statement into the 2 new statements, without doing it by hand. (Did I mention there were a few hundred tables in 8 different databases?) Playing around a bit I discovered that this syntax works perfectly:
ALTER TABLE [dbo].[MyTable]
ADD [DateCreated] DATETIME NULL, CONSTRAINT [DF_MyTable_DateCreated] DEFAULT (getdate()) FOR [DateCreated],
[DateModified] DATETIME NULL, CONSTRAINT [DF_MyTable_DateModified] DEFAULT (getdate()) FOR [DateModified],
[UserCreated] VARCHAR (64) NULL, CONSTRAINT [DF_MyTable_UserCreated] DEFAULT (suser_sname()) FOR [UserCreated],
[UserModified] VARCHAR (64) NULL, CONSTRAINT [DF_MyTable_UserModified] DEFAULT (suser_sname()) FOR [UserModified];
Wow, that's pretty close to the original! (And it runs instantly since it only requires schema locks on the table) It's still annoying because I have to copy the new column name and paste it at the end of the DEFAULT definition.
Normally this is where I break out my 1337 4@X0r skillz and INFORMATION_SCHEMA and generate the SQL I need. Sadly the cruel, cold mistress named CHANGE CONTROL had other plans: we have to build from source and deploy with zero changes to the script. Minor changes are allowed but must be exactly repeatable/reproducible. (And also checked into source control, but that's for another time.)
If you've ever played with regular expressions (regex) you'll know that it's not hard to reformat the first statement into the last. If you haven't used regex before, or always get frustrated by them, hopefully this example will convince you to spend more time with them. SQL Server Management Studio has a find/replace feature that can use regex. Here's what I used to get the job done (each regex follows after the "-> "):
Find-> {.*}{\[.*\]}{.*}{CONSTRAINT}{.*}{NOT NULL}{.*}
Replace-> \1\2\3 NULL, CONSTRAINT\5FOR \2\7
The Find expression breaks up each line into the following groups, using { and } to denote each:
1. All characters before left square bracket [ -> {.*}
2. All characters between (and including) square brackets -> {\[.*\]}
3. All characters between right square bracket ] and the word CONSTRAINT -> {.*}
4. The word CONSTRAINT -> {CONSTRAINT}
5. All characters between CONSTRAINT and NOT NULL -> {.*}
6. The words NOT NULL -> {NOT NULL}
7. All characters after NOT NULL -> {.*}
The Replace expression uses backreferences (\1, \2, etc.) to do the following:
1. Rebuild the first 3 captured groups exactly as before -> \1\2\3
2. Add a space and the word NULL -> NULL
3. Add a comma, space, and the word CONSTRAINT -> , CONSTRAINT
4. Rebuild the 5th captured group (all characters between CONSTRAINT and NOT NULL) -> \5
5. Add the FOR keyword and a space -> FOR
6. Add the 2nd captured group, which is the column name and its enclosing brackets -> \2
7. Add the remaining characters captured after NOT NULL -> \7
Try it out for yourself! You can, of course, do a lot more with regular expressions than this, but it's a good introduction for a relatively common problem. In case you're wondering, the find/replace ran in about 15-20 seconds, and the original 2 hour deployment now takes less than 1 minute.
One final note: if you've used regex in Perl or some other language, you'll probably wonder why I'm capturing groups with {} instead of the standard () characters. I wonder that too; my only explanation is Microsoft are idiots hate Unix and Perl people have their own unexplained reasons for using non-standard characters. It's not the first time.
Enjoy!
A couple of months ago I wrote a post about Odd Profiler Results with Entity Framework 4. Thanks to Olaf Tinnemeyer, we now have a simple work-around that he posted to my question on StackOverflow. When using the Visual Studio tools to create the Entity Data Model, it automatically builds the Connection String for you, and one of the default settings is to set the MultipleActiveResultSets option to True. If you change this setting to False, then Profiler will properly report that the query was executed within the application database. Please note that I have not done any load testing to determine what type of performance impact this change might have, so YMMV. But for me, I know it will perform acceptably for our needs, and the greater need in my view is to have accurate Profiler results. I have posted this work-around to the Connect item that I opened. I still believe that this is a bug in SQL Server and hope that Microsoft will address it in a future release. If you share my concern, I encourage you to login and vote this up.
This problem is designed to come up with a solution that uses the smallest amount of storage possible for a 1,000 seat restaurant.
I've come up with a solution that need only 125 bytes of storage. All other solutions covered in Mr Celko's book has at least 1,000 bytes of storage.
Here is my solution, complete with all procedures to assign and release seats, together with views to display current status of each and one seat.
-- Setup sample data
CREATE TABLE dbo.Restaurant
(
Seats BINARY(125) NOT NULL
)
-- Initialize an empty restaurant
INSERT dbo.Restaurant
(
Seats
)
SELECT 0x
GO
-- Create procedure for handling seat assignment
CREATE PROCEDURE dbo.spAssignSeat
(
@Seat SMALLINT
)
AS
DECLARE @Block TINYINT,
@Bit TINYINT
SELECT @Block = SUBSTRING(Seats, 1 + (@Seat - 1) / 8, 1),
@Bit = POWER(2, (@Seat - 1) % 8)
FROM dbo.Restaurant
UPDATE dbo.Restaurant
SET Seats = SUBSTRING(Seats, 1, (@Seat - 1) / 8) + CAST(@Bit | @Block AS BINARY(1)) + SUBSTRING(Seats, 2 + (@Seat - 1) / 8, 124 - (@Seat - 1) / 8)
GO
-- Create procedure for handling seat clearance
CREATE PROCEDURE dbo.spClearSeat
(
@Seat SMALLINT
)
AS
DECLARE @Block TINYINT,
@Bit TINYINT
SELECT @Block = SUBSTRING(Seats, 1 + @Seat / 8, 1),
@Bit = CAST(255 AS TINYINT) ^ POWER(2, (@Seat - 1) % 8)
FROM dbo.Restaurant
UPDATE dbo.Restaurant
SET Seats = SUBSTRING(Seats, 1, (@Seat - 1) / 8) + CAST(@Bit & @Block AS BINARY(1)) + SUBSTRING(Seats, 2 + (@Seat - 1) / 8, 124 - (@Seat - 1) / 8)
GO
-- Create tally view
CREATE VIEW dbo.vwNums
AS
WITH L0 AS(SELECT 1 AS c UNION ALL SELECT 1),
L1 AS(SELECT 1 AS c FROM L0 AS A CROSS JOIN L0 AS B),
L2 AS(SELECT 1 AS c FROM L1 AS A CROSS JOIN L1 AS B),
L3 AS(SELECT 1 AS c FROM L2 AS A CROSS JOIN L2 AS B),
Nums AS(SELECT ROW_NUMBER() OVER(ORDER BY c) - 1 AS Number FROM L3)
SELECT TOP(125)
Number
FROM Nums
ORDER BY Number
GO
-- Create view CurrentSeatings
CREATE VIEW dbo.vwCurrentSeatings
AS
SELECT 8 * v.number + b.number + 1 AS Seat,
SIGN(SUBSTRING(Seats, 1 + v.number, 1) & POWER(2, b.number)) AS Taken
FROM dbo.Restaurant AS s
INNER JOIN dbo.vwNums AS v ON v.Number BETWEEN 0 AND 124
INNER JOIN dbo.vwNums AS b ON b.Number BETWEEN 0 AND 7
GO
-- Create the available seats sequence view
CREATE VIEW dbo.vwAvailableSeats
AS
WITH cteSource(Seat, Taken, grp)
AS (
SELECT Seat,
Taken,
Seat - ROW_NUMBER() OVER (PARTITION BY Taken ORDER BY Seat) AS grp
FROM dbo.vwCurrentSeatings
)
SELECT ROW_NUMBER() OVER (ORDER BY grp) AS Sequence,
MIN(Seat) AS FromSeat,
MAX(Seat) AS ToSeat
FROM cteSource
WHERE Taken = 0
GROUP BY grp
GO
-- Display the wanted result
SELECT Sequence,
FromSeat,
ToSeat
FROM dbo.vwAvailableSeats
ORDER BY Sequence
If you are interested in finding out the largest SQL projects in the world, you should check out this PowerPoint presentation. It’s from Kevin Cox of SQL CAT at Microsoft.
SQL Saturday is coming to my hometown, San Diego, on September 18th, 2010. I have submitted my session and hope that it gets approved. Let me know if anyone is attending the event in San Diego. If you aren’t in Southern California, check their site for the location nearest you.
A while back I wrote a CLR object to track database growth. The CLR object queries sp_databases for each passed in server and saves the data into a table. By using this CLR object, I can track all of my systems in one location. I blogged about this tool in this article. A few weeks ago, I noticed that my SQL Agent job that runs this CLR object was failing. It was only failing on one of my database servers, so I ran sp_databases on that system to figure out what was going on. I noticed that for one database, the DATABASE_SIZE column was NULL. After doing some digging, I realized that sp_databases has a bug in it for any database that is 2TB in size or greater. Sp_databases is using int data type for the DATABASE_SIZE column. Well that’s not big enough for a 2+TB database! A bigint should have been used instead. For backward compatibility reasons, Microsoft decided to display NULL for this situation rather than an error in SQL Server 2005 and 2008. In SQL Server 2000, it throws an error. I found out that there is no bug fix for this, so I had two options. The first option would be to manually patch sp_databases on each of the servers. The second option would be to patch my CLR object. If I patched my CLR object, I could either create a new stored procedure, perhaps name it sp_databases2, or put the query directly into the C# code. I didn’t like the option of patching sp_databases on potentially hundreds of servers even though I could easily deploy it using a batch file that calls sqlcmd. I don’t like the idea of modifying Microsoft’s code even though it’s a simple bug fix. With the second option of patching my CLR object, I didn’t like the idea of creating a new stored procedure as that adds a dependency to my tool. So I instead decided to patch the CLR object by putting the correct query directly in the code. You can download the new version here. Please reference the old article for how to call it, and especially how to call it for all of your servers. The old article also links to the new version of the code, since I didn’t change functionality. Here’s the query in sp_databases that has a bug: select
DATABASE_NAME = db_name(s_mf.database_id),
DATABASE_SIZE = convert(int,
case -- more than 2TB(maxint) worth of pages (by 8K each) can not fit an int...
when convert(bigint, sum(s_mf.size)) >= 268435456
then null
else sum(s_mf.size)*8 -- Convert from 8192 byte pages to Kb
end),
REMARKS = convert(varchar(254),null)
from
sys.master_files s_mf
where
s_mf.state = 0 and -- ONLINE
has_dbaccess(db_name(s_mf.database_id)) = 1 -- Only look at databases to which we have access
group by s_mf.database_id
order by 1
And here’s the bug fix:
select
DATABASE_NAME = db_name(s_mf.database_id),
DATABASE_SIZE = convert(bigint, convert(bigint, sum(s_mf.size))*8),
REMARKS = convert(varchar(254),null)
from sys.master_files s_mf
where
s_mf.state = 0 and
has_dbaccess(db_name(s_mf.database_id)) = 1
group by s_mf.database_id
order by 1
I regularly move logins between servers. Mostly this is between production and our DR site. I’ve used the code in KB246133 many, many times but it’s pretty limited. I started with that and wrote the script below. - There is a user-defined function that’s created in master. You can create it in any database you want but you’ll need to update the script. The function converts varbinary hashed passwords to a string representation.
- It keeps the password intact for SQL Server logins.
- It scripts both Windows logins and SQL Server logins. It also scripts role membership.
- It keeps the SID intact for SQL Server logins. This is important so you don’t have to remap users to logins.
- The script that is generated uses IF NOT EXISTS so that it doesn’t try to create logins that already exist.
- It DOES NOT handle removal of logins from roles. It does handle disabled accounts but I haven’t done much testing on that.
- I’ve tested this on SQL Server 2005 and SQL Server 2008.
- You’ll probably need to change your results so that you display more characters by default. Under Tools –> Options –> Query Results –> SQL Server –> Results to Text increase the maximum number of characters returned to 8192 (or a number high enough that the results aren’t truncated). You’ll want to set results to text before running this.
USE [master] GO
/****** Object: UserDefinedFunction [dbo].[fn_hexadecimal] ****/ SET ANSI_NULLS ON GO
SET QUOTED_IDENTIFIER ON GO
CREATE FUNCTION [dbo].[fn_hexadecimal] ( -- Add the parameters for the function here @binvalue varbinary(256) ) RETURNS VARCHAR(256) AS BEGIN
DECLARE @charvalue varchar(256) DECLARE @i int DECLARE @length int DECLARE @hexstring char(16) SELECT @charvalue = '0x' SELECT @i = 1 SELECT @length = DATALENGTH (@binvalue) SELECT @hexstring = '0123456789ABCDEF' WHILE (@i <= @length) BEGIN DECLARE @tempint int DECLARE @firstint int DECLARE @secondint int SELECT @tempint = CONVERT(int, SUBSTRING(@binvalue,@i,1)) SELECT @firstint = FLOOR(@tempint/16) SELECT @secondint = @tempint - (@firstint*16) SELECT @charvalue = @charvalue + SUBSTRING(@hexstring, @firstint+1, 1) + SUBSTRING(@hexstring, @secondint+1, 1) SELECT @i = @i + 1 END return @charvalue
END GO
SET NOCOUNT ON GO --use MASTER GO PRINT '-----------------------------------------------------------------------------' PRINT '-- Script created on ' + CAST(GETDATE() AS VARCHAR(100)) PRINT '-----------------------------------------------------------------------------' PRINT '' PRINT '-----------------------------------------------------------------------------' PRINT '-- Create the windows logins' PRINT '-----------------------------------------------------------------------------' SELECT 'IF NOT EXISTS (SELECT * FROM master.sys.server_principals WHERE [name] = ''' + [name] + ''') CREATE LOGIN [' + [name] + '] FROM WINDOWS WITH DEFAULT_DATABASE=[' + default_database_name + '], DEFAULT_LANGUAGE=[us_english] GO
' FROM master.sys.server_principals where type_desc In ('WINDOWS_GROUP', 'WINDOWS_LOGIN') AND [name] not like 'BUILTIN%' and [NAME] not like 'NT AUTHORITY%' and [name] not like '%\SQLServer%' GO
PRINT '-----------------------------------------------------------------------------' PRINT '-- Create the SQL Logins' PRINT '-----------------------------------------------------------------------------' select 'IF NOT EXISTS (SELECT * FROM master.sys.sql_logins WHERE [name] = ''' + [name] + ''') CREATE LOGIN [' + [name] + '] WITH PASSWORD=' + [master].[dbo].[fn_hexadecimal](password_hash) + ' HASHED, SID = ' + [master].[dbo].[fn_hexadecimal]([sid]) + ', DEFAULT_DATABASE=[' + default_database_name + '], DEFAULT_LANGUAGE=[us_english], CHECK_EXPIRATION=' + CASE WHEN is_expiration_checked = 1 THEN 'ON' ELSE 'OFF' END + ', CHECK_POLICY=OFF GO IF EXISTS (SELECT * FROM master.sys.sql_logins WHERE [name] = ''' + [name] + ''') ALTER LOGIN [' + [name] + '] WITH CHECK_EXPIRATION=' + CASE WHEN is_expiration_checked = 1 THEN 'ON' ELSE 'OFF' END + ', CHECK_POLICY=' + CASE WHEN is_policy_checked = 1 THEN 'ON' ELSE 'OFF' END + ' GO
' --[name], [sid] , password_hash from master.sys.sql_logins where type_desc = 'SQL_LOGIN' and [name] not in ('sa', 'guest')
PRINT '-----------------------------------------------------------------------------' PRINT '-- Disable any logins' PRINT '-----------------------------------------------------------------------------' SELECT 'ALTER LOGIN [' + [name] + '] DISABLE GO ' from master.sys.server_principals where is_disabled = 1
PRINT '-----------------------------------------------------------------------------' PRINT '-- Assign groups' PRINT '-----------------------------------------------------------------------------' select 'EXEC master..sp_addsrvrolemember @loginame = N''' + l.name + ''', @rolename = N''' + r.name + ''' GO
' from master.sys.server_role_members rm join master.sys.server_principals r on r.principal_id = rm.role_principal_id join master.sys.server_principals l on l.principal_id = rm.member_principal_id where l.[name] not in ('sa') AND l.[name] not like 'BUILTIN%' and l.[NAME] not like 'NT AUTHORITY%' and l.[name] not like '%\SQLServer%'
Today, let's examine encoding with SQL Server and XML datatype.
DECLARE @Inf XML
SET @Inf = '<?xml version="1.0" encoding="utf-16"?>
<root>
<names>
<name>test</name>
</names>
<names>
<name>test1</name>
</names>
</root>
'
SELECT x.value('name[1]', 'VARCHAR(10)') AS Name
FROM @Inf.nodes('/root/names') AS t(x)
If you try to run the code above, you will get an error message like this
Msg 9402, Level 16, State 1, Line 3
XML parsing: line 1, character 39, unable to switch the encoding
Why is that? If you change the encoding to UTF-8, the code works.
The solution is to know that UTF-16 works like UNICODE, and how do we denote UNICODE strings in SQL Server? Yes, by prefixing the string with N.
So this code works with UTF-16 encoding and you can happily continue to work.
DECLARE @Inf XML
SET @Inf = N'<?xml version="1.0" encoding="utf-16"?>
<root>
<names>
<name>test</name>
</names>
<names>
<name>test1</name>
</names>
</root>
'
SELECT x.value('name[1]', 'VARCHAR(10)') AS Name
FROM @Inf.nodes('/root/names') AS t(x)
This question has been asked over and over again, and instead of having to redirect to any of my previous answers, I will cover the solution here in my blog.
Consider this sample data
DECLARE @Sample TABLE
(
ID INT,
Data VARCHAR(100)
)
INSERT @Sample
VALUES (1, 'Peso & Performance SQL'),
(1, 'MVP'),
(2, 'Need help <? /> -- '),
(2, 'With XML string concatenation ?')
The "trick" is to use the TYPE directive (to deal with entitization), and then use ".value" function to get the correct value out.
So here is the final query.
SELECT i.ID,
STUFF(f.Content.value('.', 'VARCHAR(MAX)'), 1, 1, '')
FROM (
SELECT ID
FROM @Sample
GROUP BY ID
) AS i
CROSS APPLY (
SELECT ', ' + w.Data
FROM @Sample AS w
WHERE w.ID = i.ID
FOR XML PATH(''),
TYPE
) AS f(Content)
And to deal with characters having ascii values less than 32 (space), you can use this
;WITH cteSource(ID, Content)
AS (
SELECT i.ID,
f.Content.value('.', 'NVARCHAR(MAX)')
FROM (
SELECT ID
FROM @Sample
GROUP BY ID
) AS i
CROSS APPLY (
SELECT CAST(', ' + w.Data AS VARBINARY(MAX))
FROM @Sample AS w
WHERE w.ID = i.ID
FOR XML PATH(''),
TYPE
) AS f(Content)
)
SELECT ID,
STUFF(CAST(Content AS NVARCHAR(MAX)), 1, 2, '') AS Content
FROM (
SELECT ID,
CAST(N'' AS XML).value('xs:base64Binary(sql:column("Content"))', 'VARBINARY(MAX)') AS Content
FROM cteSource
) AS d
Last November, I blogged about a weird bug with SQL Server 2005 on a Windows 2008 cluster. We were having issues with Database Mail and other things and learned that it was due to the server names being in lower case. To workaround the issue, we had to follow the steps outlined here as there was no bug fix at the time. I’ve now learned that this bug has been fixed! The fix is included in cumulative update package 9 for SQL Server 2005 service pack 3. Check it out here.
I got an email from Mr Celko and he correctly stated that my previous solution was not truly working with sets, so he posted me some solutions and proper sample data.
With this information at my hand, I started to investigate what really is needed to get this Relational Division to work properly with two sets; Dividend and Divisor.
Some of you know me well, and know I am not satisfied with just solving the problem. There have to be some tweaks, and I did that too with this solution. Not only is it only touching the Dividend table once and Divisor table once, you can also set if you want a division with no remainder (which means all records in Divisor should match and not a single record more), or allow a division with remainder (which means all the records should match and maybe more records).
Great? Just set 1 for "No remainder" and 0 for "Allow remainder".
Simple as that. So why does it work? Remember your old algebra? "Divide is the same thing as multiply with the inverse number..."
Now for the sample data (courtesy of Mr Celko)
CREATE TABLE dbo.Dividend
(
group_id INTEGER NOT NULL,
item_name VARCHAR(10) NOT NULL,
PRIMARY KEY (
group_id,
item_name
)
)
INSERT INTO dbo.Dividend
(
group_id,
item_name
)
VALUES (1, 'one'),
(1, 'two'),
(1, 'three'),
(1, 'four'),
(2, 'one'),
(2, 'two'),
(2, 'three'),
(3, 'one'),
(3, 'two')
CREATE TABLE dbo.Divisor
(
item_name VARCHAR(10) NOT NULL PRIMARY KEY
)
INSERT INTO dbo.Divisor
(
item_name
)
VALUES ('one'),
('two'),
('three')
Now for the 4 solutions posted by Mr Celko
-- Celko 1
SELECT D1.group_id
FROM Dividend AS D1
WHERE D1.item_name IN (SELECT item_name FROM Divisor)
GROUP BY D1.group_id
HAVING COUNT(DISTINCT D1.item_name) = (SELECT COUNT(*) FROM Divisor)
AND COUNT(DISTINCT D1.item_name) = (SELECT COUNT(*) FROM Dividend AS D2 WHERE D2.group_id = D1.group_id)
-- Celko 2
SELECT D1.group_id
FROM Dividend AS D1
WHERE D1.item_name IN (SELECT item_name FROM Divisor)
AND NOT EXISTS (
SELECT *
FROM Dividend AS D2
WHERE D2.group_id = D1.group_id
AND D2.item_name NOT IN (SELECT item_name FROM Divisor)
)
GROUP BY D1.group_id
HAVING COUNT(DISTINCT D1.item_name) = (SELECT COUNT(*) FROM Divisor)
-- Celko 3
SELECT D1.group_id
FROM (
SELECT group_id,
item_name,
COUNT(*) OVER (PARTITION BY group_id) AS cnt
FROM Dividend
) AS D1
WHERE D1.item_name IN (SELECT item_name FROM Divisor)
AND cnt = (SELECT COUNT(*) FROM Divisor)
GROUP BY D1.group_id
HAVING COUNT(D1.cnt) = (SELECT COUNT(*) FROM Divisor)
--Celko 4
;WITH Divisor2
AS (
SELECT group_id,
MIN(CASE WHEN item_name IN (SELECT item_name FROM Divisor) THEN 1 ELSE 0 END) OVER(PARTITION BY group_id) AS single,
SUM(CASE WHEN item_name IN (SELECT item_name FROM Divisor) THEN 1 ELSE 0 END) OVER(PARTITION BY group_id) AS full_basket
FROM Dividend
)
SELECT D.group_id
FROM Dividend AS D,
Divisor2
WHERE D.group_id = Divisor2.group_id
AND Divisor2.single = 1
AND Divisor2.full_basket = (SELECT COUNT(*) FROM Divisor)
GROUP BY D.group_id
You can copy and paste the code to a query window and run them. Investigate the execution plan and compare the 4 of them.
And now to my solution.
-- Peso 1
SELECT group_id
FROM (
SELECT t.group_id,
SUM(CASE WHEN t.item_name = n.item_name THEN 1 ELSE 0 END) AS cnt,
COUNT(*) AS Items
FROM dbo.Dividend AS t
CROSS JOIN dbo.Divisor AS n
GROUP BY t.group_id,
t.item_name
) AS d
GROUP BY group_id
HAVING SUM(cnt) = MIN(Items)
AND MIN(cnt) >= 1 -- 1 means no remainder, 0 means remainder
After some challenging with MVP Adam Machanic, here is another version
-- Peso v2
SELECT t.group_id
FROM (
SELECT group_id,
COUNT(*) AS cnt
FROM dbo.Dividend
GROUP BY group_id
) AS kc
INNER JOIN (
SELECT COUNT(*) AS cnt
FROM dbo.Divisor
) AS nc ON nc.cnt = kc.cnt
INNER JOIN dbo.Dividend AS t ON t.group_id = kc.group_id
INNER JOIN dbo.Divisor AS n ON n.item_name = t.item_name
GROUP BY t.group_id
HAVING COUNT(*) = MIN(nc.cnt)
Here is an algorithm (exact division) which is really fast, but not 100% accurate due to the implementation of CHECKSUM (see http://www.sqlteam.com/forums/topic.asp?TOPIC_ID=70832).
-- Peso 3
SELECT group_id
FROM (
SELECT group_id,
CHECKSUM_AGG(CHECKSUM(item_name)) AS ca1,
CHECKSUM_AGG(CHECKSUM(REVERSE(item_name))) AS ca2
FROM dbo.Dividend
GROUP BY group_id
) AS t
INNER JOIN (
SELECT CHECKSUM_AGG(CHECKSUM(item_name)) AS ca1,
CHECKSUM_AGG(CHECKSUM(REVERSE(item_name))) AS ca2
FROM dbo.Divisor
) AS n ON n.ca1 = t.ca1
AND n.ca2 = t.ca2
Now copy my solutions and compare them to the other 4.
-- Celko 1
Table 'Dividend'. Scan count 3, logical reads 6.
Table 'Divisor'. Scan count 2, logical reads 20.
-- Celko 2
Table 'Divisor'. Scan count 3, logical reads 32.
Table 'Dividend'. Scan count 4, logical reads 8.
-- Celko 3
Table 'Divisor'. Scan count 3, logical reads 10.
Table 'Worktable'. Scan count 3, logical reads 31.
Table 'Dividend'. Scan count 1, logical reads 2.
-- Celko 4
Table 'Dividend'. Scan count 2, logical reads 9.
Table 'Worktable'. Scan count 3, logical reads 31.
Table 'Divisor'. Scan count 1, logical reads 38.
-- Peso 1
Table 'Divisor'. Scan count 1, logical reads 19.
Table 'Dividend'. Scan count 1, logical reads 2.
-- Peso 2
Table 'Divisor'. Scan count 1, logical reads 8.
Table 'Dividend'. Scan count 2, logical reads 4.
-- Peso 3
Table 'Divisor'. Scan count 1, logical reads 2.
Table 'Dividend'. Scan count 1, logical reads 2.
So it seems my solution is cleaner and faster than the previous existing. But the best thing is yet hidden. My solution cares for multi-column division (just expand the CASE and GROUP BY clauses) whereas the previous 4 do not. Well, not easily anyway.
It will involve some replacing for IN with EXISTS, and some string concatenation for the DISTINCT clauses.
//Peso
As of today I’m the third Slovenian SQL Server MVP.
Thanx to all who nominated me!
Let’s see how this year goes and i’m sure it’ll be a blast, but most importantly:
See you all at the MVP Summit next year! :))
As a follow-up to my previous post Odd Profiler Results with EF4, I have now logged a SQL Server bug to Microsoft Connect. If you have similar concerns, I encourage you to logon to Connect and vote it up. If you have a solution, I encourage you to reply to my blog, or respond to my still unanswered questions on the ASP.NET Forums or Stack Overflow or Server Fault. I will happily mark your answer as correct (assuming that it is).
|