Jeff Smith Blog

Random Thoughts & Cartesian Products with Microsoft SQL Server

How to calculate Median in SQL Server

Nothing earth-shattering here, I was just helping out a colleague with this so I thought I'd post up the example I gave him. -- sample table:create table People(Person varchar(1) primary key,City varchar(10),Age int)go-- with some sample data: insert into People select 'A','Boston',23 union all -- odd #select 'B','Boston',43 union allselect 'C','Boston',29 union allselect 'D','Chicago',15 union all -- single #select 'E','NY',12 union all -- even #select 'F','NY',55 union allselect 'G','NY',57 union allselect 'H','NY',61go-- here's our query, showing median age per city: select city,AVG(age) as MedianAgefrom (select City, Person, Age, ROW_NUMBER() over (partition by City order by Age ASC) as AgeRank,COUNT(*) over (partition by City) as CityCountfromPeople) xwherex. Read more →

UNPIVOT: Normalizing data on the fly

Everyone seems to want to "pivot" or "cross tab" data, but knowing how to do the opposite is equally important. In fact, I would argue that the skill of "unpivoting" data is more useful and more important and more relevant to a SQL programmer, since pivoting results in denormalized data, while unpivoting can transform non-normalized data into a normalized result set. Read more →

Top N Percent per Group

Here's a good question in the feedback from my post about using the T-SQL 2005 features to return the Top N per Group of a result set: Sani writes: What about Top n Percent per Group? Read more →

Implementing Table Inheritance with SQL Server

I have a new article up at SQLTeam: Implementing Table Inheritance with Sql Server It discusses the situation where you have multiple entities that are distinct, yet they have many attributes or relations in common. Read more →

Date Only and Time Only data types in SQL Server 2005 (without the CLR)

In this post, I showed a simple way to create simple but useful Date and Time user-defined data types in SQL Server 2000. Here's how to do it in SQL Server 2005, without the need for CLR types or anything fancy. Read more →

Taking a look at CROSS APPLY

Applying a Sub-Query, Joining a Derived Table … I think the easiest way to think of CROSS APPLY is that it is like doing a CROSS JOIN with a correlated sub-query instead of a derived table. Read more →

There's two interesting posts over at the MSDN blogs from Craig Freedman about the new PIVOT operator in SQL 2005. First, he gives a nice overview of the operator and how to use it, and then he follows it up with a performance analysis as well. Read more →

SQL Server 2005: CROSS APPLY

If you are using SQL 2005 and find User Defined Functions helpful, be sure to read my new article over at SQLTeam: Returning Complex Data from User-Defined Functions with CROSS APPLY Read more →

SQL Server 2005: Specifying Partitions for Aggregate Functions

Did you know that a new feature in SQL Server 2005 allows you to specify an OVER partition for aggregate functions in your SELECT statement?  For example: Read more →

SQL Server 2005: Using EXCEPT and INTERSECT to compare tables

Previously, I had written that UNION ALL (combined with a GROUP BY) is a really quick and easy way to compare two tables. You don't need to worry about NULLS, the code is fairly short and easy to follow, and you can view exceptions from both tables at the same time. Read more →

Regular Expression Replace in SQL 2005 (via the CLR)

I had to do some data clean up the other day, and really needed some regular expression replacements to do the job. Since .NET has a great RegularExpressions namespace, and since SQL 2005 allows you to integrate . Read more →

More with SQL Server 2005 : Top n Per Group, Paging, and Common Table Expressions

I previously wrote about a few of the new features in SQL 2005 and how they can be used to solve some old "classic" SQL problems very easily, and I thought I'd briefly discuss a few more. Read more →

SQL Server 2005: Using PARTITION and RANK in your criteria

The RANK and PARTITION features in 2005 are simply amazing. They make so many "classic" SQL problems very easy to solve. For example, consider the following table: create table Batting(Player varchar(10), Year int, Team varchar(10), HomeRuns int, primary key(Player,Year))insert into Battingselect 'A',2001,'Red Sox',13 union allselect 'A',2002,'Red Sox',23 union allselect 'A',2003,'Red Sox',19 union allselect 'A',2004,'Red Sox',14 union allselect 'A',2005,'Red Sox',11 union allselect 'B',2001,'Yankees',42 union allselect 'B',2002,'Yankees',39 union allselect 'B',2003,'Yankees',42 union allselect 'B',2004,'Yankees',29 union allselect 'C',2002,'Yankees',2 union allselect 'C',2003,'Yankees',3 union allselect 'C',2004,'Red Sox',6 union allselect 'C',2005,'Red Sox',9 Suppose we would like to find out which year each player hit their most home runs, and which team they played for. Read more →