Thursday, October 09, 2008
#

I have been working with Microsoft SQL Server for close to ten years now if you include my college days (yes, my college used SQL Server 7.0 for learning RDBMS purposes). I have literally built my entire career around Microsoft SQL Server and have used the product for development, administration, and now for building DW/BI solutions. Around the 2002 timeframe I became aware of this mysterious elite group of folks called Microsoft MVPs. I perceived that MVPs were the best of the best and thus by winning the award you would prove you were one of the best. On October 1st, 2008 I was informed of being awarded a SQL Server MVP and within just eight days of winning I believe I have grasped the award’s true meaning.
MVPs are technical community leaders that are highly proficient in their area of expertise and most importantly continuously give back to their respective communities. The Microsoft MVP tagline is ‘Independent Experts. Real World Answers’. To me, it is the giving dimension of the award that is most important. I believe that if someone was good (but not great) yet he or she continuously shared the knowledge they did have with their respective communities they would have a much better chance at winning the award as opposed to someone who is very strong technically yet never helps others in the community. Like most things in life, only once you’ve been there do you truly understand a context’s meaning. Finally, I give thanks to the Microsoft MVP program and my MVP lead for the award.
Derek Comingore
SQL Server MVP
Sunday, August 17, 2008
#
Business Intelligence projects (like software development projects) are prone to bad or completely absent planning prior to their start. Most of us that work in the technology industries are busy professionals that can never seem to find enough time. When you consider that mindset and then the fact that project planning requires additional time...well you get the point. By our very nature we do not want to waste time!
BI projects are large endeavors, rather BI enterprise projects are large endeavors. Personally, I have been fortunate enough to plan, design, and implement several enterprise systems over the years with some of the best BI professionals. And every time I was glad that my team made a plan!
'Scope Creep' is just as common in BI projects as it is in software development. It is easy to not remain focused through the entire lifecycle of an enterprise BI project. The plan can also serve as a deliverable schedule for your project sponsors (usually executives). In a consulting engagement scenario the importance of planning is of even greater importance as the entire project will be based on deliverables and their scheduling.
While I agree that we hate to waste time, I can tell you that proper planning (for any project) is NOT a waste of time.
Wednesday, August 06, 2008
#
My friends over at SQL Server Magazine posted this today...SQL Server 2008 RTM!
http://www.sqlmag.com/Article/ArticleID/99953/99953.html
Microsoft continues to up the 'ante' in the Business Intelligence space, cheers.
Update:
Official Press Release:
http://www.microsoft.com/Presspass/press/2008/aug08/08-06SQLServer2008PR.mspx
Friday, August 01, 2008
#
One of my favorite things to do each month is write SQL Server Magazine's Essential BI column. I write two posts for the column each month. About a month ago I started to focus on SharePoint for Business Intelligence (BI) topics. For those of you who don't know, SharePoint is Microsoft's BI portal platform.
If you have ever seen a Microsoft BI slide deck then you probably know of the ‘hamburger slide’. The ‘hamburger slide’ is a common slide which shows that SharePoint is the outer-most layer in the Microsoft BI stack. Below is the slide, notice how SharePoint Server is at the top with its various BI capabilities in the tabs:

My last two posts for the Essential BI column has been on SharePoint for BI topics. First, I wrote about SharePoint's overall key BI features including integration points with the other pieces of Microsoft BI software. Next, I created a piece on how you can use Windows SharePoint Services (WSS) 3.0 for BI. A lot of companies have access to WSS 3.0 via Windows Server licenses but do not have the larger Microsoft Office SharePoint Portal Server (MOSS) edition.
From here I have some additional topics, the first of which I have just begun to write:
· MOSS BI: Data Connection Libraries
· MOSS BI: Native KPIs/Dashboards/Filter WebParts
· MOSS BI: Enterprise Search for BI/Business Data Catalog (BDC)
*Excel Services was recently written about quite well by Paul Vaughn in this month’s edition of SQL Server Magazine (August 2008).
Where would like to see additional content on SharePoint BI? What SharePoint BI topics are important to you that I have not yet covered? Send me an email at dcomingore@scalabilityexperts.com or feel free to leave a comment on this blog.
Monday, July 21, 2008
#
A common question I hear from my clients is “how much storage do we need?” Now wait a second, I thought storage was cheap these days right? Wrong, consumer storage is cheap; enterprise storage is still not what I would classify as cheap. Add to this fact that Enterprise Data Warehouses (EDWs) commonly break into the several terabytes range and the matter of planning storage capacities becomes greater. Consulting firms that simply ignore this common client question are missing a value-add scenario. Your clients need to know how much space to plan for and if you simply say “roughly X TB” you are creating two net results. The first is that the client may purchase too much storage or may not purchase enough (very bad). Second, the client will have no idea as to how much storage to plan for in the coming years. Do both yourself and the client a favor; research the storage requirements of the BI solution you have proposed!
Estimating Data Warehouse Storage
To estimate a data warehouse’s storage requirements have a dimensional model designed in any number of tools including Visio, Erwin, or Excel. The main point is to have the actual data types required and their storage affecting settings such as variable length string columns designated. When estimating your data warehouse storage requirements make sure you assume the data types have their max allowed values being used for every record. The thinking is that it’s better to overestimate then under when it comes to storage (to a degree).
Now, obtain from the client how many records they believe will be processed and stored in the data warehoused based upon business requirements. Note how the storage requirements of fact tables are directly related to their grain definition. If your fact tables must store data by day you will certainly require more storage than if you only need to store the various measures by week. This is why it’s very important to have a solid dimensional model phase completed prior to entering other data warehousing tasks such as storage estimates. At this point you should instantiate the dimensional model in a SQL Server relational database.
Populate your schema with a sample percentage of the client’s estimates. For example, if the client has said that you can expect one million records per month and you must retain three years worth of data then you can expect to hold a total of around 36 million records. Based upon the estimates populate your fact tables with 1, 5, 10, or even 25 percent of the total records expected to be retained. Rebuild your fact table’s indexes. Finally, obtain the data and index spaced used by your fact tables and multiply those values according to the percentage of sampling you used.
Estimating Analysis Services Cube Storage
Clients not only need to know how much storage to obtain for the relational dimensional model but also the cubes (multidimensional objects). Estimating cube storage requirements is a more difficult task; however the same basic logic applies. Design and build your sample cubes as close as you can to how they will be built in production. The most important settings for estimating cube storage are the partition aggregations & storage mode settings. In a development environment you will probably only have a single partition, so there is an error of margin when estimating cube storage requirements since you will likely have multiple partitions with various storage settings in a production environment. Finally, perform a full process on the Analysis Services database once you are done creating the sample cubes and their related settings.
ROLAP & HOLAP w/0% Aggregations incur zero storage overhead excluding the cube’s metadata. If you apply aggregations to the HOLAP storage mode the aggregations will occur in the partition files (cube), thus consuming space. If you apply aggregations to the ROLAP storage mode, indexed views are created (and populated) in the underlying relational database (which means ROLAP with aggregations does in fact incur some storage overhead). MOLAP stores both grain data as well as aggregations in the partition files (cube). Finally, remote partitions can be used for placing less frequently access data onto cheaper disks.
With your sample cubes built and processed you can begin the cube estimation process. Obtain the sample cubes’ partition(s) estimated size in Management studio. Next, multiply the partition storage estimates according to the percentage of sampling you used (just like we did in the fact table estimates). Note that this method assumes both the storage mode and the aggregations are consistent throughout each cube (which is fairly unlikely in production environments). I spoke about the storage mode margin of error earlier but I should also mention that the % aggregations effect cube partition storage estimates as well. In the real world aggregation designs change, however for estimation purposes it’s easiest to assume a static setting of X%.
By using some variation of the methods mentioned above you will have a solid foundation for providing accurate estimates of storage required for both dimensional data warehouses as well as their associated cubes.
Monday, July 07, 2008
#
As a follow up to some recent feedback (actually a question) here is some more in depth information on becoming proficient in the Microsoft Data Warehousing/Business Intelligence stack.
Get the terms straight
Business Intelligence or BI is an umbrella term for various business processes, tools, and methodologies which support improved organizational decision making. Data Warehousing (and thus data warehouses) on the other hand are organizational repositories designed to facilitate highly scalable reporting & analytics. DW and BI go together like PB & J (ah the good ol' days). So is Data Warehousing synonymous with BI...NO! Data Warehousing is a common solution (due to its many advantages) implemented to enable BI in an organization....so when you see DW/BI you will now understand that a) DW/BI doesn’t mean they represent the same thing and b) that DW and BI merely complement each other very well thus the term's existence.
Now, understand before ever cracking open a BI book (or scouring the web all day for free information, who would do such a thing?) that REPORTING <> BI. The BI industry is plagued with terminology problems! So, DW <> BI and Reporting <> BI. And like DW, Reporting is merely one component of the larger field of BI.
Personal Microsoft Business Intelligence Professional Learning Plan
Understand before I dive into this that the words to follow are from a DW/BI architect/engineer and not a fulltime trainer. Actually, I wrote a seperate post on the pros/cons of fulltime trainers prior on this blog here. So with that disclaimer out of the way here are the topics in chronological order I would propose to anyone wishing to learn how to implement DW/BI systems on the Microsoft platform.
- Foundations (some of these topics overlap to various degrees)
- Data Warehousing (DW) Fundamentals (Normalized DW Vs Denormalized DW)
- Extract, Transform, and Load (ETL) Fundamentals
- Dimensional Modeling Fundamentals
- OnLine Analytical Processing (OLAP) Fundamentals
- Data Mining Fundamentals (I have more to learn here as well...in fact most do)
- Brief History of DSS (decision support systems) *You could argue you don't need this knowledge and in so far as getting the "job done" you don't.
- DW/BI Methodologies
- Kimball (www.kimballgroup.com) is pretty much the defacto standard for HOW to build an enterprise data warehouse on the Microsoft Platform. I personally have not read others work but do know of them and by all means use what ever methodology fits you and your DW/BI projects best.
- Microsoft DW/BI Tools (now to the technology)
- SQL Server 2005 (including Analysis/Reporting/Integration Services)
- Microsoft Office 2007 Suite (namely SharePoint, PerformancePoint & Excel)
Best "WWW Stops" (in no particular order)
What you see is what you get
So thats my list (and best "stops") for now. Those of you new to the Microsoft platform may laugh that there are only a handful of products to learn but be forewarned that both SQL Server and the Office products are VERY LARGE offerings. SharePoint & SQL Server in particular can (and often do) fill bookshelves worth of valuable content.
I hope that helps everyone out there who is considering learning more and/or making a career in the MSFT DW/BI space. There is a lot to learn, however you will find that having a solid background in databases (and a bit of programming) goes a long way in learning the actual tools. It also seems to be true according to the TDWI (www.tdwi.org) that most DW/BI Professionals in general (not just those using the Microsoft platform) have lived a "past life" in another part of IT (usually dev/dba). Most people do not go directly into the field of DW/BI (which I personally 100% support)!
Thursday, July 03, 2008
#
Hope everyone is doing well out there. A good friend of mine across the pond mentioned that he has started to read a lot of my content across the blogs, my SQL Magazine Articles/BI Column, etc. His feedback was excellent and relates to a general perception issue with the field of BI. "They are highly technology based" was his comment in so many words...
First a few clarifications are in order, writing for SQL Magazine means you will be discussing the actual data platform called SQL Server at some length. This blog and other places though I could write more BI neurtral content. And thus I have created a new post category called "Business Intelligence Nuggets" (which this post is the first in).
The perception issue is that Business Intelligence is all about technology! Business Intelligence is 100% about business, THE TOOLS ARE SIMPLY ENABLERS. A well thought out dimensional model will work well on any database platform. A solid ETL design will run great on any ETL platform it is implemented in. The larger point is, a great BI architect will be good (and thus his designs) on any platform or toolset.
There are layers to any BI solution nomatter how or what it is constructed of. First is the Business and it's logic/requirements. Second, is design and methodologies...the words Kimball & Inmon come to mind. Third, are the tools...in my case SQL Server, Analysis Services, Reporting Services, and Integration Services.
Tools (SQL Server etc.)
Design (Design Patterns, Best Practices, & Methodologies)
Business Requirements (What needs to be measured?)
Sunday, June 15, 2008
#

Well, the new SQL Server 2008 build Release Candidate 0 (RC0) has been available for a bit now and I have had some time with the new build. The word "build" brings up a good point actually, before we go any further I want to review the last few SQL Server 2008 version numbers. You can retrieve your SQL Server 2008's database instance's version number using either @@Version or the serverproperty function, for more information see one of Euan's old posts here .The last few SQL Server 2008 versions are as follows:
- November 2007 CTP: 10.0.1075.23
- February 2008 CTP: 10.0.1300.13
- June 2008 RC0: 10.0.1442.32
So what is brand new in the RC0 release? Well, thus far I am not finding a lot but I believe this is because the product is now in RC status and thus much closer to the ship date! However, there are some new items I have come across and there are probably more. If you happen to know of a SQL Server 2008 RC0 feature that is not in my list please comment the post and I will add it.
*SQL Server 2008 Books Online (BOL) RC0 unfortunately is not explicit about what the new features are in the RC0 release. There were some components in the past SQL Server 2008 CTPs that were documented in this manner and I wish BOL would remain consistent in that documentation convention while the products are in CTP/RC status.
- Marketing
- We have a new logo for the 2008 product (see above), you can find more background information here.
- Tools
- In Management Studio, when we connect to ANY SQL Server server we are given an option to specify additional connection string parameters.
- Enhanced installation routine which also requires a new version of Windows Installer
- Enhanced installation Center utility
- New SSIS Connections Project which serves as an alternative to the Import/Export Wizard
- Report Designer Preview morphed into what is now known as Report Builder 2.0 and is part of the accompanying Feature Pack.
- New Activity Monitor (per Graz on SQLTeam, thanks!)
3. Samples
- Updated RC0 Sample Databases (www.codeplex.com)

Tuesday, June 10, 2008
#
Friday, June 06, 2008
#
With the impending arrival of Microsoft
SQL Server 2008 come several new enhancements. These enhancements can be categorized as new capabilities, old constraints removed, and improvements in scalability. In the SQL Server community (including this blog) you can find numerous amounts of information on the new features found in
SQL Server 2008…however you seldom see much on deprecated features. Deprecated features are abilities of the new release which are either being removed completely or are being replaced by a future build of the software product. Discontinued features on the other hand are software features removed in the new release of the product.
Before I dive into the ‘list’ I wanted to mention that
SQL Server 2008 includes a performance counter which will increment each time you use a deprecated feature. The counter is called
SQL Server:Deprecated Features. Plan accordingly prior to upgrading to
SQL Server 2008 and of course leverage the SQL Server Upgrade Advisor which is bundled with the product as well as Scalability Expert’s SQL Server Upgrade Assistant which can be found at
http://www.scalabilityexperts.com/default.asp?action=article&ID=43 .
Here is the deprecated & discontinued list by component, MSDN URLs:
SQL Server 2008 Reporting Services Discontinued Features