Geoff N. Hiten Blog

SQL Server thoughts, observations, and comments

Zunicide

Yes, I own a 30 GB Zune.  Yes, it crashed today.  Yes, I am unhappy.

Having worked in the computer industry for many years now, I watched many companies deal with failed products.  Such is inevitable in an industry that gives the biggest rewards to the first implementation that is “good enough”.  More importantly, I have seen companies deal with failures in various ways.  Some handled things well and some no longer exist or have lost market dominance due to poor reactions to failures.  Here are a few observations that I am sure Microsoft will ignore, partly because you have to plan ahead for failure in order to handle it well. 

This may sound strange, but data professionals who play in the High Availability space understand this at a near instinctual level.  Every action we take has to have a failure plan as well as a success plan.  It is by limiting the duration and impact of failures that we can build and operate systems with very high service levels.

An essential part of any failure plan is communication.  Once the failure leaves the server room, you have to tell people what is going on.  I have seen this done poorly (SQP 2005 SP2) and done extremely well (SQL 2008 CTP Leap day bug).  The key is to have the communication plan in place before anything goes wrong.  You won’t have time to decide who to tell, much less find their contact information.  Go ahead and have a generic “Sorry, We Failed” web page with a place to write short notices and updates if that is appropriate.  Even if your service level agreement states you have four hours to restore service, telling people when the clock started counting and when they can expect service restoration is still important.  Don’t try and make things look better than they are.  Truth, no matter how painful, is preferable to silence or meaningless verbiage intended to deflect blame.

Note that the Zune.net home page has no announcement.  The only information about the problem is on the Zune user help forums and on social network sites like Facebook and Twitter.  

Lesson:  Don’t let your user community lead the announcement effort.  Let them assist, but make sure you have official information for them to share.  There will still be speculation, but it won’t dominate the discussion.  As of this writing, the community has no official assurance that the Zune team is even aware that they have a problem.

For now, I will give them the benefit of the doubt and see how this plays out.  Unfortunately, Microsoft Zune is off to a poor start handling this problem.

Legacy Comments


Greg
2008-12-31
re: Zunicide
While it's not on their main home page, at least it's at the top of their main support page... http://www.zune.net/en-US/support/default.htm (but even there it is a "we know about it and are looking into it" message... but that's better than nothing at least?)

Mike Walsh
2009-01-01
re: Zunicide
Very good parallels drawn up here. Failure is failure regardless of if it is a database, a zune or a power grid.

The same patterns and practices should be taken in the even of a failure of any of the above or anything that can fail. Great reminders.. I just recently posted about troubleshooting skills and completely failed to highlight communication. I smell a follow-up coming now. http://straightpathsql.squarespace.com/blog/2008/12/31/troubleshooting-methodology.html


I only wanted to spend the money on the 8GB, so I guess it paid off this time. Hopefully these get resolved soon and in the right manner.