Are you a DBA Monkey?

Fri Aug 1, 2008 by Dan Guzman

We have a number of de facto standards in our shop, such as always placing data and indexes on separate filegroups. I asked why we bothered with the separate filegroups since the underlying files were often on the same physical disks anyway. No one really knew why. The practice had been followed before the current DBAs joined the organization and has continued.

This reminded me of a parable my boss told me several years ago. Five monkeys are placed in a cage with a step ladder leading to a bunch of bananas suspended from the top. The first monkey rushes up the ladder eagerly only to get forcibly knocked off by a stream of water from fire hose when he reaches the top. The other monkeys are also drenched with the ice cold water. After recovering from the shock, another monkey climbs the ladder with the same result. Eventually, each monkey tries to reach the bananas and learns the futility and unpleasantness of the effort. Any monkey that even tries to climb the ladder is promptly tackled by his peers.

Now attrition begins. One of the original monkeys is removed and replaced with a new monkey. He looks at the bananas and back at the other monkeys, wondering why they ignore such a tasty treat. He starts up the ladder and is immediately pulled off and beaten by the other monkeys. After a couple of attempts, the new monkey learns to avoid the ladder. Another original monkey is replaced with a new monkey. When he goes for the bananas, he also gets tackled by the other monkeys, including the one that’s never been soaked. The same process is repeated; original monkeys are replaced and new monkeys quickly learn to avoid the ladder despite the prize at the top. The last original monkey is eventually replaced.

We now have a group of monkeys who’ve never been soaked by the fire hose, won’t climb the ladder and won’t allow others to climb it either. It doesn’t matter if the fire hose is removed or how many generations of monkeys follow. They don’t know why they do what they do.

Like the monkeys, institutional memory causes us to follow practices mindlessly and be slow to adapt to changes in our environment. Going back to my data/index filegroup question, there may have once been a good reason for the separate filegroups. Perhaps the separate groups improved performance of a specific application workload by isolating random and sequential I/O on local disk storage. Our current environment is quite different, however. Physical disks on the current SAN storage are often shared with other hosts/LUNs so placing data and indexes on different filegroups no longer isolates the I/O. Furthermore, the separate filegroup approach was since generalized to apply to all databases regardless of whether or not it made sense for a given application workload. We continue the data/index filegroup practice because the reasoning behind the separate filegroups is not known or understood.

It may be easier to robotically continue with the status quo but I think we need to understand exactly why a given filegroup approach might be advantageous so that we can choose what is best for the situation at hand. For example, I/O Performance of our high transaction OLTP databases is best achieved by balancing the heavy random I/O workload over as many disks as possible because this maximizes the disk transfers/sec rate. I don't think segregating data and index files is appropriate for our OLTP applications because it artificially caps the maximum I/O rate and can actually introduce an unnecessary I/O bottleneck.

In contrast, a single filegroup for both data and indexes distributes I/O proportionally according to the size of the underlying files with less complexity and administration than the separate filegroup approach. In my experience, this single user-defined filegroup strategy works well for both OLTP and mixed workloads where sequential I/O cannot be accurately predicted. Note that we need to work closely with the SAN storage team to meet performance objectives regardless of the filegroup strategy we choose since the physical implementation is outside our control.

We also have databases containing large amounts of historical data. In that case, we can use filegroups to place infrequently accessed older data on read-only filegroups with the files on relatively slow inexpensive storage. Often-queried recent data can then be stored on the more expensive fast storage devices via a read-write filegroup. The key here is that the additional maintenance and administration complexity is justified by the hardware savings. Similarly, our large reporting databases can benefit from a customized filegroup strategy to facilitate backup/recovery or improve performance by isolating random and sequential I/O for specialized (and predictable) workloads.

Like many of the choices we make, there is no single filegroup strategy that is optimal for all cases. It’s important to understand the rationale behind a given approach so that one can choose the best one for the current situation and create new strategies when needed. A key differentiator between average DBA and a good one is a thorough knowledge of why practices are followed. Don’t be a monkey.

Legacy Comments


Uri Dimant 2008-08-03	Are you a DBA Monkey? Hi Dan ,thanks for this very nice article along with very good parable :-))) .Often, you do not have power to change the policy being followed by years , I completely agree with "A key differentiator between average DBA and a good one is a thorough knowledge of why practices are followed. Don’t be a monkey." You said that the separate groups improved performance of a specific application workload by isolating random and sequential I/O on local disk storage, but by separating LOG (seq) and DATA(random) on diff disks we gain the same goal, I mean why do we need a filegroup? BTW , don't you affraid your boss to read this arcticle? :-))))))

Dan Guzman 2008-08-03	re: Are you a DBA Monkey? You are absolutely correct, Uri. The main reasons that placing the log file on a different drive is a SQL Server Best Practice are that this isolates the sequential log file access from the other database file I/O and the separation ensures data I/O does not slow down the critical synchronous log writes during COMMIT. > I mean why do we need a filegroup? Once log file I/O is isolated, data files can be spread over all remaining physical disks to maximize typical OLTP application performance. User-defined filegroups are not required to accomplish this. Filegroups may still be desirable to address other backup/recovery or manageability objectives. > BTW , don't you affraid your boss to read this arcticle? :-)))))) On the contrary, I think open and frank discussion is healthy. This is one if the differentiators been us and monkeys :-)

Regan 2008-08-06	re: Are you a DBA Monkey? Useful article. I re-inforces what I've long held as a principle when making decisions, and doing documentation (design, standard, etc. etc.), and that is: Do not just give the decision (or statement) - include the reasoning for it. If there had been a "standard's" document at the company, when you got there, and it followed the above principle, then "new monkey's" coming in could have read that "you don't climb the ladder because of the fire hose", and been able to challenge the status quo. Another important documentation principle (imho) is that, when considering design, etc., it can be very useful to document not just the design (and reasons), but rejected designs (and reasons) - for example: "This system was deployed on SQL 2005 and using file system deployed images. SQL 2008 and filestream data type was considered, but because the solution goes live before the release of SQL 2008, it was decided that was not a viable design." This helps someone in the future not only understand why something was done, but also why another option was not used, and may help them choose to go that route for the future.

Adam Kahtava 2008-09-30	re: Are you a DBA Monkey? Great post! Enjoyed the parable, food for thought.

Patrick Ikhifa 2009-03-11	re: Are you a DBA Monkey? Good Post and good parable. Dan I believe I remember you from the SQL Server Professional days when it was only available in print

Parag Bandkar 2009-09-10	re: Are you a DBA Monkey? Good Article. Really liked it

James 2012-04-01	re: Are you a DBA Monkey? Good article. Enjoyed the parable. Your explanations are very clear and accurate. Good work dude