Expansion Diary Part 6
All of those new Sales Guys are starting to pour in now :) Did you notice, that was a smiley? I think we're going to be able to handle the load after all!
Our ongoing transition from ASP.OLD into ASP.NET is helping, along with lots of SQL optimizations. But the single biggest thing we are implementing is... Caching! Of course, like everyone will gladly tell you, caching is an immense performance enhancer - IF you can stand the data being "stale."
I think that what we've done is quite interesting, so of course I'll talk about it here :) Remember the Work Queues I discussed before? With caching I've been able to solve ALL the problems in a very elegant way. Here's the scoop:
1) We set up a WorkQueue Server. Yes, a server all it's own, with lots of RAM.
2) The WorkQueue Server hosts one ASP.NET WebService which knows the definitions of each Work Queue. You call into the WebService asking for a certain Queue, and passing in all sorts of parameters, which will cause the service to remove certain columns, sort the data a certain way, filter certain rows, return only the ID fields, etc.
3) The WebService holds a DataSet in CACHE for each Work Queue. When a request comes in, it takes the appropriate DataSet from cache, filters it accordingly, and sends it out. The DataSets in cache are "total" sets of data. In other words, enough rows and columns to support all the various places that need the data.
4) So then if we have 4 different places that need the same Queue, but sliced and diced differently, we call the same WebService, with different args, and that's it.
5) The WebService auto-refills the cache on expiration, etc. We have introduced a randomization factor so that not all Queues need refreshing at the same time :)
The benefits of this are enormous. We can now have one stored procedure per Queue. And that procedure gets run on average once each 10 minutes (our cache base time). From there, it's all done in RAM on the Queue Web Service. This gives us all the benefits of "dynamic SQL" without paying the price for that. Let's not argue about what that price might be. Suffice it to say we "don't like dynamic SQL." Our database interaction is with stored procedures, with NO PARAMETERS. The furthest that any badly formed dynamic sql can make it is into the definition of the DataView used to work with the DataSet in cache. Never to the database. That's beautiful.
Legacy Comments
Lavos
2004-04-01 |
re: Expansion Diary Part 6 Why no parameters on the Stored Procs? Also, if you dig around you can find some interesting ways to invalidate the cache when SQL Server gets updated. I remember someone making an extended procedure to "touch" a file (something I picked up in my unix days scripting a shell :) that the ASP.NET process has a File Dependency on. Of course, whether the effort is worth it depends on how many calls you could actually save and how much/little tolerance towards staleness you have. |
Travis Laborde
2004-04-02 |
re: Expansion Diary Part 6 No parameters on the sprocs is really just a by-product of the way the system is set up. Each Queue is populated by a sproc, and needs to have all of the rows and all of the columns needed for all of the various callers of the Queue, it doesn't need parameters. When different callers want only "part" of the Queue, those are designated in the call to the QueueService, which then uses a DataView to do what the sql parameters would have done. |
Wesley Brown
2004-04-05 |
re: Expansion Diary Part 6 Hello, Your blog posts on this have been interesting to read. I have a couple questions about the architectural designs you're using. Maybe I don't understand your solution here but it seems like your caching solution ends up with the worst of both caching and not caching. Your primary cost in obtaining data from a DB is getting it across the wire, given well designed SP's/Tables/View and such. All your data clients (in this case your ASP.net webservers) still have to travel through wire to get to the cached data. So even if the dataset is cached you're going to be paying a whole lot more then if each client kept it's own local cache. Your ability to pass in parameters, select the fields you want, sort by columns and all is very cool; however it seems like you're replicating Sql Server's query execution engine. Maybe you wrote it in c++ and optimized the hell out if it (and if so kudos to you! :) ); however, if your didn't then it is almost certainly slower than just getting the data out of SQL Server in a SP or filtering a local dataset. I hope I'm not coming off like a jerk; I think this sounds like a very interesting project. I am just curious about what factors came into your design. Thanks, Wes |
Travis Laborde
2004-04-12 |
re: Expansion Diary Part 6 Wes, I'm glad you find the posts interesting. Here, I'll try to answer... First, my primary cost is not in getting the data over the wire, but rather in getting it out of the database. The database hits are major, and quite frequent. So, caching is a huge plus. Here's an example. 50 Sales Guys. Each hitting his "home" page every few minutes. The page contains Queue information, for maybe 20 Work Queues, showing a quick index of how many accounts are in each queue, etc. That's a LOT of database hits, and heavy ones too. Now, increase that to over 100 Sales Guys. Obviously caching is a good thing. Second, often we use essentially the same data on many pages in the site, just viewed in a different way. For example, the managers get summary pages where the reps get their individual pages. Or, support personnel need to see limited subsets of the same data. Different columns needed on each view, etc. Third, we bring it over the wire rather than keeping it in the main IIS machine's RAM for a couple of reasons. Mainly, it takes a LOT of RAM to do this, and also because of the dynamic use of it, it is a good hit on the CPU as well. This way, our actual "webserver" is under less strain. Using WebServices as your data tier is definitely the slowest way to get your data. But honestly, that doesn't matter in a lot of cases. EBay probably wouldn't want to get their data that way. But for me, with 120 users? The difference between getting the data via WebService and in-process ADO.NET calls is not even detectable. I'll probably have an entire post on that subject soon. Thanks for your questions. I hope my answers made at least a little sense. |