Gah.
I spent all day yesterday trying to get MOM to send email.
I set up test rules to watch my local install of SQL Server, and fired over 50 messages into my event log with RAISERROR (by the way, what's with the parenthesis on that command anyway?) and didn't get any emails back. I tweaked the detection rules all day long, thinking I'd screwed something up. It was firing the alerts, but it wasn't doing ANY of the notifications.
Then I went to google and searched, and found
this.
A brief history. SMTP originally had no authentication, and allowed anyone to send email from anywhere. Eventually people who send spam figured this out and started using these open relays to send spam, so two different standard methods of stopping open relays were created. The first is to limit relay by restricting which subnets or IP addresses are allowed to use the system as a relay, the second is to require a login (either cleartext or MD5 Challenge/Response) in order to authenticate the user before performing the relay. Typically, clients first attempt to use the relay as an open relay (which will succeed if the relay is really an open relay, or if it's a subnet-restricted relay), and then if that doesn't work they'll notify the user that they need to enter a username and password.
For MOM, Microsoft elected to do two things that made yesterday hell. First, they don't log any errors when an email fails to send. You basically just have to wait 5-10 minutes and then assume it fell in a hole somewhere. The KBase article I linked above mentions a ""Failed to send SMTP message" error". I got no error in any log. That would have been very helpful, because I assumed for most of the day that the problem was that the event wasn't being detected, not that the notification was failing.
The second thing Microsoft did, and the root cause of the problem, is that the SMTP client in MOM authenticates via NTLM by default. There isn't any GUI, short of regedit, to change that setting, and no mention in the docs that an NTLM-authenticating SMTP server is required for notifications. So, the common SMTP authorization methods (by subnet or MD5 challenge/response) don't work, only NTLM works, and if you don't have NTLM, it doesn't give you an error, it just silently fails.
Nothing should EVER silently fail. Any failure in any system, especially in a systems management provider, should fail very noisily. I should be able to hear it fail around the world via my mobile phone.
MOM should have a button on every operator setup screen that says "test messaging to this operator".
On the good news side, at leat MOM 2005 uses SMTP instead of some kind of goofy MAPI-based (*cough* SQLMail Sucks *cough*) implementation that leaks memory like crazy.
Other than that, MOM is a pretty nifty tool. It has a lot of features that NTManage (now Intuit Network Monitor) is lacking, such as good reporting capabilities, the ability to run as a service instead of requiring a console login, easy-to-configure perfmon stats monitoring (NTManage has this, but it's a real pain in the ass to set up), very flexible operator scheduling, and a lot more. I'm going to spend the rest of the day migrating alert setups over and setting up new operators.