Recently, I rebooted all nodes in our 4-node cluster at our disaster recovery site because I had attempted to install another SQL Server 2005 instance but the installation had indicated I needed to reboot the nodes first. I rebooted the nodes one at a time as is best practice for a cluster. After the reboots, I was able to successfully install the new instance.
At some point after the reboot, database mirroring had entered the suspended state for most of our databases at the primary site. This eventually caused us to run out of disk space at the primary site on the mount point where the transaction log exists as the log records keep accumulating at the principal. Backing up the transaction log does not fix this since database mirroring has not applied it at the mirror. You also can not truncate the transaction log, since this isn't allowed when the database is mirrored. The only thing I could do to fix the problem quickly was to break database mirroring.
I am not sure when database mirroring entered the suspended state as the Event Log had rolled over already, but I went under the assumption that the reboots caused the problem. I had planned on opening a case with Microsoft on this problem as soon as I could duplicate the issue.
Since we run production at our primary site, I am able to reboot the nodes at the DR site without impacting our applications. So I began attempting to duplicate the issue by rebooting the nodes one at a time. Each time the cluster groups failed to another node, I checked the state of database mirroring for each database. As expected, the state would be disconnected while SQL Server was failing over to another node. As soon as the cluster group came online, the state would change to synchronizing and then eventually synchronized. I then started rebooting 2-3 nodes at a time, but still database mirroring recovered fine. I decided to then take one of the cluster groups offline for several minutes. Database mirroring remained in the disconnected state while the cluster group was offline. I then brought it back online and database mirroring resumed without any problems.
In the end, I was unable to get database mirroring into a suspended state, so I am doubtful that the reboots that I did caused this problem. If anyone has any ideas as to possible culprits, please let me know so that I can test them.