| jul2006.tar |
HADR and Heartbeat Timing SettingsOne of the hardest things to figure out was the correct timing settings. For most folks, setting these values is just a matter of preference (i.e., how soon you want things to fail over). But timing becomes critical when dealing with resources like a HADR database, which, when it fails over, takes some time to do so. If you don't configure these numbers correctly, you can end up with a "split-brain" scenario. This is essentially a situation where both the HADR primary and HADR standby believe they are the primary for the database. They are then "independent", with neither one knowing what the other is doing nor shipping logs to the other. After a lot of digging and some judicious translation of German texts, we found the following suggestions to avoid split brain:
On our tuned systems at the lab, we use the following values, because the systems sit on the same network segment and are fairly "unloaded":
|