From 58ddb4bbe43dcf804e391919cdca4c257f91ac4f Mon Sep 17 00:00:00 2001 From: Ken Gaillot Date: Dec 15 2020 18:02:05 +0000 Subject: Doc: Pacemaker Explained: document new on-fail="demote" option --- diff --git a/doc/Pacemaker_Explained/en-US/Ch-Resources.txt b/doc/Pacemaker_Explained/en-US/Ch-Resources.txt index d8e7115..9df9243 100644 --- a/doc/Pacemaker_Explained/en-US/Ch-Resources.txt +++ b/doc/Pacemaker_Explained/en-US/Ch-Resources.txt @@ -676,6 +676,10 @@ a|The action to take if this action ever fails. Allowed values: * +ignore:+ Pretend the resource did not fail. * +block:+ Don't perform any further operations on the resource. * +stop:+ Stop the resource and do not start it elsewhere. +* +demote:+ Demote the resource, without a full restart. This is valid only for + +promote+ actions, and for +monitor+ actions with both a nonzero +interval+ + and +role+ set to +Master+; for any other action, a configuration error will + be logged, and the default behavior will be used. * +restart:+ Stop the resource and start it again (possibly on a different node). * +fence:+ STONITH the node on which the resource failed. * +standby:+ Move _all_ resources away from the node on which the resource failed. @@ -714,6 +718,38 @@ indexterm:[Action,Property,on-fail] |========================================================= +[NOTE] +==== +When +on-fail+ is set to +demote+, recovery from failure by a successful demote +causes the cluster to recalculate whether and where a new instance should be +promoted. The node with the failure is eligible, so if master scores have not +changed, it will be promoted again. + +There is no direct equivalent of +migration-threshold+ for the master role, but +the same effect can be achieved with a location constraint using a +<> with a node attribute expression for the resource's fail +count. + +For example, to immediately ban the master role from a node with any failed +promote or master monitor: +[source,XML] +---- + + + + + + +---- + +This example assumes that there is a promotable clone of the +my_primitive+ +resource (note that the primitive name, not the clone name, is used in the +rule), and that there is a recurring 10-second-interval monitor configured for +the master role (fail count attributes specify the interval in milliseconds). +==== + [[s-resource-monitoring]] === Monitoring Resources for Failure ===