dhodovsk / source-git / pacemaker

Forked from source-git/pacemaker 3 years ago
Clone

Blame doc/Clusters_from_Scratch/en-US/Ch-Fencing.txt

rpm-build 3ee90c
:compat-mode: legacy
rpm-build 3ee90c
= Configure Fencing =
rpm-build 3ee90c
rpm-build 3ee90c
== What is Fencing? ==
rpm-build 3ee90c
rpm-build 3ee90c
Fencing protects your data from being corrupted, and your application from
rpm-build 3ee90c
becoming unavailable, due to unintended concurrent access by rogue nodes.
rpm-build 3ee90c
rpm-build 3ee90c
Just because a node is unresponsive doesn't mean it has stopped
rpm-build 3ee90c
accessing your data. The only way to be 100% sure that your data is
rpm-build 3ee90c
safe, is to use fencing to ensure that the node is truly
rpm-build 3ee90c
offline before allowing the data to be accessed from another node.
rpm-build 3ee90c
rpm-build 3ee90c
Fencing also has a role to play in the event that a clustered service
rpm-build 3ee90c
cannot be stopped. In this case, the cluster uses fencing to force the
rpm-build 3ee90c
whole node offline, thereby making it safe to start the service
rpm-build 3ee90c
elsewhere.
rpm-build 3ee90c
rpm-build 3ee90c
Fencing is also known as STONITH, an acronym for "Shoot The Other Node In The
rpm-build 3ee90c
Head", since the most popular form of fencing is cutting a host's power.
rpm-build 3ee90c
rpm-build 3ee90c
In order to guarantee the safety of your data,
rpm-build 3ee90c
footnote:[If the data is corrupt, there is little point in continuing to make it available]
rpm-build 3ee90c
fencing is enabled by default.
rpm-build 3ee90c
rpm-build 3ee90c
[NOTE]
rpm-build 3ee90c
====
rpm-build 3ee90c
It is possible to tell the cluster not to use fencing, by setting the
rpm-build 3ee90c
*stonith-enabled* cluster option to false:
rpm-build 3ee90c
----
rpm-build 3ee90c
[root@pcmk-1 ~]# pcs property set stonith-enabled=false
rpm-build 3ee90c
[root@pcmk-1 ~]# crm_verify -L
rpm-build 3ee90c
----
rpm-build 3ee90c
rpm-build 3ee90c
However, this is completely inappropriate for a production cluster. It tells
rpm-build 3ee90c
the cluster to simply pretend that failed nodes are safely powered off. Some
rpm-build 3ee90c
vendors will refuse to support clusters that have fencing disabled. Even
rpm-build 3ee90c
disabling it for a test cluster means you won't be able to test real failure
rpm-build 3ee90c
scenarios.
rpm-build 3ee90c
====
rpm-build 3ee90c
rpm-build 3ee90c
== Choose a Fence Device ==
rpm-build 3ee90c
rpm-build 3ee90c
The two broad categories of fence device are power fencing, which cuts off
rpm-build 3ee90c
power to the target, and fabric fencing, which cuts off the target's access to
rpm-build 3ee90c
some critical resource, such as a shared disk or access to the local network.
rpm-build 3ee90c
rpm-build 3ee90c
Power fencing devices include:
rpm-build 3ee90c
rpm-build 3ee90c
* Intelligent power switches
rpm-build 3ee90c
* IPMI
rpm-build 3ee90c
* Hardware watchdog device (alone, or in combination with shared storage used
rpm-build 3ee90c
  as a "poison pill" mechanism)
rpm-build 3ee90c
rpm-build 3ee90c
Fabric fencing devices include:
rpm-build 3ee90c
rpm-build 3ee90c
* Shared storage that can be cut off for a target host by another host (for
rpm-build 3ee90c
  example, an external storage device that supports SCSI-3 persistent
rpm-build 3ee90c
  reservations)
rpm-build 3ee90c
* Intelligent network switches
rpm-build 3ee90c
rpm-build 3ee90c
Using IPMI as a power fencing device may seem like a good choice. However,
rpm-build 3ee90c
if the IPMI shares power and/or network access with the host (such as most
rpm-build 3ee90c
onboard IPMI controllers), a power or network failure will cause both the
rpm-build 3ee90c
host and its fencing device to fail. The cluster will be unable to recover,
rpm-build 3ee90c
and must stop all resources to avoid a possible split-brain situation.
rpm-build 3ee90c
rpm-build 3ee90c
Likewise, any device that relies on the machine being active (such as
rpm-build 3ee90c
SSH-based "devices" sometimes used during testing) is inappropriate,
rpm-build 3ee90c
because fencing will be required when the node is completely unresponsive.
rpm-build 3ee90c
rpm-build 3ee90c
== Configure the Cluster for Fencing ==
rpm-build 3ee90c
rpm-build 3ee90c
. Install the fence agent(s). To see what packages are available, run `yum
rpm-build 3ee90c
  search fence-`. Be sure to install the package(s) on all cluster nodes.
rpm-build 3ee90c
rpm-build 3ee90c
. Configure the fence device itself to be able to fence your nodes and accept
rpm-build 3ee90c
  fencing requests. This includes any necessary configuration on the device and
rpm-build 3ee90c
  on the nodes, and any firewall or SELinux changes needed. Test the
rpm-build 3ee90c
  communication between the device and your nodes.
rpm-build 3ee90c
rpm-build 3ee90c
. Find the name of the correct fence agent: `pcs stonith list`
rpm-build 3ee90c
rpm-build 3ee90c
. Find the parameters associated with the device:
rpm-build 3ee90c
  +pcs stonith describe pass:[<replaceable>agent_name</replaceable>]+
rpm-build 3ee90c
rpm-build 3ee90c
. Create a local copy of the CIB: `pcs cluster cib stonith_cfg`
rpm-build 3ee90c
rpm-build 3ee90c
. Create the fencing resource: +pcs -f stonith_cfg stonith create pass:[<replaceable>stonith_id
rpm-build 3ee90c
  stonith_device_type [stonith_device_options]</replaceable>]+
rpm-build 3ee90c
+
rpm-build 3ee90c
Any flags that do not take arguments, such as +--ssl+, should be passed as +ssl=1+.
rpm-build 3ee90c
rpm-build 3ee90c
. Enable fencing in the cluster: `pcs -f stonith_cfg property set stonith-enabled=true`
rpm-build 3ee90c
rpm-build 3ee90c
. If the device does not know how to fence nodes based on their cluster node
rpm-build 3ee90c
  name, you may also need to set the special *pcmk_host_map* parameter. See
rpm-build 3ee90c
  `man pacemaker-fenced` for details.
rpm-build 3ee90c
rpm-build 3ee90c
. If the device does not support the *list* command, you may also need
rpm-build 3ee90c
  to set the special *pcmk_host_list* and/or *pcmk_host_check*
rpm-build 3ee90c
  parameters.  See `man pacemaker-fenced` for details.
rpm-build 3ee90c
rpm-build 3ee90c
. If the device does not expect the victim to be specified with the
rpm-build 3ee90c
  *port* parameter, you may also need to set the special
rpm-build 3ee90c
  *pcmk_host_argument* parameter. See `man pacemaker-fenced` for details.
rpm-build 3ee90c
rpm-build 3ee90c
. Commit the new configuration: `pcs cluster cib-push stonith_cfg`
rpm-build 3ee90c
rpm-build 3ee90c
. Once the fence device resource is running, test it (you might want to stop
rpm-build 3ee90c
  the cluster on that machine first):
rpm-build 3ee90c
  +stonith_admin --reboot pass:[<replaceable>nodename</replaceable>]+
rpm-build 3ee90c
rpm-build 3ee90c
== Example ==
rpm-build 3ee90c
rpm-build 3ee90c
For this example, assume we have a chassis containing four nodes
rpm-build 3ee90c
and a separately powered IPMI device active on 10.0.0.1. Following the steps
rpm-build 3ee90c
above would go something like this:
rpm-build 3ee90c
rpm-build 3ee90c
Step 1: Install the *fence-agents-ipmilan* package on both nodes.
rpm-build 3ee90c
rpm-build 3ee90c
Step 2: Configure the IP address, authentication credentials, etc. in the IPMI device itself.
rpm-build 3ee90c
rpm-build 3ee90c
Step 3: Choose the *fence_ipmilan* STONITH agent.
rpm-build 3ee90c
rpm-build 3ee90c
Step 4: Obtain the agent's possible parameters:
rpm-build 3ee90c
----
rpm-build 3ee90c
[root@pcmk-1 ~]# pcs stonith describe fence_ipmilan
rpm-build 3ee90c
fence_ipmilan - Fence agent for IPMI
rpm-build 3ee90c
rpm-build 3ee90c
fence_ipmilan is an I/O Fencing agentwhich can be used with machines controlled by IPMI.This agent calls support software ipmitool (http://ipmitool.sf.net/). WARNING! This fence agent might report success before the node is powered off. You should use -m/method onoff if your fence device works correctly with that option.
rpm-build 3ee90c
rpm-build 3ee90c
Stonith options:
rpm-build 3ee90c
  ipport: TCP/UDP port to use for connection with device
rpm-build 3ee90c
  hexadecimal_kg: Hexadecimal-encoded Kg key for IPMIv2 authentication
rpm-build 3ee90c
  port: IP address or hostname of fencing device (together with --port-as-ip)
rpm-build 3ee90c
  inet6_only: Forces agent to use IPv6 addresses only
rpm-build 3ee90c
  ipaddr: IP Address or Hostname
rpm-build 3ee90c
  passwd_script: Script to retrieve password
rpm-build 3ee90c
  method: Method to fence (onoff|cycle)
rpm-build 3ee90c
  inet4_only: Forces agent to use IPv4 addresses only
rpm-build 3ee90c
  passwd: Login password or passphrase
rpm-build 3ee90c
  lanplus: Use Lanplus to improve security of connection
rpm-build 3ee90c
  auth: IPMI Lan Auth type.
rpm-build 3ee90c
  cipher: Ciphersuite to use (same as ipmitool -C parameter)
rpm-build 3ee90c
  target: Bridge IPMI requests to the remote target address
rpm-build 3ee90c
  privlvl: Privilege level on IPMI device
rpm-build 3ee90c
  timeout: Timeout (sec) for IPMI operation
rpm-build 3ee90c
  login: Login Name
rpm-build 3ee90c
  verbose: Verbose mode
rpm-build 3ee90c
  debug: Write debug information to given file
rpm-build 3ee90c
  power_wait: Wait X seconds after issuing ON/OFF
rpm-build 3ee90c
  login_timeout: Wait X seconds for cmd prompt after login
rpm-build 3ee90c
  delay: Wait X seconds before fencing is started
rpm-build 3ee90c
  power_timeout: Test X seconds for status change after ON/OFF
rpm-build 3ee90c
  ipmitool_path: Path to ipmitool binary
rpm-build 3ee90c
  shell_timeout: Wait X seconds for cmd prompt after issuing command
rpm-build 3ee90c
  port_as_ip: Make "port/plug" to be an alias to IP address
rpm-build 3ee90c
  retry_on: Count of attempts to retry power on
rpm-build 3ee90c
  sudo: Use sudo (without password) when calling 3rd party sotfware.
rpm-build 3ee90c
  priority: The priority of the stonith resource. Devices are tried in order of highest priority to lowest.
rpm-build 3ee90c
  pcmk_host_map: A mapping of host names to ports numbers for devices that do not support host names. Eg. node1:1;node2:2,3 would tell the cluster to use port 1 for node1 and ports 2 and
rpm-build 3ee90c
                 3 for node2
rpm-build 3ee90c
  pcmk_host_list: A list of machines controlled by this device (Optional unless pcmk_host_check=static-list).
rpm-build 3ee90c
  pcmk_host_check: How to determine which machines are controlled by the device. Allowed values: dynamic-list (query the device), static-list (check the pcmk_host_list attribute), none
rpm-build 3ee90c
                   (assume every device can fence every machine)
rpm-build 3ee90c
  pcmk_delay_max: Enable a random delay for stonith actions and specify the maximum of random delay. This prevents double fencing when using slow devices such as sbd. Use this to enable a
rpm-build 3ee90c
                  random delay for stonith actions. The overall delay is derived from this random delay value adding a static delay so that the sum is kept below the maximum delay.
rpm-build 3ee90c
  pcmk_delay_base: Enable a base delay for stonith actions and specify base delay value. This prevents double fencing when different delays are configured on the nodes. Use this to enable
rpm-build 3ee90c
                   a static delay for stonith actions. The overall delay is derived from a random delay value adding this static delay so that the sum is kept below the maximum delay.
rpm-build 3ee90c
  pcmk_action_limit: The maximum number of actions can be performed in parallel on this device Pengine property concurrent-fencing=true needs to be configured first. Then use this to
rpm-build 3ee90c
                     specify the maximum number of actions can be performed in parallel on this device. -1 is unlimited.
rpm-build 3ee90c
rpm-build 3ee90c
Default operations:
rpm-build 3ee90c
  monitor: interval=60s
rpm-build 3ee90c
----
rpm-build 3ee90c
rpm-build 3ee90c
Step 5: `pcs cluster cib stonith_cfg`
rpm-build 3ee90c
rpm-build 3ee90c
Step 6: Here are example parameters for creating our fence device resource:
rpm-build 3ee90c
----
rpm-build 3ee90c
[root@pcmk-1 ~]# pcs -f stonith_cfg stonith create ipmi-fencing fence_ipmilan \
rpm-build 3ee90c
      pcmk_host_list="pcmk-1 pcmk-2" ipaddr=10.0.0.1 login=testuser \
rpm-build 3ee90c
      passwd=acd123 op monitor interval=60s
rpm-build 3ee90c
[root@pcmk-1 ~]# pcs -f stonith_cfg stonith
rpm-build 3ee90c
 ipmi-fencing	(stonith:fence_ipmilan):	Stopped 
rpm-build 3ee90c
----
rpm-build 3ee90c
rpm-build 3ee90c
Steps 7-10: Enable fencing in the cluster:
rpm-build 3ee90c
----
rpm-build 3ee90c
[root@pcmk-1 ~]# pcs -f stonith_cfg property set stonith-enabled=true
rpm-build 3ee90c
[root@pcmk-1 ~]# pcs -f stonith_cfg property
rpm-build 3ee90c
Cluster Properties:
rpm-build 3ee90c
 cluster-infrastructure: corosync
rpm-build 3ee90c
 cluster-name: mycluster
rpm-build 3ee90c
 dc-version: 1.1.18-11.el7_5.3-2b07d5c5a9
rpm-build 3ee90c
 have-watchdog: false
rpm-build 3ee90c
 stonith-enabled: true
rpm-build 3ee90c
----
rpm-build 3ee90c
rpm-build 3ee90c
Step 11: `pcs cluster cib-push stonith_cfg --config`
rpm-build 3ee90c
rpm-build 3ee90c
Step 12: Test:
rpm-build 3ee90c
----
rpm-build 3ee90c
[root@pcmk-1 ~]# pcs cluster stop pcmk-2
rpm-build 3ee90c
[root@pcmk-1 ~]# stonith_admin --reboot pcmk-2
rpm-build 3ee90c
----
rpm-build 3ee90c
rpm-build 3ee90c
After a successful test, login to any rebooted nodes, and start the cluster
rpm-build 3ee90c
(with `pcs cluster start`).