Blob Blame History Raw
:compat-mode: legacy
= Remote Node Walk-through =

*What this tutorial is:* An in-depth walk-through of how to get Pacemaker to
integrate a remote node into the cluster as a node capable of running cluster
resources.

*What this tutorial is not:* A realistic deployment scenario. The steps shown
here are meant to get users familiar with the concept of remote nodes as
quickly as possible.

This tutorial requires three machines: two to act as cluster nodes, and
a third to act as the remote node.

== Configure Remote Node ==

=== Configure Firewall on Remote Node ===

Allow cluster-related services through the local firewall:
----
# firewall-cmd --permanent --add-service=high-availability
success
# firewall-cmd --reload
success
----

[NOTE]
======
If you are using iptables directly, or some other firewall solution besides
firewalld, simply open the following ports, which can be used by various
clustering components: TCP ports 2224, 3121, and 21064, and UDP port 5405.

If you run into any problems during testing, you might want to disable
the firewall and SELinux entirely until you have everything working.
This may create significant security issues and should not be performed on
machines that will be exposed to the outside world, but may be appropriate
during development and testing on a protected host.

To disable security measures:
----
# setenforce 0
# sed -i.bak "s/SELINUX=enforcing/SELINUX=permissive/g" /etc/selinux/config
# systemctl mask firewalld.service
# systemctl stop firewalld.service
# iptables --flush
----
======

=== Configure pacemaker_remote on Remote Node ===

Install the pacemaker_remote daemon on the remote node.
----
# yum install -y pacemaker-remote resource-agents pcs
----

Create a location for the shared authentication key:
----
# mkdir -p --mode=0750 /etc/pacemaker
# chgrp haclient /etc/pacemaker
----

All nodes (both cluster nodes and remote nodes) must have the same
authentication key installed for the communication to work correctly.
If you already have a key on an existing node, copy it to the new
remote node. Otherwise, create a new key, for example:
----
# dd if=/dev/urandom of=/etc/pacemaker/authkey bs=4096 count=1
----

Now start and enable the pacemaker_remote daemon on the remote node.
----
# systemctl enable pacemaker_remote.service
# systemctl start pacemaker_remote.service
----

Verify the start is successful.

----
# systemctl status pacemaker_remote
pacemaker_remote.service - Pacemaker Remote Service
   Loaded: loaded (/usr/lib/systemd/system/pacemaker_remote.service; enabled)
   Active: active (running) since Fri 2018-01-12 15:21:20 CDT; 20s ago
 Main PID: 21273 (pacemaker_remot)
   CGroup: /system.slice/pacemaker_remote.service
           └─21273 /usr/sbin/pacemaker-remoted

Jan 12 15:21:20 remote1 systemd[1]: Starting Pacemaker Remote Service...
Jan 12 15:21:20 remote1 systemd[1]: Started Pacemaker Remote Service.
Jan 12 15:21:20 remote1 pacemaker-remoted[21273]: notice: crm_add_logfile: Additional logging available in /var/log/pacemaker.log
Jan 12 15:21:20 remote1 pacemaker-remoted[21273]: notice: lrmd_init_remote_tls_server: Starting a tls listener on port 3121.
Jan 12 15:21:20 remote1 pacemaker-remoted[21273]: notice: bind_and_listen: Listening on address ::
----

== Verify Connection to Remote Node ==

Before moving forward, it's worth verifying that the cluster nodes
can contact the remote node on port 3121. Here's a trick you can use.
Connect using ssh from each of the cluster nodes. The connection will get
destroyed, but how it is destroyed tells you whether it worked or not.

First, add the remote node's hostname (we're using *remote1* in this tutorial)
to the cluster nodes' +/etc/hosts+ files if you haven't already. This
is required unless you have DNS set up in a way where remote1's address can be
discovered.

Execute the following on each cluster node, replacing the IP address with the
actual IP address of the remote node.
----
# cat << END >> /etc/hosts
192.168.122.10    remote1
END
----

If running the ssh command on one of the cluster nodes results in this
output before disconnecting, the connection works:
----
# ssh -p 3121 remote1
ssh_exchange_identification: read: Connection reset by peer
----

If you see one of these, the connection is not working:
----
# ssh -p 3121 remote1
ssh: connect to host remote1 port 3121: No route to host
----
----
# ssh -p 3121 remote1
ssh: connect to host remote1 port 3121: Connection refused
----

Once you can successfully connect to the remote node from the both
cluster nodes, move on to setting up Pacemaker on the cluster nodes.

== Configure Cluster Nodes ==

=== Configure Firewall on Cluster Nodes ===

On each cluster node, allow cluster-related services through the local
firewall, following the same procedure as in <<_configure_firewall_on_remote_node>>.

=== Install Pacemaker on Cluster Nodes ===

On the two cluster nodes, install the following packages.

----
# yum install -y pacemaker corosync pcs resource-agents
----

=== Copy Authentication Key to Cluster Nodes ===

Create a location for the shared authentication key,
and copy it from any existing node:
----
# mkdir -p --mode=0750 /etc/pacemaker
# chgrp haclient /etc/pacemaker
# scp remote1:/etc/pacemaker/authkey /etc/pacemaker/authkey
----

=== Configure Corosync on Cluster Nodes ===

Corosync handles Pacemaker's cluster membership and messaging. The corosync
config file is located in +/etc/corosync/corosync.conf+. That config file must be
initialized with information about the two cluster nodes before pacemaker can
start.

To initialize the corosync config file, execute the following pcs command on
both nodes, filling in the information in <> with your nodes' information.
----
# pcs cluster setup --force --local --name mycluster <node1 ip or hostname> <node2 ip or hostname>
----

=== Start Pacemaker on Cluster Nodes ===

Start the cluster stack on both cluster nodes using the following command.

----
# pcs cluster start
----

Verify corosync membership

....
# pcs status corosync
Membership information
----------------------
    Nodeid      Votes Name
         1          1 node1 (local)
....

Verify Pacemaker status. At first, the `pcs cluster status` output will look
like this.

----
# pcs status
Cluster name: mycluster
Stack: corosync
Current DC: NONE
Last updated: Fri Jan 12 16:14:05 2018
Last change: Fri Jan 12 14:02:14 2018

1 node configured
0 resources configured
----

After about a minute, you should see your two cluster nodes come online.

----
# pcs status
Cluster name: mycluster
Stack: corosync
Current DC: node1 (version 1.1.16-12.el7_4.5-94ff4df) - partition with quorum
Last updated: Fri Jan 12 16:16:32 2018
Last change: Fri Jan 12 14:02:14 2018

2 nodes configured
0 resources configured

Online: [ node1 node2 ]
----

For the sake of this tutorial, we are going to disable stonith to avoid having to cover fencing device configuration.

----
# pcs property set stonith-enabled=false
----

== Integrate Remote Node into Cluster ==

Integrating a remote node into the cluster is achieved through the
creation of a remote node connection resource. The remote node connection
resource both establishes the connection to the remote node and defines that
the remote node exists. Note that this resource is actually internal to
Pacemaker's controller. A metadata file for this resource can be found in
the +/usr/lib/ocf/resource.d/pacemaker/remote+ file that describes what options
are available, but there is no actual *ocf:pacemaker:remote* resource agent
script that performs any work.

Define the remote node connection resource to our remote node,
*remote1*, using the following command on any cluster node.

----
# pcs resource create remote1 ocf:pacemaker:remote
----

That's it.  After a moment you should see the remote node come online.

----
Cluster name: mycluster
Stack: corosync
Current DC: node1 (version 1.1.16-12.el7_4.5-94ff4df) - partition with quorum
Last updated: Fri Jan 12 17:13:09 2018
Last change: Fri Jan 12 17:02:02 2018

3 nodes configured
1 resources configured

Online: [ node1 node2 ]
RemoteOnline: [ remote1 ]

Full list of resources:

 remote1 (ocf::pacemaker:remote):	Started node1

Daemon Status:
  corosync: active/disabled
  pacemaker: active/disabled
  pcsd: active/enabled
----

== Starting Resources on Remote Node ==

Once the remote node is integrated into the cluster, starting resources on a
remote node is the exact same as on cluster nodes. Refer to the
http://clusterlabs.org/doc/['Clusters from Scratch'] document for examples of
resource creation.

[WARNING]
=========
Never involve a remote node connection resource in a resource group,
colocation constraint, or order constraint.
=========

== Fencing Remote Nodes ==

Remote nodes are fenced the same way as cluster nodes. No special
considerations are required. Configure fencing resources for use with
remote nodes the same as you would with cluster nodes.

Note, however, that remote nodes can never 'initiate' a fencing action. Only
cluster nodes are capable of actually executing a fencing operation against
another node.

== Accessing Cluster Tools from a Remote Node ==

Besides allowing the cluster to manage resources on a remote node,
pacemaker_remote has one other trick. The pacemaker_remote daemon allows
nearly all the pacemaker tools (`crm_resource`, `crm_mon`, `crm_attribute`,
`crm_master`, etc.) to work on remote nodes natively.

Try it: Run `crm_mon` on the remote node after pacemaker has
integrated it into the cluster. These tools just work. These means resource
agents such as promotable resources (which need access to tools like
`crm_master`) work seamlessly on the remote nodes.

Higher-level command shells such as `pcs` may have partial support
on remote nodes, but it is recommended to run them from a cluster node.