|
rpm-build |
3ee90c |
:compat-mode: legacy
|
|
rpm-build |
3ee90c |
= Troubleshooting Cluster Problems =
|
|
rpm-build |
3ee90c |
|
|
rpm-build |
3ee90c |
== Logging ==
|
|
rpm-build |
3ee90c |
|
|
rpm-build |
3ee90c |
Pacemaker by default logs messages of notice severity and higher to the system
|
|
rpm-build |
3ee90c |
log, and messages of info severity and higher to the detail log, which by
|
|
rpm-build |
3ee90c |
default is /var/log/pacemaker/pacemaker.log.
|
|
rpm-build |
3ee90c |
|
|
rpm-build |
3ee90c |
Logging options can be controlled via environment variables at Pacemaker
|
|
rpm-build |
3ee90c |
start-up. Where these are set varies by operating system (often
|
|
rpm-build |
3ee90c |
+/etc/sysconfig/pacemaker+ or +/etc/default/pacemaker+).
|
|
rpm-build |
3ee90c |
|
|
rpm-build |
3ee90c |
Because cluster problems are often highly complex, involving multiple machines,
|
|
rpm-build |
3ee90c |
cluster daemons, and managed services, Pacemaker logs rather verbosely to
|
|
rpm-build |
3ee90c |
provide as much context as possible. It is an ongoing priority to make these
|
|
rpm-build |
3ee90c |
logs more user-friendly, but by necessity there is a lot of obscure, low-level
|
|
rpm-build |
3ee90c |
information that can make them difficult to follow.
|
|
rpm-build |
3ee90c |
|
|
rpm-build |
3ee90c |
The default log rotation configuration shipped with Pacemaker (typically
|
|
rpm-build |
3ee90c |
installed in /etc/logrotate.d/pacemaker) rotates the log when it reaches 100MB
|
|
rpm-build |
3ee90c |
in size, or weekly, whichever comes first.
|
|
rpm-build |
3ee90c |
|
|
rpm-build |
3ee90c |
If you configure debug or (Heaven forbid) trace-level logging, the logs can
|
|
rpm-build |
3ee90c |
grow enormous quite quickly. Because rotated logs are by default named with the
|
|
rpm-build |
3ee90c |
year, month, and day only, this can cause name collisions if your logs exceed
|
|
rpm-build |
3ee90c |
100MB in a single day. You can add +dateformat -%Y%m%d-%H+ to the rotation
|
|
rpm-build |
3ee90c |
configuration to avoid this.
|
|
rpm-build |
3ee90c |
|
|
rpm-build |
3ee90c |
== Transitions ==
|
|
rpm-build |
3ee90c |
|
|
rpm-build |
3ee90c |
A key concept in understanding how a Pacemaker cluster functions is a
|
|
rpm-build |
3ee90c |
'transition'. A transition is a set of actions that need to be taken to bring
|
|
rpm-build |
3ee90c |
the cluster from its current state to the desired state (as expressed by the
|
|
rpm-build |
3ee90c |
configuration).
|
|
rpm-build |
3ee90c |
|
|
rpm-build |
3ee90c |
Whenever a relevant event happens (a node joining or leaving the cluster,
|
|
rpm-build |
3ee90c |
a resource failing, etc.), the controller will ask the scheduler to recalculate
|
|
rpm-build |
3ee90c |
the status of the cluster, which generates a new transition. The controller
|
|
rpm-build |
3ee90c |
then performs the actions in the transition in the proper order.
|
|
rpm-build |
3ee90c |
|
|
rpm-build |
3ee90c |
Each transition can be identified in the logs by a line like:
|
|
rpm-build |
3ee90c |
|
|
rpm-build |
3ee90c |
----
|
|
rpm-build |
3ee90c |
Nov 30 20:28:16 rhel7-1 pacemaker-schedulerd[36417] (process_pe_message) notice: Calculated transition 19, saving inputs in /var/lib/pacemaker/pengine/pe-input-1463.bz2
|
|
rpm-build |
3ee90c |
----
|
|
rpm-build |
3ee90c |
|
|
rpm-build |
3ee90c |
The file listed as the "inputs" is a snapshot of the cluster configuration and
|
|
rpm-build |
3ee90c |
state at that moment (the CIB). This file can help determine why particular
|
|
rpm-build |
3ee90c |
actions were scheduled. The `crm_simulate` command, described in
|
|
rpm-build |
3ee90c |
<<s-crm_simulate>>, can be used to replay the file.
|
|
rpm-build |
3ee90c |
|
|
rpm-build |
3ee90c |
== Further Information About Troubleshooting ==
|
|
rpm-build |
3ee90c |
|
|
rpm-build |
3ee90c |
Andrew Beekhof wrote a series of articles about troubleshooting in his blog,
|
|
rpm-build |
3ee90c |
http://blog.clusterlabs.org/[The Cluster Guy]:
|
|
rpm-build |
3ee90c |
|
|
rpm-build |
3ee90c |
* http://blog.clusterlabs.org/blog/2013/debugging-pacemaker[Debugging Pacemaker]
|
|
rpm-build |
3ee90c |
* http://blog.clusterlabs.org/blog/2013/debugging-pengine[Debugging the Policy Engine]
|
|
rpm-build |
3ee90c |
* http://blog.clusterlabs.org/blog/2013/pacemaker-logging[Pacemaker Logging]
|
|
rpm-build |
3ee90c |
|
|
rpm-build |
3ee90c |
The articles were written for an earlier version of Pacemaker, so many of the
|
|
rpm-build |
3ee90c |
specific names and log messages to look for have changed, but the concepts are
|
|
rpm-build |
3ee90c |
still valid.
|