Tree - source-git/pacemaker - CentOS Git server

source-git / pacemaker

Files

Commit: 150f0b6385d66feff25de684c1b958a24fd89aa7

Blob Blame History Raw

 :compat-mode: legacy
= Troubleshooting Cluster Problems =
 
== Logging ==
 
Pacemaker by default logs messages of notice severity and higher to the system
log, and messages of info severity and higher to the detail log, which by
default is /var/log/pacemaker/pacemaker.log.
 
Logging options can be controlled via environment variables at Pacemaker
start-up. Where these are set varies by operating system (often
+/etc/sysconfig/pacemaker+ or +/etc/default/pacemaker+).
 
Because cluster problems are often highly complex, involving multiple machines,
cluster daemons, and managed services, Pacemaker logs rather verbosely to
provide as much context as possible. It is an ongoing priority to make these
logs more user-friendly, but by necessity there is a lot of obscure, low-level
information that can make them difficult to follow.
 
The default log rotation configuration shipped with Pacemaker (typically
installed in /etc/logrotate.d/pacemaker) rotates the log when it reaches 100MB
in size, or weekly, whichever comes first.
 
If you configure debug or (Heaven forbid) trace-level logging, the logs can
grow enormous quite quickly. Because rotated logs are by default named with the
year, month, and day only, this can cause name collisions if your logs exceed
100MB in a single day. You can add +dateformat -%Y%m%d-%H+ to the rotation
configuration to avoid this.
 
== Transitions ==
 
A key concept in understanding how a Pacemaker cluster functions is a
'transition'. A transition is a set of actions that need to be taken to bring
the cluster from its current state to the desired state (as expressed by the
configuration).
 
Whenever a relevant event happens (a node joining or leaving the cluster,
a resource failing, etc.), the controller will ask the scheduler to recalculate
the status of the cluster, which generates a new transition. The controller
then performs the actions in the transition in the proper order.
 
Each transition can be identified in the logs by a line like:
 
----
notice: Calculated transition 19, saving inputs in /var/lib/pacemaker/pengine/pe-input-1463.bz2
----
 
The file listed as the "inputs" is a snapshot of the cluster configuration and
state at that moment (the CIB). This file can help determine why particular
actions were scheduled. The `crm_simulate` command, described in
<<s-crm_simulate>>, can be used to replay the file.
 
== Further Information About Troubleshooting ==
 
Andrew Beekhof wrote a series of articles about troubleshooting in his blog,
 http://blog.clusterlabs.org/[The Cluster Guy]:
 
* http://blog.clusterlabs.org/blog/2013/debugging-pacemaker[Debugging Pacemaker]
* http://blog.clusterlabs.org/blog/2013/debugging-pengine[Debugging the Policy Engine]
* http://blog.clusterlabs.org/blog/2013/pacemaker-logging[Pacemaker Logging]
 
The articles were written for an earlier version of Pacemaker, so many of the
specific names and log messages to look for have changed, but the concepts are
still valid.

	:compat-mode: legacy
	= Troubleshooting Cluster Problems =

	== Logging ==

	Pacemaker by default logs messages of notice severity and higher to the system
	log, and messages of info severity and higher to the detail log, which by
	default is /var/log/pacemaker/pacemaker.log.

	Logging options can be controlled via environment variables at Pacemaker
	start-up. Where these are set varies by operating system (often
	+/etc/sysconfig/pacemaker+ or +/etc/default/pacemaker+).

	Because cluster problems are often highly complex, involving multiple machines,
	cluster daemons, and managed services, Pacemaker logs rather verbosely to
	provide as much context as possible. It is an ongoing priority to make these
	logs more user-friendly, but by necessity there is a lot of obscure, low-level
	information that can make them difficult to follow.

	The default log rotation configuration shipped with Pacemaker (typically
	installed in /etc/logrotate.d/pacemaker) rotates the log when it reaches 100MB
	in size, or weekly, whichever comes first.

	If you configure debug or (Heaven forbid) trace-level logging, the logs can
	grow enormous quite quickly. Because rotated logs are by default named with the
	year, month, and day only, this can cause name collisions if your logs exceed
	100MB in a single day. You can add +dateformat -%Y%m%d-%H+ to the rotation
	configuration to avoid this.

	== Transitions ==

	A key concept in understanding how a Pacemaker cluster functions is a
	'transition'. A transition is a set of actions that need to be taken to bring
	the cluster from its current state to the desired state (as expressed by the
	configuration).

	Whenever a relevant event happens (a node joining or leaving the cluster,
	a resource failing, etc.), the controller will ask the scheduler to recalculate
	the status of the cluster, which generates a new transition. The controller
	then performs the actions in the transition in the proper order.

	Each transition can be identified in the logs by a line like:

	----
	notice: Calculated transition 19, saving inputs in /var/lib/pacemaker/pengine/pe-input-1463.bz2
	----

	The file listed as the "inputs" is a snapshot of the cluster configuration and
	state at that moment (the CIB). This file can help determine why particular
	actions were scheduled. The `crm_simulate` command, described in
	<<s-crm_simulate>>, can be used to replay the file.

	== Further Information About Troubleshooting ==

	Andrew Beekhof wrote a series of articles about troubleshooting in his blog,
	http://blog.clusterlabs.org/[The Cluster Guy]:

	* http://blog.clusterlabs.org/blog/2013/debugging-pacemaker[Debugging Pacemaker]
	* http://blog.clusterlabs.org/blog/2013/debugging-pengine[Debugging the Policy Engine]
	* http://blog.clusterlabs.org/blog/2013/pacemaker-logging[Pacemaker Logging]

	The articles were written for an earlier version of Pacemaker, so many of the
	specific names and log messages to look for have changed, but the concepts are
	still valid.

source-git / pacemaker

Source Code

Files