|
Packit |
577717 |
$Id: overview.txt,v 1.2 2004/07/17 00:30:49 mikpe Exp $
|
|
Packit |
577717 |
|
|
Packit |
577717 |
AN OVERVIEW OF PERFCTR
|
|
Packit |
577717 |
======================
|
|
Packit |
577717 |
The perfctr package adds support to the Linux kernel for using
|
|
Packit |
577717 |
the performance-monitoring counters found in many processors.
|
|
Packit |
577717 |
|
|
Packit |
577717 |
Perfctr is internally organised in three layers:
|
|
Packit |
577717 |
|
|
Packit |
577717 |
- The low-level drivers, one for each supported architecture.
|
|
Packit |
577717 |
Currently there are two, one for 32 and 64-bit x86 processors,
|
|
Packit |
577717 |
and one for 32-bit PowerPC processors.
|
|
Packit |
577717 |
|
|
Packit |
577717 |
low-level-api.txt documents the model of the performance counters
|
|
Packit |
577717 |
used in this package, and the internal API to the low-level drivers.
|
|
Packit |
577717 |
|
|
Packit |
577717 |
low-level-{x86,ppc}.txt provide documentation specific for those
|
|
Packit |
577717 |
architectures and their low-level drivers.
|
|
Packit |
577717 |
|
|
Packit |
577717 |
- The high-level services.
|
|
Packit |
577717 |
There is currently one, a kernel extension adding support for
|
|
Packit |
577717 |
virtualised per-process performance counters.
|
|
Packit |
577717 |
See virtual.txt for documentation on this kernel extension.
|
|
Packit |
577717 |
|
|
Packit |
577717 |
[There used to be a second high-level service, a simple driver
|
|
Packit |
577717 |
to control and access all performance counters in all processors.
|
|
Packit |
577717 |
This driver is currently removed, pending an acceptable new API.]
|
|
Packit |
577717 |
|
|
Packit |
577717 |
- The top-level, which performs initialisation and implements
|
|
Packit |
577717 |
common procedures and system calls.
|
|
Packit |
577717 |
|
|
Packit |
577717 |
Rationale
|
|
Packit |
577717 |
---------
|
|
Packit |
577717 |
The perfctr package solves three problems:
|
|
Packit |
577717 |
|
|
Packit |
577717 |
- Hardware invariably restricts programming of the performance
|
|
Packit |
577717 |
counter registers to kernel-level code, and sometimes also
|
|
Packit |
577717 |
restricts reading the counters to kernel-level code.
|
|
Packit |
577717 |
|
|
Packit |
577717 |
Perfctr adds APIs allowing user-space code access the counters.
|
|
Packit |
577717 |
In the case of the per-process counters kernel extension,
|
|
Packit |
577717 |
even non-privileged processes are allowed access.
|
|
Packit |
577717 |
|
|
Packit |
577717 |
- Hardware often limits the precision of the hardware counters,
|
|
Packit |
577717 |
making them unsuitable for storing total event counts.
|
|
Packit |
577717 |
|
|
Packit |
577717 |
The counts are instead maintained as 64-bit values in software,
|
|
Packit |
577717 |
with the hardware counters used to derive increments over given
|
|
Packit |
577717 |
time periods.
|
|
Packit |
577717 |
|
|
Packit |
577717 |
- In a non-modified kernel, the thread state does not include the
|
|
Packit |
577717 |
performance monitoring counters, and the context switch code
|
|
Packit |
577717 |
does not save and restore them. In this situation the counters
|
|
Packit |
577717 |
are system-wide, making them unreliable and inaccurate when used
|
|
Packit |
577717 |
for monitoring specific processes or specific segments of code.
|
|
Packit |
577717 |
|
|
Packit |
577717 |
The per-process counters kernel extension treats the counter state as
|
|
Packit |
577717 |
part of the thread state, solving the reliability and accuracy problems.
|
|
Packit |
577717 |
|
|
Packit |
577717 |
Non-goals
|
|
Packit |
577717 |
---------
|
|
Packit |
577717 |
Providing high-level interfaces that abstract and hide the
|
|
Packit |
577717 |
underlying hardware is a non-goal. Such abstractions can
|
|
Packit |
577717 |
and should be implemented in user-space, for several reasons:
|
|
Packit |
577717 |
|
|
Packit |
577717 |
- The complexity and variability of the hardware means that
|
|
Packit |
577717 |
any abstraction would be inaccurate. There would be both
|
|
Packit |
577717 |
loss of functionality, and presence of functionality which
|
|
Packit |
577717 |
isn't supportable on any given processor. User-space tools
|
|
Packit |
577717 |
and libraries can implement this, on top of the processor-
|
|
Packit |
577717 |
specific interfaces provided by the kernel.
|
|
Packit |
577717 |
|
|
Packit |
577717 |
- The implementation of such an abstraction would be large
|
|
Packit |
577717 |
and complex. (Consider ESCR register assignment on P4.)
|
|
Packit |
577717 |
Performing complex actions in user-space simplifies the
|
|
Packit |
577717 |
kernel, allowing it to concentrate on validating control
|
|
Packit |
577717 |
data, managing processes, and driving the hardware.
|
|
Packit |
577717 |
(C.f. the role of compilers.)
|
|
Packit |
577717 |
|
|
Packit |
577717 |
- The abstraction is purely a user-convenience thing. The
|
|
Packit |
577717 |
kernel-level components have no need for it.
|
|
Packit |
577717 |
|
|
Packit |
577717 |
Common System Calls
|
|
Packit |
577717 |
===================
|
|
Packit |
577717 |
This lists those system calls that are not tied to
|
|
Packit |
577717 |
a specific high-level service/driver.
|
|
Packit |
577717 |
|
|
Packit |
577717 |
Querying CPU and Driver Information
|
|
Packit |
577717 |
-----------------------------------
|
|
Packit |
577717 |
int err = sys_perfctr_info(struct perfctr_info *info,
|
|
Packit |
577717 |
struct perfctr_cpu_mask *cpus,
|
|
Packit |
577717 |
struct perfctr_cpu_mask *forbidden);
|
|
Packit |
577717 |
|
|
Packit |
577717 |
This operation retrieves information from the kernel about
|
|
Packit |
577717 |
the processors in the system.
|
|
Packit |
577717 |
|
|
Packit |
577717 |
If non-NULL, '*info' will be updated with information about the
|
|
Packit |
577717 |
capabilities of the processor and the low-level driver.
|
|
Packit |
577717 |
|
|
Packit |
577717 |
If non-NULL, '*cpus' will be updated with a bitmask listing the
|
|
Packit |
577717 |
set of processors in the system. The size of this bitmask is not
|
|
Packit |
577717 |
statically known, so the protocol is:
|
|
Packit |
577717 |
|
|
Packit |
577717 |
1. User-space initialises cpus->nrwords to the number of elements
|
|
Packit |
577717 |
allocated for cpus->mask[].
|
|
Packit |
577717 |
2. The kernel reads cpus->nrwords, and then writes the required
|
|
Packit |
577717 |
number of words to cpus->nrwords.
|
|
Packit |
577717 |
3. If the required number of words is less than the original value
|
|
Packit |
577717 |
of cpus->nrwords, then an EOVERFLOW error is signalled.
|
|
Packit |
577717 |
4. Otherwise, the kernel converts its internal cpumask_t value
|
|
Packit |
577717 |
to the external format and writes that to cpus->mask[].
|
|
Packit |
577717 |
|
|
Packit |
577717 |
If non-NULL, '*forbidden' will be updated with a bitmask listing
|
|
Packit |
577717 |
the set of processors in the system on which users must not try
|
|
Packit |
577717 |
to use performance counters. This is currently only relevant for
|
|
Packit |
577717 |
hyper-threaded Pentium 4/Xeon systems. The protocol is the same
|
|
Packit |
577717 |
as for '*cpus'.
|
|
Packit |
577717 |
|
|
Packit |
577717 |
Notes:
|
|
Packit |
577717 |
- The internal representation of a cpumask_t is as an array of
|
|
Packit |
577717 |
unsigned long. This representation is unsuitable for user-space,
|
|
Packit |
577717 |
because it is not binary-compatible between 32 and 64-bit
|
|
Packit |
577717 |
variants of a big-endian processor. The 'struct perfctr_cpu_mask'
|
|
Packit |
577717 |
type uses an array of unsigned 32-bit integers.
|
|
Packit |
577717 |
- The protocol for retrieving a 'struct perfctr_cpu_mask' was
|
|
Packit |
577717 |
designed to allow user-space to quickly determine the correct
|
|
Packit |
577717 |
size of the 'mask[]' array. Other system calls use weaker protocols,
|
|
Packit |
577717 |
which force user-space to guess increasingly larger values in a
|
|
Packit |
577717 |
loop, until finally an acceptable value was guessed.
|