$Id: README,v 1.48 2007/10/06 13:02:07 mikpe Exp $ Linux Performance-Monitoring Counters Driver Mikael Pettersson <mikpe@it.uu.se> ======================================================================== Overview -------- This package adds support to the Linux kernel (2.4.16 or newer) for using the Performance-Monitoring Counters (PMCs) found in many modern processors. Supported processors are: - All Intel Pentium processors, i.e., Pentium, Pentium MMX, Pentium Pro, Pentium II, Pentium III, Pentium M and Pentium 4, including Celeron and Xeon versions. - The AMD K7 and K8 processor families. - Cyrix 6x86MX, MII, and III. - VIA C3 (Cyrix III). - Centaur WinChip C6/2/3. - PowerPC 604, 7xx, and 74xx processors. PMCs are "event counters" capable of recording any of a large number of performance-related events during execution. These events typically include instructions executed, cache misses, TLB misses, stalls, and other events specific to the microarchitecture of the processor being used. PMCs are primarily used to identify low-level performance problems, and to validate code changes intended to improve performance. Limited support is available for generic x86 processors with a Time-Stamp Counter but no PMCs, such as the AMD K6 family. For these processors, only TSC-based cycle-count measurements are possible. However, all high-level facilities implemented by the driver are still available. Features -------- Each Linux process has its own set of "virtual" PMCs. That is, to a process the PMCs appear to be private and unrelated to the activities of other processes in the system. The virtual PMCs have 64-bit precision, even though current processors only implement 32, 40, or 48-bit PMCs. Each process also has a virtual Time-Stamp Counter (TSC). On most machines, the virtual PMCs can be sampled entirely in user-space without incurring the overhead of a system call. A process accesses its virtual PMCs by opening /dev/perfctr and issuing system calls on the resulting file descriptor. A user-space library is included which provides a more high-level interface. The driver also supports global-mode or system-wide PMCs. In this mode, each PMC on each processor can be controlled and read. The PMCs and TSC on active processors are sampled periodically and the accumulated sums have 64-bit precision. Global-mode PMCs are accessed via the /dev/perfctr device file; the user-space library provides a more high-level interface. The user-space library is accompanied by several example programs that illustrate how the driver and the library can be used. Support for performance-counter overflow interrupts is provided for Intel P4 and P6, and AMD K7 and K8 processors. Limitations ----------- - Kernels older than 2.4.16 are not supported since perfctr-2.6. You can use the previous stable series, perfctr-2.4, if you must use an older kernel, but this has several limitations: * Older kernels do not support AMD64 (x86-64). * The performance counters in hyper-threaded P4s/Xeons cannot be used with kernels older than 2.4.15. You'd have to disable hyper-threading or SMP, or restrict yourself to TSC sampling. * No profiling using counter overflow interrupts, except in 2.4.10 and newer kernels, and some early 2.4-ac/redhat kernels. * Application code compiled for perfctr-2.4 is not compatible with perfctr-2.6, and vice versa. * The perfctr-2.4 series does not support 2.6 kernels. Some of these limitations may be fixable. Contact the author if you are willing to fund development in this direction. - The performance counter interrupt facility requires SMP or uniprocessor APIC support. In the latter case, the BIOS must be reasonably non-buggy. Unfortunately, this is often not the case. - Neither the kernel driver nor the sample user-space library attempt to hide any processor-specific details from the user. - This package makes it possible to compute aggregate event and cycle counts for sections of code. Since many x86-type processors use out-of-order execution, it is impossible to attribute exact event or cycle counts to individual instructions. See the "Continuous Profiling" and "ProfileMe" papers at Compaq's DCPI web site for more information on this issue. (The URL is listed in the OTHERS file.) - Centaur WinChip C6/2/3 support requires that the TSC is disabled. See linux/drivers/perfctr/x86.c for further information. Availability ------------ This and future versions of this package can be downloaded from <http://user.it.uu.se/~mikpe/linux/perfctr/>. The perfctr-devel mailing list is an open forum for driver update announcements and general discussions about the perfctr driver and its usage. To subscribe to perfctr-devel, visit <http://lists.sourceforge.net/lists/listinfo/perfctr-devel>. Licensing --------- Copyright (C) 1999-2007 Mikael Pettersson <mikpe@it.uu.se> This library is free software; you can redistribute it and/or modify it under the terms of the GNU Lesser General Public License as published by the Free Software Foundation; either version 2.1 of the License, or (at your option) any later version. This library is distributed in the hope that it will be useful, but WITHOUT ANY WARRANTY; without even the implied warranty of MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU Lesser General Public License for more details. You should have received a copy of the GNU Lesser General Public License along with this library; if not, write to the Free Software Foundation, Inc., 59 Temple Place, Suite 330, Boston, MA 02111-1307 USA