Blob Blame History Raw
* File:    INSTALL.txt
* CVS:     $Id$
* Author:  Kevin London
* Mods:    Dan Terpstra
* Mods:    Philip Mucci
* Mods:    <your name here>
*          <your email address>


On some of the systems that PAPI supports, you can install PAPI right 
out of the box without any additional setup. Others require drivers or 
patches to be installed first.

The general installation steps are below, but first find your particular 
Operating System's section for any additional steps that may be necessary.
NOTE: the configure and make files are located in the papi/src directory.

General Installation

1.	% ./configure
	% make

2.	Check for errors. 

	a) Run a simple test case: (This will run ctests/zero)

	% make test

	If you get good counts, you can optionally run all the test programs
	with the included test harness. This will run the tests in quiet mode, 
	which will print PASSED, FAILED, or SKIPPED. Tests are SKIPPED if the
	functionality being tested is not supported by that platform.

	% make fulltest (This will run ./

	To run the tests in verbose mode:

	% ./ -v

3.	Create a PAPI binary distribution or install PAPI directly.

	a) To install PAPI libraries and header files from the build tree:

	% make install

	b) To install PAPI manual pages from the build tree:

	% make install-man

	c) To install PAPI test programs from the build tree:

	% make install-tests

	d) To install all of the above in one step from the build tree:

	% make install-all

	e) To create a binary kit, papi-<arch>.tgz:

	% make dist


There is an extensive array of options available from the configure 
command-line. These can differ significantly from version to versions of
PAPI. For complete details on the command-line options, use:
	% ./configure --help


PAPI now ships with documentation generated by doxygen.
Documentation for the public apis can be created by running 
doxygen from the doc directory. 

More complete documentation of all internal apis and structures can be 
generated with: 
	% doxygen Doxyfile-html

Doxygen documentation for the currently released version of PAPI is also
available on the website.

Operating System Specific Installation Steps (In Alphabetical Order by OS)

PAPI is supported on AIX 5.x for POWER5 and POWER6.
PAPI is also tested on AIX 6.1 for POWER7.
Use ./configure to select the desired make options for your system, 
specifying the --with_bitmode=32 or --with-bitmode=64 to select wordlength.
32 bits is the default.

1.	On AIX 5.x, the bos.pmapi is a product level fileset (part of the OS).
	However, it is not installed by default. Consult your sysadmin to 
	make sure it is installed. 
2.	Follow the general instructions for installing PAPI.

WARNING: PAPI requires XLC version 6 or greater.
Your version can be determined by running 'lslpp -a -l | grep -i xlc'.

BG/P is a cross-compiled environment. The machine on which PAPI is compiled
is not the machine on which PAPI runs. To compile PAPI on BG/P, specify the
BG/P environment as shown below:

	% ./configure --with-OS=bgp
	% make
NOTE: ./configure might fail if the cross compiler is not in your path.
	 If that is the case, just add it to your path and everything should work:

	% export PATH=$PATH:/bgsys/drivers/ppcfloor/gnu-linux/bin

By default this will make a subset of tests in the ctests directory and all
 tests in the ftests directory.

There is an additional C test program provided for the BG/P environment
that exercises the specific BG/P events and demonstrates how to
intermix the PAPI and BG/P UPC native calls. This test program is built with
the normal make sequence and can be found in the ctests/bgp directory.

The testing targets in the make file will not work in the BG/P environment.
Since BG/P supports multiple queuing systems, you must manually execute
individual programs in the ctests and ftests directories to check for successful
library creation. You can also manually edit the script to
automate testing for your installation.

Most papi utilities work for BGP, including papi_avail, papi_native_avail, and
papi_command_line. Many ctests pass for BGP, but many others produce errors due
to the non-traditional architecture of BGP. In particular, PAPI_TOT_CYC always
seems to produce 0 counts, although papi_get_virt_usec and papi_get_real_usec
appear to work.

The IBM RedPaper: provides
further discussion about PAPI on BGP along with other performance issues.

Five new components have been added to PAPI to support hardware performance 
monitoring for the BG/Q platform; in particular the BG/Q network, the I/O system,
the Compute Node Kernel in addition to the processing core. There are no specific
component configure scripts for L2unit, IOunit, NWunit, CNKunit. In order to
configure PAPI for BG/Q, use the following configure options at the papi/src level:
% ./configure --prefix=< your_choice >  \
			  --with-OS=bgq  \
			  --with-bgpm_installdir=/bgsys/drivers/ppcfloor  \
			  CC=/bgsys/drivers/ppcfloor/gnu-linux/bin/powerpc64-bgq-linux-gcc  \
			  F77=/bgsys/drivers/ppcfloor/gnu-linux/bin/powerpc64-bgq-linux-gfortran  \
			  --with-components="bgpm/L2unit bgpm/CNKunit bgpm/IOunit bgpm/NWunit"

CLE - Cray XT and XE Opteron
The Cray XT/XE is a cross-compiled environment. You must specify the
perfmon version to configure as shown below.

Before running configure to create the makefile that supports a Cray XT/XE CLE
build of PAPI, execute the following module commands:
    % module purge
    % module load gcc
Note: do not load the programming environment module (e.g. PrgEnv-gnu)
but the compiler module (e.g. gcc) as shown above.

Check CLE compute nodes for the version of perfmon2 that it supports:
    % aprun -b -a xt cat /sys/kernel/perfmon/version

and use this version when configuring PAPI for a perfmon2 substrate:
    % configure CFLAGS="-D__crayxt" \
	--with-perfmon=2.82 --prefix=<install-dir> \
	--with-virtualtimer=times --with-tls=__thread \
	--with-walltimer=cycle --with-ffsll --with-shared-lib=no \

Configure PAPI for a perf events substrate:
    % configure CFLAGS="-D__crayxt" \
        --with-perf-events --with-pe-incdir=<perf-events-hdr-dir> \
        --with-assumed-kernel=2.6.34 --prefix=<install-dir> \
        --with-virtualtimer=times --with-tls=__thread \
        --with-walltimer=cycle --with-ffsll --with-shared-lib=no \

Invoke the make accordingly:

The testing targets in the makefile will not work in the XT/XE CLE environment.
It is necessary to log into an interactive session and run the tests
manually through the job submission system. For example, instead of:
    % make test
    % aprun -n1 ctests/zero
and instead of:
    % make fulltest
    % ./
after substituting "aprun -n1" for "yod -sz 1" in

FreeBSD - i386 & amd64
PAPI requires FreeBSD 6 or higher to work.

Kernel needs some modifications to provide PAPI access to the performance 
monitoring counters. Simply, add "options HWPMC_HOOKS" and "device hwpmc" in
the kernel configuration file. For i386 systems, add also "device apic".
(You can obtain more information in hwpmc(4), see NOTE 1 to check the
supported HW)

After this step, just recompile the kernel and boot it.

FreeBSD 7 (or greater) does not ship with a fortran compiler. To compile
fortan tests you will need to install a fortran compiler first (e.g.
installing it from /usr/ports/lang/gcc42), and setup the F77 environment
variable with the compiler you want to use (e.g. gfortran42). 

Fortran compilers may issue errors due to "Integer too big for its kind *".
Add to FFLAGS environment variable a compiler option to use int*8 by default
(in gfortran42 it is -fdefault-integer-8).

Follow the "General Installation" steps.

NOTE 1: 
HWPMC driver supports the following processors: Intel Pentium 2,
Intel Pentium Pro, Intel Pentium 3, Intel Pentium M, Intel Celeron,
Intel Pentium 4, AMD K7 (AMD Athlon) and AMD K8 (AMD Athlon64 / Opteron).

FreeBSD 8 also adds support for Core/Core2/Core-i[357]/Atom processors.
There is also a patch for FreeBSD 7/7.1 in

Linux - Xeon Phi [MIC, KNC, Knight's Corner]
Full PAPI support of the MIC card requires MPSS Gold Update 2 or above, and a
cross-compilation toolchain from Intel, the Intel C compiler is also

The compiler
* Download one of the MPSS full source bundles at
* Untar the download. 
* Extract gpl/package-cross-k1om.tar.bz2

Building PAPI - gcc cross compiler
* Add usr/linux-k1om-4.7/bin or equivalent to your PATH so PAPI can find the
	cross-build utils. (see above for instructions on acquiring the cross
	compilation toolchain)
* You will need to invoke configure with options:
	> ./configure --with-mic --host=x86_64-k1om-linux --with-arch=k1om

	This sets up cross-compilation and sets options needed by PAPI.
* Run make to build the library.

Building PAPI - icc
If icc is in your path,
    > ./configure --with-mic
You may have to provide additional configuration options... try
    > ./configure --with-mic --with-ffsll --with-walltimer=cycle --with-tls=__thread --with-virtualtimer=clock_thread_cputime_id 
This builds a mic native version of the library. 

Offload Code
To use PAPI in MIC offload code, build a mic-native version of PAPI 
as detailed above. 

The PAPI utility programs can be run on the MIC using the
micnativeloadex tool provided by Intel.  The MIC events may require
additional qualifiers to set the exclude_guest and exclude_host bits
to 0 (eventname:mg=1:mh=1).  For example, get a list of events
available on the MIC by calling:
micnativeloadex ./utils/papi_native_avail 
Then get an event count while setting the appropriate qualifiers
micnativeloadex ./utils/papi_command_line -a "CPU_CLK_UNHALTED:mg=1:mh=1"

To add offload code into your program, wrap the papi.h header as
#pragma offload_attribute (push,target(mic)) 
#include "papi.h" 
#pragma offload_attribute (pop)

Make PAPI calls from offload code as normal.

Finally add -offload-option,mic,ld,$(path_to_papi)/libpapi.a
to your compile incantation or if that does not recognise papi library try
-offload-option,mic,compiler,"-lpapi -L<path/to dir containing libpapi.a>" to
your compile incantation

Linux - Itanium II & Montecito
PAPI on Itanium Linux links to the perfmon library. The library version and 
the Itanium version are automatically determined by configure.
If you wish to override the defaults, a number of pfm options are available
to configure. Use:
	% ./configure --help
to learn more about these options.

Follow the general installation instructions to complete your installation.

The earprofile test fails under perfmon for Itanium II. It has been
reconfigured to work on the upcoming perfmon2 interface.

Linux - PPC64 (POWER5, POWER5+, POWER6 and PowerPC970)
Linux/PPC64 requires that the kernel be patched and recompiled with the
PerfCtr patch if the kernel is version 2.6.30 or older. The required patches 
and complete installation instructions are provided in the 
papi/src/perfctr-2.7.x directory. PPC64 is the ONLY platform that REQUIRES 
use of PerfCtr 2.7.x.


WARNING: You should always use a PerfCtr distribution that has been distributed
with a version of PAPI or your build will fail. The reason for this is that
PAPI builds a shared library of the Perfctr runtime, on which
depends. PAPI also depends on the .a file, which it decomposes into component
objects files and includes in the libpapi.a file for convenience. If you
install a new perfctr, even a shared library, YOU MUST REBUILD PAPI to get
a proper, working libpapi.a.

There are several options in configure to allow you to specify your perfctr 
version and location. Use:
	% ./configure --help
to learn more about these options.

Follow the general installation instructions to complete your installation.

Linux Perf Events ( with kernel 2.6.32 and newer )

Performance counter support has been merged as the "Perf Events"
subsystem as of Linux 2.6.32.  This means that PAPI can be built
without patching the kernel on new enough systems.

Perf Events support is new, and certain functionality does not work.
If you need any of the functionality listed below, we recommend
you install the PerfCtr patchset and use that in conjunction with PAPI.

   + PAPI requires at least Linux kernel 2.6.32, as the earlier 2.6.31
     version had some significant API changes.
   + Kernels before 2.6.33 have extra overhead when determining
     whether events conflict or not.
   + Counter multiplexing is handled by PAPI (rather than perf_events) 
     on kernels before 2.6.33 due to a bug in the kernel perf_events code.
   + Nehalem EX support requires kernel 2.6.34 or newer.
   + Pentium 4 support requires kernel 2.6.35 or newer.

The PAPI configure script should auto-detect the availability of
Perf Events on new enough distributions (this mainly requires
that perf_event.h be available in /usr/include/linux)

On older distributions (even ones that include the 2.6.32 kernel) 
the perf_event.h file might not be there.  One fix is to install
your distributions linux kernel headers package, which is often
an optional package not installed by default.

If you cannot install the kernel headers, you can obtain the
perf_event.h file from your kernel and run configure as such:
   ./configure --with-pe-incdir=INCDIR
replacing INCDIR with the directory that perf_event.h is in.

Linux PerfCtr (requires patching the kernel)
When using Linux kernels before 2.6.32 the kernel must be patched with
the PerfCtr patch set.  (This patchset can also be used on more recent
kernels if the support provided by Perf Events is not enough for your
workload). The required patches and complete installation instructions 
are provided in the papi/src/perfctr-x.y directory. Please see the INSTALL 
file in that directory.

Do not forget, you also need to build your kernel with APIC support in order
for hardware overflow to work. This is very important for accurate statistical
profiling ala gprof via the hardware counters.

So, when you configure your kernel to build with PERFCTR as above, make
sure you turn on APIC support in the "Processor type and features" section.
This should be enabled by default if you are on an SMP, but it is disabled
by default on a UP. 

In our 2.4.x kernels:
> grep PIC /usr/src/linux/.config

You can verify the APIC is working after rebooting with the new kernel
by running the 'perfex -i' command found in the perfctr/examples/perfex

PAPI on x86 assumes PerfCtr 2.6.x. NOTE: THE VERSIONS OF PERFCTR DO NOT 


WARNING: You should always use a PerfCtr distribution that has been distributed
with a version of PAPI or your build may fail. Newer versions with backward
compatibility may also work. PAPI builds a shared library of the Perfctr 
runtime, on which depends. PAPI also depends on the .a file, 
which it decomposes into component objects files and includes in the libpapi.a 
file for convenience. If you install a new PerfCtr, even a shared library, 
YOU MUST REBUILD PAPI to get a proper, working libpapi.a. 

There are several options in configure to allow you to specify your perfctr 
version and location. Use:
	% ./configure --help
to learn more about these options.

Follow the general installation instructions to complete your installation.PERFCT


You may be running udev, which is not smart enough to know the permissions of 
dynamically created devices. To fix this, find your udev/devices directory, 
often /lib/udev/devices or /etc/udev/devices and perform the following actions:

 mknod perfctr c 10 182
 chmod 644 perfctr

On Ubuntu 6.06 (and probably other debian distros),  add a line to 
/etc/udev/rules.d/40-permissions.rules like this:

KERNEL=="perfctr", MODE="0666"

On SuSE, you may need to add something like the following to
 (SuSE does not have the 40-permissions.rules file in it.]

# cpu devices
KERNEL=="cpu[0-9]*",            NAME="cpu/%n/cpuid"
KERNEL=="msr[0-9]*",            NAME="cpu/%n/msr"
KERNEL=="microcode",            NAME="cpu/microcode", MODE="0600"
KERNEL=="perfctr",              NAME="perfctr", MODE="0644"

These lines tell udev to always create the device file with the appropriate permissions.
Use 'perfex -i' from the perfctr distribution to test this fix.

Opteron fails the matrix-hl test because the default definition of PAPI_FP_OPS
overcounts speculative floating point operations.

Solaris 8 - Ultrasparc
The only requirement for Solaris is that you must be running version 2.8 or 
newer.  As long as that requirement is met, no additional steps are required 
to install PAPI and you can follow the general installation guide.

Solaris 10 - UltraSPARC T2/Niagara 2
PAPI supports the Niagara 2 on Solaris 10. The substrate offers support for 
common basic operations like adding/reading/etc and the advanced features 
multiplexing (see below), overflow handling and profiling. The implementation 
for Solaris 10 is based on libcpc 2, which offers access to the underlying 
performance counters. Performance counters for the UltraSPARC architecture 
are described in the UltraSPARC architecture manual in general with detailed 
descriptions in the actual processor manual. In case of this substrate the 
documentation for performance counters can be found at:


In order to install PAPI on this platform make sure the packages SUNWcpc and
SUNWcpcu are installed. For the compilation Sun Studio 12 was used while the
substrate has been developed. GNU GCC has not been tested and would require
to modify the makefiles Makefile.solaris-niagara2 (32 bit) and
Makefile.solaris-niagara2-64bit (64 bit).

The steps required for installation are as follows:

	./configure --with-bitmode=[32|64] --prefix=/is/optional
		If no --with-bitmode parameter is present a default of
		32 bit is assumed.

		If no --prefix is used, a default of /usr/local is assumed.

	make install

If you want to link your application against your installation you should
make sure to include at least the following linker options:

	-lpapi -lcpc

PLEASE NOTE: This is the first revision of Niagara 2/libcpc 2/Solaris 10
support and needs further testing! Contributions, especially for the preset
definitions, would be very appreciated.

MULTIPLEXING: As the Niagara 2 offers no native event to count the cycles
elapsed, a "synthetic event" was created offering access to the cycle count.
This event is neither as accurate as the native events, nor it should be
used for anything else than the multiplexing mode, which needs the cycle
count in order to work. Therefore multiplexing and the preset PAPI_TOT_CYC
should be only used with caution. BEWARE OF WRONG COUNTER RESULTS!

Windows XP/2000/Server 2003 - Intel Pentium III or AMD Athlon / Opteron
Please use PAPI 3.7 (

The Windows source tree comes with Microsoft Visual Studio Version 8 projects
to build a graphical shell application, the PAPI library as a DLL, a kernel 
driver to provide access to the counters, and a collection of C test programs.

The WinPMC driver must be installed with administrator privileges. See the 
winpmc.html file in the papi/win2k/winpmc directory for details on building 
and installing this driver.

The general installation instructions are irrelevant for Windows.

Other Platforms
PAPI can be compiled and installed on most platforms that have GNU compilers 
regardless of operating system or hardware. This includes, for example, 
Macintosh systems running recent versions of OSX. However, PAPI can only 
provide access to the CPU hardware counters on platforms that are directly 
supported. Unsupported platforms will run, buttony provide basic timing 
functions, and potential access to some non-cpu components.


Basic instructions on how to create a new component can be found in 
src/components/README. The components directory contains several components 
developed by the PAPI team along with a simple yet functional "example" 
component which can be used as a guide to aid third-party developers. 
Assuming components are developed according to the specified guidelines, 
they will function within the PAPI framework without requiring any changes 
to PAPI source code.

Before running any component that requires configuration, the configure 
script for that component must be executed in order to generate the 
Makefile which contains the configuration settings. Normally, the script 
will only need to be executed once. Depending on the component, configure 
may require that one or more configuration settings be specified by the user.

The components to be added to PAPI are specified during the configuration of
PAPI by adding the --with-components=<component list> command line option to
configure. For example, to add the acpi, lustre, and net components, the 
option would be:
   	% ./configure --with-components="acpi lustre net"

Attempting to add a component to PAPI which requires configuration and has 
not been configured will result in a compilation error because the PAPI 
build environment will be unable to find the Makefile for that component.