Blob Blame History Raw
Tue Dec 5 20:10:50 2017 -0800  William Cohen <>

	* src/libpfm4/lib/events/power9_events.h,
	  src/libpfm4/tests/validate_power.c: Update libpfm4 Current with
	  commit 206dea666e7c259c7ca53b16f934660344293475  Ensure unique
	  names for IBM Power 9 events  Older versions of PAPI use the event
	  name to look up the libpfm event number when doing the enumeration
	  of the available events.  If there were multiple events with the
	  same name in libpfm, the earliest one would be selected.  This
	  selection would cause the enumeration of events in
	  papi_native_avail to get stuck looping on the first duplicated
	  named event in a pmu.  In the case of IBM Power 9 the enumeration
	  would get stuck on PM_CO0_BUSY. Gave each event a unique name to
	  avoid this unfortunate behavior.

2017-11-16  Will Schmidt <>

	* src/papi_events.csv: revised papi_derived patch.  [PATCH, papi]
	  Updated derived entries for power9.  This is a re-implementation of
	  the patch that Will Cohen posted earlier, which uses the (newly
	  defined) PM_LD_MISS_ALT entry instead of the PM_LD_MISS_FIN .
	  Thanks, -Will

2017-12-05  Heike Jagode ( <>

	* release_procedure.txt: Updated notes for release procedure.

2017-12-05  Vince Weaver <>

	* src/extras.c: extras.c: add string.h include to make the ffsll
	  warning go away

2017-12-04  Heike Jagode ( <>

	* src/configure, src/ Fixed configure bug:  Once ffsll
	  support is detected, set HAVE_FFSLL to 1 in config.h.  Tested
	  without configure flag --with-ffsll, with --with-ffsll=yes, --with-

2017-12-04  Vince Weaver <>

	* src/ctests/Makefile.recipies, src/ctests/locks_pthreads.c: ctests:
	  locks_pthreads: adjust run count again  linear slowdown makes
	  things run really quickly. This patch scales it down by the square
	  root of the number of cores which is maybe a better compromise.
	* src/ctests/locks_pthreads.c: ctests: locks_pthreads, minor cleanups

2017-11-20  William Cohen <>

	* src/ctests/locks_pthreads.c: Keep locks_pthreads test's amount of
	  work reasonable on many core machines  The runtime of
	  locks_pthreads test scaled by the number of processors on the
	  machine because of the serialized increment operation in the test.
	  As more machines are available with 100+ processors the runtime of
	  locks_pthreads is becoming execessive.  Revised the test to specify
	  the approximate total number of iterations and split the work the

Fri Dec 4 11:31:46 2015 -0500  sangamesh <>

	* src/extras.c, src/papi.h: Revert change that added ffsll to papi.h
	  This reverts commit 2f1ec33a9e585df1b6343a0ea735f79974c080df.
	  commit 2f1ec33a9e585df1b6343a0ea735f79974c080df  changed #if
	  (!defined(HAVE_FFSLL) || defined(__bgp__)) int ffsll( long long lli
	  ); #endif --- to --- extern int ffsll( long long lli  in extras.c
	  to avoid warning when --with-ffsll is used as config option

Thu Apr 20 11:31:38 2017 -0400  Stephen Wood <>

	* src/extras.c, src/papi.h: revert part of patch that added extra
	  attributes to ffsll  This manually reverts part of:  commit
	  9e199a8aee48f5a2c62d891f0b2c1701b496a9ca  cast pointers
	  appropriately to avoid warnings and errors

Sun Dec 3 09:42:44 2017 -0800  Will Schmidt <>

	* src/libpfm4/lib/events/power9_events.h,
	  src/libpfm4/tests/validate_power.c: Updated libpfm4  Current with:
	  ---------------- commit ed3f51c4690685675cf2766edb90acbc0c1cdb67
	  (HEAD -> master, origin/master, origin/HEAD)  Add alternate event
	  numbers for power9.  I had previously missed adding the _ALT
	  entries, which allow some events to be specified on different
	  counters. This patch fills those in.  This patch also adds a few
	  validation tests for the ALT events.  ----------------

2017-11-28  Heike Jagode ( <>

	* src/utils/papi_avail.c, src/utils/papi_native_avail.c: Fixed
	  utility option inconsistencies between papi_avail and
	  papi_native_avail. There are more inconsistencies with other PAPI
	  utilities, which will be addressed eventually.

2017-11-28  Heike Jagode <>

	* edited online with Bitbucket
	* edited online with Bitbucket
	* edited online with Bitbucket
	* edited online with Bitbucket

2017-11-27  Heike Jagode <>

	* src/components/powercap/linux-powercap.c: More clean-ups and
	  checking of return values.

Mon Nov 13 23:15:53 2017 -0800  Thomas Richter <>

	* src/libpfm4/lib/pfmlib_common.c: Update libpfm4” > /tmp/commit-
	  libpfm4-header.txt echo “Current with commit
	  f5331b7cbc96d9f9441df6a54a6f3b6e0fab3fb9  better fix for
	  pfmlib_getl()  The following commit:  commit
	  9c69edf67f6899d9c6870e9cb54dcd0990974f81  better param check in
	  pfmlib_getl()  Fixed paramter checking of pfmlib_getl() but missed
	  one condition on the buffer argument. It is char **buffer.
	  Therefore we need to check if *buffer is not NULL before we can
	  check *len.

2017-11-19  Asim YarKhan <>

	* src/components/cuda/linux-cuda.c: CUDA component: Bug fix for
	  releasing and resetting event list  When an event addition failed
	  because the event (or metric) requires multiple-runs the eventlist
	  and event-context structure was not being cleaned up properly.
	  This fixes the event cleanup process.

2017-11-17  Asim YarKhan <>

	* src/components/powercap/tests/powercap_basic.c,
	  src/components/powercap/tests/powercap_limit.c: Powercap component:
	  Updated tests to handle no-event-counters (num_cntrs==0) and skip
	  some compiler warnings (argv, argc unused)

2017-11-16  William Cohen <>

	* src/components/lmsensors/linux-lmsensors.c: Make more of lmsensors
	  component internal state hidden  There are a number of functions
	  pointers stored in variable that are only used within the lmsensors
	  component.  Making those static ensures they are not visible
	  outside the lmsensors component.
	* src/components/lmsensors/linux-lmsensors.c: Make internal
	  cached_counts variable static  Want to make as little information
	  about the internals of the PAPI lmsensors component visible to the
	  outside.  Thus, making cached_counts variable static.

2017-11-15  William Cohen <>

	* src/components/lmsensors/linux-lmsensors.c: Avoid statically
	  limiting the number of lmsensor events allowed  Some high-end
	  server machines provide more events than the 512 entries limit
	  imposed by the LM_SENSORS_MAX_COUNTERS define in the lmsensor
	  component (observed 577 entries on one machine).  When this limit
	  was exceeded the lmsensor component would write beyond the array
	  bounds causing ctests/all_native_events to crash.  Modified the
	  lmsensor code to dynamically allocate the required space for all
	  the available lmsensor entries on the machine. This allows
	  ctests/all_native_events to run to completion.
	* src/components/appio/appio.c, src/components/coretemp/linux-
	  coretemp.c, src/components/example/example.c,
	  src/components/infiniband/linux-infiniband.c, src/components/lustre
	  /linux-lustre.c, src/components/rapl/linux-rapl.c: Use correct
	  argument order for calloc function calls  Some calls to calloc in
	  PAPI have the order of the arguments reversed. According to the
	  calloc man page the number of elements is the first argument and
	  the size of each element is the second argument.  Due to alignment
	  constraints the second argument might be rounded up.  Thus, it is
	  best not to not to swap the arguments to calloc.

2017-11-15  Philip Vaccaro <>

	* src/components/powercap/linux-powercap.c,
	  src/components/powercap/tests/powercap_basic.c: Updates and changes
	  to the powercap component to address a few areas.. Various things
	  were changed but mainly things were simplified and made more
	  streamlined.  Main focus was on simpifying managing the sytem

Mon Nov 13 23:15:53 2017 -0800  Thomas Richter <>

	* src/libpfm4/docs/man3/pfm_get_event_encoding.3,
	  src/libpfm4/lib/pfmlib_common.c, src/libpfm4/lib/pfmlib_priv.h,
	  src/libpfm4/tests/validate_x86.c: Update libpfm4  Current with
	  commit 9c69edf67f6899d9c6870e9cb54dcd0990974f81  better param check
	  in pfmlib_getl()  This patch ensures tha len >= 2 because we do: m
	  = l - 2;  Reviewed-by: Hendrik Brueckner

2017-11-13  Vince Weaver <>

	* src/components/perf_event/pe_libpfm4_events.c: pe_libpfm4_events:
	  properly notice if trying to add invalid umask  this passes the
	  broken-event test case and all of the unit tests, but it would be
	  good to test this on codes that do a lot of native event tests.
	  the pe_libpfm4_events code *really* needs a once-over, it is
	  currently a confusing mess.
	* src/components/perf_event/tests/Makefile,
	  src/components/perf_event/tests/event_name_lib.h: perf_event/tsts:
	  add broken event name test  we were wrongly accepting event names
	  with invalid umasks

2017-11-13  Philip Mucci <>

	* src/utils/print_header.c: Removed extraneous colon in VM vendor

2017-11-10  Vince Weaver <>

	* src/validation_tests/papi_l1_dcm.c,
	  src/validation_tests/papi_l2_dcw.c: validation_tests: fix compiler
	  warnings on arm32  On Raspberry Pi we were getting warnings where
	  we were printing sizeof() valus with %ld.  Convert to %zu instead.

2017-11-09  Vince Weaver <>

	* src/validation_tests/papi_l2_dca.c: validation_tests: papi_l2_dca
	  fix crash on ARM32  On raspberry pi it's not possible to detect L2
	  cache size so the test was dividing by zero.
	* src/linux-common.c: linux-common: remove warning on not finding mhz
	  in cpuinfo  This was added recently and is not needed. Most ARM32
	  devices don't have MHz in the cpuinfo file and it's not really a
	* src/components/perf_event/perf_event.c: perf_event: disable the old
	  pre-Linux-2.6.34 workarounds by default  There were a number of
	  bugs in perf_event that PAPI had to work around, but most of these
	  were fixed by 2.6.34  In order to hit these bugs you would need to
	  be running a kernel from before 2010 which wouldn't support any
	  recent hardware.  Unfortunately these bugs are hard to test for.
	  We were enabling things based on kernel versions, but this caught
	  vendors (such as Redhat) shipping 2.6.32 kernels that had
	  backported fixes.  This fix just #ifdefs things out, if no one
	  complains then we can fully remove the code.
	* src/components/perf_event/perf_event.c: perf_event: decrement the
	  available counter count if NMI_WATCHDOG is stealing one
	* src/components/perf_event/perf_event.c: perf_event: move the
	  paranoid handling code to its own function
	* src/components/perf_event/perf_event.c: perf_event: centralize
	  fast_counter_read flag  just use the component version of the flag,
	  rather than having a shadow global version.

2017-11-09  William Cohen <>

	* src/linux-memory.c: Make the fallback generic_get_memory_info
	  function more robust  On the aarch64 processor linux 4.11.0 kernels
	  /sys/devices/system/cpu/cpu0/cache is available, but the index[0-9]
	  subdirectories are not fully populated with information about cache
	  and line size, associativity, or number of sets.  These missing
	  files would cause the generic_get_memory_info function to attempt
	  to read data using a NULL file descriptor causing the program to
	  crash.  Added checks to see if every fopen was and fscan was
	  successful and just say there is no cache if there is any failure.

2017-11-09  Asim YarKhan <>

	* src/components/cuda/linux-cuda.c,
	  src/components/nvml/tests/Makefile, src/configure,
	  src/ Enable icc and nvcc to work together in cuda and
	  nvml components.  For nvcc to work with Intel icc to compile cuda
	  and nvml components and tests , it needs to use nvcc -ccbin=<$CC-
	  compilerbin> . The compiler name in CC also needs to be clean, so
	  CC=<compilerbin> and any other flags are pushed to CFLAGS (changed
	  in src/ script).
	* src/ctests/mpifirst.c: Minor correction to mpifirst.c test

2017-11-09  Vince Weaver <>

	* src/utils/print_header.c: utils: print fast_counter_read (rdpmc)
	  status in the utils header

2017-11-08  William Cohen <>

	* src/validation_tests/cache_helper.c: Ensure access to array within
	  bounds  Coverity reported the following issues.  Need the test to
	  be "type>=MAX_CACHE" rather than "type>MAX_CACHE".  Error: OVERRUN
	  (CWE-119): papi-5.5.2/src/validation_tests/cache_helper.c:85:
	  cond_at_most: Checking "type > 4" implies that "type" may be up to
	  4 on the false branch.
	  papi-5.5.2/src/validation_tests/cache_helper.c:90: overrun-local:
	  Overrunning array "cache_info" of 4 24-byte elements at element
	  index 4 (byte offset 96) using index "type" (which evaluates to 4).
	  Error: OVERRUN (CWE-119):
	  papi-5.5.2/src/validation_tests/cache_helper.c:101: cond_at_most:
	  Checking "type > 4" implies that "type" may be up to 4 on the false
	  branch. papi-5.5.2/src/validation_tests/cache_helper.c:106:
	  overrun-local: Overrunning array "cache_info" of 4 24-byte elements
	  at element index 4 (byte offset 96) using index "type" (which
	  evaluates to 4).  Error: OVERRUN (CWE-119):
	  papi-5.5.2/src/validation_tests/cache_helper.c:117: cond_at_most:
	  Checking "type > 4" implies that "type" may be up to 4 on the false
	  branch. papi-5.5.2/src/validation_tests/cache_helper.c:122:
	  overrun-local: Overrunning array "cache_info" of 4 24-byte elements
	  at element index 4 (byte offset 96) using index "type" (which
	  evaluates to 4).
	* src/ctests/overflow_pthreads.c: Eliminate coverity overflow warning
	  about expression
	* src/components/perf_event_uncore/tests/perf_event_uncore_lib.c:
	  Remove dead code from perf_event_uncore_lib.c

2017-11-09  Vince Weaver <>

	* src/components/perf_event/perf_event.c: perf_event: don't
	  initialize globals statically  from the mucci-5.5.2 tree

2017-11-08 <>

	* src/linux-common.c: linux-common: clean up the /proc/cpuinfo
	  parsing code  From the mucci-cleanup branch
	* src/components/perf_event/perf_event.c,
	  src/papi_libpfm4_events.c, src/papi_libpfm4_events.h: perf_event:
	  clean up _papi_libpfm4_shutdown()  From the mucci-cleanup branch
	* src/utils/print_header.c: utils: clean up the cpuinfo header  From
	  the mucci-cleanup branch
	* src/papi_internal.c, src/papi_internal.h: papi_internal: add
	  PAPI_WARN() function  From the mucci-cleanup branch
	* src/components/perf_event/pe_libpfm4_events.c: perf_event: clean up
	  pe_libpfm4_events  From the mucci-cleanup branch  --

2017-11-08  Vince Weaver <>

	* src/utils/papi_avail.c: utils/papi_avail: update the manpage info
	  based on changes by Phil Mucci
	* .../perf_event/tests/perf_event_system_wide.c: perf_event tests:
	  perf_event_system_wide: don't fail if permissions restrict system-
	  wide events  right now we just skip if we get EPERM, we should also
	  maybe check the perf_event_paranoid setting and print a more
	  meaningful report
	* src/ctests/locks_pthreads.c: ctests/locks_pthreads: avoid printing
	  values when in quiet mode

2017-08-31 <>

	* src/ Better symlink creation for shared library in
	  make phase

2017-08-28 <>

	* doc/Makefile, src/.gitignore, src/,
	  src/components/.gitignore, src/components/Makefile_comp_tests,
	  src/ctests/.gitignore, src/ctests/Makefile.recipies,
	  src/ftests/.gitignore, src/ftests/Makefile.recipies,
	  src/testlib/.gitignore, src/utils/.gitignore, src/utils/Makefile,
	  src/validation_tests/Makefile.recipies: Full cleanup, including
	  removal of .gitignore files that prevented us from realizing we
	  were really cleaning/clobbering properly
	* src/validation_tests/.gitignore: .gitignore
	* src/papi.c: Remove PAPI_VERB_ECONT setting by default from
	  initialization path. This prints all kinds of needless errors on
	  virtual platforms.
	* src/x86_cpuid_info.c: Remove leftover printf

2017-08-21 <>

	* src/ctests/locks_pthreads.c: Test now performs a fixed number of
	  iterations, and reports lock/unlock timings per thread.
	* src/components/perf_event/perf_event.c: Added more descriptive
	  error message to exclude_guest check
	* src/papi_internal.c: Removed leading newline and trailing . from
	  error messages
	* src/papi_preset.c: Updated message for derived event failures

2017-11-07  Vince Weaver <>

	* src/, src/ctests/Makefile,
	  src/ctests/, src/ftests/Makefile,
	  src/ftests/, src/testlib/,
	  src/utils/, src/validation_tests/Makefile,
	  src/validation_tests/ tests: make sure DESTDIR
	  and DATADIR are passed in when doing an install
	* src/ctests/Makefile, src/ctests/,
	  src/ftests/Makefile, src/ftests/,
	  src/utils/Makefile, src/utils/,
	  ctests/ftests/utils/validation_tests: get shared library linking
	  working again  This should let the various tests and utils be
	  linked as shared libraries again.
	* src/validation_tests/Makefile: validation_tests: add an
	  installation target  this makes the validation tests have an
	  install target, like the ctests and ftests
	* src/ctests/Makefile, src/ftests/Makefile: ctests/ftests: fix
	  "install" target  at some point DATADIR was renamed datadir and the
	  install targets were not updated.

2017-11-07  Asim YarKhan <>

	* bitbucket-pipelines.yml: Bitbucket pipeline testing: Inspired by
	  Phil Mucci's branch; copied the functionalty tests run in that
	* src/components/lmsensors/linux-lmsensors.c: lmsensors component:
	  Changed event names to use lm_sensors (only once) instead of
	  LM_SENSORS (twice) to be consistent with other events

2017-11-02  William Cohen <>

	* src/components/appio/tests/iozone/gnu3d.dem: gnu3d.dem should not
	  be executed by the test framework  This file is a gnuplot file and
	  should not be executed as part of the tests. Removing the
	  executable perms will signal to the testing framework that it
	  shouldn't be executed.
	* src/components/appio/tests/iozone/Gnuplot.txt: Gnuplot.txt should
	  not be executed by the test framework  This file is a readme file
	  and should not be executed as part of the tests. Removing the
	  executable perms will signal to the testing framework that it
	  shouldn't be executed.
	* .../appio/tests/iozone/,
	  src/components/appio/tests/iozone/ Fix perl scripts so
	  they run on Linux machines  The DOS style newlines were preventing
	  Linux from selecting the appropriate interpreter for these scripts
	  and causing these tests to fail.

2017-11-07  Asim YarKhan <>

	* src/components/lmsensors/configure: lmsensors component: Regenerate
	  the configure file for the component

2017-11-02  William Cohen <>

	* src/components/lmsensors/,
	  src/components/lmsensors/, src/components/lmsensors
	  /linux-lmsensors.c: Make the lmsensors dynamically load the needed
	  shared library  When attempting to build the current git repo of
	  papi the build of the files in the utils subdirectory failed
	  because the lmsensors libraries were not being linked in.  Rather
	  than forcing the papi to link in the lmsensor library during the
	  build the lmsensors component has been modified to dynamically load
	  the needed libraries and enable the lmsensors events when
	  available.  This allows machines missing the lmsensor libraries
	  installed to still use papi.

2017-11-06  Asim YarKhan <>

	* src/components/cuda/linux-cuda.c: CUDA component: On architectures
	  without CUDA Metrics (e.g. Tesla C2050), skip metric registration
	  rather than returning errors

2017-11-06  Vince Weaver <>

	* src/validation_tests/papi_l2_dca.c,
	  src/validation_tests/papi_l2_dcw.c: validation_tests: make the
	  papi_l2 tests fail with warnings  On Haswell/Broadwell and newer
	  these tests fail for unknown reasons.  This isn't new behavior,
	  it's just that the tests are new.  It's unlikely we will have time
	  to completely sort this out before the upcoming release, so change
	  the FAIL to WARN so testers won't be unnecessarily alarmed.

2017-11-05  Vince Weaver <>

	* src/components/perf_event/perf_event.c, src/configure,
	  src/ perf_event: enable rdpmc support by default  It
	  can still be disabled at configure time with --enable-perfevent-
	  rdpmc=no  This speeds up PAPI_read() by at least a factor of 5x
	  (see the ESPT'17 workshop presentation)  It is only enabled on
	  Linux 4.13 and newer due to bugs in previous versions.

2017-11-03  Vince Weaver <>

	* src/ctests/sdsc-mpx.c: ctests: sdsc: fix issue where the error
	  message is not printed correctly

2017-11-01  Heike Jagode <>

	* src/components/powercap/linux-powercap.c: Intermediate check-in:
	  Fixed a whole bunch of careless file handling (missing closing of
	  open files, missing setting of open/close flag, etc). Still more
	  rigorous checks needed.

Mon Oct 30 17:16:32 2017 -0700  Stephane Eranian <>

	* src/libpfm4/lib/events/intel_skl_events.h: Update
	  libpfm4\n\nCurrent with\n commit
	  21405fb3c247a0d16861483daf0696cf4fa0cc43  update SW_PREFETCH event
	  for Intel Skylake  Event was renamed SW_PREFETCH_ACCESS, but we
	  keep SW_PREFETCH as an alias.  Added PREFETCHW umask.  Enabled
	  suport for both Skylake client and server as per official event
	  table from 10/27/2017. See

2017-10-30  Vince Weaver <>

	* src/validation_tests/Makefile.recipies,
	  src/validation_tests/cycles_validation.c: validation_tests: add
	  cycles_validation test  this is the old zero test, which does a
	  number of cycles tests  It should be extended to add more.

2017-10-30  Vince Weaver <>

	* src/ctests/attach2.c, src/ctests/attach3.c, src/ctests/calibrate.c,
	  src/ctests/child_overflow.c, src/ctests/code2name.c,
	  src/ctests/earprofile.c, src/ctests/exec_overflow.c,
	  src/ctests/fork_overflow.c, src/ctests/hwinfo.c, src/ctests/mendes-
	  alt.c, src/ctests/prof_utils.c, src/ctests/prof_utils.h,
	  src/ctests/profile.c, src/ctests/remove_events.c,
	  src/ctests/shlib.c, src/ctests/system_child_overflow.c,
	  src/ctests/system_overflow.c, src/ctests/zero_named.c,
	  src/testlib/papi_test.h, src/testlib/test_utils.c: papi: c++11
	  fixes: fix various ctests that c++ complains on  mostly just const
	  warnings, some K+R function declarations, and possibly an actual
	  char/char* bug.
	* src/papi.c, src/papi.h: papi: c++11 conversion:
	* src/papi.c, src/papi.h: papi: c++11 conversion: convert
	* src/aix.c, src/components/appio/appio.c,
	  src/components/bgpm/NWunit/linux-NWunit.c, src/components/emon
	  /linux-emon.c, src/components/net/linux-net.c,
	  src/components/perfmon_ia64/perfmon-ia64.c, src/freebsd.c, src
	  /linux-bgq.c, src/papi.c, src/papi.h, src/papi_internal.c,
	  src/papi_internal.h, src/papi_libpfm3_events.c,
	  src/papi_libpfm_events.h, src/papi_vector.c, src/papi_vector.h:
	  papi: start converting papi.h to be C++11 clean  Most of the issues
	  have to do with string to char * conversion.  This first patch
	  converts PAPI_event_name_to_code()  The issue was first reported by
	  Brian Van Straalen
	* src/validation_tests/papi_l2_dca.c: validation_tests/papi_l2_dca:
	  update some comments
	* src/ctests/zero.c, src/validation_tests/cycles.c: ctests/zero: make
	  test pass on recent intel machines  The test was failing due to the
	  PAPI_get_real_cycles() validation on recent Intel chips.  This is
	  probably something that should be tested in a separate test and not
	  in zero which is supposed to be a bare-bones are-things-working

2017-10-27  Philip Vaccaro <>

	* src/components/powercap/README: updated powercap README to be more
	  concise. includes more details on interacting with energy counters
	  and power limits.

2017-10-27  Asim YarKhan <>

	* src/components/cuda/linux-cuda.c, src/components/nvml/linux-nvml.c:
	  CUDA/NVML components: Handled segfault which can occur when
	  dlclosing libcudart from both components by adding an additional
	  flag to dlopen

2017-10-24  Asim YarKhan <>

	* src/components/cuda/linux-cuda.c,
	  src/components/cuda/tests/ CUDA component: Clean
	  up fulltest by moving some output from stdout to SUBDBG, removed
	  some commented out lines
	* src/components/nvml/linux-nvml.c: nvml component: To support V100
	  (Volta) updated to get nvmlDevice handle ordered by index rather
	  than pci busid.

2017-10-23  Asim YarKhan <>

	* src/components/cuda/linux-cuda.c: CUDA component: Minor fix to
	  remove some unneeded stdout which shows up during fulltest

2017-10-20  Asim YarKhan <>

	* src/components/cuda/linux-cuda.c,
	  src/components/cuda/tests/ CUDA component test
	  update: Remove some debug output.  Do not build cupti_only test

Thu Oct 19 11:23:44 2017 -0700  Stephane Eranian <>

	* src/libpfm4/examples/showevtinfo.c,
	  src/libpfm4/lib/events/intel_skl_events.h: Update
	  libpfm4\n\nCurrent with\n commit
	  2e98642dd331b15382256caa380834d01b63bef8  Fix Intel Skylake
	  EXE_ACTIVITY.1_PORTS_UTIL event  Was missing a umask name.

2017-10-17  Vince Weaver <>

	* src/ctests/version.c: ctests: version, add INCREMENT field  at the
	  request of Steve Kaufmann
	* src/ctests/Makefile.recipies, src/ctests/version.c: ctests: re-
	  enable version test  not sure why it was disabled
	* src/ctests/Makefile.recipies: ctests: alphabetize SERIAL tests in

2017-10-13  Philip Vaccaro <>

	* src/components/powercap/tests/Makefile,
	  src/components/powercap/tests/powercap_limit.c: added simple limit
	  test for the powercap component.

2017-10-09  Asim YarKhan <>

	* src/components/nvml/linux-nvml.c: Big Fix NVML component: Fix
	  problem with names when there are multiple identical GPUs  If
	  multiple identical GPUs were available, the names were not mapped
	  correctly.  Fixed event names to be
	  "nvml:::Tesla_K40c:device_0:myevent" rather than

Fri Sep 29 00:25:09 2017 -0700  Stephane Eranian <>

	* src/libpfm4/include/perfmon/perf_event.h,
	  src/libpfm4/perf_examples/perf_util.c: Update libpfm4\n\nCurrent
	  with\n commit d1e7c96df60a00a371fdaa3b635ad4a38cee4c2f  add new
	  branch_smpl.c perf_events example  This patch adds a new example to
	  demo how to sample and parse the PERF_SAMPLE_BRANCH_STACK record
	  format of perf_events. It will dump branches taken from the sampled

2017-10-05  Asim YarKhan <>

	* src/components/nvml/README, src/components/nvml/linux-nvml.c,
	  .../nvml/tests/ Update NVML component:
	  Support for power limiting using NVML  PAPI has added support for
	  power limiting using NVML (on supported devices from the Kepler
	  family or later).  The executable needs to have root permissions to
	  change the power limits on the device.  We have added new events to
	  the NVML component to support power management limits.  The
	  nvml:::DEVICE:power_management_limit can be written (as well as
	  read), but requires higher permissions (root level).  The limit is
	  constrainted between a min and a max value, which can be read.
	  When the component is unloaded, the power_management_limit should
	  be reset to the initial value.
	  nvml:::DEVICE:power_management_limit_constraint_max  A new test
	  (nvml/tests/ was written to check if
	  the writing functionality works (with the proper hardware and

2017-10-04  Asim YarKhan <>

	* src/components/nvml/linux-nvml.c, src/components/nvml/linux-nvml.h,
	  src/components/nvml/tests/ Style consistency and
	  refactoring via astyle command.  No changes to the actual code were
	  made here.

2017-10-04  Vince Weaver <>

	* src/components/rapl/linux-rapl.c: rapl: add support for some Intel
	  Atom models Goldmont / Gemini_Lake / Denverton
	* src/components/rapl/linux-rapl.c: rapl: fix skylake SoC measurement
	* src/components/rapl/linux-rapl.c: rapl: add support for skylake SoC
	  energy measurements
	* src/components/rapl/linux-rapl.c: rapl: add Skylake-X / Kabylake
	* src/components/rapl/linux-rapl.c: rapl: centralize the "different
	  DRAM units" code
	* src/components/rapl/linux-rapl.c: rapl: merge like processors
	* src/components/rapl/linux-rapl.c: rapl: convert chip detection to a
	  switch statement
	* src/components/rapl/linux-rapl.c: rapl: update the whitespace a bit

2017-09-12  Heike Jagode ( <>

	* .../infiniband_umad/linux-infiniband_umad.c, .../infiniband_umad
	  /linux-infiniband_umad.h: Fixed papi_vector for infiniband_umad
	  component.  The array of function pointers that the component
	  defines must use the naming convention papi_vector_t _x_vector
	  where x is the name of the component directory.  In this case, the
	  name of the component directory is infiniband_umad and not
	  infiniband.  This change has not been tested yet due to OFED lib
	  issues on our local machines. There may be more changes required in
	  order to get the infiniband_umad component to work properly.

2017-09-11  Hanumanth <>

	* man/man1/papi_avail.1, man/man1/papi_native_avail.1,
	  src/utils/papi_avail.c, src/utils/papi_native_avail.c: Updating man
	  and help pages for papi_avail and papi_native_avail

2017-09-07  Asim YarKhan <>

	* src/components/cuda/tests/,
	  .../cuda/tests/ Update to CUDA
	  component to support NVLink.  The CUDA component has been cleaned
	  up and updated to support NVLink. NVLink metrics can not be
	  measured properly in KERNEL event collection mode, so the CUPTI
	  EventCollectionMode is transparently set to
	  being measured in an eventset.  For all other events and metrics,
	  the CUDA component uses the KERNEL event collection mode.  A bug in
	  the earlier version was that repeated calls to add CUDA events were
	  failing because some structures were not cleaned up.  This should
	  now be fixed.  A new nvlink test was added to the CUDA component

2017-08-31  Phil Mucci <>

	* man/man1/papi_avail.1, man/man1/papi_clockres.1,
	  man/man1/papi_command_line.1, man/man1/papi_component_avail.1,
	  man/man1/papi_cost.1, man/man1/papi_decode.1,
	  man/man1/papi_error_codes.1, man/man1/papi_event_chooser.1,
	  man/man1/papi_hybrid_native_avail.1, man/man1/papi_mem_info.1,
	  man/man1/papi_multiplex_cost.1, man/man1/papi_native_avail.1,
	  man/man1/papi_version.1, man/man1/papi_xml_event_info.1,
	  man/man3/PAPI_cleanup_eventset.3, man/man3/PAPI_destroy_eventset.3:
	  Updating options for papi_avail/native_avail as well as all
	  references to old mailing list

2017-08-31  Asim YarKhan <>

	* src/components/nvml/linux-nvml.c,
	  src/components/nvml/tests/Makefile: Minor updates to NVML component
	  to enable it to compile and run without complaints

2017-08-30  Vince Weaver <>

	* src/validation_tests/papi_br_prc.c,
	  src/validation_tests/papi_br_tkn.c: validation: update papi_br_prc
	  and papi_br_tkn for amd fam15h  amd fam15h doesn't have a
	  conditional branch event so the measures have to be against total.
	  for now print warning, maybe we should let it go w/o a warning.
	* src/papi_events.csv: papi_events: add PAPI_BR_PRC event to amd
	* src/papi_events.csv: papi_events: update PAPI_BR_PRC and
	  PAPI_BR_TKN on sandybridge/ivybridge  They were using TOTAL
	  branches for the derived branch events rather than CONDITIONAL like
	  the other modern x86 processors were using.
	* src/validation_tests/papi_br_tkn.c: validation_tests: papi_br_tkn:
	  update to only count conditional branches
	* src/validation_tests/papi_br_prc.c: validation_tests: papi_br_prc:
	  make sure it is comparing conditional branches  was doing total
	  branches, which made the test fail on skylake

Mon Aug 21 23:55:46 2017 -0700  Stephane Eranian <>

	* src/libpfm4/lib/pfmlib_intel_x86.c: Update libpfm4\n\nCurrent
	  with\n commit a290dead7c1f351f8269a265c0d4a5f38a60ba29  fix usage
	  of is_model_event() for Intel X86  This patch fixes a couple of
	  problems introduced by commit: 77a5ac9d43b1 add model field to
	  intel_x86_entry_t  The code in pfm_intel_x86_get_event_first() was
	  incorrect. It was calling is_model_event() before checking if the
	  index was within bounds. It should have been the opposite. Same
	  issue in pfm_intel_x86_get_next_event(). This could cause SEGFAULT
	  as report by Phil Mucci.  The patch also fixes the return value of
	  pfm_intel_x86_get_event_first(). It was not calculated correctly.
	  Reported-by: Phil Mucci <>

2017-08-20  Vince Weaver <>

	* src/ctests/Makefile.recipies, src/ctests/failed_events.c: ctests:
	  add failed_events test  it tries to create invalid events to make
	  sure the event parser properly handles invalid events.

2017-08-19  Vince Weaver <>

	* src/components/perf_event_uncore/tests/Makefile,
	  .../tests/perf_event_uncore_attach.c: perf_event_uncore: tests:
	  update perf_event_uncore to use :cpu=0  This is the more common way
	  of specifying uncore events. Rename the old test that uses
	  PAPI_set_opt() to perf_event_uncore_attach
	* .../tests/perf_event_uncore_cbox.c,
	  .../tests/perf_event_uncore_lib.h: perf_event_uncore: tests: update
	  uncore events for recent processors
	* src/ctests/zero_pthreads.c: ctests: zero_pthreads: remove
	  extraneous printf when in quiet mode
	* .../tests/perf_event_uncore_lib.c: perf_event_uncore: event list,
	  add recent processors  libpfm4 still doesn't support regular
	  Haswell, Broadwell, or Skylake machines
	* .../perf_event_uncore/tests/perf_event_uncore.c,
	  .../tests/perf_event_uncore_multiple.c: perf_event_uncore: tests:
	  print a message indicating the problem on skip  also some
	  whitespace cleanups
	* src/components/perf_event/tests/event_name_lib.c: perf_event:
	  tests: update event_name_lib for recent Intel processors
	* src/components/perf_event/tests/event_name_lib.c: perf_event:
	  tests: event_name_lib, clean up whitespace
	* .../perf_event/tests/perf_event_offcore_response.c: perf_event:
	  tests: update perf_event_offcore_response test  print an indicator
	  of why we are skipping the test also some gratuitous whitespace
	* src/ctests/zero_shmem.c: ctests: zero_shmem: document the code a
	  little better
	* src/ctests/zero_smp.c: ctests: zero_smp: make it actually do
	  something on Linux  Linux can use the pthread code just like AIX
	  although we don't validate the results, so this test could be
	  another candidate for not being necessary anymore.
	* src/ctests/zero_shmem.c: ctests: zero_shmem: minor cleanups  we
	  pretty much always skip this test.  Is it needed anymore? What was
	  it testing in the first place?  The code it calls (start_pes() )
	  doesn't seem to exist anymore
	* src/ctests/zero_omp.c, src/ctests/zero_pthreads.c: ctests: zero_omp
	  and zero_pthread were skipping due to a typo  when updating the
	  code I had left a stray ! before PAPI_query_event()

2017-08-19  Vince Weaver <>

	* src/papi_events.csv: papi_events: the skylake fixes broke hsw/bdw
	  this skylake-x change is way more trouble than it was worth.

2017-08-19  Vince Weaver <>

	* src/papi_events.csv: papi_events: on skylake the SNP_FWD umask was
	  renamed to SNP_HIT_WITH_FWD  This broke presets on skylake,
	* src/components/perf_event/pe_libpfm4_events.c: perf_event: fix
	  uninitialized descr issue reported by valgrind  I don't think this
	  is the skylake-x bug though

2017-08-18  Vince Weaver <>

	* src/components/perf_event/pe_libpfm4_events.c: perf_event: clean up
	  some whitespace in pe_libpfm4_events.c
	* src/linux-memory.c: linux-memory: various errors when compiling
	  with debug enabled  the new proc memory code had some mistakes in
	  the debug messages that only appeared when compiled with --with-
	  debug  Reported-by: Steve Kaufmann <>

2017-08-17  Vince Weaver <>

	* src/papi_events.csv: papi_events: missed one of the skx event

2017-08-16  Vince Weaver <>

	* src/papi_events.csv: papi_events: enable Skylake X support

Sun Aug 6 00:22:52 2017 -0700  Stephane Eranian <>

	* src/libpfm4/include/perfmon/pfmlib.h,
	  src/libpfm4/lib/pfmlib_priv.h, src/libpfm4/tests/validate_x86.c:
	  Update libpfm4\n\nCurrent with\n commit
	  efd16920194999fdf1146e9dab3f7435608a9479  add support for Intel
	  Skylake X  This patch adds support for Intel Skylake X core PMU
	  events. Based on  New PMU is
	  called skx.

2017-08-07  Vince Weaver <>

	* src/papi_events.csv: papi_events: add initial AMD fam17h support
	  not tested on actual hardware yet
	* src/papi_events.csv: papi_events: fix the amd_fam16h PMU name  The
	  way libpfm4 reports fam16h was modified a bit from my initial
	  patches.  fam16h seems to be working now.

Thu Jul 27 23:30:20 2017 -0700  Stephane Eranian <>

	* src/libpfm4/README, src/libpfm4/docs/Makefile,
	  src/libpfm4/include/perfmon/pfmlib.h, src/libpfm4/lib/Makefile,
	  src/libpfm4/tests/validate_x86.c: Update libpfm4  Current with
	  commit 72474c59d88512e49d9be7c4baa4355e8d8ad10a  fix typo in AMd
	  Fam17h man page  PMU name was mistyped.

2017-08-04  Vince Weaver <>

	* src/validation_tests/papi_l1_dcm.c,
	  src/validation_tests/papi_l2_dcm.c: validation_tests: for the DCM
	  tests up the allowed error to 5%  We don't want to fail too easily,
	  and 5% seems reasonable. This lets the test pass on ARM64
	  Dragonboard 401c
	* src/linux-memory.c: linux-memory: add fallback generic Linux /sys
	  cache size detection  This will allow getting cache sizes on
	  architectures we don't have custom code for.  Currently this mostly
	  means ARM64.
	* src/validation_tests/papi_l1_dcm.c,
	  src/validation_tests/papi_l2_dcm.c: validation_tests: don't crash
	  if cachesize reported as zero
	* src/validation_tests/branches_testcode.c: branches_testcode: add
	  arm64 support

2017-07-27  Vince Weaver <>

	* src/papi_events.csv, src/validation_tests/papi_l2_dca.c:
	  validation_tests: trying to find out why PAPI_L2_DCA fails on
	  Haswell  it's a mystery still.  One alternative is to switch the
	  event to be the same as PAPI_L1_DCM but that seems like it would be
	* src/validation_tests/papi_l2_dcw.c: validation_tests: papi_l2_dcw:
	  shorten a warning message
	* src/papi_events.csv: papi_events: note that libpfm4 Kaby Lake
	  support is treated as part of Skylake
	* src/validation_tests/Makefile.recipies,
	  src/validation_tests/papi_l2_dcw.c: validation_tests: add
	  PAPI_L2_DCW test
	* src/validation_tests/Makefile.recipies,
	  src/validation_tests/papi_l2_dcr.c: validation_tests: add
	  PAPI_L2_DCR test
	* src/validation_tests/papi_l2_dcm.c: validation_tests: PAPI_L2_DCM
	  figured out a test that made sense
	* src/validation_tests/Makefile.recipies,
	  src/validation_tests/papi_l1_dcm.c: validation_tests: add
	  PAPI_L1_DCM test
	* src/validation_tests/Makefile.recipies,
	  src/validation_tests/testcode.h: validation_tests: first attempt at
	  papi_l2_dcm test  disabled for now, as it's really hard to make a
	  workable cache miss test on modern hardware.

2017-07-26  Vince Weaver <>

	* src/ctests/Makefile, src/ctests/Makefile.recipies,
	  src/ctests/child_overflow.c, src/ctests/exec_overflow.c,
	  src/validation_tests/busy_work.c, src/validation_tests/testcode.h:
	  ctests: clean up the exec/child overflow tests  The exec_overflow
	  test segfaults when using rdpmc  This is a bug in Linux.  I'm
	  working on getting it fixed.

2017-07-21  Vince Weaver <>

	* src/validation_tests/Makefile.recipies,
	  src/validation_tests/testcode.h: validation_tests: add PAPI_L2_DCA
	  test  also adds some generic cache testing infrastructure
	* src/validation_tests/papi_l1_dca.c: validation_tests: PAPI_L1_DCA
	  fixes  had to find a machine that actually supported the event.  On
	  AMD Fam15h the write count is 3x expected?  Need to investigate
	* src/validation_tests/papi_br_prc.c: validation_tests: papi_br_prc,
	  properly skip if event not found
	* src/validation_tests/Makefile.recipies,
	  src/validation_tests/papi_l1_dca.c: validation_tests: add
	  PAPI_L1_DCA test

2017-07-20  Vince Weaver <>

	* src/validation_tests/Makefile.recipies,
	  src/validation_tests/papi_br_prc.c: validation_tests: add
	  PAPI_BR_PRC test
	* src/validation_tests/Makefile.recipies,
	  src/validation_tests/papi_br_tkn.c: validation_tests: add
	  PAPI_BR_TKN test
	* src/validation_tests/Makefile.recipies,
	  src/validation_tests/papi_br_ntk.c: validation_tests: add
	  PAPI_BR_NTK test

2017-07-07  Vince Weaver <>

	* src/papi_events.csv: papi_events: move haswell, skylake, and
	  broadwell to traditional PAPI_REF_CYC  there's a slight chance this
	  might break things for people, if so we can revert it.
	* src/linux-timer.c: linux-timer: fix build warning on non-power
	* src/ctests/flops.c, src/validation_tests/flops_testcode.c,
	  src/validation_tests/papi_sp_ops.c: validation: make the flops
	  tests handle that POWER has fused multiply-add  PAPI_DP_OPS and
	  PAPI_SP_OPS still fail, need to audit what the event is doing
	* src/papi_events.csv: POWER8: add a few branch preset events  they
	  pass the validation tests, not sure why they weren't enabled
	* src/validation_tests/branches_testcode.c: validation: add POWER
	  branches testcode  not sure I got the clobbers right
	* src/components/perf_event/perf_helpers.h,
	  src/validation_tests/papi_tot_ins.c: POWER: fix some compiler

2016-10-18  Phil Mucci <>

	* src/linux-timer.c: Ensure stdint gets included for all Linuxen.
	* src/linux-timer.c: Some Linuxen need stdint to get the uint64_t

2016-10-14  Phil Mucci <>

	* src/linux-lock.h: Restructured unlock code to avoid warnings.
	  Tested against 80 threads on Power8

2016-10-12  Phil Mucci <>

	* src/linux-timer.c: PPC64/PPC fast timer fixup.

2017-07-07  Vince Weaver <>

	* src/linux-timer.c: linux-timer: allow using fast timer for
	  get_real_cycles() on POWER

2016-07-12  Phil Mucci <>

	* src/linux-timer.c, src/linux-timer.h: First pass at good rdtsc for

2017-07-03  Vince Weaver <>

	* src/ctests/flops.c, src/ctests/hl_rates.c,
	  src/validation_tests/testcode.h: validation_tests: add tests for
	  PAPI_SP_OPS and PAPI_DP_OPS  extend the flops_testcode as well, to
	  have both float and double versions.
	* src/validation_tests/papi_ref_cyc.c: validation_tests:
	  papi_ref_cyc: update test to work on older systems  it's actually
	  the newer (haswell/broadwell/skylake) that are using a different
	  event than the older systems.  Make the test check for the old

2017-07-02  Vince Weaver <>

	* src/ctests/Makefile.recipies, src/ctests/cycle_ratio.c,
	  src/validation_tests/testcode.h: validation_tests: move cycle_ratio
	  test to be papi_ref_cyc test
	* src/ctests/cycle_ratio.c: ctests: rewrite cycle_ratio test  on
	  Intel platforms PAPI_REF_CYC is a fixed 100MHz cycle count  the
	  test was making the assumption that PAPI_REF_CYC was equal to the
	  max design freq (not turboboost) and thus as far as I can tell it
	  never would return the right answer.  This test should probably be
	  moved to validation_tests.

2017-07-01  Vince Weaver <>

	* src/ctests/Makefile.recipies, src/ctests/branches.c, src/ctests
	  /sdsc-mpx.c, src/ctests/sdsc2.c: ctests: migrate all other users of
	  dummy3() workload
	* src/ctests/Makefile.recipies, src/ctests/sdsc4-mpx.c,
	  src/validation_tests/testcode.h: ctests: move the "dummy3" workload
	  to the common workload library
	* src/ctests/sdsc4-mpx.c: ctests: sdsc4-mpx: fix failing on recent
	  Intel machines  the multiplexing of an event with small results
	  (PAPI_SR_INS in this case) has high variance, so don't use it for
	  validation.  There was code trying to do this but it wasn't

2017-06-30  Vince Weaver <>

	* src/ctests/first.c, src/ctests/matrix-hl.c, src/ctests/zero_omp.c,
	  src/ctests/zero_pthreads.c: ctests: catch lack of CPU component
	  earlier  gets rid of extreaneous SKIPPED in the output of
	* src/components/cuda/tests/,
	  src/components/cuda/tests/Makefile: tests:cuda: make the HelloWorld
	  test more like a standard PAPI test
	* src/validation_tests/Makefile.recipies: validation_tests: fix
	  linking against a CUDA enabled PAPI  Fix suggested by Steve
	  Kaufmann <>
	* src/testlib/papi_test.h, src/testlib/test_utils.c: testlib: make it
	  so it can compile with c++  this lets us link against it from the
	  CUDA tests
	* src/components/cuda/sampling/gpu_activity.c: tests: cuda: fix
	  sampling/gpu_activity to compile without warnings
	* src/ tests: make the component tests build command be
	  the same as ctests/ftests
	* src/ctests/calibrate.c: ctests: calibrate: turn off printf if
	  TEST_QUIET  missed this one when testing because test machine
	  skipped it due to lack of floating point events

2017-06-29  Vince Weaver <>

	* .../tests/perf_event_amd_northbridge.c,
	  src/ctests/Makefile.recipies, src/ctests/cycle_ratio.c,
	  src/ctests/derived.c, src/ctests/multiplex1_pthreads.c,
	  src/ctests/multiplex3_pthreads.c, src/ctests/overflow.c,
	  src/ctests/overflow_allcounters.c, src/ctests/overflow_index.c,
	  src/ctests/overflow_pthreads.c, src/ctests/overflow_twoevents.c,
	  src/ctests/prof_utils.c, src/ctests/prof_utils.h,
	  src/ctests/profile.c, src/ctests/profile_twoevents.c,
	  src/ctests/realtime.c, src/ctests/reset.c,
	  src/ctests/reset_multiplex.c, src/ctests/sdsc-mpx.c,
	  src/ctests/sdsc.c, src/ctests/sdsc4-mpx.c, src/ctests/sdsc4.c,
	  src/ctests/shlib.c, src/ctests/tenth.c, src/ctests/thrspecific.c,
	  src/testlib/papi_test.h: testlib: remove the hack where all
	  printf's are #defined to something else  Explicitly check
	  everywhere for TESTS_QUIET or equivelent, rather than using c-pre-
	  processor macros to redefine printf
	* src/papi.c, src/testlib/test_utils.c: tests: set the ctest debug
	  mode to VERBOSE by default for tests  the TESTS_QUIET mode was
	  turning *off* verbose debugging, which meant that PAPIERROR() calls
	  wouldn't show up during a ./
	* src/components/perf_event/perf_event.c: perf_event: properly
	  initialize the mmap_addr structure  It wasn't always being set to
	  NULL, and so on some tests the code would try to munmap() it even
	  though it wasn't mapped.
	* src/testlib/test_utils.c: tests: enable color in test status
	  messages  this has been an optional feature for a long time, if you
	  enabled the environment variable TESTS_COLOR=y  this change makes
	  it default to being on (you can disable with export TESTS_COLOR=n
	  also it should automatically detect if you are piping to a file and
	  disable colors in the case too
	* src/validation_tests/Makefile,
	  src/validation_tests/Makefile.recipies: validation_tests: always
	  include -lrt on the tests  Should be harmless, and I don't always
	  test on an old enough machine to trigger the problem.
	* src/ctests/forkexec.c, src/ctests/forkexec2.c,
	  src/ctests/forkexec3.c, src/ctests/forkexec4.c,
	  src/ctests/system_child_overflow.c: ctests: make the fork/exec
	  tests only print "PASSED" once  this makes the input
	  look a lot nicer
	* src/, src/testlib/test_utils.c: tests: make the output
	  from more compact

2017-06-28  Vince Weaver <>

	* .../perf_event/tests/perf_event_system_wide.c: perf_event: tests,
	  make perf_event_system_wide use INS rather than CYC  cycles varied
	  too much, making the validation fail
	* src/validation_tests/Makefile.recipies,
	  src/validation_tests/papi_br_ucn.c: validation_tests: add tests for
	* src/validation_tests/flops.c: validation_tests: flops: wasn't
	  falling back properly if no FLOPS event
	* src/utils/Makefile, src/validation_tests/Makefile.recipies: tests:
	  clean up the Makefiles
	* src/utils/print_header.c: utils: print_header: print the operating
	  system version in the header
	* .../tests/perf_event_amd_northbridge.c: perf_event_uncore: the
	  perf_event_amd_northbridge test wasn't working  it maybe never
	  worked at all?  It was hardcoded to thinking it was running on a
	  3.9 kernel always.
	* src/ctests/Makefile, src/ctests/Makefile.recipies,
	  src/ctests/zero.c: ctests: zero: complete transition from FLOPS to
	  INS as metric  this will make it more likely to be runnable on
	  modern machines.
	* src/ctests/vector.c, src/validation_tests/vector_testcode.c:
	  validation_tests: move the unused vector.c code  maybe we should
	  remove it.  It was never built as far as I can tell.
	* src/validation_tests/Makefile.recipies,
	  src/validation_tests/flops.c: validation_tests: add a generic flops
	  test based on hl_rates  we do a lot of testing of the high-level
	  interface but not as much of the regular PAPI interface.
	* src/ctests/Makefile.recipies, src/ctests/hl_rates.c,
	  src/validation_tests/testcode.h: ctests: hl_rates: clean up and fix
	  extraneous error message  the error message was due to the way
	  TESTS_QUIET is passed as a command line argument.  also made it use
	  the same matrix-multiply code that the flops test uses.  also added
	  some validation to the results.
	* src/ctests/all_events.c: ctests: all_events: issue warning if
	  preset cannot be created  specifically this came up on an AMD
	  fam15h system where the PAPI_L1_ICH event cannot be created due to
	  Linux stealing a counter for the NMI watchdog
	* src/validation_tests/papi_hw_int.c: validation_tests: papi_hw_int
	  explicitly mark large constant as ULL  compiler was warning on
	  32-bit machine
	* src/validation_tests/papi_ld_ins.c,
	  src/validation_tests/papi_tot_cyc.c: validation_tests:  a few tests
	  had the !quiet check inverted
	* src/validation_tests/papi_hw_int.c: validation_tests: fix
	  papi_hw_int looping forever  somehow the loop exit line got lost
	* src/validation_tests/Makefile.recipies,
	  src/validation_tests/papi_sr_ins.c: validation_tests: add
	  PAPI_SR_INS test
	* src/validation_tests/Makefile.recipies,
	  src/validation_tests/papi_ld_ins.c: validation_tests: add
	  PAPI_LD_INS test
	* src/, src/validation_tests/Makefile.recipies,
	  src/validation_tests/papi_hw_int.c: validation_tests: add
	  PAPI_HW_INT test

2017-06-27  Vince Weaver <>

	* src/run_tests_exclude.txt: run_tests_exclude: add attach_target
	  not really a test so we shouldn't run it
	* src/ctests/byte_profile.c, src/ctests/earprofile.c,
	  src/ctests/prof_utils.c, src/ctests/prof_utils.h:
	  ctests/prof_utils: remove prof_init() helper  It didn't do much
	  more than a papi_init, probably better to have each file do that in
	  the open.
	* src/ctests/inherit.c, src/ctests/ipc.c, src/ctests/johnmay2.c,
	  src/ctests/krentel_pthreads.c, src/ctests/kufrin.c, src/ctests/low-
	  level.c, src/ctests/mendes-alt.c, src/ctests/multiplex1.c,
	  src/ctests/multiplex1_pthreads.c, src/ctests/multiplex2.c,
	  src/ctests/multiplex3_pthreads.c, src/ctests/overflow.c,
	  src/ctests/overflow2.c, src/ctests/overflow3_pthreads.c,
	  src/ctests/overflow_allcounters.c, src/ctests/overflow_index.c,
	  src/ctests/overflow_twoevents.c, src/ctests/prof_utils.c,
	  src/ctests/profile.c, src/ctests/profile_pthreads.c,
	  src/ctests/profile_twoevents.c, src/ctests/remove_events.c,
	  src/ctests/sprofile.c, src/ctests/zero.c, src/ctests/zero_flip.c,
	  src/ctests/zero_named.c, src/testlib/test_utils.c: ctests: skip
	  rather than fail if no events available

2017-06-26  Vince Weaver <>

	* src/ctests/first.c, src/ctests/mpifirst.c,
	  src/ctests/multiattach.c, src/ctests/multiattach2.c,
	  src/testlib/test_utils.c: testlib: fix add_two_events()  was not
	  setting some values, causing many tests to fail
	* src/ctests/attach2.c, src/ctests/system_overflow.c: ctests:
	  compiler warning caught two lack-of-braces mistakes
	* src/ctests/byte_profile.c, src/ctests/code2name.c,
	  src/ctests/describe.c, src/testlib/test_utils.c: tests: more
	  changes to skip instead of fail if no events available
	* src/ctests/Makefile.recipies, src/ctests/child_overflow.c,
	  src/ctests/exec_overflow.c, src/ctests/fork_exec_overflow.c,
	  src/ctests/fork_overflow.c, src/ctests/system_child_overflow.c,
	  src/ctests/system_overflow.c: ctests: break up the
	  for_exec_overflow test  it was really four benchmarks with some
	  ifdefs  the proper way to do that would be to have a common C file
	  and link against it for the shared routines, rather than using the
	* src/ctests/attach2.c, src/ctests/attach3.c,
	  src/ctests/attach_cpu.c: ctests: have attach tests cleanly skip if
	  no events available
	* src/testlib/test_utils.c: testlib: update add_two_events to skip()
	  if not events found
	* src/ctests/mendes-alt.c, src/ctests/multiplex2.c,
	  src/ctests/multiplex3_pthreads.c, src/ctests/sdsc.c,
	  src/ctests/sdsc2.c, src/ctests/sdsc4.c, src/testlib/papi_test.h,
	  src/testlib/test_utils.c: testutils: remove init_multiplex() test
	  helper  the only benefit it had over calling PAPI_multiplex_init()
	  was a domain workaround for perfctr+power6 systems.  Ideally not
	  many of those systems are around anymore, an in any case a proper
	  fix would have the perfctr component handle that, not the testing
	* .../perf_event/tests/perf_event_system_wide.c,
	  .../perf_event/tests/perf_event_user_kernel.c, src/ctests/api.c,
	  src/ctests/byte_profile.c, src/ctests/high-level.c,
	  src/ctests/hl_rates.c, src/validation_tests/papi_br_ins.c,
	  src/validation_tests/papi_tot_ins.c: tests: try to "skip" rather
	  than "fail" if no events available
	* src/ctests/derived.c: ctests: derived: fix warning found on older
	* src/ctests/high-level2.c: ctests: clean up high-level2 test  skip
	  on machine without flops/flips event
	* src/components/ components test: fix
	  another build issue  be sure to use local copy of papi.h
	* src/components/ component tests: fix
	  build issue  was trying to use the system version of libpapi.a
	  instead of local version
	* src/components/appio/tests/Makefile,
	  src/components/stealtime/tests/Makefile: components: update
	  component test Makefiles to include
	* src/components/ components: update  should now be usable by the
	  components without many Makefile changes
	* src/components/perf_event/tests/Makefile,
	  src/ctests/Makefile.recipies, src/ctests/nmi_watchdog.c: ctests:
	  nmi_watchdog is a perf_event specific test, move it there
	* src/components/,
	  src/components/README, src/components/perf_event/tests/Makefile:
	  components: update the autoconfigure to generate more useful  although I don't think most components are
	  using it at all

2017-06-26  Asim YarKhan <>

	* src/components/cuda/, src/components/cuda/README,
	  src/components/cuda/Rules.cuda, src/components/cuda/configure,
	  src/components/cuda/, src/components/cuda/linux-cuda.c,
	  src/components/cuda/tests/ CUDA component update:
	  Support for CUPTI metrics (early release)  This commit adds support
	  for CUPTI metrics, which are higher level measures that may be
	  decompsed into multiple lower level CUPTI events.  Known problems
	  and limitations in early release of metric support * Only sets of
	  metrics and events that can be gathered in a single pass are
	  supported.  Transparent multi-pass support is expected * All
	  metrics are returned as long long integers, which means that CUPTI
	  double precision values will be truncated, possibly severely. * The
	  NVLink metrics have been disabled for this alpha release.

2017-06-23  Vince Weaver <>

	* src/validation_tests/papi_fp_ops.c: validation: papi_fp_ops, skip
	  (not fail) if PAPI_FP_OPS unavailable
	* src/ctests/Makefile, src/ctests/Makefile.recipies,
	  src/ctests/, src/ctests/flops.c: ctests: flops,
	  update to use some of the validate_tests infrastructure
	* src/validation_tests/Makefile.recipies,
	  src/validation_tests/testcode.h: validation_tests: add papi_fp_ops
	  test  tested on an AMD fam15h machine
	* src/components/powercap/tests/powercap_basic.c: powercap: fix
	  compiler warnings in the powercap_basic test
	* src/ctests/flops.c: ctests: update flops test
	* src/ctests/api.c: ctests: update api test  only seems to test the
	  high-level API
	* src/ctests/all_native_events.c: ctests: update all_native_events
	  removed some ancient warnings about uncore/offcore events. Should
	  not be a problem on libpfm4/perf_event
	* src/ctests/all_events.c: ctests: clean up all_events test
	* src/components/appio/tests/appio_list_events.c,
	  src/ctests/all_events.c, src/ctests/all_native_events.c,
	  src/ctests/api.c, src/ctests/attach2.c, src/ctests/attach3.c,
	  src/ctests/attach_cpu.c, src/ctests/branches.c,
	  src/ctests/byte_profile.c, src/ctests/calibrate.c,
	  src/ctests/case1.c, src/ctests/case2.c,
	  src/ctests/clockres_pthreads.c, src/ctests/cmpinfo.c,
	  src/ctests/code2name.c, src/ctests/cycle_ratio.c,
	  src/ctests/data_range.c, src/ctests/derived.c,
	  src/ctests/describe.c, src/ctests/disable_component.c,
	  src/ctests/dmem_info.c, src/ctests/earprofile.c,
	  src/ctests/eventname.c, src/ctests/exec.c, src/ctests/exec2.c,
	  src/ctests/exeinfo.c, src/ctests/first.c, src/ctests/flops.c,
	  src/ctests/fork.c, src/ctests/fork2.c,
	  src/ctests/fork_exec_overflow.c, src/ctests/forkexec.c,
	  src/ctests/forkexec2.c, src/ctests/forkexec3.c,
	  src/ctests/forkexec4.c, src/ctests/get_event_component.c,
	  src/ctests/high-level.c, src/ctests/high-level2.c,
	  src/ctests/hl_rates.c, src/ctests/hwinfo.c, src/ctests/inherit.c,
	  src/ctests/ipc.c, src/ctests/johnmay2.c,
	  src/ctests/krentel_pthreads.c, src/ctests/kufrin.c,
	  src/ctests/locks_pthreads.c, src/ctests/low-level.c, src/ctests
	  /matrix-hl.c, src/ctests/max_multiplex.c, src/ctests/memory.c,
	  src/ctests/mendes-alt.c, src/ctests/multiattach.c,
	  src/ctests/multiattach2.c, src/ctests/multiplex1.c,
	  src/ctests/multiplex1_pthreads.c, src/ctests/multiplex2.c,
	  src/ctests/multiplex3_pthreads.c, src/ctests/nmi_watchdog.c,
	  src/ctests/omptough.c, src/ctests/overflow.c,
	  src/ctests/overflow2.c, src/ctests/overflow3_pthreads.c,
	  src/ctests/overflow_force_software.c, src/ctests/overflow_index.c,
	  src/ctests/overflow_one_and_read.c, src/ctests/overflow_pthreads.c,
	  src/ctests/overflow_twoevents.c, src/ctests/p4_lst_ins.c,
	  src/ctests/profile.c, src/ctests/profile_pthreads.c,
	  src/ctests/profile_twoevents.c, src/ctests/pthrtough.c,
	  src/ctests/pthrtough2.c, src/ctests/realtime.c,
	  src/ctests/remove_events.c, src/ctests/reset.c,
	  src/ctests/reset_multiplex.c, src/ctests/sdsc.c,
	  src/ctests/sdsc2.c, src/ctests/sdsc4.c, src/ctests/second.c,
	  src/ctests/shlib.c, src/ctests/sprofile.c, src/ctests/tenth.c,
	  src/ctests/thrspecific.c, src/ctests/timer_overflow.c,
	  src/ctests/virttime.c, src/ctests/zero.c, src/ctests/zero_attach.c,
	  src/ctests/zero_flip.c, src/ctests/zero_fork.c,
	  src/ctests/zero_named.c, src/ctests/zero_omp.c,
	  src/ctests/zero_pthreads.c, src/ctests/zero_smp.c,
	  src/testlib/papi_test.h, src/testlib/test_utils.c,
	  src/validation_tests/papi_tot_ins.c: testlib: remove the "free
	  variables" option from test_pass()  It was only used by a small
	  handfull of tests, and wasn't really strictly necessary anyway.
	  test_pass() should pass the test and that's all.
	* src/ctests/zero.c: ctests: zero: start cleaning up this test
	* src/validation_tests/Makefile.recipies: validation_tests:
	  clock_gettime() requires -lrt on older versions of glibc

2017-06-22  Will Schmidt <>

	* src/linux-memory.c, src/papi_events.csv: PAPI power9 event list
	  presets  Here is an initial set of events and changes to help
	  support Power9.  This is based on similar changes that were made
	  for power8 when initial support was added there.  I've updated the
	  event names to match what we expect to have in power9, and have
	  done compile/build/ sniff tests.

2017-06-22  Vince Weaver <>

	* src/ftests/ ftests: fortran tests weren't
	  getting the TOPTFLAGS var set
	* src/testlib/test_utils.c: testlib: fix colors not turning off in
	  pass/fail indicator
	* src/ctests/api.c, src/ctests/attach2.c, src/ctests/attach3.c,
	  src/ctests/attach_cpu.c, src/ctests/inherit.c,
	  src/ctests/multiattach.c, src/ctests/multiattach2.c,
	  src/ctests/zero_attach.c, src/testlib/papi_test.h,
	  src/testlib/test_utils.c: testlib: update the way pass/fail is
	  printed  It's been bugging me for years that they don't line up
	* src/ run the validation tests too
	* src/ make it compile the
	* src/validation_tests/Makefile.recipies,
	  src/validation_tests/papi_br_msp.c: validation-tests: add
	  papi_br_msp test
	* src/validation_tests/Makefile.recipies,
	  src/validation_tests/testcode.h: validation_tests: add papi_br_ins
	* src/validation_tests/Makefile.recipies,
	  src/validation_tests/papi_tot_cyc.c: validation_tests: add
	  papi_tot_cyc test
	* src/ fix "make install-all"  had some extraneous ".."
	  after some previous changes
	* src/configure, src/,
	  src/validation_tests/papi_tot_ins.c: validation_tests: update
	  configure so it sets up the Makefile
	* src/testlib/papi_test.h, src/testlib/test_utils.c: testlib:
	  papi_print_header() lives with the utils code now
	* src/testlib/papi_test.h, src/testlib/test_utils.c: testlib: make
	  tests_quiet() return an integer  This way we don't have to depend
	  on the global var TESTS_QUIET if we don't want to.
	* src/validation_tests/Makefile,
	  src/validation_tests/testcode.h: validation_tests: add initial
	  papi_tot_ins test  it is not hooked up to the build system yet
	* src/ctests/multiplex1.c, src/ctests/multiplex2.c,
	  src/ctests/second.c, src/ctests/sprofile.c, src/ctests/virttime.c,
	  src/ctests/zero_attach.c, src/ctests/zero_flip.c,
	  src/ctests/zero_fork.c, src/ctests/zero_omp.c,
	  src/ctests/zero_pthreads.c: ctests: more printf/TESTS_QUIET
	* src/testlib/fpapi_test.h: ftests: missing define was making
	  second.F fail
	* src/ctests/johnmay2.c, src/ctests/krentel_pthreads.c,
	  src/ctests/kufrin.c, src/ctests/locks_pthreads.c,
	  src/ctests/memory.c, src/ctests/multiattach.c,
	  src/ctests/multiattach2.c, src/ctests/multiplex1.c: ctests: more
	  printf/TESTS_QUIET fixes

2017-06-21  Vince Weaver <>

	* src/ctests/all_events.c, src/ctests/all_native_events.c,
	  src/ctests/attach2.c, src/ctests/attach3.c,
	  src/ctests/attach_cpu.c, src/ctests/byte_profile.c,
	  src/ctests/calibrate.c, src/ctests/cmpinfo.c,
	  src/ctests/code2name.c, src/ctests/cycle_ratio.c,
	  src/ctests/exeinfo.c, src/ctests/fork_exec_overflow.c,
	  src/ctests/hl_rates.c, src/ctests/hwinfo.c: ctests: explicitly
	  block printfs with TESTS_QUIET  There was some hackery with the
	  preprocessor to avoid this but that wasn't a good solution.
	* src/testlib/do_loops.h, src/testlib/papi_test.h,
	  src/testlib/test_utils.c: testlib: minor papi_test.h cleanups
	* .../perf_event/tests/perf_event_offcore_response.c,
	  .../tests/perf_event_uncore_multiple.c, src/ctests/attach2.c,
	  src/ctests/attach3.c, src/ctests/attach_cpu.c,
	  src/ctests/attach_target.c, src/ctests/branches.c,
	  src/ctests/burn.c, src/ctests/byte_profile.c,
	  src/ctests/cycle_ratio.c, src/ctests/derived.c,
	  src/ctests/dmem_info.c, src/ctests/earprofile.c,
	  src/ctests/first.c, src/ctests/high-level.c, src/ctests/inherit.c,
	  src/ctests/johnmay2.c, src/ctests/krentel_pthreads.c,
	  src/ctests/kufrin.c, src/ctests/locks_pthreads.c, src/ctests/low-
	  level.c, src/ctests/matrix-hl.c, src/ctests/memory.c,
	  src/ctests/multiattach.c, src/ctests/multiattach2.c,
	  src/ctests/multiplex1.c, src/ctests/multiplex1_pthreads.c,
	  src/ctests/multiplex2.c, src/ctests/multiplex3_pthreads.c,
	  src/ctests/overflow.c, src/ctests/overflow2.c,
	  src/ctests/overflow3_pthreads.c, src/ctests/overflow_allcounters.c,
	  src/ctests/overflow_force_software.c, src/ctests/overflow_index.c,
	  src/ctests/overflow_twoevents.c, src/ctests/p4_lst_ins.c,
	  src/ctests/prof_utils.c, src/ctests/profile.c,
	  src/ctests/profile_twoevents.c, src/ctests/remove_events.c,
	  src/ctests/reset.c, src/ctests/reset_multiplex.c,
	  src/ctests/sdsc.c, src/ctests/sdsc2.c, src/ctests/sdsc4.c,
	  src/ctests/second.c, src/ctests/sprofile.c, src/ctests/tenth.c,
	  src/ctests/zero.c, src/ctests/zero_attach.c,
	  src/ctests/zero_flip.c, src/ctests/zero_fork.c,
	  src/ctests/zero_named.c, src/ctests/zero_omp.c,
	  src/ctests/zero_pthreads.c, src/ctests/zero_shmem.c,
	  src/ctests/zero_smp.c, src/testlib/Makefile,
	  src/testlib/fpapi_test.h, src/testlib/papi_test.h,
	  src/testlib/test_utils.h: testlib: more papi_test.h reduction
	* src/testlib/Makefile: testlib: turn off optimization on the
	  validation loops  it's making tests fail, need to go back and be
	  sure we are properly tricking the compiler.
	* src/, src/components/Makefile_comp_tests,
	  src/components/rapl/tests/rapl_overflow.c, src/ctests/Makefile,
	  src/ctests/Makefile.recipies, src/ctests/overflow_pthreads.c,
	  src/ctests/profile_pthreads.c, src/ftests/Makefile,
	  src/ftests/Makefile.recipies, src/ftests/,
	  src/testlib/Makefile, src/testlib/do_loops.c,
	  src/testlib/do_loops.h, src/testlib/papi_test.h: testlib: start
	  splitting the validation code off from the pass/fail code
	* src/components/perf_event/tests/perf_event_offcore_response.c,
	  src/components/perf_event/tests/perf_event_user_kernel.c, src/compo
	  src/components/perf_event_uncore/tests/perf_event_uncore_cbox.c, sr
	  src/ctests/all_native_events.c, src/ctests/attach2.c,
	  src/ctests/attach3.c, src/ctests/attach_cpu.c,
	  src/ctests/attach_target.c, src/ctests/branches.c,
	  src/ctests/burn.c, src/ctests/byte_profile.c,
	  src/ctests/calibrate.c, src/ctests/case1.c, src/ctests/case2.c,
	  src/ctests/clockres_pthreads.c, src/ctests/cmpinfo.c,
	  src/ctests/code2name.c, src/ctests/cycle_ratio.c,
	  src/ctests/data_range.c, src/ctests/derived.c,
	  src/ctests/describe.c, src/ctests/disable_component.c,
	  src/ctests/dmem_info.c, src/ctests/earprofile.c,
	  src/ctests/eventname.c, src/ctests/exec.c, src/ctests/exec2.c,
	  src/ctests/exeinfo.c, src/ctests/first.c, src/ctests/flops.c,
	  src/ctests/fork.c, src/ctests/fork2.c, src/ctests/forkexec.c,
	  src/ctests/forkexec2.c, src/ctests/forkexec3.c,
	  src/ctests/forkexec4.c, src/ctests/get_event_component.c,
	  src/ctests/high-level.c, src/ctests/high-level2.c,
	  src/ctests/hl_rates.c, src/ctests/hwinfo.c, src/ctests/inherit.c,
	  src/ctests/ipc.c, src/ctests/johnmay2.c,
	  src/ctests/krentel_pthreads.c, src/ctests/kufrin.c,
	  src/ctests/locks_pthreads.c, src/ctests/low-level.c, src/ctests
	  /matrix-hl.c, src/ctests/memory.c, src/ctests/mendes-alt.c,
	  src/ctests/multiattach.c, src/ctests/multiattach2.c,
	  src/ctests/multiplex1.c, src/ctests/multiplex1_pthreads.c,
	  src/ctests/multiplex2.c, src/ctests/multiplex3_pthreads.c,
	  src/ctests/nmi_watchdog.c, src/ctests/omptough.c,
	  src/ctests/overflow.c, src/ctests/overflow2.c,
	  src/ctests/overflow3_pthreads.c, src/ctests/overflow_allcounters.c,
	  src/ctests/overflow_force_software.c, src/ctests/overflow_index.c,
	  src/ctests/overflow_one_and_read.c, src/ctests/overflow_pthreads.c,
	  src/ctests/overflow_twoevents.c, src/ctests/p4_lst_ins.c,
	  src/ctests/prof_utils.c, src/ctests/profile.c,
	  src/ctests/profile_pthreads.c, src/ctests/profile_twoevents.c,
	  src/ctests/pthrtough.c, src/ctests/pthrtough2.c,
	  src/ctests/realtime.c, src/ctests/remove_events.c,
	  src/ctests/reset.c, src/ctests/reset_multiplex.c,
	  src/ctests/sdsc.c, src/ctests/sdsc2.c, src/ctests/sdsc4.c,
	  src/ctests/second.c, src/ctests/shlib.c, src/ctests/sprofile.c,
	  src/ctests/tenth.c, src/ctests/thrspecific.c,
	  src/ctests/timer_overflow.c, src/ctests/virttime.c,
	  src/ctests/zero.c, src/ctests/zero_attach.c,
	  src/ctests/zero_flip.c, src/ctests/zero_fork.c,
	  src/ctests/zero_named.c, src/ctests/zero_omp.c,
	  src/ctests/zero_pthreads.c, src/ctests/zero_shmem.c,
	  src/ctests/zero_smp.c, src/testlib/do_loops.c,
	  src/testlib/papi_test.h, src/testlib/test_utils.c: testlib: remove
	  include of papi.h  Need to explicitly include it in your test if
	  you need it.
	* src/testlib/Makefile, src/testlib/do_loops.c,
	  src/testlib/do_loops.h, src/testlib/dummy.c, src/utils/Makefile,
	  src/utils/papi_command_line.c, src/utils/papi_cost.c: utils: remove
	  last uses of testlib
	* src/utils/Makefile, src/utils/papi_hybrid_native_avail.c: utils:
	  update papi_hybrid_native_avail to not depend on testlib
	* src/utils/papi_multiplex_cost.c: utils: clean up
	  papi_multiplex_cost  remove dependeicnes on papi_test.h  print
	  message warning that it can take a long time to run
	* .../perf_event/tests/perf_event_offcore_response.c,
	  src/ctests/all_native_events.c, src/ctests/attach2.c,
	  src/ctests/attach3.c, src/ctests/branches.c,
	  src/ctests/byte_profile.c, src/ctests/calibrate.c,
	  src/ctests/data_range.c, src/ctests/describe.c,
	  src/ctests/disable_component.c, src/ctests/earprofile.c,
	  src/ctests/exec.c, src/ctests/exec2.c, src/ctests/exeinfo.c,
	  src/ctests/first.c, src/ctests/forkexec.c, src/ctests/forkexec2.c,
	  src/ctests/forkexec3.c, src/ctests/forkexec4.c,
	  src/ctests/get_event_component.c, src/ctests/inherit.c,
	  src/ctests/krentel_pthreads.c, src/ctests/kufrin.c, src/ctests
	  /matrix-hl.c, src/ctests/multiplex1.c,
	  src/ctests/multiplex1_pthreads.c, src/ctests/multiplex2.c,
	  src/ctests/nmi_watchdog.c, src/ctests/overflow_allcounters.c,
	  src/ctests/overflow_pthreads.c, src/ctests/overflow_single_event.c,
	  src/ctests/overflow_twoevents.c, src/ctests/prof_utils.c,
	  src/ctests/profile_pthreads.c, src/ctests/remove_events.c,
	  src/ctests/reset.c, src/ctests/reset_multiplex.c,
	  src/ctests/sdsc.c, src/ctests/sdsc2.c, src/ctests/sdsc4.c,
	  src/ctests/second.c, src/ctests/shlib.c,
	  src/ctests/timer_overflow.c, src/ctests/zero_named.c,
	  src/testlib/do_loops.c, src/testlib/papi_test.h,
	  src/testlib/test_utils.c, src/utils/Makefile,
	  src/utils/cost_utils.c, src/utils/papi_command_line.c,
	  src/utils/papi_cost.c, src/utils/papi_event_chooser.c: testlib:
	  more header removal from papi_test.h
	* src/components/perf_event/tests/perf_event_system_wide.c,
	  src/ctests/attach2.c, src/ctests/attach3.c,
	  src/ctests/multiattach.c, src/ctests/multiattach2.c,
	  src/ctests/zero_attach.c, src/testlib/papi_test.h,
	  src/utils/cost_utils.c: testlib: remove a few more includes from
	* src/components/rapl/tests/rapl_basic.c, src/ctests/all_events.c,
	  src/ctests/all_native_events.c, src/ctests/api.c,
	  src/ctests/attach2.c, src/ctests/attach3.c,
	  src/ctests/attach_cpu.c, src/ctests/attach_target.c,
	  src/ctests/branches.c, src/ctests/burn.c, src/ctests/calibrate.c,
	  src/ctests/case1.c, src/ctests/case2.c,
	  src/ctests/clockres_pthreads.c, src/ctests/code2name.c,
	  src/ctests/cycle_ratio.c, src/ctests/data_range.c,
	  src/ctests/derived.c, src/ctests/describe.c,
	  src/ctests/dmem_info.c, src/ctests/earprofile.c,
	  src/ctests/eventname.c, src/ctests/exec.c, src/ctests/exec2.c,
	  src/ctests/exeinfo.c, src/ctests/flops.c, src/ctests/fork.c,
	  src/ctests/fork2.c, src/ctests/forkexec.c, src/ctests/forkexec2.c,
	  src/ctests/forkexec3.c, src/ctests/forkexec4.c, src/ctests/high-
	  level.c, src/ctests/high-level2.c, src/ctests/hl_rates.c,
	  src/ctests/hwinfo.c, src/ctests/inherit.c, src/ctests/ipc.c,
	  src/ctests/johnmay2.c, src/ctests/kufrin.c,
	  src/ctests/locks_pthreads.c, src/ctests/low-level.c,
	  src/ctests/max_multiplex.c, src/ctests/memory.c,
	  src/ctests/multiattach.c, src/ctests/multiattach2.c,
	  src/ctests/multiplex1.c, src/ctests/multiplex1_pthreads.c,
	  src/ctests/multiplex2.c, src/ctests/multiplex3_pthreads.c,
	  src/ctests/overflow.c, src/ctests/overflow2.c,
	  src/ctests/overflow3_pthreads.c, src/ctests/overflow_allcounters.c,
	  src/ctests/overflow_force_software.c, src/ctests/overflow_index.c,
	  src/ctests/overflow_one_and_read.c, src/ctests/overflow_pthreads.c,
	  src/ctests/overflow_twoevents.c, src/ctests/p4_lst_ins.c,
	  src/ctests/prof_utils.c, src/ctests/profile.c,
	  src/ctests/profile_pthreads.c, src/ctests/profile_twoevents.c,
	  src/ctests/pthrtough.c, src/ctests/pthrtough2.c,
	  src/ctests/realtime.c, src/ctests/sdsc.c, src/ctests/sdsc2.c,
	  src/ctests/sdsc4.c, src/ctests/second.c, src/ctests/shlib.c,
	  src/ctests/sprofile.c, src/ctests/tenth.c,
	  src/ctests/thrspecific.c, src/ctests/timer_overflow.c,
	  src/ctests/virttime.c, src/ctests/zero.c, src/ctests/zero_attach.c,
	  src/ctests/zero_flip.c, src/ctests/zero_fork.c,
	  src/ctests/zero_omp.c, src/ctests/zero_pthreads.c,
	  src/ctests/zero_shmem.c, src/ctests/zero_smp.c,
	  src/testlib/do_loops.c, src/testlib/dummy.c,
	  src/testlib/papi_test.h, src/testlib/test_utils.c,
	  src/utils/papi_command_line.c, src/utils/papi_cost.c: testlib:
	  split some headers out of papi_test.h  Too much is going on in that
	  header, no need to have every include in the world in it.  Trying
	  to make the testcode more standalone so it is easier to follow.
	* src/testlib/Makefile, src/testlib/ testlib: let
	  testlib build properly from within the testlib directory
	* src/testlib/clockcore.c: testlib: clockcore wasn't protecting all
	  the output with !quiet
	* src/ctests/Makefile: ctests: make sure tests link against the right
	  papi.h file
	* src/, src/ctests/Makefile,
	  src/ctests/ ctests: allow running "make" in the
	  ctests directory to work

2017-06-20  Vince Weaver <>

	* src/Matlab/PAPI_Matlab.readme, src/papi.c, src/utils/papi_avail.c,
	  src/utils/papi_clockres.c, src/utils/papi_command_line.c,
	  src/utils/papi_component_avail.c, src/utils/papi_cost.c,
	  src/utils/papi_decode.c, src/utils/papi_error_codes.c,
	  src/utils/papi_hybrid_native_avail.c, src/utils/papi_mem_info.c,
	  src/utils/papi_multiplex_cost.c, src/utils/papi_native_avail.c,
	  src/utils/papi_version.c, src/utils/papi_xml_event_info.c: update
	  the ptools-perfapi e-mail address  in the auto-generated manpages
	  it was still using the old address.
	* doc/Makefile: docs: fix the manpage build after renaming the utils
	  Thanks to Steve Kaufmann for catching this.
	* src/utils/Makefile, src/utils/papi_native_avail.c: utils:
	  papi_native_avail: remove extraneous testing code
	* src/utils/Makefile, src/utils/papi_mem_info.c: utils:
	  papi_mem_info: remove extraneous test code
	* src/utils/Makefile, src/utils/papi_xml_event_info.c: utils:
	  papi_xml_event_info: remove extraneous test code
	* src/utils/Makefile, src/utils/papi_decode.c: utils: papi_decode:
	  remove extraneous test code
	* src/utils/Makefile, src/utils/papi_error_codes.c: utils:
	  papi_error_codes: remove extraneous test code
	* src/utils/Makefile, src/utils/papi_component_avail.c: utils:
	  papi_component_avail: remove extraneous test code
	* src/ctests/clockres_pthreads.c, src/testlib/clockcore.c,
	  src/testlib/clockcore.h, src/testlib/papi_test.h,
	  src/utils/Makefile, src/utils/papi_clockres.c: utils:
	  papi_clockres, remove extraneous test code
	* src/utils/Makefile, src/utils/papi_avail.c,
	  src/utils/print_header.c, src/utils/print_header.h: utils: update
	  papi_avail to not depend on testlibs  It's not a test.
	* src/utils/Makefile: utils: add target for papi_hybrid_native_avail
	  do not build it by default though?  Should only be built if
	  compiling for MIC?
	* src/utils/Makefile, src/utils/avail.c, src/utils/clockres.c,
	  src/utils/command_line.c, src/utils/component.c, src/utils/cost.c,
	  src/utils/decode.c, src/utils/error_codes.c,
	  src/utils/event_chooser.c, src/utils/event_info.c,
	  src/utils/hybrid_native_avail.c, src/utils/mem_info.c,
	  src/utils/multiplex_cost.c, src/utils/native_avail.c,
	  src/utils/papi_avail.c, src/utils/papi_clockres.c,
	  src/utils/papi_command_line.c, src/utils/papi_component_avail.c,
	  src/utils/papi_cost.c, src/utils/papi_decode.c,
	  src/utils/papi_error_codes.c, src/utils/papi_event_chooser.c,
	  src/utils/papi_hybrid_native_avail.c, src/utils/papi_mem_info.c,
	  src/utils/papi_multiplex_cost.c, src/utils/papi_native_avail.c,
	  src/utils/papi_xml_event_info.c: utils: rename the utils so the
	  executable matches the filename  This has bothered me for years,
	  you want to fix "papi_native_avail" but there is no file in the
	  tree called "papi_native_avail.c"
	* src/utils/Makefile, src/utils/papi_version.c, src/utils/version.c:
	  utils: rename version.c to papi_version.c  Also minor cleanups to
	  the utility.
	* src/, src/configure, src/,
	  src/utils/Makefile, src/utils/ utils: clean up
	  Makefile and build process of utils  Now should be able to run
	  "make" in the utils subdir and have it build.  Also move the list
	  of util files to build out of configure as I don't think there's
	  any reason for having them there.
	* src/components/perf_event/pe_libpfm4_events.c: perf: fall back to
	  operating system default events if libpfm4 lacks support  This will
	  allow use of PAPI on machines that Linux has support for, but
	  libpfm4 has not added events yet.  Still some limitations, for
	  example the PAPI preset events won't work.
	* src/components/perf_event/pe_libpfm4_events.c,
	  src/components/perf_event/perf_event.c: perf: report better errors
	  if libpfm4 initialization fails
	* src/components/perf_event/pe_libpfm4_events.c: perf:
	  pe_libpfm4_events: minor whitespace fixup
	* src/components/perf_event/pe_libpfm4_events.c: perf:
	  pe_libpfm4_events: whitespace changes to make code easier to follow

2017-06-19  Vince Weaver <>

	* src/ctests/code2name.c: ctests/code2name: fix uninitialized
	  variable warning
	* src/ctests/calibrate.c: ctests/calibrate: fix uninitialized
	  variable warning
	* src/ctests/thrspecific.c: ctests: thrspecific fix so it finishes
	  It's actually really unclear what this code is trying to test, but
	  with optimization enabled it hung forever.  Marking the variable
	  being spun on as volatile fixes things but I think there is more
	  wrong with the test than just that.
	* src/ctests/branches.c, src/ctests/sdsc.c, src/ctests/sdsc4.c:
	  ctests: fix tests using "dummy3()" as a workload  Now that we
	  enable optimization on the ctests this breaks some of the
	  benchmarks.  dummy3() was being optimized away which caused
	  segfaults and other problems.  The tests don't crash now, but they
	  still fail.  Still investigating.

2016-10-12  Phil Mucci <>

	* src/configure: Regenerated configure with recent autoconf
	* src/ By default, we want -O1 on tests (TOPTFLAGS). -O0
	  is too literal and causes a number of tests who depend on peephole
	  optimization to run.
	* src/utils/Makefile: Utils are installed therefore they should be
	  built with production flags not test/debug flags
	* src/ Make clean should not clean up libpfm. Thats for
	  make distclean. We're not developing libpfm!

2016-07-04  Phil Mucci <>

	* src/ctests/mendes-alt.c, src/ctests/zero.c: Moved functions
	  definitions to top of file to eliminate non-ANSI-C prototypes
	  inside main. Modified message in zero to not turbo boost will also
	  cause errors (cycles > real-time-cycle
	* src/, src/, src/configure, src/
	  Remove EXTRA_CFLAGS, now CFLAGS. Added FTOPTS so compiling Fortran
	  tests have same flags as ctests. Fix proper testing at configure
	  time of libpfm for proper combinations of libpfm options
	* src/ftests/Makefile: Homogenize include flags
	* src/ctests/Makefile: Homogenize include flags
	* src/testlib/Makefile: Removed unnecessary defs and options
	* src/utils/Makefile: Removed unnecessary definitions and compiler

2016-07-01  Phil Mucci <>

	* src/, src/, src/Rules.perfctr-pfm,
	  src/Rules.perfmon2, src/Rules.pfm4_pe,
	  src/components/perf_event/pe_libpfm4_events.c, src/configure,
	  src/, src/ctests/Makefile,
	  src/ctests/, src/ftests/Makefile,
	  src/ftests/ - Removed DEBUGFLAGS,
	  NOTLS, PAPI_EVENTS_TABLE from being generated. These were not
	  properly used. - Added LIBCFLAGS generated from configure for
	  CFLAGS that ONLY apply to the library and the library code. NOT
	  tests nor utilities. Previously we were propagating all kinds of
	  bogus flags to the tests and utils. - CFLAGS is now properly set
	  for compiler flags not defines etc. - Put
	  papi_events_table.h in the right place. This is always the same
	  name. Previous attempts at parameterizing this were broken and/or
	  unnecessary. - Added dependency for the above in the right place
	  and ALWAYS generate it, regardless of whether we actually include
	  it in the library (vs load the CSV at runtime).  Rules.perfctr-pfm
	  - Removed conditional removal of events table during clean.
	  Rules.perfmon2 - Removed conditional removal of events table during
	  clean.  Rules.pfm4_pe - Stopped mussing with CFLAGS which would
	  pollute child builds but refer to LIBCFLAGS. CFLAGS is for
	  everything! - Removed conditional removal of events table during
	  clean. - Removed duplicate reference to papi_events_table.h
	  components/perf_event/pe_libpfm4_events.c: - Removed HARDCODED
	  include of a libpfm4 private header file. Wrong path and
	  unnecessary include. This would break if you linked against another
	  libpfm using any of the config options.
	  components/perf_event/peu_libpfm4_events.c: - Removed HARDCODED
	  include of a libpfm4 private header file. Wrong path and
	  unnecessary include. This would break if you linked against another
	  libpfm using any of the config options.
	  components/ - Refer to datarootdir to
	  make autoconf happy  configure/ Regenerated using
	  autoconf 2.69 and many modifications to serious brokennesss. Lots
	  of fixes: - Sanitize options for static inclusion of user and papi
	  presets - Fix options that do not print out a result - Fix
	  debug=yes to not include PAPI_MEMORY_MANAGEMENT. That's only
	  enabled with debug=memory. This will reduce false positives when we
	  debug. We don't want our own malloc/free changing behavior when we
	  are trying to debug! - Fix CFLAGS/LIBCFLAGS/DEBUGFLAGS. configure
	  now exports a variable called PAPICFLAGS which gets stuffed into
	  LIBCFLAGS in This variable IS ONLY for compiler flags
	  relevant to the library. Previously we were exporting all sorts of
	  stuff that would make our passes behave differently that user code.
	  _GNU_SOURCE and -D_REENTRANT. That stuff is for the library and
	  components. Not user code. - Update compile tests to use
	  AC_LANG_SOURCE as required. - Fix clock timer checking output to
	  now say what timer we picked instead of just skipping an answer -
	  Same for virtual clock timer - Remove broken --with-papi-events
	  option. - Fixed --with-static-tools option - Fixed/added --with-
	  static-papi-events option (default) and --with-static-user-events
	  option. - Fixed modalities of configuring whether to build a
	  static/shared or both. - Fixed link of tests with shared libraries
	  when above options don't support it. Modality again. Remove
	  SETPATH/LIBPATH define, which won't work for ANY combination of
	  --with-pfm-prefix/root/libdir except our included library. Woefully
	  broken and would result in many false positive failures. If you are
	  going to run the tests on the shared library it is now the users
	  responsibility to set LD_LIBRARY_PATH/LIBPATH correctly. I suspect
	  this may irritate some, but broken 90% of the time is no excuse for
	  correct 10% of the time especially when it could generate bug
	  reports falsely. - Fixed with-static-tools, with-shlib-tools
	  options to correct modalities. - Fixed all modalities with --with-
	  pfm-prefix/root/libdir/incdir. Previously the build, configure and
	  source files were still referring to pieces of code INSIDE our
	  libpfm4 resulting in version skew and breakage. The way to test
	  this stuff is to use --root or --prefix after removing the internal
	  libpfm4 library. - Removed unnecessary and confusing
	  force_pfm_incdir - Fixed with-pe-incdir option which, like before
	  was most of the time referring to the libpfm4 included header file.
	  Not good if one has a custom kernel! PECFLAGS now only appended to
	  PAPICFLAGS(LIBCFLAGS). - Removal of DEBUGFLAGS. aix.c needs
	  testing. Anyone have one? - Fixed CFLAGS for BSD - Add message for
	  papi_events.csv  ctests/Makefile ftests/Makefile - Don't redefine
	  CC/CC_R/CFLAGS/FFLAGS. - Make these files consistent
	  ctests/ ftests/ - refer to
	  datarootdir as required

2016-06-27  Phil Mucci <>

	* src/testlib/Makefile, src/testlib/ Added
	  explicit target for libtestlib.a. The all target should have been
	  markted as .PHONY as to avoid constant rebuilding.  Also, we really
	  should merge these two files into a master and an include.
	  Maintaining two makefiles stinks!

2017-06-16  Vince Weaver <>

	* src/papi_fwrappers.c: fwrappers: papif_unregister_thread was
	  misspelled as papif_unregster_thread  This was noticed by Vedran
	  Novakovic  For an extremely long time (10+ years?) the fortran
	  wrapper was misspelled as papif_unregster_thread()  It's probably
	  too late to fix this without potentially breaking things, so just
	  add a duplicate function with the proper spelling and leave the old
	  one too.
	* src/papi_preset.c: papi_preset: fix compiler warning  This really
	  confusing warning has been around for a while.  gcc-6.3 reports it
	  in a really odd way:  papi_preset.c: In function
	  ‘check_derived_events’: papi_preset.c:513:19: warning:
	  ‘__s’ may be used uninitialized in this function$ int val =
	  atoi(&subtoken[1]); ^~~~~~~~~~~~ papi_preset.c:464:1: note:
	  ‘__s’ was declared here ops_string_merge(char **original, char
	  *insertion, int replaces, int start_ind$ ^~~~~~~~~~~~~~~~  But
	  there is no __s variable, or anything to do with where the arrows
	  are pointing.  gcc-5 gives a better warning:  papi_preset.c: In
	  function ‘check_derived_events’: papi_preset.c:513:14: warning:
	  ‘tok_save_ptr’ may be used uninitialized in this$ int val =
	  atoi(&subtoken[1]); ^ papi_preset.c:472:8: note: ‘tok_save_ptr’
	  was declared here char *tok_save_ptr;  So the thing it seems to be
	  complaining about is that the *saveptr paramater to strtok_r() is
	  not set to NULL.  According to the manpage I don't think this
	  should be needed? But I think it should be safe to initialize it

Tue Jun 6 11:09:17 2017 -0500  Will Schmidt <>

	* src/libpfm4/lib/events/power9_events.h,
	  src/libpfm4/tests/validate_power.c: Update libpfm4  Current with
	  commit ce5b320031f75f9a9881333c13902d5541f91cc8  add power9 entries
	  to validate_power.c  Hi,  Update the validate_power test to include
	  power9 entries.  sniff-test run output: $ ./validate Libpfm
	  structure tests: libpfm ABI version : 0 pfm_pmu_info_t : Passed
	  pfm_event_info_t : Passed pfm_event_attr_info_t : Passed
	  pfm_pmu_encode_arg_t : Passed pfm_perf_encode_arg_t : Passed Libpfm
	  internal table tests: <snip...> checking power9 (946 events):
	  Passed Architecture specific tests: 20 PowerPC events: 0 errors All
	  tests passed

2017-06-15  Vince Weaver <>

	* src/components/perf_event/pe_libpfm4_events.c,
	  .../perf_event_uncore/peu_libpfm4_events.h: perf_event: merge the
	  libpfm4 helper libraries  perf_event and perf_event_uncore had
	  their own almost exactly the same libpfm4 helper libraries.
	  Maintaining both was a chore, and it looks like it is possible to
	  just share one copy.  This does mean that it is now not possible to
	  configure the perf_event_uncore component without perf_event being
	  enabled, but I am not sure if that was even possible to begin with.
	* src/components/perf_event/pe_libpfm4_events.c,
	  .../perf_event_uncore/peu_libpfm4_events.h: perf_event_uncore: make
	  the libpfm4 routines match even more
	* src/components/perf_event/pe_libpfm4_events.c,
	  .../perf_event_uncore/peu_libpfm4_events.c: perf_event: make
	  perf_event and perf_event uncore libpfm4 more similar  it's a bad
	  idea to have more or less two copies of the same code
	* src/components/perf_event/pe_libpfm4_events.c,
	  .../perf_event_uncore/peu_libpfm4_events.c: perf_event: Avoid
	  unintended libpfm build dependency due to PFM_PMU_MAX enum  This
	  patch is based on one sent by William Cohen <>
	  The libpfm pfmlib.h file enumerates the each of performance
	  monitoring units (PMUs) it can program in pfm_pmu_t type.  The last
	  enum in this type is PFM_PMU_MAX.  Depending on which specific
	  version of libpfm being used this specific value could vary.  The
	  problem is that PFM_PMU_MAX is statically defined in the pfmlib.h
	  file and this was being used as a loop bounds when iterating to
	  determine which PMUs are potentially available.  If PAPI was built
	  with an older version of libpfm and then run with a newer libpfm
	  shared library on a machine with a larger PFM_PMU_MAX value, none
	  of the PMUs past the smaller PFM_PMU_MAX used for the the build
	  would be examined or enabled.

2017-06-15  Heike Jagode ( <jagode@d00.descartes>

	* src/components/infiniband/linux-infiniband.c: Updated infiniband
	  component so that it works for mofed driver version 4.0, where
	  directory counters_ext in sysfs fs has changed to hw_counters.
	  This update to the component makes it work for both directory
	  names: - counters_ext for mofed driver version <4.0, and -
	  hw_counters for mofed driver version =>4.0  This change has not
	  been fully tested yet due to missing access to machine with updated
	  version of mofed driver. (CORAL machines will have an updated
	  version of this driver.)

2017-05-04  Vince Weaver <>

	* src/components/rapl/linux-rapl.c: rapl: broadwell-ep DRAM units are
	  special (like Haswell-EP)  The Linux kernel perf interface had this
	  wrong too.  I noticed this in my cluster computing classs, the
	  Broadwell-EP DRAM results were unrealistically high values.

Fri Apr 21 17:33:15 2017 -0700  William Cohen <>

	* src/libpfm4/README, src/libpfm4/include/perfmon/pfmlib.h,
	  src/libpfm4/lib/Makefile, src/libpfm4/lib/events/power9_events.h,
	  src/libpfm4/lib/pfmlib_common.c, src/libpfm4/lib/pfmlib_power9.c,
	  src/libpfm4/lib/pfmlib_power_priv.h, src/libpfm4/lib/pfmlib_priv.h,
	  src/libpfm4/lib/pfmlib_s390x_cpumf.c: Update libpfm4\n\nCurrent
	  with\n commit 8385268c98553cb5dec9ca86bbad3e5c44a2ab16  fix
	  internal pfm_event_attr_info_t use for S390X  Commit 321133e
	  converted most of the architectures to use the internal
	  perflib_event_attr_info_t type.  However, the s390 was missed in
	  that previous commit.  This patch corrects the issue so libpfm
	  compiles on s390.

2017-04-20  Stephen Wood <>

	* src/extras.c, src/papi.h, src/papi_fwrappers.c, src/papi_hl.c,
	  src/papi_internal.c: cast pointers appropriately to avoid warnings
	  and errors

2017-04-19  Sangamesh Ragate <>

	* src/papi_events.csv: Mapped PAPI_L2_ICM preset event to
	  PM_INST_FROM_L2MISS native event for Power8

2017-04-06  Asim YarKhan <>

	* src/ftests/fmatrixlowpapi.F: Fixed: This fortran test exceeded 72
	  columns and made the default Intel ifort compilation unhappy

Wed Apr 5 23:35:44 2017 -0700  Andreas Beckmann <>

	* src/libpfm4/docs/man3/libpfm_arm_ac53.3,
	  src/libpfm4/docs/man3/libpfm_arm_xgene.3, src/libpfm4/lib/Makefile,
	  src/libpfm4/perf_examples/self_smpl_multi.c: Update
	  libpfm4\n\nCurrent with\n commit
	  71a960d9c17b663137a2023ce63edd2f3ca115f5  fix various event
	  description typos  This patch fixes the typos in several event
	  description for Intel, Arm, and Power event tables.

2017-03-30  William Cohen <>

	* src/ftests/cost.F, src/ftests/first.F, src/ftests/fmatrixlowpapi.F,
	  src/ftests/second.F: Eliminate warnings about implicit type
	  conversions in Fortran tests  The gfortran compiler on Fedora 25
	  was giving warnings indicating that a few of the tests were doing
	  implicit type convertion between reals and ints.  Those implicit
	  conversions have been made explicit to elminate the fortran
	  compiler warning messages.

Tue Apr 4 09:42:25 2017 -0700  Stephane Eranian <>

	* src/libpfm4/include/perfmon/pfmlib.h,
	  src/libpfm4/lib/pfmlib_amd64_priv.h, src/libpfm4/lib/pfmlib_arm.c,
	  src/libpfm4/lib/pfmlib_arm_priv.h, src/libpfm4/lib/pfmlib_common.c,
	  src/libpfm4/lib/pfmlib_mips.c, src/libpfm4/lib/pfmlib_mips_priv.h,
	  src/libpfm4/lib/pfmlib_powerpc.c, src/libpfm4/lib/pfmlib_priv.h,
	  src/libpfm4/lib/pfmlib_torrent.c, src/libpfm4/tests/validate.c,
	  src/libpfm4/tests/validate_x86.c: Update libpfm4\n\nCurrent with\n
	  commit 5e311841e5d70efb93d11826109cb5acab6e051c  enable 38-bit raw
	  umasks for Intel offcore_response events  This patch enables
	  support for passing and encoding of 38-bit offcore_response matrix
	  umask. Without the patch, the raw umask was limited to 32-bit which
	  is not enough to cover all the possible bits of the
	  offcore_response event available since Intel SandyBridge.  $
	  examples/check_events offcore_response_0:0xffffff Requested Event:
	  offcore_response_0:0xffffff Actual    Event:
	  ivb::OFFCORE_RESPONSE_0:0xffffff:k=1:u=1:e=0:i=0:c=0:t=0 PMU
	  : Intel Ivy Bridge IDX            : 155189325 Codes          :
	  0x5301b7 0xffffff  The patch also adds tests to the validation

2017-03-29  Vince Weaver <>

	* src/components/perfctr/perfctr-x86.c: perfctr: fix perfctr
	  component to actually work  Simple one-line typo means perfctr was
	  not working, probably for years.  I've tested on a 2.6.32-perfctr
	  kernel and it works again.

2017-03-28  Vince Weaver <>

	* src/papi_events.csv: papi_events: add AMD fam16h jaguar events
	  These will become useful if/when the contributed libpfm4 jaguar
	  patches get applied.

2017-03-27  Vince Weaver <>

	* src/papi_events.csv: events: p4: change the PAPI_TOT_CYC event
	  PAPI_TOT_CYC wasn't working on Pentium4 because the
	  GLOBAL_POWER_EVENT:RUNNING event was being grabbed by the hardware
	  watchdog.  perf cycles:u was still working, that's because the
	  kernel transparently remaps the cycles event to an alias when
	  global_power_event's slot is taken.  The aliased event is the
	  unwieldly: execution_event:nbogus0:nbogus1:nbogus2:nbogus3:bogus0:b
	  ogus1:bogus2:bogus3:cmpl:thr=15 which does seem to give the right
	  results.  Use this event instead by default on Pentium 4
	* src/components/perf_event/perf_event.c: perf_event: fix warning
	  when compiling with debug enabled  the flags field is an unsigned
	  long, not an int

2017-03-22  Vince Weaver <>

	* src/components/perf_event/perf_event.c: perf_event: don't allocate
	  a mmap page if not rdpmc or sampling
	* src/components/perf_event/perf_event.c: perf_event: only allocate 1
	  mmap page (rather than 3) if not sampling  Next step is to allocate
	  0 mmap pages unless rdpmc is enabled
	* src/components/perf_event/perf_event.c,
	  src/components/perf_event/perf_event_lib.h: perf_event: update the
	  _pe_set_overflow() call  Working on making it more obvious which
	  events are sampling (and thus need mmap buffers) or not.  Also
	  there were some bugs in the handling of having multiple overflow
	  sources per eventset, though I'm not sure if PAPI actually handles
	* src/components/perf_event/perf_event.c: perf_event: turn off
	  fast_counter_read if mmaps fail  By default on Linux perf_event
	  can't use more than 516kB of mmap space.  So perf_event-rdpmc would
	  fail after you added a large number (>32) of events.  This shows up
	  on the kufrin benchmark on some machines.  This fix makes PAPI fall
	  back to non-rdpmc if an mmap error happens. I'm also going to try
	  to tune the mmap usage a bit to make the limits a bit higher.

2017-03-21  Asim YarKhan <>

	* src/configure: configure script updated using autoconf-2.59

2017-03-20  Vince Weaver <>

	* src/components/perf_event/perf_event.c, src/
	  configure: enable rdpmc with --enable-perfevent-rdpmc=yes  Make
	  this an option to configure.  Defaults to no.  Need to find a
	  machine with autoconf 2.59 on and I'll regenerate configure as

2017-03-16  Vince Weaver <>

	* src/components/perf_event/perf_event.c: perf_event: try to work
	  around exclude_guest issue  run a test at startup to see if events
	  with exclude_guest fail.  libpfm4 sets this by default, but older
	  kernels will fail because this was previously a reserved (must be
	  zero) field.

2017-03-14  Vince Weaver <>

	* src/ctests/multiattach.c: tests: multiattach:
	  whitespace/comments/clarifications  digging through the code trying
	  to figure out why it fails with rdpmc enabled.  it turns out it is
	  seeing wrong running/enabled multiplexing results even though we
	  aren't multiplexing  tracking this down is a pain because we can't
	  strace/ltrace due to the code using ptrace to start/stop processes.

2017-03-09  Vince Weaver <>

	* src/components/perf_event/perf_event.c: perf_event: can't mmap() an
	  inherited event  this is why the inherit test was failing
	* src/components/perf_event/perf_event.c,
	  src/components/perf_event/perf_helpers.h: perf_event: add rdpmc
	  support (but disabled)  finally add the rdpmc code, but it still
	  fails on a few tests so it is disabled by default.
	* src/components/perf_event/perf_event.c,
	  src/components/perf_event/perf_event_lib.h: perf_event: make all
	  events come with a mmap buffer  This wastes some address space, but
	  having separate codepaths for rdpmc/regular/sampling/profiling
	  would be hard to maintain.  Had to remove some assumptions from the
	  profiling/sampling code that mmap_buf means sampling is happening.
	* src/components/perf_event/perf_event.c: perf_event: add check for
	  paranoid==3  Recent distributions are *completely* disablng
	  perf_event by default with their vendor kernels (this is not
	  upstream yet).  Have PAPI detect and disable the perf_event
	  component if this is detected.
	* src/components/perf_event/perf_event.c: perf_event: split
	  close_pe_events() into two functions
	* src/components/perf_event/perf_event.c,
	  src/components/perf_event/perf_helpers.h: perf_event: more
	  whitespace / rearrangement  should not be any changes to actual
	  code, is just whitespace/comment/function movement  I know changes
	  like this make the git history harder to follow, but it really
	  helps when trying to follow the code when working on major changes.

2017-03-08  Vince Weaver <>

	* src/components/perf_event/perf_event.c: perf_event: more
	  whitespace/comment cleanups  digging through the code, still
	  prepping for rdpmc

2017-03-07  Vince Weaver <>

	* src/components/perf_event/perf_helpers.h: perf_event: rdpmc: need
	  to sign extend offset too  Otherwise things stop working after a
	* src/components/perf_event/perf_event.c: perf_event: split up
	  _pe_read()  makes the code a bit easier to follow.  also prep for
	* src/components/perf_event/perf_event.c: perf_event: clean up
	  whitespace in _pe_read

2017-03-08  Vince Weaver <>

	* src/ctests/first.c: ctests: first: white space cleanups  minor
	  things noticed when trying to figure out why it was failing with
	  rdpmc (the answer was rdpmc code not handling PAPI_reset())

2017-03-07  Vince Weaver <>

	* src/components/perf_event/perf_helpers.h: perf_event: recent
	  changes broke build on non-x86  an ifdef was in the wrong location.
	* src/components/perf_event/perf_event.c,
	  src/components/perf_event/perf_helpers.h: perf_event: update rdpmc
	* src/utils/component.c: utils: component_avail: clean up -d
	  (detailed) results  print rdpmc status, as well as line things up.
	  Also don't print redundant info, now that a lot more fields are
	  printed by default.
	* src/utils/component.c: utils: component_avail: whitespace/grammar
	* src/components/perf_event/Rules.perf_event,
	  src/components/perf_event/perf_helpers.h: perf_event: add
	  mmap/rdpmc routine  we don't use it yet

2017-03-06  Vince Weaver <>

	* src/components/perf_event/perf_helpers.h: perf_event: add rdtsc()
	  and rdpmc() inline-assembly
	* src/components/perf_event/perf_event.c,
	  src/components/perf_event/perf_helpers.h: perf_event: move
	  perf_event_open() code to a helper file  We'll be adding some other
	  helpers to this file too.

2017-03-03  Vince Weaver <>

	* src/components/perf_event/perf_event.c: perf_event: move
	  bug_sync_read() check out of line  we should eventually just phase
	  out a lot of these checks for older kernels, but it gets tricky as
	  long as RHEL is shipping 2.6.32.  With this change on my IVB
	  machine PAPI_read() cost went from mean cycles  : 932.158549
	  std deviation: 358.752461 to mean cycles  : 896.642644       std
	  deviation:    305.568268
	* src/components/perf_event/pe_libpfm4_events.c,
	  src/components/perf_event/perf_event.c: perf_event: remove
	  _pe_libpfm4_get_cidx() helper function  easier to explicitly pass
	  it to the libpfm4 event code
	* src/components/perf_event/perf_event_lib.h: perf_event: wakeup_mode
	  field is no longer used
	* src/components/perf_event/perf_event.c: perf_event: remove
	  WAKEUP_MODE_ defines  These date back to initial perf_event
	  support, but were never used.  Probably were meant in case advanced
	  sampling/profiling was ever implemented, but it wasn't.
	* src/components/perf_event/perf_event.c: perf_event.c: split
	  setup_mmap() to its own function  non-sampling events will need to
	  have mmap buffers when we move to rdpmc()
	* src/components/perf_event/perf_event.c: perf_event: rename
	  tune_up_fd to configure_fd_for_sampling  makes it a bit more clear
	  what is going on
	* src/components/perf_event/perf_event.c: perf_event: remove
	  extraneous whitespace

2017-02-24  Vince Weaver <>

	* src/utils/cost.c: papi_cost: wasn't properly resetting the event
	  search after POSTFIX  This means some architectures could have
	  skipped the ADD/SUB test even though such events were available.

Wed Feb 22 01:16:42 2017 -0800  Stephane Eranian <>

	* src/libpfm4/lib/events/intel_bdw_events.h,
	  src/libpfm4/tests/validate_x86.c: Update libpfm4\n\nCurrent with\n
	  commit 1bd352eef242f53e130c3b025bbf7881a5fb5d1e  update Intel RAPL
	  processor support  Added Kabylake, Skylake X  Added PSYS RAPL event
	  for Skylake client.

2017-02-17  Vince Weaver <>

	* src/utils/cost.c: papi_cost: clear eventset before derived add test
	  we weren't clearing the eventset after the derived postfix test to
	  the add test was actually measuring two derived events.  This was
	  noticed on broadwell-ep where papi_cost would fail due to the lack
	  of enough counters to have both the postfix and add events at the
	  same time.

2017-01-23  Asim YarKhan <>

	* RELEASENOTES.txt: Fixing the date in the RELEASENOTES file.