Blob Blame History Raw

  * 8f524875 RELEASENOTES.txt: Prepare release notes for a 5.4.0 release


  * a8b4613b man/man1/papi_avail.1 man/man1/papi_clockres.1
  man/man1/papi_command_line.1...: Rebuild the doxygen manpages

  * fbea4897 src/run_tests_exclude.txt: Remove omptough from standard testing  On some platforms (e.g. some AMD machines), if
  OMP_NUM_THREADS is not set, then this test does not complete in a reasonable
  time.  That is because on these platforms too many threads are spawned inside
  a large loop.


  * 23c2705b src/components/bgpm/CNKunit/linux-CNKunit.c
  src/components/bgpm/IOunit/linux-IOunit.h...: Fix number of counters and
  events for each of the 5 BGPM units as well as emon on BG/Q


  * 93c69ded src/linux-timer.c: Patch linux-timer.c to provide cycle counter
  support on aarch64 (64-bit ARMv8 architecture)  Thanks to William Cohen for
  this patch and the message below: --- The aarch64 has cycle counter available
  to userspace and this resource should be made available in papi. --- This
  patch is not tested by the PAPI team (no easily available hardware).


  * 038b2f31 src/components/rapl/linux-rapl.c: Extension of the RAPL energy
  measurements on Intel via msr-safe.
  (  msr-safe is a linux kernel
  module that allows user access to a whitelisted set of MSRs. It is nearly
  identical in structure to the stock msr kernel module, with the important
  exception that the "capabilities" check has been removed.  The LLNL sysadmins
  did a security review for the whitelist.

  * 67e0b3f6 src/components/rapl/tests/rapl_basic.c: Fixed string null


  * 2a1805ec src/components/perf_event/pe_libpfm4_events.c
  src/components/perf_event/perf_event.c...: Patch to reduce cost of using
  PAPI_name_to_code and add list of supported pmu names to papi_component_avail
  output  Thanks to Gary Mohr for this patch and its documentation: --- This
  patch file contains code to look for either pmu names or component names on
  the front of event strings passed to PAPI_name_to_code calls.  If found the
  name will be compared to values in each component info structure to see if
  the component supports this event.  If the pmu name or component name does
  not match the values in the component info structure then there is no need to
  call this component for this event.  If the event string does not contain
  either a pmu name or a component name then all components will be called. 
  This reduces the overhead in PAPI when converting event names to event codes
  when either component names or pmu names are provided in the event name
  string.  To support the above checks, there is also code in this patch to add
  an array of pmu names to the component info structure and modifications to
  the core and uncore components to save the pmu names supported by each of
  these components in this new array.  This patch also adds code to the
  papi_component_avail tool to display the pmu names supported by each active
  component. ---


  * a91db97b src/components/net/linux-net.c src/components/nvml/linux-nvml.c
  src/components/perf_event/perf_event.c...: This patch file contains
  additional changes to resolve defects reported by Coverity.  Thanks to Gary
  Mohr for this patch. ------ This patch file contains additional changes to
  resolve defects reported by Coverity.  Mostly these just make sure that
  character buffers get null terminated so they can be used as C strings. 
  There is also a change in the RAPL component to improve the message to
  identify why the component may have been disabled. ------


  * 3f913658 src/ctests/tenth.c: Fix percent error calculation in
  ctests/tenth.c  Thanks to Carl Love for this patch and the following
  documentation:  Do the division first then multiply by 100 when calculating
  the percent error.  This will keep the magnitude of the numbers closer.  If
  you multiply by 100 before dividing you may exceed the size the representable
  number size.  Additionally, by casting the values to floats then dividing we
  get more accuracy in the calculation of the percent error.  The integer
  division will not give us a percent error less then 1.

  * ba5ef24a src/papi_events.csv: PPC64, fix L1 data cache read, write and all
  access equations.  Thanks to Carl Love for this patch and the following
  documentation:  The current POWER 7 equations for all accesses over counts
  because it includes non load accesses to the cache.  The equation was changed
  to be the sum of the reads and the writes.  The read accesses to the two
  units, can be counted with the single event PM_LD_REF_L1 rather then counting
  the events to the two LSU units independently.  The number of reads to the L1
  must be adjusted by subtracting the misses as these become writes.  Power 8
  has four LSU units. The same equations can be used since PM_LD_REF_L1 counts
  across all four LSU units.

  * 882f5765 src/utils/native_avail.c: This patch file fixes two problems and
  adds a performance improvement in "papi_native_avail.  Thanks to Gary Mohr
  for this patch and the following information:  First it corrects a problem
  when using the -i or -x options.  The code was putting out too many event
  divider lines (lines with all '-' characters).  This has been corrected. 
  Second it improves the results from "papi_native_avail --validate" when being
  used on SNBEP systems.  This system has some events which require multiple
  masks to be provided for them to be valid.  The validate code was showing
  these events as not available because it did not try to use the event with
  the correct combination of masks.  The fix checks to see if a valid form of
  the event has not yet been found and if so then it tries the event with
  specific combinations of masks that have been found to make these events
  work.  It also adds a check before trying to validate the event with a new
  mask to see if a valid form of the event has already been found.  If it has
  then there is no need to try and validate the event again.

  * 94985c8a src/ src/configure src/ Fix build error
  when no fortan is installed  Thanks to Maynard Johnson for this patch. Fix up
  the build mechanism to properly handle the case where no Fortran compiler is
  installed -- i.e., don't build or install testlib/ftest_util.o or the ftests.


  * de05a9d8 src/linux-common.c: PPC64 add support for the Power non
  virtualized platform  Thanks to Carl Love for this patch and the following
  description: The Power 8 system can be run as a non-virtualized machine.  In
  this case, the platform is "PowerNV".  This patch adds the platform to the
  possible IBM platform types.

  * 547f4412 src/ctests/byte_profile.c: byte_profile.c: PPC64 add support for
  PPC64 Little Endian to byte_profile.c  Thanks to Carl Love for this patch and
  the following description: The POWER 8 platform is Little Endian.  It uses
  ELF version 2 which does not use function descriptors.  This patch adds the
  needed #ifdef support to correctly compile the test case for Big Endian or
  Little Endian.  This patch is untested by the PAPI developers (hardware not
  easily accessible).


  * 14f70ebc src/ctests/sprofile.c: PPC64 add support for PPC64 Little Endian
  to sprofile.c  Thanks to Carl Love for this patch and the following
  description: The POWER 8 platform is Little Endian.  It uses ELF version 2
  which does not use function descriptors.  This patch adds the needed #ifdef
  support to correctly compile the test case for Big Endian or Little Endian.

  * 6d41e208 src/linux-memory.c: PPC64 sys_mem_info array size is wrong  Thanks
  to Carl Love for this patch and the following description: The variable
  sys_mem_info is an array of type PAPI_mh_info_t.  It is statically declared
  as size 4.  The data for POWER8 is statically declared in entry 4 of the
  array which is beyond the allocated array.  The array should be declared
  without a size so the compiler will automatically determine the correct size
  based on the number of elements being initialized.  This patch makes the

  * 061817e0 src/papi_events.csv: Remove stray Intel Haswell events from Intel
  Ivy Bridge presets  Thanks to William Cohen for this patch and the following
  description: 'Commit 4c87d753ab56688acad5bf0cb3b95eae8aa80458 added some
  events meant for Intel Haswell to the Intel Ivy bridge presets.  This patch
  removes those stray events.  Without this patch on Intel Ivy Bridge machines
  would see messages like the following: PAPI Error: papi_preset: Error finding
  event L2_TRANS:DEMAND_DATA_RD. PAPI Error: papi_preset: Error finding event
  L2_RQSTS:ALL_DEMAND_REFERENCES.'  This patch was not tested by the PAPI team
  (no appropriate hardware).


  * 8bc1ff85 src/papi_events.csv: Update papi_events.csv to match libpfm
  support for Intel family 6 model 63 (hsw_ep)  Thanks to William Cohen for
  this patch and its information: 'A recent September 11, 2014 patch (98c00b)
  to the upstream libpfm split out Intel family 6 model 63 into its own name of
  "hsw_ep".  The papi_events.csv needs to be updated to support that new name. 
  This should have no impact for older libpfms that still identify Intel family
  6 model 63 as "hswv" and "hsw_ep" map to the same papi presets.'

  * 32a8b758 src/papi_events.csv: Support for the ARM X-Gene processor.  Thanks
  to William Cohen for this patch. The events for the Applied Micro X-Gene
  processor are slightly different from other ARM processors.  Thus, need to
  define those presets for the X-Gene processor.  Note: This patch is not
  tested by the PAPI team because we do not have the appropriate hardware.

  * 0a97f54e src/components/perf_event/pe_libpfm4_events.c
  .../perf_event_uncore/peu_libpfm4_events.c src/papi_internal.c...: Thanks to
  Gary Mohr for the patch  ---------------------  Fix for bugs in
  PAPI_get_event_info when using the core and uncore components: 
  PAPI_get_event_info returns incorrect / incomplete information.  The errors
  were in how the code handled event masks and their descriptions so the errors
  would not lead to program failures, just the possibility of incorrect
  labeling of output.  ---------------------


  * 77960f71 src/components/perf_event/pe_libpfm4_events.c: Record
  encode_failed error by setting attr.config to 0xFFFFFFF.  This causes any
  later validate attempts to fail, which is the desired behavior.  Note: as of
  2014/10/09 this has not been thoroughly tested, since a failure case is not
  known.  This patch simply copies a fix that was applied to the
  perf_event_uncore component.


  * 00ae8c1e src/components/perf_event_uncore/peu_libpfm4_events.c: Based on
  Gary Mohr's suggestion.  If an event fails when we try to add it
  (encode_failed), then we note that error by setting attr.config = 0xFFFFFF
  for that event.  Then, if there is a later check to validate this event, the
  check will correctly return an error.

  * 3801faaf src/utils/native_avail.c: Adding the NativeAvailValidate patch
  provided by Gary Mohr.  The problem being addresed is that if there were any
  problems validating event masks, then those problems would result in the
  entire event being invalid.  The desired action was to test each event mask,
  and if any basic event mask can make the event succeed, then the event should
  be returned as valid and available.  The solution is to create a large buffer
  and write events and masks into this buffer as they are processed, tracking
  their validity.  At the end go back and mark the validity of the entire
  event.  This matches the standard output of PAPI.


  * 6abc8196 src/components/emon/README src/components/emon/Rules.emon
  src/components/emon/linux-emon.c: Emon power component for BG/Q


  * 62b9f2a9 .../perf_event_uncore/perf_event_uncore.c: perf_event_uncore.c:
  Check scheduability of events  Patch by Gary Mohr, ------------------- This
  patch file adds code to the uncore component to check and make sure that the
  events being opened from an event set can be scheduled by the kernel.  This
  kind of code exists in the core component but was not moved into the uncore
  component because it was felt that it would not be an issue with uncore. 
  Turns out the kernel has the same kind of issues when scheduling uncore
  events.  The symptoms of this problem will show that the kernel will report
  that all events in the event set were opened successfully but when trying to
  read the results one (or more) of the events will get a read failure.  Seen
  in the traces and on stderr (if papi configured with debug=yes) as a "short
  read" error.  The logic is slightly different than what is in the core
  component because the events in the core component are grouped and the ones
  in the uncore are not grouped.  When events are grouped, you only need to
  enable/disable and read results on the group leader. But when they are not
  grouped you need to do these operations on each event in the event set.


  * 8e6bf887 src/components/cuda/linux-cuda.c
  src/components/lustre/linux-lustre.c...: Address coverity defects in
  src/components  Thanks Gary Mohr ----------------  This patch file contains
  fixes for defects reported by Coverity in the /src/components directory.
  Mostly these changes just make sure that char buffers get null terminated so
  that when they get used as a C string (they usually do) we will not end up
  with unpredictable results.  A problem has been reported by one of our
  testers that the lustre component produced very long event names with lots of
  unprintable garbage in the names.  It turns out this was caused by a buffer
  that filled up and never got null terminated.  Then string functions were
  used on the buffer which picked up the whole buffer and lots more.  These
  changes fixed the problem.

  * 266c61a4 src/linux-common.c src/papi_hl.c src/papi_internal.c...: Address
  coverity reported issues in src/  Thanks to Gary Mohr -------------------
  Changes in this patch file:  linux-common.c:  Add code to insure that cpu
  info vendor_string and model_string buffers are NULL terminated strings. 
  Also insure that the value which gets read into mdi->exe_info.fullname gets
  NULL terminated. This makes it safe to use the  'strxxx' functions on the
  value (which is done immediately after it is read in).  papi_hl.c:  Fix call
  to _hl_rate_calls() where the third argument was not the correct data type. 
  papi_internal.c:  Add code to insure that event info name, short_desc, and
  long_desc buffers are NULL terminated strings.  papi_user_events.c:  While
  processing define symbols, insure that the 'local_line', 'name', and 'value'
  buffers get NULL terminated (so we can safely use 'strxxx' functions on
  them). Insure that the 'symbol' field in the user defined event ends up NULL
  terminated. Rearrange code to avoid falling through from one case to the next
  in a switch statement. Coverity flagged falling out the bottom of a case
  statement as a potential defect but it was doing what it should. 
  sw_multiplex.c:  Unnecessary test.  The value of ESI can not be NULL when
  this code is reached.  x86_cpuid_info.c:  The variable need_leaf4 is set but
  not used.  The only place it gets set returns without checking its value. 
  The place that checks its value never could have set its value non-zero.


  * d72277fc release_procedure.txt: Update release procedure, check buildbot!


  * a0e4f9a7 src/components/perf_event_uncore/perf_event_uncore.c: Uncore
  component fix:  By Gary Mohr, The line that sets exclude_guests in the uncore
  component is there because it also there in the core component.  But when I
  look at it closer, it is an error in both cases.  I will submit a patch to
  remove them and get rid of some commented out code that no longer belongs in
  the source.  The uncore events do not support the concept of excluding the
  host or guest os so we should never set either bit.  But the core events do
  support this concept and libpfm4 provides event masks "mg" and "mh" to
  control counting in these domains.  By default if neither is set then libpfm4
  excludes counting in the guest os, if either "mh" or "mg" is provided as an
  event mask then it is counted but the other is excluded, and if both are
  provided then both are counted.  So when the code forces the exclude_guest
  bit to be set it breaks the ability to fully control what will happen with
  the masks.  I did not notice the uncore part of this problem when testing on
  my SNB system, probably because we use an older kernel which tolerated the
  bit being set (or maybe because HSW is handled differently).


  * 4499fee7 src/papi_internal.c: Thanks to Gary Mohr for the patch
  --------------------- Fix in papi_internal.c where it was trying to look up
  the event name.  The RAPL component found the event and returned a code but
  papi_internal.c  exited the enum loop for that component but failed to exit
  the loop that checks all of the components.  This caused it to keep looking
  at other components until it fell out of the outer loop and returned an
  error. In addition to the actual change, some formatting issues were fixed.

  * f5835c26 src/components/perf_event/perf_event_lib.h: Bump NUM_MPX_COUNTERS
  for linux-perf  Uncore on SNB and newer systems has enough counters to go
  beyond the 64 array spaces we allocate. This needs a better long term
  solution.  Reported by Gary Mohr --------------------- When running on snbep
  systems with the uncore component enabled, if papi is configured with
  debug=yes then the message "Warning!  num_cntrs is more than num_mpx_cntrs"
  gets written to stderr.  This happens because the snbep uncore pmu's have a
  total of 81 counters and PAPI is set to only accept a maximum of 64 counters.
   This change increases the amount PAPI will accept to 100 (prevents the
  warning message from being printed). ---------------------

  * 07990f85 src/ctests/branches.c src/ctests/calibrate.c
  src/ctests/describe.c...: ctests/ Address coverity reported defects  Thanks
  to Gary Mohr for the patch --------------------------------- he contents of
  this patch file fix defects reported by Coverity in the directory
  'papi/src/ctests'.  The defect reported in branches.c was that a comparison
  between different kinds of data was being done.  The defect reported in
  calibrate.c was that the variable 'papi_event_str' could end up without a
  null terminator.  The defects reported in describe.c, get_event_component.c,
  and krentel_pthreads.c were that return values from function calls were being
  stored in a variable but never being used.  I also did a little clean-up in
  describe.c.  This test had been failing for me on Intel NHM and SNBEP but now
  it runs and reports that it PASSED. ---------------------------------


  * 74cb07df src/testlib/test_utils.c: testlib/test_util.c: Check enum return
  value  Addresses an issue found by Coverity. Thanks Gary Mohr,
  ---------------- The changes in this patch file fixes the only defect in the
  src/testlib directory.  The defect reported that the return value from a call
  to PAPI_enum_cmp_event was being ignored.  This call to enum events is to get
  the first event for the component index passed into this function. It turns
  out that the function that contains this code is only ever called by the
  overflow_allcounters ctest and it only calls once and always passes a
  component index of 0 (perf_event).  So I added code to check the return value
  and fail the test if an error was returned. ----------------

  * 74041b3e src/utils/event_info.c: event_info utility: address coverity
  defect  From Gary Mohr -------------- This patch corrects a defect reported
  by Coverity.  The defect reported that the call to PAPI_enum_cmp_event was
  setting retval which was never getting used before it got set again by a call
  to PAPI_get_event_info.  After looking at the code, I decided that we should
  not be trying to get the next event inside a loop that is enumerating masks
  for the current event.  It makes more sense to break out of the loop to get
  masks and let the outer loop that is walking the events get the next event.


  * 62dceb9b src/utils/native_avail.c: Extend 'papi_native_event --validate' to
  check for umasks.


  * a5c2beb2 src/components/perf_event/pe_libpfm4_events.c
  src/components/perf_event/perf_event.c...: perf_event[_uncore]: switch to
  libpfm4 extended masks  Patch due to Gary Mohr, many thanks.
  ------------------------------------ This patch file contains the changes to
  make the perf_event and perf_event_uncore components in PAPI use the libpfm4
  extended event masks.  This adds a number of new masks that can be entered
  with events supported by these components. They include a mask 'u' which can
  be used control if counting in the user domain should be enabled, a mask 'k'
  which does the same for the kernel domain, and a mask 'cpu' which will cause
  counting to only occur on a specified cpu.  There are also some other new
  masks which may work but have not been tested yet.


  * e76bbe66 src/components/perf_event/pe_libpfm4_events.c
  .../perf_event_uncore/perf_event_uncore.c...: General code cleanup and
  improved debugging  Thanks to Gary Mohr ------------------- This patch file
  does general code cleanup.  It modifies the code to eliminate compiler
  warnings, remove defects reported by coverity, and improve traces.


  * 8f2a1cee src/utils/error_codes.c: error_codes utility: remove internal bits
   Remove dependency on _papi_hwi_num_errors, just keep calling PAPI_strerror
  until it fails. We shouldn't be using internal symbols anyways.


  * a7136edd src/components/nvml/README: Update nvml README  We changed the
  options to simplify the configure line. Bad information is worse than no


  * a37160c1 src/components/perf_event/perf_event.c: perf_event.c: cleanup
  error messages  Thanks to Gary Mohr ------------------- This patch contains
  general cleanup code.  Calls to PAPIERROR pass a string which does not need
  to end with a new line because this function will always add one.  New lines
  at the end of strings passed to this function have been removed.  These
  changes also add some additional debug messages.


  * bf55b6b7 src/papi_events.csv: Update HSW presets  Thanks to Gary Mohr
  ------------------- Previously we sent updates to the PAPI preset event
  definitions to improve the preset cache events on Haswell processors.  In
  checking the latest source, it looks like the L1 cache events changes did not
  get applied quite right.  Here is a patch to the latest source that will make
  it the way we had intended.

  * eeaef9fa src/papi.c: papi.c: Add information to API entry debuging  Thanks
  to Gary Mohr ------------------- This patch contains the results of taking a
  second pass to cleanup the debug prints in the file papi.c.  It adds entry
  traces to more functions that can be called from an application.  It also
  adds lots of additional values to the trace entries so that we can see what
  is being passed to these functions from the application.


  * ee736151 src/ more exclude cleanups  Thanks Gary
  Mohr ---------------- This patch removes an additional check for Makefiles in
  the script.  The exclude files are now used to prevent Makefiles from getting
  run by this script.  I missed this one when providing the previous patch to
  make this change.

  * c37afa23 src/papi_internal.c: papi_internal.c: change SUBDBG to INTDBG 
  Thanks to Gary Mohr ------------------- This patch contains changes to
  replace calls to SUBDBG with calls to INTDBG in this source file.  This
  source file should be using the Internal debug macro rather than the
  Substrate debug macro so that the PAPI debug filters work correctly.  These
  changes also add some new debug calls so that we will get a better picture of
  what is going on in the PAPI internal layer.  There are a few calls to the
  SUBDBG macro that are in code that I have modified to add support for new
  event level masks which are not converted by this patch.  They will be
  corrected when the event level mask patch is provided.

  * e43b1138 src/utils/native_avail.c: native_avail.c: Bug fixes and updates 
  Thanks to Gary Mohr -------------------------------------------------- This
  patch fixes a couple of problems found in the papi_native_avail program. 
  First change fixes a problem introduced when the -validate option was added. 
  This option causes events to get added to an event set but never removes
  them.  This change will remove them if the add works.  This change also fixes
  a coverity detected error where the return value from PAPI_destroy_eventset
  was being ignored.  Second change improves the delimitor check when
  separating the event description from the event mask description.  The
  previous check only looked for a colon but some of the event descriptions
  contain a colon so descriptions would get displayed incorrectly.  The new
  check finds the "masks:" substring which is what papi inserts to separate
  these two descriptions.  Third change adds code to allow the user to enter
  events of the form pmu:::event or pmu::event when using the -e option in the