2016-01-25 * d779d1172a6e4c73b5ece9939c4d067c2b3d7b8d Update libpfm4 current with Jan 25 08:33:02 2016 version. 2016-01-07 * 0d9776b8 src/components/stealtime/linux-stealtime.c: Free allocated memory in the stealtime component when component is shutdown Thanks to William Cohen for contributing this patch and the following explaination: Running examples with "valgrind --leak-check=full ..." showed a number of items allocated by the stealtime component were not freed when PAPI_shutdown() was called. This patch frees those unused memory allocations. 2016-01-06 * de40668c src/papi_preset.c: Fixed memory leak in papi_preset.c by updating the infix_to_postfix function. Thanks to William Cohen for discovering the leak. The infix_to_postfix function was re-written and tested using user defined events. 2015-12-30 * db37e115 src/utils/avail.c src/utils/native_avail.c: Added "-check" flag to papi_avail and papi_native_avail to test counter availability/validity" This patch updates the papi_avail and papi_native_avail utilities to use the "-check" flag to test the actual availability of counters. There were previously two different flags for this capability, papi_native_avail used "-validate" and papi_avail used "-avail_test". Based on a mailing list discussion these flags have been consolidated as "-check". 2015-12-29 * 72e0ffe8 src/components/lmsensors/linux-lmsensors.c: Fixed minor error with multiple initializers for lmsensors_vector .default_granularity. Thanks to William Cohen for the bug report. 2015-12-07 * ec3582d8 src/utils/avail.c: papi_avail to test actual availability of counters using "papi_avail --avail-test" This problem and the associated patch were detected and contributed by Harald Savat. Thanks. On an Intel(R) Xeon(R) CPU E5-2660 v2 @ 2.20GHz system with PAPI 5.4.1 installed The papi_avail command indicates that both PAPI_LD_INS and PAPI_SR_INS are available, however the papi_event_chooser does not accept them (see below) and return -1. This problem can occur in kernels from version 3.1 till 4.1. The kernel devs blocked all uses of the MEM_OPS events (including load and store). The patch modifies papi_avail to test the counters to see if they can be added. papi_avail # gets all PAPI counters papi_avail -a # gets all available PAPI counters papi_avail -at # shows all available PAPI counters that can be added [Ptools-perfapi: Oct 14 2015] 2015-11-30 * 1fab922e src/components/libmsr/README src/components/libmsr/configure src/components/libmsr/configure.in...: The libmsr component is updated to match major changes in LLNL libmsr library and the LLNL msr-safe kernel module 2015-11-18 * 242b16d3 src/Makefile.inc src/components/cuda/configure src/components/cuda/configure.in...: Added papi_cuda_sampling utility in /src/components/cuda/sampling changed src/Makefile.inc , src/components/cuda/configure.in to build the utiltiy during PAPI installation Added in /src/components/cuda/tests/Makefile which is -ldl switch because 3.10.0-229.14.1.el7.x86_64 had issues using libpapi.a during compilation of cuda component test programs 2015-10-21 * a10e8331 src/papi_events.csv: papi_events: add Intel Skylake presets This just shares all of teh broadwell events with skylake. Some quick tests show that this probably works. Someone with skylake hardware should validate this at some point. 2015-10-08 * 91736851 src/papi_internal.c: Thanks to David Eberius of ICL for reporting a bug in PAPI_get_event_info() in papi_internal.c, (info->component_index = (unsigned int) cidx) was missing at line 2554, of papi_internal.c 2015-08-27 * 502df070 src/Makefile.inc: Thanks to Steve Kaufmann for reporting about the redundant () paramater in the OBJECTS expression of src/Makefile.inc file. Updated Makefile.inc by removing the redundant paramater 2015-08-24 * 69fdc2e0 src/papi.c: Thanks to Harald Servat for reporting the PAPI_overflow issue for multiple eventsets. The problem was in the PAPI_start() function in the branch at line-2166:papi.c , if(is_dirty). After update_control_state(), it is required to re-initialize the overflow settings using set_overflow() 2015-07-29 * be81dc43 src/components/perf_event/perf_event.c: perf_event: update the ARM domain workaround older ARM processors could not separate out KERNEL vs USER events. ARMv7 starting with the Cortex A15 can, as can all ARMv8 (ARM64). This updates the code with a whitelist to properly allow setting the domains. * 43be2588 src/linux-common.c: linux-common: clean up ARM cpu detection Parsing cpuinfo is always a pain. Extra work because of Raspberry Pi (ARM1176) lying and saying it's ARMv7 rather than ARMv6. * 5a101a50 src/linux-common.c: linux-common: split up x86, power and arm cpuinfo parsing * 0d7772d9 src/linux-common.c: linux-common: clean up and comment the cpuinfo parsing code 2015-07-16 * 59489b1f src/components/libmsr/Makefile.libmsr.in src/components/libmsr/README src/components/libmsr/Rules.libmsr...: Create libmsr component for reading power information and writing power constraints using MSRs on some Intel processsors The PAPI libmsr component supports measuring and capping power usage on recent Intel architectures using the RAPL interface exposed through MSRs (model-specific registers). Lawrence Livermore National Laboratory has released a library (libmsr) designed to provide a simple, safe, consistent interface to several of the model-specific registers (MSRs) in Intel processors. The problem is that permitting open access to the MSRs on a machine can be a safety hazard, so access to MSRs is usually limited. In order to encourage system administrators to give wider access to the MSRs on a machine, LLNL has released a Linux kernel module (msr_safe) which provides safer, white-listed access to the MSRs. PAPI has created a libmsr component that can provide read and write access to the information and controls exposed via the libmsr library. This PAPI component introduces a new ability for PAPI; it is the first case where PAPI is writing information to a counter as well as reading the data from the counter. 2015-07-13 * d326ecc9 src/components/perf_event/perf_event_lib.h src/papi_internal.c: Thanks to Steve Kaufman for providing a patch that increases the PERF_EVENT_MAX_MPX_COUNTERS to 192 from 128 and enhances the corresponding warning message in papi_internal.c 2015-06-29 * e829baa5 src/components/cuda/tests/Makefile src/components/cuda/tests/cuda_ld_preload_example.README src/components/cuda/tests/cuda_ld_preload_example.c: Example of using LD_PRELOAD with the CUDA component. A short example of using LD_PRELOAD on a Linux system to intercept function calls and PAPI-enable an un-instrumented CUDA binary. Several CUDA events (e.g. SM PM counters) require a CUcontext handle to be a provided since they are context switched. This means that we cannot use a PAPI_attach from an external process to measure those events in a preexisting executable. These events can only be measured from within the CUcontext, that is, within the CUDA enabled code we are trying to measure. If the user is unable to change the source code, they may be able to use LD_PRELOAD's ability to trap functions and measure the events for within the executable. See src/components/cuda/tests/cuda_ld_preload_example.README for details. 2015-06-26 * 0829a4f5 src/papi_events.csv: Add future broadwell-ep support. libpfm4 doesn't support it yet, but add it for when it appears. 2015-06-25 * 36c5b5b6 src/papi_events.csv: add broadwell predefined events For now they are the same as Haswell, as that's what the Linux kernel does. * f42eda64 src/papi_events.csv: Added definitions to Power8 for PAPI_SP_OPS, PAPI_DP_OPS. 2015-06-18 * f87542f7 src/components/perf_event/tests/event_name_lib.c: Added [case 63: /*Haswell EP*/] line the src/components/perf_event/tests/event_name_lib.c file to support offcore for haswell EP * fbfc641f src/components/perf_event_uncore/tests/perf_event_uncore_cbox.c src/components/perf_event_uncore/tests/perf_event_uncore_lib.c: Added suuport for Haswell-EP processor with model-63 in src/components/perf_event_uncore/tests/perf_event_uncore_lib.c and src/components/perf_event_uncore/tests/perf_event_uncore_cbox.c files. As a result perf_event_uncore, perf_event_uncore_multiple and perf_event_uncore_cbox tests get passed. Tested and verified on Intel(R) Xeon(R) CPU E5-2650 v3 @ 2.30GHz with linux kernel 4.0.4-1.el6.elrepo.x86_64 2015-06-17 * 56698211 src/components/lustre/linux-lustre.c: Thanks to Garry Mohr for the patch that removes the error message (PAPI Error: Error Code -7,Event does not exist) on executing papi_native_avail in PAPI built with lustre component 2015-06-16 * 1b9fd867 src/components/rapl/linux-rapl.c: rapl: allow DRAM to have separate scaling factor from CPU on Haswell-EP the DRAM scaling value is different and cannot be detected. See https://lkml.org/lkml/2015/3/20/582 * 1aa74f85 src/components/rapl/linux-rapl.c: rapl: add support for Broadwell 2015-06-11 * a5ecda79 src/components/rapl/linux-rapl.c: Thanks to William Cohen for the patch which does the following: Checking the cpu family and module number is not sufficient to determine whether RAPL can be used. If the papi is running inside a guest VM, the MSR used by the PAPI RAPL component may not be available. There should be a simple read test to verify the RAPL MSR registers are available. This allows the component to more clearly report that RAPL is unsupported rather than just exiting program when the RAPL 2015-05-19 * 54c45107 src/components/rapl/utils/rapl_plot.c: Updated rapl_plot utility so that the correct values/units are reported (e.g. scaled and fixed value counts should not be converted) 2015-05-04 * a34fbc62 src/papi_events.csv: papi_events.csv: typo in the ARM Cortex A53 definitions 2015-04-30 * caa3af72 src/papi_events.csv: papi_events.csv: add preset events for ARM Cortex A53 This is based purely on the names in the libpfm4 output, these were not validated in any way. 2015-04-20 * 66553715 INSTALL.txt: added compile incantation for compiling programs that offload code to MIC 2015-04-16 * 8914dcfc src/papi_events.csv: Bug reported by William Cohen in papi_events.csv for the event PAPI_L1_TCM 2015-03-31 * 023af5ec src/components/nvml/configure: Updated the NVML configure script which requires an autoconf and an updated configure script * 2385c1b2 src/components/nvml/Makefile.nvml.in src/components/nvml/Rules.nvml src/components/nvml/configure.in: Updated the NVML configure script to allow separate include and library paths 2015-03-30 * 3d509095 src/components/infiniband_umad/linux-infiniband_umad.c: Bugfix linux-infiniband_umad.c to include linux-infiniband_umad.h rather than linux-infiniband.h. Thanks to Aurelien Bouteiller for pointing out this bug. * b865f227 src/components/vmware/vmware.c: Corrected function name in _vmware_vector from _vmware_init to _vmware_init_thread. 2015-03-24 * 2f58a4d8 src/configure: Regenerated configure to match the PAPI_GRN_SYS patch * 12e6ef31 src/components/perf_event/tests/perf_event_system_wide.c: Support PAPI_GRN_SYS granularity for perf component, updating the system wide test (patch 2 of 2). Thanks to William Cohen for this patch and the documentation Make sure that a sane cpu number is selected with PAPI_GRN_SYS Corrections to output and comments of perf_event_system_wide.c test * 42879693 src/components/perf_event/perf_event.c src/components/perf_event/tests/perf_event_system_wide.c src/config.h.in...: Support PAPI_GRN_SYS granularity for perf component, picking a sane CPU number (patch 1 of 2). Thanks to William Cohen for this patch and the documentation The checks in perf_event_open syscall cause a failure when both pid=-1 and cpu=-1. The perf_event component was passing in pid=-1 and cpu=-1 when PAPI_GRN_SYS was selected. If possible, the code should pick the current processor that the command is running so that the permission check works properly when PAPI_GRN_SYS is used. The patch also adds a test fail if PAPI_GRN_SYS unable to add PAPI_TOT_CYC. * 0ab9b0c8 src/ctests/krentel_pthreads.c: Added call to unregister the overflow handler.. plus small code cleanup 2015-03-05 * d886c49c src/papi.c src/papi_libpfm4_events.c src/utils/avail.c: Clean output from papi_avail tools when there are no user defined events Thanks to Gary Mohr for this patch. The changes in this patch improve the output from the papi_avail tool. It was printing the user defined events header and a PAPI Error message when no user defined events existed. These changes add code in the enum call to return an error when trying to fetch the first user defined event if no user events are defined. This allows the application to detect that no user events are known and skip printing the user defined event heading. It also prevents the application from calling PAPI_get_event_info with a user defined event code that does not exist which avoids the PAPI Error message. Also a one line change to modify a debug message type to make the debug output produced by papi_libpfm4_events.c consistent. 2015-03-03 * ee0c58d7 src/components/cuda/linux-cuda.c: Do not generate an error if the CUDA libraries cannot be loaded, just write a debug message * 08bb9bf0 src/configure: Updating the number to 5.4.1 2015-03-02 * 01f742c1 release_procedure.txt: Minor change to specify locations of some files