Blame src/libpfm4/docs/man3/libpfm_intel_knl.3

Packit 577717
.TH LIBPFM 3  "July, 2016" "" "Linux Programmer's Manual"
Packit 577717
.SH NAME
Packit 577717
libpfm_intel_knl - support for Intel Kinghts Landing core PMU
Packit 577717
.SH SYNOPSIS
Packit 577717
.nf
Packit 577717
.B #include <perfmon/pfmlib.h>
Packit 577717
.sp
Packit 577717
.B PMU name: knl
Packit 577717
.B PMU desc: Intel Kinghts Landing
Packit 577717
.sp
Packit 577717
.SH DESCRIPTION
Packit 577717
The library supports the Intel Kinghts Landing core PMU. It should be noted that
Packit 577717
this PMU model only covers each core's PMU and not the socket level PMU.
Packit 577717
Packit 577717
On Knights Landing, the number of generic counters is 4. There is 4-way HyperThreading support.
Packit 577717
The \fBpfm_get_pmu_info()\fR function returns the maximum number of generic counters
Packit 577717
in \fBnum_cntrs\fr.
Packit 577717
Packit 577717
.SH MODIFIERS
Packit 577717
The following modifiers are supported on Intel Kinghts Landing processors:
Packit 577717
.TP
Packit 577717
.B u
Packit 577717
Measure at user level which includes privilege levels 1, 2, 3. This corresponds to \fBPFM_PLM3\fR.
Packit 577717
This is a boolean modifier.
Packit 577717
.TP
Packit 577717
.B k
Packit 577717
Measure at kernel level which includes privilege level 0. This corresponds to \fBPFM_PLM0\fR.
Packit 577717
This is a boolean modifier.
Packit 577717
.TP
Packit 577717
.B i
Packit 577717
Invert the meaning of the event. The counter will now count cycles in which the event is \fBnot\fR
Packit 577717
occurring. This is a boolean modifier
Packit 577717
.TP
Packit 577717
.B e
Packit 577717
Enable edge detection, i.e., count only when there is a state transition from no occurrence of the event
Packit 577717
to at least one occurrence. This modifier must be combined with a counter mask modifier (m) with a value greater or equal to one.
Packit 577717
This is a boolean modifier.
Packit 577717
.TP
Packit 577717
.B c
Packit 577717
Set the counter mask value. The mask acts as a threshold. The counter will count the number of cycles
Packit 577717
in which the number of occurrences of the event is greater or equal to the threshold. This is an integer
Packit 577717
modifier with values in the range [0:255].
Packit 577717
.TP
Packit 577717
.B t
Packit 577717
Measure on any of the 4 hyper-threads at the same time assuming hyper-threading is enabled. This is a boolean modifier.
Packit 577717
This modifier is only available on fixed counters (unhalted_reference_cycles, instructions_retired, unhalted_core_cycles).
Packit 577717
Depending on the underlying kernel interface, the event may be programmed on a fixed counter or a generic counter, except for
Packit 577717
unhalted_reference_cycles, in which case, this modifier may be ignored or rejected.
Packit 577717
Packit 577717
.SH OFFCORE_RESPONSE events
Packit 577717
Intel Knights Landing provides two offcore_response events. They are called OFFCORE_RESPONSE_0 and OFFCORE_RESPONSE_1.
Packit 577717
Packit 577717
Those events need special treatment in the performance monitoring infrastructure
Packit 577717
because each event uses an extra register to store some settings. Thus, in
Packit 577717
case multiple offcore_response events are monitored simultaneously, the kernel needs
Packit 577717
to manage the sharing of that extra register.
Packit 577717
Packit 577717
The offcore_response events are exposed as normal events by the library. The extra
Packit 577717
settings are exposed as regular umasks. The library takes care of encoding the
Packit 577717
events according to the underlying kernel interface.
Packit 577717
Packit 577717
On Intel Knights Landing, the umasks are divided into 4 categories: request, supplier
Packit 577717
and snoop and average latency. Offcore_response event has two modes of operations: normal and average latency.
Packit 577717
In the first mode, the two offcore_respnse events operate independently of each other. The user must provide at
Packit 577717
least one umask for each of the first 3 categories: request, supplier, snoop. In the second mode, the two
Packit 577717
offcore_response events are combined to compute an average latency per request type.
Packit 577717
Packit 577717
For the normal mode, there is a special supplier (response) umask called \fBANY_RESPONSE\fR. When this umask
Packit 577717
is used then it overrides any supplier and snoop umasks. In other words, users can
Packit 577717
specify either \fBANY_RESPONSE\fR \fBOR\fR any combinations of supplier + snoops. In case no supplier or snoop
Packit 577717
is specified, the library defaults to using \fBANY_RESPONSE\fR.
Packit 577717
Packit 577717
For instance, the following are valid event selections:
Packit 577717
.TP
Packit 577717
.B OFFCORE_RESPONSE_0:DMND_DATA_RD:ANY_RESPONSE
Packit 577717
.TP
Packit 577717
.B OFFCORE_RESPONSE_0:ANY_REQUEST
Packit 577717
.TP
Packit 577717
.B OFFCORE_RESPONSE_0:ANY_RFO:DDR_NEAR
Packit 577717
Packit 577717
.P
Packit 577717
But the following is illegal:
Packit 577717
Packit 577717
.TP
Packit 577717
.B OFFCORE_RESPONSE_0:ANY_RFO:DDR_NEAR:ANY_RESPONSE
Packit 577717
.P
Packit 577717
In average latency mode, \fBOFFCORE_RESPONSE_0\fR must be programmed to select the request types of interest, for instance, \fBDMND_DATA_RD\fR, and the \fBOUTSTANDING\fR umask must be set and no others. the library will enforce that restriction as soon as the \fBOUTSTANDING\fR umask is used. Then \fBOFFCORE_RESPONSE_1\fR must be set with the same request types and the \fBANY_RESPONSE\fR umask. It should be noted that the library encodes events independently of each other and therefore cannot verify that the requests are matching between the two events.
Packit 577717
Example of average latency settings:
Packit 577717
.TP
Packit 577717
.B OFFCORE_RESPONSE_0:DMND_DATA_RD:OUTSTANDING+OFFCORE_RESPONSE_1:DMND_DATA_RD:ANY_RESPONSE
Packit 577717
.TP
Packit 577717
.B OFFCORE_RESPONSE_0:ANY_REQUEST:OUTSTANDING+OFFCORE_RESPONSE_1:ANY_REQUEST:ANY_RESPONSE
Packit 577717
.P
Packit 577717
The average latency for the request(s) is obtained by dividing the counts of \fBOFFCORE_RESPONSE_0\fR by the count of \fBOFFCORE_RESPONSE_1\fR. The ratio is expressed in core cycles.
Packit 577717
Packit 577717
.SH AUTHORS
Packit 577717
.nf
Packit 577717
Stephane Eranian <eranian@gmail.com>
Packit 577717
.if
Packit 577717
.PP