|
Packit |
577717 |
.TH LIBPFM 3 "July, 2016" "" "Linux Programmer's Manual"
|
|
Packit |
577717 |
.SH NAME
|
|
Packit |
577717 |
libpfm_intel_glm - support for Intel Goldmont core PMU
|
|
Packit |
577717 |
.SH SYNOPSIS
|
|
Packit |
577717 |
.nf
|
|
Packit |
577717 |
.B #include <perfmon/pfmlib.h>
|
|
Packit |
577717 |
.sp
|
|
Packit |
577717 |
.B PMU name: glm
|
|
Packit |
577717 |
.B PMU desc: Intel Goldmont
|
|
Packit |
577717 |
.sp
|
|
Packit |
577717 |
.SH DESCRIPTION
|
|
Packit |
577717 |
The library supports the Intel Goldmont core PMU. It should be noted that
|
|
Packit |
577717 |
this PMU model only covers each core's PMU and not the socket level
|
|
Packit |
577717 |
PMU.
|
|
Packit |
577717 |
|
|
Packit |
577717 |
On Goldmont, the number of generic counters is 4. There is no HyperThreading support.
|
|
Packit |
577717 |
The \fBpfm_get_pmu_info()\fR function returns the maximum number of generic counters
|
|
Packit |
577717 |
in \fBnum_cntrs\fr.
|
|
Packit |
577717 |
|
|
Packit |
577717 |
.SH MODIFIERS
|
|
Packit |
577717 |
The following modifiers are supported on Intel Goldmont processors:
|
|
Packit |
577717 |
.TP
|
|
Packit |
577717 |
.B u
|
|
Packit |
577717 |
Measure at user level which includes privilege levels 1, 2, 3. This corresponds to \fBPFM_PLM3\fR.
|
|
Packit |
577717 |
This is a boolean modifier.
|
|
Packit |
577717 |
.TP
|
|
Packit |
577717 |
.B k
|
|
Packit |
577717 |
Measure at kernel level which includes privilege level 0. This corresponds to \fBPFM_PLM0\fR.
|
|
Packit |
577717 |
This is a boolean modifier.
|
|
Packit |
577717 |
.TP
|
|
Packit |
577717 |
.B i
|
|
Packit |
577717 |
Invert the meaning of the event. The counter will now count cycles in which the event is \fBnot\fR
|
|
Packit |
577717 |
occurring. This is a boolean modifier
|
|
Packit |
577717 |
.TP
|
|
Packit |
577717 |
.B e
|
|
Packit |
577717 |
Enable edge detection, i.e., count only when there is a state transition from no occurrence of the event
|
|
Packit |
577717 |
to at least one occurrence. This modifier must be combined with a counter mask modifier (m) with a value greater or equal to one.
|
|
Packit |
577717 |
This is a boolean modifier.
|
|
Packit |
577717 |
.TP
|
|
Packit |
577717 |
.B c
|
|
Packit |
577717 |
Set the counter mask value. The mask acts as a threshold. The counter will count the number of cycles
|
|
Packit |
577717 |
in which the number of occurrences of the event is greater or equal to the threshold. This is an integer
|
|
Packit |
577717 |
modifier with values in the range [0:255].
|
|
Packit |
577717 |
|
|
Packit |
577717 |
.SH OFFCORE_RESPONSE events
|
|
Packit |
577717 |
Intel Goldmont provides two offcore_response events. They are called OFFCORE_RESPONSE_0 and OFFCORE_RESPONSE_1.
|
|
Packit |
577717 |
|
|
Packit |
577717 |
Those events need special treatment in the performance monitoring infrastructure
|
|
Packit |
577717 |
because each event uses an extra register to store some settings. Thus, in
|
|
Packit |
577717 |
case multiple offcore_response events are monitored simultaneously, the kernel needs
|
|
Packit |
577717 |
to manage the sharing of that extra register.
|
|
Packit |
577717 |
|
|
Packit |
577717 |
The offcore_response events are exposed as normal events by the library. The extra
|
|
Packit |
577717 |
settings are exposed as regular umasks. The library takes care of encoding the
|
|
Packit |
577717 |
events according to the underlying kernel interface.
|
|
Packit |
577717 |
|
|
Packit |
577717 |
On Intel Goldmont, the umasks are divided into 4 categories: request, supplier
|
|
Packit |
577717 |
and snoop and average latency. Offcore_response event has two modes of operations: normal and average latency.
|
|
Packit |
577717 |
In the first mode, the two offcore_respnse events operate independently of each other. The user must provide at
|
|
Packit |
577717 |
least one umask for each of the first 3 categories: request, supplier, snoop. In the second mode, the two
|
|
Packit |
577717 |
offcore_response events are combined to compute an average latency per request type.
|
|
Packit |
577717 |
|
|
Packit |
577717 |
For the normal mode, there is a special supplier (response) umask called \fBANY_RESPONSE\fR. When this umask
|
|
Packit |
577717 |
is used then it overrides any supplier and snoop umasks. In other words, users can
|
|
Packit |
577717 |
specify either \fBANY_RESPONSE\fR \fBOR\fR any combinations of supplier + snoops. In case no supplier or snoop
|
|
Packit |
577717 |
is specified, the library defaults to using \fBANY_RESPONSE\fR.
|
|
Packit |
577717 |
|
|
Packit |
577717 |
For instance, the following are valid event selections:
|
|
Packit |
577717 |
.TP
|
|
Packit |
577717 |
.B OFFCORE_RESPONSE_0:DMND_DATA_RD:ANY_RESPONSE
|
|
Packit |
577717 |
.TP
|
|
Packit |
577717 |
.B OFFCORE_RESPONSE_0:ANY_REQUEST
|
|
Packit |
577717 |
.TP
|
|
Packit |
577717 |
.B OFFCORE_RESPONSE_0:ANY_RFO:LLC_HITM:SNOOP_ANY
|
|
Packit |
577717 |
|
|
Packit |
577717 |
.P
|
|
Packit |
577717 |
But the following are illegal:
|
|
Packit |
577717 |
|
|
Packit |
577717 |
.TP
|
|
Packit |
577717 |
.B OFFCORE_RESPONSE_0:ANY_RFO:LLC_HITM:ANY_RESPONSE
|
|
Packit |
577717 |
.TP
|
|
Packit |
577717 |
.B OFFCORE_RESPONSE_0:ANY_RFO:LLC_HITM:SNOOP_ANY:ANY_RESPONSE
|
|
Packit |
577717 |
.P
|
|
Packit |
577717 |
In average latency mode, \fBOFFCORE_RESPONSE_0\fR must be programmed to select the request types of interest, for instance, \fBDMND_DATA_RD\fR, and the \fBOUTSTANDING\fR umask must be set and no others. the library will enforce that restriction as soon as the \fBOUTSTANDING\fR umask is used. Then \fBOFFCORE_RESPONSE_1\fR must be set with the same request types and the \fBANY_RESPONSE\fR umask. It should be noted that the library encodes events independently of each other and therefore cannot verify that the requests are matching between the two events.
|
|
Packit |
577717 |
Example of average latency settings:
|
|
Packit |
577717 |
.TP
|
|
Packit |
577717 |
.B OFFCORE_RESPONSE_0:DMND_DATA_RD:OUTSTANDING+OFFCORE_RESPONSE_1:DMND_DATA_RD:ANY_RESPONSE
|
|
Packit |
577717 |
.TP
|
|
Packit |
577717 |
.B OFFCORE_RESPONSE_0:ANY_REQUEST:OUTSTANDING+OFFCORE_RESPONSE_1:ANY_REQUEST:ANY_RESPONSE
|
|
Packit |
577717 |
.P
|
|
Packit |
577717 |
The average latency for the request(s) is obtained by dividing the counts of \fBOFFCORE_RESPONSE_0\fR by the count of \fBOFFCORE_RESPONSE_1\fR. The ratio is expressed in core cycles.
|
|
Packit |
577717 |
|
|
Packit |
577717 |
.SH AUTHORS
|
|
Packit |
577717 |
.nf
|
|
Packit |
577717 |
Stephane Eranian <eranian@gmail.com>
|
|
Packit |
577717 |
.if
|
|
Packit |
577717 |
.PP
|