Blob Blame History Raw
.TH "PAPI_profil" 3 "Mon Dec 18 2017" "Version 5.6.0.0" "PAPI" \" -*- nroff -*-
.ad l
.nh
.SH NAME
PAPI_profil \- 
.PP
Generate a histogram of hardware counter overflows vs\&. PC addresses\&.  

.SH SYNOPSIS
.br
.PP
.SH "Detailed Description"
.PP 

.PP
\fBC Interface:\fP
.RS 4
#include <\fBpapi\&.h\fP> 
.br
 int \fBPAPI_profil\fP(void *buf, unsigned bufsiz, unsigned long offset, unsigned scale, int EventSet, int EventCode, int threshold, int flags );
.RE
.PP
\fBFortran Interface\fP
.RS 4
The profiling routines have no Fortran interface\&.
.RE
.PP
\fBParameters:\fP
.RS 4
\fI*buf\fP -- pointer to a buffer of bufsiz bytes in which the histogram counts are stored in an array of unsigned short, unsigned int, or unsigned long long values, or 'buckets'\&. The size of the buckets is determined by values in the flags argument\&. 
.br
\fIbufsiz\fP -- the size of the histogram buffer in bytes\&. It is computed from the length of the code region to be profiled, the size of the buckets, and the scale factor as discussed above\&. 
.br
\fIoffset\fP -- the start address of the region to be profiled\&. 
.br
\fIscale\fP -- broadly and historically speaking, a contraction factor that indicates how much smaller the histogram buffer is than the region to be profiled\&. More precisely, scale is interpreted as an unsigned 16-bit fixed-point fraction with the decimal point implied on the left\&. Its value is the reciprocal of the number of addresses in a subdivision, per counter of histogram buffer\&. Below is a table of representative values for scale\&. 
.br
\fIEventSet\fP -- The PAPI EventSet to profile\&. This EventSet is marked as profiling-ready, but profiling doesn't actually start until a \fBPAPI_start()\fP call is issued\&. 
.br
\fIEventCode\fP -- Code of the Event in the EventSet to profile\&. This event must already be a member of the EventSet\&. 
.br
\fIthreshold\fP -- minimum number of events that must occur before the PC is sampled\&. If hardware overflow is supported for your component, this threshold will trigger an interrupt when reached\&. Otherwise, the counters will be sampled periodically and the PC will be recorded for the first sample that exceeds the threshold\&. If the value of threshold is 0, profiling will be disabled for this event\&. 
.br
\fIflags\fP -- bit pattern to control profiling behavior\&. Defined values are shown in the table above\&.
.RE
.PP
\fBReturn values:\fP
.RS 4
\fIPAPI_OK\fP 
.br
\fIPAPI_EINVAL\fP One or more of the arguments is invalid\&. 
.br
\fIPAPI_ENOMEM\fP Insufficient memory to complete the operation\&. 
.br
\fIPAPI_ENOEVST\fP The EventSet specified does not exist\&. 
.br
\fIPAPI_EISRUN\fP The EventSet is currently counting events\&. 
.br
\fIPAPI_ECNFLCT\fP The underlying counter hardware can not count this event and other events in the EventSet simultaneously\&. 
.br
\fIPAPI_ENOEVNT\fP The PAPI preset is not available on the underlying hardware\&.
.RE
.PP
\fBPAPI_profil()\fP provides hardware event statistics by profiling the occurence of specified hardware counter events\&. It is designed to mimic the UNIX SVR4 profil call\&.
.PP
The statistics are generated by creating a histogram of hardware counter event overflows vs\&. program counter addresses for the current process\&. The histogram is defined for a specific region of program code to be profiled, and the identified region is logically broken up into a set of equal size subdivisions, each of which corresponds to a count in the histogram\&.
.PP
With each hardware event overflow, the current subdivision is identified and its corresponding histogram count is incremented\&. These counts establish a relative measure of how many hardware counter events are occuring in each code subdivision\&.
.PP
The resulting histogram counts for a profiled region can be used to identify those program addresses that generate a disproportionately high percentage of the event of interest\&.
.PP
Events to be profiled are specified with the EventSet and EventCode parameters\&. More than one event can be simultaneously profiled by calling \fBPAPI_profil()\fP several times with different EventCode values\&. Profiling can be turned off for a given event by calling \fBPAPI_profil()\fP with a threshold value of 0\&.
.PP
\fBRepresentative values for the scale variable\fP
.RS 4
 
  HEX      DECIMAL  DEFININTION  
  0x20000  131072   Maps precisely one instruction address to a unique bucket in buf.  
  0x10000   65536   Maps precisely two instruction addresses to a unique bucket in buf.  
  0x0FFFF   65535   Maps approximately two instruction addresses to a unique bucket in buf.  
  0x08000   32768   Maps every four instruction addresses to a bucket in buf.  
  0x04000   16384   Maps every eight instruction addresses to a bucket in buf.  
  0x00002       2   Maps all instruction addresses to the same bucket in buf.  
  0x00001       1   Undefined.  
  0x00000       0   Undefined.  
   
.RE
.PP
Historically, the scale factor was introduced to allow the allocation of buffers smaller than the code size to be profiled\&. Data and instruction sizes were assumed to be multiples of 16-bits\&. These assumptions are no longer necessarily true\&. \fBPAPI_profil()\fP has preserved the traditional definition of scale where appropriate, but deprecated the definitions for 0 and 1 (disable scaling) and extended the range of scale to include 65536 and 131072 to allow for exactly two addresses and exactly one address per profiling bucket\&.
.PP
The value of bufsiz is computed as follows:
.PP
bufsiz = (end - start)*(bucket_size/2)*(scale/65536) where 
.PD 0

.IP "\(bu" 2
bufsiz - the size of the buffer in bytes 
.IP "\(bu" 2
end, start - the ending and starting addresses of the profiled region 
.IP "\(bu" 2
bucket_size - the size of each bucket in bytes; 2, 4, or 8 as defined in flags
.PP
\fBDefined bits for the flags variable:\fP
.RS 4

.PD 0

.IP "\(bu" 2
PAPI_PROFIL_POSIX Default type of profiling, similar to profil (3)\&.
.br
 
.IP "\(bu" 2
PAPI_PROFIL_RANDOM Drop a random 25% of the samples\&.
.br
 
.IP "\(bu" 2
PAPI_PROFIL_WEIGHTED Weight the samples by their value\&.
.br
 
.IP "\(bu" 2
PAPI_PROFIL_COMPRESS Ignore samples as values in the hash buckets get big\&.
.br
 
.IP "\(bu" 2
PAPI_PROFIL_BUCKET_16 Use unsigned short (16 bit) buckets, This is the default bucket\&.
.br
 
.IP "\(bu" 2
PAPI_PROFIL_BUCKET_32 Use unsigned int (32 bit) buckets\&.
.br
 
.IP "\(bu" 2
PAPI_PROFIL_BUCKET_64 Use unsigned long long (64 bit) buckets\&.
.br
 
.IP "\(bu" 2
PAPI_PROFIL_FORCE_SW Force software overflow in profiling\&. 
.br
 
.PP
.RE
.PP
\fBExample\fP
.RS 4

.PP
.nf
* int retval;
* unsigned long length;
* PAPI_exe_info_t *prginfo;
* unsigned short *profbuf;
*
* if ((prginfo = PAPI_get_executable_info()) == NULL)
*    handle_error(1);
*
* length = (unsigned long)(prginfo->text_end - prginfo->text_start);
*
* profbuf = (unsigned short *)malloc(length);
* if (profbuf == NULL)
*    handle_error(1);
* memset(profbuf,0x00,length);
*
* if ((retval = PAPI_profil(profbuf, length, start, 65536, EventSet,
*     PAPI_FP_INS, 1000000, PAPI_PROFIL_POSIX | PAPI_PROFIL_BUCKET_16)) 
*    != PAPI_OK)
*    handle_error(retval);
* 

.fi
.PP
.RE
.PP
.PP
\fBSee Also:\fP
.RS 4
\fBPAPI_overflow\fP 
.PP
\fBPAPI_sprofil\fP 
.RE
.PP


.SH "Author"
.PP 
Generated automatically by Doxygen for PAPI from the source code\&.