Blame doc/howto_libipt.md

Packit b1f7ae
Decoding Intel(R) Processor Trace Using libipt {#libipt}
Packit b1f7ae
========================================================
Packit b1f7ae
Packit b1f7ae
Packit b1f7ae
 ! Copyright (c) 2013-2017, Intel Corporation
Packit b1f7ae
 !
Packit b1f7ae
 ! Redistribution and use in source and binary forms, with or without
Packit b1f7ae
 ! modification, are permitted provided that the following conditions are met:
Packit b1f7ae
 !
Packit b1f7ae
 !  * Redistributions of source code must retain the above copyright notice,
Packit b1f7ae
 !    this list of conditions and the following disclaimer.
Packit b1f7ae
 !  * Redistributions in binary form must reproduce the above copyright notice,
Packit b1f7ae
 !    this list of conditions and the following disclaimer in the documentation
Packit b1f7ae
 !    and/or other materials provided with the distribution.
Packit b1f7ae
 !  * Neither the name of Intel Corporation nor the names of its contributors
Packit b1f7ae
 !    may be used to endorse or promote products derived from this software
Packit b1f7ae
 !    without specific prior written permission.
Packit b1f7ae
 !
Packit b1f7ae
 ! THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS "AS IS"
Packit b1f7ae
 ! AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE
Packit b1f7ae
 ! IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE
Packit b1f7ae
 ! ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT OWNER OR CONTRIBUTORS BE
Packit b1f7ae
 ! LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR
Packit b1f7ae
 ! CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF
Packit b1f7ae
 ! SUBSTITUTE GOODS OR SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS
Packit b1f7ae
 ! INTERRUPTION) HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN
Packit b1f7ae
 ! CONTRACT, STRICT LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE)
Packit b1f7ae
 ! ARISING IN ANY WAY OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE
Packit b1f7ae
 ! POSSIBILITY OF SUCH DAMAGE.
Packit b1f7ae
 !-->
Packit b1f7ae
Packit b1f7ae
This chapter describes how to use libipt for various tasks around Intel
Packit b1f7ae
Processor Trace (Intel PT).  For code examples, refer to the sample tools that
Packit b1f7ae
are contained in the source tree:
Packit b1f7ae
Packit b1f7ae
  * *ptdump*    A packet dumper example.
Packit b1f7ae
  * *ptxed*     A control-flow reconstruction example.
Packit b1f7ae
  * *pttc*      A packet encoder example.
Packit b1f7ae
Packit b1f7ae
Packit b1f7ae
For detailed information about Intel PT, please refer to chapter 36 of the Intel
Packit b1f7ae
Software Developer's Manual at http://www.intel.com/sdm.
Packit b1f7ae
Packit b1f7ae
Packit b1f7ae
## Introduction
Packit b1f7ae
Packit b1f7ae
The libipt decoder library provides multiple layers of abstraction ranging from
Packit b1f7ae
packet encoding and decoding to full execution flow reconstruction.  The layers
Packit b1f7ae
are organized as follows:
Packit b1f7ae
Packit b1f7ae
  * *packets*               This layer deals with raw Intel PT packets.
Packit b1f7ae
Packit b1f7ae
  * *events*                This layer deals with packet combinations that
Packit b1f7ae
                            encode higher-level events.
Packit b1f7ae
Packit b1f7ae
  * *instruction flow*      This layer deals with the execution flow on the
Packit b1f7ae
                            instruction level.
Packit b1f7ae
Packit b1f7ae
  * *block*                 This layer deals with the execution flow on the
Packit b1f7ae
                            instruction level.
Packit b1f7ae
Packit b1f7ae
                            It is faster than the instruction flow decoder but
Packit b1f7ae
                            requires a small amount of post-processing.
Packit b1f7ae
Packit b1f7ae
Packit b1f7ae
Each layer provides its own encoder or decoder struct plus a set of functions
Packit b1f7ae
for allocating and freeing encoder or decoder objects and for synchronizing
Packit b1f7ae
decoders onto the Intel PT packet stream.  Function names are prefixed with
Packit b1f7ae
`pt_<lyr>_` where `<lyr>` is an abbreviation of the layer name.  The following
Packit b1f7ae
abbreviations are used:
Packit b1f7ae
Packit b1f7ae
  * *enc*     Packet encoding (packet layer).
Packit b1f7ae
  * *pkt*     Packet decoding (packet layer).
Packit b1f7ae
  * *qry*     Event (or query) layer.
Packit b1f7ae
  * *insn*    Instruction flow layer.
Packit b1f7ae
  * *blk*     Block layer.
Packit b1f7ae
Packit b1f7ae
Packit b1f7ae
Here is some generic example code for working with decoders:
Packit b1f7ae
Packit b1f7ae
~~~{.c}
Packit b1f7ae
    struct pt_<layer>_decoder *decoder;
Packit b1f7ae
    struct pt_config config;
Packit b1f7ae
    int errcode;
Packit b1f7ae
Packit b1f7ae
    memset(&config, 0, sizeof(config));
Packit b1f7ae
    config.size = sizeof(config);
Packit b1f7ae
    config.begin = <pt buffer begin>;
Packit b1f7ae
    config.end = <pt buffer end>;
Packit b1f7ae
    config.cpu = <cpu identifier>;
Packit b1f7ae
    config...
Packit b1f7ae
Packit b1f7ae
    decoder = pt_<lyr>_alloc_decoder(&config);
Packit b1f7ae
    if (!decoder)
Packit b1f7ae
        <handle error>(errcode);
Packit b1f7ae
Packit b1f7ae
    errcode = pt_<lyr>_sync_<where>(decoder);
Packit b1f7ae
    if (errcode < 0)
Packit b1f7ae
        <handle error>(errcode);
Packit b1f7ae
Packit b1f7ae
    <use decoder>(decoder);
Packit b1f7ae
Packit b1f7ae
    pt_<lyr>_free_decoder(decoder);
Packit b1f7ae
~~~
Packit b1f7ae
Packit b1f7ae
First, configure the decoder.  As a minimum, the size of the config struct and
Packit b1f7ae
the `begin` and `end` of the buffer containing the Intel PT data need to be set.
Packit b1f7ae
Configuration options details will be discussed later in this chapter.  In the
Packit b1f7ae
case of packet encoding, this is the begin and end address of the pre-allocated
Packit b1f7ae
buffer, into which Intel PT packets shall be written.
Packit b1f7ae
Packit b1f7ae
Next, allocate a decoder object for the layer you are interested in.  A return
Packit b1f7ae
value of NULL indicates an error.  There is no further information available on
Packit b1f7ae
the exact error condition.  Most of the time, however, the error is the result
Packit b1f7ae
of an incomplete or inconsistent configuration.
Packit b1f7ae
Packit b1f7ae
Before the decoder can be used, it needs to be synchronized onto the Intel PT
Packit b1f7ae
packet stream specified in the configuration.  The only exception to this is the
Packit b1f7ae
packet encoder, which is implicitly synchronized onto the beginning of the Intel
Packit b1f7ae
PT buffer.
Packit b1f7ae
Packit b1f7ae
Depending on the type of decoder, one or more synchronization options are
Packit b1f7ae
available.
Packit b1f7ae
Packit b1f7ae
  * `pt_<lyr>_sync_forward()`     Synchronize onto the next PSB in forward
Packit b1f7ae
                                  direction (or the first PSB if not yet
Packit b1f7ae
                                  synchronized).
Packit b1f7ae
Packit b1f7ae
  * `pt_<lyr>_sync_backward()`    Synchronize onto the next PSB in backward
Packit b1f7ae
                                  direction (or the last PSB if not yet
Packit b1f7ae
                                  synchronized).
Packit b1f7ae
Packit b1f7ae
  * `pt_<lyr>_sync_set()`         Set the synchronization position to a
Packit b1f7ae
                                  user-defined location in the Intel PT packet
Packit b1f7ae
                                  stream.
Packit b1f7ae
                                  There is no check whether the specified
Packit b1f7ae
                                  location makes sense or is valid.
Packit b1f7ae
Packit b1f7ae
Packit b1f7ae
After synchronizing, the decoder can be used.  While decoding, the decoder
Packit b1f7ae
stores the location of the last PSB it encountered during normal decode.
Packit b1f7ae
Subsequent calls to pt_<lyr>_sync_forward() will start searching from that
Packit b1f7ae
location.  This is useful for re-synchronizing onto the Intel PT packet stream
Packit b1f7ae
in case of errors.  An example of a typical decode loop is given below:
Packit b1f7ae
Packit b1f7ae
~~~{.c}
Packit b1f7ae
    for (;;) {
Packit b1f7ae
        int errcode;
Packit b1f7ae
Packit b1f7ae
        errcode = <use decoder>(decoder);
Packit b1f7ae
        if (errcode >= 0)
Packit b1f7ae
            continue;
Packit b1f7ae
Packit b1f7ae
        if (errcode == -pte_eos)
Packit b1f7ae
            return;
Packit b1f7ae
Packit b1f7ae
        <report error>(errcode);
Packit b1f7ae
Packit b1f7ae
        do {
Packit b1f7ae
            errcode = pt_<lyr>_sync_forward(decoder);
Packit b1f7ae
Packit b1f7ae
            if (errcode == -pte_eos)
Packit b1f7ae
                return;
Packit b1f7ae
        } while (errcode < 0);
Packit b1f7ae
    }
Packit b1f7ae
~~~
Packit b1f7ae
Packit b1f7ae
You can get the current decoder position as offset into the Intel PT buffer via:
Packit b1f7ae
Packit b1f7ae
    pt_<lyr>_get_offset()
Packit b1f7ae
Packit b1f7ae
Packit b1f7ae
You can get the position of the last synchronization point as offset into the
Packit b1f7ae
Intel PT buffer via:
Packit b1f7ae
Packit b1f7ae
    pt_<lyr>_get_sync_offset()
Packit b1f7ae
Packit b1f7ae
Packit b1f7ae
Each layer will be discussed in detail below.  In the remainder of this section,
Packit b1f7ae
general functionality will be considered.
Packit b1f7ae
Packit b1f7ae
Packit b1f7ae
### Version
Packit b1f7ae
Packit b1f7ae
You can query the library version using:
Packit b1f7ae
Packit b1f7ae
  * `pt_library_version()`
Packit b1f7ae
Packit b1f7ae
Packit b1f7ae
This function returns a version structure that can be used for compatibility
Packit b1f7ae
checks or simply for reporting the version of the decoder library.
Packit b1f7ae
Packit b1f7ae
Packit b1f7ae
### Errors
Packit b1f7ae
Packit b1f7ae
The library uses a single error enum for all layers.
Packit b1f7ae
Packit b1f7ae
  * `enum pt_error_code`      An enumeration of encode and decode errors.
Packit b1f7ae
Packit b1f7ae
Packit b1f7ae
Errors are typically represented as negative pt_error_code enumeration constants
Packit b1f7ae
and returned as an int.  The library provides two functions for dealing with
Packit b1f7ae
errors:
Packit b1f7ae
Packit b1f7ae
  * `pt_errcode()`            Translate an int return value into a pt_error_code
Packit b1f7ae
                              enumeration constant.
Packit b1f7ae
Packit b1f7ae
  * `pt_errstr()`             Returns a human-readable error string.
Packit b1f7ae
Packit b1f7ae
Packit b1f7ae
Not all errors may occur on every layer.  Every API function specifies the
Packit b1f7ae
errors it may return.
Packit b1f7ae
Packit b1f7ae
Packit b1f7ae
### Configuration
Packit b1f7ae
Packit b1f7ae
Every encoder or decoder allocation function requires a configuration argument.
Packit b1f7ae
Some of its fields have already been discussed in the example above.  Refer to
Packit b1f7ae
the `intel-pt.h` header for detailed and up-to-date documentation of each field.
Packit b1f7ae
Packit b1f7ae
As a minimum, the `size` field needs to be set to `sizeof(struct pt_config)` and
Packit b1f7ae
`begin` and `end` need to be set to the Intel PT buffer to use.
Packit b1f7ae
Packit b1f7ae
The size is used for detecting library version mismatches and to provide
Packit b1f7ae
backwards compatibility.  Without the proper `size`, decoder allocation will
Packit b1f7ae
fail.
Packit b1f7ae
Packit b1f7ae
Although not strictly required, it is recommended to also set the `cpu` field to
Packit b1f7ae
the processor, on which Intel PT has been collected (for decoders), or for which
Packit b1f7ae
Intel PT shall be generated (for encoders).  This allows implementing
Packit b1f7ae
processor-specific behavior such as erratum workarounds.
Packit b1f7ae
Packit b1f7ae
Packit b1f7ae
## The Packet Layer
Packit b1f7ae
Packit b1f7ae
This layer deals with Intel PT packet encoding and decoding.  It can further be
Packit b1f7ae
split into three sub-layers: opcodes, encoding, and decoding.
Packit b1f7ae
Packit b1f7ae
Packit b1f7ae
### Opcodes
Packit b1f7ae
Packit b1f7ae
The opcodes layer provides enumerations for all the bits necessary for Intel PT
Packit b1f7ae
encoding and decoding.  The enumeration constants can be used without linking to
Packit b1f7ae
the decoder library.  There is no encoder or decoder struct associated with this
Packit b1f7ae
layer.  See the intel-pt.h header file for details.
Packit b1f7ae
Packit b1f7ae
Packit b1f7ae
### Packet Encoding
Packit b1f7ae
Packit b1f7ae
The packet encoding layer provides support for encoding Intel PT
Packit b1f7ae
packet-by-packet.  Start by configuring and allocating a `pt_packet_encoder` as
Packit b1f7ae
shown below:
Packit b1f7ae
Packit b1f7ae
~~~{.c}
Packit b1f7ae
    struct pt_encoder *encoder;
Packit b1f7ae
    struct pt_config config;
Packit b1f7ae
    int errcode;
Packit b1f7ae
Packit b1f7ae
    memset(&config, 0, sizeof(config));
Packit b1f7ae
    config.size = sizeof(config);
Packit b1f7ae
    config.begin = <pt buffer begin>;
Packit b1f7ae
    config.end = <pt buffer end>;
Packit b1f7ae
    config.cpu = <cpu identifier>;
Packit b1f7ae
Packit b1f7ae
    encoder = pt_alloc_encoder(&config);
Packit b1f7ae
    if (!encoder)
Packit b1f7ae
        <handle error>(errcode);
Packit b1f7ae
~~~
Packit b1f7ae
Packit b1f7ae
For packet encoding, only the mandatory config fields need to be filled in.
Packit b1f7ae
Packit b1f7ae
The allocated encoder object will be implicitly synchronized onto the beginning
Packit b1f7ae
of the Intel PT buffer.  You may change the encoder's position at any time by
Packit b1f7ae
calling `pt_enc_sync_set()` with the desired buffer offset.
Packit b1f7ae
Packit b1f7ae
Next, fill in a `pt_packet` object with details about the packet to be encoded.
Packit b1f7ae
You do not need to fill in the `size` field.  The needed size is computed by the
Packit b1f7ae
encoder.  There is no consistency check with the size specified in the packet
Packit b1f7ae
object.  The following example encodes a TIP packet:
Packit b1f7ae
Packit b1f7ae
~~~{.c}
Packit b1f7ae
    struct pt_packet_encoder *encoder = ...;
Packit b1f7ae
    struct pt_packet packet;
Packit b1f7ae
    int errcode;
Packit b1f7ae
Packit b1f7ae
    packet.type = ppt_tip;
Packit b1f7ae
    packet.payload.ip.ipc = pt_ipc_update_16;
Packit b1f7ae
    packet.payload.ip.ip = <ip>;
Packit b1f7ae
~~~
Packit b1f7ae
Packit b1f7ae
For IP packets, for example FUP or TIP.PGE, there is no need to mask out bits in
Packit b1f7ae
the `ip` field that will not be encoded in the packet due to the specified IP
Packit b1f7ae
compression in the `ipc` field.  The encoder will ignore them.
Packit b1f7ae
Packit b1f7ae
There are no consistency checks whether the specified IP compression in the
Packit b1f7ae
`ipc` field is allowed in the current context or whether decode will result in
Packit b1f7ae
the full IP specified in the `ip` field.
Packit b1f7ae
Packit b1f7ae
Once the packet object has been filled, it can be handed over to the encoder as
Packit b1f7ae
shown here:
Packit b1f7ae
Packit b1f7ae
~~~{.c}
Packit b1f7ae
    errcode = pt_enc_next(encoder, &packet);
Packit b1f7ae
    if (errcode < 0)
Packit b1f7ae
        <handle error>(errcode);
Packit b1f7ae
~~~
Packit b1f7ae
Packit b1f7ae
The encoder will encode the packet, write it into the Intel PT buffer, and
Packit b1f7ae
advance its position to the next byte after the packet.  On a successful encode,
Packit b1f7ae
it will return the number of bytes that have been written.  In case of errors,
Packit b1f7ae
nothing will be written and the encoder returns a negative error code.
Packit b1f7ae
Packit b1f7ae
Packit b1f7ae
### Packet Decoding
Packit b1f7ae
Packit b1f7ae
The packet decoding layer provides support for decoding Intel PT
Packit b1f7ae
packet-by-packet.  Start by configuring and allocating a `pt_packet_decoder` as
Packit b1f7ae
shown here:
Packit b1f7ae
Packit b1f7ae
~~~{.c}
Packit b1f7ae
    struct pt_packet_decoder *decoder;
Packit b1f7ae
    struct pt_config config;
Packit b1f7ae
    int errcode;
Packit b1f7ae
Packit b1f7ae
    memset(&config, 0, sizeof(config));
Packit b1f7ae
    config.size = sizeof(config);
Packit b1f7ae
    config.begin = <pt buffer begin>;
Packit b1f7ae
    config.end = <pt buffer end>;
Packit b1f7ae
    config.cpu = <cpu identifier>;
Packit b1f7ae
    config.decode.callback = <decode function>;
Packit b1f7ae
    config.decode.context = <decode context>;
Packit b1f7ae
Packit b1f7ae
    decoder = pt_pkt_alloc_decoder(&config);
Packit b1f7ae
    if (!decoder)
Packit b1f7ae
        <handle error>(errcode);
Packit b1f7ae
~~~
Packit b1f7ae
Packit b1f7ae
For packet decoding, an optional decode callback function may be specified in
Packit b1f7ae
addition to the mandatory config fields.  If specified, the callback function
Packit b1f7ae
will be called for packets the decoder does not know about.  If there is no
Packit b1f7ae
decode callback specified, the decoder will return `-pte_bad_opc`.  In addition
Packit b1f7ae
to the callback function pointer, an optional pointer to user-defined context
Packit b1f7ae
information can be specified.  This context will be passed to the decode
Packit b1f7ae
callback function.
Packit b1f7ae
Packit b1f7ae
Before the decoder can be used, it needs to be synchronized onto the Intel PT
Packit b1f7ae
packet stream.  Packet decoders offer three synchronization functions.  To
Packit b1f7ae
iterate over synchronization points in the Intel PT packet stream in forward or
Packit b1f7ae
backward direction, use one of the following two functions respectively:
Packit b1f7ae
Packit b1f7ae
    pt_pkt_sync_forward()
Packit b1f7ae
    pt_pkt_sync_backward()
Packit b1f7ae
Packit b1f7ae
Packit b1f7ae
To manually synchronize the decoder at a particular offset into the Intel PT
Packit b1f7ae
packet stream, use the following function:
Packit b1f7ae
Packit b1f7ae
    pt_pkt_sync_set()
Packit b1f7ae
Packit b1f7ae
Packit b1f7ae
There are no checks to ensure that the specified offset is at the beginning of a
Packit b1f7ae
packet.  The example below shows synchronization to the first synchronization
Packit b1f7ae
point:
Packit b1f7ae
Packit b1f7ae
~~~{.c}
Packit b1f7ae
    struct pt_packet_decoder *decoder;
Packit b1f7ae
    int errcode;
Packit b1f7ae
Packit b1f7ae
    errcode = pt_pkt_sync_forward(decoder);
Packit b1f7ae
    if (errcode < 0)
Packit b1f7ae
        <handle error>(errcode);
Packit b1f7ae
~~~
Packit b1f7ae
Packit b1f7ae
The decoder will remember the last synchronization packet it decoded.
Packit b1f7ae
Subsequent calls to `pt_pkt_sync_forward` and `pt_pkt_sync_backward` will use
Packit b1f7ae
this as their starting point.
Packit b1f7ae
Packit b1f7ae
You can get the current decoder position as offset into the Intel PT buffer via:
Packit b1f7ae
Packit b1f7ae
    pt_pkt_get_offset()
Packit b1f7ae
Packit b1f7ae
Packit b1f7ae
You can get the position of the last synchronization point as offset into the
Packit b1f7ae
Intel PT buffer via:
Packit b1f7ae
Packit b1f7ae
    pt_pkt_get_sync_offset()
Packit b1f7ae
Packit b1f7ae
Packit b1f7ae
Once the decoder is synchronized, you can iterate over packets by repeated calls
Packit b1f7ae
to `pt_pkt_next()` as shown in the following example:
Packit b1f7ae
Packit b1f7ae
~~~{.c}
Packit b1f7ae
    struct pt_packet_decoder *decoder;
Packit b1f7ae
    int errcode;
Packit b1f7ae
Packit b1f7ae
    for (;;) {
Packit b1f7ae
        struct pt_packet packet;
Packit b1f7ae
Packit b1f7ae
        errcode = pt_pkt_next(decoder, &packet, sizeof(packet));
Packit b1f7ae
        if (errcode < 0)
Packit b1f7ae
            break;
Packit b1f7ae
Packit b1f7ae
        <process packet>(&packet);
Packit b1f7ae
    }
Packit b1f7ae
~~~
Packit b1f7ae
Packit b1f7ae
Packit b1f7ae
## The Event Layer
Packit b1f7ae
Packit b1f7ae
The event layer deals with packet combinations that encode higher-level events.
Packit b1f7ae
It is used for reconstructing execution flow for users who need finer-grain
Packit b1f7ae
control not available via the instruction flow layer or for users who want to
Packit b1f7ae
integrate execution flow reconstruction with other functionality more tightly
Packit b1f7ae
than it would be possible otherwise.
Packit b1f7ae
Packit b1f7ae
This section describes how to use the query decoder for reconstructing execution
Packit b1f7ae
flow.  See the instruction flow decoder as an example.  Start by configuring and
Packit b1f7ae
allocating a `pt_query_decoder` as shown below:
Packit b1f7ae
Packit b1f7ae
~~~{.c}
Packit b1f7ae
    struct pt_query_decoder *decoder;
Packit b1f7ae
    struct pt_config config;
Packit b1f7ae
    int errcode;
Packit b1f7ae
Packit b1f7ae
    memset(&config, 0, sizeof(config));
Packit b1f7ae
    config.size = sizeof(config);
Packit b1f7ae
    config.begin = <pt buffer begin>;
Packit b1f7ae
    config.end = <pt buffer end>;
Packit b1f7ae
    config.cpu = <cpu identifier>;
Packit b1f7ae
    config.decode.callback = <decode function>;
Packit b1f7ae
    config.decode.context = <decode context>;
Packit b1f7ae
Packit b1f7ae
    decoder = pt_qry_alloc_decoder(&config);
Packit b1f7ae
    if (!decoder)
Packit b1f7ae
        <handle error>(errcode);
Packit b1f7ae
~~~
Packit b1f7ae
Packit b1f7ae
An optional packet decode callback function may be specified in addition to the
Packit b1f7ae
mandatory config fields.  If specified, the callback function will be called for
Packit b1f7ae
packets the decoder does not know about.  The query decoder will ignore the
Packit b1f7ae
unknown packet except for its size in order to skip it.  If there is no decode
Packit b1f7ae
callback specified, the decoder will abort with `-pte_bad_opc`.  In addition to
Packit b1f7ae
the callback function pointer, an optional pointer to user-defined context
Packit b1f7ae
information can be specified.  This context will be passed to the decode
Packit b1f7ae
callback function.
Packit b1f7ae
Packit b1f7ae
Before the decoder can be used, it needs to be synchronized onto the Intel PT
Packit b1f7ae
packet stream.  To iterate over synchronization points in the Intel PT packet
Packit b1f7ae
stream in forward or backward direction, the query decoders offer the following
Packit b1f7ae
two synchronization functions respectively:
Packit b1f7ae
Packit b1f7ae
    pt_qry_sync_forward()
Packit b1f7ae
    pt_qry_sync_backward()
Packit b1f7ae
Packit b1f7ae
Packit b1f7ae
To manually synchronize the decoder at a synchronization point (i.e. PSB packet)
Packit b1f7ae
in the Intel PT packet stream, use the following function:
Packit b1f7ae
Packit b1f7ae
    pt_qry_sync_set()
Packit b1f7ae
Packit b1f7ae
Packit b1f7ae
After successfully synchronizing, the query decoder will start reading the PSB+
Packit b1f7ae
header to initialize its internal state.  If tracing is enabled at this
Packit b1f7ae
synchronization point, the IP of the instruction, at which decoding should be
Packit b1f7ae
started, is returned.  If tracing is disabled at this synchronization point, it
Packit b1f7ae
will be indicated in the returned status bits (see below).  In this example,
Packit b1f7ae
synchronization to the first synchronization point is shown:
Packit b1f7ae
Packit b1f7ae
~~~{.c}
Packit b1f7ae
    struct pt_query_decoder *decoder;
Packit b1f7ae
    uint64_t ip;
Packit b1f7ae
    int status;
Packit b1f7ae
Packit b1f7ae
    status = pt_qry_sync_forward(decoder, &ip);
Packit b1f7ae
    if (status < 0)
Packit b1f7ae
        <handle error>(status);
Packit b1f7ae
~~~
Packit b1f7ae
Packit b1f7ae
In addition to a query decoder, you will need an instruction decoder for
Packit b1f7ae
decoding and classifying instructions.
Packit b1f7ae
Packit b1f7ae
Packit b1f7ae
#### In A Nutshell
Packit b1f7ae
Packit b1f7ae
After synchronizing, you begin decoding instructions starting at the returned
Packit b1f7ae
IP.  As long as you can determine the next instruction in execution order, you
Packit b1f7ae
continue on your own.  Only when the next instruction cannot be determined by
Packit b1f7ae
examining the current instruction, you would ask the query decoder for guidance:
Packit b1f7ae
Packit b1f7ae
  * If the current instruction is a conditional branch, the
Packit b1f7ae
    `pt_qry_cond_branch()` function will tell whether it was taken.
Packit b1f7ae
Packit b1f7ae
  * If the current instruction is an indirect branch, the
Packit b1f7ae
    `pt_qry_indirect_branch()` function will provide the IP of its destination.
Packit b1f7ae
Packit b1f7ae
Packit b1f7ae
~~~{.c}
Packit b1f7ae
    struct pt_query_decoder *decoder;
Packit b1f7ae
    uint64_t ip;
Packit b1f7ae
Packit b1f7ae
    for (;;) {
Packit b1f7ae
        struct <instruction> insn;
Packit b1f7ae
Packit b1f7ae
        insn = <decode instruction>(ip);
Packit b1f7ae
Packit b1f7ae
        ip += <instruction size>(insn);
Packit b1f7ae
Packit b1f7ae
        if (<is cond branch>(insn)) {
Packit b1f7ae
            int status, taken;
Packit b1f7ae
Packit b1f7ae
            status = pt_qry_cond_branch(decoder, &taken);
Packit b1f7ae
            if (status < 0)
Packit b1f7ae
                <handle error>(status);
Packit b1f7ae
Packit b1f7ae
            if (taken)
Packit b1f7ae
                ip += <branch displacement>(insn);
Packit b1f7ae
        } else if (<is indirect branch>(insn)) {
Packit b1f7ae
            int status;
Packit b1f7ae
Packit b1f7ae
            status = pt_qry_indirect_branch(decoder, &ip);
Packit b1f7ae
            if (status < 0)
Packit b1f7ae
                <handle error>(status);
Packit b1f7ae
        }
Packit b1f7ae
    }
Packit b1f7ae
~~~
Packit b1f7ae
Packit b1f7ae
Packit b1f7ae
Certain aspects such as, for example, asynchronous events or synchronizing at a
Packit b1f7ae
location where tracing is disabled, have been ignored so far.  Let us consider
Packit b1f7ae
them now.
Packit b1f7ae
Packit b1f7ae
Packit b1f7ae
#### Queries
Packit b1f7ae
Packit b1f7ae
The query decoder provides four query functions:
Packit b1f7ae
Packit b1f7ae
  * `pt_qry_cond_branch()`      Query whether the next conditional branch was
Packit b1f7ae
                                taken.
Packit b1f7ae
Packit b1f7ae
  * `pt_qry_indirect_branch()`  Query for the destination IP of the next
Packit b1f7ae
                                indirect branch.
Packit b1f7ae
Packit b1f7ae
  * `pt_qry_event()`            Query for the next event.
Packit b1f7ae
Packit b1f7ae
  * `pt_qry_time()`             Query for the current time.
Packit b1f7ae
Packit b1f7ae
Packit b1f7ae
Each function returns either a positive vector of status bits or a negative
Packit b1f7ae
error code.  For details on status bits and error conditions, please refer to
Packit b1f7ae
the `pt_status_flag` and `pt_error_code` enumerations in the intel-pt.h header.
Packit b1f7ae
Packit b1f7ae
The `pts_ip_suppressed` status bit is used to indicate that no IP is available
Packit b1f7ae
at functions that are supposed to return an IP.  Examples are the indirect
Packit b1f7ae
branch query function and both synchronization functions.
Packit b1f7ae
Packit b1f7ae
The `pts_event_pending` status bit is used to indicate that there is an event
Packit b1f7ae
pending.  You should query for this event before continuing execution flow
Packit b1f7ae
reconstruction.
Packit b1f7ae
Packit b1f7ae
The `pts_eos` status bit is used to indicate the end of the trace.  Any
Packit b1f7ae
subsequent query will return -pte_eos.
Packit b1f7ae
Packit b1f7ae
Packit b1f7ae
#### Events
Packit b1f7ae
Packit b1f7ae
Events are signaled ahead of time.  When you query for pending events as soon as
Packit b1f7ae
they are indicated, you will be aware of asynchronous events before you reach
Packit b1f7ae
the instruction associated with the event.
Packit b1f7ae
Packit b1f7ae
For example, if tracing is disabled at the synchronization point, the IP will be
Packit b1f7ae
suppressed.  In this case, it is very likely that a tracing enabled event is
Packit b1f7ae
signaled.  You will also get events for initializing the decoder state after
Packit b1f7ae
synchronizing onto the Intel PT packet stream.  For example, paging or execution
Packit b1f7ae
mode events.
Packit b1f7ae
Packit b1f7ae
See the `enum pt_event_type` and `struct pt_event` in the intel-pt.h header for
Packit b1f7ae
details on possible events.  This document does not give an example of event
Packit b1f7ae
processing.  Refer to the implementation of the instruction flow decoder in
Packit b1f7ae
pt_insn.c for details.
Packit b1f7ae
Packit b1f7ae
Packit b1f7ae
#### Timing
Packit b1f7ae
Packit b1f7ae
To be able to signal events, the decoder reads ahead until it arrives at a query
Packit b1f7ae
relevant packet.  Errors encountered during that time will be postponed until
Packit b1f7ae
the respective query call.  This reading ahead affects timing.  The decoder will
Packit b1f7ae
always be a few packets ahead.  When querying for the current time, the query
Packit b1f7ae
will return the time at the decoder's current packet.  This corresponds to the
Packit b1f7ae
time at our next query.
Packit b1f7ae
Packit b1f7ae
Packit b1f7ae
#### Return Compression
Packit b1f7ae
Packit b1f7ae
If Intel PT has been configured to compress returns, a successfully compressed
Packit b1f7ae
return is represented as a conditional branch instead of an indirect branch.
Packit b1f7ae
For a RET instruction, you first query for a conditional branch.  If the query
Packit b1f7ae
succeeds, it should indicate that the branch was taken.  In that case, the
Packit b1f7ae
return has been compressed.  A not taken branch indicates an error.  If the
Packit b1f7ae
query fails, the return has not been compressed and you query for an indirect
Packit b1f7ae
branch.
Packit b1f7ae
Packit b1f7ae
There is no guarantee that returns will be compressed.  Even though return
Packit b1f7ae
compression has been enabled, returns may still be represented as indirect
Packit b1f7ae
branches.
Packit b1f7ae
Packit b1f7ae
To reconstruct the execution flow for compressed returns, you would maintain a
Packit b1f7ae
stack of return addresses.  For each call instruction, push the IP of the
Packit b1f7ae
instruction following the call onto the stack.  For compressed returns, pop the
Packit b1f7ae
topmost IP from the stack.  See pt_retstack.h and pt_retstack.c for a sample
Packit b1f7ae
implementation.
Packit b1f7ae
Packit b1f7ae
Packit b1f7ae
## The Instruction Flow Layer
Packit b1f7ae
Packit b1f7ae
The instruction flow layer provides a simple API for iterating over instructions
Packit b1f7ae
in execution order.  Start by configuring and allocating a `pt_insn_decoder` as
Packit b1f7ae
shown below:
Packit b1f7ae
Packit b1f7ae
~~~{.c}
Packit b1f7ae
    struct pt_insn_decoder *decoder;
Packit b1f7ae
    struct pt_config config;
Packit b1f7ae
    int errcode;
Packit b1f7ae
Packit b1f7ae
    memset(&config, 0, sizeof(config));
Packit b1f7ae
    config.size = sizeof(config);
Packit b1f7ae
    config.begin = <pt buffer begin>;
Packit b1f7ae
    config.end = <pt buffer end>;
Packit b1f7ae
    config.cpu = <cpu identifier>;
Packit b1f7ae
    config.decode.callback = <decode function>;
Packit b1f7ae
    config.decode.context = <decode context>;
Packit b1f7ae
Packit b1f7ae
    decoder = pt_insn_alloc_decoder(&config);
Packit b1f7ae
    if (!decoder)
Packit b1f7ae
        <handle error>(errcode);
Packit b1f7ae
~~~
Packit b1f7ae
Packit b1f7ae
An optional packet decode callback function may be specified in addition to the
Packit b1f7ae
mandatory config fields.  If specified, the callback function will be called for
Packit b1f7ae
packets the decoder does not know about.  The decoder will ignore the unknown
Packit b1f7ae
packet except for its size in order to skip it.  If there is no decode callback
Packit b1f7ae
specified, the decoder will abort with `-pte_bad_opc`.  In addition to the
Packit b1f7ae
callback function pointer, an optional pointer to user-defined context
Packit b1f7ae
information can be specified.  This context will be passed to the decode
Packit b1f7ae
callback function.
Packit b1f7ae
Packit b1f7ae
The image argument is optional.  If no image is given, the decoder will use an
Packit b1f7ae
empty default image that can be populated later on and that is implicitly
Packit b1f7ae
destroyed when the decoder is freed.  See below for more information on this.
Packit b1f7ae
Packit b1f7ae
Packit b1f7ae
#### The Traced Image
Packit b1f7ae
Packit b1f7ae
In addition to the Intel PT configuration, the instruction flow decoder needs to
Packit b1f7ae
know the memory image for which Intel PT has been recorded.  This memory image
Packit b1f7ae
is represented by a `pt_image` object.  If decoding failed due to an IP lying
Packit b1f7ae
outside of the traced memory image, `pt_insn_next()` will return `-pte_nomap`.
Packit b1f7ae
Packit b1f7ae
Use `pt_image_alloc()` to allocate and `pt_image_free()` to free an image.
Packit b1f7ae
Images may not be shared.  Every decoder must use a different image.  Use this
Packit b1f7ae
to prepare the image in advance or if you want to switch between images.
Packit b1f7ae
Packit b1f7ae
Every decoder provides an empty default image that is used if no image is
Packit b1f7ae
specified during allocation.  The default image is implicitly destroyed when the
Packit b1f7ae
decoder is freed.  It can be obtained by calling `pt_insn_get_image()`.  Use
Packit b1f7ae
this if you only use one decoder and one image.
Packit b1f7ae
Packit b1f7ae
An image is a collection of contiguous, non-overlapping memory regions called
Packit b1f7ae
`sections`.  Starting with an empty image, it may be populated with repeated
Packit b1f7ae
calls to `pt_image_add_file()` or `pt_image_add_cached()`, one for each section,
Packit b1f7ae
or with a call to `pt_image_copy()` to add all sections from another image.  If
Packit b1f7ae
a newly added section overlaps with an existing section, the existing section
Packit b1f7ae
will be truncated or split to make room for the new section.
Packit b1f7ae
Packit b1f7ae
In some cases, the memory image may change during the execution.  You can use
Packit b1f7ae
the `pt_image_remove_by_filename()` function to remove previously added sections
Packit b1f7ae
by their file name and `pt_image_remove_by_asid()` to remove all sections for an
Packit b1f7ae
address-space.
Packit b1f7ae
Packit b1f7ae
In addition to adding sections, you can register a callback function for reading
Packit b1f7ae
memory using `pt_image_set_callback()`.  The `context` parameter you pass
Packit b1f7ae
together with the callback function pointer will be passed to your callback
Packit b1f7ae
function every time it is called.  There can only be one callback at any time.
Packit b1f7ae
Adding a new callback will remove any previously added callback.  To remove the
Packit b1f7ae
callback function, pass `NULL` to `pt_image_set_callback()`.
Packit b1f7ae
Packit b1f7ae
Callback and files may be combined.  The callback function is used whenever
Packit b1f7ae
the memory cannot be found in any of the image's sections.
Packit b1f7ae
Packit b1f7ae
If more than one process is traced, the memory image may change when the process
Packit b1f7ae
context is switched.  To simplify handling this case, an address-space
Packit b1f7ae
identifier may be passed to each of the above functions to define separate
Packit b1f7ae
images for different processes at the same time.  The decoder will select the
Packit b1f7ae
correct image based on context switch information in the Intel PT trace.  If
Packit b1f7ae
you want to manage this on your own, you can use `pt_insn_set_image()` to
Packit b1f7ae
replace the image a decoder uses.
Packit b1f7ae
Packit b1f7ae
Packit b1f7ae
#### The Traced Image Section Cache
Packit b1f7ae
Packit b1f7ae
When using multiple decoders that work on related memory images it is desirable
Packit b1f7ae
to share image sections between decoders.  The underlying file sections will be
Packit b1f7ae
mapped only once per image section cache.
Packit b1f7ae
Packit b1f7ae
Use `pt_iscache_alloc()` to allocate and `pt_iscache_free()` to free an image
Packit b1f7ae
section cache.  Freeing the cache does not destroy sections added to the cache.
Packit b1f7ae
They remain valid until they are no longer used.
Packit b1f7ae
Packit b1f7ae
Use `pt_iscache_add_file()` to add a file section to an image section cache.
Packit b1f7ae
The function returns an image section identifier (ISID) that uniquely identifies
Packit b1f7ae
the section in this cache.  Use `pt_image_add_cached()` to add a file section
Packit b1f7ae
from an image section cache to an image.
Packit b1f7ae
Packit b1f7ae
Multiple image section caches may be used at the same time but it is recommended
Packit b1f7ae
not to mix sections from different image section caches in one image.
Packit b1f7ae
Packit b1f7ae
A traced image section cache can also be used for reading an instruction's
Packit b1f7ae
memory via its IP and ISID as provided in `struct pt_insn`.
Packit b1f7ae
Packit b1f7ae
Packit b1f7ae
#### Synchronizing
Packit b1f7ae
Packit b1f7ae
Before the decoder can be used, it needs to be synchronized onto the Intel PT
Packit b1f7ae
packet stream.  To iterate over synchronization points in the Intel PT packet
Packit b1f7ae
stream in forward or backward directions, the instruction flow decoders offer
Packit b1f7ae
the following two synchronization functions respectively:
Packit b1f7ae
Packit b1f7ae
    pt_insn_sync_forward()
Packit b1f7ae
    pt_insn_sync_backward()
Packit b1f7ae
Packit b1f7ae
Packit b1f7ae
To manually synchronize the decoder at a synchronization point (i.e. PSB packet)
Packit b1f7ae
in the Intel PT packet stream, use the following function:
Packit b1f7ae
Packit b1f7ae
    pt_insn_sync_set()
Packit b1f7ae
Packit b1f7ae
Packit b1f7ae
The example below shows synchronization to the first synchronization point:
Packit b1f7ae
Packit b1f7ae
~~~{.c}
Packit b1f7ae
    struct pt_insn_decoder *decoder;
Packit b1f7ae
    int errcode;
Packit b1f7ae
Packit b1f7ae
    errcode = pt_insn_sync_forward(decoder);
Packit b1f7ae
    if (errcode < 0)
Packit b1f7ae
        <handle error>(errcode);
Packit b1f7ae
~~~
Packit b1f7ae
Packit b1f7ae
The decoder will remember the last synchronization packet it decoded.
Packit b1f7ae
Subsequent calls to `pt_insn_sync_forward` and `pt_insn_sync_backward` will use
Packit b1f7ae
this as their starting point.
Packit b1f7ae
Packit b1f7ae
You can get the current decoder position as offset into the Intel PT buffer via:
Packit b1f7ae
Packit b1f7ae
    pt_insn_get_offset()
Packit b1f7ae
Packit b1f7ae
Packit b1f7ae
You can get the position of the last synchronization point as offset into the
Packit b1f7ae
Intel PT buffer via:
Packit b1f7ae
Packit b1f7ae
    pt_insn_get_sync_offset()
Packit b1f7ae
Packit b1f7ae
Packit b1f7ae
#### Iterating
Packit b1f7ae
Packit b1f7ae
Once the decoder is synchronized, you can iterate over instructions in execution
Packit b1f7ae
flow order by repeated calls to `pt_insn_next()` as shown in the following
Packit b1f7ae
example:
Packit b1f7ae
Packit b1f7ae
~~~{.c}
Packit b1f7ae
    struct pt_insn_decoder *decoder;
Packit b1f7ae
    int errcode;
Packit b1f7ae
Packit b1f7ae
    for (;;) {
Packit b1f7ae
        struct pt_insn insn;
Packit b1f7ae
Packit b1f7ae
        errcode = pt_insn_next(decoder, &insn, sizeof(insn));
Packit b1f7ae
Packit b1f7ae
        if (insn.iclass != ptic_error)
Packit b1f7ae
            <process instruction>(&insn);
Packit b1f7ae
Packit b1f7ae
        if (errcode < 0)
Packit b1f7ae
            break;
Packit b1f7ae
    }
Packit b1f7ae
~~~
Packit b1f7ae
Packit b1f7ae
For each instruction, you get its IP, its size in bytes, the raw memory, an
Packit b1f7ae
identifier for the image section that contained it, the current execution mode,
Packit b1f7ae
and the speculation state, that is whether the instruction has been executed
Packit b1f7ae
speculatively.  In addition, you get a coarse classification that can be used
Packit b1f7ae
for further processing without the need for a full instruction decode.
Packit b1f7ae
Packit b1f7ae
If a traced image section cache is used the image section identifier can be used
Packit b1f7ae
to trace an instruction back to the binary file that contained it.  This allows
Packit b1f7ae
mapping the instruction back to source code using the debug information
Packit b1f7ae
contained in or reachable via the binary file.
Packit b1f7ae
Packit b1f7ae
You also get some information about events that occured either before or after
Packit b1f7ae
executing the instruction like enable or disable tracing.  For detailed
Packit b1f7ae
information about instructions, see `enum pt_insn_class` and `struct pt_insn` in
Packit b1f7ae
the intel-pt.h header file.
Packit b1f7ae
Packit b1f7ae
Beware that `pt_insn_next()` may indicate errors that occur after the returned
Packit b1f7ae
instruction.  The returned instruction is valid if its `iclass` field is set.
Packit b1f7ae
Packit b1f7ae
Packit b1f7ae
## The Block Layer
Packit b1f7ae
Packit b1f7ae
The block layer provides a simple API for iterating over blocks of sequential
Packit b1f7ae
instructions in execution order.  The instructions in a block are sequential in
Packit b1f7ae
the sense that no trace is required for reconstructing the instructions.  The IP
Packit b1f7ae
of the first instruction is given in `struct pt_block` and the IP of other
Packit b1f7ae
instructions in the block can be determined by decoding and examining the
Packit b1f7ae
previous instruction.
Packit b1f7ae
Packit b1f7ae
Start by configuring and allocating a `pt_block_decoder` as shown below:
Packit b1f7ae
Packit b1f7ae
~~~{.c}
Packit b1f7ae
    struct pt_block_decoder *decoder;
Packit b1f7ae
    struct pt_config config;
Packit b1f7ae
Packit b1f7ae
    memset(&config, 0, sizeof(config));
Packit b1f7ae
    config.size = sizeof(config);
Packit b1f7ae
    config.begin = <pt buffer begin>;
Packit b1f7ae
    config.end = <pt buffer end>;
Packit b1f7ae
    config.cpu = <cpu identifier>;
Packit b1f7ae
    config.decode.callback = <decode function>;
Packit b1f7ae
    config.decode.context = <decode context>;
Packit b1f7ae
Packit b1f7ae
    decoder = pt_blk_alloc_decoder(&config);
Packit b1f7ae
~~~
Packit b1f7ae
Packit b1f7ae
An optional packet decode callback function may be specified in addition to the
Packit b1f7ae
mandatory config fields.  If specified, the callback function will be called for
Packit b1f7ae
packets the decoder does not know about.  The decoder will ignore the unknown
Packit b1f7ae
packet except for its size in order to skip it.  If there is no decode callback
Packit b1f7ae
specified, the decoder will abort with `-pte_bad_opc`.  In addition to the
Packit b1f7ae
callback function pointer, an optional pointer to user-defined context
Packit b1f7ae
information can be specified.  This context will be passed to the decode
Packit b1f7ae
callback function.
Packit b1f7ae
Packit b1f7ae
Packit b1f7ae
#### Synchronizing
Packit b1f7ae
Packit b1f7ae
Before the decoder can be used, it needs to be synchronized onto the Intel PT
Packit b1f7ae
packet stream.  To iterate over synchronization points in the Intel PT packet
Packit b1f7ae
stream in forward or backward directions, the block decoder offers the following
Packit b1f7ae
two synchronization functions respectively:
Packit b1f7ae
Packit b1f7ae
    pt_blk_sync_forward()
Packit b1f7ae
    pt_blk_sync_backward()
Packit b1f7ae
Packit b1f7ae
Packit b1f7ae
To manually synchronize the decoder at a synchronization point (i.e. PSB packet)
Packit b1f7ae
in the Intel PT packet stream, use the following function:
Packit b1f7ae
Packit b1f7ae
    pt_blk_sync_set()
Packit b1f7ae
Packit b1f7ae
Packit b1f7ae
The example below shows synchronization to the first synchronization point:
Packit b1f7ae
Packit b1f7ae
~~~{.c}
Packit b1f7ae
    struct pt_block_decoder *decoder;
Packit b1f7ae
    int errcode;
Packit b1f7ae
Packit b1f7ae
    errcode = pt_blk_sync_forward(decoder);
Packit b1f7ae
    if (errcode < 0)
Packit b1f7ae
        <handle error>(errcode);
Packit b1f7ae
~~~
Packit b1f7ae
Packit b1f7ae
The decoder will remember the last synchronization packet it decoded.
Packit b1f7ae
Subsequent calls to `pt_blk_sync_forward` and `pt_blk_sync_backward` will use
Packit b1f7ae
this as their starting point.
Packit b1f7ae
Packit b1f7ae
You can get the current decoder position as offset into the Intel PT buffer via:
Packit b1f7ae
Packit b1f7ae
    pt_blk_get_offset()
Packit b1f7ae
Packit b1f7ae
Packit b1f7ae
You can get the position of the last synchronization point as offset into the
Packit b1f7ae
Intel PT buffer via:
Packit b1f7ae
Packit b1f7ae
    pt_blk_get_sync_offset()
Packit b1f7ae
Packit b1f7ae
Packit b1f7ae
#### Iterating
Packit b1f7ae
Packit b1f7ae
Once the decoder is synchronized, it can be used to iterate over blocks of
Packit b1f7ae
instructions in execution flow order by repeated calls to `pt_blk_next()` as
Packit b1f7ae
shown in the following example:
Packit b1f7ae
Packit b1f7ae
~~~{.c}
Packit b1f7ae
    struct pt_block_decoder *decoder;
Packit b1f7ae
    int errcode;
Packit b1f7ae
Packit b1f7ae
    for (;;) {
Packit b1f7ae
        struct pt_block block;
Packit b1f7ae
Packit b1f7ae
        errcode = pt_blk_next(decoder, &block, sizeof(block));
Packit b1f7ae
Packit b1f7ae
        if (block.ninsn > 0)
Packit b1f7ae
            <process block>(&block);
Packit b1f7ae
Packit b1f7ae
        if (errcode < 0)
Packit b1f7ae
            break;
Packit b1f7ae
    }
Packit b1f7ae
~~~
Packit b1f7ae
Packit b1f7ae
A block contains enough information to reconstruct the instructions.  See
Packit b1f7ae
`struct pt_block` in `intel-pt.h` for details.  Note that errors returned by
Packit b1f7ae
`pt_blk_next()` apply after the last instruction in the provided block.
Packit b1f7ae
Packit b1f7ae
It is recommended to use a traced image section cache so the image section
Packit b1f7ae
identifier contained in a block can be used for reading the memory containing
Packit b1f7ae
the instructions in the block.  This also allows mapping the instructions back
Packit b1f7ae
to source code using the debug information contained in or reachable via the
Packit b1f7ae
binary file.
Packit b1f7ae
Packit b1f7ae
In some cases, the last instruction in a block may cross image section
Packit b1f7ae
boundaries.  This can happen when a code segment is split into more than one
Packit b1f7ae
image section.  The block is marked truncated in this case and provides the raw
Packit b1f7ae
bytes of the last instruction.
Packit b1f7ae
Packit b1f7ae
The following example shows how instructions can be reconstructed from a block:
Packit b1f7ae
Packit b1f7ae
~~~{.c}
Packit b1f7ae
    struct pt_image_section_cache *iscache;
Packit b1f7ae
    struct pt_block *block;
Packit b1f7ae
    uint16_t ninsn;
Packit b1f7ae
    uint64_t ip;
Packit b1f7ae
Packit b1f7ae
    ip = block->ip;
Packit b1f7ae
    for (ninsn = 0; ninsn < block->ninsn; ++ninsn) {
Packit b1f7ae
        uint8_t raw[pt_max_insn_size];
Packit b1f7ae
        <struct insn> insn;
Packit b1f7ae
        int size;
Packit b1f7ae
Packit b1f7ae
        if (block->truncated && ((ninsn +1) == block->ninsn)) {
Packit b1f7ae
            memcpy(raw, block->raw, block->size);
Packit b1f7ae
            size = block->size;
Packit b1f7ae
        } else {
Packit b1f7ae
            size = pt_iscache_read(iscache, raw, sizeof(raw), block->isid, ip);
Packit b1f7ae
            if (size < 0)
Packit b1f7ae
                break;
Packit b1f7ae
        }
Packit b1f7ae
Packit b1f7ae
        errcode = <decode instruction>(&insn, raw, size, block->mode);
Packit b1f7ae
        if (errcode < 0)
Packit b1f7ae
            break;
Packit b1f7ae
Packit b1f7ae
        <process instruction>(&insn);
Packit b1f7ae
Packit b1f7ae
        ip = <determine next ip>(&insn);
Packit b1f7ae
    }
Packit b1f7ae
~~~
Packit b1f7ae
Packit b1f7ae
Packit b1f7ae
## Parallel Decode
Packit b1f7ae
Packit b1f7ae
Intel PT splits naturally into self-contained PSB segments that can be decoded
Packit b1f7ae
independently.  Use the packet or query decoder to search for PSB's using
Packit b1f7ae
repeated calls to `pt_pkt_sync_forward()` and `pt_pkt_get_sync_offset()` (or
Packit b1f7ae
`pt_qry_sync_forward()` and `pt_qry_get_sync_offset()`).  The following example
Packit b1f7ae
shows this using the query decoder, which will already give the IP needed in
Packit b1f7ae
the next step.
Packit b1f7ae
Packit b1f7ae
~~~{.c}
Packit b1f7ae
    struct pt_query_decoder *decoder;
Packit b1f7ae
    uint64_t offset, ip;
Packit b1f7ae
    int status, errcode;
Packit b1f7ae
Packit b1f7ae
    for (;;) {
Packit b1f7ae
        status = pt_qry_sync_forward(decoder, &ip);
Packit b1f7ae
        if (status < 0)
Packit b1f7ae
            break;
Packit b1f7ae
Packit b1f7ae
        errcode = pt_qry_get_sync_offset(decoder, &offset);
Packit b1f7ae
        if (errcode < 0)
Packit b1f7ae
            <handle error>(errcode);
Packit b1f7ae
Packit b1f7ae
        <split trace>(offset, ip, status);
Packit b1f7ae
    }
Packit b1f7ae
~~~
Packit b1f7ae
Packit b1f7ae
The individual trace segments can then be decoded using the query, instruction
Packit b1f7ae
flow, or block decoder as shown above in the previous examples.
Packit b1f7ae
Packit b1f7ae
When stitching decoded trace segments together, a sequence of linear (in the
Packit b1f7ae
sense that it can be decoded without Intel PT) code has to be filled in.  Use
Packit b1f7ae
the `pts_eos` status indication to stop decoding early enough.  Then proceed
Packit b1f7ae
until the IP at the start of the succeeding trace segment is reached.  When
Packit b1f7ae
using the instruction flow decoder, `pt_insn_next()` may be used for that as
Packit b1f7ae
shown in the following example:
Packit b1f7ae
Packit b1f7ae
~~~{.c}
Packit b1f7ae
    struct pt_insn_decoder *decoder;
Packit b1f7ae
    struct pt_insn insn;
Packit b1f7ae
    int status;
Packit b1f7ae
Packit b1f7ae
    for (;;) {
Packit b1f7ae
        status = pt_insn_next(decoder, &insn, sizeof(insn));
Packit b1f7ae
        if (status < 0)
Packit b1f7ae
            <handle error>(status);
Packit b1f7ae
Packit b1f7ae
        if (status & pts_eos)
Packit b1f7ae
            break;
Packit b1f7ae
Packit b1f7ae
        <process instruction>(&insn);
Packit b1f7ae
    }
Packit b1f7ae
Packit b1f7ae
    while (insn.ip != <next segment's start IP>) {
Packit b1f7ae
        <process instruction>(&insn);
Packit b1f7ae
Packit b1f7ae
        status = pt_insn_next(decoder, &insn, sizeof(insn));
Packit b1f7ae
        if (status < 0)
Packit b1f7ae
            <handle error>(status);
Packit b1f7ae
    }
Packit b1f7ae
~~~
Packit b1f7ae
Packit b1f7ae
Packit b1f7ae
## Threading
Packit b1f7ae
Packit b1f7ae
The decoder library API is not thread-safe.  Different threads may allocate and
Packit b1f7ae
use different decoder objects at the same time.  Different decoders must not use
Packit b1f7ae
the same image object.  Use `pt_image_copy()` to give each decoder its own copy
Packit b1f7ae
of a shared master image.