Blame libfreerdp/primitives/README.txt

Packit Service fa4841
The Primitives Library
Packit Service fa4841

Packit Service fa4841
Introduction
Packit Service fa4841
------------
Packit Service fa4841
The purpose of the primitives library is to give the freerdp code easy
Packit Service fa4841
access to *run-time* optimization via SIMD operations.  When the library
Packit Service fa4841
is initialized, dynamic checks of processor features are run (such as
Packit Service fa4841
the support of SSE3 or Neon), and entrypoints are linked to through
Packit Service fa4841
function pointers to provide the fastest possible operations.  All
Packit Service fa4841
routines offer generic C alternatives as fallbacks.
Packit Service fa4841

Packit Service fa4841
Run-time optimization has the advantage of allowing a single executable
Packit Service fa4841
to run fast on multiple platforms with different SIMD capabilities.
Packit Service fa4841

Packit Service fa4841

Packit Service fa4841
Use In Code
Packit Service fa4841
-----------
Packit Service fa4841
A singleton pointing to a structure containing the function pointers
Packit Service fa4841
is accessed through primitives_get().   The function pointers can then
Packit Service fa4841
be used from that structure, e.g.
Packit Service fa4841

Packit Service fa4841
    primitives_t *prims = primitives_get();
Packit Service fa4841
    prims->shiftC_16s(buffer, shifts, buffer, 256);
Packit Service fa4841

Packit Service fa4841
Of course, there is some overhead in calling through the function pointer
Packit Service fa4841
and setting up the SIMD operations, so it would be counterproductive to
Packit Service fa4841
call the primitives library for very small operation, e.g. initializing an
Packit Service fa4841
array of eight values to a constant.  The primitives library is intended
Packit Service fa4841
for larger-scale operations, e.g. arrays of size 64 and larger.
Packit Service fa4841

Packit Service fa4841

Packit Service fa4841
Initialization and Cleanup
Packit Service fa4841
--------------------------
Packit Service fa4841
Library initialization is done the first time primitives_init() is called
Packit Service fa4841
or the first time primitives_get() is used.  Cleanup (if any) is done by
Packit Service fa4841
primitives_deinit().
Packit Service fa4841

Packit Service fa4841

Packit Service fa4841
Intel Integrated Performance Primitives (IPP)
Packit Service fa4841
---------------------------------------------
Packit Service fa4841
If freerdp is compiled with IPP support (-DWITH_IPP=ON), the IPP function
Packit Service fa4841
calls will be used (where available) to fill the function pointers.
Packit Service fa4841
Where possible, function names and parameter lists match IPP format so
Packit Service fa4841
that the IPP functions can be plugged into the function pointers without
Packit Service fa4841
a wrapper layer.  Use of IPP is completely optional, and in many cases
Packit Service fa4841
the SSE operations in the primitives library itself are faster or similar
Packit Service fa4841
in performance.
Packit Service fa4841

Packit Service fa4841

Packit Service fa4841
Coverage
Packit Service fa4841
--------
Packit Service fa4841
The primitives library is not meant to be comprehensive, offering
Packit Service fa4841
entrypoints for every operation and operand type.  Instead, the coverage
Packit Service fa4841
is focused on operations known to be performance bottlenecks in the code.
Packit Service fa4841
For instance, 16-bit signed operations are used widely in the RemoteFX
Packit Service fa4841
software, so you'll find 16s versions of several operations, but there
Packit Service fa4841
is no attempt to provide (unused) copies of the same code for 8u, 16u,
Packit Service fa4841
32s, etc.
Packit Service fa4841

Packit Service fa4841

Packit Service fa4841
New Optimizations
Packit Service fa4841
-----------------
Packit Service fa4841
As the need arises, new optimizations can be added to the library,
Packit Service fa4841
including NEON, AVX, and perhaps OpenCL or other SIMD implementations.
Packit Service fa4841
The CPU feature detection is done in winpr/sysinfo.
Packit Service fa4841

Packit Service fa4841

Packit Service fa4841
Adding Entrypoints
Packit Service fa4841
------------------
Packit Service fa4841
As the need for new operations or operands arises, new entrypoints can
Packit Service fa4841
be added.  
Packit Service fa4841
  1) Function prototypes and pointers are added to 
Packit Service fa4841
     include/freerdp/primitives.h
Packit Service fa4841
  2) New module initialization and cleanup function prototypes are added
Packit Service fa4841
     to prim_internal.h and called in primitives.c (primitives_init()
Packit Service fa4841
     and primitives_deinit()).
Packit Service fa4841
  3) Operation names and parameter lists should be compatible with the IPP.
Packit Service fa4841
     IPP manuals are available online at software.intel.com.
Packit Service fa4841
  4) A generic C entrypoint must be available as a fallback.
Packit Service fa4841
  5) prim_templates.h contains macro-based templates for simple operations,
Packit Service fa4841
     such as applying a single SSE operation to arrays of data.
Packit Service fa4841
     The template functions can frequently be used to extend the
Packit Service fa4841
     operations without writing a lot of new code.
Packit Service fa4841

Packit Service fa4841
Cache Management
Packit Service fa4841
----------------
Packit Service fa4841
I haven't found a lot of speed improvement by attempting prefetch, and
Packit Service fa4841
in fact it seems to have a negative impact in some cases.  Done correctly
Packit Service fa4841
perhaps the routines could be further accelerated by proper use of prefetch,
Packit Service fa4841
fences, etc.
Packit Service fa4841

Packit Service fa4841

Packit Service fa4841
Testing
Packit Service fa4841
-------
Packit Service fa4841
In the test subdirectory is an executable (prim_test) that tests both
Packit Service fa4841
functionality and speed of primitives library operations.   Any new
Packit Service fa4841
modules should be added to that test, following the conventions already
Packit Service fa4841
established in that directory.  The program can be executed on various
Packit Service fa4841
target hardware to compare generic C, optimized, and IPP performance
Packit Service fa4841
with various array sizes.
Packit Service fa4841