Blob Blame History Raw
== Hacking ==

In this section we look at coding conventions and some of the design models used in coding graphiteng.

=== Compiling and Integrating ===

To compile the graphite2 library for integration into another application framework, there are some
useful utilities and also various definitions that need to be defined. The file src/files.mk will create
three make variables containing lists of files that someone will want to build in order to build
graphite2 from source as part of an application build. The variable names are controlled by setting
_NS to a prefix that is then applied to the variable names _SOURCES, _PRIVATE_HEADERS and _PUBLIC_HEADERS.
All files are listed relative to the variable _BASE (with its _NS expansion prefix). The _MACHINE variable
with its _NS expansion prefix, must be set to 'call' or 'direct' depending on which kind of virtual
machine to use inside the engine. gcc supports direct call, while all other compilers without a computed
goto should use the call style virtual machine. See src/direct_machine.cpp and src/call_machine.cpp for details.

Various C preprocessor definitions are also used as part of compiling graphite2:

GRAPHITE2_EXPORTING::
    Needs to be set when building the library source if your are build it as a DLL.
    When unset the graphite header API will be marked dllimport for use in client code.

GRAPHITE2_STATIC::
    If set removes all dllimport or dllexport declspecs on the Graphtie2 API functions and makes them 
    exportable for a clean static library build on Windows. 

GRAPHITE2_BUILTINS::
    If set this will use gcc or clang __builtin intrinsics where appropriate when compilling for 
    some architectures (Intel with SSE4.2 or later and ARMv7a with a NEON capable FPU) this
    will cause the compiler to emit specialised instructions where it can or fall back to library code.  
    Currently this is only used for __builtin_popcount and should only be turned when you know the instruction 
    will be emitted as the gcc __builtin_popcount is 2x slower on Intel than Graphite's own version.

GRAPHITE2_NSEGCACHE::
    If set, the segment caching code is not compiled into the library. This can be used to save space
    if the engine is under space constraints both for code and memory. The default is that this is not
    defined.

GRAPHITE2_NFILEFACE::
    If set, the code to support creating a face directly from a font file is not included in the library.
    By default this is not set.

GRAPHITE2_NTRACING::
    If set, the code to support tracing segment creation and logging to a json output file is not built.
    However the API remains, it just won't do anything.

GRAPHITE2_CUSTOM_HEADER::
    If set, then the value of this macro will be included as a header in Main.h (in effect, all source files). See Main.h for details.

=== Floating point maths ===

On some Intel 32 bit processors, gcc on Linux, will attempt to optimise floating point operations by keeping 
floating point values in 80 bit (or larger) float registers for as long as possible. In some of the floating 
point maths performed for collision detection this results in error accumlating which produces different
results between 64 bit and 32 bit native code.  The current work around is to pass either -ffloat-store (the 
slow but widely compatible option) or -mfpmath=sse -msse2 (the faster but not as generic), this causes float 
values to be rounded between every opteration or the use of sse float operations which are more strictly 
specified and therfore more predictable across processor generations.
This has not been observed to be an issue on 32 bit MSVC or Mac 32 bit compilers and is not an issue on ARM 32 
bit either.

=== Thread Safety ===

The Graphite engine has no locking or thread safe storage. But it is possible to use the Graphite engine in a thread safe manner.
Face creation must be done in one thread and if +gr_face_preloadAll+ is set, all font table interaction will occur while the face is being created. That is no calls will be made to the +get_table+ function or the +release_table+ function during segment creation. References to the name table, made during calls to +gr_fref_label+ use an internal copy of that table to ensure that all table interaction is completed after +gr_make_face+ is called. Note that none of this precludes an application handling thread issues around font table querying and releasing and graphite being used in a lazy table query manner.
Font objects must be created without hinted advances otherwise the application is responsible for handling the shared +AppHandle+ resource during segment creation.
Following this, the face is a read only object and can be shared across different threads, and so segment creation is thread safe. Following creation, segments may be shared across threads so long as they are not modified (using +gr_seg_justify+ or +gr_slot_linebreak_before+).

Any use of logging will break thread safety. Face specific logging involves holding a file open for as long as logging is active,
and so segments cannot be made from a shared face across different threads.
Further, if gr_start_logging is called with NULL for the face, a non locked global library variable is written to which will impact across threads and all face creation. Logging should therefore only be used in a single threaded context.

Any other use of the graphite engine in a multithreaded context will involve the application in doing its own locking.

=== Memory Allocation ===

+++While GraphiteNG is written in C++, it is targetted at environments where
libstdc++ is not present. While this can be problematic since functions used in
C++ may be arbitrarily placed in libc and libstdc++, there are general
approaches that can help. To this end we use a mixed memory allocation model.
For graphiteng classes, we declare our own new() methods and friends, allowing
the use of C++ new() and all that it gives us in terms of constructors. For
types that are not classes we use malloc() or the type safe version gralloc().+++

=== Missing Features ===

There are various facilities that silgraphite provides that graphite2 as yet does not. The primary motivation in developing graphite2 is that it be use case driven. Thus only those facilities that have a proven use case in the target applications into which graphite is to be integrated will be implemented.

Ligature Components::
    Graphite has the ability to track ligature components and this feature is used in some fonts. But application support has yet to be proven and this is an issue looking for a use case and appropriate API.

SegmentPainter::
    Silgraphite provides a helper class to do range selection, cursor hitting and
    support text editing within a segment. These facilities were not being used in
    applications. Graphite2 does not preclude the addition of such a helper class
    if it would be of help to different applications, since it would be layered
    on top of graphite2.

Hinted Attachment Points::
    No use has been made of hinted attachment points, and so until a real use case requirement is proven, these are not in the font api. They can be added should a need be proven.

=== Hunting Speed ===

Graphite2 is written based on the experience of using SilGraphite. SilGraphite was written primarily to be feature complete in terms of all features that may be needed. Graphite2 takes the experience of using SilGraphite and starts from a use case requirement before a feature is added to the engine. Thus a number of features that have not been used in SilGraphite have been removed, although the design is such that they can be replaced in future.

In addition, a number of techniques were used to speed up the engine. Some of these are discussed here.

Slot Stream::
    Rather than copying slots from one stream to another, the slot stream is allocated in blocks of slots and the processing is done in place. This means that each pass is executed to completion in sequence rather than using a pull model that SilGraphite uses. In addition, the slot stream is held as a linked list to keep the cost of insertion and deletion down for large segments.

Virtual Machine::
    The virtual machine that executes action and condition code is optimised for speed with different versions of the core engine code being used dependent upon compiler. The interpretted code is also pre-analysed for checking purposes and even some commands are added necessary for the inplace editing of the slot stream.

Design Space Positioning::
    The nature of the new processing model is that all concrete positioning is done in a final finalisation process. This means that all passes can be run in design space and then only at finalisation positioned in pixel space.