Blame docs/usermanual-what-is-harfbuzz.xml

Packit Service 5bcba8
<chapter id="what-is-harfbuzz">
Packit Service 5bcba8
  <title>What is Harfbuzz?</title>
Packit Service 5bcba8
  <para>
Packit Service 5bcba8
    Harfbuzz is a <emphasis>text shaping engine</emphasis>. It solves
Packit Service 5bcba8
    the problem of selecting and positioning glyphs from a font given a
Packit Service 5bcba8
    Unicode string.
Packit Service 5bcba8
  </para>
Packit Service 5bcba8
  <section id="why-do-i-need-it">
Packit Service 5bcba8
    <title>Why do I need it?</title>
Packit Service 5bcba8
    <para>
Packit Service 5bcba8
      Text shaping is an integral part of preparing text for display. It
Packit Service 5bcba8
      is a fairly low level operation; Harfbuzz is used directly by
Packit Service 5bcba8
      graphic rendering libraries such as Pango, and the layout engines
Packit Service 5bcba8
      in Firefox, LibreOffice and Chromium. Unless you are
Packit Service 5bcba8
      <emphasis>writing</emphasis> one of these layout engines yourself,
Packit Service 5bcba8
      you will probably not need to use Harfbuzz - normally higher level
Packit Service 5bcba8
      libraries will turn text into glyphs for you.
Packit Service 5bcba8
    </para>
Packit Service 5bcba8
    <para>
Packit Service 5bcba8
      However, if you <emphasis>are</emphasis> writing a layout engine
Packit Service 5bcba8
      or graphics library yourself, you will need to perform text
Packit Service 5bcba8
      shaping, and this is where Harfbuzz can help you. Here are some
Packit Service 5bcba8
      reasons why you need it:
Packit Service 5bcba8
    </para>
Packit Service 5bcba8
    <itemizedlist>
Packit Service 5bcba8
      <listitem>
Packit Service 5bcba8
        <para>
Packit Service 5bcba8
          OpenType fonts contain a set of glyphs, indexed by glyph ID.
Packit Service 5bcba8
          The glyph ID within the font does not necessarily relate to a
Packit Service 5bcba8
          Unicode codepoint. For instance, some fonts have the letter
Packit Service 5bcba8
          "a" as glyph ID 1. To pull the right glyph out of
Packit Service 5bcba8
          the font in order to display it, you need to consult a table
Packit Service 5bcba8
          within the font (the "cmap" table) which maps
Packit Service 5bcba8
          Unicode codepoints to glyph IDs. Text shaping turns codepoints
Packit Service 5bcba8
          into glyph IDs.
Packit Service 5bcba8
        </para>
Packit Service 5bcba8
      </listitem>
Packit Service 5bcba8
      <listitem>
Packit Service 5bcba8
        <para>
Packit Service 5bcba8
          Many OpenType fonts contain ligatures: combinations of
Packit Service 5bcba8
          characters which are rendered together. For instance, it's
Packit Service 5bcba8
          common for the <literal>fi</literal> combination to appear in
Packit Service 5bcba8
          print as the single ligature "fi". Whether you should
Packit Service 5bcba8
          render text as <literal>fi</literal> or "fi" does not
Packit Service 5bcba8
          depend on the input text, but on the capabilities of the font
Packit Service 5bcba8
          and the level of ligature application you wish to perform.
Packit Service 5bcba8
          Text shaping involves querying the font's ligature tables and
Packit Service 5bcba8
          determining what substitutions should be made.
Packit Service 5bcba8
        </para>
Packit Service 5bcba8
      </listitem>
Packit Service 5bcba8
      <listitem>
Packit Service 5bcba8
        <para>
Packit Service 5bcba8
          While ligatures like "fi" are typographic
Packit Service 5bcba8
          refinements, some languages <emphasis>require</emphasis> such
Packit Service 5bcba8
          substitutions to be made in order to display text correctly.
Packit Service 5bcba8
          In Tamil, when the letter "TTA" (ட) letter is
Packit Service 5bcba8
          followed by "U" (உ), the combination should appear
Packit Service 5bcba8
          as the single glyph "டு". The sequence of Unicode
Packit Service 5bcba8
          characters "டஉ" needs to be rendered as a single
Packit Service 5bcba8
          glyph from the font - text shaping chooses the correct glyph
Packit Service 5bcba8
          from the sequence of characters provided.
Packit Service 5bcba8
        </para>
Packit Service 5bcba8
      </listitem>
Packit Service 5bcba8
      <listitem>
Packit Service 5bcba8
        <para>
Packit Service 5bcba8
          Similarly, each Arabic character has four different variants:
Packit Service 5bcba8
          within a font, there will be glyphs for the initial, medial,
Packit Service 5bcba8
          final, and isolated forms of each letter. Unicode only encodes
Packit Service 5bcba8
          one codepoint per character, and so a Unicode string will not
Packit Service 5bcba8
          tell you which glyph to use. Text shaping chooses the correct
Packit Service 5bcba8
          form of the letter and returns the correct glyph from the font
Packit Service 5bcba8
          that you need to render.
Packit Service 5bcba8
        </para>
Packit Service 5bcba8
      </listitem>
Packit Service 5bcba8
      <listitem>
Packit Service 5bcba8
        <para>
Packit Service 5bcba8
          Other languages have marks and accents which need to be
Packit Service 5bcba8
          rendered in certain positions around a base character. For
Packit Service 5bcba8
          instance, the Moldovan language has the Cyrillic letter
Packit Service 5bcba8
          "zhe" (ж) with a breve accent, like so: ӂ. Some
Packit Service 5bcba8
          fonts will contain this character as an individual glyph,
Packit Service 5bcba8
          whereas other fonts will not contain a zhe-with-breve glyph
Packit Service 5bcba8
          but expect the rendering engine to form the character by
Packit Service 5bcba8
          overlaying the two glyphs ж and ˘. Where you should draw the
Packit Service 5bcba8
          combining breve depends on the height of the preceding glyph.
Packit Service 5bcba8
          Again, for Arabic, the correct positioning of vowel marks
Packit Service 5bcba8
          depends on the height of the character on which you are
Packit Service 5bcba8
          placing the mark. Text shaping tells you whether you have a
Packit Service 5bcba8
          precomposed glyph within your font or if you need to compose a
Packit Service 5bcba8
          glyph yourself out of combining marks, and if so, where to
Packit Service 5bcba8
          position those marks.
Packit Service 5bcba8
        </para>
Packit Service 5bcba8
      </listitem>
Packit Service 5bcba8
    </itemizedlist>
Packit Service 5bcba8
    <para>
Packit Service 5bcba8
      If this is something that you need to do, then you need a text
Packit Service 5bcba8
      shaping engine: you could use Uniscribe if you are using Windows;
Packit Service 5bcba8
      you could use CoreText on OS X; or you could use Harfbuzz. In the
Packit Service 5bcba8
      rest of this manual, we are going to assume that you are the
Packit Service 5bcba8
      implementor of a text layout engine.
Packit Service 5bcba8
    </para>
Packit Service 5bcba8
  </section>
Packit Service 5bcba8
  <section id="why-is-it-called-harfbuzz">
Packit Service 5bcba8
    <title>Why is it called Harfbuzz?</title>
Packit Service 5bcba8
    <para>
Packit Service 5bcba8
      Harfbuzz began its life as text shaping code within the FreeType
Packit Service 5bcba8
      project, (and you will see references to the FreeType authors
Packit Service 5bcba8
      within the source code copyright declarations) but was then
Packit Service 5bcba8
      abstracted out to its own project. This project is maintained by
Packit Service 5bcba8
      Behdad Esfahbod, and named Harfbuzz. Originally, it was a shaping
Packit Service 5bcba8
      engine for OpenType fonts - "Harfbuzz" is the Persian
Packit Service 5bcba8
      for "open type".
Packit Service 5bcba8
    </para>
Packit Service 5bcba8
  </section>
Packit Service 5bcba8
</chapter>