|
Packit Service |
5bcba8 |
<chapter id="what-is-harfbuzz">
|
|
Packit Service |
5bcba8 |
<title>What is Harfbuzz?</title>
|
|
Packit Service |
5bcba8 |
<para>
|
|
Packit Service |
5bcba8 |
Harfbuzz is a <emphasis>text shaping engine</emphasis>. It solves
|
|
Packit Service |
5bcba8 |
the problem of selecting and positioning glyphs from a font given a
|
|
Packit Service |
5bcba8 |
Unicode string.
|
|
Packit Service |
5bcba8 |
</para>
|
|
Packit Service |
5bcba8 |
<section id="why-do-i-need-it">
|
|
Packit Service |
5bcba8 |
<title>Why do I need it?</title>
|
|
Packit Service |
5bcba8 |
<para>
|
|
Packit Service |
5bcba8 |
Text shaping is an integral part of preparing text for display. It
|
|
Packit Service |
5bcba8 |
is a fairly low level operation; Harfbuzz is used directly by
|
|
Packit Service |
5bcba8 |
graphic rendering libraries such as Pango, and the layout engines
|
|
Packit Service |
5bcba8 |
in Firefox, LibreOffice and Chromium. Unless you are
|
|
Packit Service |
5bcba8 |
<emphasis>writing</emphasis> one of these layout engines yourself,
|
|
Packit Service |
5bcba8 |
you will probably not need to use Harfbuzz - normally higher level
|
|
Packit Service |
5bcba8 |
libraries will turn text into glyphs for you.
|
|
Packit Service |
5bcba8 |
</para>
|
|
Packit Service |
5bcba8 |
<para>
|
|
Packit Service |
5bcba8 |
However, if you <emphasis>are</emphasis> writing a layout engine
|
|
Packit Service |
5bcba8 |
or graphics library yourself, you will need to perform text
|
|
Packit Service |
5bcba8 |
shaping, and this is where Harfbuzz can help you. Here are some
|
|
Packit Service |
5bcba8 |
reasons why you need it:
|
|
Packit Service |
5bcba8 |
</para>
|
|
Packit Service |
5bcba8 |
<itemizedlist>
|
|
Packit Service |
5bcba8 |
<listitem>
|
|
Packit Service |
5bcba8 |
<para>
|
|
Packit Service |
5bcba8 |
OpenType fonts contain a set of glyphs, indexed by glyph ID.
|
|
Packit Service |
5bcba8 |
The glyph ID within the font does not necessarily relate to a
|
|
Packit Service |
5bcba8 |
Unicode codepoint. For instance, some fonts have the letter
|
|
Packit Service |
5bcba8 |
"a" as glyph ID 1. To pull the right glyph out of
|
|
Packit Service |
5bcba8 |
the font in order to display it, you need to consult a table
|
|
Packit Service |
5bcba8 |
within the font (the "cmap" table) which maps
|
|
Packit Service |
5bcba8 |
Unicode codepoints to glyph IDs. Text shaping turns codepoints
|
|
Packit Service |
5bcba8 |
into glyph IDs.
|
|
Packit Service |
5bcba8 |
</para>
|
|
Packit Service |
5bcba8 |
</listitem>
|
|
Packit Service |
5bcba8 |
<listitem>
|
|
Packit Service |
5bcba8 |
<para>
|
|
Packit Service |
5bcba8 |
Many OpenType fonts contain ligatures: combinations of
|
|
Packit Service |
5bcba8 |
characters which are rendered together. For instance, it's
|
|
Packit Service |
5bcba8 |
common for the <literal>fi</literal> combination to appear in
|
|
Packit Service |
5bcba8 |
print as the single ligature "fi". Whether you should
|
|
Packit Service |
5bcba8 |
render text as <literal>fi</literal> or "fi" does not
|
|
Packit Service |
5bcba8 |
depend on the input text, but on the capabilities of the font
|
|
Packit Service |
5bcba8 |
and the level of ligature application you wish to perform.
|
|
Packit Service |
5bcba8 |
Text shaping involves querying the font's ligature tables and
|
|
Packit Service |
5bcba8 |
determining what substitutions should be made.
|
|
Packit Service |
5bcba8 |
</para>
|
|
Packit Service |
5bcba8 |
</listitem>
|
|
Packit Service |
5bcba8 |
<listitem>
|
|
Packit Service |
5bcba8 |
<para>
|
|
Packit Service |
5bcba8 |
While ligatures like "fi" are typographic
|
|
Packit Service |
5bcba8 |
refinements, some languages <emphasis>require</emphasis> such
|
|
Packit Service |
5bcba8 |
substitutions to be made in order to display text correctly.
|
|
Packit Service |
5bcba8 |
In Tamil, when the letter "TTA" (ட) letter is
|
|
Packit Service |
5bcba8 |
followed by "U" (உ), the combination should appear
|
|
Packit Service |
5bcba8 |
as the single glyph "டு". The sequence of Unicode
|
|
Packit Service |
5bcba8 |
characters "டஉ" needs to be rendered as a single
|
|
Packit Service |
5bcba8 |
glyph from the font - text shaping chooses the correct glyph
|
|
Packit Service |
5bcba8 |
from the sequence of characters provided.
|
|
Packit Service |
5bcba8 |
</para>
|
|
Packit Service |
5bcba8 |
</listitem>
|
|
Packit Service |
5bcba8 |
<listitem>
|
|
Packit Service |
5bcba8 |
<para>
|
|
Packit Service |
5bcba8 |
Similarly, each Arabic character has four different variants:
|
|
Packit Service |
5bcba8 |
within a font, there will be glyphs for the initial, medial,
|
|
Packit Service |
5bcba8 |
final, and isolated forms of each letter. Unicode only encodes
|
|
Packit Service |
5bcba8 |
one codepoint per character, and so a Unicode string will not
|
|
Packit Service |
5bcba8 |
tell you which glyph to use. Text shaping chooses the correct
|
|
Packit Service |
5bcba8 |
form of the letter and returns the correct glyph from the font
|
|
Packit Service |
5bcba8 |
that you need to render.
|
|
Packit Service |
5bcba8 |
</para>
|
|
Packit Service |
5bcba8 |
</listitem>
|
|
Packit Service |
5bcba8 |
<listitem>
|
|
Packit Service |
5bcba8 |
<para>
|
|
Packit Service |
5bcba8 |
Other languages have marks and accents which need to be
|
|
Packit Service |
5bcba8 |
rendered in certain positions around a base character. For
|
|
Packit Service |
5bcba8 |
instance, the Moldovan language has the Cyrillic letter
|
|
Packit Service |
5bcba8 |
"zhe" (ж) with a breve accent, like so: ӂ. Some
|
|
Packit Service |
5bcba8 |
fonts will contain this character as an individual glyph,
|
|
Packit Service |
5bcba8 |
whereas other fonts will not contain a zhe-with-breve glyph
|
|
Packit Service |
5bcba8 |
but expect the rendering engine to form the character by
|
|
Packit Service |
5bcba8 |
overlaying the two glyphs ж and ˘. Where you should draw the
|
|
Packit Service |
5bcba8 |
combining breve depends on the height of the preceding glyph.
|
|
Packit Service |
5bcba8 |
Again, for Arabic, the correct positioning of vowel marks
|
|
Packit Service |
5bcba8 |
depends on the height of the character on which you are
|
|
Packit Service |
5bcba8 |
placing the mark. Text shaping tells you whether you have a
|
|
Packit Service |
5bcba8 |
precomposed glyph within your font or if you need to compose a
|
|
Packit Service |
5bcba8 |
glyph yourself out of combining marks, and if so, where to
|
|
Packit Service |
5bcba8 |
position those marks.
|
|
Packit Service |
5bcba8 |
</para>
|
|
Packit Service |
5bcba8 |
</listitem>
|
|
Packit Service |
5bcba8 |
</itemizedlist>
|
|
Packit Service |
5bcba8 |
<para>
|
|
Packit Service |
5bcba8 |
If this is something that you need to do, then you need a text
|
|
Packit Service |
5bcba8 |
shaping engine: you could use Uniscribe if you are using Windows;
|
|
Packit Service |
5bcba8 |
you could use CoreText on OS X; or you could use Harfbuzz. In the
|
|
Packit Service |
5bcba8 |
rest of this manual, we are going to assume that you are the
|
|
Packit Service |
5bcba8 |
implementor of a text layout engine.
|
|
Packit Service |
5bcba8 |
</para>
|
|
Packit Service |
5bcba8 |
</section>
|
|
Packit Service |
5bcba8 |
<section id="why-is-it-called-harfbuzz">
|
|
Packit Service |
5bcba8 |
<title>Why is it called Harfbuzz?</title>
|
|
Packit Service |
5bcba8 |
<para>
|
|
Packit Service |
5bcba8 |
Harfbuzz began its life as text shaping code within the FreeType
|
|
Packit Service |
5bcba8 |
project, (and you will see references to the FreeType authors
|
|
Packit Service |
5bcba8 |
within the source code copyright declarations) but was then
|
|
Packit Service |
5bcba8 |
abstracted out to its own project. This project is maintained by
|
|
Packit Service |
5bcba8 |
Behdad Esfahbod, and named Harfbuzz. Originally, it was a shaping
|
|
Packit Service |
5bcba8 |
engine for OpenType fonts - "Harfbuzz" is the Persian
|
|
Packit Service |
5bcba8 |
for "open type".
|
|
Packit Service |
5bcba8 |
</para>
|
|
Packit Service |
5bcba8 |
</section>
|
|
Packit Service |
5bcba8 |
</chapter>
|