|
Packit |
df99a1 |
|
|
Packit |
df99a1 |
|
|
Packit |
df99a1 |
|
|
Packit |
df99a1 |
|
|
Packit |
df99a1 |
BY THE LIZARDTECH SPECIFICATION "DJVU3SPEC.DJVU".>
|
|
Packit |
df99a1 |
|
|
Packit |
df99a1 |
|
|
Packit |
df99a1 |
|
|
Packit |
df99a1 |
This file summarizes the file format changes
|
|
Packit |
df99a1 |
between DjVu2 and DjVu3.
|
|
Packit |
df99a1 |
|
|
Packit |
df99a1 |
|
|
Packit |
df99a1 |
|
|
Packit |
df99a1 |
|
|
Packit |
df99a1 |
------------------------------------------------------------
|
|
Packit |
df99a1 |
1 - DJVU3 FILE STRUCTURE OVERVIEW
|
|
Packit |
df99a1 |
------------------------------------------------------------
|
|
Packit |
df99a1 |
|
|
Packit |
df99a1 |
DjVu files are organized according to the ``EA IFF 85'' layout. Pointers to
|
|
Packit |
df99a1 |
the appropriate reference document are provided in section
|
|
Packit |
df99a1 |
\Ref{IFFByteStream.h}. IFF files are logically composed of a sequence of
|
|
Packit |
df99a1 |
data \emph{chunks}. Each chunk comes with a four character \emph{chunk
|
|
Packit |
df99a1 |
identifier} describing the type of the data stored in the chunk. A few
|
|
Packit |
df99a1 |
special chunk identifiers, for instance #"FORM"#, are reserved for so
|
|
Packit |
df99a1 |
called \emph{composite chunks} containing a sequence of data chunks. This
|
|
Packit |
df99a1 |
convention effectively provides IFF files with a hierarchical structure.
|
|
Packit |
df99a1 |
Composite chunks are further identified by a \emph{secondary chunk
|
|
Packit |
df99a1 |
identifier}. For convenience, both identifiers are gathered as an
|
|
Packit |
df99a1 |
extended chunk identifier such as #"FORM:DJVU"#.
|
|
Packit |
df99a1 |
|
|
Packit |
df99a1 |
The four octets #0x41,0x54,0x26,0x54# may be inserted in front of the IFF
|
|
Packit |
df99a1 |
compliant byte stream. The decoder simply ignores these four octets when
|
|
Packit |
df99a1 |
they are present. These four octets are not part of the IFF format and
|
|
Packit |
df99a1 |
are not required components of a valid DjVu file. Certain versions of MSIE
|
|
Packit |
df99a1 |
incorrectly recognize any IFF file as a Microsoft AIFF sound file. The
|
|
Packit |
df99a1 |
presence of these four octets prevents this incorrect identification.
|
|
Packit |
df99a1 |
|
|
Packit |
df99a1 |
The DjVu specification mandates that the decoder should silently
|
|
Packit |
df99a1 |
skip chunks whose identifier is not recognized. This mechanism
|
|
Packit |
df99a1 |
provides a backward compatible way to extend the initial format by
|
|
Packit |
df99a1 |
allocating new chunk identifiers.
|
|
Packit |
df99a1 |
|
|
Packit |
df99a1 |
------------------------------------------------------------
|
|
Packit |
df99a1 |
1.1 - DJVU3 IMAGE FILES
|
|
Packit |
df99a1 |
------------------------------------------------------------
|
|
Packit |
df99a1 |
|
|
Packit |
df99a1 |
\textbf{Photo DjVu Image} ---
|
|
Packit |
df99a1 |
|
|
Packit |
df99a1 |
Photo DjVu Image files are best used for
|
|
Packit |
df99a1 |
encoding photographic images in colors or in shades of gray. The data
|
|
Packit |
df99a1 |
compression model relies on the IW44 wavelet representation. This format
|
|
Packit |
df99a1 |
is designed such that the IW44 decoder is able to quickly perform
|
|
Packit |
df99a1 |
progressive rendering of any image segment using only a small amount of
|
|
Packit |
df99a1 |
memory. Photo DjVu files are composed of a single #"FORM:DJVU"# composite
|
|
Packit |
df99a1 |
chunk. This composite chunk always begins with one #"INFO"# chunk
|
|
Packit |
df99a1 |
describing the image size and resolution (see \Ref{DjVuInfo.h}). One or
|
|
Packit |
df99a1 |
more additional #"BG44"# chunks contains the image data encoded with the
|
|
Packit |
df99a1 |
IW44 representation (see \Ref{IW44Image.h}). The image size specified in
|
|
Packit |
df99a1 |
the #"INFO"# chunk and the image size specified in the IW44 data must be
|
|
Packit |
df99a1 |
equal.
|
|
Packit |
df99a1 |
|
|
Packit |
df99a1 |
\textbf{Bilevel DjVu Image} ---
|
|
Packit |
df99a1 |
|
|
Packit |
df99a1 |
Bilevel DjVu Image files are used to compress
|
|
Packit |
df99a1 |
black and white images representing text and simple drawings. The
|
|
Packit |
df99a1 |
JB2 data compression model uses the soft pattern matching technique, which
|
|
Packit |
df99a1 |
essentially consists of encoding each character by describing how it
|
|
Packit |
df99a1 |
differs from a well chosen already encoded character. Bilevel DjVu Files
|
|
Packit |
df99a1 |
are composed of a single #"FORM:DJVU"# composite chunk. This composite
|
|
Packit |
df99a1 |
chunk always begins with one #"INFO"# chunk describing the image size and
|
|
Packit |
df99a1 |
resolution (see \Ref{DjVuInfo.h}). An additional #"Sjbz"# chunk contains
|
|
Packit |
df99a1 |
the bilevel data encoded with the JB2 representation (see
|
|
Packit |
df99a1 |
\Ref{JB2Image.h}). The image size specified in the #"INFO"# chunk and the
|
|
Packit |
df99a1 |
image size specified in the JB2 data must be equal.
|
|
Packit |
df99a1 |
|
|
Packit |
df99a1 |
\textbf{Compound DjVu Image} ---
|
|
Packit |
df99a1 |
|
|
Packit |
df99a1 |
Compound DjVu Files are an extremely
|
|
Packit |
df99a1 |
efficient way to compress high resolution Compound document images
|
|
Packit |
df99a1 |
containing both pictures and text, such as a page of a magazine. Compound
|
|
Packit |
df99a1 |
DjVu Files represent the document images using two layers. The
|
|
Packit |
df99a1 |
\emph{background layer} is used for encoding the pictures and the
|
|
Packit |
df99a1 |
paper texture.
|
|
Packit |
df99a1 |
The \emph{foreground layer} is used for encoding the text and the drawings.
|
|
Packit |
df99a1 |
Compound DjVu Files are composed of a single #"FORM:DJVU"# composite
|
|
Packit |
df99a1 |
chunk. This composite chunk always begins with one #"INFO"# chunk
|
|
Packit |
df99a1 |
describing the size and the resolution of the image (see \Ref{DjVuInfo}).
|
|
Packit |
df99a1 |
Additional chunks hold the components of either the foreground or the
|
|
Packit |
df99a1 |
background layers.
|
|
Packit |
df99a1 |
|
|
Packit |
df99a1 |
The main component of the foreground layer is a bilevel image named the
|
|
Packit |
df99a1 |
\emph{foreground mask}. The pixel size of the foreground mask is equal to
|
|
Packit |
df99a1 |
the size of the DjVu image. It contains a black-on-white representation
|
|
Packit |
df99a1 |
of the text and the drawings. This image is encoded by a #"Sjbz"# chunk
|
|
Packit |
df99a1 |
using the JB2 representation. There may also be a companion chunk
|
|
Packit |
df99a1 |
#"Djbz"# containing a \emph{shape dictionary} that defines bilevel shapes
|
|
Packit |
df99a1 |
referenced by the #"Sjbz"# chunk.
|
|
Packit |
df99a1 |
|
|
Packit |
df99a1 |
The \emph{foreground colors} can be encoded according to two models:
|
|
Packit |
df99a1 |
\begin{itemize}
|
|
Packit |
df99a1 |
\item
|
|
Packit |
df99a1 |
The foreground colors may be encoded using a small color image,
|
|
Packit |
df99a1 |
the \emph{foreground color image}, encoded as a single #"FG44"#
|
|
Packit |
df99a1 |
chunk using the
|
|
Packit |
df99a1 |
IW44 representation (see \Ref{IW44Image.h}). Such compound DjVu images
|
|
Packit |
df99a1 |
are rendered by painting the foreground color image on top of the
|
|
Packit |
df99a1 |
background color image using the foreground mask as a stencil. The
|
|
Packit |
df99a1 |
pixel size of the foreground color image is computed by rounding up the
|
|
Packit |
df99a1 |
quotient of the mask size by an integer sub-sampling factor ranging from
|
|
Packit |
df99a1 |
1 to 12. Most Compound DjVu Images use a foreground color sub-sampling
|
|
Packit |
df99a1 |
factor of 12. Smaller sub-sampling factors produce very slightly better
|
|
Packit |
df99a1 |
images.
|
|
Packit |
df99a1 |
\item
|
|
Packit |
df99a1 |
The foreground colors may be encoded by specifying one solid color per
|
|
Packit |
df99a1 |
object described by the JB2 encoded mask. These \emph{JB2 colors} are
|
|
Packit |
df99a1 |
color-quantized and stored in a single #"FGbz"# chunk (see.
|
|
Packit |
df99a1 |
\Ref{DjVuPalette.h}). Such compound DjVu images are rendered by
|
|
Packit |
df99a1 |
painting each foreground object on top of the background color image
|
|
Packit |
df99a1 |
using the solid color specified by the #"FGbz"# chunk.
|
|
Packit |
df99a1 |
\end{itemize}
|
|
Packit |
df99a1 |
|
|
Packit |
df99a1 |
The background layer is a color image, \Ref{the background color image}
|
|
Packit |
df99a1 |
ncoded by an arbitrary number of #"BG44"# chunks containing successive
|
|
Packit |
df99a1 |
IW44 refinements (see \Ref{IW44Image.h}). The size of this image is
|
|
Packit |
df99a1 |
computed by rounding up the quotient of the mask size by an integer
|
|
Packit |
df99a1 |
sub-sampling factor ranging from 1 to 12. Most Compound DjVu Images use a
|
|
Packit |
df99a1 |
background sub-sampling factor equal to 3. Smaller sub-sampling factors
|
|
Packit |
df99a1 |
are adequate for images with a very rich paper texture. Larger
|
|
Packit |
df99a1 |
sub-sampling factors are adequate for images containing no pictures.
|
|
Packit |
df99a1 |
|
|
Packit |
df99a1 |
There are no ordering or interleaving constraints on these chunks except
|
|
Packit |
df99a1 |
that (a) the #"INFO"# chunk must appear first, and (b) the successive
|
|
Packit |
df99a1 |
#"BG44"# refinements must appear with their natural order. The chunk
|
|
Packit |
df99a1 |
order simply affects the progressive rendering of DjVu images on a web
|
|
Packit |
df99a1 |
browser.
|
|
Packit |
df99a1 |
|
|
Packit |
df99a1 |
\textbf{IW44 Image Files} --
|
|
Packit |
df99a1 |
|
|
Packit |
df99a1 |
The IW44 Image file format is the native format for the IW44 wavelet
|
|
Packit |
df99a1 |
representation. These files are deprecated in favor of Photo DjVu
|
|
Packit |
df99a1 |
Images.
|
|
Packit |
df99a1 |
|
|
Packit |
df99a1 |
\textbf{Alternative encodings} ---
|
|
Packit |
df99a1 |
|
|
Packit |
df99a1 |
Besides the JB2 and IW44 encoding schemes,
|
|
Packit |
df99a1 |
the DjVu format supports alternative encoding methods for its components.
|
|
Packit |
df99a1 |
|
|
Packit |
df99a1 |
\begin{itemize}
|
|
Packit |
df99a1 |
\item
|
|
Packit |
df99a1 |
The foreground mask may be represented by a single #"Smmr"# chunk
|
|
Packit |
df99a1 |
instead of #"Sjbz"#. The #"Smmr"# chunk contains a bilevel image
|
|
Packit |
df99a1 |
encoded with the Fax-G4/MMR method. Although the resulting files
|
|
Packit |
df99a1 |
are typically six times larger, this capability can be useful when
|
|
Packit |
df99a1 |
DjVu is used as a front-end for fax machines and scanners with
|
|
Packit |
df99a1 |
embedded Fax-G4/MMR capabilities.
|
|
Packit |
df99a1 |
\item
|
|
Packit |
df99a1 |
The background color image may be represented by a single #"BGjp"#
|
|
Packit |
df99a1 |
chunk instead of several #"BG44"# chunks. The #"BGjp"# chunk contains
|
|
Packit |
df99a1 |
a JPEG encoded color image. The resulting files are significantly
|
|
Packit |
df99a1 |
larger and lack the progressivity of the usual DjVu files.
|
|
Packit |
df99a1 |
This is useful because some scanners have embedded JPEG capabilities.
|
|
Packit |
df99a1 |
\item
|
|
Packit |
df99a1 |
The foreground color image may be represented by a single #"FGjp"#
|
|
Packit |
df99a1 |
chunk instead of a single #"FG44"# chunk. This is useful because
|
|
Packit |
df99a1 |
some scanners have embedded JPEG capabilities.
|
|
Packit |
df99a1 |
\end{itemize}
|
|
Packit |
df99a1 |
|
|
Packit |
df99a1 |
In addition, the chunk names #"BG2k"# and #"FG2k"# have been reserved for
|
|
Packit |
df99a1 |
encoding the background color image and the foreground color image using
|
|
Packit |
df99a1 |
the forthcoming JPEG-2000 standard. This capability is not implemented at
|
|
Packit |
df99a1 |
the moment. The JPEG-2000 standard may even become the preferred encoding
|
|
Packit |
df99a1 |
method for color images in DjVu. */
|
|
Packit |
df99a1 |
|
|
Packit |
df99a1 |
\textbf{Annotations and Textual Information } --
|
|
Packit |
df99a1 |
|
|
Packit |
df99a1 |
All types of DjVu images may contain
|
|
Packit |
df99a1 |
annotation chunks. Annotation chunks are currently used to describe
|
|
Packit |
df99a1 |
hyperlinks, to specify more closely the behavior of the viewers,
|
|
Packit |
df99a1 |
and to hold metadata information. Annotations are contained in #"ANTa"#
|
|
Packit |
df99a1 |
or #"ANTz"# chunks. The #"ANTa"# chunks contain the annotation in
|
|
Packit |
df99a1 |
plain text. The #"ANTz"# chunks contain the same information compressed
|
|
Packit |
df99a1 |
with the BZZ encoder (cf. \Ref{BSByteStream.h}).
|
|
Packit |
df99a1 |
|
|
Packit |
df99a1 |
All types of DjVu image files may also contain a
|
|
Packit |
df99a1 |
computer readable description of the text appearing on the page. This
|
|
Packit |
df99a1 |
information is contained by either a #"TXTa"# chunk or #"TXTz"# chunk.
|
|
Packit |
df99a1 |
The #"TXTa"# chunk contains uncompressed data. The #"TXTz"# chunk
|
|
Packit |
df99a1 |
contains the same data compressed with the \Ref{bzz} compressor
|
|
Packit |
df99a1 |
(cf. \Ref{BSByteStream.h}). The #"TXTa"# chunks begins by a 24 bit
|
|
Packit |
df99a1 |
integer (most significant byte first) describing the length of the text in
|
|
Packit |
df99a1 |
bytes. Then come the ISO10646/UTF8 text. Additional information
|
|
Packit |
df99a1 |
indicates the position of each column/region/paragraph/line/word in the
|
|
Packit |
df99a1 |
document. More information about the capabilities of the chunk can be
|
|
Packit |
df99a1 |
found in section \Ref{DjVuTXT}. More information about the encoding of
|
|
Packit |
df99a1 |
textual information can be found in file #"DjVuAnno.cpp"#. */
|
|
Packit |
df99a1 |
|
|
Packit |
df99a1 |
|
|
Packit |
df99a1 |
------------------------------------------------------------
|
|
Packit |
df99a1 |
1.2 - DJVU3 MULTIPAGE DOCUMENTS
|
|
Packit |
df99a1 |
------------------------------------------------------------
|
|
Packit |
df99a1 |
|
|
Packit |
df99a1 |
The DjVu3 system supports two models for multi-page documents:
|
|
Packit |
df99a1 |
\emph{bundled} multi-page documents and \emph{indirect} multi-page documents.
|
|
Packit |
df99a1 |
|
|
Packit |
df99a1 |
\textbf{Bundled multi-page documents} ---
|
|
Packit |
df99a1 |
|
|
Packit |
df99a1 |
A \emph{bundled} multi-page DjVu
|
|
Packit |
df99a1 |
document uses a single file to represent the entire document. This single
|
|
Packit |
df99a1 |
file contains all the pages as well as ancillary information (e.g. the
|
|
Packit |
df99a1 |
page directory, data shared by several pages, thumbnails, etc.). Using a
|
|
Packit |
df99a1 |
single file format is very convenient for storing documents or for sending
|
|
Packit |
df99a1 |
email attachments.
|
|
Packit |
df99a1 |
|
|
Packit |
df99a1 |
A bundled multi-page document is composed of a single #"FORM:DJVM"#
|
|
Packit |
df99a1 |
composite chunk. This composite chunk always begins with a #"DIRM"# chunk
|
|
Packit |
df99a1 |
containing the document directory (see. \Ref{DjVmDir.h}) which represents
|
|
Packit |
df99a1 |
the list of the \emph{component files} that compose the document. The
|
|
Packit |
df99a1 |
component files themselves are then encoded as IFF85 composite chunks
|
|
Packit |
df99a1 |
following the #"DIRM"# chunk.
|
|
Packit |
df99a1 |
|
|
Packit |
df99a1 |
\begin{itemize}
|
|
Packit |
df99a1 |
\item
|
|
Packit |
df99a1 |
Component files may be any valid DjVu image (see \Ref{DjVu Image Files})
|
|
Packit |
df99a1 |
or IW44 image (see \Ref{IW44 Image Files}.) These component files
|
|
Packit |
df99a1 |
always represent a page of a document. The corresponding IFF85 chunk ids are
|
|
Packit |
df99a1 |
#"FORM:DJVU"#, #"FORM:PM44"#, or #"FORM:BM44"#.
|
|
Packit |
df99a1 |
\item
|
|
Packit |
df99a1 |
Component files may contain shared information indirectly referenced by
|
|
Packit |
df99a1 |
some document pages. These \emph{shared component files} are always composed
|
|
Packit |
df99a1 |
of a single #"FORM:DJVI"# chunk containing an arbitrary collection of
|
|
Packit |
df99a1 |
chunks.
|
|
Packit |
df99a1 |
\item
|
|
Packit |
df99a1 |
Thumbnail files contain optional thumbnail images for a few consecutive
|
|
Packit |
df99a1 |
pages of the document. Thumbnail files consist of a single
|
|
Packit |
df99a1 |
#"FORM:THUM"# composite chunk containing several #"TH44"# chunks
|
|
Packit |
df99a1 |
containing IW44 encoded thumbnail images (see \Ref{IW44Image.h}). These
|
|
Packit |
df99a1 |
thumbnails always pertain the first few page files following the
|
|
Packit |
df99a1 |
thumbnail file in the document directory.
|
|
Packit |
df99a1 |
\end{itemize}
|
|
Packit |
df99a1 |
|
|
Packit |
df99a1 |
\textbf{Including shared information} ---
|
|
Packit |
df99a1 |
|
|
Packit |
df99a1 |
Any DjVu image file contained in a multipage file may contain an #"INCL"#
|
|
Packit |
df99a1 |
chunk containing the ID of a shared component file. The decoder processes
|
|
Packit |
df99a1 |
the chunks contained in the shared component file as if they were
|
|
Packit |
df99a1 |
contained by the DjVu image file.
|
|
Packit |
df99a1 |
|
|
Packit |
df99a1 |
A shared component file is composed of a single #"FORM:DJVI"# potentially
|
|
Packit |
df99a1 |
containing any information otherwise allowed in a DjVu image file (except
|
|
Packit |
df99a1 |
for the #"INFO"# chunk of course).
|
|
Packit |
df99a1 |
|
|
Packit |
df99a1 |
There are many benefits associated with storing such shared information in
|
|
Packit |
df99a1 |
separate files. A well designed browser may keep pre-decoded copies of
|
|
Packit |
df99a1 |
these files in a cache. This procedure would reduce the size of the data
|
|
Packit |
df99a1 |
transferred over the Internet and also increase the display speed. The
|
|
Packit |
df99a1 |
multipage DjVu compressor, for instance, identifies similar object shapes
|
|
Packit |
df99a1 |
occuring in several pages. These shapes are encoded in a shape dictionary
|
|
Packit |
df99a1 |
(chunk #"Djbz"#) placed in a shared component file. All relevant pages
|
|
Packit |
df99a1 |
include this shared component file. Although they appear in several
|
|
Packit |
df99a1 |
pages, these shared shapes are encoded only once in the document.
|
|
Packit |
df99a1 |
|
|
Packit |
df99a1 |
\textbf{Browsing a multi-page document} ---
|
|
Packit |
df99a1 |
|
|
Packit |
df99a1 |
You can view the pages using the DjVu plugin and a web browser. When you
|
|
Packit |
df99a1 |
type the URL of a multi-page document, the browser starts downloading the
|
|
Packit |
df99a1 |
whole file, but displays the first page as soon as it is available. You
|
|
Packit |
df99a1 |
can immediately navigate to other pages using the DjVu toolbar. Suppose
|
|
Packit |
df99a1 |
however that the document is stored on a remote web server. You can
|
|
Packit |
df99a1 |
easily access the first page and see that this is not the document you
|
|
Packit |
df99a1 |
wanted. Although you will never display the other pages the browser is
|
|
Packit |
df99a1 |
transferring data for these pages and is wasting the bandwith of your
|
|
Packit |
df99a1 |
server (and the bandwith of the Internet too). You could also see the
|
|
Packit |
df99a1 |
summary of the document on the first page and jump to page 100. But page
|
|
Packit |
df99a1 |
100 cannot be displayed until data for pages 1 to 99 has been received.
|
|
Packit |
df99a1 |
You may have to wait for the transmission of unnecessary page data. This
|
|
Packit |
df99a1 |
second problem (the unnecessary wait) can be solved using the ``byte
|
|
Packit |
df99a1 |
serving'' options of the HTTP/1.1 protocol. This option has to be
|
|
Packit |
df99a1 |
supported by the web server, the proxies, the caches and the browser. We
|
|
Packit |
df99a1 |
are coming there but not quite yet. Byte serving however does not solve
|
|
Packit |
df99a1 |
the first problem (the waste of bandwith).
|
|
Packit |
df99a1 |
|
|
Packit |
df99a1 |
\textbf{Indirect multi-page documents} ---
|
|
Packit |
df99a1 |
|
|
Packit |
df99a1 |
DjVu solves both problem using a
|
|
Packit |
df99a1 |
special multi-page format named the \emph{indirect} model. An indirect
|
|
Packit |
df99a1 |
multi-page DjVu document is composed of several files. The main file is
|
|
Packit |
df99a1 |
named the \emph{index file}. You can browse a document using the URL of
|
|
Packit |
df99a1 |
the index file, just like you do with a bundled multi-page document. The
|
|
Packit |
df99a1 |
index file however is very small. It simply contains the document
|
|
Packit |
df99a1 |
directory and the URLs of secondary files containing the page data. When
|
|
Packit |
df99a1 |
you browse an indirect multi-page document, the browser only accesses data
|
|
Packit |
df99a1 |
for the pages you are viewing. This can be done at a reasonable speed
|
|
Packit |
df99a1 |
because the browser maintains a cache of pages and sometimes pre-fetches a
|
|
Packit |
df99a1 |
few pages ahead of the current page. This model uses the web serving
|
|
Packit |
df99a1 |
bandwith much more effectively. It also eliminates unnecessary delays
|
|
Packit |
df99a1 |
when jumping ahead to pages located anywhere in a long document.
|
|
Packit |
df99a1 |
|
|
Packit |
df99a1 |
\textbf{Obsolete Formats} ---
|
|
Packit |
df99a1 |
|
|
Packit |
df99a1 |
The library also supports two other multipage
|
|
Packit |
df99a1 |
formats which are now obsolete. These formats are technologically
|
|
Packit |
df99a1 |
inferior and should no longer be used. */
|
|
Packit |
df99a1 |
|
|
Packit |
df99a1 |
|
|
Packit |
df99a1 |
|
|
Packit |
df99a1 |
|
|
Packit |
df99a1 |
|
|
Packit |
df99a1 |
------------------------------------------------------------
|
|
Packit |
df99a1 |
2 - CHUNK ENCODING
|
|
Packit |
df99a1 |
------------------------------------------------------------
|
|
Packit |
df99a1 |
|
|
Packit |
df99a1 |
This section describes
|
|
Packit |
df99a1 |
- the encoding of new chunks introduces with DjVu3
|
|
Packit |
df99a1 |
- the encoding changes of chunks already present in DjVu2
|
|
Packit |
df99a1 |
|
|
Packit |
df99a1 |
|
|
Packit |
df99a1 |
|
|
Packit |
df99a1 |
------------------------------------------------------------
|
|
Packit |
df99a1 |
2.1 - CHANGES TO JB2 ( "Sjbz" AND "Djbz" CHUNKS )
|
|
Packit |
df99a1 |
------------------------------------------------------------
|
|
Packit |
df99a1 |
|
|
Packit |
df99a1 |
Two extensions of the JB2 encoding format have been introduced
|
|
Packit |
df99a1 |
with DjVu files version 21. Both extensions maintain significant
|
|
Packit |
df99a1 |
backward compatibility with previous version of the JB2 format.
|
|
Packit |
df99a1 |
These extensions are described below by reference to the DjVu2 spec
|
|
Packit |
df99a1 |
dated August 1999. Both extension make use of the unused record
|
|
Packit |
df99a1 |
type value #9# (cf. ICFDD page 24) which has been renamed
|
|
Packit |
df99a1 |
#REQUIRED_DICT_OR_RESET#.
|
|
Packit |
df99a1 |
|
|
Packit |
df99a1 |
\textbf{Shared Shape Dictionaries} --- This extension provides
|
|
Packit |
df99a1 |
support for sharing symbol definitions between the pages of a
|
|
Packit |
df99a1 |
document. To achieve this objective, the JB2 image data chunk
|
|
Packit |
df99a1 |
must be able to address symbols defined elsewhere by a JB2
|
|
Packit |
df99a1 |
dictionary data chunk shared by all the pages of a document.
|
|
Packit |
df99a1 |
|
|
Packit |
df99a1 |
The arithmetically encoded JB2 image data logically consist of a
|
|
Packit |
df99a1 |
sequence of records. The decoder processes these records in
|
|
Packit |
df99a1 |
sequence and maintains a library of symbols which can be addressed
|
|
Packit |
df99a1 |
by the following records. The first record usually is a ``Start
|
|
Packit |
df99a1 |
Of Image'' record describing the size of the image.
|
|
Packit |
df99a1 |
|
|
Packit |
df99a1 |
Starting with version 21, a #REQUIRED_DICT_OR_RESET# (9) record
|
|
Packit |
df99a1 |
type can appear \emph{before} the #START_OF_DATA# (0) record. The
|
|
Packit |
df99a1 |
record type field is followed by a single number arithmetically
|
|
Packit |
df99a1 |
encoded (cf. ICFDD page 26) using a sixteenth context (cf. ICFDD
|
|
Packit |
df99a1 |
page 25). This record appears when the JB2 data chunk requires
|
|
Packit |
df99a1 |
symbols encoded in a separate JB2 dictionary data chunk. The
|
|
Packit |
df99a1 |
number (the \textbf{dictionary size}) indicates how many symbols
|
|
Packit |
df99a1 |
should have been defined by the JB2 dictionary data chunk. The
|
|
Packit |
df99a1 |
decoder should simply load these symbols in the symbol library and
|
|
Packit |
df99a1 |
proceed as usual. New symbols potentially defined by the
|
|
Packit |
df99a1 |
subsequent JB2 image data records will therefore be numbered with
|
|
Packit |
df99a1 |
integers greater or equal than the dictionary size.
|
|
Packit |
df99a1 |
|
|
Packit |
df99a1 |
The JB2 dictionary data format is a pure subset of the JB2 image
|
|
Packit |
df99a1 |
data format. The #START_OF_DATA# (0) record always specifies an
|
|
Packit |
df99a1 |
image width of zero and an image height of zero. The only allowed
|
|
Packit |
df99a1 |
record types are those defining library symbols only
|
|
Packit |
df99a1 |
(#NEW_SYMBOL_LIBRARY_ONLY# (2) and #MATCHED_REFINE_LIBRARY_ONLY#
|
|
Packit |
df99a1 |
(5) cf. ICFDD page 24) followed by a final #END_OF_DATA# (11)
|
|
Packit |
df99a1 |
record.
|
|
Packit |
df99a1 |
|
|
Packit |
df99a1 |
The JB2 dictionary data is usually located in an \textbf{Djbz} chunk.
|
|
Packit |
df99a1 |
Each page \textbf{FORM:DJVU} may directly contain a \textbf{Djbz} chunk,
|
|
Packit |
df99a1 |
or may indirectly point to such a chunk using an \textbf{INCL} chunk
|
|
Packit |
df99a1 |
(cf. \Ref{Multipage DjVu documents.}).
|
|
Packit |
df99a1 |
|
|
Packit |
df99a1 |
\textbf{Numcoder Reset} --- This extension addresses a problem for
|
|
Packit |
df99a1 |
hardware implementations. The encoding of numbers (cf. ICFDD page
|
|
Packit |
df99a1 |
26) potentially uses an unbounded number of binary coding
|
|
Packit |
df99a1 |
contexts. These contexts are normally allocated when they are used
|
|
Packit |
df99a1 |
for the first time (cf. ICFDD informative note, page 27).
|
|
Packit |
df99a1 |
|
|
Packit |
df99a1 |
Starting with version 21, a #REQUIRED_DICT_OR_RESET# (9) record
|
|
Packit |
df99a1 |
type can appear \emph{after} the #START_OF_DATA# (0) record. The
|
|
Packit |
df99a1 |
decoder should proceed with the next record after \emph{clearing
|
|
Packit |
df99a1 |
all binary contexts used for coding numbers}. This operation
|
|
Packit |
df99a1 |
implies that all binary contexts previously allocated for coding
|
|
Packit |
df99a1 |
numbers can be deallocated.
|
|
Packit |
df99a1 |
|
|
Packit |
df99a1 |
Starting with version 21, the JB2 encoder should insert a
|
|
Packit |
df99a1 |
#REQUIRED_DICT_OR_RESET# record type whenever the number of these
|
|
Packit |
df99a1 |
allocated binary contexts exceeds #20000#. Only very large
|
|
Packit |
df99a1 |
documents ever reach such a large number of allocated binary
|
|
Packit |
df99a1 |
contexts (e.g large maps). Hardware implementation however can
|
|
Packit |
df99a1 |
benefit greatly from a hard bound on the total number of binary
|
|
Packit |
df99a1 |
coding contexts. Old JB2 decoders will treat this record type as
|
|
Packit |
df99a1 |
an #END_OF_DATA# record and cleanly stop decoding (cf. ICFDD page
|
|
Packit |
df99a1 |
30, Image refinement data).
|
|
Packit |
df99a1 |
|
|
Packit |
df99a1 |
|
|
Packit |
df99a1 |
|
|
Packit |
df99a1 |
|
|
Packit |
df99a1 |
------------------------------------------------------------
|
|
Packit |
df99a1 |
2.2 - JB2 COLORS ( "FGbz" CHUNK )
|
|
Packit |
df99a1 |
------------------------------------------------------------
|
|
Packit |
df99a1 |
|
|
Packit |
df99a1 |
To be documented.
|
|
Packit |
df99a1 |
|
|
Packit |
df99a1 |
The #"FGbz"# contains BZZ compressed data
|
|
Packit |
df99a1 |
(cf. \Ref{BSByteStream.h}).
|
|
Packit |
df99a1 |
|
|
Packit |
df99a1 |
The uncompressed data can be decoded using function
|
|
Packit |
df99a1 |
#DjVuPalette::decode# defined in file #"DjVuPalette.cpp"#.
|
|
Packit |
df99a1 |
|
|
Packit |
df99a1 |
|
|
Packit |
df99a1 |
|
|
Packit |
df99a1 |
|
|
Packit |
df99a1 |
------------------------------------------------------------
|
|
Packit |
df99a1 |
2.3 - ANNOTATIONS ( "ANTa" AND "ANTz" CHUNKS )
|
|
Packit |
df99a1 |
------------------------------------------------------------
|
|
Packit |
df99a1 |
|
|
Packit |
df99a1 |
[MAY 19TH, 2005:
|
|
Packit |
df99a1 |
New annotation types have been defined by Lizardtech.
|
|
Packit |
df99a1 |
The authoritative documentation is now the djvused man page.]
|
|
Packit |
df99a1 |
|
|
Packit |
df99a1 |
|
|
Packit |
df99a1 |
Annotations are contained in #"ANTa"#
|
|
Packit |
df99a1 |
or #"ANTz"# chunks. The #"ANTa"# chunks contain the annotation in
|
|
Packit |
df99a1 |
plain text. The #"ANTz"# chunks contain the same information compressed
|
|
Packit |
df99a1 |
with the BZZ encoder (cf. \Ref{BSByteStream.h}).
|
|
Packit |
df99a1 |
|
|
Packit |
df99a1 |
The complete annotation text is obtained by concatenating all annotation
|
|
Packit |
df99a1 |
chunks present in the page. Pages can share annotations using an INCL
|
|
Packit |
df99a1 |
chunk as explained in section \Ref{Including shared information}.
|
|
Packit |
df99a1 |
A restriction of the current reference library implementation
|
|
Packit |
df99a1 |
limits the number of shared annotation files to one.
|
|
Packit |
df99a1 |
|
|
Packit |
df99a1 |
The syntax of the annotation text uses a simple
|
|
Packit |
df99a1 |
parenthesized notation. Erroneous and unrecognized constructs are silently
|
|
Packit |
df99a1 |
ignored. The following constructs are recognized:
|
|
Packit |
df99a1 |
|
|
Packit |
df99a1 |
\begin{description}
|
|
Packit |
df99a1 |
\item[(background <color>)]
|
|
Packit |
df99a1 |
Sets the color of the viewer area surrounding the DjVu image.
|
|
Packit |
df99a1 |
The color argument #color# are always represented using X11
|
|
Packit |
df99a1 |
syntax \##RRGGBB#. For instance \##000000# is black
|
|
Packit |
df99a1 |
and \##FFFFFF# is white.
|
|
Packit |
df99a1 |
|
|
Packit |
df99a1 |
\item[(zoom <zoom-value>)]
|
|
Packit |
df99a1 |
Sets the initial zoom factor of the image. Argument #zoom-value# may
|
|
Packit |
df99a1 |
be #stretch#, #one2one#, #width#, #page#, or composed of the letter
|
|
Packit |
df99a1 |
#"d"# followed by a number between #1# and #999# (such as in #d300# for
|
|
Packit |
df99a1 |
instance.)
|
|
Packit |
df99a1 |
|
|
Packit |
df99a1 |
\item[(mode <mode-value>)]
|
|
Packit |
df99a1 |
Sets the display mode for the image. Argument #mode-value# may
|
|
Packit |
df99a1 |
be #color#, #bw#, #fore# or #back#.
|
|
Packit |
df99a1 |
|
|
Packit |
df99a1 |
\item[(align <horz-align> <vert-align>)]
|
|
Packit |
df99a1 |
Specifies how the image should be aligned on the viewer surface.
|
|
Packit |
df99a1 |
By default the image is located in the center. Argument #horz-align#
|
|
Packit |
df99a1 |
may be #left#, #center#, or #right#. Argument #vert-align# may be
|
|
Packit |
df99a1 |
#top#, #center#, or #bottom#.
|
|
Packit |
df99a1 |
|
|
Packit |
df99a1 |
\item[(maparea <url> <comment> <area> <...options...>]
|
|
Packit |
df99a1 |
Defines an hyperlink for the URL specified by argument #url#.
|
|
Packit |
df99a1 |
|
|
Packit |
df99a1 |
Argument #url# may have one of the following two forms:
|
|
Packit |
df99a1 |
\begin{verbatim}
|
|
Packit |
df99a1 |
"<href>"
|
|
Packit |
df99a1 |
(url "<href>" "<target>")
|
|
Packit |
df99a1 |
\end{verbatim}
|
|
Packit |
df99a1 |
where #href# is a string representing the URL and #target# is a string
|
|
Packit |
df99a1 |
representing the target frame for the hyperlink (cf. Documentation for
|
|
Packit |
df99a1 |
the HTML tag ##). Both strings are surrounded with double quotes.
|
|
Packit |
df99a1 |
Argument #comment# is a string surrounded by double quotes.
|
|
Packit |
df99a1 |
This string may be displayed as a tooltip when the user
|
|
Packit |
df99a1 |
moves the mouse over the hyperlink.
|
|
Packit |
df99a1 |
Argument #area# defines the shape of the hyperlink.
|
|
Packit |
df99a1 |
The following options are supported for representing
|
|
Packit |
df99a1 |
rectangle, circle, or polygons.
|
|
Packit |
df99a1 |
\begin{verbatim}
|
|
Packit |
df99a1 |
(rect <xmin> <ymin> <width> <height>)
|
|
Packit |
df99a1 |
(oval <xmin> <ymin> <width> <height>)
|
|
Packit |
df99a1 |
(polygon <x0> <y0> <x1> <y1> ....)
|
|
Packit |
df99a1 |
\end{verbatim}
|
|
Packit |
df99a1 |
All parameters are numbers representing coordinates measured in image
|
|
Packit |
df99a1 |
pixels with the origin set at the bottom left corner of the image. The
|
|
Packit |
df99a1 |
remaining arguments describe options regarding the hyperlink borders.
|
|
Packit |
df99a1 |
A first set of option define the type of the borders:
|
|
Packit |
df99a1 |
\begin{verbatim}
|
|
Packit |
df99a1 |
(xor)
|
|
Packit |
df99a1 |
(border <color>
|
|
Packit |
df99a1 |
(shadow_in [<thickness>])
|
|
Packit |
df99a1 |
(shadow_out [<thickness>])
|
|
Packit |
df99a1 |
(shadow_ein [<thickness>])
|
|
Packit |
df99a1 |
(shadow_eout [<thickness>])
|
|
Packit |
df99a1 |
\end{verbatim}
|
|
Packit |
df99a1 |
where parameter #color# has syntax \##RRGGBB# (as above) and parameter
|
|
Packit |
df99a1 |
#thickness# is a number from 1 to 32. The last four border modes are
|
|
Packit |
df99a1 |
only supported with rectangular areas. The border becomes visible when
|
|
Packit |
df99a1 |
the user moves the mouse over the hyperlink. The border may be made
|
|
Packit |
df99a1 |
always visible by using the following option:
|
|
Packit |
df99a1 |
\begin{verbatim}
|
|
Packit |
df99a1 |
(border-avis)
|
|
Packit |
df99a1 |
\end{verbatim}
|
|
Packit |
df99a1 |
Finally the following option may be used with rectangular areas only.
|
|
Packit |
df99a1 |
The complete area will be hilited using the specified color (specified
|
|
Packit |
df99a1 |
with syntax \##RRGGBB# as usual).
|
|
Packit |
df99a1 |
\begin{verbatim}
|
|
Packit |
df99a1 |
(hilite <color>)
|
|
Packit |
df99a1 |
\end{verbatim}
|
|
Packit |
df99a1 |
This is often used with an empty URL for simply emphasizing a specific
|
|
Packit |
df99a1 |
segment of an image.
|
|
Packit |
df99a1 |
|
|
Packit |
df99a1 |
\item[(metadata <...entries...>)]
|
|
Packit |
df99a1 |
Defines multiple metadata entries.
|
|
Packit |
df99a1 |
|
|
Packit |
df99a1 |
Each metadata entry has the form
|
|
Packit |
df99a1 |
\begin{verbatim}
|
|
Packit |
df99a1 |
(<key> "<value>")
|
|
Packit |
df99a1 |
\end{verbatim}
|
|
Packit |
df99a1 |
parameter #<key># is a symbolic attribute name such as #year#,
|
|
Packit |
df99a1 |
#booktitle#, #editor#, #author#, and parameter #<value>#
|
|
Packit |
df99a1 |
is a UTF-8 encoded string representing the attribute value.
|
|
Packit |
df99a1 |
Common C escape sequences are recognized.
|
|
Packit |
df99a1 |
It is suggested to use the same key names as
|
|
Packit |
df99a1 |
the BibTeX bibliography system.
|
|
Packit |
df99a1 |
|
|
Packit |
df99a1 |
Metadata pertaining to the entire document should be placed
|
|
Packit |
df99a1 |
in a shared annotation file (and therefore are seen in all pages).
|
|
Packit |
df99a1 |
Metadata pertaining to a particular page are usually places
|
|
Packit |
df99a1 |
inside an #"ANTz"# chunk in this particular page.
|
|
Packit |
df99a1 |
|
|
Packit |
df99a1 |
\end{description}
|
|
Packit |
df99a1 |
|
|
Packit |
df99a1 |
|
|
Packit |
df99a1 |
|
|
Packit |
df99a1 |
|
|
Packit |
df99a1 |
|
|
Packit |
df99a1 |
------------------------------------------------------------
|
|
Packit |
df99a1 |
2.4 - HIDDEN TEXT ( "TXTa" AND "TXTz" CHUNKS )
|
|
Packit |
df99a1 |
------------------------------------------------------------
|
|
Packit |
df99a1 |
|
|
Packit |
df99a1 |
To be documented.
|
|
Packit |
df99a1 |
|
|
Packit |
df99a1 |
The #"TXTa"# chunk contains uncompressed data.
|
|
Packit |
df99a1 |
The #"TXTz"# chunk contains BZZ compressed data (cf. \Ref{BSByteStream.h}).
|
|
Packit |
df99a1 |
|
|
Packit |
df99a1 |
The uncompressed data can be decoded using function #DjVuText::decode#
|
|
Packit |
df99a1 |
defined in file #"DjVuText.cpp"# Program #djvused# can display the content
|
|
Packit |
df99a1 |
of the text chunk using a lisp syntax, and can create a text chunk from
|
|
Packit |
df99a1 |
this lisp syntax.
|
|
Packit |
df99a1 |
|
|
Packit |
df99a1 |
|
|
Packit |
df99a1 |
------------------------------------------------------------
|
|
Packit |
df99a1 |
2.5 - MULTIPAGE DIRECTORY CHUNK ( "DIRM" CHUNK )
|
|
Packit |
df99a1 |
------------------------------------------------------------
|
|
Packit |
df99a1 |
|
|
Packit |
df99a1 |
Multipage DjVu documents follow the EA
|
|
Packit |
df99a1 |
IFF85 format (cf. \Ref{IFFByteStream.h}.) A document is composed of a
|
|
Packit |
df99a1 |
#"FORM:DJVM"# whose first chunk is a #"DIRM"# chunk containing the
|
|
Packit |
df99a1 |
\emph{document directory}. This directory lists all component
|
|
Packit |
df99a1 |
files composing
|
|
Packit |
df99a1 |
the given document, helps to access every component file and identify the
|
|
Packit |
df99a1 |
pages of the document.
|
|
Packit |
df99a1 |
|
|
Packit |
df99a1 |
\begin{itemize}
|
|
Packit |
df99a1 |
\item
|
|
Packit |
df99a1 |
In a \emph{bundled} multipage file, the component files
|
|
Packit |
df99a1 |
are stored immediately after the #"DIRM"# chunk,
|
|
Packit |
df99a1 |
within the #"FORM:DJVM"# composite chunk.
|
|
Packit |
df99a1 |
\item
|
|
Packit |
df99a1 |
In an \emph{indirect} multipage file, the component files are
|
|
Packit |
df99a1 |
stored in different files whose URLs are composed using information
|
|
Packit |
df99a1 |
stored in the #"DIRM"# chunk.
|
|
Packit |
df99a1 |
\end{itemize}
|
|
Packit |
df99a1 |
|
|
Packit |
df99a1 |
Most of the component files represent pages of a document. Some files
|
|
Packit |
df99a1 |
however represent data shared by several pages. The pages refer to these
|
|
Packit |
df99a1 |
supporting files by means of an inclusion chunk (#"INCL"# chunks)
|
|
Packit |
df99a1 |
identifying the supporting file. Every directory record describes a
|
|
Packit |
df99a1 |
component file. Each component file is identified by a small string
|
|
Packit |
df99a1 |
named the identifier (ID). Each component file also contains a
|
|
Packit |
df99a1 |
file name and a title.
|
|
Packit |
df99a1 |
|
|
Packit |
df99a1 |
Theoretically, IDs are used to uniquely identify each component file in
|
|
Packit |
df99a1 |
#"INCL"# chunks, names are used to compose the the URLs of the component
|
|
Packit |
df99a1 |
files in an indirect multipage DjVu file, and titles are cosmetic names
|
|
Packit |
df99a1 |
possibly displayed when viewing a page of a document. There are however
|
|
Packit |
df99a1 |
many problems with this scheme, and we \emph{strongly suggest}, with the
|
|
Packit |
df99a1 |
current implementation to always make the file ID, the file name and the
|
|
Packit |
df99a1 |
file title identical.
|
|
Packit |
df99a1 |
|
|
Packit |
df99a1 |
\textbf{Variants} --- There are two versions of the #"DIRM"# chunk format.
|
|
Packit |
df99a1 |
The version number is identified by the seven low bits of the first byte
|
|
Packit |
df99a1 |
of the chunk. Version \textbf{0} is obsolete and should never be used. This
|
|
Packit |
df99a1 |
section describes version \textbf{1}. There are two major multipage DjVu
|
|
Packit |
df99a1 |
formats supported: \emph{bundled} and \emph{indirect}. The #"DIRM"# chunk
|
|
Packit |
df99a1 |
indicates which format is used in the most significant bit of the first
|
|
Packit |
df99a1 |
byte of the chunk. The document is bundled when this bit is set.
|
|
Packit |
df99a1 |
Otherwise the document is indirect.
|
|
Packit |
df99a1 |
|
|
Packit |
df99a1 |
\textbf{Unencoded data} ---
|
|
Packit |
df99a1 |
|
|
Packit |
df99a1 |
The #"DIRM"# chunk is composed some unencoded
|
|
Packit |
df99a1 |
data followed by \Ref{bzz} encoded data. The unencoded data starts with
|
|
Packit |
df99a1 |
the version byte and a 16 bit integer representing the number of component
|
|
Packit |
df99a1 |
files. All integers are encoded with the most significant byte first.
|
|
Packit |
df99a1 |
\begin{verbatim}
|
|
Packit |
df99a1 |
BYTE: Flags/Version: 0x<bundled>0000011
|
|
Packit |
df99a1 |
INT16: Number of component files.
|
|
Packit |
df99a1 |
\end{verbatim}
|
|
Packit |
df99a1 |
When the document is a bundled document (i.e. the flag #bundled# is set),
|
|
Packit |
df99a1 |
this header is followed by the offsets of each of the component files within
|
|
Packit |
df99a1 |
the #"FORM:DJVM"#. These offsets allow for random component file access.
|
|
Packit |
df99a1 |
\begin{verbatim}
|
|
Packit |
df99a1 |
INT32: Offset of first component file.
|
|
Packit |
df99a1 |
INT32: Offset of second component file.
|
|
Packit |
df99a1 |
...
|
|
Packit |
df99a1 |
INT32: Offset of last component file.
|
|
Packit |
df99a1 |
\end{verbatim}
|
|
Packit |
df99a1 |
|
|
Packit |
df99a1 |
\textbf{BZZ encoded data} ---
|
|
Packit |
df99a1 |
|
|
Packit |
df99a1 |
The rest of the chunk is entirely compressed
|
|
Packit |
df99a1 |
with the BZZ general purpose compressor. We describe now the data fed
|
|
Packit |
df99a1 |
into (or retrieved from) the BZZ codec (cf. \Ref{BSByteStream}.) First
|
|
Packit |
df99a1 |
come the sizes and the flags associated with each component file.
|
|
Packit |
df99a1 |
\begin{verbatim}
|
|
Packit |
df99a1 |
INT24: Size of the first component file.
|
|
Packit |
df99a1 |
INT24: Size of the second component file.
|
|
Packit |
df99a1 |
...
|
|
Packit |
df99a1 |
INT24: Size of the last component file.
|
|
Packit |
df99a1 |
BYTE: Flag byte for the first component file.
|
|
Packit |
df99a1 |
BYTE: Flag byte for the second component file.
|
|
Packit |
df99a1 |
...
|
|
Packit |
df99a1 |
BYTE: Flag byte for the last component file.
|
|
Packit |
df99a1 |
\end{verbatim}
|
|
Packit |
df99a1 |
The flag bytes have the following format:
|
|
Packit |
df99a1 |
\begin{verbatim}
|
|
Packit |
df99a1 |
0b<hasname><hastitle>000000 for a file included by other files.
|
|
Packit |
df99a1 |
0b<hasname><hastitle>000001 for a file representing a page.
|
|
Packit |
df99a1 |
0b<hasname><hastitle>000010 for a file containing thumbnails.
|
|
Packit |
df99a1 |
\end{verbatim}
|
|
Packit |
df99a1 |
Flag #hasname# is set when the name of the file is different from the file
|
|
Packit |
df99a1 |
ID. Flag #hastitle# is set when the title of the file is different from
|
|
Packit |
df99a1 |
the file ID. These flags are used to avoid encoding the same string three
|
|
Packit |
df99a1 |
times. Then come a sequence of zero terminated strings. There are one to
|
|
Packit |
df99a1 |
three such strings per component file. The first string contains the ID
|
|
Packit |
df99a1 |
of the component file. The second string contains the name of the
|
|
Packit |
df99a1 |
component file. It is only present when the flag #hasname# is set. The third
|
|
Packit |
df99a1 |
one contains the title of the component file. It is only present when the
|
|
Packit |
df99a1 |
flag #hastitle# is set. The \Ref{bzz} encoding system makes sure that
|
|
Packit |
df99a1 |
all these strings will be encoded efficiently despite their possible
|
|
Packit |
df99a1 |
redundancies.
|
|
Packit |
df99a1 |
\begin{verbatim}
|
|
Packit |
df99a1 |
ZSTR: ID of the first component file.
|
|
Packit |
df99a1 |
ZSTR: Name of the first component file (only if #hasname# is set.)
|
|
Packit |
df99a1 |
ZSTR: Title of the first component file (only if #hastitle# is set.)
|
|
Packit |
df99a1 |
...
|
|
Packit |
df99a1 |
ZSTR: ID of the last component file.
|
|
Packit |
df99a1 |
ZSTR: Name of the last component file (only if #hasname# is set.)
|
|
Packit |
df99a1 |
ZSTR: Title of the last component file (only if #hastitle# is set.)
|
|
Packit |
df99a1 |
\end{verbatim}
|
|
Packit |
df99a1 |
|
|
Packit |
df99a1 |
|
|
Packit |
df99a1 |
|
|
Packit |
df99a1 |
------------------------------------------------------------
|
|
Packit |
df99a1 |
2.6 - INCLUDES ( "INCL" CHUNK )
|
|
Packit |
df99a1 |
------------------------------------------------------------
|
|
Packit |
df99a1 |
|
|
Packit |
df99a1 |
The chunks simply contains the ascii encoded ID
|
|
Packit |
df99a1 |
of the included component file.
|
|
Packit |
df99a1 |
|
|
Packit |
df99a1 |
|
|
Packit |
df99a1 |
|
|
Packit |
df99a1 |
------------------------------------------------------------
|
|
Packit |
df99a1 |
2.7 - THUMBNAILS
|
|
Packit |
df99a1 |
------------------------------------------------------------
|
|
Packit |
df99a1 |
|
|
Packit |
df99a1 |
|
|
Packit |
df99a1 |
Multipage document file optionally can contain thumbnails for some or all
|
|
Packit |
df99a1 |
pages. These thumbnails are stored into special component files
|
|
Packit |
df99a1 |
containing thumbnails for a number of consecutive pages.
|
|
Packit |
df99a1 |
|
|
Packit |
df99a1 |
The thumbnail component file is composed of a single #"FORM:THUM"#
|
|
Packit |
df99a1 |
containing one or more #"TH44"# chunk. Each #"TH44"# chunk contains one
|
|
Packit |
df99a1 |
IW44 encoded thumbnail image for one page (cf. \Ref{IW44Image.h}).
|
|
Packit |
df99a1 |
|
|
Packit |
df99a1 |
|
|
Packit |
df99a1 |
------------------------------------------------------------
|
|
Packit |
df99a1 |
2.8 - OUTLINES/BOOKMARKS
|
|
Packit |
df99a1 |
------------------------------------------------------------
|
|
Packit |
df99a1 |
|
|
Packit |
df99a1 |
[MAY 19th, 2005
|
|
Packit |
df99a1 |
Multipage files (FORM:DJVM) can contain an
|
|
Packit |
df99a1 |
additional chunk "NAVM" located after the "DIRM" chunk.
|
|
Packit |
df99a1 |
The NAVM chunk contains outlines and bookmarks.
|
|
Packit |
df99a1 |
See the files libdjvu/DjVmNav.h and libdjvu.DjVmNav.cpp]
|