Blame tools/csepdjvu.1

Packit df99a1
.\" Copyright (c) 2001-2003 Leon Bottou, Yann Le Cun, Patrick Haffner,
Packit df99a1
.\" Copyright (c) 2001 AT&T Corp., and Lizardtech, Inc.
Packit df99a1
.\"
Packit df99a1
.\" This is free documentation; you can redistribute it and/or
Packit df99a1
.\" modify it under the terms of the GNU General Public License as
Packit df99a1
.\" published by the Free Software Foundation; either version 2 of
Packit df99a1
.\" the License, or (at your option) any later version.
Packit df99a1
.\"
Packit df99a1
.\" The GNU General Public License's references to "object code"
Packit df99a1
.\" and "executables" are to be interpreted as the output of any
Packit df99a1
.\" document formatting or typesetting system, including
Packit df99a1
.\" intermediate and printed output.
Packit df99a1
.\"
Packit df99a1
.\" This manual is distributed in the hope that it will be useful,
Packit df99a1
.\" but WITHOUT ANY WARRANTY; without even the implied warranty of
Packit df99a1
.\" MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
Packit df99a1
.\" GNU General Public License for more details.
Packit df99a1
.\"
Packit df99a1
.\" You should have received a copy of the GNU General Public
Packit df99a1
.\" License along with this manual. Otherwise check the web site
Packit df99a1
.\" of the Free Software Foundation at http://www.fsf.org.
Packit df99a1
.TH CSEPDJVU 1 "10/11/2001" "DjVuLibre-3.5" "DjVuLibre-3.5"
Packit df99a1
.SH NAME
Packit df99a1
csepdjvu \- DjVu encoder for separated data files.
Packit df99a1
Packit df99a1
.SH SYNOPSIS
Packit df99a1
.BI "csepdjvu  [" "options" "] [" "sepfiles" "]... " "outputdjvufile"
Packit df99a1
Packit df99a1
.SH DESCRIPTION
Packit df99a1
Packit df99a1
This program creates a DjVuDocument file
Packit df99a1
.I outputdjvufile
Packit df99a1
from separated data files 
Packit df99a1
.IR sepfiles .
Packit df99a1
It can read separated data from the standard input when given 
Packit df99a1
a single dash instead of the separated data file names.  
Packit df99a1
This feature is intended for pre-processing programs that
Packit df99a1
push separated data into
Packit df99a1
.B csepdjvu
Packit df99a1
via a pipe.
Packit df99a1
Packit df99a1
Each separated data file represents one or more page images.  When the program
Packit df99a1
arguments specify multiple pages, all the pages are encoded and saved as a
Packit df99a1
bundled multi-page document.  When the program arguments specify a single
Packit df99a1
page, the page is encoded and saved as a single page file.
Packit df99a1
Packit df99a1
.SH OPTIONS
Packit df99a1
.TP
Packit df99a1
.BI "-d " "n"
Packit df99a1
Specify the resolution information encoded into the output file expressed in
Packit df99a1
dots per inch. The resolution information encoded in DjVu files determine how
Packit df99a1
the decoder scales the image on a particular display.  Meaningful resolutions
Packit df99a1
range from 25 to 6000.  The default value is 300 dpi.
Packit df99a1
.TP
Packit df99a1
.BI "-q " "n" ",...," "n"
Packit df99a1
.TP
Packit df99a1
.BI "-q " "n" "+...+" "n"
Packit df99a1
Specify the encoding quality of the IW44 encoded background layer.  
Packit df99a1
The option argument contain several integers (one per chunk) separated by
Packit df99a1
either commas or pluses.  This option is similar to option
Packit df99a1
.B -slice
Packit df99a1
of program
Packit df99a1
.BR c44 .
Packit df99a1
Please refer to the 
Packit df99a1
.BR c44 (1)
Packit df99a1
man page for additional details.
Packit df99a1
The default quality specification is
Packit df99a1
.BR "-q 72,83,93,103" . 
Packit df99a1
Packit df99a1
This option does not apply to uniformly white background that were not specified
Packit df99a1
by the separated data but are called for by the DjVu specification.  Such 
Packit df99a1
background images always come at the lowest possible resolution and with a
Packit df99a1
standard quality setting that ensures the color uniformity.
Packit df99a1
.TP
Packit df99a1
.B "-t"
Packit df99a1
Program 
Packit df99a1
.B csepdjvu
Packit df99a1
interprets certain comments in the separated file to
Packit df99a1
construct a hidden text layer in the DjVu file. This layer
Packit df99a1
records the location of each word for hiliting purposes. 
Packit df99a1
This option reduces the file size by simply recording the
Packit df99a1
location of each line.
Packit df99a1
.TP
Packit df99a1
.B "-v"
Packit df99a1
Display a brief message describing each page.
Packit df99a1
.TP
Packit df99a1
.B "-vv"
Packit df99a1
Display extensive informational messages during encoding.
Packit df99a1
Packit df99a1
.SH SEPARATED DATA FILE FORMAT
Packit df99a1
Packit df99a1
Each separated data file contains a concatenation of one or more separated
Packit df99a1
page images.  Each page is logically represented by a foreground image with a
Packit df99a1
transparent color and by a background image visible through the transparent
Packit df99a1
pixels.  The data for each separated page image is the concatenation of the
Packit df99a1
following data blocks:
Packit df99a1
.IP "*" 3
Packit df99a1
A foreground image encoded using either 
Packit df99a1
the "Color RLE format" or the "Bitonal RLE format".
Packit df99a1
These formats are described later in this section.
Packit df99a1
.IP "*" 3
Packit df99a1
An optional background image encoded as a "Portable Pixmap" (
Packit df99a1
.SM PPM
Packit df99a1
).  This well known format is summarized later in this section.  The absence
Packit df99a1
of a background image simply indicates that a uniformly white background
Packit df99a1
should be assumed.
Packit df99a1
.IP "*" 3
Packit df99a1
An arbitrary number of comment lines starting with character "#" and
Packit df99a1
terminated by a linefeed character. Comment lines whose first word starts
Packit df99a1
with a capital letter have special meanings documented later in this document.
Packit df99a1
.PP
Packit df99a1
The dimensions (width and height) of the background image must be obtained by
Packit df99a1
rounding up the quotient of the foreground image dimensions by an integer
Packit df99a1
reduction factor ranging from 1 to 12.  Assume, for instance, that the width
Packit df99a1
of the foreground is 2507 and the reduction factor is 3.  The width of the
Packit df99a1
background image will be the integer ratio (2507+2)/3.
Packit df99a1
Packit df99a1
.SS Color RLE format
Packit df99a1
Packit df99a1
The Color RLE format is a simple run-length encoding scheme for color images
Packit df99a1
with a limited number of distinct colors.  The data always begin with a text
Packit df99a1
header composed of the two characters "R6", the number of columns, the number
Packit df99a1
of rows, and the number of color palette entries.  All numbers are expressed
Packit df99a1
in decimal
Packit df99a1
.SM ASCII.
Packit df99a1
These four items are separated by blank characters (space, tab, carriage
Packit df99a1
return, or linefeed) or by comment lines introduced by character "#".  The
Packit df99a1
last number is followed by exactly one character which usually is a linefeed
Packit df99a1
character.
Packit df99a1
Packit df99a1
The header is followed by the color palette containing three bytes per color
Packit df99a1
entry.  The bytes represent the red, green, and blue components of the color.
Packit df99a1
Packit df99a1
The palette is followed by a collection of four bytes integers (most
Packit df99a1
significant bit first) representing runs of pixels with an identical color.
Packit df99a1
The twelve upper bits of this integer indicate the index of the run color in
Packit df99a1
the palette entry.  The twenty lower bits of the integer indicate the run
Packit df99a1
length.  Color indices greater than 0xff0 are reserved.  Color index 0xfff is
Packit df99a1
used for transparent runs.  Each row is represented by a sequence of runs
Packit df99a1
whose lengths add up to the image width.  Rows are encoded starting with the
Packit df99a1
top row and progressing toward the bottom row.
Packit df99a1
Packit df99a1
.SS Bitonal RLE format
Packit df99a1
Packit df99a1
The Bitonal RLE format is a simple run-length encoding scheme for bitonal
Packit df99a1
images.  The data always begin with a text header composed of the two
Packit df99a1
characters "R4", the number of columns, and the number of rows.  All numbers
Packit df99a1
are expressed in decimal
Packit df99a1
.SM ASCII.
Packit df99a1
These three items are separated by blank characters (space, tab, carriage
Packit df99a1
return, or linefeed) or by comment lines introduced by character "#".  The
Packit df99a1
last number is followed by exactly one character which usually is a linefeed
Packit df99a1
character.
Packit df99a1
Packit df99a1
The rest of the file encodes a sequence of numbers representing the lengths of
Packit df99a1
alternating runs of transparent and black pixels.  Lines are encoded starting
Packit df99a1
with the top line and progressing toward the bottom line.  Each line starts
Packit df99a1
with a white run. The decoder knows that a line is finished when the sum of
Packit df99a1
the run lengths for that line is equal to the number of columns in the image.
Packit df99a1
Numbers in range 0 to 191 are represented by a single byte in range 0x00 to
Packit df99a1
0xbf.  Numbers in range 192 to 16383 are represented by a two byte sequence:
Packit df99a1
the first byte, in range 0xc0 to 0xff, encodes the six most significant bits
Packit df99a1
of the number, the second byte encodes the remaining eight bits of the
Packit df99a1
number. This scheme allows for runs of length zero, which are useful when a
Packit df99a1
line starts with a black pixel, and when a very long run (whose length exceeds
Packit df99a1
16383) must be split into smaller runs.
Packit df99a1
Packit df99a1
.SS Portable Pixmap (PPM) format
Packit df99a1
Packit df99a1
The Portable Pixmap format is a well known format for representing color
Packit df99a1
images.  Check the
Packit df99a1
.BR ppm (1)
Packit df99a1
man page for complete information.
Packit df99a1
Packit df99a1
The data always begin with a text header composed of the two characters "P6",
Packit df99a1
the number of columns, the number of rows, and the maximal value of
Packit df99a1
a color component (usually 255).  All numbers are expressed in
Packit df99a1
decimal
Packit df99a1
.SM ASCII.
Packit df99a1
These three items are separated by blank characters (space, tab, carriage
Packit df99a1
return, or linefeed) or by comment lines introduced by character "#".  The
Packit df99a1
last number is followed by exactly one character which usually is a linefeed
Packit df99a1
character.
Packit df99a1
Packit df99a1
The rest of the file encodes all the pixels.  Each pixel is represented by
Packit df99a1
three bytes representing the red, green and blue component of the pixel.
Packit df99a1
Pixels are ordered in left to right, top to bottom.
Packit df99a1
Packit df99a1
.SS Comments in separated files
Packit df99a1
Packit df99a1
Each page is followed by an arbitrary number of comment lines 
Packit df99a1
starting with character "#" and terminated by a linefeed character. 
Packit df99a1
Comment lines whose first word starts with a capital letter have 
Packit df99a1
special meanings. The following constructs are currently defined:
Packit df99a1
.IP "*" 3
Packit df99a1
.BI "# T " px ":" py " " dx ":" dy " " w "x" h "+" x "+" y " (" string ")"
Packit df99a1
.br
Packit df99a1
This constructs indicates that the piece of text
Packit df99a1
.I string
Packit df99a1
must be associated with an area of size
Packit df99a1
.IR w "x" h
Packit df99a1
at position 
Packit df99a1
.IR x "," y
Packit df99a1
relative to the lower left corner of the page.
Packit df99a1
The string is UTF-8 encoded. Special characters
Packit df99a1
can be escaped as in PostScript using the backslash character.
Packit df99a1
Integers
Packit df99a1
.IR px ", and " py
Packit df99a1
represent the position of the current point on the text baseline
Packit df99a1
before the text was drawn. The drawing operation then moves the
Packit df99a1
current point by 
Packit df99a1
.IR dx ", and " dy
Packit df99a1
pixels.
Packit df99a1
When such comments are present, 
Packit df99a1
.BR csepdjvu 
Packit df99a1
produces a hidden text layer for the 
Packit df99a1
corresponding pages.
Packit df99a1
.IP "*" 3
Packit df99a1
.BI "# L " w "x" h "+" x "+" y " (" url ")"
Packit df99a1
.br
Packit df99a1
This construct indicates that an hyperlink to url
Packit df99a1
.I url
Packit df99a1
should be associated with area of size
Packit df99a1
.IR w "x" h
Packit df99a1
at position 
Packit df99a1
.IR x "," y "."
Packit df99a1
When such comments are present, 
Packit df99a1
.BR csepdjvu 
Packit df99a1
produces pages with an annotation chunk 
Packit df99a1
containing the specified hyperlinks.
Packit df99a1
.IP "*" 3
Packit df99a1
.BI "# B " count " (" string ") (#" pageno ")"
Packit df99a1
.br
Packit df99a1
This constructs provides outline information for the document.
Packit df99a1
An outline entry entitled
Packit df99a1
.I string
Packit df99a1
is associated with page
Packit df99a1
.IR pageno .
Packit df99a1
Integer 
Packit df99a1
.I count 
Packit df99a1
indicates how many of the following outline entries must
Packit df99a1
be attached to the current entry as subentries.
Packit df99a1
When such comments are present in the first page
Packit df99a1
.BR csepdjvu 
Packit df99a1
produces an navigation chunk with 
Packit df99a1
the specified outline.
Packit df99a1
Packit df99a1
.SH CREDITS
Packit df99a1
Packit df99a1
This program was initially written by L\('eon Bottou
Packit df99a1
<leonb@users.sourceforge.net> and was improved by Bill Riemers
Packit df99a1
<docbill@sourceforge.net> and many others.
Packit df99a1
Packit df99a1
.SH SEE ALSO
Packit df99a1
.BR djvu (1),
Packit df99a1
.BR ppm (5),
Packit df99a1
.BR c44 (1)