|
Packit |
df99a1 |
.\" Copyright (c) 2001-2003 Leon Bottou, Yann Le Cun, Patrick Haffner,
|
|
Packit |
df99a1 |
.\" Copyright (c) 2001 AT&T Corp., and Lizardtech, Inc.
|
|
Packit |
df99a1 |
.\"
|
|
Packit |
df99a1 |
.\" This is free documentation; you can redistribute it and/or
|
|
Packit |
df99a1 |
.\" modify it under the terms of the GNU General Public License as
|
|
Packit |
df99a1 |
.\" published by the Free Software Foundation; either version 2 of
|
|
Packit |
df99a1 |
.\" the License, or (at your option) any later version.
|
|
Packit |
df99a1 |
.\"
|
|
Packit |
df99a1 |
.\" The GNU General Public License's references to "object code"
|
|
Packit |
df99a1 |
.\" and "executables" are to be interpreted as the output of any
|
|
Packit |
df99a1 |
.\" document formatting or typesetting system, including
|
|
Packit |
df99a1 |
.\" intermediate and printed output.
|
|
Packit |
df99a1 |
.\"
|
|
Packit |
df99a1 |
.\" This manual is distributed in the hope that it will be useful,
|
|
Packit |
df99a1 |
.\" but WITHOUT ANY WARRANTY; without even the implied warranty of
|
|
Packit |
df99a1 |
.\" MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the
|
|
Packit |
df99a1 |
.\" GNU General Public License for more details.
|
|
Packit |
df99a1 |
.\"
|
|
Packit |
df99a1 |
.\" You should have received a copy of the GNU General Public
|
|
Packit |
df99a1 |
.\" License along with this manual. Otherwise check the web site
|
|
Packit |
df99a1 |
.\" of the Free Software Foundation at http://www.fsf.org.
|
|
Packit |
df99a1 |
.TH BZZ 1 "10/11/2001" "DjVuLibre-3.5" "DjVuLibre-3.5"
|
|
Packit |
df99a1 |
.de SS
|
|
Packit |
df99a1 |
.SH \\0\\0\\0\\$*
|
|
Packit |
df99a1 |
..
|
|
Packit |
df99a1 |
.SH NAME
|
|
Packit |
df99a1 |
bzz \- DjVu general purpose compression utility.
|
|
Packit |
df99a1 |
|
|
Packit |
df99a1 |
.SH SYNOPSIS
|
|
Packit |
df99a1 |
.SS Encoding:
|
|
Packit |
df99a1 |
.BI "bzz \-e" "[blocksize]" " " "inputfile" " " "outputfile"
|
|
Packit |
df99a1 |
.SS Decoding:
|
|
Packit |
df99a1 |
.BI "bzz \-d " "inputfile" " " "outputfile"
|
|
Packit |
df99a1 |
.PP
|
|
Packit |
df99a1 |
|
|
Packit |
df99a1 |
.SH DESCRIPTION
|
|
Packit |
df99a1 |
The first form of the command line (option
|
|
Packit |
df99a1 |
.BR \-e )
|
|
Packit |
df99a1 |
compresses the data from file
|
|
Packit |
df99a1 |
.I inputfile
|
|
Packit |
df99a1 |
and writes the compressed data into
|
|
Packit |
df99a1 |
.IR outputfile .
|
|
Packit |
df99a1 |
The second form of the command line (option
|
|
Packit |
df99a1 |
.BR \-d )
|
|
Packit |
df99a1 |
decompressed file
|
|
Packit |
df99a1 |
.I inputfile
|
|
Packit |
df99a1 |
and writes the output to
|
|
Packit |
df99a1 |
.IR outputfile .
|
|
Packit |
df99a1 |
|
|
Packit |
df99a1 |
.SH OPTIONS
|
|
Packit |
df99a1 |
.TP
|
|
Packit |
df99a1 |
.B "\-d"
|
|
Packit |
df99a1 |
Decoding mode.
|
|
Packit |
df99a1 |
.TP
|
|
Packit |
df99a1 |
.BI "\-e" "[blocksize]"
|
|
Packit |
df99a1 |
Encoding mode.
|
|
Packit |
df99a1 |
The optional argument
|
|
Packit |
df99a1 |
.I blocksize
|
|
Packit |
df99a1 |
specifies the size of the input file blocks processed by the Burrows-Wheeler
|
|
Packit |
df99a1 |
transform expressed in kilobytes. The default block sizes is 2048
|
|
Packit |
df99a1 |
.SM KB.
|
|
Packit |
df99a1 |
The maximal block size is 4096
|
|
Packit |
df99a1 |
.SM KB.
|
|
Packit |
df99a1 |
Specifying a larger block size usually produces higher compression ratios
|
|
Packit |
df99a1 |
and increases the memory requirements of both the encoder and decoder.
|
|
Packit |
df99a1 |
It is useless to specify a block size that is larger than the
|
|
Packit |
df99a1 |
input file.
|
|
Packit |
df99a1 |
|
|
Packit |
df99a1 |
.SH ALGORITHMS
|
|
Packit |
df99a1 |
The Burrows-Wheeler transform is performed using a combination of the
|
|
Packit |
df99a1 |
Karp-Miller-Rosenberg and the Bentley-Sedgewick algorithms. This is comparable
|
|
Packit |
df99a1 |
to (Sadakane, DCC 98) with a slightly more flexible ranking scheme. Symbols
|
|
Packit |
df99a1 |
are then ordered according to a running estimate of their occurrence
|
|
Packit |
df99a1 |
frequencies. The symbol ranks are then coded using a simple fixed tree and
|
|
Packit |
df99a1 |
the ZP binary adaptive coder (Bottou, DCC 98).
|
|
Packit |
df99a1 |
|
|
Packit |
df99a1 |
The Burrows-Wheeler transform is also used in the well known compressor
|
|
Packit |
df99a1 |
.BR bzip2 .
|
|
Packit |
df99a1 |
The originality of
|
|
Packit |
df99a1 |
.B bzz
|
|
Packit |
df99a1 |
is the use of the ZP adaptive coder.
|
|
Packit |
df99a1 |
The adaptation noise can cost up to 5 percent in
|
|
Packit |
df99a1 |
file size, but this penalty is usually offset by the benefits of
|
|
Packit |
df99a1 |
adaptation.
|
|
Packit |
df99a1 |
|
|
Packit |
df99a1 |
.SH PERFORMANCE
|
|
Packit |
df99a1 |
The following table shows comparative results (in bits per character)
|
|
Packit |
df99a1 |
on the Canterbury Corpus (
|
|
Packit |
df99a1 |
.B http://corpus.canterbury.ac.nz
|
|
Packit |
df99a1 |
). The very good
|
|
Packit |
df99a1 |
.B bzz
|
|
Packit |
df99a1 |
performance on the spreadsheet file
|
|
Packit |
df99a1 |
.I excl
|
|
Packit |
df99a1 |
puts the weighted average ahead of much more sophisticated
|
|
Packit |
df99a1 |
compressors such as
|
|
Packit |
df99a1 |
.BR fsmx .
|
|
Packit |
df99a1 |
.ps -2
|
|
Packit |
df99a1 |
|
|
Packit |
df99a1 |
.TS
|
|
Packit |
df99a1 |
center,box;
|
|
Packit |
df99a1 |
c s s s s s s s s s s s s s
|
|
Packit |
df99a1 |
l c c c c c c c c c c c c c
|
|
Packit |
df99a1 |
l n n n n n n n n n n n n n
|
|
Packit |
df99a1 |
l n n n n n n n n n n n n n
|
|
Packit |
df99a1 |
l n n n n n n n n n n n n n
|
|
Packit |
df99a1 |
l n n n n n n n n n n n n n
|
|
Packit |
df99a1 |
l nfB n nfB n nfB nfB nfB nfB nfB nfB nfB n nfB
|
|
Packit |
df99a1 |
lfB n nfB n nfB n n n n n n n nfB n
|
|
Packit |
df99a1 |
.
|
|
Packit |
df99a1 |
Compression performance
|
|
Packit |
df99a1 |
text fax csrc excl sprc tech poem\
|
|
Packit |
df99a1 |
html lisp man play Weighted Average
|
|
Packit |
df99a1 |
=
|
|
Packit |
df99a1 |
\0compress\0 3.27 0.97 3.56 2.41 4.21 3.06 3.38 3.68 3.90 4.43 3.51 2.55 3.31
|
|
Packit |
df99a1 |
\0gzip \-9\0 2.85 0.82 2.24 1.63 2.67 2.71 3.23 2.59 2.65 3.31 3.12 2.08 2.53
|
|
Packit |
df99a1 |
\0bzip2 \-9\0 2.27 0.78 2.18 1.01 2.70 2.02 2.42 2.48 2.79 3.33 2.53 1.54 2.23
|
|
Packit |
df99a1 |
\0ppmd\0 2.31 0.99 2.11 1.08 2.68 2.19 2.48 2.38 2.43 3.00 2.53 1.65 2.20
|
|
Packit |
df99a1 |
\0fsmx\0 2.10 0.79 1.89 1.48 2.52 1.84 2.21 2.24 2.29 2.91 2.35 1.63 2.06
|
|
Packit |
df99a1 |
\0bzz\0 2.25 0.76 2.13 0.78 2.67 2.00 2.40 2.52 2.60 3.19 2.52 1.44 2.16
|
|
Packit |
df99a1 |
.TE
|
|
Packit |
df99a1 |
|
|
Packit |
df99a1 |
|
|
Packit |
df99a1 |
.PP
|
|
Packit |
df99a1 |
Note that DjVu contributors have several
|
|
Packit |
df99a1 |
entries in this table. Program
|
|
Packit |
df99a1 |
.B compress
|
|
Packit |
df99a1 |
was written some time ago by Joe Orost.
|
|
Packit |
df99a1 |
Program
|
|
Packit |
df99a1 |
.B ppmd
|
|
Packit |
df99a1 |
is an improvement of the
|
|
Packit |
df99a1 |
.SM PPM-C
|
|
Packit |
df99a1 |
method invented by Paul Howard.
|
|
Packit |
df99a1 |
|
|
Packit |
df99a1 |
.SH CREDITS
|
|
Packit |
df99a1 |
Program
|
|
Packit |
df99a1 |
.B bzz
|
|
Packit |
df99a1 |
was written by L\('eon Bottou <leonb@users.sourceforge.net> and
|
|
Packit |
df99a1 |
was then improved by Andrei Erofeev <andrew_erofeev@yahoo.com>, Bill Riemers
|
|
Packit |
df99a1 |
<docbill@sourceforge.net> and many others.
|
|
Packit |
df99a1 |
|
|
Packit |
df99a1 |
.SH SEE ALSO
|
|
Packit |
df99a1 |
.BR djvu (1),
|
|
Packit |
df99a1 |
.BR compress (1),
|
|
Packit |
df99a1 |
.BR gzip (1),
|
|
Packit |
df99a1 |
.BR bzip2 (1)
|