|
Packit |
df99a1 |
//C- -*- C++ -*-
|
|
Packit |
df99a1 |
//C- -------------------------------------------------------------------
|
|
Packit |
df99a1 |
//C- DjVuLibre-3.5
|
|
Packit |
df99a1 |
//C- Copyright (c) 2002 Leon Bottou and Yann Le Cun.
|
|
Packit |
df99a1 |
//C- Copyright (c) 2001 AT&T
|
|
Packit |
df99a1 |
//C-
|
|
Packit |
df99a1 |
//C- This software is subject to, and may be distributed under, the
|
|
Packit |
df99a1 |
//C- GNU General Public License, either Version 2 of the license,
|
|
Packit |
df99a1 |
//C- or (at your option) any later version. The license should have
|
|
Packit |
df99a1 |
//C- accompanied the software or you may obtain a copy of the license
|
|
Packit |
df99a1 |
//C- from the Free Software Foundation at http://www.fsf.org .
|
|
Packit |
df99a1 |
//C-
|
|
Packit |
df99a1 |
//C- This program is distributed in the hope that it will be useful,
|
|
Packit |
df99a1 |
//C- but WITHOUT ANY WARRANTY; without even the implied warranty of
|
|
Packit |
df99a1 |
//C- MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the
|
|
Packit |
df99a1 |
//C- GNU General Public License for more details.
|
|
Packit |
df99a1 |
//C-
|
|
Packit |
df99a1 |
//C- DjVuLibre-3.5 is derived from the DjVu(r) Reference Library from
|
|
Packit |
df99a1 |
//C- Lizardtech Software. Lizardtech Software has authorized us to
|
|
Packit |
df99a1 |
//C- replace the original DjVu(r) Reference Library notice by the following
|
|
Packit |
df99a1 |
//C- text (see doc/lizard2002.djvu and doc/lizardtech2007.djvu):
|
|
Packit |
df99a1 |
//C-
|
|
Packit |
df99a1 |
//C- ------------------------------------------------------------------
|
|
Packit |
df99a1 |
//C- | DjVu (r) Reference Library (v. 3.5)
|
|
Packit |
df99a1 |
//C- | Copyright (c) 1999-2001 LizardTech, Inc. All Rights Reserved.
|
|
Packit |
df99a1 |
//C- | The DjVu Reference Library is protected by U.S. Pat. No.
|
|
Packit |
df99a1 |
//C- | 6,058,214 and patents pending.
|
|
Packit |
df99a1 |
//C- |
|
|
Packit |
df99a1 |
//C- | This software is subject to, and may be distributed under, the
|
|
Packit |
df99a1 |
//C- | GNU General Public License, either Version 2 of the license,
|
|
Packit |
df99a1 |
//C- | or (at your option) any later version. The license should have
|
|
Packit |
df99a1 |
//C- | accompanied the software or you may obtain a copy of the license
|
|
Packit |
df99a1 |
//C- | from the Free Software Foundation at http://www.fsf.org .
|
|
Packit |
df99a1 |
//C- |
|
|
Packit |
df99a1 |
//C- | The computer code originally released by LizardTech under this
|
|
Packit |
df99a1 |
//C- | license and unmodified by other parties is deemed "the LIZARDTECH
|
|
Packit |
df99a1 |
//C- | ORIGINAL CODE." Subject to any third party intellectual property
|
|
Packit |
df99a1 |
//C- | claims, LizardTech grants recipient a worldwide, royalty-free,
|
|
Packit |
df99a1 |
//C- | non-exclusive license to make, use, sell, or otherwise dispose of
|
|
Packit |
df99a1 |
//C- | the LIZARDTECH ORIGINAL CODE or of programs derived from the
|
|
Packit |
df99a1 |
//C- | LIZARDTECH ORIGINAL CODE in compliance with the terms of the GNU
|
|
Packit |
df99a1 |
//C- | General Public License. This grant only confers the right to
|
|
Packit |
df99a1 |
//C- | infringe patent claims underlying the LIZARDTECH ORIGINAL CODE to
|
|
Packit |
df99a1 |
//C- | the extent such infringement is reasonably necessary to enable
|
|
Packit |
df99a1 |
//C- | recipient to make, have made, practice, sell, or otherwise dispose
|
|
Packit |
df99a1 |
//C- | of the LIZARDTECH ORIGINAL CODE (or portions thereof) and not to
|
|
Packit |
df99a1 |
//C- | any greater extent that may be necessary to utilize further
|
|
Packit |
df99a1 |
//C- | modifications or combinations.
|
|
Packit |
df99a1 |
//C- |
|
|
Packit |
df99a1 |
//C- | The LIZARDTECH ORIGINAL CODE is provided "AS IS" WITHOUT WARRANTY
|
|
Packit |
df99a1 |
//C- | OF ANY KIND, EITHER EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED
|
|
Packit |
df99a1 |
//C- | TO ANY WARRANTY OF NON-INFRINGEMENT, OR ANY IMPLIED WARRANTY OF
|
|
Packit |
df99a1 |
//C- | MERCHANTABILITY OR FITNESS FOR A PARTICULAR PURPOSE.
|
|
Packit |
df99a1 |
//C- +------------------------------------------------------------------
|
|
Packit |
df99a1 |
|
|
Packit |
df99a1 |
#ifndef _ZPCODEC_H
|
|
Packit |
df99a1 |
#define _ZPCODEC_H
|
|
Packit |
df99a1 |
#ifdef HAVE_CONFIG_H
|
|
Packit |
df99a1 |
#include "config.h"
|
|
Packit |
df99a1 |
#endif
|
|
Packit |
df99a1 |
#if NEED_GNUG_PRAGMAS
|
|
Packit |
df99a1 |
# pragma interface
|
|
Packit |
df99a1 |
#endif
|
|
Packit |
df99a1 |
|
|
Packit |
df99a1 |
// From: Leon Bottou, 1/31/2002
|
|
Packit |
df99a1 |
// Almost equal to my initial code.
|
|
Packit |
df99a1 |
|
|
Packit |
df99a1 |
#include "GContainer.h"
|
|
Packit |
df99a1 |
|
|
Packit |
df99a1 |
#ifdef HAVE_NAMESPACES
|
|
Packit |
df99a1 |
namespace DJVU {
|
|
Packit |
df99a1 |
# ifdef NOT_DEFINED // Just to fool emacs c++ mode
|
|
Packit |
df99a1 |
}
|
|
Packit |
df99a1 |
#endif
|
|
Packit |
df99a1 |
#endif
|
|
Packit |
df99a1 |
|
|
Packit |
df99a1 |
class ByteStream;
|
|
Packit |
df99a1 |
|
|
Packit |
df99a1 |
|
|
Packit |
df99a1 |
|
|
Packit |
df99a1 |
/** @name ZPCodec.h
|
|
Packit |
df99a1 |
|
|
Packit |
df99a1 |
Files #"ZPCodec.h"# and #"ZPCodec.cpp"# implement a fast binary adaptive
|
|
Packit |
df99a1 |
quasi-arithmetic coder named ZP-Coder. Because of its speed and
|
|
Packit |
df99a1 |
convenience, the ZP-Coder is used in several parts of the DjVu reference
|
|
Packit |
df99a1 |
library (See \Ref{BSByteStream.h}, \Ref{JB2Image.h}, \Ref{IW44Image.h}).
|
|
Packit |
df99a1 |
The following comments avoid the theory (see the historical remarks for
|
|
Packit |
df99a1 |
useful pointers) and concentrate on the user perspective on the ZP-Coder.
|
|
Packit |
df99a1 |
|
|
Packit |
df99a1 |
{\bf Introduction} ---
|
|
Packit |
df99a1 |
Encoding consists of transforming a sequence of {\em message bits} into a
|
|
Packit |
df99a1 |
sequence of {\em code bits}. Decoding consists of retrieving the message
|
|
Packit |
df99a1 |
bits using only the code bits. We can make the code smaller than the
|
|
Packit |
df99a1 |
message as soon as we can predict a message bit on the basis of a {\em
|
|
Packit |
df99a1 |
coding context} composed of previously encoded or decoded bits. If the
|
|
Packit |
df99a1 |
prediction is always correct, we do not even need to encode the message
|
|
Packit |
df99a1 |
bit. If the prediction is totally unreliable, we need to generate one code
|
|
Packit |
df99a1 |
bit in order to unambiguously specify the message bit. In other words,
|
|
Packit |
df99a1 |
the more reliable the prediction, the more compression we get.
|
|
Packit |
df99a1 |
|
|
Packit |
df99a1 |
The ZP-Coder handles prediction by means of {\em context variables} (see
|
|
Packit |
df99a1 |
\Ref{BitContext}). There must be a context variable for each possible
|
|
Packit |
df99a1 |
combination of context bits. Both the encoder and the decoder use same
|
|
Packit |
df99a1 |
context variable for coding each message bit. For instance, we can code a
|
|
Packit |
df99a1 |
binary image by successively coding all the pixels (the message bits) in
|
|
Packit |
df99a1 |
row and column order. It is reasonable to assume that each pixel can be
|
|
Packit |
df99a1 |
reasonably well predicted by looking at a few (say 10) neighboring pixels
|
|
Packit |
df99a1 |
located above and to the left of the current pixel. Since these 10 pixels
|
|
Packit |
df99a1 |
make 1024 combinations, we need 1024 context variables. Each pixel is
|
|
Packit |
df99a1 |
encoded using the context variable corresponding to the values of the 10
|
|
Packit |
df99a1 |
neighboring pixels. Each pixel will be decoded by specifying the same
|
|
Packit |
df99a1 |
context variable corresponding to the values of these 10 pixels. This is
|
|
Packit |
df99a1 |
possible because these 10 pixels (located above and to the left) have
|
|
Packit |
df99a1 |
already been decoded and therefore are known by the decoder program.
|
|
Packit |
df99a1 |
|
|
Packit |
df99a1 |
The context variables are initially set to zero, which mean that we do not
|
|
Packit |
df99a1 |
know yet how to predict the current message bit on the basis of the
|
|
Packit |
df99a1 |
context bits. While coding the message bits, the ZP-Coder automatically
|
|
Packit |
df99a1 |
estimates the frequencies of #0#s and #1#s coded using each context
|
|
Packit |
df99a1 |
variable. These frequencies actually provide a prediction (the most
|
|
Packit |
df99a1 |
probable bit value) and an estimation of the prediction reliability (how
|
|
Packit |
df99a1 |
often the prediction was correct in the past). All this statistical
|
|
Packit |
df99a1 |
information is stored into the context variable after coding each bit. In
|
|
Packit |
df99a1 |
other words, the more we code bits within a particular context, the better
|
|
Packit |
df99a1 |
the ZP-Coder adapts its prediction model, and the more compression we can
|
|
Packit |
df99a1 |
obtain.
|
|
Packit |
df99a1 |
|
|
Packit |
df99a1 |
All this adaptation works indeed because both the encoder program and the
|
|
Packit |
df99a1 |
decoder program are always synchronized. Both the encoder and the decoder
|
|
Packit |
df99a1 |
see the same message bits encoded (or decoded) with the same context
|
|
Packit |
df99a1 |
variables. Both the encoder and the decoder apply the same rules to
|
|
Packit |
df99a1 |
update the context variables and improve the predictors. Both the encoder
|
|
Packit |
df99a1 |
and the decoder programs use the same predictors for any given message
|
|
Packit |
df99a1 |
bit. The decoder could not work if this was not the case.
|
|
Packit |
df99a1 |
|
|
Packit |
df99a1 |
Just before encoding a message bit, all the context variables in the
|
|
Packit |
df99a1 |
encoder program contain certain values. Just before decoding this message
|
|
Packit |
df99a1 |
bit, all the context variables in the decoder program must contain the same
|
|
Packit |
df99a1 |
values as for the encoder program. This is guaranteed as long as
|
|
Packit |
df99a1 |
each prediction only depends on already coded bits: {\em the coding context,
|
|
Packit |
df99a1 |
on which the each prediction is based, must be composed of message bits which
|
|
Packit |
df99a1 |
have already been coded. }
|
|
Packit |
df99a1 |
|
|
Packit |
df99a1 |
{\bf Usage} ---
|
|
Packit |
df99a1 |
Once you know how to organize the predictions (i.e. which coding context
|
|
Packit |
df99a1 |
to use, how many context variables to initialize, etc.), using the
|
|
Packit |
df99a1 |
ZP-Coder is straightforward (see \Ref{ZPCodec Examples}):
|
|
Packit |
df99a1 |
\begin{itemize}
|
|
Packit |
df99a1 |
\item The {\em encoder program} allocates context variables and
|
|
Packit |
df99a1 |
initializes them to zero. It then constructs a \Ref{ZPCodec} object for
|
|
Packit |
df99a1 |
encoding. For each message bit, the encoder program retrieves the context
|
|
Packit |
df99a1 |
bits, selects a context variable on the basis of the context bits and
|
|
Packit |
df99a1 |
calls member function \Ref{ZPCodec::encoder} with the message bit and a
|
|
Packit |
df99a1 |
reference to the context variable.
|
|
Packit |
df99a1 |
\item The {\em decoder program} allocates context variables and
|
|
Packit |
df99a1 |
initializes them to zero. It then constructs a \Ref{ZPCodec} object for
|
|
Packit |
df99a1 |
decoding. For each message bit, the decoder program retrieves the context
|
|
Packit |
df99a1 |
bits, selects a context variable on the basis of the context bits and
|
|
Packit |
df99a1 |
calls member function \Ref{ZPCodec::decoder} with a reference to the
|
|
Packit |
df99a1 |
context variable. This function returns the message bit.
|
|
Packit |
df99a1 |
\end{itemize}
|
|
Packit |
df99a1 |
Functions #encoder# and #decoder# only require a few machine cycles to
|
|
Packit |
df99a1 |
perform two essential tasks, namely {\em coding} and {\em context
|
|
Packit |
df99a1 |
adaptation}. Function #decoder# often returns after two arithmetic
|
|
Packit |
df99a1 |
operations only. To make your program fast, you just need to feed message
|
|
Packit |
df99a1 |
bits and context variables fast enough.
|
|
Packit |
df99a1 |
|
|
Packit |
df99a1 |
{\bf History} --- The ZP-Coder is similar in function and performance to
|
|
Packit |
df99a1 |
the seminal Q-Coder (Pennebaker, Mitchell, Langdon, Arps, IBM J. Res
|
|
Packit |
df99a1 |
Dev. 32, 1988). An improved version of the Q-Coder, named QM-Coder, has
|
|
Packit |
df99a1 |
been described in certain parts of the JPEG standard. Unfortunate patent
|
|
Packit |
df99a1 |
policies have made these coders very difficult to use in general purpose
|
|
Packit |
df99a1 |
applications. The Z-Coder is constructed using a new approach based on an
|
|
Packit |
df99a1 |
extension of the Golomb codes (Bottou, Howard, Bengio, IEEE DCC 98, 1998
|
|
Packit |
df99a1 |
\URL[DjVu]{http://www.research.att.com/~leonb/DJVU/bottou-howard-bengio/}
|
|
Packit |
df99a1 |
\URL[PostScript]{http://www.research.att.com/~leonb/PS/bottou-howard-bengio.ps.gz})
|
|
Packit |
df99a1 |
This new approach does not infringe the QM-Coder patents. Unfortunately
|
|
Packit |
df99a1 |
the Z-Coder is dangerously close to the patented Arithmetic MEL Coder.
|
|
Packit |
df99a1 |
Therefore we wrote the ZP-Coder (pronounce Zee-Prime Coder) which we
|
|
Packit |
df99a1 |
believe is clear of legal problems. Needless to say, AT&T has patents
|
|
Packit |
df99a1 |
pending for both the Z-Coder and the ZP-Coder, licenced to LizardTech.
|
|
Packit |
df99a1 |
The good news however is that we can grant a license to use the ZP-Coder
|
|
Packit |
df99a1 |
in ``free software'' without further complication. See the Copyright
|
|
Packit |
df99a1 |
for more information.
|
|
Packit |
df99a1 |
|
|
Packit |
df99a1 |
@memo
|
|
Packit |
df99a1 |
Binary adaptive quasi-arithmetic coder.
|
|
Packit |
df99a1 |
@author
|
|
Packit |
df99a1 |
L\'eon Bottou <leonb@research.att.com> */
|
|
Packit |
df99a1 |
//@{
|
|
Packit |
df99a1 |
|
|
Packit |
df99a1 |
|
|
Packit |
df99a1 |
/** Context variable.
|
|
Packit |
df99a1 |
Variables of type #BitContext# hold a single byte describing how to encode
|
|
Packit |
df99a1 |
or decode message bits with similar statistical properties. This single
|
|
Packit |
df99a1 |
byte simultaneously represents the current estimate of the bit probability
|
|
Packit |
df99a1 |
distribution (which is determined by the frequencies of #1#s and #0#s
|
|
Packit |
df99a1 |
already coded with this context) and the confidence in this estimate
|
|
Packit |
df99a1 |
(which determines how fast the estimate can change.)
|
|
Packit |
df99a1 |
|
|
Packit |
df99a1 |
A coding program typically allocates hundreds of context variables. Each
|
|
Packit |
df99a1 |
coding context is initialized to zero before encoding or decoding. Value
|
|
Packit |
df99a1 |
zero represents equal probabilities for #1#s and #0#s with a minimal
|
|
Packit |
df99a1 |
confidence and therefore a maximum adaptation speed. Each message bit is
|
|
Packit |
df99a1 |
encoded using a coding context determined as a function of previously
|
|
Packit |
df99a1 |
encoded message bits. The decoder therefore can examine the previously
|
|
Packit |
df99a1 |
decoded message bits and decode the current bit using the same context as
|
|
Packit |
df99a1 |
the encoder. This is critical for proper decoding.
|
|
Packit |
df99a1 |
*/
|
|
Packit |
df99a1 |
typedef unsigned char BitContext;
|
|
Packit |
df99a1 |
|
|
Packit |
df99a1 |
|
|
Packit |
df99a1 |
/** Performs ZP-Coder encoding and decoding. A ZPCodec object must either
|
|
Packit |
df99a1 |
constructed for encoding or for decoding. The ZPCodec object is connected
|
|
Packit |
df99a1 |
with a \Ref{ByteStream} object specified at construction time. A ZPCodec
|
|
Packit |
df99a1 |
object constructed for decoding reads code bits from the ByteStream and
|
|
Packit |
df99a1 |
returns a message bit whenever function \Ref{decoder} is called. A
|
|
Packit |
df99a1 |
ZPCodec constructed for encoding processes the message bits provided by
|
|
Packit |
df99a1 |
function \Ref{encoder} and writes the corresponding code bits to
|
|
Packit |
df99a1 |
ByteStream #bs#.
|
|
Packit |
df99a1 |
|
|
Packit |
df99a1 |
You should never directly access a ByteStream object connected to a valid
|
|
Packit |
df99a1 |
ZPCodec object. The most direct way to access the ByteStream object
|
|
Packit |
df99a1 |
consists of using the "pass-thru" versions of functions \Ref{encoder} and
|
|
Packit |
df99a1 |
\Ref{decoder}.
|
|
Packit |
df99a1 |
|
|
Packit |
df99a1 |
The ByteStream object can be accessed again after the destruction of the
|
|
Packit |
df99a1 |
ZPCodec object. Note that the encoder always flushes its internal buffers
|
|
Packit |
df99a1 |
and writes a few final code bytes when the ZPCodec object is destroyed.
|
|
Packit |
df99a1 |
Note also that the decoder often reads a few bytes beyond the last code byte
|
|
Packit |
df99a1 |
written by the encoder. This lag means that you must reposition the
|
|
Packit |
df99a1 |
ByteStream after the destruction of the ZPCodec object and before re-using
|
|
Packit |
df99a1 |
the ByteStream object (see \Ref{IFFByteStream}.)
|
|
Packit |
df99a1 |
|
|
Packit |
df99a1 |
Please note also that the decoder has no way to reliably indicate the end
|
|
Packit |
df99a1 |
of the message bit sequence. The content of the message must be designed
|
|
Packit |
df99a1 |
in a way which indicates when to stop decoding. Simple ways to achieve
|
|
Packit |
df99a1 |
this consists of announcing the message length at the beginning (like a
|
|
Packit |
df99a1 |
pascal style string), or of defining a termination code (like a null
|
|
Packit |
df99a1 |
terminated string). */
|
|
Packit |
df99a1 |
|
|
Packit |
df99a1 |
class ZPCodec : public GPEnabled {
|
|
Packit |
df99a1 |
protected:
|
|
Packit |
df99a1 |
ZPCodec (GP<ByteStream> gbs, const bool encoding, const bool djvucompat=false);
|
|
Packit |
df99a1 |
public:
|
|
Packit |
df99a1 |
class Encode;
|
|
Packit |
df99a1 |
class Decode;
|
|
Packit |
df99a1 |
|
|
Packit |
df99a1 |
/// Non-virtual destructor.
|
|
Packit |
df99a1 |
~ZPCodec();
|
|
Packit |
df99a1 |
/** Constructs a ZP-Coder. If argument #encoding# is zero, the ZP-Coder
|
|
Packit |
df99a1 |
object will read code bits from the ByteStream #bs# and return a message
|
|
Packit |
df99a1 |
bit whenever function #decoder# is called. If flag #encoding# is set
|
|
Packit |
df99a1 |
the ZP-Coder object will process the message bits provided by function
|
|
Packit |
df99a1 |
#encoder# and write code bits to ByteStream #bs#. Optional flag
|
|
Packit |
df99a1 |
#djvucompat# selects a slightly less efficient adaptation table which is
|
|
Packit |
df99a1 |
used by the DjVu project. This is required in order to ensure the
|
|
Packit |
df99a1 |
bitstream compatibility. You should not use this flag unless you want
|
|
Packit |
df99a1 |
to decode JB2, IW44 or BZZ encoded data. */
|
|
Packit |
df99a1 |
static GP<ZPCodec> create(
|
|
Packit |
df99a1 |
GP<ByteStream> gbs, const bool encoding, const bool djvucompat=false);
|
|
Packit |
df99a1 |
|
|
Packit |
df99a1 |
/** Encodes bit #bit# using context variable #ctx#. Argument #bit# must be
|
|
Packit |
df99a1 |
#0# or #1#. This function should only be used with ZP-Coder objects
|
|
Packit |
df99a1 |
created for encoding. It may modify the contents of variable #ctx# in
|
|
Packit |
df99a1 |
order to perform context adaptation. */
|
|
Packit |
df99a1 |
void encoder(int bit, BitContext &ctx;;
|
|
Packit |
df99a1 |
|
|
Packit |
df99a1 |
/** Decodes a bit using context variable #ctx#. This function should only be
|
|
Packit |
df99a1 |
used with ZP-Coder objects created for decoding. It may modify the
|
|
Packit |
df99a1 |
contents of variable #ctx# in order to perform context adaptation. */
|
|
Packit |
df99a1 |
int decoder(BitContext &ctx;;
|
|
Packit |
df99a1 |
|
|
Packit |
df99a1 |
/** Encodes bit #bit# without compression (pass-thru encoder). Argument
|
|
Packit |
df99a1 |
#bit# must be #0# or #1#. No compression will be applied. Calling this
|
|
Packit |
df99a1 |
function always increases the length of the code bit sequence by one
|
|
Packit |
df99a1 |
bit. */
|
|
Packit |
df99a1 |
void encoder(int bit);
|
|
Packit |
df99a1 |
|
|
Packit |
df99a1 |
/** Decodes a bit without compression (pass-thru decoder). This function
|
|
Packit |
df99a1 |
retrieves bits encoded with the pass-thru encoder. */
|
|
Packit |
df99a1 |
int decoder(void);
|
|
Packit |
df99a1 |
#ifdef ZPCODEC_BITCOUNT
|
|
Packit |
df99a1 |
/** Counter for code bits (requires #-DZPCODEC_BITCOUNT#). This member
|
|
Packit |
df99a1 |
variable is available when the ZP-Coder is compiled with option
|
|
Packit |
df99a1 |
#-DZPCODEC_BITCOUNT#. Variable #bitcount# counts the number of code
|
|
Packit |
df99a1 |
bits processed by the coder since the construction of the object. This
|
|
Packit |
df99a1 |
variable can be used to evaluate how many code bits are spent on various
|
|
Packit |
df99a1 |
components of the message. */
|
|
Packit |
df99a1 |
int bitcount;
|
|
Packit |
df99a1 |
#endif
|
|
Packit |
df99a1 |
// Table management (advanced stuff)
|
|
Packit |
df99a1 |
struct Table {
|
|
Packit |
df99a1 |
unsigned short p;
|
|
Packit |
df99a1 |
unsigned short m;
|
|
Packit |
df99a1 |
BitContext up;
|
|
Packit |
df99a1 |
BitContext dn;
|
|
Packit |
df99a1 |
};
|
|
Packit |
df99a1 |
void newtable(ZPCodec::Table *table);
|
|
Packit |
df99a1 |
BitContext state(float prob1);
|
|
Packit |
df99a1 |
// Non-adaptive encoder/decoder
|
|
Packit |
df99a1 |
void encoder_nolearn(int pix, BitContext &ctx;;
|
|
Packit |
df99a1 |
int decoder_nolearn(BitContext &ctx;;
|
|
Packit |
df99a1 |
inline int IWdecoder(void);
|
|
Packit |
df99a1 |
inline void IWencoder(const bool bit);
|
|
Packit |
df99a1 |
protected:
|
|
Packit |
df99a1 |
// coder status
|
|
Packit |
df99a1 |
GP<ByteStream> gbs; // Where the data goes/comes from
|
|
Packit |
df99a1 |
ByteStream *bs; // Where the data goes/comes from
|
|
Packit |
df99a1 |
const bool encoding; // Direction (0=decoding, 1=encoding)
|
|
Packit |
df99a1 |
unsigned char byte;
|
|
Packit |
df99a1 |
unsigned char scount;
|
|
Packit |
df99a1 |
unsigned char delay;
|
|
Packit |
df99a1 |
unsigned int a;
|
|
Packit |
df99a1 |
unsigned int code;
|
|
Packit |
df99a1 |
unsigned int fence;
|
|
Packit |
df99a1 |
unsigned int subend;
|
|
Packit |
df99a1 |
unsigned int buffer;
|
|
Packit |
df99a1 |
unsigned int nrun;
|
|
Packit |
df99a1 |
// table
|
|
Packit |
df99a1 |
unsigned int p[256];
|
|
Packit |
df99a1 |
unsigned int m[256];
|
|
Packit |
df99a1 |
BitContext up[256];
|
|
Packit |
df99a1 |
BitContext dn[256];
|
|
Packit |
df99a1 |
// machine independent ffz
|
|
Packit |
df99a1 |
char ffzt[256];
|
|
Packit |
df99a1 |
// encoder private
|
|
Packit |
df99a1 |
void einit (void);
|
|
Packit |
df99a1 |
void eflush (void);
|
|
Packit |
df99a1 |
void outbit(int bit);
|
|
Packit |
df99a1 |
void zemit(int b);
|
|
Packit |
df99a1 |
void encode_mps(BitContext &ctx, unsigned int z);
|
|
Packit |
df99a1 |
void encode_lps(BitContext &ctx, unsigned int z);
|
|
Packit |
df99a1 |
void encode_mps_simple(unsigned int z);
|
|
Packit |
df99a1 |
void encode_lps_simple(unsigned int z);
|
|
Packit |
df99a1 |
void encode_mps_nolearn(unsigned int z);
|
|
Packit |
df99a1 |
void encode_lps_nolearn(unsigned int z);
|
|
Packit |
df99a1 |
// decoder private
|
|
Packit |
df99a1 |
void dinit(void);
|
|
Packit |
df99a1 |
void preload(void);
|
|
Packit |
df99a1 |
int ffz(unsigned int x);
|
|
Packit |
df99a1 |
int decode_sub(BitContext &ctx, unsigned int z);
|
|
Packit |
df99a1 |
int decode_sub_simple(int mps, unsigned int z);
|
|
Packit |
df99a1 |
int decode_sub_nolearn(int mps, unsigned int z);
|
|
Packit |
df99a1 |
private:
|
|
Packit |
df99a1 |
// no copy allowed (hate c++)
|
|
Packit |
df99a1 |
ZPCodec(const ZPCodec&);
|
|
Packit |
df99a1 |
ZPCodec& operator=(const ZPCodec&);
|
|
Packit |
df99a1 |
#ifdef ZPCODEC_FRIEND
|
|
Packit |
df99a1 |
friend ZPCODEC_FRIEND;
|
|
Packit |
df99a1 |
#endif
|
|
Packit |
df99a1 |
};
|
|
Packit |
df99a1 |
|
|
Packit |
df99a1 |
|
|
Packit |
df99a1 |
|
|
Packit |
df99a1 |
|
|
Packit |
df99a1 |
|
|
Packit |
df99a1 |
|
|
Packit |
df99a1 |
// INLINE CODE
|
|
Packit |
df99a1 |
|
|
Packit |
df99a1 |
inline void
|
|
Packit |
df99a1 |
ZPCodec::encoder(int bit, BitContext &ctx)
|
|
Packit |
df99a1 |
{
|
|
Packit |
df99a1 |
unsigned int z = a + p[ctx];
|
|
Packit |
df99a1 |
if (bit != (ctx & 1))
|
|
Packit |
df99a1 |
{
|
|
Packit |
df99a1 |
encode_lps(ctx, z);
|
|
Packit |
df99a1 |
}else if (z >= 0x8000)
|
|
Packit |
df99a1 |
{
|
|
Packit |
df99a1 |
encode_mps(ctx, z);
|
|
Packit |
df99a1 |
}else
|
|
Packit |
df99a1 |
{
|
|
Packit |
df99a1 |
a = z;
|
|
Packit |
df99a1 |
}
|
|
Packit |
df99a1 |
}
|
|
Packit |
df99a1 |
|
|
Packit |
df99a1 |
inline int
|
|
Packit |
df99a1 |
ZPCodec::IWdecoder(void)
|
|
Packit |
df99a1 |
{
|
|
Packit |
df99a1 |
return decode_sub_simple(0,0x8000 + ((a+a+a) >> 3));
|
|
Packit |
df99a1 |
}
|
|
Packit |
df99a1 |
|
|
Packit |
df99a1 |
inline int
|
|
Packit |
df99a1 |
ZPCodec::decoder(BitContext &ctx)
|
|
Packit |
df99a1 |
{
|
|
Packit |
df99a1 |
unsigned int z = a + p[ctx];
|
|
Packit |
df99a1 |
if (z <= fence)
|
|
Packit |
df99a1 |
{ a = z; return (ctx&1;; }
|
|
Packit |
df99a1 |
return decode_sub(ctx, z);
|
|
Packit |
df99a1 |
}
|
|
Packit |
df99a1 |
|
|
Packit |
df99a1 |
inline void
|
|
Packit |
df99a1 |
ZPCodec::encoder_nolearn(int bit, BitContext &ctx)
|
|
Packit |
df99a1 |
{
|
|
Packit |
df99a1 |
unsigned int z = a + p[ctx];
|
|
Packit |
df99a1 |
if (bit != (ctx & 1))
|
|
Packit |
df99a1 |
encode_lps_nolearn(z);
|
|
Packit |
df99a1 |
else if (z >= 0x8000)
|
|
Packit |
df99a1 |
encode_mps_nolearn(z);
|
|
Packit |
df99a1 |
else
|
|
Packit |
df99a1 |
a = z;
|
|
Packit |
df99a1 |
}
|
|
Packit |
df99a1 |
|
|
Packit |
df99a1 |
inline int
|
|
Packit |
df99a1 |
ZPCodec::decoder_nolearn(BitContext &ctx)
|
|
Packit |
df99a1 |
{
|
|
Packit |
df99a1 |
unsigned int z = a + p[ctx];
|
|
Packit |
df99a1 |
if (z <= fence)
|
|
Packit |
df99a1 |
{ a = z; return (ctx&1;; }
|
|
Packit |
df99a1 |
return decode_sub_nolearn( (ctx&1), z);
|
|
Packit |
df99a1 |
}
|
|
Packit |
df99a1 |
|
|
Packit |
df99a1 |
inline void
|
|
Packit |
df99a1 |
ZPCodec::encoder(int bit)
|
|
Packit |
df99a1 |
{
|
|
Packit |
df99a1 |
if (bit)
|
|
Packit |
df99a1 |
encode_lps_simple(0x8000 + (a>>1));
|
|
Packit |
df99a1 |
else
|
|
Packit |
df99a1 |
encode_mps_simple(0x8000 + (a>>1));
|
|
Packit |
df99a1 |
}
|
|
Packit |
df99a1 |
|
|
Packit |
df99a1 |
inline int
|
|
Packit |
df99a1 |
ZPCodec::decoder(void)
|
|
Packit |
df99a1 |
{
|
|
Packit |
df99a1 |
return decode_sub_simple(0, 0x8000 + (a>>1));
|
|
Packit |
df99a1 |
}
|
|
Packit |
df99a1 |
|
|
Packit |
df99a1 |
inline void
|
|
Packit |
df99a1 |
ZPCodec::IWencoder(const bool bit)
|
|
Packit |
df99a1 |
{
|
|
Packit |
df99a1 |
const int z = 0x8000 + ((a+a+a) >> 3);
|
|
Packit |
df99a1 |
if (bit)
|
|
Packit |
df99a1 |
{
|
|
Packit |
df99a1 |
encode_lps_simple(z);
|
|
Packit |
df99a1 |
}else
|
|
Packit |
df99a1 |
{
|
|
Packit |
df99a1 |
encode_mps_simple(z);
|
|
Packit |
df99a1 |
}
|
|
Packit |
df99a1 |
}
|
|
Packit |
df99a1 |
|
|
Packit |
df99a1 |
// ------------ ADDITIONAL DOCUMENTATION
|
|
Packit |
df99a1 |
|
|
Packit |
df99a1 |
/** @name ZPCodec Examples
|
|
Packit |
df99a1 |
|
|
Packit |
df99a1 |
Binary adaptive coders are efficient and very flexible. Unfortunate
|
|
Packit |
df99a1 |
intellectual property issues however have limited their popularity. As a
|
|
Packit |
df99a1 |
consequence, few programmers have a direct experience of using such a
|
|
Packit |
df99a1 |
coding device. The few examples provided in this section demonstrate how
|
|
Packit |
df99a1 |
we think the ZP-Coder should be used.
|
|
Packit |
df99a1 |
|
|
Packit |
df99a1 |
{\bf Encoding Multivalued Symbols} ---
|
|
Packit |
df99a1 |
Since the ZP-Coder is a strictly binary coder, every message must be
|
|
Packit |
df99a1 |
reduced to a sequence of bits (#0#s or #1#s). It is often convenient to
|
|
Packit |
df99a1 |
consider that a message is a sequence of symbols taking more than two
|
|
Packit |
df99a1 |
values. For instance, a character string may be a sequence of bytes, and
|
|
Packit |
df99a1 |
each byte can take 256 values. Each byte of course is composed of eight
|
|
Packit |
df99a1 |
bits that we can encode in sequence. The real issue however consists of
|
|
Packit |
df99a1 |
deciding how we will use context variables in order to let the ZP-Coder
|
|
Packit |
df99a1 |
learn the probability distribution of the byte values.
|
|
Packit |
df99a1 |
|
|
Packit |
df99a1 |
The most significant bit #b0# decides whether the byte is in range 0..127
|
|
Packit |
df99a1 |
or in range 128..255. We let the ZP-Coder learn how to predict this bit
|
|
Packit |
df99a1 |
by allocating one context variable for it. The second most significant
|
|
Packit |
df99a1 |
byte #b1# has two distinct meanings depending of bit #b0#. If bit #b0# is
|
|
Packit |
df99a1 |
#0#, bit #b1# decides whether the byte is in range 0..63 or 64..127. If
|
|
Packit |
df99a1 |
bit #b0# is #1#, bit #b1# decides whether the byte is in range 128..191 or
|
|
Packit |
df99a1 |
192..255. The prediction for bit #b1# must therefore depend on the value
|
|
Packit |
df99a1 |
of #b0#. This is why we will allocate two context variables for this bit.
|
|
Packit |
df99a1 |
If bit #b0# is #0#, we will use the first variable; if bit #b0# is #1#, we
|
|
Packit |
df99a1 |
will use the second variable. The next bit #b2# has four meanings and
|
|
Packit |
df99a1 |
therefore we will use four context variables, etc. This analysis leads to
|
|
Packit |
df99a1 |
a total of #1+2+4+...+128# = #255# context variables for encoding one
|
|
Packit |
df99a1 |
byte. This encoding procedure can be understood as a binary decision
|
|
Packit |
df99a1 |
tree with a dedicated context variable for predicting each decision.
|
|
Packit |
df99a1 |
\begin{verbatim}
|
|
Packit |
df99a1 |
[>=128]----n---[>=64?]----n----[>31?] ...
|
|
Packit |
df99a1 |
\ `---y----[>95?] ...
|
|
Packit |
df99a1 |
\
|
|
Packit |
df99a1 |
`--y---[>=192?]----n---[>=160?] ...
|
|
Packit |
df99a1 |
`---y---[>=224?] ...
|
|
Packit |
df99a1 |
\end{verbatim}
|
|
Packit |
df99a1 |
The following decoding function illustrates a very compact way to
|
|
Packit |
df99a1 |
implement such a decision tree. Argument #ctx# points to an array of 255
|
|
Packit |
df99a1 |
#BitContext# variables. Macro #REPEAT8# is a shorthand notation for eight
|
|
Packit |
df99a1 |
repetitions of its argument.
|
|
Packit |
df99a1 |
\begin{verbatim}
|
|
Packit |
df99a1 |
int decode_8_bits(ZPCodec &zp, BitContext *ctx )
|
|
Packit |
df99a1 |
{
|
|
Packit |
df99a1 |
int n = 1;
|
|
Packit |
df99a1 |
REPEAT8( { n = (n<<1) | (zp.decoder(ctx[n-1])); } );
|
|
Packit |
df99a1 |
return n & 0xff;
|
|
Packit |
df99a1 |
}
|
|
Packit |
df99a1 |
\end{verbatim}
|
|
Packit |
df99a1 |
The binary representation of variable #n# is always composed of a #1#
|
|
Packit |
df99a1 |
followed by whichever bits have been decoded so far. This extra bit #1# in
|
|
Packit |
df99a1 |
fact is a nice trick to flatten out the tree structure and directly
|
|
Packit |
df99a1 |
address the array of context variables. Bit #b0# is decoded using the
|
|
Packit |
df99a1 |
first context variable since #n# is initially #1#. Bit #b1# is decoded
|
|
Packit |
df99a1 |
using one of the next two variables in the array, since #n# is either #2#
|
|
Packit |
df99a1 |
(#10# in binary) or #3# (#11# in binary). Bit #b2# will be decoded using
|
|
Packit |
df99a1 |
one of the next four variables, since #n# ranges from #4# (#100# in
|
|
Packit |
df99a1 |
binary) to #7# (#111# in binary). The final result is given by removing
|
|
Packit |
df99a1 |
the extra #1# in variable #n#.
|
|
Packit |
df99a1 |
|
|
Packit |
df99a1 |
The corresponding encoding function is almost as compact. Argument #ctx#
|
|
Packit |
df99a1 |
again is an array of 255 #BitContext# variables. Each bit of byte #x# is
|
|
Packit |
df99a1 |
encoded and shifted into variable #n# as in the decoding function.
|
|
Packit |
df99a1 |
Variable #x# in fact contains the bits to be encoded. Variable #n#
|
|
Packit |
df99a1 |
contains a #1# followed by the already encoded bits.
|
|
Packit |
df99a1 |
\begin{verbatim}
|
|
Packit |
df99a1 |
void encode_8_bits(ZPCodec &zp, int x, BitContext *ctx )
|
|
Packit |
df99a1 |
{
|
|
Packit |
df99a1 |
int n = 1;
|
|
Packit |
df99a1 |
REPEAT8( { int b=((x&0x80)?1:0); x=(x<<1);
|
|
Packit |
df99a1 |
zp.encoder(b,ctx[n-1]); n=(n<<1)|(b); } );
|
|
Packit |
df99a1 |
}
|
|
Packit |
df99a1 |
\end{verbatim}
|
|
Packit |
df99a1 |
The ZP-Coder automatically adjusts the content of the context variables
|
|
Packit |
df99a1 |
while coding (recall the context variable argument is passed to functions
|
|
Packit |
df99a1 |
#encoder# and #decoder# by reference). The whole array of 255 context
|
|
Packit |
df99a1 |
variables can be understood as a "byte context variable". The estimated
|
|
Packit |
df99a1 |
probability of each byte value is indeed the product of the estimated
|
|
Packit |
df99a1 |
probabilities of the eight binary decisions that lead to that value in the
|
|
Packit |
df99a1 |
decision tree. All these probabilities are adapted by the underlying
|
|
Packit |
df99a1 |
adaptation algorithm of the ZP-Coder.
|
|
Packit |
df99a1 |
|
|
Packit |
df99a1 |
{\bf Application} ---
|
|
Packit |
df99a1 |
We consider now a simple applications consisting of encoding the
|
|
Packit |
df99a1 |
horizontal and vertical coordinates of a cloud of points. Each coordinate
|
|
Packit |
df99a1 |
requires one byte. The following function illustrates a possible
|
|
Packit |
df99a1 |
implementation:
|
|
Packit |
df99a1 |
\begin{verbatim}
|
|
Packit |
df99a1 |
void encode_points(const char *filename, int n, int *x, int *y)
|
|
Packit |
df99a1 |
{
|
|
Packit |
df99a1 |
StdioByteStream bs(filename, "wb");
|
|
Packit |
df99a1 |
bs.write32(n); // Write number of points.
|
|
Packit |
df99a1 |
ZPCodec zp(bs, 1); // Construct encoder and context vars.
|
|
Packit |
df99a1 |
BitContext ctxX[255], ctxY[255];
|
|
Packit |
df99a1 |
memset(ctxX, 0, sizeof(ctxX));
|
|
Packit |
df99a1 |
memset(ctxY, 0, sizeof(ctxY));
|
|
Packit |
df99a1 |
for (int i=0; i
|
|
Packit |
df99a1 |
encode_8_bits(zp, x[i], ctxX);
|
|
Packit |
df99a1 |
encode_8_bits(zp, y[i], ctxY);
|
|
Packit |
df99a1 |
}
|
|
Packit |
df99a1 |
}
|
|
Packit |
df99a1 |
\end{verbatim}
|
|
Packit |
df99a1 |
The decoding function is very similar to the encoding function:
|
|
Packit |
df99a1 |
\begin{verbatim}
|
|
Packit |
df99a1 |
int decode_points(const char *filename, int *x, int *y)
|
|
Packit |
df99a1 |
{
|
|
Packit |
df99a1 |
StdioByteStream bs(filename,"rb");
|
|
Packit |
df99a1 |
int n = bs.read32(); // Read number of points.
|
|
Packit |
df99a1 |
ZPCodec zp(bs, 0); // Construct decoder and context vars.
|
|
Packit |
df99a1 |
BitContext ctxX[255], ctxY[255];
|
|
Packit |
df99a1 |
memset(ctxX, 0, sizeof(ctxX));
|
|
Packit |
df99a1 |
memset(ctxY, 0, sizeof(ctxY));
|
|
Packit |
df99a1 |
for (int i=0; i
|
|
Packit |
df99a1 |
x[i] = decode_8_bits(zp, ctxX);
|
|
Packit |
df99a1 |
y[i] = decode_8_bits(zp, ctxY);
|
|
Packit |
df99a1 |
}
|
|
Packit |
df99a1 |
return n; // Return number of points.
|
|
Packit |
df99a1 |
}
|
|
Packit |
df99a1 |
\end{verbatim}
|
|
Packit |
df99a1 |
The ZP-Coder automatically estimates the probability distributions of both
|
|
Packit |
df99a1 |
the horizontal and vertical coordinates. These estimates are used to
|
|
Packit |
df99a1 |
efficiently encode the point coordinates. This particular implementation
|
|
Packit |
df99a1 |
is a good option if we assume that the order of the points is significant
|
|
Packit |
df99a1 |
and that successive points are independent. It would be much smarter
|
|
Packit |
df99a1 |
otherwise to sort the points and encode relative displacements between
|
|
Packit |
df99a1 |
successive points.
|
|
Packit |
df99a1 |
|
|
Packit |
df99a1 |
|
|
Packit |
df99a1 |
{\bf Huffman Coding Tricks} ---
|
|
Packit |
df99a1 |
Programmers with experience in Huffman codes can see the similarity in the
|
|
Packit |
df99a1 |
ZP-Coder. Huffman codes also organize the symbol values as a decision
|
|
Packit |
df99a1 |
tree. The tree is balanced in such a way that each decision is as
|
|
Packit |
df99a1 |
unpredictable as possible (i.e. both branches must be equally probable).
|
|
Packit |
df99a1 |
This is very close to the ZP-Coder technique described above. Since we
|
|
Packit |
df99a1 |
allocate one context variable for each decision, our tree need not be
|
|
Packit |
df99a1 |
balanced: the context variable will track the decision statistics and the
|
|
Packit |
df99a1 |
ZP-Coder will compensate optimally.
|
|
Packit |
df99a1 |
|
|
Packit |
df99a1 |
There are good reasons however to avoid unbalanced trees with the ZP-Coder.
|
|
Packit |
df99a1 |
Frequent symbol values may be located quite deep in a poorly balanced
|
|
Packit |
df99a1 |
tree. This increases the average number of message bits (the number of
|
|
Packit |
df99a1 |
decisions) required to code a symbol. The ZP-Coder will be called more
|
|
Packit |
df99a1 |
often, making the coding program slower. Furthermore, each message
|
|
Packit |
df99a1 |
bit is encoded using an estimated distribution. All these useless message
|
|
Packit |
df99a1 |
bits mean that the ZP-Coder has more distributions to adapt. This
|
|
Packit |
df99a1 |
extra adaptation work will probably increase the file size.
|
|
Packit |
df99a1 |
|
|
Packit |
df99a1 |
Huffman codes are very fast when the tree structure is fixed beforehand.
|
|
Packit |
df99a1 |
Such {\em static Huffman codes} are unfortunately not very efficient
|
|
Packit |
df99a1 |
because the tree never matches the actual data distribution. This is why
|
|
Packit |
df99a1 |
such programs almost always define a data dependent tree structure. This
|
|
Packit |
df99a1 |
structure must then be encoded in the file since the decoder must know it
|
|
Packit |
df99a1 |
before decoding the symbols. Static Huffman codes however become very
|
|
Packit |
df99a1 |
efficient when decisions are encoded with the ZP-Coder. The tree
|
|
Packit |
df99a1 |
structure represents a priori knowledge about the distribution of the
|
|
Packit |
df99a1 |
symbol values. Small data discrepancies will be addressed transparently
|
|
Packit |
df99a1 |
by the ZP-Coder.
|
|
Packit |
df99a1 |
|
|
Packit |
df99a1 |
|
|
Packit |
df99a1 |
{\bf Encoding Numbers} ---
|
|
Packit |
df99a1 |
This technique is illustrated with the following number encoding example.
|
|
Packit |
df99a1 |
The multivalued technique described above is not practical with large
|
|
Packit |
df99a1 |
numbers because the decision tree has too many nodes and requires too many
|
|
Packit |
df99a1 |
context variables. This problem can be solved by using a priori knowledge
|
|
Packit |
df99a1 |
about the probability distribution of our numbers.
|
|
Packit |
df99a1 |
|
|
Packit |
df99a1 |
Assume for instance that the distribution is symmetrical and that small
|
|
Packit |
df99a1 |
numbers are much more probable than large numbers. We will first group
|
|
Packit |
df99a1 |
our numbers into several sets. Each number is coded by first coding which
|
|
Packit |
df99a1 |
set contains the number and then coding a position within the set. Each
|
|
Packit |
df99a1 |
set contains #2^n# numbers that we consider roughly equiprobable. Since
|
|
Packit |
df99a1 |
the most probable values occur much more often, we want to model their
|
|
Packit |
df99a1 |
probability more precisely. Therefore we use small sets for the most
|
|
Packit |
df99a1 |
probable values and large sets for the least probable values, as
|
|
Packit |
df99a1 |
demonstrated below.
|
|
Packit |
df99a1 |
\begin{verbatim}
|
|
Packit |
df99a1 |
A---------------- {0} (size=1)
|
|
Packit |
df99a1 |
`------B---C---- {1} or {-1} (size=1)
|
|
Packit |
df99a1 |
\ `--- {2,3} or {-2,-3} (size=2)
|
|
Packit |
df99a1 |
D------ {4...131} or {-4...-131} (size=128)
|
|
Packit |
df99a1 |
`----- {132...32899} or {-132...-32899} (size=32768)
|
|
Packit |
df99a1 |
\end{verbatim}
|
|
Packit |
df99a1 |
We then organize a decision tree for coding the set identifier. This
|
|
Packit |
df99a1 |
decision tree is balanced using whatever a priori knowledge we have about
|
|
Packit |
df99a1 |
the probability distribution of the number values, just like a static
|
|
Packit |
df99a1 |
Huffman tree. Each decision (except the sign decision) is then coded
|
|
Packit |
df99a1 |
using a dedicated context variable.
|
|
Packit |
df99a1 |
\begin{verbatim}
|
|
Packit |
df99a1 |
if (! zp.decoder(ctx_A)) { // decision A
|
|
Packit |
df99a1 |
return 0;
|
|
Packit |
df99a1 |
} else {
|
|
Packit |
df99a1 |
if (! zp.decoder(ctx_B)) { // + decision B
|
|
Packit |
df99a1 |
if (! zp.decoder(ctx_C)) { // ++ decision C
|
|
Packit |
df99a1 |
if (! zp.decoder()) // +++ sign decision
|
|
Packit |
df99a1 |
return +1;
|
|
Packit |
df99a1 |
else
|
|
Packit |
df99a1 |
return -1;
|
|
Packit |
df99a1 |
} else {
|
|
Packit |
df99a1 |
if (! zp.decoder()) // +++ sign decision
|
|
Packit |
df99a1 |
return + 2 + zp.decoder();
|
|
Packit |
df99a1 |
else
|
|
Packit |
df99a1 |
return - 2 - zp.decoder();
|
|
Packit |
df99a1 |
}
|
|
Packit |
df99a1 |
} else {
|
|
Packit |
df99a1 |
if (! zp.decoder(ctx_D)) { // ++ decision D
|
|
Packit |
df99a1 |
if (! zp.decoder()) // +++ sign decision
|
|
Packit |
df99a1 |
return + 4 + decode_7_bits(zp);
|
|
Packit |
df99a1 |
else
|
|
Packit |
df99a1 |
return - 4 - decode_7_bits(zp);
|
|
Packit |
df99a1 |
} else {
|
|
Packit |
df99a1 |
if (! zp.decoder()) // +++ sign decision
|
|
Packit |
df99a1 |
return + 132 + decode_15_bits(zp);
|
|
Packit |
df99a1 |
else
|
|
Packit |
df99a1 |
return - 132 - decode_15_bits(zp);
|
|
Packit |
df99a1 |
}
|
|
Packit |
df99a1 |
}
|
|
Packit |
df99a1 |
}
|
|
Packit |
df99a1 |
\end{verbatim}
|
|
Packit |
df99a1 |
Note that the call #zp.decoder()# for coding the sign decision does not use
|
|
Packit |
df99a1 |
a context variable. This is a "pass-thru" variant of \Ref{decoder} which
|
|
Packit |
df99a1 |
bypasses the ZP-Coder and just reads a bit from the code sequence. There
|
|
Packit |
df99a1 |
is a corresponding "pass-thru" version of \Ref{encoder} for encoding such
|
|
Packit |
df99a1 |
bits. Similarly, functions #decode_7_bits# and #decode_15_bits# do not
|
|
Packit |
df99a1 |
take an array of context variables because, unlike function #decode_8_bits#
|
|
Packit |
df99a1 |
listed above, they are based on the pass-thru decoder instead of the
|
|
Packit |
df99a1 |
regular decoder.
|
|
Packit |
df99a1 |
|
|
Packit |
df99a1 |
The ZP-Coder will not learn the probabilities of the numbers within a set
|
|
Packit |
df99a1 |
since no context variables have been allocated for that purpose. This
|
|
Packit |
df99a1 |
could be improved by allocating additional context variables for encoding
|
|
Packit |
df99a1 |
the position within the smaller sets and using the regular decoding
|
|
Packit |
df99a1 |
functions instead of the pass-thru variants. Only experimentation can tell
|
|
Packit |
df99a1 |
what works best for your particular encoding problem.
|
|
Packit |
df99a1 |
|
|
Packit |
df99a1 |
|
|
Packit |
df99a1 |
{\bf Understanding Adaptation} ---
|
|
Packit |
df99a1 |
We have so far explained that the ZP-Coder adaptation algorithm is able to
|
|
Packit |
df99a1 |
quickly estimate of the probability distribution of the message bits coded
|
|
Packit |
df99a1 |
using a particular context variable. It is also able to track slow
|
|
Packit |
df99a1 |
variations when the actual probabilities change while coding.
|
|
Packit |
df99a1 |
|
|
Packit |
df99a1 |
Let us consider the ``cloud of points'' application presented above.
|
|
Packit |
df99a1 |
Suppose that we first code points located towards the left side and then
|
|
Packit |
df99a1 |
slowly move towards points located on the right side. The ZP-Coder will
|
|
Packit |
df99a1 |
first estimate that the X coordinates are rather on the left side. This
|
|
Packit |
df99a1 |
estimation will be progressively revised after seeing more points on the
|
|
Packit |
df99a1 |
right side. Such an ordering of the points obviously violates the point
|
|
Packit |
df99a1 |
independence assumption on which our code is based. Despite our inexact
|
|
Packit |
df99a1 |
assumptions, the tracking mechanism allows for better prediction of the X
|
|
Packit |
df99a1 |
coordinates and therefore better compression.
|
|
Packit |
df99a1 |
|
|
Packit |
df99a1 |
However, this is not a perfect solution. The ZP-Coder tracks the changes
|
|
Packit |
df99a1 |
because every point seems to be a little bit more on the right side than
|
|
Packit |
df99a1 |
suggested by the previous points. The ZP-Coder coding algorithm is always
|
|
Packit |
df99a1 |
slightly misadjusted and we always lose a little on possible compression
|
|
Packit |
df99a1 |
ratio. This is not much of a problem when the probabilities drift slowly.
|
|
Packit |
df99a1 |
On the other hand, this can be very significant if the probabilities change
|
|
Packit |
df99a1 |
drastically.
|
|
Packit |
df99a1 |
|
|
Packit |
df99a1 |
Adaptation is always associated with a small loss of efficiency. The
|
|
Packit |
df99a1 |
ZP-Coder updates the probability model whenever it suspects, {\em after
|
|
Packit |
df99a1 |
coding}, that the current settings were not optimal. The model will be
|
|
Packit |
df99a1 |
better next time, but a slight loss in compression has occurred. The
|
|
Packit |
df99a1 |
design of ZP-Coder of course minimizes this effect as much as possible.
|
|
Packit |
df99a1 |
Yet you will pay a price if you ask too much to the adaptation algorithm.
|
|
Packit |
df99a1 |
If you have millions of context variables, it will be difficult to train
|
|
Packit |
df99a1 |
them all. If the probability distributions change drastically while
|
|
Packit |
df99a1 |
coding, it will be difficult to track the changes fast enough.
|
|
Packit |
df99a1 |
|
|
Packit |
df99a1 |
Adaptation on the other hand is a great simplification. A good data
|
|
Packit |
df99a1 |
compression program must (a) represent the data in order to make its
|
|
Packit |
df99a1 |
predictability apparent, and (b) perform the predictions and generate the
|
|
Packit |
df99a1 |
code bits. The ZP-Coder is an efficient and effortless solution for
|
|
Packit |
df99a1 |
implementing task (b).
|
|
Packit |
df99a1 |
|
|
Packit |
df99a1 |
|
|
Packit |
df99a1 |
{\bf Practical Debugging Tricks} ---
|
|
Packit |
df99a1 |
Sometimes you write an encoding program and a decoding program.
|
|
Packit |
df99a1 |
Unfortunately there is a bug: the decoding program decodes half the file
|
|
Packit |
df99a1 |
and then just outputs garbage. There is a simple way to locate the
|
|
Packit |
df99a1 |
problem. In the encoding program, after each call to #encoder#, print the
|
|
Packit |
df99a1 |
encoded bit and the value of the context variable. In the decoding
|
|
Packit |
df99a1 |
program, after each call to #decoder#, print the decoded bit and the value
|
|
Packit |
df99a1 |
of the context variable. Both program should print exactly the same thing.
|
|
Packit |
df99a1 |
When you find the difference, you find the bug.
|
|
Packit |
df99a1 |
|
|
Packit |
df99a1 |
@memo Suggestions for efficiently using the ZP-Coder. */
|
|
Packit |
df99a1 |
//@}
|
|
Packit |
df99a1 |
|
|
Packit |
df99a1 |
// ------------ THE END
|
|
Packit |
df99a1 |
|
|
Packit |
df99a1 |
#ifdef HAVE_NAMESPACES
|
|
Packit |
df99a1 |
}
|
|
Packit |
df99a1 |
# ifndef NOT_USING_DJVU_NAMESPACE
|
|
Packit |
df99a1 |
using namespace DJVU;
|
|
Packit |
df99a1 |
# endif
|
|
Packit |
df99a1 |
#endif
|
|
Packit |
df99a1 |
#endif
|
|
Packit |
df99a1 |
|
|
Packit |
df99a1 |
|