Blame libdjvu/ZPCodec.h

Packit df99a1
//C-  -*- C++ -*-
Packit df99a1
//C- -------------------------------------------------------------------
Packit df99a1
//C- DjVuLibre-3.5
Packit df99a1
//C- Copyright (c) 2002  Leon Bottou and Yann Le Cun.
Packit df99a1
//C- Copyright (c) 2001  AT&T
Packit df99a1
//C-
Packit df99a1
//C- This software is subject to, and may be distributed under, the
Packit df99a1
//C- GNU General Public License, either Version 2 of the license,
Packit df99a1
//C- or (at your option) any later version. The license should have
Packit df99a1
//C- accompanied the software or you may obtain a copy of the license
Packit df99a1
//C- from the Free Software Foundation at http://www.fsf.org .
Packit df99a1
//C-
Packit df99a1
//C- This program is distributed in the hope that it will be useful,
Packit df99a1
//C- but WITHOUT ANY WARRANTY; without even the implied warranty of
Packit df99a1
//C- MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
Packit df99a1
//C- GNU General Public License for more details.
Packit df99a1
//C- 
Packit df99a1
//C- DjVuLibre-3.5 is derived from the DjVu(r) Reference Library from
Packit df99a1
//C- Lizardtech Software.  Lizardtech Software has authorized us to
Packit df99a1
//C- replace the original DjVu(r) Reference Library notice by the following
Packit df99a1
//C- text (see doc/lizard2002.djvu and doc/lizardtech2007.djvu):
Packit df99a1
//C-
Packit df99a1
//C-  ------------------------------------------------------------------
Packit df99a1
//C- | DjVu (r) Reference Library (v. 3.5)
Packit df99a1
//C- | Copyright (c) 1999-2001 LizardTech, Inc. All Rights Reserved.
Packit df99a1
//C- | The DjVu Reference Library is protected by U.S. Pat. No.
Packit df99a1
//C- | 6,058,214 and patents pending.
Packit df99a1
//C- |
Packit df99a1
//C- | This software is subject to, and may be distributed under, the
Packit df99a1
//C- | GNU General Public License, either Version 2 of the license,
Packit df99a1
//C- | or (at your option) any later version. The license should have
Packit df99a1
//C- | accompanied the software or you may obtain a copy of the license
Packit df99a1
//C- | from the Free Software Foundation at http://www.fsf.org .
Packit df99a1
//C- |
Packit df99a1
//C- | The computer code originally released by LizardTech under this
Packit df99a1
//C- | license and unmodified by other parties is deemed "the LIZARDTECH
Packit df99a1
//C- | ORIGINAL CODE."  Subject to any third party intellectual property
Packit df99a1
//C- | claims, LizardTech grants recipient a worldwide, royalty-free, 
Packit df99a1
//C- | non-exclusive license to make, use, sell, or otherwise dispose of 
Packit df99a1
//C- | the LIZARDTECH ORIGINAL CODE or of programs derived from the 
Packit df99a1
//C- | LIZARDTECH ORIGINAL CODE in compliance with the terms of the GNU 
Packit df99a1
//C- | General Public License.   This grant only confers the right to 
Packit df99a1
//C- | infringe patent claims underlying the LIZARDTECH ORIGINAL CODE to 
Packit df99a1
//C- | the extent such infringement is reasonably necessary to enable 
Packit df99a1
//C- | recipient to make, have made, practice, sell, or otherwise dispose 
Packit df99a1
//C- | of the LIZARDTECH ORIGINAL CODE (or portions thereof) and not to 
Packit df99a1
//C- | any greater extent that may be necessary to utilize further 
Packit df99a1
//C- | modifications or combinations.
Packit df99a1
//C- |
Packit df99a1
//C- | The LIZARDTECH ORIGINAL CODE is provided "AS IS" WITHOUT WARRANTY
Packit df99a1
//C- | OF ANY KIND, EITHER EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED
Packit df99a1
//C- | TO ANY WARRANTY OF NON-INFRINGEMENT, OR ANY IMPLIED WARRANTY OF
Packit df99a1
//C- | MERCHANTABILITY OR FITNESS FOR A PARTICULAR PURPOSE.
Packit df99a1
//C- +------------------------------------------------------------------
Packit df99a1
Packit df99a1
#ifndef _ZPCODEC_H
Packit df99a1
#define _ZPCODEC_H
Packit df99a1
#ifdef HAVE_CONFIG_H
Packit df99a1
#include "config.h"
Packit df99a1
#endif
Packit df99a1
#if NEED_GNUG_PRAGMAS
Packit df99a1
# pragma interface
Packit df99a1
#endif
Packit df99a1
Packit df99a1
// From: Leon Bottou, 1/31/2002
Packit df99a1
// Almost equal to my initial code.
Packit df99a1
Packit df99a1
#include "GContainer.h"
Packit df99a1
Packit df99a1
#ifdef HAVE_NAMESPACES
Packit df99a1
namespace DJVU {
Packit df99a1
# ifdef NOT_DEFINED // Just to fool emacs c++ mode
Packit df99a1
}
Packit df99a1
#endif
Packit df99a1
#endif
Packit df99a1
Packit df99a1
class ByteStream;
Packit df99a1
Packit df99a1
Packit df99a1
Packit df99a1
/** @name ZPCodec.h
Packit df99a1
    
Packit df99a1
    Files #"ZPCodec.h"# and #"ZPCodec.cpp"# implement a fast binary adaptive
Packit df99a1
    quasi-arithmetic coder named ZP-Coder.  Because of its speed and
Packit df99a1
    convenience, the ZP-Coder is used in several parts of the DjVu reference
Packit df99a1
    library (See \Ref{BSByteStream.h}, \Ref{JB2Image.h}, \Ref{IW44Image.h}).
Packit df99a1
    The following comments avoid the theory (see the historical remarks for
Packit df99a1
    useful pointers) and concentrate on the user perspective on the ZP-Coder.
Packit df99a1
Packit df99a1
    {\bf Introduction} ---
Packit df99a1
    Encoding consists of transforming a sequence of {\em message bits} into a
Packit df99a1
    sequence of {\em code bits}. Decoding consists of retrieving the message
Packit df99a1
    bits using only the code bits.  We can make the code smaller than the
Packit df99a1
    message as soon as we can predict a message bit on the basis of a {\em
Packit df99a1
    coding context} composed of previously encoded or decoded bits. If the
Packit df99a1
    prediction is always correct, we do not even need to encode the message
Packit df99a1
    bit. If the prediction is totally unreliable, we need to generate one code
Packit df99a1
    bit in order to unambiguously specify the message bit.  In other words,
Packit df99a1
    the more reliable the prediction, the more compression we get.
Packit df99a1
Packit df99a1
    The ZP-Coder handles prediction by means of {\em context variables} (see
Packit df99a1
    \Ref{BitContext}).  There must be a context variable for each possible
Packit df99a1
    combination of context bits.  Both the encoder and the decoder use same
Packit df99a1
    context variable for coding each message bit.  For instance, we can code a
Packit df99a1
    binary image by successively coding all the pixels (the message bits) in
Packit df99a1
    row and column order.  It is reasonable to assume that each pixel can be
Packit df99a1
    reasonably well predicted by looking at a few (say 10) neighboring pixels
Packit df99a1
    located above and to the left of the current pixel.  Since these 10 pixels
Packit df99a1
    make 1024 combinations, we need 1024 context variables. Each pixel is
Packit df99a1
    encoded using the context variable corresponding to the values of the 10
Packit df99a1
    neighboring pixels.  Each pixel will be decoded by specifying the same
Packit df99a1
    context variable corresponding to the values of these 10 pixels. This is
Packit df99a1
    possible because these 10 pixels (located above and to the left) have
Packit df99a1
    already been decoded and therefore are known by the decoder program.
Packit df99a1
Packit df99a1
    The context variables are initially set to zero, which mean that we do not
Packit df99a1
    know yet how to predict the current message bit on the basis of the
Packit df99a1
    context bits. While coding the message bits, the ZP-Coder automatically
Packit df99a1
    estimates the frequencies of #0#s and #1#s coded using each context
Packit df99a1
    variable.  These frequencies actually provide a prediction (the most
Packit df99a1
    probable bit value) and an estimation of the prediction reliability (how
Packit df99a1
    often the prediction was correct in the past).  All this statistical
Packit df99a1
    information is stored into the context variable after coding each bit.  In
Packit df99a1
    other words, the more we code bits within a particular context, the better
Packit df99a1
    the ZP-Coder adapts its prediction model, and the more compression we can
Packit df99a1
    obtain.
Packit df99a1
Packit df99a1
    All this adaptation works indeed because both the encoder program and the
Packit df99a1
    decoder program are always synchronized. Both the encoder and the decoder
Packit df99a1
    see the same message bits encoded (or decoded) with the same context
Packit df99a1
    variables.  Both the encoder and the decoder apply the same rules to
Packit df99a1
    update the context variables and improve the predictors.  Both the encoder
Packit df99a1
    and the decoder programs use the same predictors for any given message
Packit df99a1
    bit.  The decoder could not work if this was not the case.
Packit df99a1
    
Packit df99a1
    Just before encoding a message bit, all the context variables in the
Packit df99a1
    encoder program contain certain values. Just before decoding this message
Packit df99a1
    bit, all the context variables in the decoder program must contain the same
Packit df99a1
    values as for the encoder program.  This is guaranteed as long as
Packit df99a1
    each prediction only depends on already coded bits: {\em the coding context,
Packit df99a1
    on which the each prediction is based, must be composed of message bits which
Packit df99a1
    have already been coded. }
Packit df99a1
Packit df99a1
    {\bf Usage} ---
Packit df99a1
    Once you know how to organize the predictions (i.e. which coding context
Packit df99a1
    to use, how many context variables to initialize, etc.), using the
Packit df99a1
    ZP-Coder is straightforward (see \Ref{ZPCodec Examples}):
Packit df99a1
    \begin{itemize}
Packit df99a1
    \item The {\em encoder program} allocates context variables and
Packit df99a1
    initializes them to zero. It then constructs a \Ref{ZPCodec} object for
Packit df99a1
    encoding. For each message bit, the encoder program retrieves the context
Packit df99a1
    bits, selects a context variable on the basis of the context bits and
Packit df99a1
    calls member function \Ref{ZPCodec::encoder} with the message bit and a
Packit df99a1
    reference to the context variable.
Packit df99a1
    \item The {\em decoder program} allocates context variables and
Packit df99a1
    initializes them to zero. It then constructs a \Ref{ZPCodec} object for
Packit df99a1
    decoding. For each message bit, the decoder program retrieves the context
Packit df99a1
    bits, selects a context variable on the basis of the context bits and
Packit df99a1
    calls member function \Ref{ZPCodec::decoder} with a reference to the
Packit df99a1
    context variable. This function returns the message bit.
Packit df99a1
    \end{itemize}
Packit df99a1
    Functions #encoder# and #decoder# only require a few machine cycles to
Packit df99a1
    perform two essential tasks, namely {\em coding} and {\em context
Packit df99a1
    adaptation}.  Function #decoder# often returns after two arithmetic
Packit df99a1
    operations only.  To make your program fast, you just need to feed message
Packit df99a1
    bits and context variables fast enough.
Packit df99a1
Packit df99a1
    {\bf History} --- The ZP-Coder is similar in function and performance to
Packit df99a1
    the seminal Q-Coder (Pennebaker, Mitchell, Langdon, Arps, IBM J. Res
Packit df99a1
    Dev. 32, 1988). An improved version of the Q-Coder, named QM-Coder, has
Packit df99a1
    been described in certain parts of the JPEG standard.  Unfortunate patent
Packit df99a1
    policies have made these coders very difficult to use in general purpose
Packit df99a1
    applications.  The Z-Coder is constructed using a new approach based on an
Packit df99a1
    extension of the Golomb codes (Bottou, Howard, Bengio, IEEE DCC 98, 1998
Packit df99a1
    \URL[DjVu]{http://www.research.att.com/~leonb/DJVU/bottou-howard-bengio/}
Packit df99a1
    \URL[PostScript]{http://www.research.att.com/~leonb/PS/bottou-howard-bengio.ps.gz})
Packit df99a1
    This new approach does not infringe the QM-Coder patents.  Unfortunately
Packit df99a1
    the Z-Coder is dangerously close to the patented Arithmetic MEL Coder.
Packit df99a1
    Therefore we wrote the ZP-Coder (pronounce Zee-Prime Coder) which we
Packit df99a1
    believe is clear of legal problems.  Needless to say, AT&T has patents
Packit df99a1
    pending for both the Z-Coder and the ZP-Coder, licenced to LizardTech.
Packit df99a1
    The good news however is that we can grant a license to use the ZP-Coder
Packit df99a1
    in ``free software'' without further complication. See the Copyright
Packit df99a1
    for more information.
Packit df99a1
    
Packit df99a1
    @memo
Packit df99a1
    Binary adaptive quasi-arithmetic coder.
Packit df99a1
    @author
Packit df99a1
    L\'eon Bottou <leonb@research.att.com> */
Packit df99a1
//@{
Packit df99a1
Packit df99a1
Packit df99a1
/** Context variable.  
Packit df99a1
    Variables of type #BitContext# hold a single byte describing how to encode
Packit df99a1
    or decode message bits with similar statistical properties.  This single
Packit df99a1
    byte simultaneously represents the current estimate of the bit probability
Packit df99a1
    distribution (which is determined by the frequencies of #1#s and #0#s
Packit df99a1
    already coded with this context) and the confidence in this estimate
Packit df99a1
    (which determines how fast the estimate can change.)
Packit df99a1
Packit df99a1
    A coding program typically allocates hundreds of context variables.  Each
Packit df99a1
    coding context is initialized to zero before encoding or decoding.  Value
Packit df99a1
    zero represents equal probabilities for #1#s and #0#s with a minimal
Packit df99a1
    confidence and therefore a maximum adaptation speed.  Each message bit is
Packit df99a1
    encoded using a coding context determined as a function of previously
Packit df99a1
    encoded message bits.  The decoder therefore can examine the previously
Packit df99a1
    decoded message bits and decode the current bit using the same context as
Packit df99a1
    the encoder.  This is critical for proper decoding.  
Packit df99a1
*/
Packit df99a1
typedef unsigned char  BitContext;
Packit df99a1
Packit df99a1
Packit df99a1
/** Performs ZP-Coder encoding and decoding.  A ZPCodec object must either
Packit df99a1
    constructed for encoding or for decoding.  The ZPCodec object is connected
Packit df99a1
    with a \Ref{ByteStream} object specified at construction time.  A ZPCodec
Packit df99a1
    object constructed for decoding reads code bits from the ByteStream and
Packit df99a1
    returns a message bit whenever function \Ref{decoder} is called.  A
Packit df99a1
    ZPCodec constructed for encoding processes the message bits provided by
Packit df99a1
    function \Ref{encoder} and writes the corresponding code bits to
Packit df99a1
    ByteStream #bs#.
Packit df99a1
Packit df99a1
    You should never directly access a ByteStream object connected to a valid
Packit df99a1
    ZPCodec object. The most direct way to access the ByteStream object
Packit df99a1
    consists of using the "pass-thru" versions of functions \Ref{encoder} and
Packit df99a1
    \Ref{decoder}.
Packit df99a1
Packit df99a1
    The ByteStream object can be accessed again after the destruction of the
Packit df99a1
    ZPCodec object.  Note that the encoder always flushes its internal buffers
Packit df99a1
    and writes a few final code bytes when the ZPCodec object is destroyed.
Packit df99a1
    Note also that the decoder often reads a few bytes beyond the last code byte
Packit df99a1
    written by the encoder.  This lag means that you must reposition the
Packit df99a1
    ByteStream after the destruction of the ZPCodec object and before re-using
Packit df99a1
    the ByteStream object (see \Ref{IFFByteStream}.)
Packit df99a1
Packit df99a1
    Please note also that the decoder has no way to reliably indicate the end
Packit df99a1
    of the message bit sequence.  The content of the message must be designed
Packit df99a1
    in a way which indicates when to stop decoding.  Simple ways to achieve
Packit df99a1
    this consists of announcing the message length at the beginning (like a
Packit df99a1
    pascal style string), or of defining a termination code (like a null
Packit df99a1
    terminated string).  */
Packit df99a1
Packit df99a1
class ZPCodec : public GPEnabled {
Packit df99a1
protected:
Packit df99a1
  ZPCodec (GP<ByteStream> gbs, const bool encoding, const bool djvucompat=false);
Packit df99a1
public:
Packit df99a1
  class Encode;
Packit df99a1
  class Decode;
Packit df99a1
Packit df99a1
  /// Non-virtual destructor.
Packit df99a1
  ~ZPCodec();
Packit df99a1
  /** Constructs a ZP-Coder.  If argument #encoding# is zero, the ZP-Coder
Packit df99a1
      object will read code bits from the ByteStream #bs# and return a message
Packit df99a1
      bit whenever function #decoder# is called.  If flag #encoding# is set
Packit df99a1
      the ZP-Coder object will process the message bits provided by function
Packit df99a1
      #encoder# and write code bits to ByteStream #bs#.  Optional flag
Packit df99a1
      #djvucompat# selects a slightly less efficient adaptation table which is
Packit df99a1
      used by the DjVu project.  This is required in order to ensure the
Packit df99a1
      bitstream compatibility.  You should not use this flag unless you want
Packit df99a1
      to decode JB2, IW44 or BZZ encoded data. */
Packit df99a1
  static GP<ZPCodec> create(
Packit df99a1
     GP<ByteStream> gbs, const bool encoding, const bool djvucompat=false);
Packit df99a1
Packit df99a1
  /** Encodes bit #bit# using context variable #ctx#.  Argument #bit# must be
Packit df99a1
      #0# or #1#. This function should only be used with ZP-Coder objects
Packit df99a1
      created for encoding. It may modify the contents of variable #ctx# in
Packit df99a1
      order to perform context adaptation. */
Packit df99a1
  void encoder(int bit, BitContext &ctx;;
Packit df99a1
Packit df99a1
  /** Decodes a bit using context variable #ctx#. This function should only be
Packit df99a1
      used with ZP-Coder objects created for decoding. It may modify the
Packit df99a1
      contents of variable #ctx# in order to perform context adaptation. */
Packit df99a1
  int  decoder(BitContext &ctx;;
Packit df99a1
Packit df99a1
  /** Encodes bit #bit# without compression (pass-thru encoder).  Argument
Packit df99a1
      #bit# must be #0# or #1#. No compression will be applied. Calling this
Packit df99a1
      function always increases the length of the code bit sequence by one
Packit df99a1
      bit. */
Packit df99a1
  void encoder(int bit);
Packit df99a1
Packit df99a1
  /** Decodes a bit without compression (pass-thru decoder).  This function
Packit df99a1
      retrieves bits encoded with the pass-thru encoder. */
Packit df99a1
  int  decoder(void);
Packit df99a1
#ifdef ZPCODEC_BITCOUNT
Packit df99a1
  /** Counter for code bits (requires #-DZPCODEC_BITCOUNT#). This member
Packit df99a1
      variable is available when the ZP-Coder is compiled with option
Packit df99a1
      #-DZPCODEC_BITCOUNT#.  Variable #bitcount# counts the number of code
Packit df99a1
      bits processed by the coder since the construction of the object.  This
Packit df99a1
      variable can be used to evaluate how many code bits are spent on various
Packit df99a1
      components of the message. */
Packit df99a1
  int bitcount;
Packit df99a1
#endif
Packit df99a1
  // Table management (advanced stuff)
Packit df99a1
  struct Table { 
Packit df99a1
    unsigned short p;
Packit df99a1
    unsigned short m;
Packit df99a1
    BitContext     up;
Packit df99a1
    BitContext     dn;
Packit df99a1
  };
Packit df99a1
  void newtable(ZPCodec::Table *table);
Packit df99a1
  BitContext state(float prob1);
Packit df99a1
  // Non-adaptive encoder/decoder
Packit df99a1
  void encoder_nolearn(int pix, BitContext &ctx;;
Packit df99a1
  int  decoder_nolearn(BitContext &ctx;;
Packit df99a1
  inline int  IWdecoder(void);
Packit df99a1
  inline void IWencoder(const bool bit);
Packit df99a1
protected:
Packit df99a1
  // coder status
Packit df99a1
  GP<ByteStream> gbs;           // Where the data goes/comes from
Packit df99a1
  ByteStream *bs;               // Where the data goes/comes from
Packit df99a1
  const bool encoding;          // Direction (0=decoding, 1=encoding)
Packit df99a1
  unsigned char byte;
Packit df99a1
  unsigned char scount;
Packit df99a1
  unsigned char delay;
Packit df99a1
  unsigned int  a;
Packit df99a1
  unsigned int  code;
Packit df99a1
  unsigned int  fence;
Packit df99a1
  unsigned int  subend;
Packit df99a1
  unsigned int  buffer;
Packit df99a1
  unsigned int  nrun;
Packit df99a1
  // table
Packit df99a1
  unsigned int  p[256];
Packit df99a1
  unsigned int  m[256];
Packit df99a1
  BitContext    up[256];
Packit df99a1
  BitContext    dn[256];
Packit df99a1
  // machine independent ffz
Packit df99a1
  char          ffzt[256];
Packit df99a1
  // encoder private
Packit df99a1
  void einit (void);
Packit df99a1
  void eflush (void);
Packit df99a1
  void outbit(int bit);
Packit df99a1
  void zemit(int b);
Packit df99a1
  void encode_mps(BitContext &ctx, unsigned int z);
Packit df99a1
  void encode_lps(BitContext &ctx, unsigned int z);
Packit df99a1
  void encode_mps_simple(unsigned int z);
Packit df99a1
  void encode_lps_simple(unsigned int z);
Packit df99a1
  void encode_mps_nolearn(unsigned int z);
Packit df99a1
  void encode_lps_nolearn(unsigned int z);
Packit df99a1
  // decoder private
Packit df99a1
  void dinit(void);
Packit df99a1
  void preload(void);
Packit df99a1
  int  ffz(unsigned int x);
Packit df99a1
  int  decode_sub(BitContext &ctx, unsigned int z);
Packit df99a1
  int  decode_sub_simple(int mps, unsigned int z);
Packit df99a1
  int  decode_sub_nolearn(int mps, unsigned int z);
Packit df99a1
private:
Packit df99a1
  // no copy allowed (hate c++)
Packit df99a1
  ZPCodec(const ZPCodec&);
Packit df99a1
  ZPCodec& operator=(const ZPCodec&);
Packit df99a1
#ifdef ZPCODEC_FRIEND
Packit df99a1
  friend ZPCODEC_FRIEND;
Packit df99a1
#endif
Packit df99a1
};
Packit df99a1
Packit df99a1
Packit df99a1
Packit df99a1
Packit df99a1
Packit df99a1
Packit df99a1
// INLINE CODE
Packit df99a1
Packit df99a1
inline void 
Packit df99a1
ZPCodec::encoder(int bit, BitContext &ctx) 
Packit df99a1
{
Packit df99a1
  unsigned int z = a + p[ctx];
Packit df99a1
  if (bit != (ctx & 1))
Packit df99a1
  {
Packit df99a1
    encode_lps(ctx, z);
Packit df99a1
  }else if (z >= 0x8000)
Packit df99a1
  {
Packit df99a1
    encode_mps(ctx, z);
Packit df99a1
  }else
Packit df99a1
  {
Packit df99a1
    a = z;
Packit df99a1
  }
Packit df99a1
}
Packit df99a1
Packit df99a1
inline int
Packit df99a1
ZPCodec::IWdecoder(void)
Packit df99a1
{
Packit df99a1
  return decode_sub_simple(0,0x8000 + ((a+a+a) >> 3));
Packit df99a1
}
Packit df99a1
Packit df99a1
inline int
Packit df99a1
ZPCodec::decoder(BitContext &ctx) 
Packit df99a1
{
Packit df99a1
  unsigned int z = a + p[ctx];
Packit df99a1
  if (z <= fence) 
Packit df99a1
    { a = z; return (ctx&1;; } 
Packit df99a1
  return decode_sub(ctx, z);
Packit df99a1
}
Packit df99a1
Packit df99a1
inline void 
Packit df99a1
ZPCodec::encoder_nolearn(int bit, BitContext &ctx) 
Packit df99a1
{
Packit df99a1
  unsigned int z = a + p[ctx];
Packit df99a1
  if (bit != (ctx & 1))
Packit df99a1
    encode_lps_nolearn(z);
Packit df99a1
  else if (z >= 0x8000)
Packit df99a1
    encode_mps_nolearn(z);
Packit df99a1
  else
Packit df99a1
    a = z;
Packit df99a1
}
Packit df99a1
Packit df99a1
inline int
Packit df99a1
ZPCodec::decoder_nolearn(BitContext &ctx) 
Packit df99a1
{
Packit df99a1
  unsigned int z = a + p[ctx];
Packit df99a1
  if (z <= fence) 
Packit df99a1
    { a = z; return (ctx&1;; } 
Packit df99a1
  return decode_sub_nolearn( (ctx&1), z);
Packit df99a1
}
Packit df99a1
Packit df99a1
inline void 
Packit df99a1
ZPCodec::encoder(int bit)
Packit df99a1
{
Packit df99a1
  if (bit)
Packit df99a1
    encode_lps_simple(0x8000 + (a>>1));
Packit df99a1
  else
Packit df99a1
    encode_mps_simple(0x8000 + (a>>1));
Packit df99a1
}
Packit df99a1
Packit df99a1
inline int
Packit df99a1
ZPCodec::decoder(void)
Packit df99a1
{
Packit df99a1
  return decode_sub_simple(0, 0x8000 + (a>>1));
Packit df99a1
}
Packit df99a1
Packit df99a1
inline void
Packit df99a1
ZPCodec::IWencoder(const bool bit)
Packit df99a1
{
Packit df99a1
  const int z = 0x8000 + ((a+a+a) >> 3);
Packit df99a1
  if (bit)
Packit df99a1
  {
Packit df99a1
    encode_lps_simple(z);
Packit df99a1
  }else
Packit df99a1
  {
Packit df99a1
    encode_mps_simple(z);
Packit df99a1
  }
Packit df99a1
}
Packit df99a1
Packit df99a1
// ------------ ADDITIONAL DOCUMENTATION
Packit df99a1
Packit df99a1
/** @name ZPCodec Examples
Packit df99a1
    
Packit df99a1
    Binary adaptive coders are efficient and very flexible.  Unfortunate
Packit df99a1
    intellectual property issues however have limited their popularity.  As a
Packit df99a1
    consequence, few programmers have a direct experience of using such a
Packit df99a1
    coding device.  The few examples provided in this section demonstrate how
Packit df99a1
    we think the ZP-Coder should be used.
Packit df99a1
    
Packit df99a1
    {\bf Encoding Multivalued Symbols} ---
Packit df99a1
    Since the ZP-Coder is a strictly binary coder, every message must be
Packit df99a1
    reduced to a sequence of bits (#0#s or #1#s).  It is often convenient to
Packit df99a1
    consider that a message is a sequence of symbols taking more than two
Packit df99a1
    values.  For instance, a character string may be a sequence of bytes, and
Packit df99a1
    each byte can take 256 values.  Each byte of course is composed of eight
Packit df99a1
    bits that we can encode in sequence.  The real issue however consists of
Packit df99a1
    deciding how we will use context variables in order to let the ZP-Coder
Packit df99a1
    learn the probability distribution of the byte values.
Packit df99a1
Packit df99a1
    The most significant bit #b0# decides whether the byte is in range 0..127
Packit df99a1
    or in range 128..255.  We let the ZP-Coder learn how to predict this bit
Packit df99a1
    by allocating one context variable for it.  The second most significant
Packit df99a1
    byte #b1# has two distinct meanings depending of bit #b0#.  If bit #b0# is
Packit df99a1
    #0#, bit #b1# decides whether the byte is in range 0..63 or 64..127.  If
Packit df99a1
    bit #b0# is #1#, bit #b1# decides whether the byte is in range 128..191 or
Packit df99a1
    192..255.  The prediction for bit #b1# must therefore depend on the value
Packit df99a1
    of #b0#.  This is why we will allocate two context variables for this bit.
Packit df99a1
    If bit #b0# is #0#, we will use the first variable; if bit #b0# is #1#, we
Packit df99a1
    will use the second variable.  The next bit #b2# has four meanings and
Packit df99a1
    therefore we will use four context variables, etc.  This analysis leads to
Packit df99a1
    a total of #1+2+4+...+128# = #255# context variables for encoding one
Packit df99a1
    byte.  This encoding procedure can be understood as a binary decision
Packit df99a1
    tree with a dedicated context variable for predicting each decision.
Packit df99a1
    \begin{verbatim}
Packit df99a1
    [>=128]----n---[>=64?]----n----[>31?]  ... 
Packit df99a1
           \              `---y----[>95?]  ...
Packit df99a1
            \
Packit df99a1
             `--y---[>=192?]----n---[>=160?] ...
Packit df99a1
                            `---y---[>=224?] ...
Packit df99a1
    \end{verbatim}
Packit df99a1
    The following decoding function illustrates a very compact way to
Packit df99a1
    implement such a decision tree.  Argument #ctx# points to an array of 255
Packit df99a1
    #BitContext# variables.  Macro #REPEAT8# is a shorthand notation for eight
Packit df99a1
    repetitions of its argument.  
Packit df99a1
    \begin{verbatim}
Packit df99a1
    int decode_8_bits(ZPCodec &zp, BitContext *ctx )
Packit df99a1
    {
Packit df99a1
      int n = 1;
Packit df99a1
      REPEAT8( { n = (n<<1) | (zp.decoder(ctx[n-1])); } );
Packit df99a1
      return n & 0xff;
Packit df99a1
    }
Packit df99a1
    \end{verbatim}
Packit df99a1
    The binary representation of variable #n# is always composed of a #1#
Packit df99a1
    followed by whichever bits have been decoded so far. This extra bit #1# in
Packit df99a1
    fact is a nice trick to flatten out the tree structure and directly
Packit df99a1
    address the array of context variables.  Bit #b0# is decoded using the
Packit df99a1
    first context variable since #n# is initially #1#.  Bit #b1# is decoded
Packit df99a1
    using one of the next two variables in the array, since #n# is either #2#
Packit df99a1
    (#10# in binary) or #3# (#11# in binary).  Bit #b2# will be decoded using
Packit df99a1
    one of the next four variables, since #n# ranges from #4# (#100# in
Packit df99a1
    binary) to #7# (#111# in binary).  The final result is given by removing
Packit df99a1
    the extra #1# in variable #n#.
Packit df99a1
Packit df99a1
    The corresponding encoding function is almost as compact. Argument #ctx#
Packit df99a1
    again is an array of 255 #BitContext# variables.  Each bit of byte #x# is
Packit df99a1
    encoded and shifted into variable #n# as in the decoding function.
Packit df99a1
    Variable #x# in fact contains the bits to be encoded. Variable #n#
Packit df99a1
    contains a #1# followed by the already encoded bits.
Packit df99a1
    \begin{verbatim}
Packit df99a1
    void encode_8_bits(ZPCodec &zp, int x, BitContext *ctx )
Packit df99a1
    {
Packit df99a1
      int n = 1;
Packit df99a1
      REPEAT8( { int b=((x&0x80)?1:0);  x=(x<<1);
Packit df99a1
                 zp.encoder(b,ctx[n-1]);  n=(n<<1)|(b); } );
Packit df99a1
    }
Packit df99a1
    \end{verbatim}
Packit df99a1
    The ZP-Coder automatically adjusts the content of the context variables
Packit df99a1
    while coding (recall the context variable argument is passed to functions
Packit df99a1
    #encoder# and #decoder# by reference).  The whole array of 255 context
Packit df99a1
    variables can be understood as a "byte context variable".  The estimated
Packit df99a1
    probability of each byte value is indeed the product of the estimated
Packit df99a1
    probabilities of the eight binary decisions that lead to that value in the
Packit df99a1
    decision tree.  All these probabilities are adapted by the underlying
Packit df99a1
    adaptation algorithm of the ZP-Coder.
Packit df99a1
Packit df99a1
    {\bf Application} ---
Packit df99a1
    We consider now a simple applications consisting of encoding the
Packit df99a1
    horizontal and vertical coordinates of a cloud of points. Each coordinate
Packit df99a1
    requires one byte.  The following function illustrates a possible
Packit df99a1
    implementation:
Packit df99a1
    \begin{verbatim}
Packit df99a1
    void encode_points(const char *filename, int n, int *x, int *y)
Packit df99a1
    {
Packit df99a1
       StdioByteStream bs(filename, "wb");
Packit df99a1
       bs.write32(n);             // Write number of points.
Packit df99a1
       ZPCodec zp(bs, 1);         // Construct encoder and context vars.
Packit df99a1
       BitContext ctxX[255], ctxY[255];
Packit df99a1
       memset(ctxX, 0, sizeof(ctxX));
Packit df99a1
       memset(ctxY, 0, sizeof(ctxY));
Packit df99a1
       for (int i=0; i
Packit df99a1
          encode_8_bits(zp, x[i], ctxX);
Packit df99a1
          encode_8_bits(zp, y[i], ctxY);
Packit df99a1
       }
Packit df99a1
    }
Packit df99a1
    \end{verbatim}
Packit df99a1
    The decoding function is very similar to the encoding function:
Packit df99a1
    \begin{verbatim}
Packit df99a1
    int decode_points(const char *filename, int *x, int *y)
Packit df99a1
    {
Packit df99a1
       StdioByteStream bs(filename,"rb");
Packit df99a1
       int n = bs.read32();      // Read number of points.
Packit df99a1
       ZPCodec zp(bs, 0);        // Construct decoder and context vars.
Packit df99a1
       BitContext ctxX[255], ctxY[255];
Packit df99a1
       memset(ctxX, 0, sizeof(ctxX));
Packit df99a1
       memset(ctxY, 0, sizeof(ctxY));
Packit df99a1
       for (int i=0; i
Packit df99a1
         x[i] = decode_8_bits(zp, ctxX);
Packit df99a1
         y[i] = decode_8_bits(zp, ctxY);
Packit df99a1
       }
Packit df99a1
       return n;                 // Return number of points.
Packit df99a1
    }
Packit df99a1
    \end{verbatim}
Packit df99a1
    The ZP-Coder automatically estimates the probability distributions of both
Packit df99a1
    the horizontal and vertical coordinates. These estimates are used to
Packit df99a1
    efficiently encode the point coordinates.  This particular implementation
Packit df99a1
    is a good option if we assume that the order of the points is significant
Packit df99a1
    and that successive points are independent.  It would be much smarter
Packit df99a1
    otherwise to sort the points and encode relative displacements between
Packit df99a1
    successive points.
Packit df99a1
Packit df99a1
Packit df99a1
    {\bf Huffman Coding Tricks} --- 
Packit df99a1
    Programmers with experience in Huffman codes can see the similarity in the
Packit df99a1
    ZP-Coder.  Huffman codes also organize the symbol values as a decision
Packit df99a1
    tree. The tree is balanced in such a way that each decision is as
Packit df99a1
    unpredictable as possible (i.e. both branches must be equally probable).
Packit df99a1
    This is very close to the ZP-Coder technique described above.  Since we
Packit df99a1
    allocate one context variable for each decision, our tree need not be
Packit df99a1
    balanced: the context variable will track the decision statistics and the
Packit df99a1
    ZP-Coder will compensate optimally.
Packit df99a1
Packit df99a1
    There are good reasons however to avoid unbalanced trees with the ZP-Coder.
Packit df99a1
    Frequent symbol values may be located quite deep in a poorly balanced
Packit df99a1
    tree.  This increases the average number of message bits (the number of
Packit df99a1
    decisions) required to code a symbol.  The ZP-Coder will be called more
Packit df99a1
    often, making the coding program slower.  Furthermore, each message
Packit df99a1
    bit is encoded using an estimated distribution.  All these useless message
Packit df99a1
    bits mean that the ZP-Coder has more distributions to adapt.  This
Packit df99a1
    extra adaptation work will probably increase the file size.
Packit df99a1
Packit df99a1
    Huffman codes are very fast when the tree structure is fixed beforehand.
Packit df99a1
    Such {\em static Huffman codes} are unfortunately not very efficient
Packit df99a1
    because the tree never matches the actual data distribution.  This is why
Packit df99a1
    such programs almost always define a data dependent tree structure.  This
Packit df99a1
    structure must then be encoded in the file since the decoder must know it
Packit df99a1
    before decoding the symbols.  Static Huffman codes however become very
Packit df99a1
    efficient when decisions are encoded with the ZP-Coder.  The tree
Packit df99a1
    structure represents a priori knowledge about the distribution of the
Packit df99a1
    symbol values.  Small data discrepancies will be addressed transparently
Packit df99a1
    by the ZP-Coder.
Packit df99a1
Packit df99a1
Packit df99a1
    {\bf Encoding Numbers} ---
Packit df99a1
    This technique is illustrated with the following number encoding example.
Packit df99a1
    The multivalued technique described above is not practical with large
Packit df99a1
    numbers because the decision tree has too many nodes and requires too many
Packit df99a1
    context variables.  This problem can be solved by using a priori knowledge
Packit df99a1
    about the probability distribution of our numbers.
Packit df99a1
Packit df99a1
    Assume for instance that the distribution is symmetrical and that small
Packit df99a1
    numbers are much more probable than large numbers.  We will first group
Packit df99a1
    our numbers into several sets.  Each number is coded by first coding which
Packit df99a1
    set contains the number and then coding a position within the set.  Each
Packit df99a1
    set contains #2^n# numbers that we consider roughly equiprobable.  Since
Packit df99a1
    the most probable values occur much more often, we want to model their
Packit df99a1
    probability more precisely. Therefore we use small sets for the most
Packit df99a1
    probable values and large sets for the least probable values, as
Packit df99a1
    demonstrated below.
Packit df99a1
    \begin{verbatim} 
Packit df99a1
    A---------------- {0}                                 (size=1)
Packit df99a1
     `------B---C---- {1}            or {-1}              (size=1)
Packit df99a1
             \   `--- {2,3}          or {-2,-3}           (size=2)
Packit df99a1
              D------ {4...131}      or {-4...-131}       (size=128)
Packit df99a1
               `----- {132...32899}  or {-132...-32899}   (size=32768)
Packit df99a1
    \end{verbatim}
Packit df99a1
    We then organize a decision tree for coding the set identifier.  This
Packit df99a1
    decision tree is balanced using whatever a priori knowledge we have about
Packit df99a1
    the probability distribution of the number values, just like a static
Packit df99a1
    Huffman tree.  Each decision (except the sign decision) is then coded
Packit df99a1
    using a dedicated context variable.
Packit df99a1
    \begin{verbatim}
Packit df99a1
        if (! zp.decoder(ctx_A)) {             // decision A
Packit df99a1
           return 0;
Packit df99a1
        } else {
Packit df99a1
           if (! zp.decoder(ctx_B)) {          // + decision B
Packit df99a1
             if (! zp.decoder(ctx_C)) {        // ++ decision C
Packit df99a1
               if (! zp.decoder())             // +++ sign decision
Packit df99a1
                 return +1;
Packit df99a1
               else
Packit df99a1
                 return -1;
Packit df99a1
             } else {
Packit df99a1
               if (! zp.decoder())             // +++ sign decision
Packit df99a1
                 return + 2 + zp.decoder();
Packit df99a1
               else
Packit df99a1
                 return - 2 - zp.decoder();
Packit df99a1
             }
Packit df99a1
           } else {
Packit df99a1
             if (! zp.decoder(ctx_D)) {        // ++ decision D
Packit df99a1
               if (! zp.decoder())             // +++ sign decision
Packit df99a1
                 return + 4 + decode_7_bits(zp);
Packit df99a1
               else
Packit df99a1
                 return - 4 - decode_7_bits(zp);
Packit df99a1
             } else {
Packit df99a1
               if (! zp.decoder())             // +++ sign decision
Packit df99a1
                 return + 132 + decode_15_bits(zp);
Packit df99a1
               else
Packit df99a1
                 return - 132 - decode_15_bits(zp);
Packit df99a1
             }
Packit df99a1
           }
Packit df99a1
        } 
Packit df99a1
   \end{verbatim}
Packit df99a1
   Note that the call #zp.decoder()# for coding the sign decision does not use
Packit df99a1
   a context variable.  This is a "pass-thru" variant of \Ref{decoder} which
Packit df99a1
   bypasses the ZP-Coder and just reads a bit from the code sequence.  There
Packit df99a1
   is a corresponding "pass-thru" version of \Ref{encoder} for encoding such
Packit df99a1
   bits.  Similarly, functions #decode_7_bits# and #decode_15_bits# do not
Packit df99a1
   take an array of context variables because, unlike function #decode_8_bits#
Packit df99a1
   listed above, they are based on the pass-thru decoder instead of the
Packit df99a1
   regular decoder.
Packit df99a1
Packit df99a1
   The ZP-Coder will not learn the probabilities of the numbers within a set
Packit df99a1
   since no context variables have been allocated for that purpose.  This
Packit df99a1
   could be improved by allocating additional context variables for encoding
Packit df99a1
   the position within the smaller sets and using the regular decoding
Packit df99a1
   functions instead of the pass-thru variants.  Only experimentation can tell
Packit df99a1
   what works best for your particular encoding problem.
Packit df99a1
Packit df99a1
Packit df99a1
   {\bf Understanding Adaptation} ---
Packit df99a1
   We have so far explained that the ZP-Coder adaptation algorithm is able to
Packit df99a1
   quickly estimate of the probability distribution of the message bits coded
Packit df99a1
   using a particular context variable.  It is also able to track slow
Packit df99a1
   variations when the actual probabilities change while coding.
Packit df99a1
   
Packit df99a1
   Let us consider the ``cloud of points'' application presented above.
Packit df99a1
   Suppose that we first code points located towards the left side and then
Packit df99a1
   slowly move towards points located on the right side.  The ZP-Coder will
Packit df99a1
   first estimate that the X coordinates are rather on the left side. This
Packit df99a1
   estimation will be progressively revised after seeing more points on the
Packit df99a1
   right side.  Such an ordering of the points obviously violates the point
Packit df99a1
   independence assumption on which our code is based.  Despite our inexact
Packit df99a1
   assumptions, the tracking mechanism allows for better prediction of the X
Packit df99a1
   coordinates and therefore better compression.
Packit df99a1
Packit df99a1
   However, this is not a perfect solution. The ZP-Coder tracks the changes
Packit df99a1
   because every point seems to be a little bit more on the right side than
Packit df99a1
   suggested by the previous points.  The ZP-Coder coding algorithm is always
Packit df99a1
   slightly misadjusted and we always lose a little on possible compression
Packit df99a1
   ratio.  This is not much of a problem when the probabilities drift slowly.
Packit df99a1
   On the other hand, this can be very significant if the probabilities change
Packit df99a1
   drastically.
Packit df99a1
Packit df99a1
   Adaptation is always associated with a small loss of efficiency.  The
Packit df99a1
   ZP-Coder updates the probability model whenever it suspects, {\em after
Packit df99a1
   coding}, that the current settings were not optimal.  The model will be
Packit df99a1
   better next time, but a slight loss in compression has occurred.  The
Packit df99a1
   design of ZP-Coder of course minimizes this effect as much as possible.
Packit df99a1
   Yet you will pay a price if you ask too much to the adaptation algorithm.
Packit df99a1
   If you have millions of context variables, it will be difficult to train
Packit df99a1
   them all.  If the probability distributions change drastically while
Packit df99a1
   coding, it will be difficult to track the changes fast enough.
Packit df99a1
Packit df99a1
   Adaptation on the other hand is a great simplification.  A good data
Packit df99a1
   compression program must (a) represent the data in order to make its
Packit df99a1
   predictability apparent, and (b) perform the predictions and generate the
Packit df99a1
   code bits.  The ZP-Coder is an efficient and effortless solution for
Packit df99a1
   implementing task (b).
Packit df99a1
Packit df99a1
Packit df99a1
   {\bf Practical Debugging Tricks} ---
Packit df99a1
   Sometimes you write an encoding program and a decoding program.
Packit df99a1
   Unfortunately there is a bug: the decoding program decodes half the file
Packit df99a1
   and then just outputs garbage.  There is a simple way to locate the
Packit df99a1
   problem.  In the encoding program, after each call to #encoder#, print the
Packit df99a1
   encoded bit and the value of the context variable.  In the decoding
Packit df99a1
   program, after each call to #decoder#, print the decoded bit and the value
Packit df99a1
   of the context variable.  Both program should print exactly the same thing.
Packit df99a1
   When you find the difference, you find the bug.
Packit df99a1
   
Packit df99a1
   @memo Suggestions for efficiently using the ZP-Coder.  */
Packit df99a1
//@}
Packit df99a1
Packit df99a1
// ------------ THE END
Packit df99a1
Packit df99a1
#ifdef HAVE_NAMESPACES
Packit df99a1
}
Packit df99a1
# ifndef NOT_USING_DJVU_NAMESPACE
Packit df99a1
using namespace DJVU;
Packit df99a1
# endif
Packit df99a1
#endif
Packit df99a1
#endif
Packit df99a1
Packit df99a1