Blame gdk-pixbuf/pixops/README

Packit a4058c
The code in this directory implements optimized, filtered scaling
Packit a4058c
for pixmap data. 
Packit a4058c
Packit a4058c
This code is copyright Red Hat, Inc, 2000 and licensed under the terms
Packit a4058c
of the GNU Lesser General Public License (LGPL).
Packit a4058c
Packit a4058c
(If you want to use it in a project where that license is not
Packit a4058c
appropriate, please contact me, and most likely something can be
Packit a4058c
worked out.)
Packit a4058c
Packit a4058c
Owen Taylor <otaylor@redhat.com>
Packit a4058c
Packit a4058c
PRINCIPLES
Packit a4058c
==========
Packit a4058c
Packit a4058c
The general principle of this code is that it first computes a filter
Packit a4058c
matrix for the given filtering mode, and then calls a general driver
Packit a4058c
routine, passing in functions to composite pixels and lines.
Packit a4058c
Packit a4058c
(The pixel functions are used for handling edge cases, and the line
Packit a4058c
functions are simply used for the middle parts of the image.)
Packit a4058c
Packit a4058c
The system is designed so that the line functions can be simple, 
Packit a4058c
don't have to worry about special cases, can be selected to
Packit a4058c
be specific to the particular formats involved. This allows them
Packit a4058c
to be hyper-optimized. Since most of the compution time is 
Packit a4058c
spent in these functions, this results in an overall fast design.
Packit a4058c
Packit a4058c
MMX assembly code for Intel (and compatible) processors is included
Packit a4058c
for a number of the most common special cases:
Packit a4058c
Packit a4058c
 scaling from RGB to RGB
Packit a4058c
 compositing from RGBA to RGBx
Packit a4058c
 compositing against a color from RGBA and storing in a RGBx buffer
Packit a4058c
Packit a4058c
Alpha compositing 8 bit RGBAa onto RGB is defined in terms of
Packit a4058c
rounding the exact result (real values in [0,1]):
Packit a4058c
Packit a4058c
 cc = ca * aa + (1 - aa) * Cb
Packit a4058c
Packit a4058c
 Cc = ROUND [255. * (Ca/255. * Aa/255. + (1 - Aa/255.) * Cb/255.)]
Packit a4058c
Packit a4058c
ROUND(i / 255.) can be computed exactly for i in [0,255*255] as:
Packit a4058c
Packit a4058c
 t = i + 0x80; result = (t + (t >> 8)) >> 8;  [ call this as To8(i) ]
Packit a4058c
Packit a4058c
So, 
Packit a4058c
  
Packit a4058c
 t = Ca * Aa + (255 - Aa) * Cb + 0x80;
Packit a4058c
 Cc = (t + (t >> 8)) >> 8;
Packit a4058c
Packit a4058c
Alpha compositing 8 bit RaGaBaAa onto RbGbBbAa is a little harder, for
Packit a4058c
non-premultiplied alpha. The premultiplied result is simple:
Packit a4058c
Packit a4058c
 ac = aa + (1 - aa) * ab
Packit a4058c
 cc = ca + (1 - aa) * cb
Packit a4058c
Packit a4058c
Which can be computed in integers terms as:
Packit a4058c
Packit a4058c
 Cc = Ca + To8 ((255 - Aa) * Cb)
Packit a4058c
 Ac = Aa + To8 ((255 - Aa) * Ab)
Packit a4058c
Packit a4058c
For non-premultiplied alpha, we need divide the color components by 
Packit a4058c
the alpha:
Packit a4058c
Packit a4058c
       +- (ca * aa + (1 - aa) * ab * cb)) / ac; aa != 0
Packit a4058c
  cc = |
Packit a4058c
       +- cb; aa == 0
Packit a4058c
Packit a4058c
To calculate this as in integer, we note the alternate form:
Packit a4058c
Packit a4058c
 cc = cb + aa * (ca - cb) / ac
Packit a4058c
Packit a4058c
[ 'cc = ca + (ac - aa) * (cb - ca) / ac' can also be useful numerically,
Packit a4058c
  but isn't important here ]
Packit a4058c
Packit a4058c
We can express this as integers as:
Packit a4058c
Packit a4058c
 Ac_tmp = Aa * 255 + (255 - Aa) * Ab;
Packit a4058c
 
Packit a4058c
      +- Cb + (255 * Aa * (Ca - Cb) + Ac_tmp / 2) / Ac_tmp ; Ca > Cb
Packit a4058c
 Cc = | 
Packit a4058c
      +- Cb - (255 * Aa * (Cb - Ca) + Ac_tmp / 2) / Ac_tmp ; ca <= Cb
Packit a4058c
Packit a4058c
Or, playing bit tricks to avoid the conditional
Packit a4058c
Packit a4058c
 Cc = Cb + (255 * Aa * (Ca - Cb) + (((Ca - Cb) >> 8) ^ (Ac_tmp / 2)) ) / Ac_tmp
Packit a4058c
Packit a4058c
TODO
Packit a4058c
====
Packit a4058c
Packit a4058c
* ART_FILTER_HYPER is not correctly implemented. It is currently
Packit a4058c
  implemented as a filter that is derived by doing linear interpolation
Packit a4058c
  on the source image and then averaging that with a box filter.
Packit a4058c
Packit a4058c
  It should be defined as followed (see art_filterlevel.h)
Packit a4058c
Packit a4058c
   "HYPER is the highest quality reconstruction function. It is derived
Packit a4058c
    from the hyperbolic filters in Wolberg's "Digital Image Warping,"
Packit a4058c
    and is formally defined as the hyperbolic-filter sampling the ideal
Packit a4058c
    hyperbolic-filter interpolated image (the filter is designed to be
Packit a4058c
    idempotent for 1:1 pixel mapping). It is the slowest and highest
Packit a4058c
    quality."
Packit a4058c
Packit a4058c
  The current HYPER is probably as slow, but lower quality. Also, there
Packit a4058c
  are some subtle errors in the calculation current HYPER that show up as dark
Packit a4058c
  stripes if you scale a constant-color image.
Packit a4058c
Packit a4058c
* There are some roundoff errors in the compositing routines. 
Packit a4058c
  the _nearest() variants do it right, most of the other code 
Packit a4058c
  is wrong to some degree or another.
Packit a4058c
Packit a4058c
  For instance, in composite_line_22_4a4(), we have:
Packit a4058c
Packit a4058c
    dest[0] = ((0xff0000 - a) * dest[0] + r) >> 24;
Packit a4058c
Packit a4058c
   if a is 0 (implies r == 0), then we have:
Packit a4058c
Packit a4058c
    (0xff0000 * dest[0]) >> 24
Packit a4058c
Packit a4058c
   which gives results which are 1 to low:
Packit a4058c
Packit a4058c
       255 => 254,   1 => 0.
Packit a4058c
Packit a4058c
   So, this should be something like:
Packit a4058c
Packit a4058c
     ((0xff0000 - a) * dest[0] + r + 0xffffff) >> 24;
Packit a4058c
Packit a4058c
   (Not checked, caveat emptor)
Packit a4058c
Packit a4058c
   An alternatve formulation of this as:
Packit a4058c
Packit a4058c
     dest[0] + (r - a * dest[0] + 0xffffff) >> 24
Packit a4058c
Packit a4058c
   may be better numerically, but would need consideration for overflow.
Packit a4058c
Packit a4058c
* The generic functions could be sped up considerably by
Packit a4058c
  switching around conditionals and inner loops in various
Packit a4058c
  places.
Packit a4058c
Packit a4058c
* Right now, in several of the most common cases, there are
Packit a4058c
  optimized mmx routines, but no optimized C routines.
Packit a4058c
Packit a4058c
  For instance, there is a 
Packit a4058c
Packit a4058c
    pixops_composite_line_22_4a4_mmx()
Packit a4058c
Packit a4058c
  But no 
Packit a4058c
  
Packit a4058c
    pixops_composite_line_22_4a4()
Packit a4058c
Packit a4058c
  Also, it may be desirable to include a few more special cases - in particular:
Packit a4058c
Packit a4058c
    pixops_composite_line_22_4a3()
Packit a4058c
Packit a4058c
  May be desirable.
Packit a4058c
Packit a4058c
* Scaling down images by large scale factors is _slow_ since huge filter
Packit a4058c
  matrixes are computed. (e.g., to scale down by a factor of 100, we compute
Packit a4058c
  101x101 filter matrixes. At some point, it would be more efficent to
Packit a4058c
  switch over to subsampling when scaling down - one should never need a filter
Packit a4058c
  matrix bigger than 16x16. 
Packit a4058c