Blame gdk-pixbuf/pixops/DETAILS

Packit a4058c
General ideas of Pixops
Packit a4058c
=======================
Packit a4058c
Packit a4058c
 - Gain speed by special-casing the common case, and using
Packit a4058c
   generic code to handle the uncommon case.
Packit a4058c
Packit a4058c
 - Most of the time in scaling an image is in the center;
Packit a4058c
   however code that can handle edges properly is slow
Packit a4058c
   because it needs to deal with the possibility of running
Packit a4058c
   off the edge. So make the fast case code only handle
Packit a4058c
   the centers, and use generic, slow, code for the edges,
Packit a4058c
Packit a4058c
Structure of Pixops
Packit a4058c
===================
Packit a4058c
Packit a4058c
The code of pixops can roughly be grouped into four parts:
Packit a4058c
Packit a4058c
 - Filter computation functions
Packit a4058c
Packit a4058c
 - Functions for scaling or compositing lines and pixels
Packit a4058c
   using precomputed filters
Packit a4058c
Packit a4058c
 - pixops process, the central driver that iterates through
Packit a4058c
   the image calling pixel or line functions as necessary
Packit a4058c
   
Packit a4058c
 - Wrapper functions (pixops_scale/composite/composite_color)
Packit a4058c
   that compute the filter, chooses the line and pixel functions
Packit a4058c
   and then call pixops_processs with the filter, line,
Packit a4058c
   and pixel functions.
Packit a4058c
Packit a4058c
Packit a4058c
pixops process is a pretty scary looking function:
Packit a4058c
Packit a4058c
static void
Packit a4058c
pixops_process (guchar         *dest_buf,
Packit a4058c
		int             render_x0,
Packit a4058c
		int             render_y0,
Packit a4058c
		int             render_x1,
Packit a4058c
		int             render_y1,
Packit a4058c
		int             dest_rowstride,
Packit a4058c
		int             dest_channels,
Packit a4058c
		gboolean        dest_has_alpha,
Packit a4058c
		const guchar   *src_buf,
Packit a4058c
		int             src_width,
Packit a4058c
		int             src_height,
Packit a4058c
		int             src_rowstride,
Packit a4058c
		int             src_channels,
Packit a4058c
		gboolean        src_has_alpha,
Packit a4058c
		double          scale_x,
Packit a4058c
		double          scale_y,
Packit a4058c
		int             check_x,
Packit a4058c
		int             check_y,
Packit a4058c
		int             check_size,
Packit a4058c
		guint32         color1,
Packit a4058c
		guint32         color2,
Packit a4058c
		PixopsFilter   *filter,
Packit a4058c
		PixopsLineFunc  line_func,
Packit a4058c
		PixopsPixelFunc pixel_func)
Packit a4058c
Packit a4058c
(Some of the arguments should be moved into structures. It's basically
Packit a4058c
"all the arguments to pixops_composite_color plus three more") The
Packit a4058c
arguments can be divided up into:
Packit a4058c
Packit a4058c
Packit a4058c
Information about the destination buffer
Packit a4058c
Packit a4058c
   guchar *dest_buf, int dest_rowstride, int dest_channels, gboolean dest_has_alpha,
Packit a4058c
Packit a4058c
Information about the source buffer
Packit a4058c
Packit a4058c
   guchar *src_buf,  int src_rowstride,  int src_channels,  gboolean src_has_alpha,
Packit a4058c
   int src_width, int src_height,
Packit a4058c
Packit a4058c
Information on how to scale the source buf and the region of the scaled source
Packit a4058c
to render onto the destination buffer
Packit a4058c
Packit a4058c
   int render_x0, int render_y0, int render_x1, int render_y1
Packit a4058c
   double scale_x, double scale_y
Packit a4058c
Packit a4058c
Information about a constant color or check pattern onto which to to composite
Packit a4058c
Packit a4058c
   int check_x,	int check_y, int check_size, guint32 color1, guint32 color2
Packit a4058c
Packit a4058c
Information precomputed to use during the scale operation
Packit a4058c
Packit a4058c
   PixopsFilter *filter, PixopsLineFunc line_func, OixopsPixelFunc pixel_func
Packit a4058c
Packit a4058c
Packit a4058c
Filter computation
Packit a4058c
==================
Packit a4058c
Packit a4058c
The PixopsFilter structure looks like:
Packit a4058c
Packit a4058c
struct _PixopsFilter
Packit a4058c
{
Packit a4058c
  int *weights;
Packit a4058c
  int n_x;
Packit a4058c
  int n_y;
Packit a4058c
  double x_offset;
Packit a4058c
  double y_offset;
Packit a4058c
}; 
Packit a4058c
Packit a4058c
Packit a4058c
'weights' is an array of size:
Packit a4058c
Packit a4058c
 weights[SUBSAMPLE][SUBSAMPLE][n_x][n_y]
Packit a4058c
Packit a4058c
SUBSAMPLE is a constant - currently 16 in pixops.c.
Packit a4058c
Packit a4058c
Packit a4058c
In order to compute a scaled destination pixel we convolve
Packit a4058c
an array of n_x by n_y source pixels with one of
Packit a4058c
the SUBSAMPLE * SUBSAMPLE filter matrices stored
Packit a4058c
in weights. The choice of filter matrix is determined
Packit a4058c
by the fractional part of the source location.
Packit a4058c
Packit a4058c
To compute dest[i,j] we do the following:
Packit a4058c
Packit a4058c
 x = i * scale_x + x_offset;
Packit a4058c
 y = i * scale_x + y_offset;
Packit a4058c
 x_int = floor(x)
Packit a4058c
 y_int = floor(y)
Packit a4058c
Packit a4058c
 C = weights[SUBSAMPLE*(x - x_int)][SUBSAMPLE*(y - y_int)]
Packit a4058c
 total  = sum[l=0..n_x-1, j=0..n_y-1] (C[l,m] * src[x_int + l, x_int + m])
Packit a4058c
Packit a4058c
The filter weights are integers scaled so that the total of the
Packit a4058c
weights in the weights array is equal to 65536.
Packit a4058c
Packit a4058c
When the source does not have alpha, we simply compute each channel
Packit a4058c
as above, so total is in the range [0,255*65536]
Packit a4058c
Packit a4058c
 dest = src / 65536
Packit a4058c
Packit a4058c
When the source does have alpha, then we need to compute using
Packit a4058c
"pre-multiplied alpha":
Packit a4058c
Packit a4058c
 a_total = sum (C[l,m] * src_a[x_int + l, x_int + m])
Packit a4058c
 c_total = sum (C[l,m] * src_a[x_int + l, x_int + m] * src_c[x_int + l, x_int + m])
Packit a4058c
 
Packit a4058c
This gives us a result for c_total in the range of [0,255*a_total]
Packit a4058c
 
Packit a4058c
 c_dest = c_total / a_total
Packit a4058c
 
Packit a4058c
Packit a4058c
Mathematical aside:
Packit a4058c
Packit a4058c
The process of producing a destination filter consists
Packit a4058c
of:
Packit a4058c
Packit a4058c
 - Producing a continuous approximation to the source
Packit a4058c
   image via interpolation. 
Packit a4058c
Packit a4058c
 - Sampling that continuous approximation with filter.
Packit a4058c
Packit a4058c
This is representable as:
Packit a4058c
Packit a4058c
 S(x,y) = sum[i=-inf,inf; j=-inf,inf] A(frac(x),frac(y))[i,j] * S[floor(x)+i,floor(y)+j]
Packit a4058c
Packit a4058c
 D[i,j] = Integral(s=-inf,inf; t=-inf,inf) B(i+x,j+y) S((i+x)/scale_x,(i+y)/scale_y)
Packit a4058c
 
Packit a4058c
By reordering the sums and integrals, you get something of the form:
Packit a4058c
Packit a4058c
 D[i,j] = sum[l=-inf,inf; m=-inf;inf] C[l,m] S[i+l,j+l]
Packit a4058c
Packit a4058c
The arrays in weights are the C[l,m] above, and are thus
Packit a4058c
determined by the interpolating algorithm in use and the
Packit a4058c
sampling filter:
Packit a4058c
Packit a4058c
                                       INTERPOLATE       SAMPLE
Packit a4058c
 ART_FILTER_NEAREST                nearest neighbour     point
Packit a4058c
 ART_FILTER_TILES                  nearest neighbour      box
Packit a4058c
 ART_FILTER_BILINEAR (scale < 1)   nearest neighbour      box   (scale < 1)
Packit a4058c
 ART_FILTER_BILINEAR (scale > 1)       bilinear           point  (scale > 1)
Packit a4058c
 ART_FILTER_HYPER                      bilinear           box
Packit a4058c
 
Packit a4058c
Packit a4058c
Pixel Functions
Packit a4058c
===============
Packit a4058c
Packit a4058c
typedef void (*PixopsPixelFunc) (guchar *dest, int dest_x, int dest_channels, int dest_has_alpha,
Packit a4058c
				 int src_has_alpha, 
Packit a4058c
                                 int check_size, guint32 color1, guint32 color2,
Packit a4058c
				 int r, int g, int b, int a);
Packit a4058c
Packit a4058c
The arguments here are:
Packit a4058c
Packit a4058c
 dest: location to store the output pixel
Packit a4058c
 dest_x: x coordinate of destination (for handling checks)
Packit a4058c
 dest_has_alpha, dest_channels: Information about the destination pixbuf
Packit a4058c
 src_has_alpha: Information about the source pixbuf
Packit a4058c
Packit a4058c
 check_size, color1, color2: Information for color background for composite_color variant
Packit a4058c
 
Packit a4058c
 r,g,b,a - scaled red, green, blue and alpha
Packit a4058c
Packit a4058c
r,g,b are premultiplied alpha.
Packit a4058c
Packit a4058c
 a is in [0,65536*255]
Packit a4058c
 r is in [0,255*a]
Packit a4058c
 g is in [0,255*a]
Packit a4058c
 b is in [0,255*a]
Packit a4058c
Packit a4058c
If src_has_alpha is false, then a will be 65536*255, allowing optimization.
Packit a4058c
Packit a4058c
Packit a4058c
Line functions
Packit a4058c
==============
Packit a4058c
Packit a4058c
typedef guchar *(*PixopsLineFunc) (int *weights, int n_x, int n_y,
Packit a4058c
				   guchar *dest, int dest_x, guchar *dest_end, int dest_channels, int dest_has_alpha,
Packit a4058c
				   guchar **src, int src_channels, gboolean src_has_alpha,
Packit a4058c
				   int x_init, int x_step, int src_width,
Packit a4058c
				   int check_size, guint32 color1, guint32 color2);
Packit a4058c
Packit a4058c
The argumets are:
Packit a4058c
Packit a4058c
 weights, n_x, n_y
Packit a4058c
Packit a4058c
   Filter weights for this row - dimensions weights[SUBSAMPLE][n_x][n_y]
Packit a4058c
Packit a4058c
 dest, dest_x, dest_end, dest_channels, dest_has_alpha
Packit a4058c
Packit a4058c
   The destination buffer, function will start writing into *dest and
Packit a4058c
   increment by dest_channels, until dest == dest_end. Reading from
Packit a4058c
   src for these pixels is guaranteed not to go outside of the 
Packit a4058c
   bufer bounds
Packit a4058c
Packit a4058c
 src, src_channels, src_has_alpha
Packit a4058c
 
Packit a4058c
   src[n_y] - an array of pointers to the start of the source rows
Packit a4058c
   for each filter coordinate.
Packit a4058c
Packit a4058c
 x_init, x_step
Packit a4058c
Packit a4058c
   Information about x positions in source image.
Packit a4058c
Packit a4058c
 src_width - unused
Packit a4058c
Packit a4058c
 check_size, color1, color2: Information for color background for composite_color variant
Packit a4058c
Packit a4058c
 The total for the destination pixel at dest + i is given by
Packit a4058c
Packit a4058c
   SUM (l=0..n_x - 1, m=0..n_y - 1) 
Packit a4058c
     src[m][(x_init + i * x_step)>> SCALE_SHIFT + l] * weights[m][l]
Packit a4058c
Packit a4058c
Packit a4058c
Algorithms for compositing
Packit a4058c
==========================
Packit a4058c
Packit a4058c
Compositing alpha on non alpha:
Packit a4058c
Packit a4058c
 R = As * Rs + (1 - As) * Rd
Packit a4058c
 G = As * Gs + (1 - As) * Gd
Packit a4058c
 B = As * Bs + (1 - As) * Bd
Packit a4058c
Packit a4058c
This can be regrouped as:
Packit a4058c
Packit a4058c
 Cd + Cs * (Cs - Rd)
Packit a4058c
Packit a4058c
Compositing alpha on alpha:
Packit a4058c
Packit a4058c
 A = As + (1 - As) * Ad
Packit a4058c
 R = (As * Rs + (1 - As) * Rd * Ad)  / A
Packit a4058c
 G = (As * Gs + (1 - As) * Gd * Ad)  / A
Packit a4058c
 B = (As * Bs + (1 - As) * Bd * Ad)  / A
Packit a4058c
Packit a4058c
The way to think of this is in terms of the "area":
Packit a4058c
Packit a4058c
The final pixel is composed of area As of the source pixel
Packit a4058c
and (1 - As) * Ad of the target pixel. So the final pixel
Packit a4058c
is a weighted average with those weights.
Packit a4058c
Packit a4058c
Note that the weights do not add up to one - hence the
Packit a4058c
non-constant division.
Packit a4058c
Packit a4058c
Packit a4058c
Integer tricks for compositing
Packit a4058c
==============================
Packit a4058c
Packit a4058c
Packit a4058c
Packit a4058c
MMX Code
Packit a4058c
========
Packit a4058c
Packit a4058c
Line functions are provided in MMX functionsfor a few special 
Packit a4058c
cases:
Packit a4058c
Packit a4058c
 n_x = n_y = 2
Packit a4058c
Packit a4058c
   src_channels = 3 dest_channels = 3    op = scale
Packit a4058c
   src_channels = 4 with alpha dest_channels = 4 no alpha  op = composite
Packit a4058c
   src_channels = 4 with alpha dest_channels = 4 no alpha  op = composite_color
Packit a4058c
Packit a4058c
For the case n_x = n_y = 2 - primarily hit when scaling up with bilinear
Packit a4058c
scaling, we can take advantage of the fact that multiple destination
Packit a4058c
pixels will be composed from the same source pixels.
Packit a4058c
Packit a4058c
That is a destination pixel is a linear combination of the source
Packit a4058c
pixels around it:
Packit a4058c
Packit a4058c
Packit a4058c
  S0                     S1
Packit a4058c
Packit a4058c
Packit a4058c
Packit a4058c
Packit a4058c
Packit a4058c
       D  D' D'' ...
Packit a4058c
Packit a4058c
Packit a4058c
Packit a4058c
Packit a4058c
  S2                     S3
Packit a4058c
Packit a4058c
Each mmx register is 64 bits wide, so we can unpack a source pixel
Packit a4058c
into the low 8 bits of 4 16 bit words, and store it into a mmx 
Packit a4058c
register.
Packit a4058c
Packit a4058c
For each destination pixel, we first make sure that we have pixels S0
Packit a4058c
... S3 loaded into registers mm0 ...mm3. (This will often involve not
Packit a4058c
doing anything or moving mm1 and mm3 into mm0 and mm1 then reloading
Packit a4058c
mm1 and mm3 with new values).
Packit a4058c
Packit a4058c
Then we load up the appropriate weights for the 4 corner pixels
Packit a4058c
based on the offsets of the destination pixel within the source
Packit a4058c
pixels.
Packit a4058c
Packit a4058c
We have preexpanded the weights to 64 bits wide and truncated the
Packit a4058c
range to 8 bits, so an original filter value of 
Packit a4058c
Packit a4058c
 0x5321 would be expanded to
Packit a4058c
Packit a4058c
 0x0053005300530053
Packit a4058c
Packit a4058c
For source buffers without alpha, we simply do a multiply-add
Packit a4058c
of the weights, giving us a 16 bit quantity for the result
Packit a4058c
that we shift left by 8 and store in the destination buffer.
Packit a4058c
Packit a4058c
When the source buffer has alpha, then things become more
Packit a4058c
complicated - when we load up mm0 and mm3, we premultiply
Packit a4058c
the alpha, so they contain:
Packit a4058c
Packit a4058c
 (a*ff >> 8) (r*a >> 8) (g*a >> 8) (b*a >> a)
Packit a4058c
Packit a4058c
Then when we multiply by the weights, and add we end up
Packit a4058c
with premultiplied r,g,b,a in the range of 0 .. 0xff * 0ff,
Packit a4058c
call them A,R,G,B
Packit a4058c
Packit a4058c
We then need to composite with the dest pixels - which 
Packit a4058c
we do by:
Packit a4058c
Packit a4058c
 r_dest = (R + ((0xff * 0xff - A) >> 8) * r_dest) >> 8
Packit a4058c
Packit a4058c
(0xff * 0xff)