|
Packit |
a4058c |
General ideas of Pixops
|
|
Packit |
a4058c |
=======================
|
|
Packit |
a4058c |
|
|
Packit |
a4058c |
- Gain speed by special-casing the common case, and using
|
|
Packit |
a4058c |
generic code to handle the uncommon case.
|
|
Packit |
a4058c |
|
|
Packit |
a4058c |
- Most of the time in scaling an image is in the center;
|
|
Packit |
a4058c |
however code that can handle edges properly is slow
|
|
Packit |
a4058c |
because it needs to deal with the possibility of running
|
|
Packit |
a4058c |
off the edge. So make the fast case code only handle
|
|
Packit |
a4058c |
the centers, and use generic, slow, code for the edges,
|
|
Packit |
a4058c |
|
|
Packit |
a4058c |
Structure of Pixops
|
|
Packit |
a4058c |
===================
|
|
Packit |
a4058c |
|
|
Packit |
a4058c |
The code of pixops can roughly be grouped into four parts:
|
|
Packit |
a4058c |
|
|
Packit |
a4058c |
- Filter computation functions
|
|
Packit |
a4058c |
|
|
Packit |
a4058c |
- Functions for scaling or compositing lines and pixels
|
|
Packit |
a4058c |
using precomputed filters
|
|
Packit |
a4058c |
|
|
Packit |
a4058c |
- pixops process, the central driver that iterates through
|
|
Packit |
a4058c |
the image calling pixel or line functions as necessary
|
|
Packit |
a4058c |
|
|
Packit |
a4058c |
- Wrapper functions (pixops_scale/composite/composite_color)
|
|
Packit |
a4058c |
that compute the filter, chooses the line and pixel functions
|
|
Packit |
a4058c |
and then call pixops_processs with the filter, line,
|
|
Packit |
a4058c |
and pixel functions.
|
|
Packit |
a4058c |
|
|
Packit |
a4058c |
|
|
Packit |
a4058c |
pixops process is a pretty scary looking function:
|
|
Packit |
a4058c |
|
|
Packit |
a4058c |
static void
|
|
Packit |
a4058c |
pixops_process (guchar *dest_buf,
|
|
Packit |
a4058c |
int render_x0,
|
|
Packit |
a4058c |
int render_y0,
|
|
Packit |
a4058c |
int render_x1,
|
|
Packit |
a4058c |
int render_y1,
|
|
Packit |
a4058c |
int dest_rowstride,
|
|
Packit |
a4058c |
int dest_channels,
|
|
Packit |
a4058c |
gboolean dest_has_alpha,
|
|
Packit |
a4058c |
const guchar *src_buf,
|
|
Packit |
a4058c |
int src_width,
|
|
Packit |
a4058c |
int src_height,
|
|
Packit |
a4058c |
int src_rowstride,
|
|
Packit |
a4058c |
int src_channels,
|
|
Packit |
a4058c |
gboolean src_has_alpha,
|
|
Packit |
a4058c |
double scale_x,
|
|
Packit |
a4058c |
double scale_y,
|
|
Packit |
a4058c |
int check_x,
|
|
Packit |
a4058c |
int check_y,
|
|
Packit |
a4058c |
int check_size,
|
|
Packit |
a4058c |
guint32 color1,
|
|
Packit |
a4058c |
guint32 color2,
|
|
Packit |
a4058c |
PixopsFilter *filter,
|
|
Packit |
a4058c |
PixopsLineFunc line_func,
|
|
Packit |
a4058c |
PixopsPixelFunc pixel_func)
|
|
Packit |
a4058c |
|
|
Packit |
a4058c |
(Some of the arguments should be moved into structures. It's basically
|
|
Packit |
a4058c |
"all the arguments to pixops_composite_color plus three more") The
|
|
Packit |
a4058c |
arguments can be divided up into:
|
|
Packit |
a4058c |
|
|
Packit |
a4058c |
|
|
Packit |
a4058c |
Information about the destination buffer
|
|
Packit |
a4058c |
|
|
Packit |
a4058c |
guchar *dest_buf, int dest_rowstride, int dest_channels, gboolean dest_has_alpha,
|
|
Packit |
a4058c |
|
|
Packit |
a4058c |
Information about the source buffer
|
|
Packit |
a4058c |
|
|
Packit |
a4058c |
guchar *src_buf, int src_rowstride, int src_channels, gboolean src_has_alpha,
|
|
Packit |
a4058c |
int src_width, int src_height,
|
|
Packit |
a4058c |
|
|
Packit |
a4058c |
Information on how to scale the source buf and the region of the scaled source
|
|
Packit |
a4058c |
to render onto the destination buffer
|
|
Packit |
a4058c |
|
|
Packit |
a4058c |
int render_x0, int render_y0, int render_x1, int render_y1
|
|
Packit |
a4058c |
double scale_x, double scale_y
|
|
Packit |
a4058c |
|
|
Packit |
a4058c |
Information about a constant color or check pattern onto which to to composite
|
|
Packit |
a4058c |
|
|
Packit |
a4058c |
int check_x, int check_y, int check_size, guint32 color1, guint32 color2
|
|
Packit |
a4058c |
|
|
Packit |
a4058c |
Information precomputed to use during the scale operation
|
|
Packit |
a4058c |
|
|
Packit |
a4058c |
PixopsFilter *filter, PixopsLineFunc line_func, OixopsPixelFunc pixel_func
|
|
Packit |
a4058c |
|
|
Packit |
a4058c |
|
|
Packit |
a4058c |
Filter computation
|
|
Packit |
a4058c |
==================
|
|
Packit |
a4058c |
|
|
Packit |
a4058c |
The PixopsFilter structure looks like:
|
|
Packit |
a4058c |
|
|
Packit |
a4058c |
struct _PixopsFilter
|
|
Packit |
a4058c |
{
|
|
Packit |
a4058c |
int *weights;
|
|
Packit |
a4058c |
int n_x;
|
|
Packit |
a4058c |
int n_y;
|
|
Packit |
a4058c |
double x_offset;
|
|
Packit |
a4058c |
double y_offset;
|
|
Packit |
a4058c |
};
|
|
Packit |
a4058c |
|
|
Packit |
a4058c |
|
|
Packit |
a4058c |
'weights' is an array of size:
|
|
Packit |
a4058c |
|
|
Packit |
a4058c |
weights[SUBSAMPLE][SUBSAMPLE][n_x][n_y]
|
|
Packit |
a4058c |
|
|
Packit |
a4058c |
SUBSAMPLE is a constant - currently 16 in pixops.c.
|
|
Packit |
a4058c |
|
|
Packit |
a4058c |
|
|
Packit |
a4058c |
In order to compute a scaled destination pixel we convolve
|
|
Packit |
a4058c |
an array of n_x by n_y source pixels with one of
|
|
Packit |
a4058c |
the SUBSAMPLE * SUBSAMPLE filter matrices stored
|
|
Packit |
a4058c |
in weights. The choice of filter matrix is determined
|
|
Packit |
a4058c |
by the fractional part of the source location.
|
|
Packit |
a4058c |
|
|
Packit |
a4058c |
To compute dest[i,j] we do the following:
|
|
Packit |
a4058c |
|
|
Packit |
a4058c |
x = i * scale_x + x_offset;
|
|
Packit |
a4058c |
y = i * scale_x + y_offset;
|
|
Packit |
a4058c |
x_int = floor(x)
|
|
Packit |
a4058c |
y_int = floor(y)
|
|
Packit |
a4058c |
|
|
Packit |
a4058c |
C = weights[SUBSAMPLE*(x - x_int)][SUBSAMPLE*(y - y_int)]
|
|
Packit |
a4058c |
total = sum[l=0..n_x-1, j=0..n_y-1] (C[l,m] * src[x_int + l, x_int + m])
|
|
Packit |
a4058c |
|
|
Packit |
a4058c |
The filter weights are integers scaled so that the total of the
|
|
Packit |
a4058c |
weights in the weights array is equal to 65536.
|
|
Packit |
a4058c |
|
|
Packit |
a4058c |
When the source does not have alpha, we simply compute each channel
|
|
Packit |
a4058c |
as above, so total is in the range [0,255*65536]
|
|
Packit |
a4058c |
|
|
Packit |
a4058c |
dest = src / 65536
|
|
Packit |
a4058c |
|
|
Packit |
a4058c |
When the source does have alpha, then we need to compute using
|
|
Packit |
a4058c |
"pre-multiplied alpha":
|
|
Packit |
a4058c |
|
|
Packit |
a4058c |
a_total = sum (C[l,m] * src_a[x_int + l, x_int + m])
|
|
Packit |
a4058c |
c_total = sum (C[l,m] * src_a[x_int + l, x_int + m] * src_c[x_int + l, x_int + m])
|
|
Packit |
a4058c |
|
|
Packit |
a4058c |
This gives us a result for c_total in the range of [0,255*a_total]
|
|
Packit |
a4058c |
|
|
Packit |
a4058c |
c_dest = c_total / a_total
|
|
Packit |
a4058c |
|
|
Packit |
a4058c |
|
|
Packit |
a4058c |
Mathematical aside:
|
|
Packit |
a4058c |
|
|
Packit |
a4058c |
The process of producing a destination filter consists
|
|
Packit |
a4058c |
of:
|
|
Packit |
a4058c |
|
|
Packit |
a4058c |
- Producing a continuous approximation to the source
|
|
Packit |
a4058c |
image via interpolation.
|
|
Packit |
a4058c |
|
|
Packit |
a4058c |
- Sampling that continuous approximation with filter.
|
|
Packit |
a4058c |
|
|
Packit |
a4058c |
This is representable as:
|
|
Packit |
a4058c |
|
|
Packit |
a4058c |
S(x,y) = sum[i=-inf,inf; j=-inf,inf] A(frac(x),frac(y))[i,j] * S[floor(x)+i,floor(y)+j]
|
|
Packit |
a4058c |
|
|
Packit |
a4058c |
D[i,j] = Integral(s=-inf,inf; t=-inf,inf) B(i+x,j+y) S((i+x)/scale_x,(i+y)/scale_y)
|
|
Packit |
a4058c |
|
|
Packit |
a4058c |
By reordering the sums and integrals, you get something of the form:
|
|
Packit |
a4058c |
|
|
Packit |
a4058c |
D[i,j] = sum[l=-inf,inf; m=-inf;inf] C[l,m] S[i+l,j+l]
|
|
Packit |
a4058c |
|
|
Packit |
a4058c |
The arrays in weights are the C[l,m] above, and are thus
|
|
Packit |
a4058c |
determined by the interpolating algorithm in use and the
|
|
Packit |
a4058c |
sampling filter:
|
|
Packit |
a4058c |
|
|
Packit |
a4058c |
INTERPOLATE SAMPLE
|
|
Packit |
a4058c |
ART_FILTER_NEAREST nearest neighbour point
|
|
Packit |
a4058c |
ART_FILTER_TILES nearest neighbour box
|
|
Packit |
a4058c |
ART_FILTER_BILINEAR (scale < 1) nearest neighbour box (scale < 1)
|
|
Packit |
a4058c |
ART_FILTER_BILINEAR (scale > 1) bilinear point (scale > 1)
|
|
Packit |
a4058c |
ART_FILTER_HYPER bilinear box
|
|
Packit |
a4058c |
|
|
Packit |
a4058c |
|
|
Packit |
a4058c |
Pixel Functions
|
|
Packit |
a4058c |
===============
|
|
Packit |
a4058c |
|
|
Packit |
a4058c |
typedef void (*PixopsPixelFunc) (guchar *dest, int dest_x, int dest_channels, int dest_has_alpha,
|
|
Packit |
a4058c |
int src_has_alpha,
|
|
Packit |
a4058c |
int check_size, guint32 color1, guint32 color2,
|
|
Packit |
a4058c |
int r, int g, int b, int a);
|
|
Packit |
a4058c |
|
|
Packit |
a4058c |
The arguments here are:
|
|
Packit |
a4058c |
|
|
Packit |
a4058c |
dest: location to store the output pixel
|
|
Packit |
a4058c |
dest_x: x coordinate of destination (for handling checks)
|
|
Packit |
a4058c |
dest_has_alpha, dest_channels: Information about the destination pixbuf
|
|
Packit |
a4058c |
src_has_alpha: Information about the source pixbuf
|
|
Packit |
a4058c |
|
|
Packit |
a4058c |
check_size, color1, color2: Information for color background for composite_color variant
|
|
Packit |
a4058c |
|
|
Packit |
a4058c |
r,g,b,a - scaled red, green, blue and alpha
|
|
Packit |
a4058c |
|
|
Packit |
a4058c |
r,g,b are premultiplied alpha.
|
|
Packit |
a4058c |
|
|
Packit |
a4058c |
a is in [0,65536*255]
|
|
Packit |
a4058c |
r is in [0,255*a]
|
|
Packit |
a4058c |
g is in [0,255*a]
|
|
Packit |
a4058c |
b is in [0,255*a]
|
|
Packit |
a4058c |
|
|
Packit |
a4058c |
If src_has_alpha is false, then a will be 65536*255, allowing optimization.
|
|
Packit |
a4058c |
|
|
Packit |
a4058c |
|
|
Packit |
a4058c |
Line functions
|
|
Packit |
a4058c |
==============
|
|
Packit |
a4058c |
|
|
Packit |
a4058c |
typedef guchar *(*PixopsLineFunc) (int *weights, int n_x, int n_y,
|
|
Packit |
a4058c |
guchar *dest, int dest_x, guchar *dest_end, int dest_channels, int dest_has_alpha,
|
|
Packit |
a4058c |
guchar **src, int src_channels, gboolean src_has_alpha,
|
|
Packit |
a4058c |
int x_init, int x_step, int src_width,
|
|
Packit |
a4058c |
int check_size, guint32 color1, guint32 color2);
|
|
Packit |
a4058c |
|
|
Packit |
a4058c |
The argumets are:
|
|
Packit |
a4058c |
|
|
Packit |
a4058c |
weights, n_x, n_y
|
|
Packit |
a4058c |
|
|
Packit |
a4058c |
Filter weights for this row - dimensions weights[SUBSAMPLE][n_x][n_y]
|
|
Packit |
a4058c |
|
|
Packit |
a4058c |
dest, dest_x, dest_end, dest_channels, dest_has_alpha
|
|
Packit |
a4058c |
|
|
Packit |
a4058c |
The destination buffer, function will start writing into *dest and
|
|
Packit |
a4058c |
increment by dest_channels, until dest == dest_end. Reading from
|
|
Packit |
a4058c |
src for these pixels is guaranteed not to go outside of the
|
|
Packit |
a4058c |
bufer bounds
|
|
Packit |
a4058c |
|
|
Packit |
a4058c |
src, src_channels, src_has_alpha
|
|
Packit |
a4058c |
|
|
Packit |
a4058c |
src[n_y] - an array of pointers to the start of the source rows
|
|
Packit |
a4058c |
for each filter coordinate.
|
|
Packit |
a4058c |
|
|
Packit |
a4058c |
x_init, x_step
|
|
Packit |
a4058c |
|
|
Packit |
a4058c |
Information about x positions in source image.
|
|
Packit |
a4058c |
|
|
Packit |
a4058c |
src_width - unused
|
|
Packit |
a4058c |
|
|
Packit |
a4058c |
check_size, color1, color2: Information for color background for composite_color variant
|
|
Packit |
a4058c |
|
|
Packit |
a4058c |
The total for the destination pixel at dest + i is given by
|
|
Packit |
a4058c |
|
|
Packit |
a4058c |
SUM (l=0..n_x - 1, m=0..n_y - 1)
|
|
Packit |
a4058c |
src[m][(x_init + i * x_step)>> SCALE_SHIFT + l] * weights[m][l]
|
|
Packit |
a4058c |
|
|
Packit |
a4058c |
|
|
Packit |
a4058c |
Algorithms for compositing
|
|
Packit |
a4058c |
==========================
|
|
Packit |
a4058c |
|
|
Packit |
a4058c |
Compositing alpha on non alpha:
|
|
Packit |
a4058c |
|
|
Packit |
a4058c |
R = As * Rs + (1 - As) * Rd
|
|
Packit |
a4058c |
G = As * Gs + (1 - As) * Gd
|
|
Packit |
a4058c |
B = As * Bs + (1 - As) * Bd
|
|
Packit |
a4058c |
|
|
Packit |
a4058c |
This can be regrouped as:
|
|
Packit |
a4058c |
|
|
Packit |
a4058c |
Cd + Cs * (Cs - Rd)
|
|
Packit |
a4058c |
|
|
Packit |
a4058c |
Compositing alpha on alpha:
|
|
Packit |
a4058c |
|
|
Packit |
a4058c |
A = As + (1 - As) * Ad
|
|
Packit |
a4058c |
R = (As * Rs + (1 - As) * Rd * Ad) / A
|
|
Packit |
a4058c |
G = (As * Gs + (1 - As) * Gd * Ad) / A
|
|
Packit |
a4058c |
B = (As * Bs + (1 - As) * Bd * Ad) / A
|
|
Packit |
a4058c |
|
|
Packit |
a4058c |
The way to think of this is in terms of the "area":
|
|
Packit |
a4058c |
|
|
Packit |
a4058c |
The final pixel is composed of area As of the source pixel
|
|
Packit |
a4058c |
and (1 - As) * Ad of the target pixel. So the final pixel
|
|
Packit |
a4058c |
is a weighted average with those weights.
|
|
Packit |
a4058c |
|
|
Packit |
a4058c |
Note that the weights do not add up to one - hence the
|
|
Packit |
a4058c |
non-constant division.
|
|
Packit |
a4058c |
|
|
Packit |
a4058c |
|
|
Packit |
a4058c |
Integer tricks for compositing
|
|
Packit |
a4058c |
==============================
|
|
Packit |
a4058c |
|
|
Packit |
a4058c |
|
|
Packit |
a4058c |
|
|
Packit |
a4058c |
MMX Code
|
|
Packit |
a4058c |
========
|
|
Packit |
a4058c |
|
|
Packit |
a4058c |
Line functions are provided in MMX functionsfor a few special
|
|
Packit |
a4058c |
cases:
|
|
Packit |
a4058c |
|
|
Packit |
a4058c |
n_x = n_y = 2
|
|
Packit |
a4058c |
|
|
Packit |
a4058c |
src_channels = 3 dest_channels = 3 op = scale
|
|
Packit |
a4058c |
src_channels = 4 with alpha dest_channels = 4 no alpha op = composite
|
|
Packit |
a4058c |
src_channels = 4 with alpha dest_channels = 4 no alpha op = composite_color
|
|
Packit |
a4058c |
|
|
Packit |
a4058c |
For the case n_x = n_y = 2 - primarily hit when scaling up with bilinear
|
|
Packit |
a4058c |
scaling, we can take advantage of the fact that multiple destination
|
|
Packit |
a4058c |
pixels will be composed from the same source pixels.
|
|
Packit |
a4058c |
|
|
Packit |
a4058c |
That is a destination pixel is a linear combination of the source
|
|
Packit |
a4058c |
pixels around it:
|
|
Packit |
a4058c |
|
|
Packit |
a4058c |
|
|
Packit |
a4058c |
S0 S1
|
|
Packit |
a4058c |
|
|
Packit |
a4058c |
|
|
Packit |
a4058c |
|
|
Packit |
a4058c |
|
|
Packit |
a4058c |
|
|
Packit |
a4058c |
D D' D'' ...
|
|
Packit |
a4058c |
|
|
Packit |
a4058c |
|
|
Packit |
a4058c |
|
|
Packit |
a4058c |
|
|
Packit |
a4058c |
S2 S3
|
|
Packit |
a4058c |
|
|
Packit |
a4058c |
Each mmx register is 64 bits wide, so we can unpack a source pixel
|
|
Packit |
a4058c |
into the low 8 bits of 4 16 bit words, and store it into a mmx
|
|
Packit |
a4058c |
register.
|
|
Packit |
a4058c |
|
|
Packit |
a4058c |
For each destination pixel, we first make sure that we have pixels S0
|
|
Packit |
a4058c |
... S3 loaded into registers mm0 ...mm3. (This will often involve not
|
|
Packit |
a4058c |
doing anything or moving mm1 and mm3 into mm0 and mm1 then reloading
|
|
Packit |
a4058c |
mm1 and mm3 with new values).
|
|
Packit |
a4058c |
|
|
Packit |
a4058c |
Then we load up the appropriate weights for the 4 corner pixels
|
|
Packit |
a4058c |
based on the offsets of the destination pixel within the source
|
|
Packit |
a4058c |
pixels.
|
|
Packit |
a4058c |
|
|
Packit |
a4058c |
We have preexpanded the weights to 64 bits wide and truncated the
|
|
Packit |
a4058c |
range to 8 bits, so an original filter value of
|
|
Packit |
a4058c |
|
|
Packit |
a4058c |
0x5321 would be expanded to
|
|
Packit |
a4058c |
|
|
Packit |
a4058c |
0x0053005300530053
|
|
Packit |
a4058c |
|
|
Packit |
a4058c |
For source buffers without alpha, we simply do a multiply-add
|
|
Packit |
a4058c |
of the weights, giving us a 16 bit quantity for the result
|
|
Packit |
a4058c |
that we shift left by 8 and store in the destination buffer.
|
|
Packit |
a4058c |
|
|
Packit |
a4058c |
When the source buffer has alpha, then things become more
|
|
Packit |
a4058c |
complicated - when we load up mm0 and mm3, we premultiply
|
|
Packit |
a4058c |
the alpha, so they contain:
|
|
Packit |
a4058c |
|
|
Packit |
a4058c |
(a*ff >> 8) (r*a >> 8) (g*a >> 8) (b*a >> a)
|
|
Packit |
a4058c |
|
|
Packit |
a4058c |
Then when we multiply by the weights, and add we end up
|
|
Packit |
a4058c |
with premultiplied r,g,b,a in the range of 0 .. 0xff * 0ff,
|
|
Packit |
a4058c |
call them A,R,G,B
|
|
Packit |
a4058c |
|
|
Packit |
a4058c |
We then need to composite with the dest pixels - which
|
|
Packit |
a4058c |
we do by:
|
|
Packit |
a4058c |
|
|
Packit |
a4058c |
r_dest = (R + ((0xff * 0xff - A) >> 8) * r_dest) >> 8
|
|
Packit |
a4058c |
|
|
Packit |
a4058c |
(0xff * 0xff)
|