|
Packit |
67cb25 |
.. index:: sorting, heapsort
|
|
Packit |
67cb25 |
|
|
Packit |
67cb25 |
*******
|
|
Packit |
67cb25 |
Sorting
|
|
Packit |
67cb25 |
*******
|
|
Packit |
67cb25 |
|
|
Packit |
67cb25 |
This chapter describes functions for sorting data, both directly and
|
|
Packit |
67cb25 |
indirectly (using an index). All the functions use the *heapsort*
|
|
Packit |
67cb25 |
algorithm. Heapsort is an :math:`O(N \log N)` algorithm which operates
|
|
Packit |
67cb25 |
in-place and does not require any additional storage. It also provides
|
|
Packit |
67cb25 |
consistent performance, the running time for its worst-case (ordered
|
|
Packit |
67cb25 |
data) being not significantly longer than the average and best cases.
|
|
Packit |
67cb25 |
Note that the heapsort algorithm does not preserve the relative ordering
|
|
Packit |
67cb25 |
of equal elements---it is an *unstable* sort. However the resulting
|
|
Packit |
67cb25 |
order of equal elements will be consistent across different platforms
|
|
Packit |
67cb25 |
when using these functions.
|
|
Packit |
67cb25 |
|
|
Packit |
67cb25 |
Sorting objects
|
|
Packit |
67cb25 |
===============
|
|
Packit |
67cb25 |
|
|
Packit |
67cb25 |
The following function provides a simple alternative to the standard
|
|
Packit |
67cb25 |
library function :func:`qsort`. It is intended for systems lacking
|
|
Packit |
67cb25 |
:func:`qsort`, not as a replacement for it. The function :func:`qsort`
|
|
Packit |
67cb25 |
should be used whenever possible, as it will be faster and can provide
|
|
Packit |
67cb25 |
stable ordering of equal elements. Documentation for :func:`qsort` is
|
|
Packit |
67cb25 |
available in the GNU C Library Reference Manual.
|
|
Packit |
67cb25 |
|
|
Packit |
67cb25 |
The functions described in this section are defined in the header file
|
|
Packit |
67cb25 |
:file:`gsl_heapsort.h`.
|
|
Packit |
67cb25 |
|
|
Packit |
67cb25 |
.. index::
|
|
Packit |
67cb25 |
single: comparison functions, definition
|
|
Packit |
67cb25 |
|
|
Packit |
67cb25 |
.. function:: void gsl_heapsort (void * array, size_t count, size_t size, gsl_comparison_fn_t compare)
|
|
Packit |
67cb25 |
|
|
Packit |
67cb25 |
This function sorts the :data:`count` elements of the array :data:`array`,
|
|
Packit |
67cb25 |
each of size :data:`size`, into ascending order using the comparison
|
|
Packit |
67cb25 |
function :data:`compare`. The type of the comparison function is defined by
|
|
Packit |
67cb25 |
|
|
Packit |
67cb25 |
.. type:: gsl_comparison_fn_t
|
|
Packit |
67cb25 |
|
|
Packit |
67cb25 |
::
|
|
Packit |
67cb25 |
|
|
Packit |
67cb25 |
int (*gsl_comparison_fn_t) (const void * a, const void * b)
|
|
Packit |
67cb25 |
|
|
Packit |
67cb25 |
A comparison function should return a negative integer if the first
|
|
Packit |
67cb25 |
argument is less than the second argument, :code:`0` if the two arguments
|
|
Packit |
67cb25 |
are equal and a positive integer if the first argument is greater than
|
|
Packit |
67cb25 |
the second argument.
|
|
Packit |
67cb25 |
|
|
Packit |
67cb25 |
For example, the following function can be used to sort doubles into
|
|
Packit |
67cb25 |
ascending numerical order.
|
|
Packit |
67cb25 |
|
|
Packit |
67cb25 |
::
|
|
Packit |
67cb25 |
|
|
Packit |
67cb25 |
int
|
|
Packit |
67cb25 |
compare_doubles (const double * a, const double * b)
|
|
Packit |
67cb25 |
{
|
|
Packit |
67cb25 |
if (*a > *b)
|
|
Packit |
67cb25 |
return 1;
|
|
Packit |
67cb25 |
else if (*a < *b)
|
|
Packit |
67cb25 |
return -1;
|
|
Packit |
67cb25 |
else
|
|
Packit |
67cb25 |
return 0;
|
|
Packit |
67cb25 |
}
|
|
Packit |
67cb25 |
|
|
Packit |
67cb25 |
The appropriate function call to perform the sort is::
|
|
Packit |
67cb25 |
|
|
Packit |
67cb25 |
gsl_heapsort (array, count, sizeof(double), compare_doubles);
|
|
Packit |
67cb25 |
|
|
Packit |
67cb25 |
Note that unlike :func:`qsort` the heapsort algorithm cannot be made into
|
|
Packit |
67cb25 |
a stable sort by pointer arithmetic. The trick of comparing pointers for
|
|
Packit |
67cb25 |
equal elements in the comparison function does not work for the heapsort
|
|
Packit |
67cb25 |
algorithm. The heapsort algorithm performs an internal rearrangement of
|
|
Packit |
67cb25 |
the data which destroys its initial ordering.
|
|
Packit |
67cb25 |
|
|
Packit |
67cb25 |
.. index:: indirect sorting
|
|
Packit |
67cb25 |
|
|
Packit |
67cb25 |
.. function:: int gsl_heapsort_index (size_t * p, const void * array, size_t count, size_t size, gsl_comparison_fn_t compare)
|
|
Packit |
67cb25 |
|
|
Packit |
67cb25 |
This function indirectly sorts the :data:`count` elements of the array
|
|
Packit |
67cb25 |
:data:`array`, each of size :data:`size`, into ascending order using the
|
|
Packit |
67cb25 |
comparison function :data:`compare`. The resulting permutation is stored
|
|
Packit |
67cb25 |
in :data:`p`, an array of length :data:`n`. The elements of :data:`p` give the
|
|
Packit |
67cb25 |
index of the array element which would have been stored in that position
|
|
Packit |
67cb25 |
if the array had been sorted in place. The first element of :data:`p`
|
|
Packit |
67cb25 |
gives the index of the least element in :data:`array`, and the last
|
|
Packit |
67cb25 |
element of :data:`p` gives the index of the greatest element in
|
|
Packit |
67cb25 |
:data:`array`. The array itself is not changed.
|
|
Packit |
67cb25 |
|
|
Packit |
67cb25 |
Sorting vectors
|
|
Packit |
67cb25 |
===============
|
|
Packit |
67cb25 |
|
|
Packit |
67cb25 |
The following functions will sort the elements of an array or vector,
|
|
Packit |
67cb25 |
either directly or indirectly. They are defined for all real and integer
|
|
Packit |
67cb25 |
types using the normal suffix rules. For example, the :code:`float`
|
|
Packit |
67cb25 |
versions of the array functions are :func:`gsl_sort_float` and
|
|
Packit |
67cb25 |
:func:`gsl_sort_float_index`. The corresponding vector functions are
|
|
Packit |
67cb25 |
:func:`gsl_sort_vector_float` and :func:`gsl_sort_vector_float_index`. The
|
|
Packit |
67cb25 |
prototypes are available in the header files :file:`gsl_sort_float.h`
|
|
Packit |
67cb25 |
:file:`gsl_sort_vector_float.h`. The complete set of prototypes can be
|
|
Packit |
67cb25 |
included using the header files :file:`gsl_sort.h` and
|
|
Packit |
67cb25 |
:file:`gsl_sort_vector.h`.
|
|
Packit |
67cb25 |
|
|
Packit |
67cb25 |
There are no functions for sorting complex arrays or vectors, since the
|
|
Packit |
67cb25 |
ordering of complex numbers is not uniquely defined. To sort a complex
|
|
Packit |
67cb25 |
vector by magnitude compute a real vector containing the magnitudes
|
|
Packit |
67cb25 |
of the complex elements, and sort this vector indirectly. The resulting
|
|
Packit |
67cb25 |
index gives the appropriate ordering of the original complex vector.
|
|
Packit |
67cb25 |
|
|
Packit |
67cb25 |
.. index::
|
|
Packit |
67cb25 |
single: sorting vector elements
|
|
Packit |
67cb25 |
single: vector, sorting elements of
|
|
Packit |
67cb25 |
|
|
Packit |
67cb25 |
.. function:: void gsl_sort (double * data, const size_t stride, size_t n)
|
|
Packit |
67cb25 |
|
|
Packit |
67cb25 |
This function sorts the :data:`n` elements of the array :data:`data` with
|
|
Packit |
67cb25 |
stride :data:`stride` into ascending numerical order.
|
|
Packit |
67cb25 |
|
|
Packit |
67cb25 |
.. function:: void gsl_sort2 (double * data1, const size_t stride1, double * data2, const size_t stride2, size_t n)
|
|
Packit |
67cb25 |
|
|
Packit |
67cb25 |
This function sorts the :data:`n` elements of the array :data:`data1` with
|
|
Packit |
67cb25 |
stride :data:`stride1` into ascending numerical order, while making the
|
|
Packit |
67cb25 |
same rearrangement of the array :data:`data2` with stride :data:`stride2`,
|
|
Packit |
67cb25 |
also of size :data:`n`.
|
|
Packit |
67cb25 |
|
|
Packit |
67cb25 |
.. function:: void gsl_sort_vector (gsl_vector * v)
|
|
Packit |
67cb25 |
|
|
Packit |
67cb25 |
This function sorts the elements of the vector :data:`v` into ascending
|
|
Packit |
67cb25 |
numerical order.
|
|
Packit |
67cb25 |
|
|
Packit |
67cb25 |
.. function:: void gsl_sort_vector2 (gsl_vector * v1, gsl_vector * v2)
|
|
Packit |
67cb25 |
|
|
Packit |
67cb25 |
This function sorts the elements of the vector :data:`v1` into ascending
|
|
Packit |
67cb25 |
numerical order, while making the same rearrangement of the vector :data:`v2`.
|
|
Packit |
67cb25 |
|
|
Packit |
67cb25 |
.. index::
|
|
Packit |
67cb25 |
single: indirect sorting, of vector elements
|
|
Packit |
67cb25 |
|
|
Packit |
67cb25 |
.. function:: void gsl_sort_index (size_t * p, const double * data, size_t stride, size_t n)
|
|
Packit |
67cb25 |
|
|
Packit |
67cb25 |
This function indirectly sorts the :data:`n` elements of the array
|
|
Packit |
67cb25 |
:data:`data` with stride :data:`stride` into ascending order, storing the
|
|
Packit |
67cb25 |
resulting permutation in :data:`p`. The array :data:`p` must be allocated with
|
|
Packit |
67cb25 |
a sufficient length to store the :data:`n` elements of the permutation.
|
|
Packit |
67cb25 |
The elements of :data:`p` give the index of the array element which would
|
|
Packit |
67cb25 |
have been stored in that position if the array had been sorted in place.
|
|
Packit |
67cb25 |
The array :data:`data` is not changed.
|
|
Packit |
67cb25 |
|
|
Packit |
67cb25 |
.. function:: int gsl_sort_vector_index (gsl_permutation * p, const gsl_vector * v)
|
|
Packit |
67cb25 |
|
|
Packit |
67cb25 |
This function indirectly sorts the elements of the vector :data:`v` into
|
|
Packit |
67cb25 |
ascending order, storing the resulting permutation in :data:`p`. The
|
|
Packit |
67cb25 |
elements of :data:`p` give the index of the vector element which would
|
|
Packit |
67cb25 |
have been stored in that position if the vector had been sorted in
|
|
Packit |
67cb25 |
place. The first element of :data:`p` gives the index of the least element
|
|
Packit |
67cb25 |
in :data:`v`, and the last element of :data:`p` gives the index of the
|
|
Packit |
67cb25 |
greatest element in :data:`v`. The vector :data:`v` is not changed.
|
|
Packit |
67cb25 |
|
|
Packit |
67cb25 |
Selecting the k smallest or largest elements
|
|
Packit |
67cb25 |
============================================
|
|
Packit |
67cb25 |
|
|
Packit |
67cb25 |
The functions described in this section select the :math:`k` smallest
|
|
Packit |
67cb25 |
or largest elements of a data set of size :math:`N`. The routines use an
|
|
Packit |
67cb25 |
:math:`O(kN)` direct insertion algorithm which is suited to subsets that
|
|
Packit |
67cb25 |
are small compared with the total size of the dataset. For example, the
|
|
Packit |
67cb25 |
routines are useful for selecting the 10 largest values from one million
|
|
Packit |
67cb25 |
data points, but not for selecting the largest 100,000 values. If the
|
|
Packit |
67cb25 |
subset is a significant part of the total dataset it may be faster
|
|
Packit |
67cb25 |
to sort all the elements of the dataset directly with an :math:`O(N \log N)`
|
|
Packit |
67cb25 |
algorithm and obtain the smallest or largest values that way.
|
|
Packit |
67cb25 |
|
|
Packit |
67cb25 |
.. function:: int gsl_sort_smallest (double * dest, size_t k, const double * src, size_t stride, size_t n)
|
|
Packit |
67cb25 |
|
|
Packit |
67cb25 |
This function copies the :data:`k` smallest elements of the array
|
|
Packit |
67cb25 |
:data:`src`, of size :data:`n` and stride :data:`stride`, in ascending
|
|
Packit |
67cb25 |
numerical order into the array :data:`dest`. The size :data:`k` of the subset must be
|
|
Packit |
67cb25 |
less than or equal to :data:`n`. The data :data:`src` is not modified by
|
|
Packit |
67cb25 |
this operation.
|
|
Packit |
67cb25 |
|
|
Packit |
67cb25 |
.. function:: int gsl_sort_largest (double * dest, size_t k, const double * src, size_t stride, size_t n)
|
|
Packit |
67cb25 |
|
|
Packit |
67cb25 |
This function copies the :data:`k` largest elements of the array
|
|
Packit |
67cb25 |
:data:`src`, of size :data:`n` and stride :data:`stride`, in descending
|
|
Packit |
67cb25 |
numerical order into the array :data:`dest`. :data:`k` must be
|
|
Packit |
67cb25 |
less than or equal to :data:`n`. The data :data:`src` is not modified by
|
|
Packit |
67cb25 |
this operation.
|
|
Packit |
67cb25 |
|
|
Packit |
67cb25 |
.. function:: int gsl_sort_vector_smallest (double * dest, size_t k, const gsl_vector * v)
|
|
Packit |
67cb25 |
int gsl_sort_vector_largest (double * dest, size_t k, const gsl_vector * v)
|
|
Packit |
67cb25 |
|
|
Packit |
67cb25 |
These functions copy the :data:`k` smallest or largest elements of the
|
|
Packit |
67cb25 |
vector :data:`v` into the array :data:`dest`. :data:`k`
|
|
Packit |
67cb25 |
must be less than or equal to the length of the vector :data:`v`.
|
|
Packit |
67cb25 |
|
|
Packit |
67cb25 |
The following functions find the indices of the :math:`k` smallest or
|
|
Packit |
67cb25 |
largest elements of a dataset.
|
|
Packit |
67cb25 |
|
|
Packit |
67cb25 |
.. function:: int gsl_sort_smallest_index (size_t * p, size_t k, const double * src, size_t stride, size_t n)
|
|
Packit |
67cb25 |
|
|
Packit |
67cb25 |
This function stores the indices of the :data:`k` smallest elements of
|
|
Packit |
67cb25 |
the array :data:`src`, of size :data:`n` and stride :data:`stride`, in the
|
|
Packit |
67cb25 |
array :data:`p`. The indices are chosen so that the corresponding data is
|
|
Packit |
67cb25 |
in ascending numerical order. :data:`k` must be
|
|
Packit |
67cb25 |
less than or equal to :data:`n`. The data :data:`src` is not modified by
|
|
Packit |
67cb25 |
this operation.
|
|
Packit |
67cb25 |
|
|
Packit |
67cb25 |
.. function:: int gsl_sort_largest_index (size_t * p, size_t k, const double * src, size_t stride, size_t n)
|
|
Packit |
67cb25 |
|
|
Packit |
67cb25 |
This function stores the indices of the :data:`k` largest elements of
|
|
Packit |
67cb25 |
the array :data:`src`, of size :data:`n` and stride :data:`stride`, in the
|
|
Packit |
67cb25 |
array :data:`p`. The indices are chosen so that the corresponding data is
|
|
Packit |
67cb25 |
in descending numerical order. :data:`k` must be
|
|
Packit |
67cb25 |
less than or equal to :data:`n`. The data :data:`src` is not modified by
|
|
Packit |
67cb25 |
this operation.
|
|
Packit |
67cb25 |
|
|
Packit |
67cb25 |
.. function:: int gsl_sort_vector_smallest_index (size_t * p, size_t k, const gsl_vector * v)
|
|
Packit |
67cb25 |
int gsl_sort_vector_largest_index (size_t * p, size_t k, const gsl_vector * v)
|
|
Packit |
67cb25 |
|
|
Packit |
67cb25 |
These functions store the indices of the :data:`k` smallest or largest
|
|
Packit |
67cb25 |
elements of the vector :data:`v` in the array :data:`p`. :data:`k` must be less than or equal to the length of the vector
|
|
Packit |
67cb25 |
:data:`v`.
|
|
Packit |
67cb25 |
|
|
Packit |
67cb25 |
Computing the rank
|
|
Packit |
67cb25 |
==================
|
|
Packit |
67cb25 |
|
|
Packit |
67cb25 |
The *rank* of an element is its order in the sorted data. The rank
|
|
Packit |
67cb25 |
is the inverse of the index permutation, :math:`p`. It can be computed
|
|
Packit |
67cb25 |
using the following algorithm::
|
|
Packit |
67cb25 |
|
|
Packit |
67cb25 |
for (i = 0; i < p->size; i++)
|
|
Packit |
67cb25 |
{
|
|
Packit |
67cb25 |
size_t pi = p->data[i];
|
|
Packit |
67cb25 |
rank->data[pi] = i;
|
|
Packit |
67cb25 |
}
|
|
Packit |
67cb25 |
|
|
Packit |
67cb25 |
This can be computed directly from the function
|
|
Packit |
67cb25 |
:code:`gsl_permutation_inverse(rank,p)`.
|
|
Packit |
67cb25 |
|
|
Packit |
67cb25 |
The following function will print the rank of each element of the vector
|
|
Packit |
67cb25 |
:math:`v`::
|
|
Packit |
67cb25 |
|
|
Packit |
67cb25 |
void
|
|
Packit |
67cb25 |
print_rank (gsl_vector * v)
|
|
Packit |
67cb25 |
{
|
|
Packit |
67cb25 |
size_t i;
|
|
Packit |
67cb25 |
size_t n = v->size;
|
|
Packit |
67cb25 |
gsl_permutation * perm = gsl_permutation_alloc(n);
|
|
Packit |
67cb25 |
gsl_permutation * rank = gsl_permutation_alloc(n);
|
|
Packit |
67cb25 |
|
|
Packit |
67cb25 |
gsl_sort_vector_index (perm, v);
|
|
Packit |
67cb25 |
gsl_permutation_inverse (rank, perm);
|
|
Packit |
67cb25 |
|
|
Packit |
67cb25 |
for (i = 0; i < n; i++)
|
|
Packit |
67cb25 |
{
|
|
Packit |
67cb25 |
double vi = gsl_vector_get(v, i);
|
|
Packit |
67cb25 |
printf ("element = %d, value = %g, rank = %d\n",
|
|
Packit |
67cb25 |
i, vi, rank->data[i]);
|
|
Packit |
67cb25 |
}
|
|
Packit |
67cb25 |
|
|
Packit |
67cb25 |
gsl_permutation_free (perm);
|
|
Packit |
67cb25 |
gsl_permutation_free (rank);
|
|
Packit |
67cb25 |
}
|
|
Packit |
67cb25 |
|
|
Packit |
67cb25 |
Examples
|
|
Packit |
67cb25 |
========
|
|
Packit |
67cb25 |
|
|
Packit |
67cb25 |
The following example shows how to use the permutation :math:`p` to print
|
|
Packit |
67cb25 |
the elements of the vector :math:`v` in ascending order::
|
|
Packit |
67cb25 |
|
|
Packit |
67cb25 |
gsl_sort_vector_index (p, v);
|
|
Packit |
67cb25 |
|
|
Packit |
67cb25 |
for (i = 0; i < v->size; i++)
|
|
Packit |
67cb25 |
{
|
|
Packit |
67cb25 |
double vpi = gsl_vector_get (v, p->data[i]);
|
|
Packit |
67cb25 |
printf ("order = %d, value = %g\n", i, vpi);
|
|
Packit |
67cb25 |
}
|
|
Packit |
67cb25 |
|
|
Packit |
67cb25 |
The next example uses the function :func:`gsl_sort_smallest` to select
|
|
Packit |
67cb25 |
the 5 smallest numbers from 100000 uniform random variates stored in an
|
|
Packit |
67cb25 |
array,
|
|
Packit |
67cb25 |
|
|
Packit |
67cb25 |
.. include:: examples/sortsmall.c
|
|
Packit |
67cb25 |
:code:
|
|
Packit |
67cb25 |
|
|
Packit |
67cb25 |
The output lists the 5 smallest values, in ascending order,
|
|
Packit |
67cb25 |
|
|
Packit |
67cb25 |
.. include:: examples/sortsmall.txt
|
|
Packit |
67cb25 |
:code:
|
|
Packit |
67cb25 |
|
|
Packit |
67cb25 |
References and Further Reading
|
|
Packit |
67cb25 |
==============================
|
|
Packit |
67cb25 |
|
|
Packit |
67cb25 |
The subject of sorting is covered extensively in the following,
|
|
Packit |
67cb25 |
|
|
Packit |
67cb25 |
* Donald E. Knuth, The Art of Computer Programming: Sorting and
|
|
Packit |
67cb25 |
Searching (Vol 3, 3rd Ed, 1997), Addison-Wesley, ISBN 0201896850.
|
|
Packit |
67cb25 |
|
|
Packit |
67cb25 |
The Heapsort algorithm is described in the following book,
|
|
Packit |
67cb25 |
|
|
Packit |
67cb25 |
* Robert Sedgewick, Algorithms in C, Addison-Wesley,
|
|
Packit |
67cb25 |
ISBN 0201514257.
|