|
Packit |
67cb25 |
.. index::
|
|
Packit |
67cb25 |
single: minimization, multidimensional
|
|
Packit |
67cb25 |
|
|
Packit |
67cb25 |
*****************************
|
|
Packit |
67cb25 |
Multidimensional Minimization
|
|
Packit |
67cb25 |
*****************************
|
|
Packit |
67cb25 |
|
|
Packit |
67cb25 |
This chapter describes routines for finding minima of arbitrary
|
|
Packit |
67cb25 |
multidimensional functions. The library provides low level components
|
|
Packit |
67cb25 |
for a variety of iterative minimizers and convergence tests. These can
|
|
Packit |
67cb25 |
be combined by the user to achieve the desired solution, while providing
|
|
Packit |
67cb25 |
full access to the intermediate steps of the algorithms. Each class of
|
|
Packit |
67cb25 |
methods uses the same framework, so that you can switch between
|
|
Packit |
67cb25 |
minimizers at runtime without needing to recompile your program. Each
|
|
Packit |
67cb25 |
instance of a minimizer keeps track of its own state, allowing the
|
|
Packit |
67cb25 |
minimizers to be used in multi-threaded programs. The minimization
|
|
Packit |
67cb25 |
algorithms can be used to maximize a function by inverting its sign.
|
|
Packit |
67cb25 |
|
|
Packit |
67cb25 |
The header file :file:`gsl_multimin.h` contains prototypes for the
|
|
Packit |
67cb25 |
minimization functions and related declarations.
|
|
Packit |
67cb25 |
|
|
Packit |
67cb25 |
Overview
|
|
Packit |
67cb25 |
========
|
|
Packit |
67cb25 |
|
|
Packit |
67cb25 |
The problem of multidimensional minimization requires finding a point
|
|
Packit |
67cb25 |
:math:`x` such that the scalar function,
|
|
Packit |
67cb25 |
|
|
Packit |
67cb25 |
.. math:: f(x_1, \dots, x_n)
|
|
Packit |
67cb25 |
|
|
Packit |
67cb25 |
takes a value which is lower than at any neighboring point. For smooth
|
|
Packit |
67cb25 |
functions the gradient :math:`g = \nabla f` vanishes at the minimum. In
|
|
Packit |
67cb25 |
general there are no bracketing methods available for the
|
|
Packit |
67cb25 |
minimization of :math:`n`-dimensional functions. The algorithms
|
|
Packit |
67cb25 |
proceed from an initial guess using a search algorithm which attempts
|
|
Packit |
67cb25 |
to move in a downhill direction.
|
|
Packit |
67cb25 |
|
|
Packit |
67cb25 |
Algorithms making use of the gradient of the function perform a
|
|
Packit |
67cb25 |
one-dimensional line minimisation along this direction until the lowest
|
|
Packit |
67cb25 |
point is found to a suitable tolerance. The search direction is then
|
|
Packit |
67cb25 |
updated with local information from the function and its derivatives,
|
|
Packit |
67cb25 |
and the whole process repeated until the true :math:`n`-dimensional
|
|
Packit |
67cb25 |
minimum is found.
|
|
Packit |
67cb25 |
|
|
Packit |
67cb25 |
Algorithms which do not require the gradient of the function use
|
|
Packit |
67cb25 |
different strategies. For example, the Nelder-Mead Simplex algorithm
|
|
Packit |
67cb25 |
maintains :math:`n+1` trial parameter vectors as the vertices of a
|
|
Packit |
67cb25 |
:math:`n`-dimensional simplex. On each iteration it tries to improve
|
|
Packit |
67cb25 |
the worst vertex of the simplex by geometrical transformations. The
|
|
Packit |
67cb25 |
iterations are continued until the overall size of the simplex has
|
|
Packit |
67cb25 |
decreased sufficiently.
|
|
Packit |
67cb25 |
|
|
Packit |
67cb25 |
Both types of algorithms use a standard framework. The user provides a
|
|
Packit |
67cb25 |
high-level driver for the algorithms, and the library provides the
|
|
Packit |
67cb25 |
individual functions necessary for each of the steps. There are three
|
|
Packit |
67cb25 |
main phases of the iteration. The steps are,
|
|
Packit |
67cb25 |
|
|
Packit |
67cb25 |
* initialize minimizer state, :data:`s`, for algorithm :data:`T`
|
|
Packit |
67cb25 |
* update :data:`s` using the iteration :data:`T`
|
|
Packit |
67cb25 |
* test :data:`s` for convergence, and repeat iteration if necessary
|
|
Packit |
67cb25 |
|
|
Packit |
67cb25 |
Each iteration step consists either of an improvement to the
|
|
Packit |
67cb25 |
line-minimisation in the current direction or an update to the search
|
|
Packit |
67cb25 |
direction itself. The state for the minimizers is held in a
|
|
Packit |
67cb25 |
:type:`gsl_multimin_fdfminimizer` struct or a
|
|
Packit |
67cb25 |
:type:`gsl_multimin_fminimizer` struct.
|
|
Packit |
67cb25 |
|
|
Packit |
67cb25 |
.. index::
|
|
Packit |
67cb25 |
single: Multimin, caveats
|
|
Packit |
67cb25 |
|
|
Packit |
67cb25 |
Caveats
|
|
Packit |
67cb25 |
=======
|
|
Packit |
67cb25 |
|
|
Packit |
67cb25 |
Note that the minimization algorithms can only search for one local
|
|
Packit |
67cb25 |
minimum at a time. When there are several local minima in the search
|
|
Packit |
67cb25 |
area, the first minimum to be found will be returned; however it is
|
|
Packit |
67cb25 |
difficult to predict which of the minima this will be. In most cases,
|
|
Packit |
67cb25 |
no error will be reported if you try to find a local minimum in an area
|
|
Packit |
67cb25 |
where there is more than one.
|
|
Packit |
67cb25 |
|
|
Packit |
67cb25 |
It is also important to note that the minimization algorithms find local
|
|
Packit |
67cb25 |
minima; there is no way to determine whether a minimum is a global
|
|
Packit |
67cb25 |
minimum of the function in question.
|
|
Packit |
67cb25 |
|
|
Packit |
67cb25 |
Initializing the Multidimensional Minimizer
|
|
Packit |
67cb25 |
===========================================
|
|
Packit |
67cb25 |
|
|
Packit |
67cb25 |
The following function initializes a multidimensional minimizer. The
|
|
Packit |
67cb25 |
minimizer itself depends only on the dimension of the problem and the
|
|
Packit |
67cb25 |
algorithm and can be reused for different problems.
|
|
Packit |
67cb25 |
|
|
Packit |
67cb25 |
.. type:: gsl_multimin_fdfminimizer
|
|
Packit |
67cb25 |
|
|
Packit |
67cb25 |
This is a workspace for minimizing functions using derivatives.
|
|
Packit |
67cb25 |
|
|
Packit |
67cb25 |
.. type:: gsl_multimin_fminimizer
|
|
Packit |
67cb25 |
|
|
Packit |
67cb25 |
This is a workspace for minimizing functions without derivatives.
|
|
Packit |
67cb25 |
|
|
Packit |
67cb25 |
.. function:: gsl_multimin_fdfminimizer * gsl_multimin_fdfminimizer_alloc (const gsl_multimin_fdfminimizer_type * T, size_t n)
|
|
Packit |
67cb25 |
gsl_multimin_fminimizer * gsl_multimin_fminimizer_alloc (const gsl_multimin_fminimizer_type * T, size_t n)
|
|
Packit |
67cb25 |
|
|
Packit |
67cb25 |
This function returns a pointer to a newly allocated instance of a
|
|
Packit |
67cb25 |
minimizer of type :data:`T` for an :data:`n`-dimension function. If there
|
|
Packit |
67cb25 |
is insufficient memory to create the minimizer then the function returns
|
|
Packit |
67cb25 |
a null pointer and the error handler is invoked with an error code of
|
|
Packit |
67cb25 |
:macro:`GSL_ENOMEM`.
|
|
Packit |
67cb25 |
|
|
Packit |
67cb25 |
.. function:: int gsl_multimin_fdfminimizer_set (gsl_multimin_fdfminimizer * s, gsl_multimin_function_fdf * fdf, const gsl_vector * x, double step_size, double tol)
|
|
Packit |
67cb25 |
int gsl_multimin_fminimizer_set (gsl_multimin_fminimizer * s, gsl_multimin_function * f, const gsl_vector * x, const gsl_vector * step_size)
|
|
Packit |
67cb25 |
|
|
Packit |
67cb25 |
The function :func:`gsl_multimin_fdfminimizer_set` initializes the minimizer :data:`s` to minimize the function
|
|
Packit |
67cb25 |
:data:`fdf` starting from the initial point :data:`x`. The size of the
|
|
Packit |
67cb25 |
first trial step is given by :data:`step_size`. The accuracy of the line
|
|
Packit |
67cb25 |
minimization is specified by :data:`tol`. The precise meaning of this
|
|
Packit |
67cb25 |
parameter depends on the method used. Typically the line minimization
|
|
Packit |
67cb25 |
is considered successful if the gradient of the function :math:`g` is
|
|
Packit |
67cb25 |
orthogonal to the current search direction :math:`p` to a relative
|
|
Packit |
67cb25 |
accuracy of :data:`tol`, where :math:`p \cdot g < tol |p| |g|`.
|
|
Packit |
67cb25 |
A :data:`tol` value of 0.1 is
|
|
Packit |
67cb25 |
suitable for most purposes, since line minimization only needs to
|
|
Packit |
67cb25 |
be carried out approximately. Note that setting :data:`tol` to zero will
|
|
Packit |
67cb25 |
force the use of "exact" line-searches, which are extremely expensive.
|
|
Packit |
67cb25 |
|
|
Packit |
67cb25 |
The function :func:`gsl_multimin_fminimizer_set` initializes the minimizer :data:`s` to minimize the function
|
|
Packit |
67cb25 |
:data:`f`, starting from the initial point
|
|
Packit |
67cb25 |
:data:`x`. The size of the initial trial steps is given in vector
|
|
Packit |
67cb25 |
:data:`step_size`. The precise meaning of this parameter depends on the
|
|
Packit |
67cb25 |
method used.
|
|
Packit |
67cb25 |
|
|
Packit |
67cb25 |
.. function:: void gsl_multimin_fdfminimizer_free (gsl_multimin_fdfminimizer * s)
|
|
Packit |
67cb25 |
void gsl_multimin_fminimizer_free (gsl_multimin_fminimizer * s)
|
|
Packit |
67cb25 |
|
|
Packit |
67cb25 |
This function frees all the memory associated with the minimizer
|
|
Packit |
67cb25 |
:data:`s`.
|
|
Packit |
67cb25 |
|
|
Packit |
67cb25 |
.. function:: const char * gsl_multimin_fdfminimizer_name (const gsl_multimin_fdfminimizer * s)
|
|
Packit |
67cb25 |
const char * gsl_multimin_fminimizer_name (const gsl_multimin_fminimizer * s)
|
|
Packit |
67cb25 |
|
|
Packit |
67cb25 |
This function returns a pointer to the name of the minimizer. For example::
|
|
Packit |
67cb25 |
|
|
Packit |
67cb25 |
printf ("s is a '%s' minimizer\n", gsl_multimin_fdfminimizer_name (s));
|
|
Packit |
67cb25 |
|
|
Packit |
67cb25 |
would print something like :code:`s is a 'conjugate_pr' minimizer`.
|
|
Packit |
67cb25 |
|
|
Packit |
67cb25 |
Providing a function to minimize
|
|
Packit |
67cb25 |
================================
|
|
Packit |
67cb25 |
|
|
Packit |
67cb25 |
You must provide a parametric function of :math:`n` variables for the
|
|
Packit |
67cb25 |
minimizers to operate on. You may also need to provide a routine which
|
|
Packit |
67cb25 |
calculates the gradient of the function and a third routine which
|
|
Packit |
67cb25 |
calculates both the function value and the gradient together. In order
|
|
Packit |
67cb25 |
to allow for general parameters the functions are defined by the
|
|
Packit |
67cb25 |
following data types:
|
|
Packit |
67cb25 |
|
|
Packit |
67cb25 |
.. type:: gsl_multimin_function_fdf
|
|
Packit |
67cb25 |
|
|
Packit |
67cb25 |
This data type defines a general function of :math:`n` variables with
|
|
Packit |
67cb25 |
parameters and the corresponding gradient vector of derivatives,
|
|
Packit |
67cb25 |
|
|
Packit |
67cb25 |
:code:`double (* f) (const gsl_vector * x, void * params)`
|
|
Packit |
67cb25 |
|
|
Packit |
67cb25 |
this function should return the result
|
|
Packit |
67cb25 |
:math:`f(x,params)` for argument :data:`x` and parameters :data:`params`.
|
|
Packit |
67cb25 |
If the function cannot be computed, an error value of :macro:`GSL_NAN`
|
|
Packit |
67cb25 |
should be returned.
|
|
Packit |
67cb25 |
|
|
Packit |
67cb25 |
:code:`void (* df) (const gsl_vector * x, void * params, gsl_vector * g)`
|
|
Packit |
67cb25 |
|
|
Packit |
67cb25 |
this function should store the :data:`n`-dimensional gradient
|
|
Packit |
67cb25 |
|
|
Packit |
67cb25 |
.. only:: not texinfo
|
|
Packit |
67cb25 |
|
|
Packit |
67cb25 |
.. math:: g_i = \partial f(x,\hbox{\it params}) / \partial x_i
|
|
Packit |
67cb25 |
|
|
Packit |
67cb25 |
.. only:: texinfo
|
|
Packit |
67cb25 |
|
|
Packit |
67cb25 |
::
|
|
Packit |
67cb25 |
|
|
Packit |
67cb25 |
g_i = d f(x,params) / d x_i
|
|
Packit |
67cb25 |
|
|
Packit |
67cb25 |
in the vector :data:`g` for argument :data:`x`
|
|
Packit |
67cb25 |
and parameters :data:`params`, returning an appropriate error code if the
|
|
Packit |
67cb25 |
function cannot be computed.
|
|
Packit |
67cb25 |
|
|
Packit |
67cb25 |
:code:`void (* fdf) (const gsl_vector * x, void * params, double * f, gsl_vector * g)`
|
|
Packit |
67cb25 |
|
|
Packit |
67cb25 |
This function should set the values of the :data:`f` and :data:`g` as above,
|
|
Packit |
67cb25 |
for arguments :data:`x` and parameters :data:`params`. This function
|
|
Packit |
67cb25 |
provides an optimization of the separate functions for :math:`f(x)` and
|
|
Packit |
67cb25 |
:math:`g(x)`---it is always faster to compute the function and its
|
|
Packit |
67cb25 |
derivative at the same time.
|
|
Packit |
67cb25 |
|
|
Packit |
67cb25 |
:code:`size_t n`
|
|
Packit |
67cb25 |
|
|
Packit |
67cb25 |
the dimension of the system, i.e. the number of components of the
|
|
Packit |
67cb25 |
vectors :data:`x`.
|
|
Packit |
67cb25 |
|
|
Packit |
67cb25 |
:code:`void * params`
|
|
Packit |
67cb25 |
|
|
Packit |
67cb25 |
a pointer to the parameters of the function.
|
|
Packit |
67cb25 |
|
|
Packit |
67cb25 |
.. type:: gsl_multimin_function
|
|
Packit |
67cb25 |
|
|
Packit |
67cb25 |
This data type defines a general function of :math:`n` variables with
|
|
Packit |
67cb25 |
parameters,
|
|
Packit |
67cb25 |
|
|
Packit |
67cb25 |
:code:`double (* f) (const gsl_vector * x, void * params)`
|
|
Packit |
67cb25 |
|
|
Packit |
67cb25 |
this function should return the result
|
|
Packit |
67cb25 |
:math:`f(x,params)` for argument :data:`x` and parameters :data:`params`.
|
|
Packit |
67cb25 |
If the function cannot be computed, an error value of :macro:`GSL_NAN`
|
|
Packit |
67cb25 |
should be returned.
|
|
Packit |
67cb25 |
|
|
Packit |
67cb25 |
:code:`size_t n`
|
|
Packit |
67cb25 |
|
|
Packit |
67cb25 |
the dimension of the system, i.e. the number of components of the
|
|
Packit |
67cb25 |
vectors :data:`x`.
|
|
Packit |
67cb25 |
|
|
Packit |
67cb25 |
:code:`void * params`
|
|
Packit |
67cb25 |
|
|
Packit |
67cb25 |
a pointer to the parameters of the function.
|
|
Packit |
67cb25 |
|
|
Packit |
67cb25 |
.. _multimin-paraboloid:
|
|
Packit |
67cb25 |
|
|
Packit |
67cb25 |
The following example function defines a simple two-dimensional
|
|
Packit |
67cb25 |
paraboloid with five parameters,
|
|
Packit |
67cb25 |
|
|
Packit |
67cb25 |
.. include:: examples/multiminfn.c
|
|
Packit |
67cb25 |
:code:
|
|
Packit |
67cb25 |
|
|
Packit |
67cb25 |
The function can be initialized using the following code::
|
|
Packit |
67cb25 |
|
|
Packit |
67cb25 |
gsl_multimin_function_fdf my_func;
|
|
Packit |
67cb25 |
|
|
Packit |
67cb25 |
/* Paraboloid center at (1,2), scale factors (10, 20),
|
|
Packit |
67cb25 |
minimum value 30 */
|
|
Packit |
67cb25 |
double p[5] = { 1.0, 2.0, 10.0, 20.0, 30.0 };
|
|
Packit |
67cb25 |
|
|
Packit |
67cb25 |
my_func.n = 2; /* number of function components */
|
|
Packit |
67cb25 |
my_func.f = &my_f;
|
|
Packit |
67cb25 |
my_func.df = &my_df;
|
|
Packit |
67cb25 |
my_func.fdf = &my_fdf;
|
|
Packit |
67cb25 |
my_func.params = (void *)p;
|
|
Packit |
67cb25 |
|
|
Packit |
67cb25 |
Iteration
|
|
Packit |
67cb25 |
=========
|
|
Packit |
67cb25 |
|
|
Packit |
67cb25 |
The following function drives the iteration of each algorithm. The
|
|
Packit |
67cb25 |
function performs one iteration to update the state of the minimizer.
|
|
Packit |
67cb25 |
The same function works for all minimizers so that different methods can
|
|
Packit |
67cb25 |
be substituted at runtime without modifications to the code.
|
|
Packit |
67cb25 |
|
|
Packit |
67cb25 |
.. function:: int gsl_multimin_fdfminimizer_iterate (gsl_multimin_fdfminimizer * s)
|
|
Packit |
67cb25 |
int gsl_multimin_fminimizer_iterate (gsl_multimin_fminimizer * s)
|
|
Packit |
67cb25 |
|
|
Packit |
67cb25 |
These functions perform a single iteration of the minimizer :data:`s`.
|
|
Packit |
67cb25 |
If the iteration encounters an unexpected problem then an error code
|
|
Packit |
67cb25 |
will be returned. The error code :macro:`GSL_ENOPROG` signifies that
|
|
Packit |
67cb25 |
the minimizer is unable to improve on its current estimate, either due
|
|
Packit |
67cb25 |
to numerical difficulty or because a genuine local minimum has been
|
|
Packit |
67cb25 |
reached.
|
|
Packit |
67cb25 |
|
|
Packit |
67cb25 |
The minimizer maintains a current best estimate of the minimum at all
|
|
Packit |
67cb25 |
times. This information can be accessed with the following auxiliary
|
|
Packit |
67cb25 |
functions,
|
|
Packit |
67cb25 |
|
|
Packit |
67cb25 |
.. function:: gsl_vector * gsl_multimin_fdfminimizer_x (const gsl_multimin_fdfminimizer * s)
|
|
Packit |
67cb25 |
gsl_vector * gsl_multimin_fminimizer_x (const gsl_multimin_fminimizer * s)
|
|
Packit |
67cb25 |
double gsl_multimin_fdfminimizer_minimum (const gsl_multimin_fdfminimizer * s)
|
|
Packit |
67cb25 |
double gsl_multimin_fminimizer_minimum (const gsl_multimin_fminimizer * s)
|
|
Packit |
67cb25 |
gsl_vector * gsl_multimin_fdfminimizer_gradient (const gsl_multimin_fdfminimizer * s)
|
|
Packit |
67cb25 |
gsl_vector * gsl_multimin_fdfminimizer_dx (const gsl_multimin_fdfminimizer * s)
|
|
Packit |
67cb25 |
double gsl_multimin_fminimizer_size (const gsl_multimin_fminimizer * s)
|
|
Packit |
67cb25 |
|
|
Packit |
67cb25 |
These functions return the current best estimate of the location of the
|
|
Packit |
67cb25 |
minimum, the value of the function at that point, its gradient, the last
|
|
Packit |
67cb25 |
step increment of the estimate, and minimizer specific characteristic size for the minimizer :data:`s`.
|
|
Packit |
67cb25 |
|
|
Packit |
67cb25 |
.. function:: int gsl_multimin_fdfminimizer_restart (gsl_multimin_fdfminimizer * s)
|
|
Packit |
67cb25 |
|
|
Packit |
67cb25 |
This function resets the minimizer :data:`s` to use the current point as a
|
|
Packit |
67cb25 |
new starting point.
|
|
Packit |
67cb25 |
|
|
Packit |
67cb25 |
Stopping Criteria
|
|
Packit |
67cb25 |
=================
|
|
Packit |
67cb25 |
|
|
Packit |
67cb25 |
A minimization procedure should stop when one of the following
|
|
Packit |
67cb25 |
conditions is true:
|
|
Packit |
67cb25 |
|
|
Packit |
67cb25 |
* A minimum has been found to within the user-specified precision.
|
|
Packit |
67cb25 |
* A user-specified maximum number of iterations has been reached.
|
|
Packit |
67cb25 |
* An error has occurred.
|
|
Packit |
67cb25 |
|
|
Packit |
67cb25 |
The handling of these conditions is under user control. The functions
|
|
Packit |
67cb25 |
below allow the user to test the precision of the current result.
|
|
Packit |
67cb25 |
|
|
Packit |
67cb25 |
.. function:: int gsl_multimin_test_gradient (const gsl_vector * g, double epsabs)
|
|
Packit |
67cb25 |
|
|
Packit |
67cb25 |
This function tests the norm of the gradient :data:`g` against the
|
|
Packit |
67cb25 |
absolute tolerance :data:`epsabs`. The gradient of a multidimensional
|
|
Packit |
67cb25 |
function goes to zero at a minimum. The test returns :macro:`GSL_SUCCESS`
|
|
Packit |
67cb25 |
if the following condition is achieved,
|
|
Packit |
67cb25 |
|
|
Packit |
67cb25 |
.. only:: not texinfo
|
|
Packit |
67cb25 |
|
|
Packit |
67cb25 |
.. math:: |g| < \hbox{\it epsabs}
|
|
Packit |
67cb25 |
|
|
Packit |
67cb25 |
.. only:: texinfo
|
|
Packit |
67cb25 |
|
|
Packit |
67cb25 |
::
|
|
Packit |
67cb25 |
|
|
Packit |
67cb25 |
|g| < epsabs
|
|
Packit |
67cb25 |
|
|
Packit |
67cb25 |
and returns :macro:`GSL_CONTINUE` otherwise. A suitable choice of
|
|
Packit |
67cb25 |
:data:`epsabs` can be made from the desired accuracy in the function for
|
|
Packit |
67cb25 |
small variations in :math:`x`. The relationship between these quantities
|
|
Packit |
67cb25 |
is given by :math:`\delta{f} = g\,\delta{x}`.
|
|
Packit |
67cb25 |
|
|
Packit |
67cb25 |
.. function:: int gsl_multimin_test_size (const double size, double epsabs)
|
|
Packit |
67cb25 |
|
|
Packit |
67cb25 |
This function tests the minimizer specific characteristic
|
|
Packit |
67cb25 |
size (if applicable to the used minimizer) against absolute tolerance :data:`epsabs`.
|
|
Packit |
67cb25 |
The test returns :macro:`GSL_SUCCESS` if the size is smaller than tolerance,
|
|
Packit |
67cb25 |
otherwise :macro:`GSL_CONTINUE` is returned.
|
|
Packit |
67cb25 |
|
|
Packit |
67cb25 |
Algorithms with Derivatives
|
|
Packit |
67cb25 |
===========================
|
|
Packit |
67cb25 |
|
|
Packit |
67cb25 |
There are several minimization methods available. The best choice of
|
|
Packit |
67cb25 |
algorithm depends on the problem. The algorithms described in this
|
|
Packit |
67cb25 |
section use the value of the function and its gradient at each
|
|
Packit |
67cb25 |
evaluation point.
|
|
Packit |
67cb25 |
|
|
Packit |
67cb25 |
.. type:: gsl_multimin_fdfminimizer_type
|
|
Packit |
67cb25 |
|
|
Packit |
67cb25 |
This type specifies a minimization algorithm using gradients.
|
|
Packit |
67cb25 |
|
|
Packit |
67cb25 |
.. index::
|
|
Packit |
67cb25 |
single: Fletcher-Reeves conjugate gradient algorithm, minimization
|
|
Packit |
67cb25 |
single: Conjugate gradient algorithm, minimization
|
|
Packit |
67cb25 |
single: minimization, conjugate gradient algorithm
|
|
Packit |
67cb25 |
|
|
Packit |
67cb25 |
.. var:: gsl_multimin_fdfminimizer_conjugate_fr
|
|
Packit |
67cb25 |
|
|
Packit |
67cb25 |
This is the Fletcher-Reeves conjugate gradient algorithm. The conjugate
|
|
Packit |
67cb25 |
gradient algorithm proceeds as a succession of line minimizations. The
|
|
Packit |
67cb25 |
sequence of search directions is used to build up an approximation to the
|
|
Packit |
67cb25 |
curvature of the function in the neighborhood of the minimum.
|
|
Packit |
67cb25 |
|
|
Packit |
67cb25 |
An initial search direction :data:`p` is chosen using the gradient, and line
|
|
Packit |
67cb25 |
minimization is carried out in that direction. The accuracy of the line
|
|
Packit |
67cb25 |
minimization is specified by the parameter :data:`tol`. The minimum
|
|
Packit |
67cb25 |
along this line occurs when the function gradient :data:`g` and the search direction
|
|
Packit |
67cb25 |
:data:`p` are orthogonal. The line minimization terminates when
|
|
Packit |
67cb25 |
:math:`p\cdot g < tol |p| |g|`. The
|
|
Packit |
67cb25 |
search direction is updated using the Fletcher-Reeves formula
|
|
Packit |
67cb25 |
:math:`p' = g' - \beta g` where :math:`\beta=-|g'|^2/|g|^2`, and
|
|
Packit |
67cb25 |
the line minimization is then repeated for the new search
|
|
Packit |
67cb25 |
direction.
|
|
Packit |
67cb25 |
|
|
Packit |
67cb25 |
.. index::
|
|
Packit |
67cb25 |
single: Polak-Ribiere algorithm, minimization
|
|
Packit |
67cb25 |
single: minimization, Polak-Ribiere algorithm
|
|
Packit |
67cb25 |
|
|
Packit |
67cb25 |
.. var:: gsl_multimin_fdfminimizer_conjugate_pr
|
|
Packit |
67cb25 |
|
|
Packit |
67cb25 |
This is the Polak-Ribiere conjugate gradient algorithm. It is similar
|
|
Packit |
67cb25 |
to the Fletcher-Reeves method, differing only in the choice of the
|
|
Packit |
67cb25 |
coefficient :math:`\beta`. Both methods work well when the evaluation
|
|
Packit |
67cb25 |
point is close enough to the minimum of the objective function that it
|
|
Packit |
67cb25 |
is well approximated by a quadratic hypersurface.
|
|
Packit |
67cb25 |
|
|
Packit |
67cb25 |
.. index::
|
|
Packit |
67cb25 |
single: BFGS algorithm, minimization
|
|
Packit |
67cb25 |
single: minimization, BFGS algorithm
|
|
Packit |
67cb25 |
|
|
Packit |
67cb25 |
.. var:: gsl_multimin_fdfminimizer_vector_bfgs2
|
|
Packit |
67cb25 |
gsl_multimin_fdfminimizer_vector_bfgs
|
|
Packit |
67cb25 |
|
|
Packit |
67cb25 |
These methods use the vector Broyden-Fletcher-Goldfarb-Shanno (BFGS)
|
|
Packit |
67cb25 |
algorithm. This is a quasi-Newton method which builds up an approximation
|
|
Packit |
67cb25 |
to the second derivatives of the function :math:`f` using the difference
|
|
Packit |
67cb25 |
between successive gradient vectors. By combining the first and second
|
|
Packit |
67cb25 |
derivatives the algorithm is able to take Newton-type steps towards the
|
|
Packit |
67cb25 |
function minimum, assuming quadratic behavior in that region.
|
|
Packit |
67cb25 |
|
|
Packit |
67cb25 |
The :code:`bfgs2` version of this minimizer is the most efficient
|
|
Packit |
67cb25 |
version available, and is a faithful implementation of the line
|
|
Packit |
67cb25 |
minimization scheme described in Fletcher's *Practical Methods of
|
|
Packit |
67cb25 |
Optimization*, Algorithms 2.6.2 and 2.6.4. It supersedes the original
|
|
Packit |
67cb25 |
:code:`bfgs` routine and requires substantially fewer function and
|
|
Packit |
67cb25 |
gradient evaluations. The user-supplied tolerance :data:`tol`
|
|
Packit |
67cb25 |
corresponds to the parameter :math:`\sigma` used by Fletcher. A value
|
|
Packit |
67cb25 |
of 0.1 is recommended for typical use (larger values correspond to
|
|
Packit |
67cb25 |
less accurate line searches).
|
|
Packit |
67cb25 |
|
|
Packit |
67cb25 |
.. index::
|
|
Packit |
67cb25 |
single: steepest descent algorithm, minimization
|
|
Packit |
67cb25 |
single: minimization, steepest descent algorithm
|
|
Packit |
67cb25 |
|
|
Packit |
67cb25 |
.. var:: gsl_multimin_fdfminimizer_steepest_descent
|
|
Packit |
67cb25 |
|
|
Packit |
67cb25 |
The steepest descent algorithm follows the downhill gradient of the
|
|
Packit |
67cb25 |
function at each step. When a downhill step is successful the step-size
|
|
Packit |
67cb25 |
is increased by a factor of two. If the downhill step leads to a higher
|
|
Packit |
67cb25 |
function value then the algorithm backtracks and the step size is
|
|
Packit |
67cb25 |
decreased using the parameter :data:`tol`. A suitable value of :data:`tol`
|
|
Packit |
67cb25 |
for most applications is 0.1. The steepest descent method is
|
|
Packit |
67cb25 |
inefficient and is included only for demonstration purposes.
|
|
Packit |
67cb25 |
|
|
Packit |
67cb25 |
Algorithms without Derivatives
|
|
Packit |
67cb25 |
==============================
|
|
Packit |
67cb25 |
|
|
Packit |
67cb25 |
The algorithms described in this section use only the value of the function
|
|
Packit |
67cb25 |
at each evaluation point.
|
|
Packit |
67cb25 |
|
|
Packit |
67cb25 |
.. type:: gsl_multimin_fminimizer_type
|
|
Packit |
67cb25 |
|
|
Packit |
67cb25 |
This type specifies minimization algorithms which do not use gradients.
|
|
Packit |
67cb25 |
|
|
Packit |
67cb25 |
.. index::
|
|
Packit |
67cb25 |
single: Nelder-Mead simplex algorithm for minimization
|
|
Packit |
67cb25 |
single: simplex algorithm, minimization
|
|
Packit |
67cb25 |
single: minimization, simplex algorithm
|
|
Packit |
67cb25 |
|
|
Packit |
67cb25 |
.. var:: gsl_multimin_fminimizer_nmsimplex2
|
|
Packit |
67cb25 |
gsl_multimin_fminimizer_nmsimplex
|
|
Packit |
67cb25 |
|
|
Packit |
67cb25 |
These methods use the Simplex algorithm of Nelder and Mead.
|
|
Packit |
67cb25 |
Starting from the initial vector :math:`x = p_0`, the algorithm
|
|
Packit |
67cb25 |
constructs an additional :math:`n` vectors :math:`p_i`
|
|
Packit |
67cb25 |
using the step size vector :math:`s = step\_size`
|
|
Packit |
67cb25 |
as follows:
|
|
Packit |
67cb25 |
|
|
Packit |
67cb25 |
.. only:: not texinfo
|
|
Packit |
67cb25 |
|
|
Packit |
67cb25 |
.. math::
|
|
Packit |
67cb25 |
|
|
Packit |
67cb25 |
p_0 & = (x_0, x_1, \cdots , x_n) \\
|
|
Packit |
67cb25 |
p_1 & = (x_0 + s_0, x_1, \cdots , x_n) \\
|
|
Packit |
67cb25 |
p_2 & = (x_0, x_1 + s_1, \cdots , x_n) \\
|
|
Packit |
67cb25 |
\dots &= \dots \\
|
|
Packit |
67cb25 |
p_n & = (x_0, x_1, \cdots , x_n + s_n)
|
|
Packit |
67cb25 |
|
|
Packit |
67cb25 |
.. only:: texinfo
|
|
Packit |
67cb25 |
|
|
Packit |
67cb25 |
::
|
|
Packit |
67cb25 |
|
|
Packit |
67cb25 |
p_0 = (x_0, x_1, ... , x_n)
|
|
Packit |
67cb25 |
p_1 = (x_0 + s_0, x_1, ... , x_n)
|
|
Packit |
67cb25 |
p_2 = (x_0, x_1 + s_1, ... , x_n)
|
|
Packit |
67cb25 |
... = ...
|
|
Packit |
67cb25 |
p_n = (x_0, x_1, ... , x_n + s_n)
|
|
Packit |
67cb25 |
|
|
Packit |
67cb25 |
These vectors form the :math:`n+1` vertices of a simplex in :math:`n`
|
|
Packit |
67cb25 |
dimensions. On each iteration the algorithm uses simple geometrical
|
|
Packit |
67cb25 |
transformations to update the vector corresponding to the highest
|
|
Packit |
67cb25 |
function value. The geometric transformations are reflection,
|
|
Packit |
67cb25 |
reflection followed by expansion, contraction and multiple
|
|
Packit |
67cb25 |
contraction. Using these transformations the simplex moves through
|
|
Packit |
67cb25 |
the space towards the minimum, where it contracts itself.
|
|
Packit |
67cb25 |
|
|
Packit |
67cb25 |
After each iteration, the best vertex is returned. Note, that due to
|
|
Packit |
67cb25 |
the nature of the algorithm not every step improves the current
|
|
Packit |
67cb25 |
best parameter vector. Usually several iterations are required.
|
|
Packit |
67cb25 |
|
|
Packit |
67cb25 |
The minimizer-specific characteristic size is calculated as the
|
|
Packit |
67cb25 |
average distance from the geometrical center of the simplex to all its
|
|
Packit |
67cb25 |
vertices. This size can be used as a stopping criteria, as the
|
|
Packit |
67cb25 |
simplex contracts itself near the minimum. The size is returned by the
|
|
Packit |
67cb25 |
function :func:`gsl_multimin_fminimizer_size`.
|
|
Packit |
67cb25 |
|
|
Packit |
67cb25 |
The :type:`gsl_multimin_fminimizer_nmsimplex2` version of this minimiser is
|
|
Packit |
67cb25 |
a new :math:`O(N)` operations
|
|
Packit |
67cb25 |
implementation of the earlier :math:`O(N^2)` operations
|
|
Packit |
67cb25 |
:type:`gsl_multimin_fminimizer_nmsimplex`
|
|
Packit |
67cb25 |
minimiser. It uses the same underlying algorithm, but the simplex
|
|
Packit |
67cb25 |
updates are computed more efficiently for high-dimensional problems.
|
|
Packit |
67cb25 |
In addition, the size of simplex is calculated as the RMS
|
|
Packit |
67cb25 |
distance of each vertex from the center rather than the mean distance,
|
|
Packit |
67cb25 |
allowing a linear update of this quantity on each step. The memory usage is
|
|
Packit |
67cb25 |
:math:`O(N^2)` for both algorithms.
|
|
Packit |
67cb25 |
|
|
Packit |
67cb25 |
.. var:: gsl_multimin_fminimizer_nmsimplex2rand
|
|
Packit |
67cb25 |
|
|
Packit |
67cb25 |
This method is a variant of :type:`gsl_multimin_fminimizer_nmsimplex2` which initialises the
|
|
Packit |
67cb25 |
simplex around the starting point :data:`x` using a randomly-oriented
|
|
Packit |
67cb25 |
set of basis vectors instead of the fixed coordinate axes. The
|
|
Packit |
67cb25 |
final dimensions of the simplex are scaled along the coordinate axes by the
|
|
Packit |
67cb25 |
vector :data:`step_size`. The randomisation uses a simple deterministic
|
|
Packit |
67cb25 |
generator so that repeated calls to :func:`gsl_multimin_fminimizer_set` for
|
|
Packit |
67cb25 |
a given solver object will vary the orientation in a well-defined way.
|
|
Packit |
67cb25 |
|
|
Packit |
67cb25 |
Examples
|
|
Packit |
67cb25 |
========
|
|
Packit |
67cb25 |
|
|
Packit |
67cb25 |
This example program finds the minimum of the :ref:`paraboloid function <multimin-paraboloid>`
|
|
Packit |
67cb25 |
defined earlier. The location of the minimum is offset from the origin
|
|
Packit |
67cb25 |
in :math:`x` and :math:`y`, and the function value at the minimum is
|
|
Packit |
67cb25 |
non-zero. The main program is given below, it requires the example
|
|
Packit |
67cb25 |
function given earlier in this chapter.
|
|
Packit |
67cb25 |
|
|
Packit |
67cb25 |
.. include:: examples/multimin.c
|
|
Packit |
67cb25 |
:code:
|
|
Packit |
67cb25 |
|
|
Packit |
67cb25 |
The initial step-size is chosen as 0.01, a conservative estimate in this
|
|
Packit |
67cb25 |
case, and the line minimization parameter is set at 0.0001. The program
|
|
Packit |
67cb25 |
terminates when the norm of the gradient has been reduced below
|
|
Packit |
67cb25 |
0.001. The output of the program is shown below,
|
|
Packit |
67cb25 |
|
|
Packit |
67cb25 |
.. include:: examples/multimin.txt
|
|
Packit |
67cb25 |
:code:
|
|
Packit |
67cb25 |
|
|
Packit |
67cb25 |
Note that the algorithm gradually increases the step size as it
|
|
Packit |
67cb25 |
successfully moves downhill, as can be seen by plotting the successive
|
|
Packit |
67cb25 |
points in :numref:`fig-multimin`.
|
|
Packit |
67cb25 |
|
|
Packit |
67cb25 |
.. _fig-multimin:
|
|
Packit |
67cb25 |
|
|
Packit |
67cb25 |
.. figure:: /images/multimin.png
|
|
Packit |
67cb25 |
:scale: 60%
|
|
Packit |
67cb25 |
|
|
Packit |
67cb25 |
Function contours with path taken by minimization algorithm
|
|
Packit |
67cb25 |
|
|
Packit |
67cb25 |
The conjugate gradient algorithm finds the minimum on its second
|
|
Packit |
67cb25 |
direction because the function is purely quadratic. Additional
|
|
Packit |
67cb25 |
iterations would be needed for a more complicated function.
|
|
Packit |
67cb25 |
|
|
Packit |
67cb25 |
Here is another example using the Nelder-Mead Simplex algorithm to
|
|
Packit |
67cb25 |
minimize the same example object function, as above.
|
|
Packit |
67cb25 |
|
|
Packit |
67cb25 |
.. include:: examples/nmsimplex.c
|
|
Packit |
67cb25 |
:code:
|
|
Packit |
67cb25 |
|
|
Packit |
67cb25 |
The minimum search stops when the Simplex size drops to 0.01. The output is
|
|
Packit |
67cb25 |
shown below.
|
|
Packit |
67cb25 |
|
|
Packit |
67cb25 |
.. include:: examples/nmsimplex.txt
|
|
Packit |
67cb25 |
:code:
|
|
Packit |
67cb25 |
|
|
Packit |
67cb25 |
The simplex size first increases, while the simplex moves towards the
|
|
Packit |
67cb25 |
minimum. After a while the size begins to decrease as the simplex
|
|
Packit |
67cb25 |
contracts around the minimum.
|
|
Packit |
67cb25 |
|
|
Packit |
67cb25 |
References and Further Reading
|
|
Packit |
67cb25 |
==============================
|
|
Packit |
67cb25 |
|
|
Packit |
67cb25 |
The conjugate gradient and BFGS methods are described in detail in the
|
|
Packit |
67cb25 |
following book,
|
|
Packit |
67cb25 |
|
|
Packit |
67cb25 |
* R. Fletcher,
|
|
Packit |
67cb25 |
*Practical Methods of Optimization (Second Edition)* Wiley
|
|
Packit |
67cb25 |
(1987), ISBN 0471915475.
|
|
Packit |
67cb25 |
|
|
Packit |
67cb25 |
A brief description of multidimensional minimization algorithms and
|
|
Packit |
67cb25 |
more recent references can be found in,
|
|
Packit |
67cb25 |
|
|
Packit |
67cb25 |
* C.W. Ueberhuber,
|
|
Packit |
67cb25 |
*Numerical Computation (Volume 2)*, Chapter 14, Section 4.4
|
|
Packit |
67cb25 |
"Minimization Methods", p.: 325--335, Springer (1997), ISBN
|
|
Packit |
67cb25 |
3-540-62057-5.
|
|
Packit |
67cb25 |
|
|
Packit |
67cb25 |
The simplex algorithm is described in the following paper,
|
|
Packit |
67cb25 |
|
|
Packit |
67cb25 |
* J.A. Nelder and R. Mead,
|
|
Packit |
67cb25 |
*A simplex method for function minimization*, Computer Journal
|
|
Packit |
67cb25 |
vol.: 7 (1965), 308--313.
|