|
Packit |
67cb25 |
.. index::
|
|
Packit |
67cb25 |
single: nonlinear least squares
|
|
Packit |
67cb25 |
single: least squares, nonlinear
|
|
Packit |
67cb25 |
|
|
Packit |
67cb25 |
*******************************
|
|
Packit |
67cb25 |
Nonlinear Least-Squares Fitting
|
|
Packit |
67cb25 |
*******************************
|
|
Packit |
67cb25 |
|
|
Packit |
67cb25 |
.. include:: include.rst
|
|
Packit |
67cb25 |
|
|
Packit |
67cb25 |
This chapter describes functions for multidimensional nonlinear
|
|
Packit |
67cb25 |
least-squares fitting. There are generally two classes of
|
|
Packit |
67cb25 |
algorithms for solving nonlinear least squares problems, which
|
|
Packit |
67cb25 |
fall under line search methods and trust region methods.
|
|
Packit |
67cb25 |
GSL currently implements only trust region methods and
|
|
Packit |
67cb25 |
provides the user with
|
|
Packit |
67cb25 |
full access to intermediate steps of the iteration. The user
|
|
Packit |
67cb25 |
also has the ability to tune a number of parameters which affect
|
|
Packit |
67cb25 |
low-level aspects of the algorithm which can help to accelerate
|
|
Packit |
67cb25 |
convergence for the specific problem at hand. GSL provides
|
|
Packit |
67cb25 |
two separate interfaces for nonlinear least squares fitting. The
|
|
Packit |
67cb25 |
first is designed for small to moderate sized problems, and the
|
|
Packit |
67cb25 |
second is designed for very large problems, which may or may not
|
|
Packit |
67cb25 |
have significant sparse structure.
|
|
Packit |
67cb25 |
|
|
Packit |
67cb25 |
The header file :file:`gsl_multifit_nlinear.h` contains prototypes for the
|
|
Packit |
67cb25 |
multidimensional nonlinear fitting functions and related declarations
|
|
Packit |
67cb25 |
relating to the small to moderate sized systems.
|
|
Packit |
67cb25 |
|
|
Packit |
67cb25 |
The header file :file:`gsl_multilarge_nlinear.h` contains prototypes for the
|
|
Packit |
67cb25 |
multidimensional nonlinear fitting functions and related declarations
|
|
Packit |
67cb25 |
relating to large systems.
|
|
Packit |
67cb25 |
|
|
Packit |
67cb25 |
.. index::
|
|
Packit |
67cb25 |
single: nonlinear least squares, overview
|
|
Packit |
67cb25 |
|
|
Packit |
67cb25 |
Overview
|
|
Packit |
67cb25 |
========
|
|
Packit |
67cb25 |
|
|
Packit |
67cb25 |
The problem of multidimensional nonlinear least-squares fitting requires
|
|
Packit |
67cb25 |
the minimization of the squared residuals of :math:`n` functions,
|
|
Packit |
67cb25 |
:math:`f_i`, in :math:`p` parameters, :math:`x_i`,
|
|
Packit |
67cb25 |
|
|
Packit |
67cb25 |
.. only:: not texinfo
|
|
Packit |
67cb25 |
|
|
Packit |
67cb25 |
.. math::
|
|
Packit |
67cb25 |
|
|
Packit |
67cb25 |
\Phi(x) &= {1 \over 2} || f(x) ||^2 \\
|
|
Packit |
67cb25 |
&= {1 \over 2} \sum_{i=1}^{n} f_i (x_1, \dots, x_p)^2
|
|
Packit |
67cb25 |
|
|
Packit |
67cb25 |
.. only:: texinfo
|
|
Packit |
67cb25 |
|
|
Packit |
67cb25 |
::
|
|
Packit |
67cb25 |
|
|
Packit |
67cb25 |
\Phi(x) = (1/2) || f(x) ||^2
|
|
Packit |
67cb25 |
= (1/2) \sum_{i=1}^{n} f_i(x_1, ..., x_p)^2
|
|
Packit |
67cb25 |
|
|
Packit |
67cb25 |
In trust region methods, the objective (or cost) function :math:`\Phi(x)` is approximated
|
|
Packit |
67cb25 |
by a model function :math:`m_k(\delta)` in the vicinity of some point :math:`x_k`. The
|
|
Packit |
67cb25 |
model function is often simply a second order Taylor series expansion around the
|
|
Packit |
67cb25 |
point :math:`x_k`, ie:
|
|
Packit |
67cb25 |
|
|
Packit |
67cb25 |
.. only:: not texinfo
|
|
Packit |
67cb25 |
|
|
Packit |
67cb25 |
.. math:: \Phi(x_k + \delta) \approx m_k(\delta) = \Phi(x_k) + g_k^T \delta + {1 \over 2} \delta^T B_k \delta
|
|
Packit |
67cb25 |
|
|
Packit |
67cb25 |
.. only:: texinfo
|
|
Packit |
67cb25 |
|
|
Packit |
67cb25 |
::
|
|
Packit |
67cb25 |
|
|
Packit |
67cb25 |
\Phi(x_k + \delta) ~=~ m_k(\delta) = \Phi(x_k) + g_k^T \delta + 1/2 \delta^T B_k \delta
|
|
Packit |
67cb25 |
|
|
Packit |
67cb25 |
where :math:`g_k = \nabla \Phi(x_k) = J^T f` is the gradient vector at the point :math:`x_k`,
|
|
Packit |
67cb25 |
:math:`B_k = \nabla^2 \Phi(x_k)` is the Hessian matrix at :math:`x_k`, or some
|
|
Packit |
67cb25 |
approximation to it, and :math:`J` is the :math:`n`-by-:math:`p` Jacobian matrix
|
|
Packit |
67cb25 |
|
|
Packit |
67cb25 |
.. only:: not texinfo
|
|
Packit |
67cb25 |
|
|
Packit |
67cb25 |
.. math:: J_{ij} = \partial f_i / \partial x_j
|
|
Packit |
67cb25 |
|
|
Packit |
67cb25 |
.. only:: texinfo
|
|
Packit |
67cb25 |
|
|
Packit |
67cb25 |
::
|
|
Packit |
67cb25 |
|
|
Packit |
67cb25 |
J_{ij} = d f_i / d x_j
|
|
Packit |
67cb25 |
|
|
Packit |
67cb25 |
In order to find the next step :math:`\delta`, we minimize the model function
|
|
Packit |
67cb25 |
:math:`m_k(\delta)`, but search for solutions only within a region where
|
|
Packit |
67cb25 |
we trust that :math:`m_k(\delta)` is a good approximation to the objective
|
|
Packit |
67cb25 |
function :math:`\Phi(x_k + \delta)`. In other words,
|
|
Packit |
67cb25 |
we seek a solution of the trust region subproblem (TRS)
|
|
Packit |
67cb25 |
|
|
Packit |
67cb25 |
.. only:: not texinfo
|
|
Packit |
67cb25 |
|
|
Packit |
67cb25 |
.. math:: \min_{\delta \in R^p} m_k(\delta) = \Phi(x_k) + g_k^T \delta + {1 \over 2} \delta^T B_k \delta, \qquad\hbox{s.t.}\quad || D_k \delta || \le \Delta_k
|
|
Packit |
67cb25 |
|
|
Packit |
67cb25 |
.. only:: texinfo
|
|
Packit |
67cb25 |
|
|
Packit |
67cb25 |
::
|
|
Packit |
67cb25 |
|
|
Packit |
67cb25 |
\min_(\delta \in R^p) m_k(\delta), s.t. || D_k \delta || <= \Delta_k
|
|
Packit |
67cb25 |
|
|
Packit |
67cb25 |
where :math:`\Delta_k > 0` is the trust region radius and :math:`D_k` is
|
|
Packit |
67cb25 |
a scaling matrix. If :math:`D_k = I`, then the trust region is a ball
|
|
Packit |
67cb25 |
of radius :math:`\Delta_k` centered at :math:`x_k`. In some applications,
|
|
Packit |
67cb25 |
the parameter vector :math:`x` may have widely different scales. For
|
|
Packit |
67cb25 |
example, one parameter might be a temperature on the order of
|
|
Packit |
67cb25 |
:math:`10^3` K, while another might be a length on the order of
|
|
Packit |
67cb25 |
:math:`10^{-6}` m. In such cases, a spherical trust region may not
|
|
Packit |
67cb25 |
be the best choice, since if :math:`\Phi` changes rapidly along
|
|
Packit |
67cb25 |
directions with one scale, and more slowly along directions with
|
|
Packit |
67cb25 |
a different scale, the model function :math:`m_k` may be a poor
|
|
Packit |
67cb25 |
approximation to :math:`\Phi` along the rapidly changing directions.
|
|
Packit |
67cb25 |
In such problems, it may be best to use an elliptical trust region,
|
|
Packit |
67cb25 |
by setting :math:`D_k` to a diagonal matrix whose entries are designed
|
|
Packit |
67cb25 |
so that the scaled step :math:`D_k \delta` has entries of approximately the same
|
|
Packit |
67cb25 |
order of magnitude.
|
|
Packit |
67cb25 |
|
|
Packit |
67cb25 |
The trust region subproblem above normally amounts to solving a
|
|
Packit |
67cb25 |
linear least squares system (or multiple systems) for the step
|
|
Packit |
67cb25 |
:math:`\delta`. Once :math:`\delta` is computed, it is checked whether
|
|
Packit |
67cb25 |
or not it reduces the objective function :math:`\Phi(x)`. A useful
|
|
Packit |
67cb25 |
statistic for this is to look at the ratio
|
|
Packit |
67cb25 |
|
|
Packit |
67cb25 |
.. only:: not texinfo
|
|
Packit |
67cb25 |
|
|
Packit |
67cb25 |
.. math:: \rho_k = { \Phi(x_k) - \Phi(x_k + \delta_k) \over m_k(0) - m_k(\delta_k) }
|
|
Packit |
67cb25 |
|
|
Packit |
67cb25 |
.. only:: texinfo
|
|
Packit |
67cb25 |
|
|
Packit |
67cb25 |
::
|
|
Packit |
67cb25 |
|
|
Packit |
67cb25 |
\rho_k = ( \Phi(x_k) - \Phi(x_k + \delta_k) / ( m_k(0) - m_k(\delta_k) )
|
|
Packit |
67cb25 |
|
|
Packit |
67cb25 |
where the numerator is the actual reduction of the objective function
|
|
Packit |
67cb25 |
due to the step :math:`\delta_k`, and the denominator is the predicted
|
|
Packit |
67cb25 |
reduction due to the model :math:`m_k`. If :math:`\rho_k` is negative,
|
|
Packit |
67cb25 |
it means that the step :math:`\delta_k` increased the objective function
|
|
Packit |
67cb25 |
and so it is rejected. If :math:`\rho_k` is positive,
|
|
Packit |
67cb25 |
then we have found a step which reduced the objective function and
|
|
Packit |
67cb25 |
it is accepted. Furthermore, if :math:`\rho_k` is close to 1,
|
|
Packit |
67cb25 |
then this indicates that the model function is a good approximation
|
|
Packit |
67cb25 |
to the objective function in the trust region, and so on the next
|
|
Packit |
67cb25 |
iteration the trust region is enlarged in order to take more ambitious
|
|
Packit |
67cb25 |
steps. When a step is rejected, the trust region is made smaller and
|
|
Packit |
67cb25 |
the TRS is solved again. An outline for the general trust region method
|
|
Packit |
67cb25 |
used by GSL can now be given.
|
|
Packit |
67cb25 |
|
|
Packit |
67cb25 |
**Trust Region Algorithm**
|
|
Packit |
67cb25 |
|
|
Packit |
67cb25 |
1. Initialize: given :math:`x_0`, construct :math:`m_0(\delta)`, :math:`D_0` and :math:`\Delta_0 > 0`
|
|
Packit |
67cb25 |
|
|
Packit |
67cb25 |
2. For k = 0, 1, 2, ...
|
|
Packit |
67cb25 |
|
|
Packit |
67cb25 |
a. If converged, then stop
|
|
Packit |
67cb25 |
b. Solve TRS for trial step :math:`\delta_k`
|
|
Packit |
67cb25 |
c. Evaluate trial step by computing :math:`\rho_k`
|
|
Packit |
67cb25 |
|
|
Packit |
67cb25 |
1). if step is accepted, set :math:`x_{k+1} = x_k + \delta_k` and increase radius,
|
|
Packit |
67cb25 |
:math:`\Delta_{k+1} = \alpha \Delta_k`
|
|
Packit |
67cb25 |
2). if step is rejected, set :math:`x_{k+1} = x_k` and decrease radius,
|
|
Packit |
67cb25 |
:math:`\Delta_{k+1} = {\Delta_k \over \beta}`; goto 2(b)
|
|
Packit |
67cb25 |
|
|
Packit |
67cb25 |
d. Construct :math:`m_{k+1}(\delta)` and :math:`D_{k+1}`
|
|
Packit |
67cb25 |
|
|
Packit |
67cb25 |
GSL offers the user a number of different algorithms for solving the trust
|
|
Packit |
67cb25 |
region subproblem in 2(b), as well as different choices of scaling matrices
|
|
Packit |
67cb25 |
:math:`D_k` and different methods of updating the trust region radius
|
|
Packit |
67cb25 |
:math:`\Delta_k`. Therefore, while reasonable default methods are provided,
|
|
Packit |
67cb25 |
the user has a lot of control to fine-tune the various steps of the
|
|
Packit |
67cb25 |
algorithm for their specific problem.
|
|
Packit |
67cb25 |
|
|
Packit |
67cb25 |
Solving the Trust Region Subproblem (TRS)
|
|
Packit |
67cb25 |
=========================================
|
|
Packit |
67cb25 |
|
|
Packit |
67cb25 |
Below we describe the methods available for solving the trust region
|
|
Packit |
67cb25 |
subproblem. The methods available provide either exact or approximate
|
|
Packit |
67cb25 |
solutions to the trust region subproblem. In all algorithms below,
|
|
Packit |
67cb25 |
the Hessian matrix :math:`B_k` is approximated as :math:`B_k \approx J_k^T J_k`,
|
|
Packit |
67cb25 |
where :math:`J_k = J(x_k)`. In all methods, the solution of the TRS
|
|
Packit |
67cb25 |
involves solving a linear least squares system involving the Jacobian
|
|
Packit |
67cb25 |
matrix. For small to moderate sized problems (:code:`gsl_multifit_nlinear` interface),
|
|
Packit |
67cb25 |
this is accomplished by factoring the full Jacobian matrix, which is provided
|
|
Packit |
67cb25 |
by the user, with the Cholesky, QR, or SVD decompositions. For large systems
|
|
Packit |
67cb25 |
(:code:`gsl_multilarge_nlinear` interface), the user has two choices. One
|
|
Packit |
67cb25 |
is to solve the system iteratively, without needing to store the full
|
|
Packit |
67cb25 |
Jacobian matrix in memory. With this method, the user must provide a routine
|
|
Packit |
67cb25 |
to calculate the matrix-vector products :math:`J u` or :math:`J^T u` for a given vector :math:`u`.
|
|
Packit |
67cb25 |
This iterative method is particularly useful for systems where the Jacobian has
|
|
Packit |
67cb25 |
sparse structure, since forming matrix-vector products can be done cheaply. The
|
|
Packit |
67cb25 |
second option for large systems involves forming the normal equations matrix
|
|
Packit |
67cb25 |
:math:`J^T J` and then factoring it using a Cholesky decomposition. The normal
|
|
Packit |
67cb25 |
equations matrix is :math:`p`-by-:math:`p`, typically much smaller than the full
|
|
Packit |
67cb25 |
:math:`n`-by-:math:`p` Jacobian, and can usually be stored in memory even if the full
|
|
Packit |
67cb25 |
Jacobian matrix cannot. This option is useful for large, dense systems, or if the
|
|
Packit |
67cb25 |
iterative method has difficulty converging.
|
|
Packit |
67cb25 |
|
|
Packit |
67cb25 |
.. index::
|
|
Packit |
67cb25 |
single: Levenberg-Marquardt algorithm
|
|
Packit |
67cb25 |
single: nonlinear least squares, levenberg-marquardt
|
|
Packit |
67cb25 |
|
|
Packit |
67cb25 |
Levenberg-Marquardt
|
|
Packit |
67cb25 |
-------------------
|
|
Packit |
67cb25 |
|
|
Packit |
67cb25 |
There is a theorem which states that if :math:`\delta_k` is a solution
|
|
Packit |
67cb25 |
to the trust region subproblem given above, then there exists
|
|
Packit |
67cb25 |
:math:`\mu_k \ge 0` such that
|
|
Packit |
67cb25 |
|
|
Packit |
67cb25 |
.. only:: not texinfo
|
|
Packit |
67cb25 |
|
|
Packit |
67cb25 |
.. math:: \left( B_k + \mu_k D_k^T D_k \right) \delta_k = -g_k
|
|
Packit |
67cb25 |
|
|
Packit |
67cb25 |
.. only:: texinfo
|
|
Packit |
67cb25 |
|
|
Packit |
67cb25 |
::
|
|
Packit |
67cb25 |
|
|
Packit |
67cb25 |
( B_k + \mu_k D_k^T D_k ) \delta_k = -g_k
|
|
Packit |
67cb25 |
|
|
Packit |
67cb25 |
with :math:`\mu_k (\Delta_k - ||D_k \delta_k||) = 0`. This
|
|
Packit |
67cb25 |
forms the basis of the Levenberg-Marquardt algorithm, which controls
|
|
Packit |
67cb25 |
the trust region size by adjusting the parameter :math:`\mu_k`
|
|
Packit |
67cb25 |
rather than the radius :math:`\Delta_k` directly. For each radius
|
|
Packit |
67cb25 |
:math:`\Delta_k`, there is a unique parameter :math:`\mu_k` which
|
|
Packit |
67cb25 |
solves the TRS, and they have an inverse relationship, so that large values of
|
|
Packit |
67cb25 |
:math:`\mu_k` correspond to smaller trust regions, while small
|
|
Packit |
67cb25 |
values of :math:`\mu_k` correspond to larger trust regions.
|
|
Packit |
67cb25 |
|
|
Packit |
67cb25 |
With the approximation :math:`B_k \approx J_k^T J_k`, on each iteration,
|
|
Packit |
67cb25 |
in order to calculate the step :math:`\delta_k`,
|
|
Packit |
67cb25 |
the following linear least squares problem is solved:
|
|
Packit |
67cb25 |
|
|
Packit |
67cb25 |
.. only:: not texinfo
|
|
Packit |
67cb25 |
|
|
Packit |
67cb25 |
.. math::
|
|
Packit |
67cb25 |
|
|
Packit |
67cb25 |
\left[
|
|
Packit |
67cb25 |
\begin{array}{c}
|
|
Packit |
67cb25 |
J_k \\
|
|
Packit |
67cb25 |
\sqrt{\mu_k} D_k
|
|
Packit |
67cb25 |
\end{array}
|
|
Packit |
67cb25 |
\right]
|
|
Packit |
67cb25 |
\delta_k =
|
|
Packit |
67cb25 |
-
|
|
Packit |
67cb25 |
\left[
|
|
Packit |
67cb25 |
\begin{array}{c}
|
|
Packit |
67cb25 |
f_k \cr
|
|
Packit |
67cb25 |
0
|
|
Packit |
67cb25 |
\end{array}
|
|
Packit |
67cb25 |
\right]
|
|
Packit |
67cb25 |
|
|
Packit |
67cb25 |
.. only:: texinfo
|
|
Packit |
67cb25 |
|
|
Packit |
67cb25 |
::
|
|
Packit |
67cb25 |
|
|
Packit |
67cb25 |
[J_k; sqrt(mu_k) D_k] \delta_k = - [f_k; 0]
|
|
Packit |
67cb25 |
|
|
Packit |
67cb25 |
If the step :math:`\delta_k` is accepted, then
|
|
Packit |
67cb25 |
:math:`\mu_k` is decreased on the next iteration in order
|
|
Packit |
67cb25 |
to take a larger step, otherwise it is increased to take
|
|
Packit |
67cb25 |
a smaller step. The Levenberg-Marquardt algorithm provides
|
|
Packit |
67cb25 |
an exact solution of the trust region subproblem, but
|
|
Packit |
67cb25 |
typically has a higher computational cost per iteration
|
|
Packit |
67cb25 |
than the approximate methods discussed below, since it
|
|
Packit |
67cb25 |
may need to solve the least squares system above several
|
|
Packit |
67cb25 |
times for different values of :math:`\mu_k`.
|
|
Packit |
67cb25 |
|
|
Packit |
67cb25 |
.. index::
|
|
Packit |
67cb25 |
single: Levenberg-Marquardt algorithm, geodesic acceleration
|
|
Packit |
67cb25 |
single: nonlinear least squares, levenberg-marquardt, geodesic acceleration
|
|
Packit |
67cb25 |
|
|
Packit |
67cb25 |
Levenberg-Marquardt with Geodesic Acceleration
|
|
Packit |
67cb25 |
----------------------------------------------
|
|
Packit |
67cb25 |
|
|
Packit |
67cb25 |
This method applies a so-called geodesic acceleration correction to
|
|
Packit |
67cb25 |
the standard Levenberg-Marquardt step :math:`\delta_k` (Transtrum et al, 2011).
|
|
Packit |
67cb25 |
By interpreting :math:`\delta_k` as a first order step along a geodesic in the
|
|
Packit |
67cb25 |
model parameter space (ie: a velocity :math:`\delta_k = v_k`), the geodesic
|
|
Packit |
67cb25 |
acceleration :math:`a_k` is a second order correction along the
|
|
Packit |
67cb25 |
geodesic which is determined by solving the linear least squares system
|
|
Packit |
67cb25 |
|
|
Packit |
67cb25 |
.. only:: not texinfo
|
|
Packit |
67cb25 |
|
|
Packit |
67cb25 |
.. math::
|
|
Packit |
67cb25 |
|
|
Packit |
67cb25 |
\left[
|
|
Packit |
67cb25 |
\begin{array}{c}
|
|
Packit |
67cb25 |
J_k \\
|
|
Packit |
67cb25 |
\sqrt{\mu_k} D_k
|
|
Packit |
67cb25 |
\end{array}
|
|
Packit |
67cb25 |
\right]
|
|
Packit |
67cb25 |
a_k =
|
|
Packit |
67cb25 |
-
|
|
Packit |
67cb25 |
\left[
|
|
Packit |
67cb25 |
\begin{array}{c}
|
|
Packit |
67cb25 |
f_{vv}(x_k) \\
|
|
Packit |
67cb25 |
0
|
|
Packit |
67cb25 |
\end{array}
|
|
Packit |
67cb25 |
\right]
|
|
Packit |
67cb25 |
|
|
Packit |
67cb25 |
.. only:: texinfo
|
|
Packit |
67cb25 |
|
|
Packit |
67cb25 |
::
|
|
Packit |
67cb25 |
|
|
Packit |
67cb25 |
[J_k; sqrt(mu_k) D_k] a_k = - [f_vv(x_k); 0]
|
|
Packit |
67cb25 |
|
|
Packit |
67cb25 |
where :math:`f_{vv}` is the second directional derivative of
|
|
Packit |
67cb25 |
the residual vector in the velocity direction :math:`v`,
|
|
Packit |
67cb25 |
:math:`f_{vv}(x) = D_v^2 f = \sum_{\alpha\beta} v_{\alpha} v_{\beta} \partial_{\alpha} \partial_{\beta} f(x)`,
|
|
Packit |
67cb25 |
where :math:`\alpha` and :math:`\beta` are summed over the :math:`p`
|
|
Packit |
67cb25 |
parameters. The new total step is then :math:`\delta_k' = v_k + {1 \over 2}a_k`.
|
|
Packit |
67cb25 |
The second order correction :math:`a_k` can be calculated with a modest additional
|
|
Packit |
67cb25 |
cost, and has been shown to dramatically reduce the number of iterations
|
|
Packit |
67cb25 |
(and expensive Jacobian evaluations) required to reach convergence on a variety
|
|
Packit |
67cb25 |
of different problems. In order to utilize the geodesic acceleration, the user must supply a
|
|
Packit |
67cb25 |
function which provides the second directional derivative vector
|
|
Packit |
67cb25 |
:math:`f_{vv}(x)`, or alternatively the library can use a finite
|
|
Packit |
67cb25 |
difference method to estimate this vector with one additional function
|
|
Packit |
67cb25 |
evaluation of :math:`f(x + h v)` where :math:`h` is a tunable step size
|
|
Packit |
67cb25 |
(see the :code:`h_fvv` parameter description).
|
|
Packit |
67cb25 |
|
|
Packit |
67cb25 |
.. index::
|
|
Packit |
67cb25 |
single: Dogleg algorithm
|
|
Packit |
67cb25 |
single: nonlinear least squares, dogleg
|
|
Packit |
67cb25 |
|
|
Packit |
67cb25 |
Dogleg
|
|
Packit |
67cb25 |
------
|
|
Packit |
67cb25 |
|
|
Packit |
67cb25 |
This is Powell's dogleg method, which finds an approximate
|
|
Packit |
67cb25 |
solution to the trust region subproblem, by restricting
|
|
Packit |
67cb25 |
its search to a piecewise linear "dogleg" path,
|
|
Packit |
67cb25 |
composed of the origin, the Cauchy point which represents
|
|
Packit |
67cb25 |
the model minimizer along the steepest descent direction,
|
|
Packit |
67cb25 |
and the Gauss-Newton point, which is the overall minimizer
|
|
Packit |
67cb25 |
of the unconstrained model. The Gauss-Newton step is calculated by
|
|
Packit |
67cb25 |
solving
|
|
Packit |
67cb25 |
|
|
Packit |
67cb25 |
.. math:: J_k \delta_{gn} = -f_k
|
|
Packit |
67cb25 |
|
|
Packit |
67cb25 |
which is the main computational task for each iteration,
|
|
Packit |
67cb25 |
but only needs to be performed once per iteration. If
|
|
Packit |
67cb25 |
the Gauss-Newton point is inside the trust region, it is
|
|
Packit |
67cb25 |
selected as the step. If it is outside, the method then
|
|
Packit |
67cb25 |
calculates the Cauchy point, which is located along the
|
|
Packit |
67cb25 |
gradient direction. If the Cauchy point is also outside
|
|
Packit |
67cb25 |
the trust region, the method assumes that it is still far
|
|
Packit |
67cb25 |
from the minimum and so proceeds along the gradient
|
|
Packit |
67cb25 |
direction, truncating the step at the trust region
|
|
Packit |
67cb25 |
boundary. If the Cauchy point is inside the trust region,
|
|
Packit |
67cb25 |
with the Gauss-Newton point outside, the method
|
|
Packit |
67cb25 |
uses a dogleg step, which is a linear combination of the
|
|
Packit |
67cb25 |
gradient direction and the Gauss-Newton direction, stopping at the trust
|
|
Packit |
67cb25 |
region boundary.
|
|
Packit |
67cb25 |
|
|
Packit |
67cb25 |
.. index::
|
|
Packit |
67cb25 |
single: double Dogleg algorithm
|
|
Packit |
67cb25 |
single: Dogleg algorithm, double
|
|
Packit |
67cb25 |
single: nonlinear least squares, double dogleg
|
|
Packit |
67cb25 |
|
|
Packit |
67cb25 |
Double Dogleg
|
|
Packit |
67cb25 |
-------------
|
|
Packit |
67cb25 |
|
|
Packit |
67cb25 |
This method is an improvement over the classical dogleg
|
|
Packit |
67cb25 |
algorithm, which attempts to include information about
|
|
Packit |
67cb25 |
the Gauss-Newton step while the iteration is still far from
|
|
Packit |
67cb25 |
the minimum. When the Cauchy point is inside the trust region
|
|
Packit |
67cb25 |
and the Gauss-Newton point is outside, the method computes
|
|
Packit |
67cb25 |
a scaled Gauss-Newton point and then takes a dogleg step
|
|
Packit |
67cb25 |
between the Cauchy point and the scaled Gauss-Newton point.
|
|
Packit |
67cb25 |
The scaling is calculated to ensure that the reduction
|
|
Packit |
67cb25 |
in the model :math:`m_k` is about the same as the reduction
|
|
Packit |
67cb25 |
provided by the Cauchy point.
|
|
Packit |
67cb25 |
|
|
Packit |
67cb25 |
Two Dimensional Subspace
|
|
Packit |
67cb25 |
------------------------
|
|
Packit |
67cb25 |
|
|
Packit |
67cb25 |
The dogleg methods restrict the search for the TRS solution
|
|
Packit |
67cb25 |
to a 1D curve defined by the Cauchy and Gauss-Newton points.
|
|
Packit |
67cb25 |
An improvement to this is to search for a solution using
|
|
Packit |
67cb25 |
the full two dimensional subspace spanned by the Cauchy
|
|
Packit |
67cb25 |
and Gauss-Newton directions. The dogleg path is of course
|
|
Packit |
67cb25 |
inside this subspace, and so this method solves the TRS
|
|
Packit |
67cb25 |
at least as accurately as the dogleg methods. Since this
|
|
Packit |
67cb25 |
method searches a larger subspace for a solution, it can
|
|
Packit |
67cb25 |
converge more quickly than dogleg on some problems. Because
|
|
Packit |
67cb25 |
the subspace is only two dimensional, this method is
|
|
Packit |
67cb25 |
very efficient and the main computation per iteration is
|
|
Packit |
67cb25 |
to determine the Gauss-Newton point.
|
|
Packit |
67cb25 |
|
|
Packit |
67cb25 |
Steihaug-Toint Conjugate Gradient
|
|
Packit |
67cb25 |
---------------------------------
|
|
Packit |
67cb25 |
|
|
Packit |
67cb25 |
One difficulty of the dogleg methods is calculating the
|
|
Packit |
67cb25 |
Gauss-Newton step when the Jacobian matrix is singular. The
|
|
Packit |
67cb25 |
Steihaug-Toint method also computes a generalized dogleg
|
|
Packit |
67cb25 |
step, but avoids solving for the Gauss-Newton step directly,
|
|
Packit |
67cb25 |
instead using an iterative conjugate gradient algorithm. This
|
|
Packit |
67cb25 |
method performs well at points where the Jacobian is singular,
|
|
Packit |
67cb25 |
and is also suitable for large-scale problems where factoring
|
|
Packit |
67cb25 |
the Jacobian matrix could be prohibitively expensive.
|
|
Packit |
67cb25 |
|
|
Packit |
67cb25 |
Weighted Nonlinear Least-Squares
|
|
Packit |
67cb25 |
================================
|
|
Packit |
67cb25 |
|
|
Packit |
67cb25 |
Weighted nonlinear least-squares fitting minimizes the function
|
|
Packit |
67cb25 |
|
|
Packit |
67cb25 |
.. only:: not texinfo
|
|
Packit |
67cb25 |
|
|
Packit |
67cb25 |
.. math::
|
|
Packit |
67cb25 |
|
|
Packit |
67cb25 |
\Phi(x) &= {1 \over 2} || f ||_W^2 \\
|
|
Packit |
67cb25 |
&= {1 \over 2} \sum_{i=1}^{n} w_i f_i (x_1, \dots, x_p)^2
|
|
Packit |
67cb25 |
|
|
Packit |
67cb25 |
.. only:: texinfo
|
|
Packit |
67cb25 |
|
|
Packit |
67cb25 |
::
|
|
Packit |
67cb25 |
|
|
Packit |
67cb25 |
\Phi(x) = (1/2) || f(x) ||_W^2
|
|
Packit |
67cb25 |
= (1/2) \sum_{i=1}^{n} f_i(x_1, ..., x_p)^2
|
|
Packit |
67cb25 |
|
|
Packit |
67cb25 |
where :math:`W = \diag(w_1,w_2,...,w_n)` is the weighting matrix,
|
|
Packit |
67cb25 |
and :math:`||f||_W^2 = f^T W f`.
|
|
Packit |
67cb25 |
The weights :math:`w_i` are commonly defined as :math:`w_i = 1/\sigma_i^2`,
|
|
Packit |
67cb25 |
where :math:`\sigma_i` is the error in the :math:`i`-th measurement.
|
|
Packit |
67cb25 |
A simple change of variables :math:`\tilde{f} = W^{1 \over 2} f` yields
|
|
Packit |
67cb25 |
:math:`\Phi(x) = {1 \over 2} ||\tilde{f}||^2`, which is in the
|
|
Packit |
67cb25 |
same form as the unweighted case. The user can either perform this
|
|
Packit |
67cb25 |
transform directly on their function residuals and Jacobian, or use
|
|
Packit |
67cb25 |
the :func:`gsl_multifit_nlinear_winit` interface which automatically
|
|
Packit |
67cb25 |
performs the correct scaling. To manually perform this transformation,
|
|
Packit |
67cb25 |
the residuals and Jacobian should be modified according to
|
|
Packit |
67cb25 |
|
|
Packit |
67cb25 |
.. only:: not texinfo
|
|
Packit |
67cb25 |
|
|
Packit |
67cb25 |
.. math::
|
|
Packit |
67cb25 |
|
|
Packit |
67cb25 |
\tilde{f}_i & = \sqrt{w_i} f_i = {f_i \over \sigma_i} \\
|
|
Packit |
67cb25 |
\tilde{J}_{ij} & = \sqrt{w_i} { \partial f_i \over \partial x_j } = { 1 \over \sigma_i} { \partial f_i \over \partial x_j }
|
|
Packit |
67cb25 |
|
|
Packit |
67cb25 |
.. only:: texinfo
|
|
Packit |
67cb25 |
|
|
Packit |
67cb25 |
::
|
|
Packit |
67cb25 |
|
|
Packit |
67cb25 |
f~_i = f_i / \sigma_i
|
|
Packit |
67cb25 |
J~_ij = 1 / \sigma_i df_i/dx_j
|
|
Packit |
67cb25 |
|
|
Packit |
67cb25 |
For large systems, the user must perform their own weighting.
|
|
Packit |
67cb25 |
|
|
Packit |
67cb25 |
.. _sec_tunable-parameters:
|
|
Packit |
67cb25 |
|
|
Packit |
67cb25 |
Tunable Parameters
|
|
Packit |
67cb25 |
==================
|
|
Packit |
67cb25 |
|
|
Packit |
67cb25 |
The user can tune nearly all aspects of the iteration at allocation
|
|
Packit |
67cb25 |
time. For the :code:`gsl_multifit_nlinear` interface, the user may
|
|
Packit |
67cb25 |
modify the :type:`gsl_multifit_nlinear_parameters` structure, which is
|
|
Packit |
67cb25 |
defined as follows:
|
|
Packit |
67cb25 |
|
|
Packit |
67cb25 |
.. type:: gsl_multifit_nlinear_parameters
|
|
Packit |
67cb25 |
|
|
Packit |
67cb25 |
::
|
|
Packit |
67cb25 |
|
|
Packit |
67cb25 |
typedef struct
|
|
Packit |
67cb25 |
{
|
|
Packit |
67cb25 |
const gsl_multifit_nlinear_trs *trs; /* trust region subproblem method */
|
|
Packit |
67cb25 |
const gsl_multifit_nlinear_scale *scale; /* scaling method */
|
|
Packit |
67cb25 |
const gsl_multifit_nlinear_solver *solver; /* solver method */
|
|
Packit |
67cb25 |
gsl_multifit_nlinear_fdtype fdtype; /* finite difference method */
|
|
Packit |
67cb25 |
double factor_up; /* factor for increasing trust radius */
|
|
Packit |
67cb25 |
double factor_down; /* factor for decreasing trust radius */
|
|
Packit |
67cb25 |
double avmax; /* max allowed |a|/|v| */
|
|
Packit |
67cb25 |
double h_df; /* step size for finite difference Jacobian */
|
|
Packit |
67cb25 |
double h_fvv; /* step size for finite difference fvv */
|
|
Packit |
67cb25 |
} gsl_multifit_nlinear_parameters;
|
|
Packit |
67cb25 |
|
|
Packit |
67cb25 |
For the :code:`gsl_multilarge_nlinear` interface, the user may
|
|
Packit |
67cb25 |
modify the :type:`gsl_multilarge_nlinear_parameters` structure, which is
|
|
Packit |
67cb25 |
defined as follows:
|
|
Packit |
67cb25 |
|
|
Packit |
67cb25 |
.. type:: gsl_multilarge_nlinear_parameters
|
|
Packit |
67cb25 |
|
|
Packit |
67cb25 |
::
|
|
Packit |
67cb25 |
|
|
Packit |
67cb25 |
typedef struct
|
|
Packit |
67cb25 |
{
|
|
Packit |
67cb25 |
const gsl_multilarge_nlinear_trs *trs; /* trust region subproblem method */
|
|
Packit |
67cb25 |
const gsl_multilarge_nlinear_scale *scale; /* scaling method */
|
|
Packit |
67cb25 |
const gsl_multilarge_nlinear_solver *solver; /* solver method */
|
|
Packit |
67cb25 |
gsl_multilarge_nlinear_fdtype fdtype; /* finite difference method */
|
|
Packit |
67cb25 |
double factor_up; /* factor for increasing trust radius */
|
|
Packit |
67cb25 |
double factor_down; /* factor for decreasing trust radius */
|
|
Packit |
67cb25 |
double avmax; /* max allowed |a|/|v| */
|
|
Packit |
67cb25 |
double h_df; /* step size for finite difference Jacobian */
|
|
Packit |
67cb25 |
double h_fvv; /* step size for finite difference fvv */
|
|
Packit |
67cb25 |
size_t max_iter; /* maximum iterations for trs method */
|
|
Packit |
67cb25 |
double tol; /* tolerance for solving trs */
|
|
Packit |
67cb25 |
} gsl_multilarge_nlinear_parameters;
|
|
Packit |
67cb25 |
|
|
Packit |
67cb25 |
Each of these parameters is discussed in further detail below.
|
|
Packit |
67cb25 |
|
|
Packit |
67cb25 |
.. type:: gsl_multifit_nlinear_trs
|
|
Packit |
67cb25 |
gsl_multilarge_nlinear_trs
|
|
Packit |
67cb25 |
|
|
Packit |
67cb25 |
The parameter :data:`trs` determines the method used to solve the trust region
|
|
Packit |
67cb25 |
subproblem, and may be selected from the following choices,
|
|
Packit |
67cb25 |
|
|
Packit |
67cb25 |
.. var:: gsl_multifit_nlinear_trs_lm
|
|
Packit |
67cb25 |
gsl_multilarge_nlinear_trs_lm
|
|
Packit |
67cb25 |
|
|
Packit |
67cb25 |
This selects the Levenberg-Marquardt algorithm.
|
|
Packit |
67cb25 |
|
|
Packit |
67cb25 |
.. var:: gsl_multifit_nlinear_trs_lmaccel
|
|
Packit |
67cb25 |
gsl_multilarge_nlinear_trs_lmaccel
|
|
Packit |
67cb25 |
|
|
Packit |
67cb25 |
This selects the Levenberg-Marquardt algorithm with geodesic
|
|
Packit |
67cb25 |
acceleration.
|
|
Packit |
67cb25 |
|
|
Packit |
67cb25 |
.. var:: gsl_multifit_nlinear_trs_dogleg
|
|
Packit |
67cb25 |
gsl_multilarge_nlinear_trs_dogleg
|
|
Packit |
67cb25 |
|
|
Packit |
67cb25 |
This selects the dogleg algorithm.
|
|
Packit |
67cb25 |
|
|
Packit |
67cb25 |
.. var:: gsl_multifit_nlinear_trs_ddogleg
|
|
Packit |
67cb25 |
gsl_multilarge_nlinear_trs_ddogleg
|
|
Packit |
67cb25 |
|
|
Packit |
67cb25 |
This selects the double dogleg algorithm.
|
|
Packit |
67cb25 |
|
|
Packit |
67cb25 |
.. var:: gsl_multifit_nlinear_trs_subspace2D
|
|
Packit |
67cb25 |
gsl_multilarge_nlinear_trs_subspace2D
|
|
Packit |
67cb25 |
|
|
Packit |
67cb25 |
This selects the 2D subspace algorithm.
|
|
Packit |
67cb25 |
|
|
Packit |
67cb25 |
.. var:: gsl_multilarge_nlinear_trs_cgst
|
|
Packit |
67cb25 |
|
|
Packit |
67cb25 |
This selects the Steihaug-Toint conjugate gradient algorithm. This
|
|
Packit |
67cb25 |
method is available only for large systems.
|
|
Packit |
67cb25 |
|
|
Packit |
67cb25 |
.. type:: gsl_multifit_nlinear_scale
|
|
Packit |
67cb25 |
gsl_multilarge_nlinear_scale
|
|
Packit |
67cb25 |
|
|
Packit |
67cb25 |
The parameter :data:`scale` determines the diagonal scaling matrix :math:`D` and
|
|
Packit |
67cb25 |
may be selected from the following choices,
|
|
Packit |
67cb25 |
|
|
Packit |
67cb25 |
.. var:: gsl_multifit_nlinear_scale_more
|
|
Packit |
67cb25 |
gsl_multilarge_nlinear_scale_more
|
|
Packit |
67cb25 |
|
|
Packit |
67cb25 |
This damping strategy was suggested by |More|, and
|
|
Packit |
67cb25 |
corresponds to :math:`D^T D = \max(\diag(J^T J))`,
|
|
Packit |
67cb25 |
in other words the maximum elements of
|
|
Packit |
67cb25 |
:math:`\diag(J^T J)` encountered thus far in the iteration.
|
|
Packit |
67cb25 |
This choice of :math:`D` makes the problem scale-invariant,
|
|
Packit |
67cb25 |
so that if the model parameters :math:`x_i` are each scaled
|
|
Packit |
67cb25 |
by an arbitrary constant, :math:`\tilde{x}_i = a_i x_i`, then
|
|
Packit |
67cb25 |
the sequence of iterates produced by the algorithm would
|
|
Packit |
67cb25 |
be unchanged. This method can work very well in cases
|
|
Packit |
67cb25 |
where the model parameters have widely different scales
|
|
Packit |
67cb25 |
(ie: if some parameters are measured in nanometers, while others
|
|
Packit |
67cb25 |
are measured in degrees Kelvin). This strategy has been proven
|
|
Packit |
67cb25 |
effective on a large class of problems and so it is the library
|
|
Packit |
67cb25 |
default, but it may not be the best choice for all problems.
|
|
Packit |
67cb25 |
|
|
Packit |
67cb25 |
.. var:: gsl_multifit_nlinear_scale_levenberg
|
|
Packit |
67cb25 |
gsl_multilarge_nlinear_scale_levenberg
|
|
Packit |
67cb25 |
|
|
Packit |
67cb25 |
This damping strategy was originally suggested by Levenberg, and
|
|
Packit |
67cb25 |
corresponds to :math:`D^T D = I`. This method has also proven
|
|
Packit |
67cb25 |
effective on a large class of problems, but is not scale-invariant.
|
|
Packit |
67cb25 |
However, some authors (e.g. Transtrum and Sethna 2012) argue
|
|
Packit |
67cb25 |
that this choice is better for problems which are susceptible
|
|
Packit |
67cb25 |
to parameter evaporation (ie: parameters go to infinity)
|
|
Packit |
67cb25 |
|
|
Packit |
67cb25 |
.. var:: gsl_multifit_nlinear_scale_marquardt
|
|
Packit |
67cb25 |
gsl_multilarge_nlinear_scale_marquardt
|
|
Packit |
67cb25 |
|
|
Packit |
67cb25 |
This damping strategy was suggested by Marquardt, and
|
|
Packit |
67cb25 |
corresponds to :math:`D^T D = \diag(J^T J)`. This
|
|
Packit |
67cb25 |
method is scale-invariant, but it is generally considered
|
|
Packit |
67cb25 |
inferior to both the Levenberg and |More| strategies, though
|
|
Packit |
67cb25 |
may work well on certain classes of problems.
|
|
Packit |
67cb25 |
|
|
Packit |
67cb25 |
.. type:: gsl_multifit_nlinear_solver
|
|
Packit |
67cb25 |
gsl_multilarge_nlinear_solver
|
|
Packit |
67cb25 |
|
|
Packit |
67cb25 |
Solving the trust region subproblem on each iteration almost always
|
|
Packit |
67cb25 |
requires the solution of the following linear least squares system
|
|
Packit |
67cb25 |
|
|
Packit |
67cb25 |
.. only:: not texinfo
|
|
Packit |
67cb25 |
|
|
Packit |
67cb25 |
.. math::
|
|
Packit |
67cb25 |
|
|
Packit |
67cb25 |
\left[
|
|
Packit |
67cb25 |
\begin{array}{c}
|
|
Packit |
67cb25 |
J \\
|
|
Packit |
67cb25 |
\sqrt{\mu} D
|
|
Packit |
67cb25 |
\end{array}
|
|
Packit |
67cb25 |
\right]
|
|
Packit |
67cb25 |
\delta =
|
|
Packit |
67cb25 |
-
|
|
Packit |
67cb25 |
\left[
|
|
Packit |
67cb25 |
\begin{array}{c}
|
|
Packit |
67cb25 |
f \\
|
|
Packit |
67cb25 |
0
|
|
Packit |
67cb25 |
\end{array}
|
|
Packit |
67cb25 |
\right]
|
|
Packit |
67cb25 |
|
|
Packit |
67cb25 |
.. only:: texinfo
|
|
Packit |
67cb25 |
|
|
Packit |
67cb25 |
::
|
|
Packit |
67cb25 |
|
|
Packit |
67cb25 |
[J; sqrt(mu) D] \delta = - [f; 0]
|
|
Packit |
67cb25 |
|
|
Packit |
67cb25 |
The :data:`solver` parameter determines how the system is
|
|
Packit |
67cb25 |
solved and can be selected from the following choices:
|
|
Packit |
67cb25 |
|
|
Packit |
67cb25 |
.. var:: gsl_multifit_nlinear_solver_qr
|
|
Packit |
67cb25 |
|
|
Packit |
67cb25 |
This method solves the system using a rank revealing QR
|
|
Packit |
67cb25 |
decomposition of the Jacobian :math:`J`. This method will
|
|
Packit |
67cb25 |
produce reliable solutions in cases where the Jacobian
|
|
Packit |
67cb25 |
is rank deficient or near-singular but does require about
|
|
Packit |
67cb25 |
twice as many operations as the Cholesky method discussed
|
|
Packit |
67cb25 |
below.
|
|
Packit |
67cb25 |
|
|
Packit |
67cb25 |
.. var:: gsl_multifit_nlinear_solver_cholesky
|
|
Packit |
67cb25 |
gsl_multilarge_nlinear_solver_cholesky
|
|
Packit |
67cb25 |
|
|
Packit |
67cb25 |
This method solves the alternate normal equations problem
|
|
Packit |
67cb25 |
|
|
Packit |
67cb25 |
.. only:: not texinfo
|
|
Packit |
67cb25 |
|
|
Packit |
67cb25 |
.. math:: \left( J^T J + \mu D^T D \right) \delta = -J^T f
|
|
Packit |
67cb25 |
|
|
Packit |
67cb25 |
.. only:: texinfo
|
|
Packit |
67cb25 |
|
|
Packit |
67cb25 |
::
|
|
Packit |
67cb25 |
|
|
Packit |
67cb25 |
( J^T J + \mu D^T D ) \delta = -J^T f
|
|
Packit |
67cb25 |
|
|
Packit |
67cb25 |
by using a Cholesky decomposition of the matrix
|
|
Packit |
67cb25 |
:math:`J^T J + \mu D^T D`. This method is faster than the
|
|
Packit |
67cb25 |
QR approach, however it is susceptible to numerical instabilities
|
|
Packit |
67cb25 |
if the Jacobian matrix is rank deficient or near-singular. In
|
|
Packit |
67cb25 |
these cases, an attempt is made to reduce the condition number
|
|
Packit |
67cb25 |
of the matrix using Jacobi preconditioning, but for highly
|
|
Packit |
67cb25 |
ill-conditioned problems the QR approach is better. If it is
|
|
Packit |
67cb25 |
known that the Jacobian matrix is well conditioned, this method
|
|
Packit |
67cb25 |
is accurate and will perform faster than the QR approach.
|
|
Packit |
67cb25 |
|
|
Packit |
67cb25 |
.. var:: gsl_multifit_nlinear_solver_svd
|
|
Packit |
67cb25 |
|
|
Packit |
67cb25 |
This method solves the system using a singular value
|
|
Packit |
67cb25 |
decomposition of the Jacobian :math:`J`. This method will
|
|
Packit |
67cb25 |
produce the most reliable solutions for ill-conditioned Jacobians
|
|
Packit |
67cb25 |
but is also the slowest solver method.
|
|
Packit |
67cb25 |
|
|
Packit |
67cb25 |
.. type:: gsl_multifit_nlinear_fdtype
|
|
Packit |
67cb25 |
|
|
Packit |
67cb25 |
The parameter :data:`fdtype` specifies whether to use forward or centered
|
|
Packit |
67cb25 |
differences when approximating the Jacobian. This is only
|
|
Packit |
67cb25 |
used when an analytic Jacobian is not provided to the solver.
|
|
Packit |
67cb25 |
This parameter may be set to one of the following choices.
|
|
Packit |
67cb25 |
|
|
Packit |
67cb25 |
.. macro:: GSL_MULTIFIT_NLINEAR_FWDIFF
|
|
Packit |
67cb25 |
|
|
Packit |
67cb25 |
This specifies a forward finite difference to approximate
|
|
Packit |
67cb25 |
the Jacobian matrix. The Jacobian matrix will be calculated as
|
|
Packit |
67cb25 |
|
|
Packit |
67cb25 |
.. only:: not texinfo
|
|
Packit |
67cb25 |
|
|
Packit |
67cb25 |
.. math:: J_{ij} = {1 \over \Delta_j} \left( f_i(x + \Delta_j e_j) - f_i(x) \right)
|
|
Packit |
67cb25 |
|
|
Packit |
67cb25 |
.. only:: texinfo
|
|
Packit |
67cb25 |
|
|
Packit |
67cb25 |
::
|
|
Packit |
67cb25 |
|
|
Packit |
67cb25 |
J_ij = 1 / \Delta_j ( f_i(x + \Delta_j e_j) - f_i(x) )
|
|
Packit |
67cb25 |
|
|
Packit |
67cb25 |
where :math:`\Delta_j = h |x_j|` and :math:`e_j` is the standard
|
|
Packit |
67cb25 |
:math:`j`-th Cartesian unit basis vector so that
|
|
Packit |
67cb25 |
:math:`x + \Delta_j e_j` represents a small (forward) perturbation of
|
|
Packit |
67cb25 |
the :math:`j`-th parameter by an amount :math:`\Delta_j`. The perturbation
|
|
Packit |
67cb25 |
:math:`\Delta_j` is proportional to the current value :math:`|x_j|` which
|
|
Packit |
67cb25 |
helps to calculate an accurate Jacobian when the various parameters have
|
|
Packit |
67cb25 |
different scale sizes. The value of :math:`h` is specified by the :code:`h_df`
|
|
Packit |
67cb25 |
parameter. The accuracy of this method is :math:`O(h)`, and evaluating this
|
|
Packit |
67cb25 |
matrix requires an additional :math:`p` function evaluations.
|
|
Packit |
67cb25 |
|
|
Packit |
67cb25 |
.. macro:: GSL_MULTIFIT_NLINEAR_CTRDIFF
|
|
Packit |
67cb25 |
|
|
Packit |
67cb25 |
This specifies a centered finite difference to approximate
|
|
Packit |
67cb25 |
the Jacobian matrix. The Jacobian matrix will be calculated as
|
|
Packit |
67cb25 |
|
|
Packit |
67cb25 |
.. only:: not texinfo
|
|
Packit |
67cb25 |
|
|
Packit |
67cb25 |
.. math:: J_{ij} = {1 \over \Delta_j} \left( f_i(x + {1 \over 2} \Delta_j e_j) - f_i(x - {1 \over 2} \Delta_j e_j) \right)
|
|
Packit |
67cb25 |
|
|
Packit |
67cb25 |
.. only:: texinfo
|
|
Packit |
67cb25 |
|
|
Packit |
67cb25 |
::
|
|
Packit |
67cb25 |
|
|
Packit |
67cb25 |
J_ij = 1 / \Delta_j ( f_i(x + 1/2 \Delta_j e_j) - f_i(x - 1/2 \Delta_j e_j) )
|
|
Packit |
67cb25 |
|
|
Packit |
67cb25 |
See above for a description of :math:`\Delta_j`. The accuracy of this
|
|
Packit |
67cb25 |
method is :math:`O(h^2)`, but evaluating this
|
|
Packit |
67cb25 |
matrix requires an additional :math:`2p` function evaluations.
|
|
Packit |
67cb25 |
|
|
Packit |
67cb25 |
:code:`double factor_up`
|
|
Packit |
67cb25 |
|
|
Packit |
67cb25 |
When a step is accepted, the trust region radius will be increased
|
|
Packit |
67cb25 |
by this factor. The default value is :math:`3`.
|
|
Packit |
67cb25 |
|
|
Packit |
67cb25 |
:code:`double factor_down`
|
|
Packit |
67cb25 |
|
|
Packit |
67cb25 |
When a step is rejected, the trust region radius will be decreased
|
|
Packit |
67cb25 |
by this factor. The default value is :math:`2`.
|
|
Packit |
67cb25 |
|
|
Packit |
67cb25 |
:code:`double avmax`
|
|
Packit |
67cb25 |
|
|
Packit |
67cb25 |
When using geodesic acceleration to solve a nonlinear least squares problem,
|
|
Packit |
67cb25 |
an important parameter to monitor is the ratio of the acceleration term
|
|
Packit |
67cb25 |
to the velocity term,
|
|
Packit |
67cb25 |
|
|
Packit |
67cb25 |
.. only:: not texinfo
|
|
Packit |
67cb25 |
|
|
Packit |
67cb25 |
.. math:: { ||a|| \over ||v|| }
|
|
Packit |
67cb25 |
|
|
Packit |
67cb25 |
.. only:: texinfo
|
|
Packit |
67cb25 |
|
|
Packit |
67cb25 |
::
|
|
Packit |
67cb25 |
|
|
Packit |
67cb25 |
|a| / |v|
|
|
Packit |
67cb25 |
|
|
Packit |
67cb25 |
If this ratio is small, it means the acceleration correction
|
|
Packit |
67cb25 |
is contributing very little to the step. This could be because
|
|
Packit |
67cb25 |
the problem is not "nonlinear" enough to benefit from
|
|
Packit |
67cb25 |
the acceleration. If the ratio is large (:math:`> 1`) it
|
|
Packit |
67cb25 |
means that the acceleration is larger than the velocity,
|
|
Packit |
67cb25 |
which shouldn't happen since the step represents a truncated
|
|
Packit |
67cb25 |
series and so the second order term :math:`a` should be smaller than
|
|
Packit |
67cb25 |
the first order term :math:`v` to guarantee convergence.
|
|
Packit |
67cb25 |
Therefore any steps with a ratio larger than the parameter
|
|
Packit |
67cb25 |
:data:`avmax` are rejected. :data:`avmax` is set to 0.75 by default.
|
|
Packit |
67cb25 |
For problems which experience difficulty converging, this threshold
|
|
Packit |
67cb25 |
could be lowered.
|
|
Packit |
67cb25 |
|
|
Packit |
67cb25 |
:code:`double h_df`
|
|
Packit |
67cb25 |
|
|
Packit |
67cb25 |
This parameter specifies the step size for approximating the
|
|
Packit |
67cb25 |
Jacobian matrix with finite differences. It is set to
|
|
Packit |
67cb25 |
:math:`\sqrt{\epsilon}` by default, where :math:`\epsilon`
|
|
Packit |
67cb25 |
is :macro:`GSL_DBL_EPSILON`.
|
|
Packit |
67cb25 |
|
|
Packit |
67cb25 |
:code:`double h_fvv`
|
|
Packit |
67cb25 |
|
|
Packit |
67cb25 |
When using geodesic acceleration, the user must either supply
|
|
Packit |
67cb25 |
a function to calculate :math:`f_{vv}(x)` or the library
|
|
Packit |
67cb25 |
can estimate this second directional derivative using a finite
|
|
Packit |
67cb25 |
difference method. When using finite differences, the library
|
|
Packit |
67cb25 |
must calculate :math:`f(x + h v)` where :math:`h` represents
|
|
Packit |
67cb25 |
a small step in the velocity direction. The parameter
|
|
Packit |
67cb25 |
:data:`h_fvv` defines this step size and is set to 0.02 by
|
|
Packit |
67cb25 |
default.
|
|
Packit |
67cb25 |
|
|
Packit |
67cb25 |
Initializing the Solver
|
|
Packit |
67cb25 |
=======================
|
|
Packit |
67cb25 |
|
|
Packit |
67cb25 |
.. type:: gsl_multifit_nlinear_type
|
|
Packit |
67cb25 |
|
|
Packit |
67cb25 |
This structure specifies the type of algorithm which will be used
|
|
Packit |
67cb25 |
to solve a nonlinear least squares problem. It may be selected from the
|
|
Packit |
67cb25 |
following choices,
|
|
Packit |
67cb25 |
|
|
Packit |
67cb25 |
.. var:: gsl_multifit_nlinear_trust
|
|
Packit |
67cb25 |
|
|
Packit |
67cb25 |
This specifies a trust region method. It is currently the only implemented
|
|
Packit |
67cb25 |
nonlinear least squares method.
|
|
Packit |
67cb25 |
|
|
Packit |
67cb25 |
.. function:: gsl_multifit_nlinear_workspace * gsl_multifit_nlinear_alloc (const gsl_multifit_nlinear_type * T, const gsl_multifit_nlinear_parameters * params, const size_t n, const size_t p)
|
|
Packit |
67cb25 |
gsl_multilarge_nlinear_workspace * gsl_multilarge_nlinear_alloc (const gsl_multilarge_nlinear_type * T, const gsl_multilarge_nlinear_parameters * params, const size_t n, const size_t p)
|
|
Packit |
67cb25 |
|
|
Packit |
67cb25 |
These functions return a pointer to a newly allocated instance of a
|
|
Packit |
67cb25 |
derivative solver of type :data:`T` for :data:`n` observations and :data:`p`
|
|
Packit |
67cb25 |
parameters. The :data:`params` input specifies a tunable set of
|
|
Packit |
67cb25 |
parameters which will affect important details in each iteration
|
|
Packit |
67cb25 |
of the trust region subproblem algorithm. It is recommended to start
|
|
Packit |
67cb25 |
with the suggested default parameters (see
|
|
Packit |
67cb25 |
:func:`gsl_multifit_nlinear_default_parameters` and
|
|
Packit |
67cb25 |
:func:`gsl_multilarge_nlinear_default_parameters`) and then tune
|
|
Packit |
67cb25 |
the parameters once the code is working correctly. See
|
|
Packit |
67cb25 |
:ref:`sec_tunable-parameters`.
|
|
Packit |
67cb25 |
for descriptions of the various parameters.
|
|
Packit |
67cb25 |
For example, the following code creates an instance of a
|
|
Packit |
67cb25 |
Levenberg-Marquardt solver for 100 data points and 3 parameters,
|
|
Packit |
67cb25 |
using suggested defaults::
|
|
Packit |
67cb25 |
|
|
Packit |
67cb25 |
const gsl_multifit_nlinear_type * T = gsl_multifit_nlinear_trust;
|
|
Packit |
67cb25 |
gsl_multifit_nlinear_parameters params = gsl_multifit_nlinear_default_parameters();
|
|
Packit |
67cb25 |
gsl_multifit_nlinear_workspace * w = gsl_multifit_nlinear_alloc (T, ¶ms, 100, 3);
|
|
Packit |
67cb25 |
|
|
Packit |
67cb25 |
The number of observations :data:`n` must be greater than or equal to
|
|
Packit |
67cb25 |
parameters :data:`p`.
|
|
Packit |
67cb25 |
|
|
Packit |
67cb25 |
If there is insufficient memory to create the solver then the function
|
|
Packit |
67cb25 |
returns a null pointer and the error handler is invoked with an error
|
|
Packit |
67cb25 |
code of :macro:`GSL_ENOMEM`.
|
|
Packit |
67cb25 |
|
|
Packit |
67cb25 |
.. function:: gsl_multifit_nlinear_parameters gsl_multifit_nlinear_default_parameters (void)
|
|
Packit |
67cb25 |
gsl_multilarge_nlinear_parameters gsl_multilarge_nlinear_default_parameters (void)
|
|
Packit |
67cb25 |
|
|
Packit |
67cb25 |
These functions return a set of recommended default parameters
|
|
Packit |
67cb25 |
for use in solving nonlinear least squares problems. The user
|
|
Packit |
67cb25 |
can tune each parameter to improve the performance on their
|
|
Packit |
67cb25 |
particular problem, see :ref:`sec_tunable-parameters`.
|
|
Packit |
67cb25 |
|
|
Packit |
67cb25 |
.. function:: int gsl_multifit_nlinear_init (const gsl_vector * x, gsl_multifit_nlinear_fdf * fdf, gsl_multifit_nlinear_workspace * w)
|
|
Packit |
67cb25 |
int gsl_multifit_nlinear_winit (const gsl_vector * x, const gsl_vector * wts, gsl_multifit_nlinear_fdf * fdf, gsl_multifit_nlinear_workspace * w)
|
|
Packit |
67cb25 |
int gsl_multilarge_nlinear_init (const gsl_vector * x, gsl_multilarge_nlinear_fdf * fdf, gsl_multilarge_nlinear_workspace * w)
|
|
Packit |
67cb25 |
|
|
Packit |
67cb25 |
These functions initialize, or reinitialize, an existing workspace :data:`w`
|
|
Packit |
67cb25 |
to use the system :data:`fdf` and the initial guess
|
|
Packit |
67cb25 |
:data:`x`. See :ref:`sec_providing-function-minimized`
|
|
Packit |
67cb25 |
for a description of the :data:`fdf` structure.
|
|
Packit |
67cb25 |
|
|
Packit |
67cb25 |
Optionally, a weight vector :data:`wts` can be given to perform
|
|
Packit |
67cb25 |
a weighted nonlinear regression. Here, the weighting matrix is
|
|
Packit |
67cb25 |
:math:`W = \diag(w_1,w_2,...,w_n)`.
|
|
Packit |
67cb25 |
|
|
Packit |
67cb25 |
.. function:: void gsl_multifit_nlinear_free (gsl_multifit_nlinear_workspace * w)
|
|
Packit |
67cb25 |
void gsl_multilarge_nlinear_free (gsl_multilarge_nlinear_workspace * w)
|
|
Packit |
67cb25 |
|
|
Packit |
67cb25 |
These functions free all the memory associated with the workspace :data:`w`.
|
|
Packit |
67cb25 |
|
|
Packit |
67cb25 |
.. function:: const char * gsl_multifit_nlinear_name (const gsl_multifit_nlinear_workspace * w)
|
|
Packit |
67cb25 |
const char * gsl_multilarge_nlinear_name (const gsl_multilarge_nlinear_workspace * w)
|
|
Packit |
67cb25 |
|
|
Packit |
67cb25 |
These functions return a pointer to the name of the solver. For example::
|
|
Packit |
67cb25 |
|
|
Packit |
67cb25 |
printf ("w is a '%s' solver\n", gsl_multifit_nlinear_name (w));
|
|
Packit |
67cb25 |
|
|
Packit |
67cb25 |
would print something like :code:`w is a 'trust-region' solver`.
|
|
Packit |
67cb25 |
|
|
Packit |
67cb25 |
.. function:: const char * gsl_multifit_nlinear_trs_name (const gsl_multifit_nlinear_workspace * w)
|
|
Packit |
67cb25 |
const char * gsl_multilarge_nlinear_trs_name (const gsl_multilarge_nlinear_workspace * w)
|
|
Packit |
67cb25 |
|
|
Packit |
67cb25 |
These functions return a pointer to the name of the trust region subproblem
|
|
Packit |
67cb25 |
method. For example::
|
|
Packit |
67cb25 |
|
|
Packit |
67cb25 |
printf ("w is a '%s' solver\n", gsl_multifit_nlinear_trs_name (w));
|
|
Packit |
67cb25 |
|
|
Packit |
67cb25 |
would print something like :code:`w is a 'levenberg-marquardt' solver`.
|
|
Packit |
67cb25 |
|
|
Packit |
67cb25 |
.. _sec_providing-function-minimized:
|
|
Packit |
67cb25 |
|
|
Packit |
67cb25 |
Providing the Function to be Minimized
|
|
Packit |
67cb25 |
======================================
|
|
Packit |
67cb25 |
|
|
Packit |
67cb25 |
The user must provide :math:`n` functions of :math:`p` variables for the
|
|
Packit |
67cb25 |
minimization algorithm to operate on. In order to allow for
|
|
Packit |
67cb25 |
arbitrary parameters the functions are defined by the following data
|
|
Packit |
67cb25 |
types:
|
|
Packit |
67cb25 |
|
|
Packit |
67cb25 |
.. type:: gsl_multifit_nlinear_fdf
|
|
Packit |
67cb25 |
|
|
Packit |
67cb25 |
This data type defines a general system of functions with arbitrary parameters,
|
|
Packit |
67cb25 |
the corresponding Jacobian matrix of derivatives, and optionally the
|
|
Packit |
67cb25 |
second directional derivative of the functions for geodesic acceleration.
|
|
Packit |
67cb25 |
|
|
Packit |
67cb25 |
:code:`int (* f) (const gsl_vector * x, void * params, gsl_vector * f)`
|
|
Packit |
67cb25 |
|
|
Packit |
67cb25 |
This function should store the :math:`n` components of the vector
|
|
Packit |
67cb25 |
:math:`f(x)` in :data:`f` for argument :data:`x` and arbitrary parameters :data:`params`,
|
|
Packit |
67cb25 |
returning an appropriate error code if the function cannot be computed.
|
|
Packit |
67cb25 |
|
|
Packit |
67cb25 |
:code:`int (* df) (const gsl_vector * x, void * params, gsl_matrix * J)`
|
|
Packit |
67cb25 |
|
|
Packit |
67cb25 |
This function should store the :data:`n`-by-:data:`p` matrix result
|
|
Packit |
67cb25 |
|
|
Packit |
67cb25 |
.. only:: not texinfo
|
|
Packit |
67cb25 |
|
|
Packit |
67cb25 |
.. math:: J_{ij} = \partial f_i(x) / \partial x_j
|
|
Packit |
67cb25 |
|
|
Packit |
67cb25 |
.. only:: texinfo
|
|
Packit |
67cb25 |
|
|
Packit |
67cb25 |
::
|
|
Packit |
67cb25 |
|
|
Packit |
67cb25 |
J_ij = d f_i(x) / d x_j
|
|
Packit |
67cb25 |
|
|
Packit |
67cb25 |
in :data:`J` for argument :data:`x`
|
|
Packit |
67cb25 |
and arbitrary parameters :data:`params`, returning an appropriate error code if the
|
|
Packit |
67cb25 |
matrix cannot be computed. If an analytic Jacobian is unavailable, or too expensive
|
|
Packit |
67cb25 |
to compute, this function pointer may be set to :code:`NULL`, in which
|
|
Packit |
67cb25 |
case the Jacobian will be internally computed using finite difference approximations
|
|
Packit |
67cb25 |
of the function :data:`f`.
|
|
Packit |
67cb25 |
|
|
Packit |
67cb25 |
:code:`int (* fvv) (const gsl_vector * x, const gsl_vector * v, void * params, gsl_vector * fvv)`
|
|
Packit |
67cb25 |
|
|
Packit |
67cb25 |
When geodesic acceleration is enabled, this function should store the
|
|
Packit |
67cb25 |
:math:`n` components of the vector
|
|
Packit |
67cb25 |
:math:`f_{vv}(x) = \sum_{\alpha\beta} v_{\alpha} v_{\beta} {\partial \over \partial x_{\alpha}} {\partial \over \partial x_{\beta}} f(x)`,
|
|
Packit |
67cb25 |
representing second directional derivatives of the function to be minimized,
|
|
Packit |
67cb25 |
into the output :data:`fvv`. The parameter vector is provided in :data:`x` and
|
|
Packit |
67cb25 |
the velocity vector is provided in :data:`v`, both of which have :math:`p`
|
|
Packit |
67cb25 |
components. The arbitrary parameters are given in :data:`params`. If
|
|
Packit |
67cb25 |
analytic expressions for :math:`f_{vv}(x)` are unavailable or too difficult
|
|
Packit |
67cb25 |
to compute, this function pointer may be set to :code:`NULL`, in which case
|
|
Packit |
67cb25 |
:math:`f_{vv}(x)` will be computed internally using a finite difference
|
|
Packit |
67cb25 |
approximation.
|
|
Packit |
67cb25 |
|
|
Packit |
67cb25 |
:code:`size_t n`
|
|
Packit |
67cb25 |
|
|
Packit |
67cb25 |
the number of functions, i.e. the number of components of the
|
|
Packit |
67cb25 |
vector :data:`f`.
|
|
Packit |
67cb25 |
|
|
Packit |
67cb25 |
:code:`size_t p`
|
|
Packit |
67cb25 |
|
|
Packit |
67cb25 |
the number of independent variables, i.e. the number of components of
|
|
Packit |
67cb25 |
the vector :data:`x`.
|
|
Packit |
67cb25 |
|
|
Packit |
67cb25 |
:code:`void * params`
|
|
Packit |
67cb25 |
|
|
Packit |
67cb25 |
a pointer to the arbitrary parameters of the function.
|
|
Packit |
67cb25 |
|
|
Packit |
67cb25 |
:code:`size_t nevalf`
|
|
Packit |
67cb25 |
|
|
Packit |
67cb25 |
This does not need to be set by the user. It counts the number of
|
|
Packit |
67cb25 |
function evaluations and is initialized by the :code:`_init` function.
|
|
Packit |
67cb25 |
|
|
Packit |
67cb25 |
:code:`size_t nevaldf`
|
|
Packit |
67cb25 |
|
|
Packit |
67cb25 |
This does not need to be set by the user. It counts the number of
|
|
Packit |
67cb25 |
Jacobian evaluations and is initialized by the :code:`_init` function.
|
|
Packit |
67cb25 |
|
|
Packit |
67cb25 |
:code:`size_t nevalfvv`
|
|
Packit |
67cb25 |
|
|
Packit |
67cb25 |
This does not need to be set by the user. It counts the number of
|
|
Packit |
67cb25 |
:math:`f_{vv}(x)` evaluations and is initialized by the :code:`_init` function.
|
|
Packit |
67cb25 |
|
|
Packit |
67cb25 |
.. type:: gsl_multilarge_nlinear_fdf
|
|
Packit |
67cb25 |
|
|
Packit |
67cb25 |
This data type defines a general system of functions with arbitrary parameters,
|
|
Packit |
67cb25 |
a function to compute :math:`J u` or :math:`J^T u` for a given vector :math:`u`,
|
|
Packit |
67cb25 |
the normal equations matrix :math:`J^T J`,
|
|
Packit |
67cb25 |
and optionally the second directional derivative of the functions for geodesic acceleration.
|
|
Packit |
67cb25 |
|
|
Packit |
67cb25 |
:code:`int (* f) (const gsl_vector * x, void * params, gsl_vector * f)`
|
|
Packit |
67cb25 |
|
|
Packit |
67cb25 |
This function should store the :math:`n` components of the vector
|
|
Packit |
67cb25 |
:math:`f(x)` in :data:`f` for argument :data:`x` and arbitrary parameters :data:`params`,
|
|
Packit |
67cb25 |
returning an appropriate error code if the function cannot be computed.
|
|
Packit |
67cb25 |
|
|
Packit |
67cb25 |
:code:`int (* df) (CBLAS_TRANSPOSE_t TransJ, const gsl_vector * x, const gsl_vector * u, void * params, gsl_vector * v, gsl_matrix * JTJ)`
|
|
Packit |
67cb25 |
|
|
Packit |
67cb25 |
If :data:`TransJ` is equal to :code:`CblasNoTrans`, then this function should
|
|
Packit |
67cb25 |
compute the matrix-vector product :math:`J u` and store the result in :data:`v`.
|
|
Packit |
67cb25 |
If :data:`TransJ` is equal to :code:`CblasTrans`, then this function should
|
|
Packit |
67cb25 |
compute the matrix-vector product :math:`J^T u` and store the result in :data:`v`.
|
|
Packit |
67cb25 |
Additionally, the normal equations matrix :math:`J^T J` should be stored in the
|
|
Packit |
67cb25 |
lower half of :data:`JTJ`. The input matrix :data:`JTJ` could be set to :code:`NULL`,
|
|
Packit |
67cb25 |
for example by iterative methods which do not require this matrix, so the user
|
|
Packit |
67cb25 |
should check for this prior to constructing the matrix.
|
|
Packit |
67cb25 |
The input :data:`params` contains the arbitrary parameters.
|
|
Packit |
67cb25 |
|
|
Packit |
67cb25 |
:code:`int (* fvv) (const gsl_vector * x, const gsl_vector * v, void * params, gsl_vector * fvv)`
|
|
Packit |
67cb25 |
|
|
Packit |
67cb25 |
When geodesic acceleration is enabled, this function should store the
|
|
Packit |
67cb25 |
:math:`n` components of the vector
|
|
Packit |
67cb25 |
:math:`f_{vv}(x) = \sum_{\alpha\beta} v_{\alpha} v_{\beta} {\partial \over \partial x_{\alpha}} {\partial \over \partial x_{\beta}} f(x)`,
|
|
Packit |
67cb25 |
representing second directional derivatives of the function to be minimized,
|
|
Packit |
67cb25 |
into the output :data:`fvv`. The parameter vector is provided in :data:`x` and
|
|
Packit |
67cb25 |
the velocity vector is provided in :data:`v`, both of which have :math:`p`
|
|
Packit |
67cb25 |
components. The arbitrary parameters are given in :data:`params`. If
|
|
Packit |
67cb25 |
analytic expressions for :math:`f_{vv}(x)` are unavailable or too difficult
|
|
Packit |
67cb25 |
to compute, this function pointer may be set to :code:`NULL`, in which case
|
|
Packit |
67cb25 |
:math:`f_{vv}(x)` will be computed internally using a finite difference
|
|
Packit |
67cb25 |
approximation.
|
|
Packit |
67cb25 |
|
|
Packit |
67cb25 |
:code:`size_t n`
|
|
Packit |
67cb25 |
|
|
Packit |
67cb25 |
the number of functions, i.e. the number of components of the
|
|
Packit |
67cb25 |
vector :data:`f`.
|
|
Packit |
67cb25 |
|
|
Packit |
67cb25 |
:code:`size_t p`
|
|
Packit |
67cb25 |
|
|
Packit |
67cb25 |
the number of independent variables, i.e. the number of components of
|
|
Packit |
67cb25 |
the vector :data:`x`.
|
|
Packit |
67cb25 |
|
|
Packit |
67cb25 |
:code:`void * params`
|
|
Packit |
67cb25 |
|
|
Packit |
67cb25 |
a pointer to the arbitrary parameters of the function.
|
|
Packit |
67cb25 |
|
|
Packit |
67cb25 |
:code:`size_t nevalf`
|
|
Packit |
67cb25 |
|
|
Packit |
67cb25 |
This does not need to be set by the user. It counts the number of
|
|
Packit |
67cb25 |
function evaluations and is initialized by the :code:`_init` function.
|
|
Packit |
67cb25 |
|
|
Packit |
67cb25 |
:code:`size_t nevaldfu`
|
|
Packit |
67cb25 |
|
|
Packit |
67cb25 |
This does not need to be set by the user. It counts the number of
|
|
Packit |
67cb25 |
Jacobian matrix-vector evaluations (:math:`J u` or :math:`J^T u`) and
|
|
Packit |
67cb25 |
is initialized by the :code:`_init` function.
|
|
Packit |
67cb25 |
|
|
Packit |
67cb25 |
:code:`size_t nevaldf2`
|
|
Packit |
67cb25 |
|
|
Packit |
67cb25 |
This does not need to be set by the user. It counts the number of
|
|
Packit |
67cb25 |
:math:`J^T J` evaluations and is initialized by the :code:`_init` function.
|
|
Packit |
67cb25 |
|
|
Packit |
67cb25 |
:code:`size_t nevalfvv`
|
|
Packit |
67cb25 |
|
|
Packit |
67cb25 |
This does not need to be set by the user. It counts the number of
|
|
Packit |
67cb25 |
:math:`f_{vv}(x)` evaluations and is initialized by the :code:`_init` function.
|
|
Packit |
67cb25 |
|
|
Packit |
67cb25 |
Note that when fitting a non-linear model against experimental data,
|
|
Packit |
67cb25 |
the data is passed to the functions above using the
|
|
Packit |
67cb25 |
:data:`params` argument and the trial best-fit parameters through the
|
|
Packit |
67cb25 |
:data:`x` argument.
|
|
Packit |
67cb25 |
|
|
Packit |
67cb25 |
Iteration
|
|
Packit |
67cb25 |
=========
|
|
Packit |
67cb25 |
|
|
Packit |
67cb25 |
The following functions drive the iteration of each algorithm. Each
|
|
Packit |
67cb25 |
function performs one iteration of the trust region method and updates
|
|
Packit |
67cb25 |
the state of the solver.
|
|
Packit |
67cb25 |
|
|
Packit |
67cb25 |
.. function:: int gsl_multifit_nlinear_iterate (gsl_multifit_nlinear_workspace * w)
|
|
Packit |
67cb25 |
int gsl_multilarge_nlinear_iterate (gsl_multilarge_nlinear_workspace * w)
|
|
Packit |
67cb25 |
|
|
Packit |
67cb25 |
These functions perform a single iteration of the solver :data:`w`. If
|
|
Packit |
67cb25 |
the iteration encounters an unexpected problem then an error code will
|
|
Packit |
67cb25 |
be returned. The solver workspace maintains a current estimate of the
|
|
Packit |
67cb25 |
best-fit parameters at all times.
|
|
Packit |
67cb25 |
|
|
Packit |
67cb25 |
The solver workspace :data:`w` contains the following entries, which can
|
|
Packit |
67cb25 |
be used to track the progress of the solution:
|
|
Packit |
67cb25 |
|
|
Packit |
67cb25 |
:code:`gsl_vector * x`
|
|
Packit |
67cb25 |
|
|
Packit |
67cb25 |
The current position, length :math:`p`.
|
|
Packit |
67cb25 |
|
|
Packit |
67cb25 |
:code:`gsl_vector * f`
|
|
Packit |
67cb25 |
|
|
Packit |
67cb25 |
The function residual vector at the current position :math:`f(x)`, length
|
|
Packit |
67cb25 |
:math:`n`.
|
|
Packit |
67cb25 |
|
|
Packit |
67cb25 |
:code:`gsl_matrix * J`
|
|
Packit |
67cb25 |
|
|
Packit |
67cb25 |
The Jacobian matrix at the current position :math:`J(x)`, size
|
|
Packit |
67cb25 |
:math:`n`-by-:math:`p` (only for :code:`gsl_multifit_nlinear` interface).
|
|
Packit |
67cb25 |
|
|
Packit |
67cb25 |
:code:`gsl_vector * dx`
|
|
Packit |
67cb25 |
|
|
Packit |
67cb25 |
The difference between the current position and the previous position,
|
|
Packit |
67cb25 |
i.e. the last step :math:`\delta`, taken as a vector, length :math:`p`.
|
|
Packit |
67cb25 |
|
|
Packit |
67cb25 |
These quantities can be accessed with the following functions,
|
|
Packit |
67cb25 |
|
|
Packit |
67cb25 |
.. function:: gsl_vector * gsl_multifit_nlinear_position (const gsl_multifit_nlinear_workspace * w)
|
|
Packit |
67cb25 |
gsl_vector * gsl_multilarge_nlinear_position (const gsl_multilarge_nlinear_workspace * w)
|
|
Packit |
67cb25 |
|
|
Packit |
67cb25 |
These functions return the current position :math:`x` (i.e. best-fit
|
|
Packit |
67cb25 |
parameters) of the solver :data:`w`.
|
|
Packit |
67cb25 |
|
|
Packit |
67cb25 |
.. function:: gsl_vector * gsl_multifit_nlinear_residual (const gsl_multifit_nlinear_workspace * w)
|
|
Packit |
67cb25 |
gsl_vector * gsl_multilarge_nlinear_residual (const gsl_multilarge_nlinear_workspace * w)
|
|
Packit |
67cb25 |
|
|
Packit |
67cb25 |
These functions return the current residual vector :math:`f(x)` of the
|
|
Packit |
67cb25 |
solver :data:`w`. For weighted systems, the residual vector includes the
|
|
Packit |
67cb25 |
weighting factor :math:`\sqrt{W}`.
|
|
Packit |
67cb25 |
|
|
Packit |
67cb25 |
.. function:: gsl_matrix * gsl_multifit_nlinear_jac (const gsl_multifit_nlinear_workspace * w)
|
|
Packit |
67cb25 |
|
|
Packit |
67cb25 |
This function returns a pointer to the :math:`n`-by-:math:`p` Jacobian matrix for the
|
|
Packit |
67cb25 |
current iteration of the solver :data:`w`. This function is available only for the
|
|
Packit |
67cb25 |
:code:`gsl_multifit_nlinear` interface.
|
|
Packit |
67cb25 |
|
|
Packit |
67cb25 |
.. function:: size_t gsl_multifit_nlinear_niter (const gsl_multifit_nlinear_workspace * w)
|
|
Packit |
67cb25 |
size_t gsl_multilarge_nlinear_niter (const gsl_multilarge_nlinear_workspace * w)
|
|
Packit |
67cb25 |
|
|
Packit |
67cb25 |
These functions return the number of iterations performed so far.
|
|
Packit |
67cb25 |
The iteration counter is updated on each call to the
|
|
Packit |
67cb25 |
:code:`_iterate` functions above, and reset to 0 in the
|
|
Packit |
67cb25 |
:code:`_init` functions.
|
|
Packit |
67cb25 |
|
|
Packit |
67cb25 |
.. function:: int gsl_multifit_nlinear_rcond (double * rcond, const gsl_multifit_nlinear_workspace * w)
|
|
Packit |
67cb25 |
int gsl_multilarge_nlinear_rcond (double * rcond, const gsl_multilarge_nlinear_workspace * w)
|
|
Packit |
67cb25 |
|
|
Packit |
67cb25 |
This function estimates the reciprocal condition number
|
|
Packit |
67cb25 |
of the Jacobian matrix at the current position :math:`x` and
|
|
Packit |
67cb25 |
stores it in :data:`rcond`. The computed value is only an estimate
|
|
Packit |
67cb25 |
to give the user a guideline as to the conditioning of their particular
|
|
Packit |
67cb25 |
problem. Its calculation is based on which factorization
|
|
Packit |
67cb25 |
method is used (Cholesky, QR, or SVD).
|
|
Packit |
67cb25 |
|
|
Packit |
67cb25 |
* For the Cholesky solver, the matrix :math:`J^T J` is factored at each
|
|
Packit |
67cb25 |
iteration. Therefore this function will estimate the 1-norm condition number
|
|
Packit |
67cb25 |
:math:`rcond^2 = 1/(||J^T J||_1 \cdot ||(J^T J)^{-1}||_1)`
|
|
Packit |
67cb25 |
|
|
Packit |
67cb25 |
* For the QR solver, :math:`J` is factored as :math:`J = Q R` at each
|
|
Packit |
67cb25 |
iteration. For simplicity, this function calculates the 1-norm conditioning of
|
|
Packit |
67cb25 |
only the :math:`R` factor, :math:`rcond = 1 / (||R||_1 \cdot ||R^{-1}||_1)`.
|
|
Packit |
67cb25 |
This can be computed efficiently since :math:`R` is upper triangular.
|
|
Packit |
67cb25 |
|
|
Packit |
67cb25 |
* For the SVD solver, in order to efficiently solve the trust region
|
|
Packit |
67cb25 |
subproblem, the matrix which is factored is :math:`J D^{-1}`, instead of
|
|
Packit |
67cb25 |
:math:`J` itself. The resulting singular values are used to provide
|
|
Packit |
67cb25 |
the 2-norm reciprocal condition number, as :math:`rcond = \sigma_{min} / \sigma_{max}`.
|
|
Packit |
67cb25 |
Note that when using |More| scaling, :math:`D \ne I` and the resulting
|
|
Packit |
67cb25 |
:data:`rcond` estimate may be significantly different from the true
|
|
Packit |
67cb25 |
:data:`rcond` of :math:`J` itself.
|
|
Packit |
67cb25 |
|
|
Packit |
67cb25 |
.. function:: double gsl_multifit_nlinear_avratio (const gsl_multifit_nlinear_workspace * w)
|
|
Packit |
67cb25 |
double gsl_multilarge_nlinear_avratio (const gsl_multilarge_nlinear_workspace * w)
|
|
Packit |
67cb25 |
|
|
Packit |
67cb25 |
This function returns the current ratio :math:`|a| / |v|` of the acceleration correction term to
|
|
Packit |
67cb25 |
the velocity step term. The acceleration term is computed only by the
|
|
Packit |
67cb25 |
:type:`gsl_multifit_nlinear_trs_lmaccel` and :type:`gsl_multilarge_nlinear_trs_lmaccel` methods, so
|
|
Packit |
67cb25 |
this ratio will be zero for other TRS methods.
|
|
Packit |
67cb25 |
|
|
Packit |
67cb25 |
.. index::
|
|
Packit |
67cb25 |
single: nonlinear fitting, stopping parameters, convergence
|
|
Packit |
67cb25 |
|
|
Packit |
67cb25 |
Testing for Convergence
|
|
Packit |
67cb25 |
=======================
|
|
Packit |
67cb25 |
|
|
Packit |
67cb25 |
A minimization procedure should stop when one of the following conditions is
|
|
Packit |
67cb25 |
true:
|
|
Packit |
67cb25 |
|
|
Packit |
67cb25 |
* A minimum has been found to within the user-specified precision.
|
|
Packit |
67cb25 |
* A user-specified maximum number of iterations has been reached.
|
|
Packit |
67cb25 |
* An error has occurred.
|
|
Packit |
67cb25 |
|
|
Packit |
67cb25 |
The handling of these conditions is under user control. The functions
|
|
Packit |
67cb25 |
below allow the user to test the current estimate of the best-fit
|
|
Packit |
67cb25 |
parameters in several standard ways.
|
|
Packit |
67cb25 |
|
|
Packit |
67cb25 |
.. function:: int gsl_multifit_nlinear_test (const double xtol, const double gtol, const double ftol, int * info, const gsl_multifit_nlinear_workspace * w)
|
|
Packit |
67cb25 |
int gsl_multilarge_nlinear_test (const double xtol, const double gtol, const double ftol, int * info, const gsl_multilarge_nlinear_workspace * w)
|
|
Packit |
67cb25 |
|
|
Packit |
67cb25 |
These functions test for convergence of the minimization method
|
|
Packit |
67cb25 |
using the following criteria:
|
|
Packit |
67cb25 |
|
|
Packit |
67cb25 |
* Testing for a small step size relative to the current parameter vector
|
|
Packit |
67cb25 |
|
|
Packit |
67cb25 |
.. math:: |\delta_i| \le xtol (|x_i| + xtol)
|
|
Packit |
67cb25 |
|
|
Packit |
67cb25 |
for each :math:`0 <= i < p`. Each element of the step vector :math:`\delta`
|
|
Packit |
67cb25 |
is tested individually in case the different parameters have widely
|
|
Packit |
67cb25 |
different scales. Adding :data:`xtol` to :math:`|x_i|` helps the test avoid
|
|
Packit |
67cb25 |
breaking down in situations where the true solution value :math:`x_i = 0`.
|
|
Packit |
67cb25 |
If this test succeeds, :data:`info` is set to 1 and the function
|
|
Packit |
67cb25 |
returns :macro:`GSL_SUCCESS`.
|
|
Packit |
67cb25 |
|
|
Packit |
67cb25 |
A general guideline for selecting the step tolerance is to choose
|
|
Packit |
67cb25 |
:math:`xtol = 10^{-d}` where :math:`d` is the number of accurate
|
|
Packit |
67cb25 |
decimal digits desired in the solution :math:`x`. See Dennis and
|
|
Packit |
67cb25 |
Schnabel for more information.
|
|
Packit |
67cb25 |
|
|
Packit |
67cb25 |
* Testing for a small gradient (:math:`g = \nabla \Phi(x) = J^T f`)
|
|
Packit |
67cb25 |
indicating a local function minimum:
|
|
Packit |
67cb25 |
|
|
Packit |
67cb25 |
.. only:: not texinfo
|
|
Packit |
67cb25 |
|
|
Packit |
67cb25 |
.. math:: \max_i |g_i \times \max(x_i, 1)| \le gtol \times \max(\Phi(x), 1)
|
|
Packit |
67cb25 |
|
|
Packit |
67cb25 |
.. only:: texinfo
|
|
Packit |
67cb25 |
|
|
Packit |
67cb25 |
::
|
|
Packit |
67cb25 |
|
|
Packit |
67cb25 |
||g||_inf <= gtol
|
|
Packit |
67cb25 |
|
|
Packit |
67cb25 |
This expression tests whether the ratio
|
|
Packit |
67cb25 |
:math:`(\nabla \Phi)_i x_i / \Phi` is small. Testing this scaled gradient
|
|
Packit |
67cb25 |
is a better than :math:`\nabla \Phi` alone since it is a dimensionless
|
|
Packit |
67cb25 |
quantity and so independent of the scale of the problem. The
|
|
Packit |
67cb25 |
:code:`max` arguments help ensure the test doesn't break down in
|
|
Packit |
67cb25 |
regions where :math:`x_i` or :math:`\Phi(x)` are close to 0.
|
|
Packit |
67cb25 |
If this test succeeds, :data:`info` is set to 2 and the function
|
|
Packit |
67cb25 |
returns :macro:`GSL_SUCCESS`.
|
|
Packit |
67cb25 |
|
|
Packit |
67cb25 |
A general guideline for choosing the gradient tolerance is to set
|
|
Packit |
67cb25 |
:code:`gtol = GSL_DBL_EPSILON^(1/3)`. See Dennis and Schnabel for
|
|
Packit |
67cb25 |
more information.
|
|
Packit |
67cb25 |
|
|
Packit |
67cb25 |
If none of the tests succeed, :data:`info` is set to 0 and the
|
|
Packit |
67cb25 |
function returns :macro:`GSL_CONTINUE`, indicating further iterations
|
|
Packit |
67cb25 |
are required.
|
|
Packit |
67cb25 |
|
|
Packit |
67cb25 |
High Level Driver
|
|
Packit |
67cb25 |
=================
|
|
Packit |
67cb25 |
|
|
Packit |
67cb25 |
These routines provide a high level wrapper that combines the iteration
|
|
Packit |
67cb25 |
and convergence testing for easy use.
|
|
Packit |
67cb25 |
|
|
Packit |
67cb25 |
.. function:: int gsl_multifit_nlinear_driver (const size_t maxiter, const double xtol, const double gtol, const double ftol, void (* callback)(const size_t iter, void * params, const gsl_multifit_linear_workspace * w), void * callback_params, int * info, gsl_multifit_nlinear_workspace * w)
|
|
Packit |
67cb25 |
int gsl_multilarge_nlinear_driver (const size_t maxiter, const double xtol, const double gtol, const double ftol, void (* callback)(const size_t iter, void * params, const gsl_multilarge_linear_workspace * w), void * callback_params, int * info, gsl_multilarge_nlinear_workspace * w)
|
|
Packit |
67cb25 |
|
|
Packit |
67cb25 |
These functions iterate the nonlinear least squares solver :data:`w` for a
|
|
Packit |
67cb25 |
maximum of :data:`maxiter` iterations. After each iteration, the system is
|
|
Packit |
67cb25 |
tested for convergence with the error tolerances :data:`xtol`, :data:`gtol` and :data:`ftol`.
|
|
Packit |
67cb25 |
Additionally, the user may supply a callback function :data:`callback`
|
|
Packit |
67cb25 |
which is called after each iteration, so that the user may save or print
|
|
Packit |
67cb25 |
relevant quantities for each iteration. The parameter :data:`callback_params`
|
|
Packit |
67cb25 |
is passed to the :data:`callback` function. The parameters :data:`callback`
|
|
Packit |
67cb25 |
and :data:`callback_params` may be set to :code:`NULL` to disable this feature.
|
|
Packit |
67cb25 |
Upon successful convergence, the function returns :macro:`GSL_SUCCESS`
|
|
Packit |
67cb25 |
and sets :data:`info` to the reason for convergence (see
|
|
Packit |
67cb25 |
:func:`gsl_multifit_nlinear_test`). If the function has not
|
|
Packit |
67cb25 |
converged after :data:`maxiter` iterations, :macro:`GSL_EMAXITER` is
|
|
Packit |
67cb25 |
returned. In rare cases, during an iteration the algorithm may
|
|
Packit |
67cb25 |
be unable to find a new acceptable step :math:`\delta` to take. In
|
|
Packit |
67cb25 |
this case, :macro:`GSL_ENOPROG` is returned indicating no further
|
|
Packit |
67cb25 |
progress can be made. If your problem is having difficulty converging,
|
|
Packit |
67cb25 |
see :ref:`sec_nlinear-troubleshooting` for further guidance.
|
|
Packit |
67cb25 |
|
|
Packit |
67cb25 |
.. index::
|
|
Packit |
67cb25 |
single: best-fit parameters, covariance
|
|
Packit |
67cb25 |
single: least squares, covariance of best-fit parameters
|
|
Packit |
67cb25 |
single: covariance matrix, nonlinear fits
|
|
Packit |
67cb25 |
|
|
Packit |
67cb25 |
Covariance matrix of best fit parameters
|
|
Packit |
67cb25 |
========================================
|
|
Packit |
67cb25 |
|
|
Packit |
67cb25 |
.. function:: int gsl_multifit_nlinear_covar (const gsl_matrix * J, const double epsrel, gsl_matrix * covar)
|
|
Packit |
67cb25 |
int gsl_multilarge_nlinear_covar (gsl_matrix * covar, gsl_multilarge_nlinear_workspace * w)
|
|
Packit |
67cb25 |
|
|
Packit |
67cb25 |
This function computes the covariance matrix of best-fit parameters
|
|
Packit |
67cb25 |
using the Jacobian matrix :data:`J` and stores it in :data:`covar`.
|
|
Packit |
67cb25 |
The parameter :data:`epsrel` is used to remove linear-dependent columns
|
|
Packit |
67cb25 |
when :data:`J` is rank deficient.
|
|
Packit |
67cb25 |
|
|
Packit |
67cb25 |
The covariance matrix is given by,
|
|
Packit |
67cb25 |
|
|
Packit |
67cb25 |
.. math:: C = (J^T J)^{-1}
|
|
Packit |
67cb25 |
|
|
Packit |
67cb25 |
or in the weighted case,
|
|
Packit |
67cb25 |
|
|
Packit |
67cb25 |
.. math:: C = (J^T W J)^{-1}
|
|
Packit |
67cb25 |
|
|
Packit |
67cb25 |
and is computed using the factored form of the Jacobian (Cholesky, QR, or SVD).
|
|
Packit |
67cb25 |
Any columns of :math:`R` which satisfy
|
|
Packit |
67cb25 |
|
|
Packit |
67cb25 |
.. math:: |R_{kk}| \leq epsrel |R_{11}|
|
|
Packit |
67cb25 |
|
|
Packit |
67cb25 |
are considered linearly-dependent and are excluded from the covariance
|
|
Packit |
67cb25 |
matrix (the corresponding rows and columns of the covariance matrix are
|
|
Packit |
67cb25 |
set to zero).
|
|
Packit |
67cb25 |
|
|
Packit |
67cb25 |
If the minimisation uses the weighted least-squares function
|
|
Packit |
67cb25 |
:math:`f_i = (Y(x, t_i) - y_i) / \sigma_i` then the covariance
|
|
Packit |
67cb25 |
matrix above gives the statistical error on the best-fit parameters
|
|
Packit |
67cb25 |
resulting from the Gaussian errors :math:`\sigma_i` on
|
|
Packit |
67cb25 |
the underlying data :math:`y_i`. This can be verified from the relation
|
|
Packit |
67cb25 |
:math:`\delta f = J \delta c` and the fact that the fluctuations in :math:`f`
|
|
Packit |
67cb25 |
from the data :math:`y_i` are normalised by :math:`\sigma_i` and
|
|
Packit |
67cb25 |
so satisfy
|
|
Packit |
67cb25 |
|
|
Packit |
67cb25 |
.. only:: not texinfo
|
|
Packit |
67cb25 |
|
|
Packit |
67cb25 |
.. math:: \langle \delta f \delta f^T \rangle = I
|
|
Packit |
67cb25 |
|
|
Packit |
67cb25 |
.. only:: texinfo
|
|
Packit |
67cb25 |
|
|
Packit |
67cb25 |
::
|
|
Packit |
67cb25 |
|
|
Packit |
67cb25 |
<\delta f \delta f^T> = I
|
|
Packit |
67cb25 |
|
|
Packit |
67cb25 |
For an unweighted least-squares function :math:`f_i = (Y(x, t_i) - y_i)`
|
|
Packit |
67cb25 |
the covariance matrix above should be multiplied by the variance
|
|
Packit |
67cb25 |
of the residuals about the best-fit :math:`\sigma^2 = \sum (y_i - Y(x,t_i))^2 / (n-p)`
|
|
Packit |
67cb25 |
to give the variance-covariance
|
|
Packit |
67cb25 |
matrix :math:`\sigma^2 C`. This estimates the statistical error on the
|
|
Packit |
67cb25 |
best-fit parameters from the scatter of the underlying data.
|
|
Packit |
67cb25 |
|
|
Packit |
67cb25 |
For more information about covariance matrices see
|
|
Packit |
67cb25 |
:ref:`Linear Least-Squares Overview <sec_lls-overview>`.
|
|
Packit |
67cb25 |
|
|
Packit |
67cb25 |
.. _sec_nlinear-troubleshooting:
|
|
Packit |
67cb25 |
|
|
Packit |
67cb25 |
Troubleshooting
|
|
Packit |
67cb25 |
===============
|
|
Packit |
67cb25 |
|
|
Packit |
67cb25 |
When developing a code to solve a nonlinear least squares problem,
|
|
Packit |
67cb25 |
here are a few considerations to keep in mind.
|
|
Packit |
67cb25 |
|
|
Packit |
67cb25 |
#. The most common difficulty is the accurate implementation of the Jacobian
|
|
Packit |
67cb25 |
matrix. If the analytic Jacobian is not properly provided to the
|
|
Packit |
67cb25 |
solver, this can hinder and many times prevent convergence of the method.
|
|
Packit |
67cb25 |
When developing a new nonlinear least squares code, it often helps
|
|
Packit |
67cb25 |
to compare the program output with the internally computed finite
|
|
Packit |
67cb25 |
difference Jacobian and the user supplied analytic Jacobian. If there
|
|
Packit |
67cb25 |
is a large difference in coefficients, it is likely the analytic
|
|
Packit |
67cb25 |
Jacobian is incorrectly implemented.
|
|
Packit |
67cb25 |
|
|
Packit |
67cb25 |
#. If your code is having difficulty converging, the next thing to
|
|
Packit |
67cb25 |
check is the starting point provided to the solver. The methods
|
|
Packit |
67cb25 |
of this chapter are local methods, meaning if you provide a starting
|
|
Packit |
67cb25 |
point far away from the true minimum, the method may converge to
|
|
Packit |
67cb25 |
a local minimum or not converge at all. Sometimes it is possible
|
|
Packit |
67cb25 |
to solve a linearized approximation to the nonlinear problem,
|
|
Packit |
67cb25 |
and use the linear solution as the starting point to the nonlinear
|
|
Packit |
67cb25 |
problem.
|
|
Packit |
67cb25 |
|
|
Packit |
67cb25 |
#. If the various parameters of the coefficient vector :math:`x`
|
|
Packit |
67cb25 |
vary widely in magnitude, then the problem is said to be badly scaled.
|
|
Packit |
67cb25 |
The methods of this chapter do attempt to automatically rescale
|
|
Packit |
67cb25 |
the elements of :math:`x` to have roughly the same order of magnitude,
|
|
Packit |
67cb25 |
but in extreme cases this could still cause problems for convergence.
|
|
Packit |
67cb25 |
In these cases it is recommended for the user to scale their
|
|
Packit |
67cb25 |
parameter vector :math:`x` so that each parameter spans roughly the
|
|
Packit |
67cb25 |
same range, say :math:`[-1,1]`. The solution vector can be backscaled
|
|
Packit |
67cb25 |
to recover the original units of the problem.
|
|
Packit |
67cb25 |
|
|
Packit |
67cb25 |
Examples
|
|
Packit |
67cb25 |
========
|
|
Packit |
67cb25 |
|
|
Packit |
67cb25 |
The following example programs demonstrate the nonlinear least
|
|
Packit |
67cb25 |
squares fitting capabilities.
|
|
Packit |
67cb25 |
|
|
Packit |
67cb25 |
Exponential Fitting Example
|
|
Packit |
67cb25 |
---------------------------
|
|
Packit |
67cb25 |
|
|
Packit |
67cb25 |
The following example program fits a weighted exponential model with
|
|
Packit |
67cb25 |
background to experimental data, :math:`Y = A \exp(-\lambda t) + b`. The
|
|
Packit |
67cb25 |
first part of the program sets up the functions :func:`expb_f` and
|
|
Packit |
67cb25 |
:func:`expb_df` to calculate the model and its Jacobian. The appropriate
|
|
Packit |
67cb25 |
fitting function is given by,
|
|
Packit |
67cb25 |
|
|
Packit |
67cb25 |
.. math:: f_i = (A \exp(-\lambda t_i) + b) - y_i
|
|
Packit |
67cb25 |
|
|
Packit |
67cb25 |
where we have chosen :math:`t_i = i T / (N - 1)`, where :math:`N` is the number
|
|
Packit |
67cb25 |
of data points fitted, so that :math:`t_i \in [0, T]`. The Jacobian matrix :math:`J` is
|
|
Packit |
67cb25 |
the derivative of these functions with respect to the three parameters
|
|
Packit |
67cb25 |
(:math:`A`, :math:`\lambda`, :math:`b`). It is given by,
|
|
Packit |
67cb25 |
|
|
Packit |
67cb25 |
.. only:: not texinfo
|
|
Packit |
67cb25 |
|
|
Packit |
67cb25 |
.. math:: J_{ij} = {\partial f_i \over \partial x_j}
|
|
Packit |
67cb25 |
|
|
Packit |
67cb25 |
.. only:: texinfo
|
|
Packit |
67cb25 |
|
|
Packit |
67cb25 |
::
|
|
Packit |
67cb25 |
|
|
Packit |
67cb25 |
J_{ij} = d f_i / d x_j
|
|
Packit |
67cb25 |
|
|
Packit |
67cb25 |
where :math:`x_0 = A`, :math:`x_1 = \lambda` and :math:`x_2 = b`.
|
|
Packit |
67cb25 |
The :math:`i`-th row of the Jacobian is therefore
|
|
Packit |
67cb25 |
|
|
Packit |
67cb25 |
.. only:: not texinfo
|
|
Packit |
67cb25 |
|
|
Packit |
67cb25 |
.. math::
|
|
Packit |
67cb25 |
|
|
Packit |
67cb25 |
J_{i\cdot} =
|
|
Packit |
67cb25 |
\left(
|
|
Packit |
67cb25 |
\begin{array}{ccc}
|
|
Packit |
67cb25 |
\exp(-\lambda t_i) & -t_i A \exp(-\lambda t_i) & 1
|
|
Packit |
67cb25 |
\end{array}
|
|
Packit |
67cb25 |
\right)
|
|
Packit |
67cb25 |
|
|
Packit |
67cb25 |
.. only:: texinfo
|
|
Packit |
67cb25 |
|
|
Packit |
67cb25 |
::
|
|
Packit |
67cb25 |
|
|
Packit |
67cb25 |
J(i,:) = [ \exp(-\lambda t_i) ; -t_i A \exp(-\lambda t_i) ; 1 ]
|
|
Packit |
67cb25 |
|
|
Packit |
67cb25 |
The main part of the program sets up a Levenberg-Marquardt solver and
|
|
Packit |
67cb25 |
some simulated random data. The data uses the known parameters
|
|
Packit |
67cb25 |
(5.0,0.1,1.0) combined with Gaussian noise (standard deviation = 0.1)
|
|
Packit |
67cb25 |
with a maximum time :math:`T = 40` and :math:`N = 100` timesteps.
|
|
Packit |
67cb25 |
The initial guess for the parameters is
|
|
Packit |
67cb25 |
chosen as (1.0, 1.0, 0.0). The iteration terminates when the relative
|
|
Packit |
67cb25 |
change in x is smaller than :math:`10^{-8}`, or when the magnitude of
|
|
Packit |
67cb25 |
the gradient falls below :math:`10^{-8}`. Here are the results of running
|
|
Packit |
67cb25 |
the program::
|
|
Packit |
67cb25 |
|
|
Packit |
67cb25 |
iter 0: A = 1.0000, lambda = 1.0000, b = 0.0000, cond(J) = inf, |f(x)| = 100.8779
|
|
Packit |
67cb25 |
iter 1: A = 1.2692, lambda = 0.3924, b = 0.0443, cond(J) = 69.5973, |f(x)| = 97.1734
|
|
Packit |
67cb25 |
iter 2: A = 1.6749, lambda = 0.1685, b = 0.1072, cond(J) = 29.5220, |f(x)| = 88.6636
|
|
Packit |
67cb25 |
iter 3: A = 2.5579, lambda = 0.0544, b = 0.2552, cond(J) = 22.9334, |f(x)| = 42.7765
|
|
Packit |
67cb25 |
iter 4: A = 3.0167, lambda = 0.0472, b = 0.3704, cond(J) = 120.2912, |f(x)| = 23.0102
|
|
Packit |
67cb25 |
iter 5: A = 3.3590, lambda = 0.0455, b = 0.4321, cond(J) = 266.5620, |f(x)| = 16.0680
|
|
Packit |
67cb25 |
iter 6: A = 3.6552, lambda = 0.0479, b = 0.4426, cond(J) = 343.8946, |f(x)| = 14.6421
|
|
Packit |
67cb25 |
iter 7: A = 3.9546, lambda = 0.0532, b = 0.4897, cond(J) = 301.4985, |f(x)| = 13.5266
|
|
Packit |
67cb25 |
iter 8: A = 4.1421, lambda = 0.0633, b = 0.6783, cond(J) = 203.9164, |f(x)| = 12.3149
|
|
Packit |
67cb25 |
iter 9: A = 4.3752, lambda = 0.0800, b = 0.9228, cond(J) = 158.2267, |f(x)| = 11.2475
|
|
Packit |
67cb25 |
iter 10: A = 4.6371, lambda = 0.0891, b = 0.9588, cond(J) = 136.6189, |f(x)| = 10.5457
|
|
Packit |
67cb25 |
iter 11: A = 4.7684, lambda = 0.0937, b = 0.9860, cond(J) = 125.4740, |f(x)| = 10.4753
|
|
Packit |
67cb25 |
iter 12: A = 4.7977, lambda = 0.0948, b = 0.9917, cond(J) = 120.1098, |f(x)| = 10.4723
|
|
Packit |
67cb25 |
iter 13: A = 4.8006, lambda = 0.0949, b = 0.9924, cond(J) = 118.9113, |f(x)| = 10.4723
|
|
Packit |
67cb25 |
iter 14: A = 4.8008, lambda = 0.0949, b = 0.9925, cond(J) = 118.7661, |f(x)| = 10.4723
|
|
Packit |
67cb25 |
iter 15: A = 4.8008, lambda = 0.0949, b = 0.9925, cond(J) = 118.7550, |f(x)| = 10.4723
|
|
Packit |
67cb25 |
iter 16: A = 4.8008, lambda = 0.0949, b = 0.9925, cond(J) = 118.7543, |f(x)| = 10.4723
|
|
Packit |
67cb25 |
iter 17: A = 4.8008, lambda = 0.0949, b = 0.9925, cond(J) = 118.7543, |f(x)| = 10.4723
|
|
Packit |
67cb25 |
iter 18: A = 4.8008, lambda = 0.0949, b = 0.9925, cond(J) = 118.7543, |f(x)| = 10.4723
|
|
Packit |
67cb25 |
summary from method 'trust-region/levenberg-marquardt'
|
|
Packit |
67cb25 |
number of iterations: 18
|
|
Packit |
67cb25 |
function evaluations: 25
|
|
Packit |
67cb25 |
Jacobian evaluations: 19
|
|
Packit |
67cb25 |
reason for stopping: small step size
|
|
Packit |
67cb25 |
initial |f(x)| = 100.877904
|
|
Packit |
67cb25 |
final |f(x)| = 10.472268
|
|
Packit |
67cb25 |
chisq/dof = 1.1306
|
|
Packit |
67cb25 |
A = 4.80085 +/- 0.17652
|
|
Packit |
67cb25 |
lambda = 0.09488 +/- 0.00527
|
|
Packit |
67cb25 |
b = 0.99249 +/- 0.04419
|
|
Packit |
67cb25 |
status = success
|
|
Packit |
67cb25 |
|
|
Packit |
67cb25 |
The approximate values of the parameters are found correctly, and the
|
|
Packit |
67cb25 |
chi-squared value indicates a good fit (the chi-squared per degree of
|
|
Packit |
67cb25 |
freedom is approximately 1). In this case the errors on the parameters
|
|
Packit |
67cb25 |
can be estimated from the square roots of the diagonal elements of the
|
|
Packit |
67cb25 |
covariance matrix. If the chi-squared value shows a poor fit (i.e.
|
|
Packit |
67cb25 |
:math:`\chi^2/(n-p) \gg 1`
|
|
Packit |
67cb25 |
then the error estimates obtained from the
|
|
Packit |
67cb25 |
covariance matrix will be too small. In the example program the error estimates
|
|
Packit |
67cb25 |
are multiplied by :math:`\sqrt{\chi^2/(n-p)}`
|
|
Packit |
67cb25 |
in this case, a common way of increasing the
|
|
Packit |
67cb25 |
errors for a poor fit. Note that a poor fit will result from the use
|
|
Packit |
67cb25 |
of an inappropriate model, and the scaled error estimates may then
|
|
Packit |
67cb25 |
be outside the range of validity for Gaussian errors.
|
|
Packit |
67cb25 |
|
|
Packit |
67cb25 |
Additionally, we see that the condition number of :math:`J(x)` stays
|
|
Packit |
67cb25 |
reasonably small throughout the iteration. This indicates we could
|
|
Packit |
67cb25 |
safely switch to the Cholesky solver for speed improvement,
|
|
Packit |
67cb25 |
although this particular system is too small to really benefit.
|
|
Packit |
67cb25 |
|
|
Packit |
67cb25 |
:numref:`fig_fit-exp` shows the fitted curve with the original data.
|
|
Packit |
67cb25 |
|
|
Packit |
67cb25 |
.. _fig_fit-exp:
|
|
Packit |
67cb25 |
|
|
Packit |
67cb25 |
.. figure:: /images/fit-exp.png
|
|
Packit |
67cb25 |
:scale: 60%
|
|
Packit |
67cb25 |
|
|
Packit |
67cb25 |
Exponential fitted curve with data
|
|
Packit |
67cb25 |
|
|
Packit |
67cb25 |
.. include:: examples/nlfit.c
|
|
Packit |
67cb25 |
:code:
|
|
Packit |
67cb25 |
|
|
Packit |
67cb25 |
Geodesic Acceleration Example 1
|
|
Packit |
67cb25 |
-------------------------------
|
|
Packit |
67cb25 |
|
|
Packit |
67cb25 |
The following example program minimizes a modified Rosenbrock function,
|
|
Packit |
67cb25 |
which is characterized by a narrow canyon with steep walls. The
|
|
Packit |
67cb25 |
starting point is selected high on the canyon wall, so the solver
|
|
Packit |
67cb25 |
must first find the canyon bottom and then navigate to the minimum.
|
|
Packit |
67cb25 |
The problem is solved both with and without using geodesic acceleration
|
|
Packit |
67cb25 |
for comparison. The cost function is given by
|
|
Packit |
67cb25 |
|
|
Packit |
67cb25 |
.. only:: not texinfo
|
|
Packit |
67cb25 |
|
|
Packit |
67cb25 |
.. math::
|
|
Packit |
67cb25 |
|
|
Packit |
67cb25 |
\Phi(x) &= {1 \over 2} (f_1^2 + f_2^2) \\
|
|
Packit |
67cb25 |
f_1 &= 100 \left( x_2 - x_1^2 \right) \\
|
|
Packit |
67cb25 |
f_2 &= 1 - x_1
|
|
Packit |
67cb25 |
|
|
Packit |
67cb25 |
.. only:: texinfo
|
|
Packit |
67cb25 |
|
|
Packit |
67cb25 |
::
|
|
Packit |
67cb25 |
|
|
Packit |
67cb25 |
Phi(x) = 1/2 (f1^2 + f2^2)
|
|
Packit |
67cb25 |
f1 = 100 ( x2 - x1^2 )
|
|
Packit |
67cb25 |
f2 = 1 - x1
|
|
Packit |
67cb25 |
|
|
Packit |
67cb25 |
The Jacobian matrix is
|
|
Packit |
67cb25 |
|
|
Packit |
67cb25 |
.. only:: not texinfo
|
|
Packit |
67cb25 |
|
|
Packit |
67cb25 |
.. math::
|
|
Packit |
67cb25 |
|
|
Packit |
67cb25 |
J =
|
|
Packit |
67cb25 |
\left(
|
|
Packit |
67cb25 |
\begin{array}{cc}
|
|
Packit |
67cb25 |
{\partial f_1 \over \partial x_1} & {\partial f_1 \over \partial x_2} \\
|
|
Packit |
67cb25 |
{\partial f_2 \over \partial x_1} & {\partial f_2 \over \partial x_2}
|
|
Packit |
67cb25 |
\end{array}
|
|
Packit |
67cb25 |
\right) =
|
|
Packit |
67cb25 |
\left(
|
|
Packit |
67cb25 |
\begin{array}{cc}
|
|
Packit |
67cb25 |
-200 x_1 & 100 \\
|
|
Packit |
67cb25 |
-1 & 0
|
|
Packit |
67cb25 |
\end{array}
|
|
Packit |
67cb25 |
\right)
|
|
Packit |
67cb25 |
|
|
Packit |
67cb25 |
.. only:: texinfo
|
|
Packit |
67cb25 |
|
|
Packit |
67cb25 |
::
|
|
Packit |
67cb25 |
|
|
Packit |
67cb25 |
J = [ -200*x1 100 ]
|
|
Packit |
67cb25 |
[ -1 0 ]
|
|
Packit |
67cb25 |
|
|
Packit |
67cb25 |
In order to use geodesic acceleration, the user must provide
|
|
Packit |
67cb25 |
the second directional derivative of each residual in the
|
|
Packit |
67cb25 |
velocity direction,
|
|
Packit |
67cb25 |
:math:`D_v^2 f_i = \sum_{\alpha\beta} v_{\alpha} v_{\beta} \partial_{\alpha} \partial_{\beta} f_i`.
|
|
Packit |
67cb25 |
The velocity vector :math:`v` is provided by the solver. For this example,
|
|
Packit |
67cb25 |
these derivatives are
|
|
Packit |
67cb25 |
|
|
Packit |
67cb25 |
.. only:: not texinfo
|
|
Packit |
67cb25 |
|
|
Packit |
67cb25 |
.. math::
|
|
Packit |
67cb25 |
|
|
Packit |
67cb25 |
f_{vv} =
|
|
Packit |
67cb25 |
D_v^2
|
|
Packit |
67cb25 |
\left(
|
|
Packit |
67cb25 |
\begin{array}{c}
|
|
Packit |
67cb25 |
f_1 \\
|
|
Packit |
67cb25 |
f_2
|
|
Packit |
67cb25 |
\end{array}
|
|
Packit |
67cb25 |
\right) =
|
|
Packit |
67cb25 |
\left(
|
|
Packit |
67cb25 |
\begin{array}{c}
|
|
Packit |
67cb25 |
-200 v_1^2 \\
|
|
Packit |
67cb25 |
0
|
|
Packit |
67cb25 |
\end{array}
|
|
Packit |
67cb25 |
\right)
|
|
Packit |
67cb25 |
|
|
Packit |
67cb25 |
.. only:: texinfo
|
|
Packit |
67cb25 |
|
|
Packit |
67cb25 |
::
|
|
Packit |
67cb25 |
|
|
Packit |
67cb25 |
fvv = [ -200 v1^2 ]
|
|
Packit |
67cb25 |
[ 0 ]
|
|
Packit |
67cb25 |
|
|
Packit |
67cb25 |
The solution of this minimization problem is
|
|
Packit |
67cb25 |
|
|
Packit |
67cb25 |
.. only:: not texinfo
|
|
Packit |
67cb25 |
|
|
Packit |
67cb25 |
.. math::
|
|
Packit |
67cb25 |
|
|
Packit |
67cb25 |
x^{*} &=
|
|
Packit |
67cb25 |
\left(
|
|
Packit |
67cb25 |
\begin{array}{c}
|
|
Packit |
67cb25 |
1 \\
|
|
Packit |
67cb25 |
1
|
|
Packit |
67cb25 |
\end{array}
|
|
Packit |
67cb25 |
\right) \\
|
|
Packit |
67cb25 |
\Phi(x^{*}) &= 0
|
|
Packit |
67cb25 |
|
|
Packit |
67cb25 |
.. only:: texinfo
|
|
Packit |
67cb25 |
|
|
Packit |
67cb25 |
::
|
|
Packit |
67cb25 |
|
|
Packit |
67cb25 |
x* = [ 1 ; 1 ]
|
|
Packit |
67cb25 |
Phi(x*) = 0
|
|
Packit |
67cb25 |
|
|
Packit |
67cb25 |
The program output is shown below::
|
|
Packit |
67cb25 |
|
|
Packit |
67cb25 |
=== Solving system without acceleration ===
|
|
Packit |
67cb25 |
NITER = 53
|
|
Packit |
67cb25 |
NFEV = 56
|
|
Packit |
67cb25 |
NJEV = 54
|
|
Packit |
67cb25 |
NAEV = 0
|
|
Packit |
67cb25 |
initial cost = 2.250225000000e+04
|
|
Packit |
67cb25 |
final cost = 6.674986031430e-18
|
|
Packit |
67cb25 |
final x = (9.999999974165e-01, 9.999999948328e-01)
|
|
Packit |
67cb25 |
final cond(J) = 6.000096055094e+02
|
|
Packit |
67cb25 |
=== Solving system with acceleration ===
|
|
Packit |
67cb25 |
NITER = 15
|
|
Packit |
67cb25 |
NFEV = 17
|
|
Packit |
67cb25 |
NJEV = 16
|
|
Packit |
67cb25 |
NAEV = 16
|
|
Packit |
67cb25 |
initial cost = 2.250225000000e+04
|
|
Packit |
67cb25 |
final cost = 7.518932873279e-19
|
|
Packit |
67cb25 |
final x = (9.999999991329e-01, 9.999999982657e-01)
|
|
Packit |
67cb25 |
final cond(J) = 6.000097233278e+02
|
|
Packit |
67cb25 |
|
|
Packit |
67cb25 |
.. _fig_nlfit2:
|
|
Packit |
67cb25 |
|
|
Packit |
67cb25 |
.. figure:: /images/nlfit2.png
|
|
Packit |
67cb25 |
|
|
Packit |
67cb25 |
Paths taken by solver for Rosenbrock function
|
|
Packit |
67cb25 |
|
|
Packit |
67cb25 |
We can see that enabling geodesic acceleration requires less
|
|
Packit |
67cb25 |
than a third of the number of Jacobian evaluations in order to locate
|
|
Packit |
67cb25 |
the minimum. The path taken by both methods is shown in :numref:`fig_nlfit2`.
|
|
Packit |
67cb25 |
The contours show the cost function
|
|
Packit |
67cb25 |
:math:`\Phi(x_1,x_2)`. We see that both methods quickly
|
|
Packit |
67cb25 |
find the canyon bottom, but the geodesic acceleration method
|
|
Packit |
67cb25 |
navigates along the bottom to the solution with significantly
|
|
Packit |
67cb25 |
fewer iterations.
|
|
Packit |
67cb25 |
|
|
Packit |
67cb25 |
The program is given below.
|
|
Packit |
67cb25 |
|
|
Packit |
67cb25 |
.. include:: examples/nlfit2.c
|
|
Packit |
67cb25 |
:code:
|
|
Packit |
67cb25 |
|
|
Packit |
67cb25 |
Geodesic Acceleration Example 2
|
|
Packit |
67cb25 |
-------------------------------
|
|
Packit |
67cb25 |
|
|
Packit |
67cb25 |
The following example fits a set of data to a Gaussian model
|
|
Packit |
67cb25 |
using the Levenberg-Marquardt method with geodesic acceleration.
|
|
Packit |
67cb25 |
The cost function is
|
|
Packit |
67cb25 |
|
|
Packit |
67cb25 |
.. only:: not texinfo
|
|
Packit |
67cb25 |
|
|
Packit |
67cb25 |
.. math::
|
|
Packit |
67cb25 |
|
|
Packit |
67cb25 |
\Phi(x) &= {1 \over 2} \sum_i f_i^2 \\
|
|
Packit |
67cb25 |
f_i &= y_i - Y(a,b,c;t_i)
|
|
Packit |
67cb25 |
|
|
Packit |
67cb25 |
.. only:: texinfo
|
|
Packit |
67cb25 |
|
|
Packit |
67cb25 |
::
|
|
Packit |
67cb25 |
|
|
Packit |
67cb25 |
Phi(x) = 1/2 \sum_i f_i^2
|
|
Packit |
67cb25 |
f_i = y_i - Y(a,b,c;t_i)
|
|
Packit |
67cb25 |
|
|
Packit |
67cb25 |
where :math:`y_i` is the measured data point at time :math:`t_i`, and
|
|
Packit |
67cb25 |
the model is specified by
|
|
Packit |
67cb25 |
|
|
Packit |
67cb25 |
.. only:: not texinfo
|
|
Packit |
67cb25 |
|
|
Packit |
67cb25 |
.. math::
|
|
Packit |
67cb25 |
|
|
Packit |
67cb25 |
Y(a,b,c;t) = a \exp{
|
|
Packit |
67cb25 |
\left[
|
|
Packit |
67cb25 |
-{1 \over 2}
|
|
Packit |
67cb25 |
\left(
|
|
Packit |
67cb25 |
{ t - b \over c }
|
|
Packit |
67cb25 |
\right)^2
|
|
Packit |
67cb25 |
\right]
|
|
Packit |
67cb25 |
}
|
|
Packit |
67cb25 |
|
|
Packit |
67cb25 |
.. only:: texinfo
|
|
Packit |
67cb25 |
|
|
Packit |
67cb25 |
::
|
|
Packit |
67cb25 |
|
|
Packit |
67cb25 |
Y(a,b,c;t) = a exp(-1/2 ((t-b)/c)^2)
|
|
Packit |
67cb25 |
|
|
Packit |
67cb25 |
The parameters :math:`a,b,c` represent the amplitude, mean, and width of the Gaussian
|
|
Packit |
67cb25 |
respectively. The program below generates the :math:`y_i` data on :math:`[0,1]` using
|
|
Packit |
67cb25 |
the values :math:`a = 5`, :math:`b = 0.4`, :math:`c = 0.15` and adding random noise.
|
|
Packit |
67cb25 |
The :math:`i`-th row of the Jacobian is
|
|
Packit |
67cb25 |
|
|
Packit |
67cb25 |
.. only:: not texinfo
|
|
Packit |
67cb25 |
|
|
Packit |
67cb25 |
.. math::
|
|
Packit |
67cb25 |
|
|
Packit |
67cb25 |
J_{i,:} =
|
|
Packit |
67cb25 |
\left(
|
|
Packit |
67cb25 |
\begin{array}{ccc}
|
|
Packit |
67cb25 |
{\partial f_i \over \partial a} & {\partial f_i \over \partial b} & {\partial f_i \over \partial c}
|
|
Packit |
67cb25 |
\end{array}
|
|
Packit |
67cb25 |
\right) =
|
|
Packit |
67cb25 |
\left(
|
|
Packit |
67cb25 |
\begin{array}{ccc}
|
|
Packit |
67cb25 |
-e_i & -{a \over c} z_i e_i & -{a \over c} z_i^2 e_i
|
|
Packit |
67cb25 |
\end{array}
|
|
Packit |
67cb25 |
\right)
|
|
Packit |
67cb25 |
|
|
Packit |
67cb25 |
.. only:: texinfo
|
|
Packit |
67cb25 |
|
|
Packit |
67cb25 |
::
|
|
Packit |
67cb25 |
|
|
Packit |
67cb25 |
J(i,:) = ( -e_i -(a/c)*z_i*e_i -(a/c)*z_i^2*e_i )
|
|
Packit |
67cb25 |
|
|
Packit |
67cb25 |
where
|
|
Packit |
67cb25 |
|
|
Packit |
67cb25 |
.. only:: not texinfo
|
|
Packit |
67cb25 |
|
|
Packit |
67cb25 |
.. math::
|
|
Packit |
67cb25 |
|
|
Packit |
67cb25 |
z_i &= { t_i - b \over c} \\
|
|
Packit |
67cb25 |
e_i &= \exp{\left( -{1 \over 2} z_i^2 \right)}
|
|
Packit |
67cb25 |
|
|
Packit |
67cb25 |
.. only:: texinfo
|
|
Packit |
67cb25 |
|
|
Packit |
67cb25 |
::
|
|
Packit |
67cb25 |
|
|
Packit |
67cb25 |
z_i = (t_i - b) / c
|
|
Packit |
67cb25 |
e_i = \exp(-1/2 z_i^2)
|
|
Packit |
67cb25 |
|
|
Packit |
67cb25 |
In order to use geodesic acceleration, we need the second directional derivative
|
|
Packit |
67cb25 |
of the residuals in the velocity direction,
|
|
Packit |
67cb25 |
:math:`D_v^2 f_i = \sum_{\alpha\beta} v_{\alpha} v_{\beta} \partial_{\alpha} \partial_{\beta} f_i`,
|
|
Packit |
67cb25 |
where :math:`v` is provided by the solver. To compute this, it is helpful to make a table of
|
|
Packit |
67cb25 |
all second derivatives of the residuals :math:`f_i` with respect to each combination of model parameters.
|
|
Packit |
67cb25 |
This table is
|
|
Packit |
67cb25 |
|
|
Packit |
67cb25 |
.. only:: not texinfo
|
|
Packit |
67cb25 |
|
|
Packit |
67cb25 |
.. math::
|
|
Packit |
67cb25 |
|
|
Packit |
67cb25 |
\begin{array}{cccc}
|
|
Packit |
67cb25 |
& {\partial \over \partial a} & {\partial \over \partial b} & {\partial \over \partial c} \cr
|
|
Packit |
67cb25 |
{\partial \over \partial a} & 0 & -{z_i \over c} e_i & -{z_i^2 \over c} e_i \cr
|
|
Packit |
67cb25 |
{\partial \over \partial b} & & {a \over c^2} \left( 1 - z_i^2 \right) e_i & {a \over c^2} z_i \left( 2 - z_i^2 \right) e_i \cr
|
|
Packit |
67cb25 |
{\partial \over \partial c} & & & {a \over c^2} z_i^2 \left( 3 - z_i^2 \right) e_i
|
|
Packit |
67cb25 |
\end{array}
|
|
Packit |
67cb25 |
|
|
Packit |
67cb25 |
The lower half of the table is omitted since it is symmetric. Then, the second directional derivative
|
|
Packit |
67cb25 |
of :math:`f_i` is
|
|
Packit |
67cb25 |
|
|
Packit |
67cb25 |
.. only:: not texinfo
|
|
Packit |
67cb25 |
|
|
Packit |
67cb25 |
.. math:: D_v^2 f_i = v_a^2 \partial_a^2 f_i + 2 v_a v_b \partial_a \partial_b f_i + 2 v_a v_c \partial_a \partial_c f_i + v_b^2 \partial_b^2 f_i + 2 v_b v_c \partial_b \partial_c f_i + v_c^2 \partial_c^2 f_i
|
|
Packit |
67cb25 |
|
|
Packit |
67cb25 |
.. only:: texinfo
|
|
Packit |
67cb25 |
|
|
Packit |
67cb25 |
::
|
|
Packit |
67cb25 |
|
|
Packit |
67cb25 |
D_v^2 f_i = v_a^2 (d/da)^2 f_i + 2 v_a v_b (d/da) (d/db) f_i + 2 v_a v_c (d/da) (d/dc) f_i + v_b^2 (d/db)^2 f_i + 2 v_b v_c (d/db) (d/dc) f_i + v_c^2 (d/dc)^2 f_i
|
|
Packit |
67cb25 |
|
|
Packit |
67cb25 |
The factors of 2 come from the symmetry of the mixed second partial derivatives.
|
|
Packit |
67cb25 |
The iteration is started using the initial guess :math:`a = 1, b = 0, c = 1`.
|
|
Packit |
67cb25 |
The program output is shown below::
|
|
Packit |
67cb25 |
|
|
Packit |
67cb25 |
iter 0: a = 1.0000, b = 0.0000, c = 1.0000, |a|/|v| = 0.0000 cond(J) = inf, |f(x)| = 35.4785
|
|
Packit |
67cb25 |
iter 1: a = 1.5708, b = 0.5321, c = 0.5219, |a|/|v| = 0.3093 cond(J) = 29.0443, |f(x)| = 31.1042
|
|
Packit |
67cb25 |
iter 2: a = 1.7387, b = 0.4040, c = 0.4568, |a|/|v| = 0.1199 cond(J) = 3.5256, |f(x)| = 28.7217
|
|
Packit |
67cb25 |
iter 3: a = 2.2340, b = 0.3829, c = 0.3053, |a|/|v| = 0.3308 cond(J) = 4.5121, |f(x)| = 23.8074
|
|
Packit |
67cb25 |
iter 4: a = 3.2275, b = 0.3952, c = 0.2243, |a|/|v| = 0.2784 cond(J) = 8.6499, |f(x)| = 15.6003
|
|
Packit |
67cb25 |
iter 5: a = 4.3347, b = 0.3974, c = 0.1752, |a|/|v| = 0.2029 cond(J) = 15.1732, |f(x)| = 7.5908
|
|
Packit |
67cb25 |
iter 6: a = 4.9352, b = 0.3992, c = 0.1536, |a|/|v| = 0.1001 cond(J) = 26.6621, |f(x)| = 4.8402
|
|
Packit |
67cb25 |
iter 7: a = 5.0716, b = 0.3994, c = 0.1498, |a|/|v| = 0.0166 cond(J) = 34.6922, |f(x)| = 4.7103
|
|
Packit |
67cb25 |
iter 8: a = 5.0828, b = 0.3994, c = 0.1495, |a|/|v| = 0.0012 cond(J) = 36.5422, |f(x)| = 4.7095
|
|
Packit |
67cb25 |
iter 9: a = 5.0831, b = 0.3994, c = 0.1495, |a|/|v| = 0.0000 cond(J) = 36.6929, |f(x)| = 4.7095
|
|
Packit |
67cb25 |
iter 10: a = 5.0831, b = 0.3994, c = 0.1495, |a|/|v| = 0.0000 cond(J) = 36.6975, |f(x)| = 4.7095
|
|
Packit |
67cb25 |
iter 11: a = 5.0831, b = 0.3994, c = 0.1495, |a|/|v| = 0.0000 cond(J) = 36.6976, |f(x)| = 4.7095
|
|
Packit |
67cb25 |
NITER = 11
|
|
Packit |
67cb25 |
NFEV = 18
|
|
Packit |
67cb25 |
NJEV = 12
|
|
Packit |
67cb25 |
NAEV = 17
|
|
Packit |
67cb25 |
initial cost = 1.258724737288e+03
|
|
Packit |
67cb25 |
final cost = 2.217977560180e+01
|
|
Packit |
67cb25 |
final x = (5.083101559156e+00, 3.994484109594e-01, 1.494898e-01)
|
|
Packit |
67cb25 |
final cond(J) = 3.669757713403e+01
|
|
Packit |
67cb25 |
|
|
Packit |
67cb25 |
We see the method converges after 11 iterations. For comparison the standard
|
|
Packit |
67cb25 |
Levenberg-Marquardt method requires 26 iterations and so the Gaussian fitting
|
|
Packit |
67cb25 |
problem benefits substantially from the geodesic acceleration correction. The
|
|
Packit |
67cb25 |
column marked :code:`|a|/|v|` above shows the ratio of the acceleration term
|
|
Packit |
67cb25 |
to the velocity term as the iteration progresses. Larger values of this
|
|
Packit |
67cb25 |
ratio indicate that the geodesic acceleration correction term is contributing
|
|
Packit |
67cb25 |
substantial information to the solver relative to the standard LM velocity step.
|
|
Packit |
67cb25 |
|
|
Packit |
67cb25 |
The data and fitted model are shown in :numref:`fig_nlfit2b`.
|
|
Packit |
67cb25 |
|
|
Packit |
67cb25 |
.. _fig_nlfit2b:
|
|
Packit |
67cb25 |
|
|
Packit |
67cb25 |
.. figure:: /images/nlfit2b.png
|
|
Packit |
67cb25 |
|
|
Packit |
67cb25 |
Gaussian model fitted to data
|
|
Packit |
67cb25 |
|
|
Packit |
67cb25 |
The program is given below.
|
|
Packit |
67cb25 |
|
|
Packit |
67cb25 |
.. include:: examples/nlfit2b.c
|
|
Packit |
67cb25 |
:code:
|
|
Packit |
67cb25 |
|
|
Packit |
67cb25 |
Comparing TRS Methods Example
|
|
Packit |
67cb25 |
-----------------------------
|
|
Packit |
67cb25 |
|
|
Packit |
67cb25 |
The following program compares all available nonlinear least squares
|
|
Packit |
67cb25 |
trust-region subproblem (TRS) methods on the Branin function, a common
|
|
Packit |
67cb25 |
optimization test problem. The cost function is
|
|
Packit |
67cb25 |
|
|
Packit |
67cb25 |
.. only:: not texinfo
|
|
Packit |
67cb25 |
|
|
Packit |
67cb25 |
.. math::
|
|
Packit |
67cb25 |
|
|
Packit |
67cb25 |
\Phi(x) &= {1 \over 2} (f_1^2 + f_2^2) \\
|
|
Packit |
67cb25 |
f_1 &= x_2 + a_1 x_1^2 + a_2 x_1 + a_3 \\
|
|
Packit |
67cb25 |
f_2 &= \sqrt{a_4} \sqrt{1 + (1 - a_5) \cos{x_1}}
|
|
Packit |
67cb25 |
|
|
Packit |
67cb25 |
.. only:: texinfo
|
|
Packit |
67cb25 |
|
|
Packit |
67cb25 |
::
|
|
Packit |
67cb25 |
|
|
Packit |
67cb25 |
\Phi(x) &= 1/2 (f_1^2 + f_2^2)
|
|
Packit |
67cb25 |
f_1 &= x_2 + a_1 x_1^2 + a_2 x_1 + a_3
|
|
Packit |
67cb25 |
f_2 &= sqrt(a_4) sqrt(1 + (1 - a_5) cos(x_1))
|
|
Packit |
67cb25 |
|
|
Packit |
67cb25 |
with :math:`a_1 = -{5.1 \over 4 \pi^2}, a_2 = {5 \over \pi}, a_3 = -6, a_4 = 10, a_5 = {1 \over 8\pi}`.
|
|
Packit |
67cb25 |
There are three minima of this function in the range
|
|
Packit |
67cb25 |
:math:`(x_1,x_2) \in [-5,15] \times [-5,15]`. The program
|
|
Packit |
67cb25 |
below uses the starting point :math:`(x_1,x_2) = (6,14.5)`
|
|
Packit |
67cb25 |
and calculates the solution with all available nonlinear
|
|
Packit |
67cb25 |
least squares TRS methods. The program output is shown below::
|
|
Packit |
67cb25 |
|
|
Packit |
67cb25 |
Method NITER NFEV NJEV Initial Cost Final cost Final cond(J) Final x
|
|
Packit |
67cb25 |
levenberg-marquardt 20 27 21 1.9874e+02 3.9789e-01 6.1399e+07 (-3.14e+00, 1.23e+01)
|
|
Packit |
67cb25 |
levenberg-marquardt+accel 27 36 28 1.9874e+02 3.9789e-01 1.4465e+07 (3.14e+00, 2.27e+00)
|
|
Packit |
67cb25 |
dogleg 23 64 23 1.9874e+02 3.9789e-01 5.0692e+08 (3.14e+00, 2.28e+00)
|
|
Packit |
67cb25 |
double-dogleg 24 69 24 1.9874e+02 3.9789e-01 3.4879e+07 (3.14e+00, 2.27e+00)
|
|
Packit |
67cb25 |
2D-subspace 23 54 24 1.9874e+02 3.9789e-01 2.5142e+07 (3.14e+00, 2.27e+00)
|
|
Packit |
67cb25 |
|
|
Packit |
67cb25 |
The first row of output above corresponds to standard Levenberg-Marquardt, while
|
|
Packit |
67cb25 |
the second row includes geodesic acceleration. We see that the standard LM method
|
|
Packit |
67cb25 |
converges to the minimum at :math:`(-\pi,12.275)` and also uses the least number
|
|
Packit |
67cb25 |
of iterations and Jacobian evaluations. All other methods converge to the minimum
|
|
Packit |
67cb25 |
:math:`(\pi,2.275)` and perform similarly in terms of number of Jacobian evaluations.
|
|
Packit |
67cb25 |
We see that :math:`J` is fairly ill-conditioned
|
|
Packit |
67cb25 |
at both minima, indicating that the QR (or SVD) solver is the best choice for this problem.
|
|
Packit |
67cb25 |
Since there are only two parameters in this optimization problem, we can easily
|
|
Packit |
67cb25 |
visualize the paths taken by each method, which are shown in :numref:`fig_nlfit3`.
|
|
Packit |
67cb25 |
The figure shows contours of the cost function :math:`\Phi(x_1,x_2)` which exhibits
|
|
Packit |
67cb25 |
three global minima in the range :math:`[-5,15] \times [-5,15]`. The paths taken
|
|
Packit |
67cb25 |
by each solver are shown as colored lines.
|
|
Packit |
67cb25 |
|
|
Packit |
67cb25 |
.. _fig_nlfit3:
|
|
Packit |
67cb25 |
|
|
Packit |
67cb25 |
.. figure:: /images/nlfit3.png
|
|
Packit |
67cb25 |
:scale: 60%
|
|
Packit |
67cb25 |
|
|
Packit |
67cb25 |
Paths taken for different TRS methods for the Branin function
|
|
Packit |
67cb25 |
|
|
Packit |
67cb25 |
The program is given below.
|
|
Packit |
67cb25 |
|
|
Packit |
67cb25 |
.. include:: examples/nlfit3.c
|
|
Packit |
67cb25 |
:code:
|
|
Packit |
67cb25 |
|
|
Packit |
67cb25 |
Large Nonlinear Least Squares Example
|
|
Packit |
67cb25 |
-------------------------------------
|
|
Packit |
67cb25 |
|
|
Packit |
67cb25 |
The following program illustrates the large nonlinear least
|
|
Packit |
67cb25 |
squares solvers on a system with significant sparse structure
|
|
Packit |
67cb25 |
in the Jacobian. The cost function is
|
|
Packit |
67cb25 |
|
|
Packit |
67cb25 |
.. only:: not texinfo
|
|
Packit |
67cb25 |
|
|
Packit |
67cb25 |
.. math::
|
|
Packit |
67cb25 |
|
|
Packit |
67cb25 |
\Phi(x) &= {1 \over 2} \sum_{i=1}^{p+1} f_i^2 \\
|
|
Packit |
67cb25 |
f_i &= \sqrt{\alpha} (x_i - 1), \quad 1 \le i \le p \\
|
|
Packit |
67cb25 |
f_{p+1} &= ||x||^2 - {1 \over 4}
|
|
Packit |
67cb25 |
|
|
Packit |
67cb25 |
.. only:: texinfo
|
|
Packit |
67cb25 |
|
|
Packit |
67cb25 |
::
|
|
Packit |
67cb25 |
|
|
Packit |
67cb25 |
\Phi(x) &= 1/2 \sum_{i=1}^{p+1} f_i^2
|
|
Packit |
67cb25 |
f_i &= \sqrt{\alpha} (x_i - 1), 1 \le i \le p
|
|
Packit |
67cb25 |
f_{p+1} &= ||x||^2 - 1/4
|
|
Packit |
67cb25 |
|
|
Packit |
67cb25 |
with :math:`\alpha = 10^{-5}`. The residual :math:`f_{p+1}` imposes a constraint on the :math:`p`
|
|
Packit |
67cb25 |
parameters :math:`x`, to ensure that :math:`||x||^2 \approx {1 \over 4}`.
|
|
Packit |
67cb25 |
The :math:`(p+1)`-by-:math:`p` Jacobian for this system is
|
|
Packit |
67cb25 |
|
|
Packit |
67cb25 |
.. only:: not texinfo
|
|
Packit |
67cb25 |
|
|
Packit |
67cb25 |
.. math::
|
|
Packit |
67cb25 |
|
|
Packit |
67cb25 |
J(x) =
|
|
Packit |
67cb25 |
\left(
|
|
Packit |
67cb25 |
\begin{array}{c}
|
|
Packit |
67cb25 |
\sqrt{\alpha} I_p \\
|
|
Packit |
67cb25 |
2 x^T
|
|
Packit |
67cb25 |
\end{array}
|
|
Packit |
67cb25 |
\right)
|
|
Packit |
67cb25 |
|
|
Packit |
67cb25 |
.. only:: texinfo
|
|
Packit |
67cb25 |
|
|
Packit |
67cb25 |
::
|
|
Packit |
67cb25 |
|
|
Packit |
67cb25 |
J(x) = [ \sqrt{alpha} I_p; 2 x^T ]
|
|
Packit |
67cb25 |
|
|
Packit |
67cb25 |
and the normal equations matrix is
|
|
Packit |
67cb25 |
|
|
Packit |
67cb25 |
.. math:: J^T J = \alpha I_p + 4 x x^T
|
|
Packit |
67cb25 |
|
|
Packit |
67cb25 |
Finally, the second directional derivative of :math:`f` for the
|
|
Packit |
67cb25 |
geodesic acceleration method is
|
|
Packit |
67cb25 |
|
|
Packit |
67cb25 |
.. only:: not texinfo
|
|
Packit |
67cb25 |
|
|
Packit |
67cb25 |
.. math::
|
|
Packit |
67cb25 |
|
|
Packit |
67cb25 |
f_{vv} = D_v^2 f =
|
|
Packit |
67cb25 |
\left(
|
|
Packit |
67cb25 |
\begin{array}{c}
|
|
Packit |
67cb25 |
0 \\
|
|
Packit |
67cb25 |
2 ||v||^2
|
|
Packit |
67cb25 |
\end{array}
|
|
Packit |
67cb25 |
\right)
|
|
Packit |
67cb25 |
|
|
Packit |
67cb25 |
.. only:: texinfo
|
|
Packit |
67cb25 |
|
|
Packit |
67cb25 |
::
|
|
Packit |
67cb25 |
|
|
Packit |
67cb25 |
fvv = [ 0 ]
|
|
Packit |
67cb25 |
[ 2 ||v||^2 ]
|
|
Packit |
67cb25 |
|
|
Packit |
67cb25 |
Since the upper :math:`p`-by-:math:`p` block of :math:`J` is diagonal,
|
|
Packit |
67cb25 |
this sparse structure should be exploited in the nonlinear solver.
|
|
Packit |
67cb25 |
For comparison, the following program solves the system for :math:`p = 2000`
|
|
Packit |
67cb25 |
using the dense direct Cholesky solver based on the normal equations matrix
|
|
Packit |
67cb25 |
:math:`J^T J`, as well as the iterative Steihaug-Toint solver, based on
|
|
Packit |
67cb25 |
sparse matrix-vector products :math:`J u` and :math:`J^T u`. The
|
|
Packit |
67cb25 |
program output is shown below::
|
|
Packit |
67cb25 |
|
|
Packit |
67cb25 |
Method NITER NFEV NJUEV NJTJEV NAEV Init Cost Final cost cond(J) Final |x|^2 Time (s)
|
|
Packit |
67cb25 |
levenberg-marquardt 25 31 26 26 0 7.1218e+18 1.9555e-02 447.50 2.5044e-01 46.28
|
|
Packit |
67cb25 |
levenberg-marquardt+accel 22 23 45 23 22 7.1218e+18 1.9555e-02 447.64 2.5044e-01 33.92
|
|
Packit |
67cb25 |
dogleg 37 87 36 36 0 7.1218e+18 1.9555e-02 447.59 2.5044e-01 56.05
|
|
Packit |
67cb25 |
double-dogleg 35 88 34 34 0 7.1218e+18 1.9555e-02 447.62 2.5044e-01 52.65
|
|
Packit |
67cb25 |
2D-subspace 37 88 36 36 0 7.1218e+18 1.9555e-02 447.71 2.5044e-01 59.75
|
|
Packit |
67cb25 |
steihaug-toint 35 88 345 0 0 7.1218e+18 1.9555e-02 inf 2.5044e-01 0.09
|
|
Packit |
67cb25 |
|
|
Packit |
67cb25 |
The first five rows use methods based on factoring the dense :math:`J^T J` matrix
|
|
Packit |
67cb25 |
while the last row uses the iterative Steihaug-Toint method. While the number
|
|
Packit |
67cb25 |
of Jacobian matrix-vector products (NJUEV) is less for the dense methods, the added time
|
|
Packit |
67cb25 |
to construct and factor the :math:`J^T J` matrix (NJTJEV) results in a much larger runtime than the
|
|
Packit |
67cb25 |
iterative method (see last column).
|
|
Packit |
67cb25 |
|
|
Packit |
67cb25 |
The program is given below.
|
|
Packit |
67cb25 |
|
|
Packit |
67cb25 |
.. include:: examples/nlfit4.c
|
|
Packit |
67cb25 |
:code:
|
|
Packit |
67cb25 |
|
|
Packit |
67cb25 |
References and Further Reading
|
|
Packit |
67cb25 |
==============================
|
|
Packit |
67cb25 |
|
|
Packit |
67cb25 |
The following publications are relevant to the algorithms described
|
|
Packit |
67cb25 |
in this section,
|
|
Packit |
67cb25 |
|
|
Packit |
67cb25 |
* J.J. |More|, *The Levenberg-Marquardt Algorithm: Implementation and
|
|
Packit |
67cb25 |
Theory*, Lecture Notes in Mathematics, v630 (1978), ed G. Watson.
|
|
Packit |
67cb25 |
|
|
Packit |
67cb25 |
* H. B. Nielsen, "Damping Parameter in Marquardt's Method",
|
|
Packit |
67cb25 |
IMM Department of Mathematical Modeling, DTU, Tech. Report IMM-REP-1999-05
|
|
Packit |
67cb25 |
(1999).
|
|
Packit |
67cb25 |
|
|
Packit |
67cb25 |
* K. Madsen and H. B. Nielsen, "Introduction to Optimization and Data
|
|
Packit |
67cb25 |
Fitting", IMM Department of Mathematical Modeling, DTU, 2010.
|
|
Packit |
67cb25 |
|
|
Packit |
67cb25 |
* J. E. Dennis and R. B. Schnabel, Numerical Methods for Unconstrained
|
|
Packit |
67cb25 |
Optimization and Nonlinear Equations, SIAM, 1996.
|
|
Packit |
67cb25 |
|
|
Packit |
67cb25 |
* M. K. Transtrum, B. B. Machta, and J. P. Sethna,
|
|
Packit |
67cb25 |
Geometry of nonlinear least squares with applications to sloppy models and optimization,
|
|
Packit |
67cb25 |
Phys. Rev. E 83, 036701, 2011.
|
|
Packit |
67cb25 |
|
|
Packit |
67cb25 |
* M. K. Transtrum and J. P. Sethna, Improvements to the Levenberg-Marquardt
|
|
Packit |
67cb25 |
algorithm for nonlinear least-squares minimization, arXiv:1201.5885, 2012.
|
|
Packit |
67cb25 |
|
|
Packit |
67cb25 |
* J.J. |More|, B.S. Garbow, K.E. Hillstrom, "Testing Unconstrained
|
|
Packit |
67cb25 |
Optimization Software", ACM Transactions on Mathematical Software, Vol
|
|
Packit |
67cb25 |
7, No 1 (1981), p 17--41.
|
|
Packit |
67cb25 |
|
|
Packit |
67cb25 |
* H. B. Nielsen, "UCTP Test Problems for Unconstrained Optimization",
|
|
Packit |
67cb25 |
IMM Department of Mathematical Modeling, DTU, Tech. Report IMM-REP-2000-17
|
|
Packit |
67cb25 |
(2000).
|