Blame doc/nls.rst

Packit 67cb25
.. index::
Packit 67cb25
   single: nonlinear least squares
Packit 67cb25
   single: least squares, nonlinear
Packit 67cb25
Packit 67cb25
*******************************
Packit 67cb25
Nonlinear Least-Squares Fitting
Packit 67cb25
*******************************
Packit 67cb25
Packit 67cb25
.. include:: include.rst
Packit 67cb25
Packit 67cb25
This chapter describes functions for multidimensional nonlinear
Packit 67cb25
least-squares fitting.  There are generally two classes of
Packit 67cb25
algorithms for solving nonlinear least squares problems, which
Packit 67cb25
fall under line search methods and trust region methods.
Packit 67cb25
GSL currently implements only trust region methods and
Packit 67cb25
provides the user with
Packit 67cb25
full access to intermediate steps of the iteration. The user
Packit 67cb25
also has the ability to tune a number of parameters which affect
Packit 67cb25
low-level aspects of the algorithm which can help to accelerate
Packit 67cb25
convergence for the specific problem at hand. GSL provides
Packit 67cb25
two separate interfaces for nonlinear least squares fitting. The
Packit 67cb25
first is designed for small to moderate sized problems, and the
Packit 67cb25
second is designed for very large problems, which may or may not
Packit 67cb25
have significant sparse structure.
Packit 67cb25
Packit 67cb25
The header file :file:`gsl_multifit_nlinear.h` contains prototypes for the
Packit 67cb25
multidimensional nonlinear fitting functions and related declarations
Packit 67cb25
relating to the small to moderate sized systems.
Packit 67cb25
Packit 67cb25
The header file :file:`gsl_multilarge_nlinear.h` contains prototypes for the
Packit 67cb25
multidimensional nonlinear fitting functions and related declarations
Packit 67cb25
relating to large systems.
Packit 67cb25
Packit 67cb25
.. index::
Packit 67cb25
   single: nonlinear least squares, overview
Packit 67cb25
Packit 67cb25
Overview
Packit 67cb25
========
Packit 67cb25
Packit 67cb25
The problem of multidimensional nonlinear least-squares fitting requires
Packit 67cb25
the minimization of the squared residuals of :math:`n` functions,
Packit 67cb25
:math:`f_i`, in :math:`p` parameters, :math:`x_i`,
Packit 67cb25
Packit 67cb25
.. only:: not texinfo
Packit 67cb25
Packit 67cb25
   .. math::
Packit 67cb25
Packit 67cb25
      \Phi(x) &= {1 \over 2} || f(x) ||^2 \\
Packit 67cb25
              &= {1 \over 2} \sum_{i=1}^{n} f_i (x_1, \dots, x_p)^2
Packit 67cb25
Packit 67cb25
.. only:: texinfo
Packit 67cb25
Packit 67cb25
   ::
Packit 67cb25
Packit 67cb25
      \Phi(x) = (1/2) || f(x) ||^2
Packit 67cb25
              = (1/2) \sum_{i=1}^{n} f_i(x_1, ..., x_p)^2 
Packit 67cb25
Packit 67cb25
In trust region methods, the objective (or cost) function :math:`\Phi(x)` is approximated
Packit 67cb25
by a model function :math:`m_k(\delta)` in the vicinity of some point :math:`x_k`. The
Packit 67cb25
model function is often simply a second order Taylor series expansion around the
Packit 67cb25
point :math:`x_k`, ie:
Packit 67cb25
Packit 67cb25
.. only:: not texinfo
Packit 67cb25
Packit 67cb25
   .. math:: \Phi(x_k + \delta) \approx m_k(\delta) = \Phi(x_k) + g_k^T \delta + {1 \over 2} \delta^T B_k \delta
Packit 67cb25
Packit 67cb25
.. only:: texinfo
Packit 67cb25
Packit 67cb25
   ::
Packit 67cb25
Packit 67cb25
      \Phi(x_k + \delta) ~=~ m_k(\delta) = \Phi(x_k) + g_k^T \delta + 1/2 \delta^T B_k \delta
Packit 67cb25
Packit 67cb25
where :math:`g_k = \nabla \Phi(x_k) = J^T f` is the gradient vector at the point :math:`x_k`,
Packit 67cb25
:math:`B_k = \nabla^2 \Phi(x_k)` is the Hessian matrix at :math:`x_k`, or some
Packit 67cb25
approximation to it, and :math:`J` is the :math:`n`-by-:math:`p` Jacobian matrix
Packit 67cb25
Packit 67cb25
.. only:: not texinfo
Packit 67cb25
Packit 67cb25
   .. math:: J_{ij} = \partial f_i / \partial x_j
Packit 67cb25
Packit 67cb25
.. only:: texinfo
Packit 67cb25
Packit 67cb25
   ::
Packit 67cb25
Packit 67cb25
      J_{ij} = d f_i / d x_j
Packit 67cb25
Packit 67cb25
In order to find the next step :math:`\delta`, we minimize the model function
Packit 67cb25
:math:`m_k(\delta)`, but search for solutions only within a region where
Packit 67cb25
we trust that :math:`m_k(\delta)` is a good approximation to the objective
Packit 67cb25
function :math:`\Phi(x_k + \delta)`. In other words,
Packit 67cb25
we seek a solution of the trust region subproblem (TRS)
Packit 67cb25
Packit 67cb25
.. only:: not texinfo
Packit 67cb25
Packit 67cb25
   .. math:: \min_{\delta \in R^p} m_k(\delta) = \Phi(x_k) + g_k^T \delta + {1 \over 2} \delta^T B_k \delta, \qquad\hbox{s.t.}\quad || D_k \delta || \le \Delta_k
Packit 67cb25
Packit 67cb25
.. only:: texinfo
Packit 67cb25
Packit 67cb25
   ::
Packit 67cb25
Packit 67cb25
      \min_(\delta \in R^p) m_k(\delta), s.t. || D_k \delta || <= \Delta_k
Packit 67cb25
Packit 67cb25
where :math:`\Delta_k > 0` is the trust region radius and :math:`D_k` is
Packit 67cb25
a scaling matrix. If :math:`D_k = I`, then the trust region is a ball
Packit 67cb25
of radius :math:`\Delta_k` centered at :math:`x_k`. In some applications,
Packit 67cb25
the parameter vector :math:`x` may have widely different scales. For
Packit 67cb25
example, one parameter might be a temperature on the order of
Packit 67cb25
:math:`10^3` K, while another might be a length on the order of
Packit 67cb25
:math:`10^{-6}` m. In such cases, a spherical trust region may not
Packit 67cb25
be the best choice, since if :math:`\Phi` changes rapidly along
Packit 67cb25
directions with one scale, and more slowly along directions with
Packit 67cb25
a different scale, the model function :math:`m_k` may be a poor
Packit 67cb25
approximation to :math:`\Phi` along the rapidly changing directions.
Packit 67cb25
In such problems, it may be best to use an elliptical trust region,
Packit 67cb25
by setting :math:`D_k` to a diagonal matrix whose entries are designed
Packit 67cb25
so that the scaled step :math:`D_k \delta` has entries of approximately the same
Packit 67cb25
order of magnitude.
Packit 67cb25
Packit 67cb25
The trust region subproblem above normally amounts to solving a
Packit 67cb25
linear least squares system (or multiple systems) for the step
Packit 67cb25
:math:`\delta`. Once :math:`\delta` is computed, it is checked whether
Packit 67cb25
or not it reduces the objective function :math:`\Phi(x)`. A useful
Packit 67cb25
statistic for this is to look at the ratio
Packit 67cb25
Packit 67cb25
.. only:: not texinfo
Packit 67cb25
Packit 67cb25
   .. math:: \rho_k = { \Phi(x_k) - \Phi(x_k + \delta_k) \over m_k(0) - m_k(\delta_k) }
Packit 67cb25
Packit 67cb25
.. only:: texinfo
Packit 67cb25
Packit 67cb25
   ::
Packit 67cb25
Packit 67cb25
      \rho_k = ( \Phi(x_k) - \Phi(x_k + \delta_k) / ( m_k(0) - m_k(\delta_k) )
Packit 67cb25
Packit 67cb25
where the numerator is the actual reduction of the objective function
Packit 67cb25
due to the step :math:`\delta_k`, and the denominator is the predicted
Packit 67cb25
reduction due to the model :math:`m_k`. If :math:`\rho_k` is negative,
Packit 67cb25
it means that the step :math:`\delta_k` increased the objective function
Packit 67cb25
and so it is rejected. If :math:`\rho_k` is positive,
Packit 67cb25
then we have found a step which reduced the objective function and
Packit 67cb25
it is accepted. Furthermore, if :math:`\rho_k` is close to 1,
Packit 67cb25
then this indicates that the model function is a good approximation
Packit 67cb25
to the objective function in the trust region, and so on the next
Packit 67cb25
iteration the trust region is enlarged in order to take more ambitious
Packit 67cb25
steps. When a step is rejected, the trust region is made smaller and
Packit 67cb25
the TRS is solved again. An outline for the general trust region method
Packit 67cb25
used by GSL can now be given.
Packit 67cb25
Packit 67cb25
**Trust Region Algorithm**
Packit 67cb25
Packit 67cb25
1. Initialize: given :math:`x_0`, construct :math:`m_0(\delta)`, :math:`D_0` and :math:`\Delta_0 > 0`
Packit 67cb25
Packit 67cb25
2. For k = 0, 1, 2, ...
Packit 67cb25
Packit 67cb25
   a. If converged, then stop
Packit 67cb25
   b. Solve TRS for trial step :math:`\delta_k`
Packit 67cb25
   c. Evaluate trial step by computing :math:`\rho_k`
Packit 67cb25
Packit 67cb25
      1). if step is accepted, set :math:`x_{k+1} = x_k + \delta_k` and increase radius,
Packit 67cb25
          :math:`\Delta_{k+1} = \alpha \Delta_k`
Packit 67cb25
      2). if step is rejected, set :math:`x_{k+1} = x_k` and decrease radius,
Packit 67cb25
          :math:`\Delta_{k+1} = {\Delta_k \over \beta}`; goto 2(b)
Packit 67cb25
Packit 67cb25
   d. Construct :math:`m_{k+1}(\delta)` and :math:`D_{k+1}`
Packit 67cb25
Packit 67cb25
GSL offers the user a number of different algorithms for solving the trust
Packit 67cb25
region subproblem in 2(b), as well as different choices of scaling matrices
Packit 67cb25
:math:`D_k` and different methods of updating the trust region radius
Packit 67cb25
:math:`\Delta_k`. Therefore, while reasonable default methods are provided,
Packit 67cb25
the user has a lot of control to fine-tune the various steps of the
Packit 67cb25
algorithm for their specific problem.
Packit 67cb25
Packit 67cb25
Solving the Trust Region Subproblem (TRS)
Packit 67cb25
=========================================
Packit 67cb25
Packit 67cb25
Below we describe the methods available for solving the trust region
Packit 67cb25
subproblem. The methods available provide either exact or approximate
Packit 67cb25
solutions to the trust region subproblem. In all algorithms below,
Packit 67cb25
the Hessian matrix :math:`B_k` is approximated as :math:`B_k \approx J_k^T J_k`,
Packit 67cb25
where :math:`J_k = J(x_k)`. In all methods, the solution of the TRS
Packit 67cb25
involves solving a linear least squares system involving the Jacobian
Packit 67cb25
matrix. For small to moderate sized problems (:code:`gsl_multifit_nlinear` interface),
Packit 67cb25
this is accomplished by factoring the full Jacobian matrix, which is provided
Packit 67cb25
by the user, with the Cholesky, QR, or SVD decompositions. For large systems
Packit 67cb25
(:code:`gsl_multilarge_nlinear` interface), the user has two choices. One
Packit 67cb25
is to solve the system iteratively, without needing to store the full
Packit 67cb25
Jacobian matrix in memory. With this method, the user must provide a routine
Packit 67cb25
to calculate the matrix-vector products :math:`J u` or :math:`J^T u` for a given vector :math:`u`.
Packit 67cb25
This iterative method is particularly useful for systems where the Jacobian has
Packit 67cb25
sparse structure, since forming matrix-vector products can be done cheaply. The
Packit 67cb25
second option for large systems involves forming the normal equations matrix
Packit 67cb25
:math:`J^T J` and then factoring it using a Cholesky decomposition. The normal
Packit 67cb25
equations matrix is :math:`p`-by-:math:`p`, typically much smaller than the full
Packit 67cb25
:math:`n`-by-:math:`p` Jacobian, and can usually be stored in memory even if the full
Packit 67cb25
Jacobian matrix cannot. This option is useful for large, dense systems, or if the
Packit 67cb25
iterative method has difficulty converging.
Packit 67cb25
Packit 67cb25
.. index::
Packit 67cb25
   single: Levenberg-Marquardt algorithm
Packit 67cb25
   single: nonlinear least squares, levenberg-marquardt
Packit 67cb25
Packit 67cb25
Levenberg-Marquardt
Packit 67cb25
-------------------
Packit 67cb25
Packit 67cb25
There is a theorem which states that if :math:`\delta_k` is a solution
Packit 67cb25
to the trust region subproblem given above, then there exists
Packit 67cb25
:math:`\mu_k \ge 0` such that
Packit 67cb25
Packit 67cb25
.. only:: not texinfo
Packit 67cb25
Packit 67cb25
   .. math:: \left( B_k + \mu_k D_k^T D_k \right) \delta_k = -g_k
Packit 67cb25
Packit 67cb25
.. only:: texinfo
Packit 67cb25
Packit 67cb25
   ::
Packit 67cb25
Packit 67cb25
      ( B_k + \mu_k D_k^T D_k ) \delta_k = -g_k
Packit 67cb25
Packit 67cb25
with :math:`\mu_k (\Delta_k - ||D_k \delta_k||) = 0`. This
Packit 67cb25
forms the basis of the Levenberg-Marquardt algorithm, which controls
Packit 67cb25
the trust region size by adjusting the parameter :math:`\mu_k`
Packit 67cb25
rather than the radius :math:`\Delta_k` directly. For each radius
Packit 67cb25
:math:`\Delta_k`, there is a unique parameter :math:`\mu_k` which
Packit 67cb25
solves the TRS, and they have an inverse relationship, so that large values of
Packit 67cb25
:math:`\mu_k` correspond to smaller trust regions, while small
Packit 67cb25
values of :math:`\mu_k` correspond to larger trust regions.
Packit 67cb25
Packit 67cb25
With the approximation :math:`B_k \approx J_k^T J_k`, on each iteration,
Packit 67cb25
in order to calculate the step :math:`\delta_k`,
Packit 67cb25
the following linear least squares problem is solved:
Packit 67cb25
Packit 67cb25
.. only:: not texinfo
Packit 67cb25
Packit 67cb25
   .. math::
Packit 67cb25
Packit 67cb25
      \left[
Packit 67cb25
        \begin{array}{c}
Packit 67cb25
          J_k \\
Packit 67cb25
          \sqrt{\mu_k} D_k
Packit 67cb25
        \end{array}
Packit 67cb25
      \right]
Packit 67cb25
      \delta_k =
Packit 67cb25
      -
Packit 67cb25
      \left[
Packit 67cb25
        \begin{array}{c}
Packit 67cb25
          f_k \cr
Packit 67cb25
          0
Packit 67cb25
        \end{array}
Packit 67cb25
      \right]
Packit 67cb25
Packit 67cb25
.. only:: texinfo
Packit 67cb25
Packit 67cb25
   ::
Packit 67cb25
Packit 67cb25
      [J_k; sqrt(mu_k) D_k] \delta_k = - [f_k; 0]
Packit 67cb25
Packit 67cb25
If the step :math:`\delta_k` is accepted, then
Packit 67cb25
:math:`\mu_k` is decreased on the next iteration in order
Packit 67cb25
to take a larger step, otherwise it is increased to take
Packit 67cb25
a smaller step. The Levenberg-Marquardt algorithm provides
Packit 67cb25
an exact solution of the trust region subproblem, but
Packit 67cb25
typically has a higher computational cost per iteration
Packit 67cb25
than the approximate methods discussed below, since it
Packit 67cb25
may need to solve the least squares system above several
Packit 67cb25
times for different values of :math:`\mu_k`.
Packit 67cb25
Packit 67cb25
.. index::
Packit 67cb25
   single: Levenberg-Marquardt algorithm, geodesic acceleration
Packit 67cb25
   single: nonlinear least squares, levenberg-marquardt, geodesic acceleration
Packit 67cb25
Packit 67cb25
Levenberg-Marquardt with Geodesic Acceleration
Packit 67cb25
----------------------------------------------
Packit 67cb25
Packit 67cb25
This method applies a so-called geodesic acceleration correction to
Packit 67cb25
the standard Levenberg-Marquardt step :math:`\delta_k` (Transtrum et al, 2011).
Packit 67cb25
By interpreting :math:`\delta_k` as a first order step along a geodesic in the
Packit 67cb25
model parameter space (ie: a velocity :math:`\delta_k = v_k`), the geodesic
Packit 67cb25
acceleration :math:`a_k` is a second order correction along the
Packit 67cb25
geodesic which is determined by solving the linear least squares system
Packit 67cb25
Packit 67cb25
.. only:: not texinfo
Packit 67cb25
Packit 67cb25
   .. math::
Packit 67cb25
Packit 67cb25
      \left[
Packit 67cb25
        \begin{array}{c}
Packit 67cb25
          J_k \\
Packit 67cb25
          \sqrt{\mu_k} D_k
Packit 67cb25
        \end{array}
Packit 67cb25
      \right]
Packit 67cb25
      a_k =
Packit 67cb25
      -
Packit 67cb25
      \left[
Packit 67cb25
        \begin{array}{c}
Packit 67cb25
          f_{vv}(x_k) \\
Packit 67cb25
          0
Packit 67cb25
        \end{array}
Packit 67cb25
      \right]
Packit 67cb25
Packit 67cb25
.. only:: texinfo
Packit 67cb25
Packit 67cb25
   ::
Packit 67cb25
Packit 67cb25
      [J_k; sqrt(mu_k) D_k] a_k = - [f_vv(x_k); 0]
Packit 67cb25
Packit 67cb25
where :math:`f_{vv}` is the second directional derivative of
Packit 67cb25
the residual vector in the velocity direction :math:`v`,
Packit 67cb25
:math:`f_{vv}(x) = D_v^2 f = \sum_{\alpha\beta} v_{\alpha} v_{\beta} \partial_{\alpha} \partial_{\beta} f(x)`,
Packit 67cb25
where :math:`\alpha` and :math:`\beta` are summed over the :math:`p`
Packit 67cb25
parameters. The new total step is then :math:`\delta_k' = v_k + {1 \over 2}a_k`.
Packit 67cb25
The second order correction :math:`a_k` can be calculated with a modest additional
Packit 67cb25
cost, and has been shown to dramatically reduce the number of iterations
Packit 67cb25
(and expensive Jacobian evaluations) required to reach convergence on a variety
Packit 67cb25
of different problems. In order to utilize the geodesic acceleration, the user must supply a
Packit 67cb25
function which provides the second directional derivative vector
Packit 67cb25
:math:`f_{vv}(x)`, or alternatively the library can use a finite
Packit 67cb25
difference method to estimate this vector with one additional function
Packit 67cb25
evaluation of :math:`f(x + h v)` where :math:`h` is a tunable step size
Packit 67cb25
(see the :code:`h_fvv` parameter description).
Packit 67cb25
Packit 67cb25
.. index::
Packit 67cb25
   single: Dogleg algorithm
Packit 67cb25
   single: nonlinear least squares, dogleg
Packit 67cb25
Packit 67cb25
Dogleg
Packit 67cb25
------
Packit 67cb25
Packit 67cb25
This is Powell's dogleg method, which finds an approximate
Packit 67cb25
solution to the trust region subproblem, by restricting
Packit 67cb25
its search to a piecewise linear "dogleg" path,
Packit 67cb25
composed of the origin, the Cauchy point which represents
Packit 67cb25
the model minimizer along the steepest descent direction,
Packit 67cb25
and the Gauss-Newton point, which is the overall minimizer
Packit 67cb25
of the unconstrained model. The Gauss-Newton step is calculated by
Packit 67cb25
solving
Packit 67cb25
Packit 67cb25
.. math:: J_k \delta_{gn} = -f_k
Packit 67cb25
Packit 67cb25
which is the main computational task for each iteration,
Packit 67cb25
but only needs to be performed once per iteration. If
Packit 67cb25
the Gauss-Newton point is inside the trust region, it is
Packit 67cb25
selected as the step. If it is outside, the method then
Packit 67cb25
calculates the Cauchy point, which is located along the
Packit 67cb25
gradient direction. If the Cauchy point is also outside
Packit 67cb25
the trust region, the method assumes that it is still far
Packit 67cb25
from the minimum and so proceeds along the gradient
Packit 67cb25
direction, truncating the step at the trust region
Packit 67cb25
boundary. If the Cauchy point is inside the trust region,
Packit 67cb25
with the Gauss-Newton point outside, the method
Packit 67cb25
uses a dogleg step, which is a linear combination of the
Packit 67cb25
gradient direction and the Gauss-Newton direction, stopping at the trust
Packit 67cb25
region boundary.
Packit 67cb25
Packit 67cb25
.. index::
Packit 67cb25
   single: double Dogleg algorithm
Packit 67cb25
   single: Dogleg algorithm, double
Packit 67cb25
   single: nonlinear least squares, double dogleg
Packit 67cb25
Packit 67cb25
Double Dogleg
Packit 67cb25
-------------
Packit 67cb25
Packit 67cb25
This method is an improvement over the classical dogleg
Packit 67cb25
algorithm, which attempts to include information about
Packit 67cb25
the Gauss-Newton step while the iteration is still far from
Packit 67cb25
the minimum. When the Cauchy point is inside the trust region
Packit 67cb25
and the Gauss-Newton point is outside, the method computes
Packit 67cb25
a scaled Gauss-Newton point and then takes a dogleg step
Packit 67cb25
between the Cauchy point and the scaled Gauss-Newton point.
Packit 67cb25
The scaling is calculated to ensure that the reduction
Packit 67cb25
in the model :math:`m_k` is about the same as the reduction
Packit 67cb25
provided by the Cauchy point.
Packit 67cb25
Packit 67cb25
Two Dimensional Subspace
Packit 67cb25
------------------------
Packit 67cb25
Packit 67cb25
The dogleg methods restrict the search for the TRS solution
Packit 67cb25
to a 1D curve defined by the Cauchy and Gauss-Newton points.
Packit 67cb25
An improvement to this is to search for a solution using
Packit 67cb25
the full two dimensional subspace spanned by the Cauchy
Packit 67cb25
and Gauss-Newton directions. The dogleg path is of course
Packit 67cb25
inside this subspace, and so this method solves the TRS
Packit 67cb25
at least as accurately as the dogleg methods. Since this
Packit 67cb25
method searches a larger subspace for a solution, it can
Packit 67cb25
converge more quickly than dogleg on some problems. Because
Packit 67cb25
the subspace is only two dimensional, this method is
Packit 67cb25
very efficient and the main computation per iteration is
Packit 67cb25
to determine the Gauss-Newton point.
Packit 67cb25
Packit 67cb25
Steihaug-Toint Conjugate Gradient
Packit 67cb25
---------------------------------
Packit 67cb25
Packit 67cb25
One difficulty of the dogleg methods is calculating the
Packit 67cb25
Gauss-Newton step when the Jacobian matrix is singular. The
Packit 67cb25
Steihaug-Toint method also computes a generalized dogleg
Packit 67cb25
step, but avoids solving for the Gauss-Newton step directly,
Packit 67cb25
instead using an iterative conjugate gradient algorithm. This
Packit 67cb25
method performs well at points where the Jacobian is singular,
Packit 67cb25
and is also suitable for large-scale problems where factoring
Packit 67cb25
the Jacobian matrix could be prohibitively expensive.
Packit 67cb25
Packit 67cb25
Weighted Nonlinear Least-Squares
Packit 67cb25
================================
Packit 67cb25
Packit 67cb25
Weighted nonlinear least-squares fitting minimizes the function
Packit 67cb25
Packit 67cb25
.. only:: not texinfo
Packit 67cb25
Packit 67cb25
   .. math::
Packit 67cb25
Packit 67cb25
      \Phi(x) &= {1 \over 2} || f ||_W^2 \\
Packit 67cb25
              &= {1 \over 2} \sum_{i=1}^{n} w_i f_i (x_1, \dots, x_p)^2
Packit 67cb25
Packit 67cb25
.. only:: texinfo
Packit 67cb25
Packit 67cb25
   ::
Packit 67cb25
Packit 67cb25
      \Phi(x) = (1/2) || f(x) ||_W^2
Packit 67cb25
              = (1/2) \sum_{i=1}^{n} f_i(x_1, ..., x_p)^2 
Packit 67cb25
Packit 67cb25
where :math:`W = \diag(w_1,w_2,...,w_n)` is the weighting matrix,
Packit 67cb25
and :math:`||f||_W^2 = f^T W f`.
Packit 67cb25
The weights :math:`w_i` are commonly defined as :math:`w_i = 1/\sigma_i^2`,
Packit 67cb25
where :math:`\sigma_i` is the error in the :math:`i`-th measurement.
Packit 67cb25
A simple change of variables :math:`\tilde{f} = W^{1 \over 2} f` yields
Packit 67cb25
:math:`\Phi(x) = {1 \over 2} ||\tilde{f}||^2`, which is in the
Packit 67cb25
same form as the unweighted case. The user can either perform this
Packit 67cb25
transform directly on their function residuals and Jacobian, or use
Packit 67cb25
the :func:`gsl_multifit_nlinear_winit` interface which automatically
Packit 67cb25
performs the correct scaling. To manually perform this transformation,
Packit 67cb25
the residuals and Jacobian should be modified according to
Packit 67cb25
Packit 67cb25
.. only:: not texinfo
Packit 67cb25
Packit 67cb25
   .. math::
Packit 67cb25
Packit 67cb25
      \tilde{f}_i & = \sqrt{w_i} f_i = {f_i \over \sigma_i} \\
Packit 67cb25
      \tilde{J}_{ij} & = \sqrt{w_i} { \partial f_i \over \partial x_j } = { 1 \over \sigma_i} { \partial f_i \over \partial x_j }
Packit 67cb25
Packit 67cb25
.. only:: texinfo
Packit 67cb25
Packit 67cb25
   ::
Packit 67cb25
Packit 67cb25
      f~_i = f_i / \sigma_i
Packit 67cb25
      J~_ij = 1 / \sigma_i df_i/dx_j
Packit 67cb25
Packit 67cb25
For large systems, the user must perform their own weighting.
Packit 67cb25
Packit 67cb25
.. _sec_tunable-parameters:
Packit 67cb25
Packit 67cb25
Tunable Parameters
Packit 67cb25
==================
Packit 67cb25
Packit 67cb25
The user can tune nearly all aspects of the iteration at allocation
Packit 67cb25
time. For the :code:`gsl_multifit_nlinear` interface, the user may
Packit 67cb25
modify the :type:`gsl_multifit_nlinear_parameters` structure, which is
Packit 67cb25
defined as follows:
Packit 67cb25
Packit 67cb25
.. type:: gsl_multifit_nlinear_parameters
Packit 67cb25
Packit 67cb25
   ::
Packit 67cb25
Packit 67cb25
      typedef struct
Packit 67cb25
      {
Packit 67cb25
        const gsl_multifit_nlinear_trs *trs;        /* trust region subproblem method */
Packit 67cb25
        const gsl_multifit_nlinear_scale *scale;    /* scaling method */
Packit 67cb25
        const gsl_multifit_nlinear_solver *solver;  /* solver method */
Packit 67cb25
        gsl_multifit_nlinear_fdtype fdtype;         /* finite difference method */
Packit 67cb25
        double factor_up;                           /* factor for increasing trust radius */
Packit 67cb25
        double factor_down;                         /* factor for decreasing trust radius */
Packit 67cb25
        double avmax;                               /* max allowed |a|/|v| */
Packit 67cb25
        double h_df;                                /* step size for finite difference Jacobian */
Packit 67cb25
        double h_fvv;                               /* step size for finite difference fvv */
Packit 67cb25
      } gsl_multifit_nlinear_parameters;
Packit 67cb25
Packit 67cb25
For the :code:`gsl_multilarge_nlinear` interface, the user may
Packit 67cb25
modify the :type:`gsl_multilarge_nlinear_parameters` structure, which is
Packit 67cb25
defined as follows:
Packit 67cb25
Packit 67cb25
.. type:: gsl_multilarge_nlinear_parameters
Packit 67cb25
Packit 67cb25
   ::
Packit 67cb25
Packit 67cb25
      typedef struct
Packit 67cb25
      {
Packit 67cb25
        const gsl_multilarge_nlinear_trs *trs;       /* trust region subproblem method */
Packit 67cb25
        const gsl_multilarge_nlinear_scale *scale;   /* scaling method */
Packit 67cb25
        const gsl_multilarge_nlinear_solver *solver; /* solver method */
Packit 67cb25
        gsl_multilarge_nlinear_fdtype fdtype;        /* finite difference method */
Packit 67cb25
        double factor_up;                            /* factor for increasing trust radius */
Packit 67cb25
        double factor_down;                          /* factor for decreasing trust radius */
Packit 67cb25
        double avmax;                                /* max allowed |a|/|v| */
Packit 67cb25
        double h_df;                                 /* step size for finite difference Jacobian */
Packit 67cb25
        double h_fvv;                                /* step size for finite difference fvv */
Packit 67cb25
        size_t max_iter;                             /* maximum iterations for trs method */
Packit 67cb25
        double tol;                                  /* tolerance for solving trs */
Packit 67cb25
      } gsl_multilarge_nlinear_parameters;
Packit 67cb25
Packit 67cb25
Each of these parameters is discussed in further detail below.
Packit 67cb25
Packit 67cb25
.. type:: gsl_multifit_nlinear_trs
Packit 67cb25
          gsl_multilarge_nlinear_trs
Packit 67cb25
Packit 67cb25
   The parameter :data:`trs` determines the method used to solve the trust region
Packit 67cb25
   subproblem, and may be selected from the following choices,
Packit 67cb25
Packit 67cb25
   .. var:: gsl_multifit_nlinear_trs_lm
Packit 67cb25
            gsl_multilarge_nlinear_trs_lm
Packit 67cb25
Packit 67cb25
      This selects the Levenberg-Marquardt algorithm.
Packit 67cb25
Packit 67cb25
   .. var:: gsl_multifit_nlinear_trs_lmaccel
Packit 67cb25
            gsl_multilarge_nlinear_trs_lmaccel
Packit 67cb25
Packit 67cb25
      This selects the Levenberg-Marquardt algorithm with geodesic
Packit 67cb25
      acceleration.
Packit 67cb25
Packit 67cb25
   .. var:: gsl_multifit_nlinear_trs_dogleg
Packit 67cb25
            gsl_multilarge_nlinear_trs_dogleg
Packit 67cb25
Packit 67cb25
      This selects the dogleg algorithm.
Packit 67cb25
Packit 67cb25
   .. var:: gsl_multifit_nlinear_trs_ddogleg
Packit 67cb25
            gsl_multilarge_nlinear_trs_ddogleg
Packit 67cb25
Packit 67cb25
      This selects the double dogleg algorithm.
Packit 67cb25
Packit 67cb25
   .. var:: gsl_multifit_nlinear_trs_subspace2D
Packit 67cb25
            gsl_multilarge_nlinear_trs_subspace2D
Packit 67cb25
Packit 67cb25
      This selects the 2D subspace algorithm.
Packit 67cb25
Packit 67cb25
   .. var:: gsl_multilarge_nlinear_trs_cgst
Packit 67cb25
Packit 67cb25
      This selects the Steihaug-Toint conjugate gradient algorithm. This
Packit 67cb25
      method is available only for large systems.
Packit 67cb25
Packit 67cb25
.. type:: gsl_multifit_nlinear_scale
Packit 67cb25
          gsl_multilarge_nlinear_scale
Packit 67cb25
Packit 67cb25
   The parameter :data:`scale` determines the diagonal scaling matrix :math:`D` and
Packit 67cb25
   may be selected from the following choices,
Packit 67cb25
Packit 67cb25
   .. var:: gsl_multifit_nlinear_scale_more
Packit 67cb25
            gsl_multilarge_nlinear_scale_more
Packit 67cb25
Packit 67cb25
      This damping strategy was suggested by |More|, and
Packit 67cb25
      corresponds to :math:`D^T D = \max(\diag(J^T J))`,
Packit 67cb25
      in other words the maximum elements of
Packit 67cb25
      :math:`\diag(J^T J)` encountered thus far in the iteration.
Packit 67cb25
      This choice of :math:`D` makes the problem scale-invariant,
Packit 67cb25
      so that if the model parameters :math:`x_i` are each scaled
Packit 67cb25
      by an arbitrary constant, :math:`\tilde{x}_i = a_i x_i`, then
Packit 67cb25
      the sequence of iterates produced by the algorithm would
Packit 67cb25
      be unchanged. This method can work very well in cases
Packit 67cb25
      where the model parameters have widely different scales
Packit 67cb25
      (ie: if some parameters are measured in nanometers, while others
Packit 67cb25
      are measured in degrees Kelvin). This strategy has been proven
Packit 67cb25
      effective on a large class of problems and so it is the library
Packit 67cb25
      default, but it may not be the best choice for all problems.
Packit 67cb25
Packit 67cb25
   .. var:: gsl_multifit_nlinear_scale_levenberg
Packit 67cb25
            gsl_multilarge_nlinear_scale_levenberg
Packit 67cb25
Packit 67cb25
      This damping strategy was originally suggested by Levenberg, and
Packit 67cb25
      corresponds to :math:`D^T D = I`. This method has also proven
Packit 67cb25
      effective on a large class of problems, but is not scale-invariant.
Packit 67cb25
      However, some authors (e.g. Transtrum and Sethna 2012) argue
Packit 67cb25
      that this choice is better for problems which are susceptible
Packit 67cb25
      to parameter evaporation (ie: parameters go to infinity)
Packit 67cb25
Packit 67cb25
   .. var:: gsl_multifit_nlinear_scale_marquardt
Packit 67cb25
            gsl_multilarge_nlinear_scale_marquardt
Packit 67cb25
Packit 67cb25
      This damping strategy was suggested by Marquardt, and
Packit 67cb25
      corresponds to :math:`D^T D = \diag(J^T J)`. This
Packit 67cb25
      method is scale-invariant, but it is generally considered
Packit 67cb25
      inferior to both the Levenberg and |More| strategies, though
Packit 67cb25
      may work well on certain classes of problems.
Packit 67cb25
Packit 67cb25
.. type:: gsl_multifit_nlinear_solver
Packit 67cb25
          gsl_multilarge_nlinear_solver
Packit 67cb25
Packit 67cb25
   Solving the trust region subproblem on each iteration almost always
Packit 67cb25
   requires the solution of the following linear least squares system
Packit 67cb25
Packit 67cb25
   .. only:: not texinfo
Packit 67cb25
Packit 67cb25
      .. math::
Packit 67cb25
Packit 67cb25
         \left[
Packit 67cb25
           \begin{array}{c}
Packit 67cb25
             J \\
Packit 67cb25
             \sqrt{\mu} D
Packit 67cb25
           \end{array}
Packit 67cb25
         \right]
Packit 67cb25
         \delta =
Packit 67cb25
         -
Packit 67cb25
         \left[
Packit 67cb25
           \begin{array}{c}
Packit 67cb25
             f \\
Packit 67cb25
             0
Packit 67cb25
           \end{array}
Packit 67cb25
         \right]
Packit 67cb25
Packit 67cb25
   .. only:: texinfo
Packit 67cb25
Packit 67cb25
      ::
Packit 67cb25
Packit 67cb25
         [J; sqrt(mu) D] \delta = - [f; 0]
Packit 67cb25
Packit 67cb25
   The :data:`solver` parameter determines how the system is
Packit 67cb25
   solved and can be selected from the following choices:
Packit 67cb25
Packit 67cb25
   .. var:: gsl_multifit_nlinear_solver_qr
Packit 67cb25
Packit 67cb25
      This method solves the system using a rank revealing QR
Packit 67cb25
      decomposition of the Jacobian :math:`J`. This method will
Packit 67cb25
      produce reliable solutions in cases where the Jacobian
Packit 67cb25
      is rank deficient or near-singular but does require about
Packit 67cb25
      twice as many operations as the Cholesky method discussed
Packit 67cb25
      below.
Packit 67cb25
Packit 67cb25
   .. var:: gsl_multifit_nlinear_solver_cholesky
Packit 67cb25
            gsl_multilarge_nlinear_solver_cholesky
Packit 67cb25
Packit 67cb25
      This method solves the alternate normal equations problem
Packit 67cb25
Packit 67cb25
      .. only:: not texinfo
Packit 67cb25
Packit 67cb25
         .. math:: \left( J^T J + \mu D^T D \right) \delta = -J^T f
Packit 67cb25
Packit 67cb25
      .. only:: texinfo
Packit 67cb25
Packit 67cb25
         ::
Packit 67cb25
Packit 67cb25
            ( J^T J + \mu D^T D ) \delta = -J^T f
Packit 67cb25
Packit 67cb25
      by using a Cholesky decomposition of the matrix
Packit 67cb25
      :math:`J^T J + \mu D^T D`. This method is faster than the
Packit 67cb25
      QR approach, however it is susceptible to numerical instabilities
Packit 67cb25
      if the Jacobian matrix is rank deficient or near-singular. In
Packit 67cb25
      these cases, an attempt is made to reduce the condition number
Packit 67cb25
      of the matrix using Jacobi preconditioning, but for highly
Packit 67cb25
      ill-conditioned problems the QR approach is better. If it is
Packit 67cb25
      known that the Jacobian matrix is well conditioned, this method
Packit 67cb25
      is accurate and will perform faster than the QR approach.
Packit 67cb25
Packit 67cb25
   .. var:: gsl_multifit_nlinear_solver_svd
Packit 67cb25
Packit 67cb25
      This method solves the system using a singular value
Packit 67cb25
      decomposition of the Jacobian :math:`J`. This method will
Packit 67cb25
      produce the most reliable solutions for ill-conditioned Jacobians
Packit 67cb25
      but is also the slowest solver method.
Packit 67cb25
Packit 67cb25
.. type:: gsl_multifit_nlinear_fdtype
Packit 67cb25
Packit 67cb25
   The parameter :data:`fdtype` specifies whether to use forward or centered
Packit 67cb25
   differences when approximating the Jacobian. This is only
Packit 67cb25
   used when an analytic Jacobian is not provided to the solver.
Packit 67cb25
   This parameter may be set to one of the following choices.
Packit 67cb25
Packit 67cb25
   .. macro:: GSL_MULTIFIT_NLINEAR_FWDIFF
Packit 67cb25
Packit 67cb25
      This specifies a forward finite difference to approximate
Packit 67cb25
      the Jacobian matrix. The Jacobian matrix will be calculated as
Packit 67cb25
Packit 67cb25
      .. only:: not texinfo
Packit 67cb25
Packit 67cb25
         .. math:: J_{ij} = {1 \over \Delta_j} \left( f_i(x + \Delta_j e_j) - f_i(x) \right)
Packit 67cb25
Packit 67cb25
      .. only:: texinfo
Packit 67cb25
Packit 67cb25
         ::
Packit 67cb25
Packit 67cb25
            J_ij = 1 / \Delta_j ( f_i(x + \Delta_j e_j) - f_i(x) )
Packit 67cb25
Packit 67cb25
      where :math:`\Delta_j = h |x_j|` and :math:`e_j` is the standard
Packit 67cb25
      :math:`j`-th Cartesian unit basis vector so that
Packit 67cb25
      :math:`x + \Delta_j e_j` represents a small (forward) perturbation of
Packit 67cb25
      the :math:`j`-th parameter by an amount :math:`\Delta_j`. The perturbation
Packit 67cb25
      :math:`\Delta_j` is proportional to the current value :math:`|x_j|` which
Packit 67cb25
      helps to calculate an accurate Jacobian when the various parameters have
Packit 67cb25
      different scale sizes. The value of :math:`h` is specified by the :code:`h_df`
Packit 67cb25
      parameter. The accuracy of this method is :math:`O(h)`, and evaluating this
Packit 67cb25
      matrix requires an additional :math:`p` function evaluations.
Packit 67cb25
Packit 67cb25
   .. macro:: GSL_MULTIFIT_NLINEAR_CTRDIFF
Packit 67cb25
Packit 67cb25
      This specifies a centered finite difference to approximate
Packit 67cb25
      the Jacobian matrix. The Jacobian matrix will be calculated as
Packit 67cb25
Packit 67cb25
      .. only:: not texinfo
Packit 67cb25
Packit 67cb25
         .. math:: J_{ij} = {1 \over \Delta_j} \left( f_i(x + {1 \over 2} \Delta_j e_j) - f_i(x - {1 \over 2} \Delta_j e_j) \right)
Packit 67cb25
Packit 67cb25
      .. only:: texinfo
Packit 67cb25
Packit 67cb25
         ::
Packit 67cb25
Packit 67cb25
            J_ij = 1 / \Delta_j ( f_i(x + 1/2 \Delta_j e_j) - f_i(x - 1/2 \Delta_j e_j) )
Packit 67cb25
Packit 67cb25
      See above for a description of :math:`\Delta_j`. The accuracy of this
Packit 67cb25
      method is :math:`O(h^2)`, but evaluating this
Packit 67cb25
      matrix requires an additional :math:`2p` function evaluations.
Packit 67cb25
Packit 67cb25
:code:`double factor_up`
Packit 67cb25
Packit 67cb25
When a step is accepted, the trust region radius will be increased
Packit 67cb25
by this factor. The default value is :math:`3`.
Packit 67cb25
Packit 67cb25
:code:`double factor_down`
Packit 67cb25
Packit 67cb25
When a step is rejected, the trust region radius will be decreased
Packit 67cb25
by this factor. The default value is :math:`2`.
Packit 67cb25
Packit 67cb25
:code:`double avmax`
Packit 67cb25
Packit 67cb25
When using geodesic acceleration to solve a nonlinear least squares problem,
Packit 67cb25
an important parameter to monitor is the ratio of the acceleration term
Packit 67cb25
to the velocity term,
Packit 67cb25
Packit 67cb25
.. only:: not texinfo
Packit 67cb25
Packit 67cb25
   .. math:: { ||a|| \over ||v|| }
Packit 67cb25
Packit 67cb25
.. only:: texinfo
Packit 67cb25
Packit 67cb25
   ::
Packit 67cb25
Packit 67cb25
      |a| / |v|
Packit 67cb25
Packit 67cb25
If this ratio is small, it means the acceleration correction
Packit 67cb25
is contributing very little to the step. This could be because
Packit 67cb25
the problem is not "nonlinear" enough to benefit from
Packit 67cb25
the acceleration. If the ratio is large (:math:`> 1`) it
Packit 67cb25
means that the acceleration is larger than the velocity,
Packit 67cb25
which shouldn't happen since the step represents a truncated
Packit 67cb25
series and so the second order term :math:`a` should be smaller than
Packit 67cb25
the first order term :math:`v` to guarantee convergence.
Packit 67cb25
Therefore any steps with a ratio larger than the parameter
Packit 67cb25
:data:`avmax` are rejected. :data:`avmax` is set to 0.75 by default.
Packit 67cb25
For problems which experience difficulty converging, this threshold
Packit 67cb25
could be lowered.
Packit 67cb25
Packit 67cb25
:code:`double h_df`
Packit 67cb25
Packit 67cb25
This parameter specifies the step size for approximating the
Packit 67cb25
Jacobian matrix with finite differences. It is set to
Packit 67cb25
:math:`\sqrt{\epsilon}` by default, where :math:`\epsilon`
Packit 67cb25
is :macro:`GSL_DBL_EPSILON`.
Packit 67cb25
Packit 67cb25
:code:`double h_fvv`
Packit 67cb25
Packit 67cb25
When using geodesic acceleration, the user must either supply
Packit 67cb25
a function to calculate :math:`f_{vv}(x)` or the library
Packit 67cb25
can estimate this second directional derivative using a finite
Packit 67cb25
difference method. When using finite differences, the library
Packit 67cb25
must calculate :math:`f(x + h v)` where :math:`h` represents
Packit 67cb25
a small step in the velocity direction. The parameter
Packit 67cb25
:data:`h_fvv` defines this step size and is set to 0.02 by
Packit 67cb25
default.
Packit 67cb25
Packit 67cb25
Initializing the Solver
Packit 67cb25
=======================
Packit 67cb25
Packit 67cb25
.. type:: gsl_multifit_nlinear_type
Packit 67cb25
Packit 67cb25
   This structure specifies the type of algorithm which will be used
Packit 67cb25
   to solve a nonlinear least squares problem. It may be selected from the
Packit 67cb25
   following choices,
Packit 67cb25
Packit 67cb25
   .. var:: gsl_multifit_nlinear_trust
Packit 67cb25
Packit 67cb25
      This specifies a trust region method. It is currently the only implemented
Packit 67cb25
      nonlinear least squares method.
Packit 67cb25
Packit 67cb25
.. function:: gsl_multifit_nlinear_workspace * gsl_multifit_nlinear_alloc (const gsl_multifit_nlinear_type * T, const gsl_multifit_nlinear_parameters * params, const size_t n, const size_t p)
Packit 67cb25
              gsl_multilarge_nlinear_workspace * gsl_multilarge_nlinear_alloc (const gsl_multilarge_nlinear_type * T, const gsl_multilarge_nlinear_parameters * params, const size_t n, const size_t p)
Packit 67cb25
Packit 67cb25
   These functions return a pointer to a newly allocated instance of a
Packit 67cb25
   derivative solver of type :data:`T` for :data:`n` observations and :data:`p`
Packit 67cb25
   parameters. The :data:`params` input specifies a tunable set of
Packit 67cb25
   parameters which will affect important details in each iteration
Packit 67cb25
   of the trust region subproblem algorithm. It is recommended to start
Packit 67cb25
   with the suggested default parameters (see
Packit 67cb25
   :func:`gsl_multifit_nlinear_default_parameters` and
Packit 67cb25
   :func:`gsl_multilarge_nlinear_default_parameters`) and then tune
Packit 67cb25
   the parameters once the code is working correctly. See
Packit 67cb25
   :ref:`sec_tunable-parameters`.
Packit 67cb25
   for descriptions of the various parameters.
Packit 67cb25
   For example, the following code creates an instance of a
Packit 67cb25
   Levenberg-Marquardt solver for 100 data points and 3 parameters,
Packit 67cb25
   using suggested defaults::
Packit 67cb25
Packit 67cb25
      const gsl_multifit_nlinear_type * T = gsl_multifit_nlinear_trust;
Packit 67cb25
      gsl_multifit_nlinear_parameters params = gsl_multifit_nlinear_default_parameters();
Packit 67cb25
      gsl_multifit_nlinear_workspace * w = gsl_multifit_nlinear_alloc (T, &params, 100, 3);
Packit 67cb25
Packit 67cb25
   The number of observations :data:`n` must be greater than or equal to
Packit 67cb25
   parameters :data:`p`.
Packit 67cb25
Packit 67cb25
   If there is insufficient memory to create the solver then the function
Packit 67cb25
   returns a null pointer and the error handler is invoked with an error
Packit 67cb25
   code of :macro:`GSL_ENOMEM`.
Packit 67cb25
Packit 67cb25
.. function:: gsl_multifit_nlinear_parameters gsl_multifit_nlinear_default_parameters (void)
Packit 67cb25
              gsl_multilarge_nlinear_parameters gsl_multilarge_nlinear_default_parameters (void)
Packit 67cb25
Packit 67cb25
   These functions return a set of recommended default parameters
Packit 67cb25
   for use in solving nonlinear least squares problems. The user
Packit 67cb25
   can tune each parameter to improve the performance on their
Packit 67cb25
   particular problem, see :ref:`sec_tunable-parameters`.
Packit 67cb25
Packit 67cb25
.. function:: int gsl_multifit_nlinear_init (const gsl_vector * x, gsl_multifit_nlinear_fdf * fdf, gsl_multifit_nlinear_workspace * w)
Packit 67cb25
              int gsl_multifit_nlinear_winit (const gsl_vector * x, const gsl_vector * wts, gsl_multifit_nlinear_fdf * fdf, gsl_multifit_nlinear_workspace * w)
Packit 67cb25
              int gsl_multilarge_nlinear_init (const gsl_vector * x, gsl_multilarge_nlinear_fdf * fdf, gsl_multilarge_nlinear_workspace * w)
Packit 67cb25
Packit 67cb25
   These functions initialize, or reinitialize, an existing workspace :data:`w`
Packit 67cb25
   to use the system :data:`fdf` and the initial guess
Packit 67cb25
   :data:`x`. See :ref:`sec_providing-function-minimized`
Packit 67cb25
   for a description of the :data:`fdf` structure.
Packit 67cb25
Packit 67cb25
   Optionally, a weight vector :data:`wts` can be given to perform
Packit 67cb25
   a weighted nonlinear regression. Here, the weighting matrix is
Packit 67cb25
   :math:`W = \diag(w_1,w_2,...,w_n)`.
Packit 67cb25
Packit 67cb25
.. function:: void gsl_multifit_nlinear_free (gsl_multifit_nlinear_workspace * w)
Packit 67cb25
              void gsl_multilarge_nlinear_free (gsl_multilarge_nlinear_workspace * w)
Packit 67cb25
Packit 67cb25
   These functions free all the memory associated with the workspace :data:`w`.
Packit 67cb25
Packit 67cb25
.. function:: const char * gsl_multifit_nlinear_name (const gsl_multifit_nlinear_workspace * w)
Packit 67cb25
              const char * gsl_multilarge_nlinear_name (const gsl_multilarge_nlinear_workspace * w)
Packit 67cb25
Packit 67cb25
   These functions return a pointer to the name of the solver.  For example::
Packit 67cb25
Packit 67cb25
      printf ("w is a '%s' solver\n", gsl_multifit_nlinear_name (w));
Packit 67cb25
Packit 67cb25
   would print something like :code:`w is a 'trust-region' solver`.
Packit 67cb25
Packit 67cb25
.. function:: const char * gsl_multifit_nlinear_trs_name (const gsl_multifit_nlinear_workspace * w)
Packit 67cb25
              const char * gsl_multilarge_nlinear_trs_name (const gsl_multilarge_nlinear_workspace * w)
Packit 67cb25
Packit 67cb25
   These functions return a pointer to the name of the trust region subproblem
Packit 67cb25
   method.  For example::
Packit 67cb25
Packit 67cb25
      printf ("w is a '%s' solver\n", gsl_multifit_nlinear_trs_name (w));
Packit 67cb25
Packit 67cb25
   would print something like :code:`w is a 'levenberg-marquardt' solver`.
Packit 67cb25
Packit 67cb25
.. _sec_providing-function-minimized:
Packit 67cb25
Packit 67cb25
Providing the Function to be Minimized
Packit 67cb25
======================================
Packit 67cb25
Packit 67cb25
The user must provide :math:`n` functions of :math:`p` variables for the
Packit 67cb25
minimization algorithm to operate on.  In order to allow for
Packit 67cb25
arbitrary parameters the functions are defined by the following data
Packit 67cb25
types:
Packit 67cb25
Packit 67cb25
.. type:: gsl_multifit_nlinear_fdf
Packit 67cb25
Packit 67cb25
   This data type defines a general system of functions with arbitrary parameters,
Packit 67cb25
   the corresponding Jacobian matrix of derivatives, and optionally the
Packit 67cb25
   second directional derivative of the functions for geodesic acceleration.
Packit 67cb25
Packit 67cb25
   :code:`int (* f) (const gsl_vector * x, void * params, gsl_vector * f)`
Packit 67cb25
Packit 67cb25
      This function should store the :math:`n` components of the vector
Packit 67cb25
      :math:`f(x)` in :data:`f` for argument :data:`x` and arbitrary parameters :data:`params`,
Packit 67cb25
      returning an appropriate error code if the function cannot be computed.
Packit 67cb25
Packit 67cb25
   :code:`int (* df) (const gsl_vector * x, void * params, gsl_matrix * J)`
Packit 67cb25
Packit 67cb25
      This function should store the :data:`n`-by-:data:`p` matrix result
Packit 67cb25
Packit 67cb25
      .. only:: not texinfo
Packit 67cb25
Packit 67cb25
         .. math:: J_{ij} = \partial f_i(x) / \partial x_j
Packit 67cb25
Packit 67cb25
      .. only:: texinfo
Packit 67cb25
Packit 67cb25
         ::
Packit 67cb25
Packit 67cb25
            J_ij = d f_i(x) / d x_j
Packit 67cb25
Packit 67cb25
      in :data:`J` for argument :data:`x` 
Packit 67cb25
      and arbitrary parameters :data:`params`, returning an appropriate error code if the
Packit 67cb25
      matrix cannot be computed. If an analytic Jacobian is unavailable, or too expensive
Packit 67cb25
      to compute, this function pointer may be set to :code:`NULL`, in which
Packit 67cb25
      case the Jacobian will be internally computed using finite difference approximations
Packit 67cb25
      of the function :data:`f`.
Packit 67cb25
Packit 67cb25
   :code:`int (* fvv) (const gsl_vector * x, const gsl_vector * v, void * params, gsl_vector * fvv)`
Packit 67cb25
Packit 67cb25
      When geodesic acceleration is enabled, this function should store the
Packit 67cb25
      :math:`n` components of the vector
Packit 67cb25
      :math:`f_{vv}(x) = \sum_{\alpha\beta} v_{\alpha} v_{\beta} {\partial \over \partial x_{\alpha}} {\partial \over \partial x_{\beta}} f(x)`,
Packit 67cb25
      representing second directional derivatives of the function to be minimized,
Packit 67cb25
      into the output :data:`fvv`. The parameter vector is provided in :data:`x` and
Packit 67cb25
      the velocity vector is provided in :data:`v`, both of which have :math:`p`
Packit 67cb25
      components. The arbitrary parameters are given in :data:`params`. If
Packit 67cb25
      analytic expressions for :math:`f_{vv}(x)` are unavailable or too difficult
Packit 67cb25
      to compute, this function pointer may be set to :code:`NULL`, in which case
Packit 67cb25
      :math:`f_{vv}(x)` will be computed internally using a finite difference
Packit 67cb25
      approximation.
Packit 67cb25
Packit 67cb25
   :code:`size_t n`
Packit 67cb25
Packit 67cb25
       the number of functions, i.e. the number of components of the
Packit 67cb25
       vector :data:`f`.
Packit 67cb25
Packit 67cb25
   :code:`size_t p`
Packit 67cb25
Packit 67cb25
      the number of independent variables, i.e. the number of components of
Packit 67cb25
      the vector :data:`x`.
Packit 67cb25
Packit 67cb25
   :code:`void * params`
Packit 67cb25
Packit 67cb25
      a pointer to the arbitrary parameters of the function.
Packit 67cb25
Packit 67cb25
   :code:`size_t nevalf`
Packit 67cb25
Packit 67cb25
      This does not need to be set by the user. It counts the number of
Packit 67cb25
      function evaluations and is initialized by the :code:`_init` function.
Packit 67cb25
Packit 67cb25
   :code:`size_t nevaldf`
Packit 67cb25
Packit 67cb25
      This does not need to be set by the user. It counts the number of
Packit 67cb25
      Jacobian evaluations and is initialized by the :code:`_init` function.
Packit 67cb25
Packit 67cb25
   :code:`size_t nevalfvv`
Packit 67cb25
Packit 67cb25
      This does not need to be set by the user. It counts the number of
Packit 67cb25
      :math:`f_{vv}(x)` evaluations and is initialized by the :code:`_init` function.
Packit 67cb25
Packit 67cb25
.. type:: gsl_multilarge_nlinear_fdf
Packit 67cb25
Packit 67cb25
   This data type defines a general system of functions with arbitrary parameters,
Packit 67cb25
   a function to compute :math:`J u` or :math:`J^T u` for a given vector :math:`u`,
Packit 67cb25
   the normal equations matrix :math:`J^T J`,
Packit 67cb25
   and optionally the second directional derivative of the functions for geodesic acceleration.
Packit 67cb25
Packit 67cb25
   :code:`int (* f) (const gsl_vector * x, void * params, gsl_vector * f)`
Packit 67cb25
Packit 67cb25
      This function should store the :math:`n` components of the vector
Packit 67cb25
      :math:`f(x)` in :data:`f` for argument :data:`x` and arbitrary parameters :data:`params`,
Packit 67cb25
      returning an appropriate error code if the function cannot be computed.
Packit 67cb25
Packit 67cb25
   :code:`int (* df) (CBLAS_TRANSPOSE_t TransJ, const gsl_vector * x, const gsl_vector * u, void * params, gsl_vector * v, gsl_matrix * JTJ)`
Packit 67cb25
Packit 67cb25
      If :data:`TransJ` is equal to :code:`CblasNoTrans`, then this function should
Packit 67cb25
      compute the matrix-vector product :math:`J u` and store the result in :data:`v`.
Packit 67cb25
      If :data:`TransJ` is equal to :code:`CblasTrans`, then this function should
Packit 67cb25
      compute the matrix-vector product :math:`J^T u` and store the result in :data:`v`.
Packit 67cb25
      Additionally, the normal equations matrix :math:`J^T J` should be stored in the
Packit 67cb25
      lower half of :data:`JTJ`. The input matrix :data:`JTJ` could be set to :code:`NULL`,
Packit 67cb25
      for example by iterative methods which do not require this matrix, so the user
Packit 67cb25
      should check for this prior to constructing the matrix.
Packit 67cb25
      The input :data:`params` contains the arbitrary parameters.
Packit 67cb25
Packit 67cb25
   :code:`int (* fvv) (const gsl_vector * x, const gsl_vector * v, void * params, gsl_vector * fvv)`
Packit 67cb25
Packit 67cb25
      When geodesic acceleration is enabled, this function should store the
Packit 67cb25
      :math:`n` components of the vector
Packit 67cb25
      :math:`f_{vv}(x) = \sum_{\alpha\beta} v_{\alpha} v_{\beta} {\partial \over \partial x_{\alpha}} {\partial \over \partial x_{\beta}} f(x)`,
Packit 67cb25
      representing second directional derivatives of the function to be minimized,
Packit 67cb25
      into the output :data:`fvv`. The parameter vector is provided in :data:`x` and
Packit 67cb25
      the velocity vector is provided in :data:`v`, both of which have :math:`p`
Packit 67cb25
      components. The arbitrary parameters are given in :data:`params`. If
Packit 67cb25
      analytic expressions for :math:`f_{vv}(x)` are unavailable or too difficult
Packit 67cb25
      to compute, this function pointer may be set to :code:`NULL`, in which case
Packit 67cb25
      :math:`f_{vv}(x)` will be computed internally using a finite difference
Packit 67cb25
      approximation.
Packit 67cb25
Packit 67cb25
   :code:`size_t n`
Packit 67cb25
Packit 67cb25
      the number of functions, i.e. the number of components of the
Packit 67cb25
      vector :data:`f`.
Packit 67cb25
Packit 67cb25
   :code:`size_t p`
Packit 67cb25
Packit 67cb25
      the number of independent variables, i.e. the number of components of
Packit 67cb25
      the vector :data:`x`.
Packit 67cb25
Packit 67cb25
   :code:`void * params`
Packit 67cb25
Packit 67cb25
      a pointer to the arbitrary parameters of the function.
Packit 67cb25
Packit 67cb25
   :code:`size_t nevalf`
Packit 67cb25
Packit 67cb25
      This does not need to be set by the user. It counts the number of
Packit 67cb25
      function evaluations and is initialized by the :code:`_init` function.
Packit 67cb25
Packit 67cb25
   :code:`size_t nevaldfu`
Packit 67cb25
Packit 67cb25
      This does not need to be set by the user. It counts the number of
Packit 67cb25
      Jacobian matrix-vector evaluations (:math:`J u` or :math:`J^T u`) and
Packit 67cb25
      is initialized by the :code:`_init` function.
Packit 67cb25
Packit 67cb25
   :code:`size_t nevaldf2`
Packit 67cb25
Packit 67cb25
      This does not need to be set by the user. It counts the number of
Packit 67cb25
      :math:`J^T J` evaluations and is initialized by the :code:`_init` function.
Packit 67cb25
Packit 67cb25
   :code:`size_t nevalfvv`
Packit 67cb25
Packit 67cb25
      This does not need to be set by the user. It counts the number of
Packit 67cb25
      :math:`f_{vv}(x)` evaluations and is initialized by the :code:`_init` function.
Packit 67cb25
Packit 67cb25
Note that when fitting a non-linear model against experimental data,
Packit 67cb25
the data is passed to the functions above using the
Packit 67cb25
:data:`params` argument and the trial best-fit parameters through the
Packit 67cb25
:data:`x` argument.
Packit 67cb25
Packit 67cb25
Iteration
Packit 67cb25
=========
Packit 67cb25
Packit 67cb25
The following functions drive the iteration of each algorithm.  Each
Packit 67cb25
function performs one iteration of the trust region method and updates
Packit 67cb25
the state of the solver.
Packit 67cb25
Packit 67cb25
.. function:: int gsl_multifit_nlinear_iterate (gsl_multifit_nlinear_workspace * w)
Packit 67cb25
              int gsl_multilarge_nlinear_iterate (gsl_multilarge_nlinear_workspace * w)
Packit 67cb25
Packit 67cb25
   These functions perform a single iteration of the solver :data:`w`.  If
Packit 67cb25
   the iteration encounters an unexpected problem then an error code will
Packit 67cb25
   be returned.  The solver workspace maintains a current estimate of the
Packit 67cb25
   best-fit parameters at all times.
Packit 67cb25
Packit 67cb25
The solver workspace :data:`w` contains the following entries, which can
Packit 67cb25
be used to track the progress of the solution:
Packit 67cb25
Packit 67cb25
:code:`gsl_vector * x`
Packit 67cb25
Packit 67cb25
  The current position, length :math:`p`.
Packit 67cb25
Packit 67cb25
:code:`gsl_vector * f`
Packit 67cb25
Packit 67cb25
  The function residual vector at the current position :math:`f(x)`, length
Packit 67cb25
  :math:`n`.
Packit 67cb25
Packit 67cb25
:code:`gsl_matrix * J`
Packit 67cb25
Packit 67cb25
  The Jacobian matrix at the current position :math:`J(x)`, size
Packit 67cb25
  :math:`n`-by-:math:`p` (only for :code:`gsl_multifit_nlinear` interface).
Packit 67cb25
Packit 67cb25
:code:`gsl_vector * dx`
Packit 67cb25
Packit 67cb25
  The difference between the current position and the previous position,
Packit 67cb25
  i.e. the last step :math:`\delta`, taken as a vector, length :math:`p`.
Packit 67cb25
Packit 67cb25
These quantities can be accessed with the following functions,
Packit 67cb25
Packit 67cb25
.. function:: gsl_vector * gsl_multifit_nlinear_position (const gsl_multifit_nlinear_workspace * w)
Packit 67cb25
              gsl_vector * gsl_multilarge_nlinear_position (const gsl_multilarge_nlinear_workspace * w)
Packit 67cb25
Packit 67cb25
   These functions return the current position :math:`x` (i.e. best-fit
Packit 67cb25
   parameters) of the solver :data:`w`.
Packit 67cb25
Packit 67cb25
.. function:: gsl_vector * gsl_multifit_nlinear_residual (const gsl_multifit_nlinear_workspace * w)
Packit 67cb25
              gsl_vector * gsl_multilarge_nlinear_residual (const gsl_multilarge_nlinear_workspace * w)
Packit 67cb25
Packit 67cb25
   These functions return the current residual vector :math:`f(x)` of the
Packit 67cb25
   solver :data:`w`.  For weighted systems, the residual vector includes the
Packit 67cb25
   weighting factor :math:`\sqrt{W}`.
Packit 67cb25
Packit 67cb25
.. function:: gsl_matrix * gsl_multifit_nlinear_jac (const gsl_multifit_nlinear_workspace * w)
Packit 67cb25
Packit 67cb25
   This function returns a pointer to the :math:`n`-by-:math:`p` Jacobian matrix for the
Packit 67cb25
   current iteration of the solver :data:`w`. This function is available only for the
Packit 67cb25
   :code:`gsl_multifit_nlinear` interface.
Packit 67cb25
Packit 67cb25
.. function:: size_t gsl_multifit_nlinear_niter (const gsl_multifit_nlinear_workspace * w)
Packit 67cb25
              size_t gsl_multilarge_nlinear_niter (const gsl_multilarge_nlinear_workspace * w)
Packit 67cb25
Packit 67cb25
   These functions return the number of iterations performed so far.
Packit 67cb25
   The iteration counter is updated on each call to the
Packit 67cb25
   :code:`_iterate` functions above, and reset to 0 in the
Packit 67cb25
   :code:`_init` functions.
Packit 67cb25
Packit 67cb25
.. function:: int gsl_multifit_nlinear_rcond (double * rcond, const gsl_multifit_nlinear_workspace * w)
Packit 67cb25
              int gsl_multilarge_nlinear_rcond (double * rcond, const gsl_multilarge_nlinear_workspace * w)
Packit 67cb25
Packit 67cb25
   This function estimates the reciprocal condition number
Packit 67cb25
   of the Jacobian matrix at the current position :math:`x` and
Packit 67cb25
   stores it in :data:`rcond`. The computed value is only an estimate
Packit 67cb25
   to give the user a guideline as to the conditioning of their particular
Packit 67cb25
   problem. Its calculation is based on which factorization
Packit 67cb25
   method is used (Cholesky, QR, or SVD). 
Packit 67cb25
Packit 67cb25
   * For the Cholesky solver, the matrix :math:`J^T J` is factored at each
Packit 67cb25
     iteration. Therefore this function will estimate the 1-norm condition number
Packit 67cb25
     :math:`rcond^2 = 1/(||J^T J||_1 \cdot ||(J^T J)^{-1}||_1)`
Packit 67cb25
Packit 67cb25
   * For the QR solver, :math:`J` is factored as :math:`J = Q R` at each
Packit 67cb25
     iteration. For simplicity, this function calculates the 1-norm conditioning of
Packit 67cb25
     only the :math:`R` factor, :math:`rcond = 1 / (||R||_1 \cdot ||R^{-1}||_1)`.
Packit 67cb25
     This can be computed efficiently since :math:`R` is upper triangular.
Packit 67cb25
Packit 67cb25
   * For the SVD solver, in order to efficiently solve the trust region
Packit 67cb25
     subproblem, the matrix which is factored is :math:`J D^{-1}`, instead of
Packit 67cb25
     :math:`J` itself. The resulting singular values are used to provide
Packit 67cb25
     the 2-norm reciprocal condition number, as :math:`rcond = \sigma_{min} / \sigma_{max}`.
Packit 67cb25
     Note that when using |More| scaling, :math:`D \ne I` and the resulting
Packit 67cb25
     :data:`rcond` estimate may be significantly different from the true
Packit 67cb25
     :data:`rcond` of :math:`J` itself.
Packit 67cb25
Packit 67cb25
.. function:: double gsl_multifit_nlinear_avratio (const gsl_multifit_nlinear_workspace * w)
Packit 67cb25
              double gsl_multilarge_nlinear_avratio (const gsl_multilarge_nlinear_workspace * w)
Packit 67cb25
Packit 67cb25
   This function returns the current ratio :math:`|a| / |v|` of the acceleration correction term to
Packit 67cb25
   the velocity step term. The acceleration term is computed only by the
Packit 67cb25
   :type:`gsl_multifit_nlinear_trs_lmaccel` and :type:`gsl_multilarge_nlinear_trs_lmaccel` methods, so
Packit 67cb25
   this ratio will be zero for other TRS methods.
Packit 67cb25
Packit 67cb25
.. index::
Packit 67cb25
   single: nonlinear fitting, stopping parameters, convergence
Packit 67cb25
Packit 67cb25
Testing for Convergence
Packit 67cb25
=======================
Packit 67cb25
Packit 67cb25
A minimization procedure should stop when one of the following conditions is
Packit 67cb25
true:
Packit 67cb25
Packit 67cb25
* A minimum has been found to within the user-specified precision.
Packit 67cb25
* A user-specified maximum number of iterations has been reached.
Packit 67cb25
* An error has occurred.
Packit 67cb25
Packit 67cb25
The handling of these conditions is under user control.  The functions
Packit 67cb25
below allow the user to test the current estimate of the best-fit
Packit 67cb25
parameters in several standard ways.
Packit 67cb25
Packit 67cb25
.. function:: int gsl_multifit_nlinear_test (const double xtol, const double gtol, const double ftol, int * info, const gsl_multifit_nlinear_workspace * w)
Packit 67cb25
              int gsl_multilarge_nlinear_test (const double xtol, const double gtol, const double ftol, int * info, const gsl_multilarge_nlinear_workspace * w)
Packit 67cb25
Packit 67cb25
   These functions test for convergence of the minimization method
Packit 67cb25
   using the following criteria:
Packit 67cb25
Packit 67cb25
   * Testing for a small step size relative to the current parameter vector
Packit 67cb25
Packit 67cb25
     .. math:: |\delta_i| \le xtol (|x_i| + xtol)
Packit 67cb25
Packit 67cb25
     for each :math:`0 <= i < p`. Each element of the step vector :math:`\delta`
Packit 67cb25
     is tested individually in case the different parameters have widely
Packit 67cb25
     different scales. Adding :data:`xtol` to :math:`|x_i|` helps the test avoid
Packit 67cb25
     breaking down in situations where the true solution value :math:`x_i = 0`.
Packit 67cb25
     If this test succeeds, :data:`info` is set to 1 and the function
Packit 67cb25
     returns :macro:`GSL_SUCCESS`.
Packit 67cb25
Packit 67cb25
     A general guideline for selecting the step tolerance is to choose
Packit 67cb25
     :math:`xtol = 10^{-d}` where :math:`d` is the number of accurate
Packit 67cb25
     decimal digits desired in the solution :math:`x`. See Dennis and
Packit 67cb25
     Schnabel for more information.
Packit 67cb25
Packit 67cb25
   * Testing for a small gradient (:math:`g = \nabla \Phi(x) = J^T f`)
Packit 67cb25
     indicating a local function minimum:
Packit 67cb25
Packit 67cb25
     .. only:: not texinfo
Packit 67cb25
Packit 67cb25
        .. math:: \max_i |g_i \times \max(x_i, 1)| \le gtol \times \max(\Phi(x), 1)
Packit 67cb25
Packit 67cb25
     .. only:: texinfo
Packit 67cb25
Packit 67cb25
        ::
Packit 67cb25
Packit 67cb25
           ||g||_inf <= gtol
Packit 67cb25
Packit 67cb25
     This expression tests whether the ratio
Packit 67cb25
     :math:`(\nabla \Phi)_i x_i / \Phi` is small. Testing this scaled gradient
Packit 67cb25
     is a better than :math:`\nabla \Phi` alone since it is a dimensionless
Packit 67cb25
     quantity and so independent of the scale of the problem. The
Packit 67cb25
     :code:`max` arguments help ensure the test doesn't break down in
Packit 67cb25
     regions where :math:`x_i` or :math:`\Phi(x)` are close to 0.
Packit 67cb25
     If this test succeeds, :data:`info` is set to 2 and the function
Packit 67cb25
     returns :macro:`GSL_SUCCESS`.
Packit 67cb25
Packit 67cb25
     A general guideline for choosing the gradient tolerance is to set
Packit 67cb25
     :code:`gtol = GSL_DBL_EPSILON^(1/3)`. See Dennis and Schnabel for
Packit 67cb25
     more information.
Packit 67cb25
Packit 67cb25
   If none of the tests succeed, :data:`info` is set to 0 and the
Packit 67cb25
   function returns :macro:`GSL_CONTINUE`, indicating further iterations
Packit 67cb25
   are required.
Packit 67cb25
Packit 67cb25
High Level Driver
Packit 67cb25
=================
Packit 67cb25
Packit 67cb25
These routines provide a high level wrapper that combines the iteration
Packit 67cb25
and convergence testing for easy use.
Packit 67cb25
Packit 67cb25
.. function:: int gsl_multifit_nlinear_driver (const size_t maxiter, const double xtol, const double gtol, const double ftol, void (* callback)(const size_t iter, void * params, const gsl_multifit_linear_workspace * w), void * callback_params, int * info, gsl_multifit_nlinear_workspace * w)
Packit 67cb25
              int gsl_multilarge_nlinear_driver (const size_t maxiter, const double xtol, const double gtol, const double ftol, void (* callback)(const size_t iter, void * params, const gsl_multilarge_linear_workspace * w), void * callback_params, int * info, gsl_multilarge_nlinear_workspace * w)
Packit 67cb25
Packit 67cb25
   These functions iterate the nonlinear least squares solver :data:`w` for a
Packit 67cb25
   maximum of :data:`maxiter` iterations. After each iteration, the system is
Packit 67cb25
   tested for convergence with the error tolerances :data:`xtol`, :data:`gtol` and :data:`ftol`.
Packit 67cb25
   Additionally, the user may supply a callback function :data:`callback`
Packit 67cb25
   which is called after each iteration, so that the user may save or print
Packit 67cb25
   relevant quantities for each iteration. The parameter :data:`callback_params`
Packit 67cb25
   is passed to the :data:`callback` function. The parameters :data:`callback`
Packit 67cb25
   and :data:`callback_params` may be set to :code:`NULL` to disable this feature.
Packit 67cb25
   Upon successful convergence, the function returns :macro:`GSL_SUCCESS`
Packit 67cb25
   and sets :data:`info` to the reason for convergence (see
Packit 67cb25
   :func:`gsl_multifit_nlinear_test`). If the function has not
Packit 67cb25
   converged after :data:`maxiter` iterations, :macro:`GSL_EMAXITER` is
Packit 67cb25
   returned. In rare cases, during an iteration the algorithm may
Packit 67cb25
   be unable to find a new acceptable step :math:`\delta` to take. In
Packit 67cb25
   this case, :macro:`GSL_ENOPROG` is returned indicating no further
Packit 67cb25
   progress can be made. If your problem is having difficulty converging,
Packit 67cb25
   see :ref:`sec_nlinear-troubleshooting` for further guidance.
Packit 67cb25
Packit 67cb25
.. index::
Packit 67cb25
   single: best-fit parameters, covariance
Packit 67cb25
   single: least squares, covariance of best-fit parameters
Packit 67cb25
   single: covariance matrix, nonlinear fits
Packit 67cb25
Packit 67cb25
Covariance matrix of best fit parameters
Packit 67cb25
========================================
Packit 67cb25
Packit 67cb25
.. function:: int gsl_multifit_nlinear_covar (const gsl_matrix * J, const double epsrel, gsl_matrix * covar)
Packit 67cb25
              int gsl_multilarge_nlinear_covar (gsl_matrix * covar, gsl_multilarge_nlinear_workspace * w)
Packit 67cb25
Packit 67cb25
   This function computes the covariance matrix of best-fit parameters
Packit 67cb25
   using the Jacobian matrix :data:`J` and stores it in :data:`covar`.
Packit 67cb25
   The parameter :data:`epsrel` is used to remove linear-dependent columns
Packit 67cb25
   when :data:`J` is rank deficient.
Packit 67cb25
Packit 67cb25
   The covariance matrix is given by,
Packit 67cb25
Packit 67cb25
   .. math:: C = (J^T J)^{-1}
Packit 67cb25
Packit 67cb25
   or in the weighted case,
Packit 67cb25
Packit 67cb25
   .. math:: C = (J^T W J)^{-1}
Packit 67cb25
Packit 67cb25
   and is computed using the factored form of the Jacobian (Cholesky, QR, or SVD).
Packit 67cb25
   Any columns of :math:`R` which satisfy 
Packit 67cb25
Packit 67cb25
   .. math:: |R_{kk}| \leq epsrel |R_{11}|
Packit 67cb25
Packit 67cb25
   are considered linearly-dependent and are excluded from the covariance
Packit 67cb25
   matrix (the corresponding rows and columns of the covariance matrix are
Packit 67cb25
   set to zero).
Packit 67cb25
Packit 67cb25
   If the minimisation uses the weighted least-squares function
Packit 67cb25
   :math:`f_i = (Y(x, t_i) - y_i) / \sigma_i` then the covariance
Packit 67cb25
   matrix above gives the statistical error on the best-fit parameters
Packit 67cb25
   resulting from the Gaussian errors :math:`\sigma_i` on 
Packit 67cb25
   the underlying data :math:`y_i`.  This can be verified from the relation 
Packit 67cb25
   :math:`\delta f = J \delta c` and the fact that the fluctuations in :math:`f`
Packit 67cb25
   from the data :math:`y_i` are normalised by :math:`\sigma_i` and 
Packit 67cb25
   so satisfy
Packit 67cb25
   
Packit 67cb25
   .. only:: not texinfo
Packit 67cb25
   
Packit 67cb25
      .. math:: \langle \delta f \delta f^T \rangle = I
Packit 67cb25
Packit 67cb25
   .. only:: texinfo
Packit 67cb25
Packit 67cb25
      ::
Packit 67cb25
Packit 67cb25
         <\delta f \delta f^T> = I
Packit 67cb25
Packit 67cb25
   For an unweighted least-squares function :math:`f_i = (Y(x, t_i) - y_i)`
Packit 67cb25
   the covariance matrix above should be multiplied by the variance
Packit 67cb25
   of the residuals about the best-fit :math:`\sigma^2 = \sum (y_i - Y(x,t_i))^2 / (n-p)`
Packit 67cb25
   to give the variance-covariance
Packit 67cb25
   matrix :math:`\sigma^2 C`.  This estimates the statistical error on the
Packit 67cb25
   best-fit parameters from the scatter of the underlying data.
Packit 67cb25
Packit 67cb25
   For more information about covariance matrices see
Packit 67cb25
   :ref:`Linear Least-Squares Overview <sec_lls-overview>`.
Packit 67cb25
Packit 67cb25
.. _sec_nlinear-troubleshooting:
Packit 67cb25
Packit 67cb25
Troubleshooting
Packit 67cb25
===============
Packit 67cb25
Packit 67cb25
When developing a code to solve a nonlinear least squares problem,
Packit 67cb25
here are a few considerations to keep in mind.
Packit 67cb25
Packit 67cb25
#. The most common difficulty is the accurate implementation of the Jacobian
Packit 67cb25
   matrix. If the analytic Jacobian is not properly provided to the
Packit 67cb25
   solver, this can hinder and many times prevent convergence of the method.
Packit 67cb25
   When developing a new nonlinear least squares code, it often helps
Packit 67cb25
   to compare the program output with the internally computed finite
Packit 67cb25
   difference Jacobian and the user supplied analytic Jacobian. If there
Packit 67cb25
   is a large difference in coefficients, it is likely the analytic
Packit 67cb25
   Jacobian is incorrectly implemented.
Packit 67cb25
Packit 67cb25
#. If your code is having difficulty converging, the next thing to
Packit 67cb25
   check is the starting point provided to the solver. The methods
Packit 67cb25
   of this chapter are local methods, meaning if you provide a starting
Packit 67cb25
   point far away from the true minimum, the method may converge to
Packit 67cb25
   a local minimum or not converge at all. Sometimes it is possible
Packit 67cb25
   to solve a linearized approximation to the nonlinear problem,
Packit 67cb25
   and use the linear solution as the starting point to the nonlinear
Packit 67cb25
   problem.
Packit 67cb25
Packit 67cb25
#. If the various parameters of the coefficient vector :math:`x`
Packit 67cb25
   vary widely in magnitude, then the problem is said to be badly scaled.
Packit 67cb25
   The methods of this chapter do attempt to automatically rescale
Packit 67cb25
   the elements of :math:`x` to have roughly the same order of magnitude,
Packit 67cb25
   but in extreme cases this could still cause problems for convergence.
Packit 67cb25
   In these cases it is recommended for the user to scale their
Packit 67cb25
   parameter vector :math:`x` so that each parameter spans roughly the
Packit 67cb25
   same range, say :math:`[-1,1]`. The solution vector can be backscaled
Packit 67cb25
   to recover the original units of the problem.
Packit 67cb25
Packit 67cb25
Examples
Packit 67cb25
========
Packit 67cb25
Packit 67cb25
The following example programs demonstrate the nonlinear least
Packit 67cb25
squares fitting capabilities.
Packit 67cb25
Packit 67cb25
Exponential Fitting Example
Packit 67cb25
---------------------------
Packit 67cb25
Packit 67cb25
The following example program fits a weighted exponential model with
Packit 67cb25
background to experimental data, :math:`Y = A \exp(-\lambda t) + b`. The
Packit 67cb25
first part of the program sets up the functions :func:`expb_f` and
Packit 67cb25
:func:`expb_df` to calculate the model and its Jacobian.  The appropriate
Packit 67cb25
fitting function is given by,
Packit 67cb25
Packit 67cb25
.. math:: f_i = (A \exp(-\lambda t_i) + b) - y_i
Packit 67cb25
Packit 67cb25
where we have chosen :math:`t_i = i T / (N - 1)`, where :math:`N` is the number
Packit 67cb25
of data points fitted, so that :math:`t_i \in [0, T]`. The Jacobian matrix :math:`J` is
Packit 67cb25
the derivative of these functions with respect to the three parameters
Packit 67cb25
(:math:`A`, :math:`\lambda`, :math:`b`).  It is given by,
Packit 67cb25
Packit 67cb25
.. only:: not texinfo
Packit 67cb25
Packit 67cb25
   .. math:: J_{ij} = {\partial f_i \over \partial x_j}
Packit 67cb25
Packit 67cb25
.. only:: texinfo
Packit 67cb25
Packit 67cb25
   ::
Packit 67cb25
Packit 67cb25
      J_{ij} = d f_i / d x_j
Packit 67cb25
Packit 67cb25
where :math:`x_0 = A`, :math:`x_1 = \lambda` and :math:`x_2 = b`.
Packit 67cb25
The :math:`i`-th row of the Jacobian is therefore
Packit 67cb25
Packit 67cb25
.. only:: not texinfo
Packit 67cb25
Packit 67cb25
   .. math::
Packit 67cb25
Packit 67cb25
      J_{i\cdot} =
Packit 67cb25
      \left(
Packit 67cb25
        \begin{array}{ccc}
Packit 67cb25
          \exp(-\lambda t_i) & -t_i A \exp(-\lambda t_i) & 1
Packit 67cb25
        \end{array}
Packit 67cb25
      \right)
Packit 67cb25
Packit 67cb25
.. only:: texinfo
Packit 67cb25
Packit 67cb25
   ::
Packit 67cb25
Packit 67cb25
      J(i,:) = [ \exp(-\lambda t_i) ; -t_i A \exp(-\lambda t_i) ; 1 ]
Packit 67cb25
Packit 67cb25
The main part of the program sets up a Levenberg-Marquardt solver and
Packit 67cb25
some simulated random data. The data uses the known parameters
Packit 67cb25
(5.0,0.1,1.0) combined with Gaussian noise (standard deviation = 0.1)
Packit 67cb25
with a maximum time :math:`T = 40` and :math:`N = 100` timesteps.
Packit 67cb25
The initial guess for the parameters is
Packit 67cb25
chosen as (1.0, 1.0, 0.0). The iteration terminates when the relative
Packit 67cb25
change in x is smaller than :math:`10^{-8}`, or when the magnitude of
Packit 67cb25
the gradient falls below :math:`10^{-8}`. Here are the results of running
Packit 67cb25
the program::
Packit 67cb25
Packit 67cb25
  iter  0: A = 1.0000, lambda = 1.0000, b = 0.0000, cond(J) =      inf, |f(x)| = 100.8779
Packit 67cb25
  iter  1: A = 1.2692, lambda = 0.3924, b = 0.0443, cond(J) =  69.5973, |f(x)| = 97.1734
Packit 67cb25
  iter  2: A = 1.6749, lambda = 0.1685, b = 0.1072, cond(J) =  29.5220, |f(x)| = 88.6636
Packit 67cb25
  iter  3: A = 2.5579, lambda = 0.0544, b = 0.2552, cond(J) =  22.9334, |f(x)| = 42.7765
Packit 67cb25
  iter  4: A = 3.0167, lambda = 0.0472, b = 0.3704, cond(J) = 120.2912, |f(x)| = 23.0102
Packit 67cb25
  iter  5: A = 3.3590, lambda = 0.0455, b = 0.4321, cond(J) = 266.5620, |f(x)| = 16.0680
Packit 67cb25
  iter  6: A = 3.6552, lambda = 0.0479, b = 0.4426, cond(J) = 343.8946, |f(x)| = 14.6421
Packit 67cb25
  iter  7: A = 3.9546, lambda = 0.0532, b = 0.4897, cond(J) = 301.4985, |f(x)| = 13.5266
Packit 67cb25
  iter  8: A = 4.1421, lambda = 0.0633, b = 0.6783, cond(J) = 203.9164, |f(x)| = 12.3149
Packit 67cb25
  iter  9: A = 4.3752, lambda = 0.0800, b = 0.9228, cond(J) = 158.2267, |f(x)| = 11.2475
Packit 67cb25
  iter 10: A = 4.6371, lambda = 0.0891, b = 0.9588, cond(J) = 136.6189, |f(x)| = 10.5457
Packit 67cb25
  iter 11: A = 4.7684, lambda = 0.0937, b = 0.9860, cond(J) = 125.4740, |f(x)| = 10.4753
Packit 67cb25
  iter 12: A = 4.7977, lambda = 0.0948, b = 0.9917, cond(J) = 120.1098, |f(x)| = 10.4723
Packit 67cb25
  iter 13: A = 4.8006, lambda = 0.0949, b = 0.9924, cond(J) = 118.9113, |f(x)| = 10.4723
Packit 67cb25
  iter 14: A = 4.8008, lambda = 0.0949, b = 0.9925, cond(J) = 118.7661, |f(x)| = 10.4723
Packit 67cb25
  iter 15: A = 4.8008, lambda = 0.0949, b = 0.9925, cond(J) = 118.7550, |f(x)| = 10.4723
Packit 67cb25
  iter 16: A = 4.8008, lambda = 0.0949, b = 0.9925, cond(J) = 118.7543, |f(x)| = 10.4723
Packit 67cb25
  iter 17: A = 4.8008, lambda = 0.0949, b = 0.9925, cond(J) = 118.7543, |f(x)| = 10.4723
Packit 67cb25
  iter 18: A = 4.8008, lambda = 0.0949, b = 0.9925, cond(J) = 118.7543, |f(x)| = 10.4723
Packit 67cb25
  summary from method 'trust-region/levenberg-marquardt'
Packit 67cb25
  number of iterations: 18
Packit 67cb25
  function evaluations: 25
Packit 67cb25
  Jacobian evaluations: 19
Packit 67cb25
  reason for stopping: small step size
Packit 67cb25
  initial |f(x)| = 100.877904
Packit 67cb25
  final   |f(x)| = 10.472268
Packit 67cb25
  chisq/dof = 1.1306
Packit 67cb25
  A      = 4.80085 +/- 0.17652
Packit 67cb25
  lambda = 0.09488 +/- 0.00527
Packit 67cb25
  b      = 0.99249 +/- 0.04419
Packit 67cb25
  status = success
Packit 67cb25
Packit 67cb25
The approximate values of the parameters are found correctly, and the
Packit 67cb25
chi-squared value indicates a good fit (the chi-squared per degree of
Packit 67cb25
freedom is approximately 1).  In this case the errors on the parameters
Packit 67cb25
can be estimated from the square roots of the diagonal elements of the
Packit 67cb25
covariance matrix. If the chi-squared value shows a poor fit (i.e.
Packit 67cb25
:math:`\chi^2/(n-p) \gg 1`
Packit 67cb25
then the error estimates obtained from the
Packit 67cb25
covariance matrix will be too small.  In the example program the error estimates
Packit 67cb25
are multiplied by :math:`\sqrt{\chi^2/(n-p)}`
Packit 67cb25
in this case, a common way of increasing the
Packit 67cb25
errors for a poor fit.  Note that a poor fit will result from the use
Packit 67cb25
of an inappropriate model, and the scaled error estimates may then
Packit 67cb25
be outside the range of validity for Gaussian errors.
Packit 67cb25
Packit 67cb25
Additionally, we see that the condition number of :math:`J(x)` stays
Packit 67cb25
reasonably small throughout the iteration. This indicates we could
Packit 67cb25
safely switch to the Cholesky solver for speed improvement,
Packit 67cb25
although this particular system is too small to really benefit.
Packit 67cb25
Packit 67cb25
:numref:`fig_fit-exp` shows the fitted curve with the original data.
Packit 67cb25
Packit 67cb25
.. _fig_fit-exp:
Packit 67cb25
Packit 67cb25
.. figure:: /images/fit-exp.png
Packit 67cb25
   :scale: 60%
Packit 67cb25
Packit 67cb25
   Exponential fitted curve with data
Packit 67cb25
Packit 67cb25
.. include:: examples/nlfit.c
Packit 67cb25
   :code:
Packit 67cb25
Packit 67cb25
Geodesic Acceleration Example 1
Packit 67cb25
-------------------------------
Packit 67cb25
Packit 67cb25
The following example program minimizes a modified Rosenbrock function,
Packit 67cb25
which is characterized by a narrow canyon with steep walls. The
Packit 67cb25
starting point is selected high on the canyon wall, so the solver
Packit 67cb25
must first find the canyon bottom and then navigate to the minimum.
Packit 67cb25
The problem is solved both with and without using geodesic acceleration
Packit 67cb25
for comparison. The cost function is given by
Packit 67cb25
Packit 67cb25
.. only:: not texinfo
Packit 67cb25
Packit 67cb25
   .. math::
Packit 67cb25
Packit 67cb25
      \Phi(x) &= {1 \over 2} (f_1^2 + f_2^2) \\
Packit 67cb25
      f_1 &= 100 \left( x_2 - x_1^2 \right) \\
Packit 67cb25
      f_2 &= 1 - x_1
Packit 67cb25
Packit 67cb25
.. only:: texinfo
Packit 67cb25
Packit 67cb25
   ::
Packit 67cb25
Packit 67cb25
      Phi(x) = 1/2 (f1^2 + f2^2)
Packit 67cb25
      f1 = 100 ( x2 - x1^2 )
Packit 67cb25
      f2 = 1 - x1
Packit 67cb25
Packit 67cb25
The Jacobian matrix is
Packit 67cb25
Packit 67cb25
.. only:: not texinfo
Packit 67cb25
Packit 67cb25
   .. math::
Packit 67cb25
Packit 67cb25
      J =
Packit 67cb25
      \left(
Packit 67cb25
        \begin{array}{cc}
Packit 67cb25
          {\partial f_1 \over \partial x_1} & {\partial f_1 \over \partial x_2} \\
Packit 67cb25
          {\partial f_2 \over \partial x_1} & {\partial f_2 \over \partial x_2}
Packit 67cb25
        \end{array}
Packit 67cb25
      \right) =
Packit 67cb25
      \left(
Packit 67cb25
        \begin{array}{cc}
Packit 67cb25
          -200 x_1 & 100 \\
Packit 67cb25
          -1 & 0
Packit 67cb25
        \end{array}
Packit 67cb25
      \right)
Packit 67cb25
Packit 67cb25
.. only:: texinfo
Packit 67cb25
Packit 67cb25
   ::
Packit 67cb25
Packit 67cb25
      J = [ -200*x1 100  ]
Packit 67cb25
          [   -1     0   ]
Packit 67cb25
Packit 67cb25
In order to use geodesic acceleration, the user must provide
Packit 67cb25
the second directional derivative of each residual in the
Packit 67cb25
velocity direction,
Packit 67cb25
:math:`D_v^2 f_i = \sum_{\alpha\beta} v_{\alpha} v_{\beta} \partial_{\alpha} \partial_{\beta} f_i`.
Packit 67cb25
The velocity vector :math:`v` is provided by the solver. For this example,
Packit 67cb25
these derivatives are
Packit 67cb25
Packit 67cb25
.. only:: not texinfo
Packit 67cb25
Packit 67cb25
   .. math::
Packit 67cb25
Packit 67cb25
      f_{vv} =
Packit 67cb25
      D_v^2
Packit 67cb25
      \left(
Packit 67cb25
        \begin{array}{c}
Packit 67cb25
          f_1 \\
Packit 67cb25
          f_2
Packit 67cb25
        \end{array}
Packit 67cb25
      \right) =
Packit 67cb25
      \left(
Packit 67cb25
        \begin{array}{c}
Packit 67cb25
          -200 v_1^2 \\
Packit 67cb25
          0
Packit 67cb25
        \end{array}
Packit 67cb25
      \right)
Packit 67cb25
Packit 67cb25
.. only:: texinfo
Packit 67cb25
Packit 67cb25
   ::
Packit 67cb25
Packit 67cb25
      fvv = [ -200 v1^2 ]
Packit 67cb25
            [     0     ]
Packit 67cb25
Packit 67cb25
The solution of this minimization problem is
Packit 67cb25
Packit 67cb25
.. only:: not texinfo
Packit 67cb25
Packit 67cb25
   .. math::
Packit 67cb25
Packit 67cb25
      x^{*} &=
Packit 67cb25
      \left(
Packit 67cb25
        \begin{array}{c}
Packit 67cb25
          1 \\
Packit 67cb25
          1
Packit 67cb25
        \end{array}
Packit 67cb25
      \right) \\
Packit 67cb25
      \Phi(x^{*}) &= 0
Packit 67cb25
Packit 67cb25
.. only:: texinfo
Packit 67cb25
Packit 67cb25
   ::
Packit 67cb25
Packit 67cb25
      x* = [ 1 ; 1 ]
Packit 67cb25
      Phi(x*) = 0
Packit 67cb25
Packit 67cb25
The program output is shown below::
Packit 67cb25
Packit 67cb25
  === Solving system without acceleration ===
Packit 67cb25
  NITER         = 53
Packit 67cb25
  NFEV          = 56
Packit 67cb25
  NJEV          = 54
Packit 67cb25
  NAEV          = 0
Packit 67cb25
  initial cost  = 2.250225000000e+04
Packit 67cb25
  final cost    = 6.674986031430e-18
Packit 67cb25
  final x       = (9.999999974165e-01, 9.999999948328e-01)
Packit 67cb25
  final cond(J) = 6.000096055094e+02
Packit 67cb25
  === Solving system with acceleration ===
Packit 67cb25
  NITER         = 15
Packit 67cb25
  NFEV          = 17
Packit 67cb25
  NJEV          = 16
Packit 67cb25
  NAEV          = 16
Packit 67cb25
  initial cost  = 2.250225000000e+04
Packit 67cb25
  final cost    = 7.518932873279e-19
Packit 67cb25
  final x       = (9.999999991329e-01, 9.999999982657e-01)
Packit 67cb25
  final cond(J) = 6.000097233278e+02
Packit 67cb25
Packit 67cb25
.. _fig_nlfit2:
Packit 67cb25
Packit 67cb25
.. figure:: /images/nlfit2.png
Packit 67cb25
Packit 67cb25
   Paths taken by solver for Rosenbrock function
Packit 67cb25
Packit 67cb25
We can see that enabling geodesic acceleration requires less
Packit 67cb25
than a third of the number of Jacobian evaluations in order to locate
Packit 67cb25
the minimum. The path taken by both methods is shown in :numref:`fig_nlfit2`.
Packit 67cb25
The contours show the cost function
Packit 67cb25
:math:`\Phi(x_1,x_2)`. We see that both methods quickly
Packit 67cb25
find the canyon bottom, but the geodesic acceleration method
Packit 67cb25
navigates along the bottom to the solution with significantly
Packit 67cb25
fewer iterations.
Packit 67cb25
Packit 67cb25
The program is given below.
Packit 67cb25
Packit 67cb25
.. include:: examples/nlfit2.c
Packit 67cb25
   :code:
Packit 67cb25
Packit 67cb25
Geodesic Acceleration Example 2
Packit 67cb25
-------------------------------
Packit 67cb25
Packit 67cb25
The following example fits a set of data to a Gaussian model
Packit 67cb25
using the Levenberg-Marquardt method with geodesic acceleration.
Packit 67cb25
The cost function is
Packit 67cb25
Packit 67cb25
.. only:: not texinfo
Packit 67cb25
Packit 67cb25
   .. math::
Packit 67cb25
Packit 67cb25
      \Phi(x) &= {1 \over 2} \sum_i f_i^2 \\
Packit 67cb25
      f_i &= y_i - Y(a,b,c;t_i)
Packit 67cb25
Packit 67cb25
.. only:: texinfo
Packit 67cb25
Packit 67cb25
   ::
Packit 67cb25
Packit 67cb25
      Phi(x) = 1/2 \sum_i f_i^2
Packit 67cb25
      f_i = y_i - Y(a,b,c;t_i)
Packit 67cb25
Packit 67cb25
where :math:`y_i` is the measured data point at time :math:`t_i`, and
Packit 67cb25
the model is specified by
Packit 67cb25
Packit 67cb25
.. only:: not texinfo
Packit 67cb25
Packit 67cb25
   .. math::
Packit 67cb25
Packit 67cb25
      Y(a,b,c;t) = a \exp{
Packit 67cb25
      \left[
Packit 67cb25
      -{1 \over 2}
Packit 67cb25
      \left(
Packit 67cb25
      { t - b \over c }
Packit 67cb25
      \right)^2
Packit 67cb25
      \right]
Packit 67cb25
      }
Packit 67cb25
Packit 67cb25
.. only:: texinfo
Packit 67cb25
Packit 67cb25
   ::
Packit 67cb25
Packit 67cb25
      Y(a,b,c;t) = a exp(-1/2 ((t-b)/c)^2)
Packit 67cb25
Packit 67cb25
The parameters :math:`a,b,c` represent the amplitude, mean, and width of the Gaussian
Packit 67cb25
respectively. The program below generates the :math:`y_i` data on :math:`[0,1]` using
Packit 67cb25
the values :math:`a = 5`, :math:`b = 0.4`, :math:`c = 0.15` and adding random noise.
Packit 67cb25
The :math:`i`-th row of the Jacobian is
Packit 67cb25
Packit 67cb25
.. only:: not texinfo
Packit 67cb25
Packit 67cb25
   .. math::
Packit 67cb25
Packit 67cb25
      J_{i,:} =
Packit 67cb25
      \left(
Packit 67cb25
        \begin{array}{ccc}
Packit 67cb25
          {\partial f_i \over \partial a} & {\partial f_i \over \partial b} & {\partial f_i \over \partial c}
Packit 67cb25
        \end{array}
Packit 67cb25
      \right) =
Packit 67cb25
      \left(
Packit 67cb25
        \begin{array}{ccc}
Packit 67cb25
          -e_i & -{a \over c} z_i e_i & -{a \over c} z_i^2 e_i
Packit 67cb25
        \end{array}
Packit 67cb25
      \right)
Packit 67cb25
Packit 67cb25
.. only:: texinfo
Packit 67cb25
Packit 67cb25
   ::
Packit 67cb25
Packit 67cb25
      J(i,:) = ( -e_i  -(a/c)*z_i*e_i  -(a/c)*z_i^2*e_i )
Packit 67cb25
Packit 67cb25
where
Packit 67cb25
Packit 67cb25
.. only:: not texinfo
Packit 67cb25
Packit 67cb25
   .. math::
Packit 67cb25
Packit 67cb25
      z_i &= { t_i - b \over c} \\
Packit 67cb25
      e_i &= \exp{\left( -{1 \over 2} z_i^2 \right)}
Packit 67cb25
Packit 67cb25
.. only:: texinfo
Packit 67cb25
Packit 67cb25
   ::
Packit 67cb25
Packit 67cb25
      z_i = (t_i - b) / c
Packit 67cb25
      e_i = \exp(-1/2 z_i^2)
Packit 67cb25
Packit 67cb25
In order to use geodesic acceleration, we need the second directional derivative
Packit 67cb25
of the residuals in the velocity direction,
Packit 67cb25
:math:`D_v^2 f_i = \sum_{\alpha\beta} v_{\alpha} v_{\beta} \partial_{\alpha} \partial_{\beta} f_i`,
Packit 67cb25
where :math:`v` is provided by the solver. To compute this, it is helpful to make a table of
Packit 67cb25
all second derivatives of the residuals :math:`f_i` with respect to each combination of model parameters.
Packit 67cb25
This table is
Packit 67cb25
Packit 67cb25
.. only:: not texinfo
Packit 67cb25
Packit 67cb25
   .. math::
Packit 67cb25
Packit 67cb25
      \begin{array}{cccc}
Packit 67cb25
        & {\partial \over \partial a} & {\partial \over \partial b} & {\partial \over \partial c} \cr
Packit 67cb25
        {\partial \over \partial a} & 0 & -{z_i \over c} e_i & -{z_i^2 \over c} e_i \cr
Packit 67cb25
        {\partial \over \partial b} & & {a \over c^2} \left( 1 - z_i^2 \right) e_i & {a \over c^2} z_i \left( 2 - z_i^2 \right) e_i \cr
Packit 67cb25
        {\partial \over \partial c} & & & {a \over c^2} z_i^2 \left( 3 - z_i^2 \right) e_i
Packit 67cb25
      \end{array}
Packit 67cb25
Packit 67cb25
The lower half of the table is omitted since it is symmetric. Then, the second directional derivative
Packit 67cb25
of :math:`f_i` is
Packit 67cb25
Packit 67cb25
.. only:: not texinfo
Packit 67cb25
Packit 67cb25
   .. math:: D_v^2 f_i = v_a^2 \partial_a^2 f_i + 2 v_a v_b \partial_a \partial_b f_i + 2 v_a v_c \partial_a \partial_c f_i + v_b^2 \partial_b^2 f_i + 2 v_b v_c \partial_b \partial_c f_i + v_c^2 \partial_c^2 f_i
Packit 67cb25
Packit 67cb25
.. only:: texinfo
Packit 67cb25
Packit 67cb25
   ::
Packit 67cb25
Packit 67cb25
      D_v^2 f_i = v_a^2 (d/da)^2 f_i + 2 v_a v_b (d/da) (d/db) f_i + 2 v_a v_c (d/da) (d/dc) f_i + v_b^2 (d/db)^2 f_i + 2 v_b v_c (d/db) (d/dc) f_i + v_c^2 (d/dc)^2 f_i
Packit 67cb25
Packit 67cb25
The factors of 2 come from the symmetry of the mixed second partial derivatives.
Packit 67cb25
The iteration is started using the initial guess :math:`a = 1, b = 0, c = 1`.
Packit 67cb25
The program output is shown below::
Packit 67cb25
Packit 67cb25
  iter  0: a = 1.0000, b = 0.0000, c = 1.0000, |a|/|v| = 0.0000 cond(J) =      inf, |f(x)| = 35.4785
Packit 67cb25
  iter  1: a = 1.5708, b = 0.5321, c = 0.5219, |a|/|v| = 0.3093 cond(J) =  29.0443, |f(x)| = 31.1042
Packit 67cb25
  iter  2: a = 1.7387, b = 0.4040, c = 0.4568, |a|/|v| = 0.1199 cond(J) =   3.5256, |f(x)| = 28.7217
Packit 67cb25
  iter  3: a = 2.2340, b = 0.3829, c = 0.3053, |a|/|v| = 0.3308 cond(J) =   4.5121, |f(x)| = 23.8074
Packit 67cb25
  iter  4: a = 3.2275, b = 0.3952, c = 0.2243, |a|/|v| = 0.2784 cond(J) =   8.6499, |f(x)| = 15.6003
Packit 67cb25
  iter  5: a = 4.3347, b = 0.3974, c = 0.1752, |a|/|v| = 0.2029 cond(J) =  15.1732, |f(x)| = 7.5908
Packit 67cb25
  iter  6: a = 4.9352, b = 0.3992, c = 0.1536, |a|/|v| = 0.1001 cond(J) =  26.6621, |f(x)| = 4.8402
Packit 67cb25
  iter  7: a = 5.0716, b = 0.3994, c = 0.1498, |a|/|v| = 0.0166 cond(J) =  34.6922, |f(x)| = 4.7103
Packit 67cb25
  iter  8: a = 5.0828, b = 0.3994, c = 0.1495, |a|/|v| = 0.0012 cond(J) =  36.5422, |f(x)| = 4.7095
Packit 67cb25
  iter  9: a = 5.0831, b = 0.3994, c = 0.1495, |a|/|v| = 0.0000 cond(J) =  36.6929, |f(x)| = 4.7095
Packit 67cb25
  iter 10: a = 5.0831, b = 0.3994, c = 0.1495, |a|/|v| = 0.0000 cond(J) =  36.6975, |f(x)| = 4.7095
Packit 67cb25
  iter 11: a = 5.0831, b = 0.3994, c = 0.1495, |a|/|v| = 0.0000 cond(J) =  36.6976, |f(x)| = 4.7095
Packit 67cb25
  NITER         = 11
Packit 67cb25
  NFEV          = 18
Packit 67cb25
  NJEV          = 12
Packit 67cb25
  NAEV          = 17
Packit 67cb25
  initial cost  = 1.258724737288e+03
Packit 67cb25
  final cost    = 2.217977560180e+01
Packit 67cb25
  final x       = (5.083101559156e+00, 3.994484109594e-01, 1.494898e-01)
Packit 67cb25
  final cond(J) = 3.669757713403e+01
Packit 67cb25
Packit 67cb25
We see the method converges after 11 iterations. For comparison the standard
Packit 67cb25
Levenberg-Marquardt method requires 26 iterations and so the Gaussian fitting
Packit 67cb25
problem benefits substantially from the geodesic acceleration correction. The
Packit 67cb25
column marked :code:`|a|/|v|` above shows the ratio of the acceleration term
Packit 67cb25
to the velocity term as the iteration progresses. Larger values of this
Packit 67cb25
ratio indicate that the geodesic acceleration correction term is contributing
Packit 67cb25
substantial information to the solver relative to the standard LM velocity step.
Packit 67cb25
Packit 67cb25
The data and fitted model are shown in :numref:`fig_nlfit2b`.
Packit 67cb25
Packit 67cb25
.. _fig_nlfit2b:
Packit 67cb25
Packit 67cb25
.. figure:: /images/nlfit2b.png
Packit 67cb25
Packit 67cb25
   Gaussian model fitted to data
Packit 67cb25
Packit 67cb25
The program is given below.
Packit 67cb25
Packit 67cb25
.. include:: examples/nlfit2b.c
Packit 67cb25
   :code:
Packit 67cb25
Packit 67cb25
Comparing TRS Methods Example
Packit 67cb25
-----------------------------
Packit 67cb25
Packit 67cb25
The following program compares all available nonlinear least squares
Packit 67cb25
trust-region subproblem (TRS) methods on the Branin function, a common
Packit 67cb25
optimization test problem. The cost function is
Packit 67cb25
Packit 67cb25
.. only:: not texinfo
Packit 67cb25
Packit 67cb25
   .. math::
Packit 67cb25
Packit 67cb25
      \Phi(x) &= {1 \over 2} (f_1^2 + f_2^2) \\
Packit 67cb25
      f_1 &= x_2 + a_1 x_1^2 + a_2 x_1 + a_3 \\
Packit 67cb25
      f_2 &= \sqrt{a_4} \sqrt{1 + (1 - a_5) \cos{x_1}}
Packit 67cb25
Packit 67cb25
.. only:: texinfo
Packit 67cb25
Packit 67cb25
   ::
Packit 67cb25
Packit 67cb25
      \Phi(x) &= 1/2 (f_1^2 + f_2^2)
Packit 67cb25
      f_1 &= x_2 + a_1 x_1^2 + a_2 x_1 + a_3
Packit 67cb25
      f_2 &= sqrt(a_4) sqrt(1 + (1 - a_5) cos(x_1))
Packit 67cb25
Packit 67cb25
with :math:`a_1 = -{5.1 \over 4 \pi^2}, a_2 = {5 \over \pi}, a_3 = -6, a_4 = 10, a_5 = {1 \over 8\pi}`.
Packit 67cb25
There are three minima of this function in the range
Packit 67cb25
:math:`(x_1,x_2) \in [-5,15] \times [-5,15]`. The program
Packit 67cb25
below uses the starting point :math:`(x_1,x_2) = (6,14.5)`
Packit 67cb25
and calculates the solution with all available nonlinear
Packit 67cb25
least squares TRS methods. The program output is shown below::
Packit 67cb25
Packit 67cb25
  Method                    NITER  NFEV  NJEV  Initial Cost  Final cost   Final cond(J) Final x        
Packit 67cb25
  levenberg-marquardt       20     27    21    1.9874e+02    3.9789e-01   6.1399e+07    (-3.14e+00, 1.23e+01)
Packit 67cb25
  levenberg-marquardt+accel 27     36    28    1.9874e+02    3.9789e-01   1.4465e+07    (3.14e+00, 2.27e+00)
Packit 67cb25
  dogleg                    23     64    23    1.9874e+02    3.9789e-01   5.0692e+08    (3.14e+00, 2.28e+00)
Packit 67cb25
  double-dogleg             24     69    24    1.9874e+02    3.9789e-01   3.4879e+07    (3.14e+00, 2.27e+00)
Packit 67cb25
  2D-subspace               23     54    24    1.9874e+02    3.9789e-01   2.5142e+07    (3.14e+00, 2.27e+00)
Packit 67cb25
Packit 67cb25
The first row of output above corresponds to standard Levenberg-Marquardt, while
Packit 67cb25
the second row includes geodesic acceleration. We see that the standard LM method
Packit 67cb25
converges to the minimum at :math:`(-\pi,12.275)` and also uses the least number
Packit 67cb25
of iterations and Jacobian evaluations. All other methods converge to the minimum
Packit 67cb25
:math:`(\pi,2.275)` and perform similarly in terms of number of Jacobian evaluations.
Packit 67cb25
We see that :math:`J` is fairly ill-conditioned
Packit 67cb25
at both minima, indicating that the QR (or SVD) solver is the best choice for this problem.
Packit 67cb25
Since there are only two parameters in this optimization problem, we can easily
Packit 67cb25
visualize the paths taken by each method, which are shown in :numref:`fig_nlfit3`.
Packit 67cb25
The figure shows contours of the cost function :math:`\Phi(x_1,x_2)` which exhibits
Packit 67cb25
three global minima in the range :math:`[-5,15] \times [-5,15]`. The paths taken
Packit 67cb25
by each solver are shown as colored lines.
Packit 67cb25
Packit 67cb25
.. _fig_nlfit3:
Packit 67cb25
Packit 67cb25
.. figure:: /images/nlfit3.png
Packit 67cb25
   :scale: 60%
Packit 67cb25
Packit 67cb25
   Paths taken for different TRS methods for the Branin function
Packit 67cb25
Packit 67cb25
The program is given below.
Packit 67cb25
Packit 67cb25
.. include:: examples/nlfit3.c
Packit 67cb25
   :code:
Packit 67cb25
Packit 67cb25
Large Nonlinear Least Squares Example
Packit 67cb25
-------------------------------------
Packit 67cb25
Packit 67cb25
The following program illustrates the large nonlinear least
Packit 67cb25
squares solvers on a system with significant sparse structure
Packit 67cb25
in the Jacobian. The cost function is
Packit 67cb25
Packit 67cb25
.. only:: not texinfo
Packit 67cb25
Packit 67cb25
   .. math::
Packit 67cb25
Packit 67cb25
      \Phi(x) &= {1 \over 2} \sum_{i=1}^{p+1} f_i^2 \\
Packit 67cb25
      f_i &= \sqrt{\alpha} (x_i - 1), \quad 1 \le i \le p \\
Packit 67cb25
      f_{p+1} &= ||x||^2 - {1 \over 4}
Packit 67cb25
Packit 67cb25
.. only:: texinfo
Packit 67cb25
Packit 67cb25
   ::
Packit 67cb25
Packit 67cb25
      \Phi(x) &= 1/2 \sum_{i=1}^{p+1} f_i^2
Packit 67cb25
      f_i &= \sqrt{\alpha} (x_i - 1), 1 \le i \le p
Packit 67cb25
      f_{p+1} &= ||x||^2 - 1/4
Packit 67cb25
Packit 67cb25
with :math:`\alpha = 10^{-5}`. The residual :math:`f_{p+1}` imposes a constraint on the :math:`p`
Packit 67cb25
parameters :math:`x`, to ensure that :math:`||x||^2 \approx {1 \over 4}`.
Packit 67cb25
The :math:`(p+1)`-by-:math:`p` Jacobian for this system is
Packit 67cb25
Packit 67cb25
.. only:: not texinfo
Packit 67cb25
Packit 67cb25
   .. math::
Packit 67cb25
Packit 67cb25
      J(x) =
Packit 67cb25
      \left(
Packit 67cb25
        \begin{array}{c}
Packit 67cb25
          \sqrt{\alpha} I_p \\
Packit 67cb25
          2 x^T
Packit 67cb25
        \end{array}
Packit 67cb25
      \right)
Packit 67cb25
Packit 67cb25
.. only:: texinfo
Packit 67cb25
Packit 67cb25
   ::
Packit 67cb25
Packit 67cb25
     J(x) = [ \sqrt{alpha} I_p; 2 x^T ]
Packit 67cb25
Packit 67cb25
and the normal equations matrix is
Packit 67cb25
Packit 67cb25
.. math:: J^T J = \alpha I_p + 4 x x^T
Packit 67cb25
Packit 67cb25
Finally, the second directional derivative of :math:`f` for the
Packit 67cb25
geodesic acceleration method is
Packit 67cb25
Packit 67cb25
.. only:: not texinfo
Packit 67cb25
Packit 67cb25
   .. math::
Packit 67cb25
Packit 67cb25
      f_{vv} = D_v^2 f =
Packit 67cb25
      \left(
Packit 67cb25
        \begin{array}{c}
Packit 67cb25
          0 \\
Packit 67cb25
          2 ||v||^2
Packit 67cb25
        \end{array}
Packit 67cb25
      \right)
Packit 67cb25
Packit 67cb25
.. only:: texinfo
Packit 67cb25
Packit 67cb25
   ::
Packit 67cb25
Packit 67cb25
      fvv = [     0     ]
Packit 67cb25
            [ 2 ||v||^2 ]
Packit 67cb25
Packit 67cb25
Since the upper :math:`p`-by-:math:`p` block of :math:`J` is diagonal,
Packit 67cb25
this sparse structure should be exploited in the nonlinear solver.
Packit 67cb25
For comparison, the following program solves the system for :math:`p = 2000`
Packit 67cb25
using the dense direct Cholesky solver based on the normal equations matrix
Packit 67cb25
:math:`J^T J`, as well as the iterative Steihaug-Toint solver, based on
Packit 67cb25
sparse matrix-vector products :math:`J u` and :math:`J^T u`. The
Packit 67cb25
program output is shown below::
Packit 67cb25
Packit 67cb25
  Method                    NITER NFEV NJUEV NJTJEV NAEV Init Cost  Final cost cond(J) Final |x|^2 Time (s)  
Packit 67cb25
  levenberg-marquardt       25    31   26    26     0    7.1218e+18 1.9555e-02 447.50  2.5044e-01  46.28
Packit 67cb25
  levenberg-marquardt+accel 22    23   45    23     22   7.1218e+18 1.9555e-02 447.64  2.5044e-01  33.92
Packit 67cb25
  dogleg                    37    87   36    36     0    7.1218e+18 1.9555e-02 447.59  2.5044e-01  56.05
Packit 67cb25
  double-dogleg             35    88   34    34     0    7.1218e+18 1.9555e-02 447.62  2.5044e-01  52.65
Packit 67cb25
  2D-subspace               37    88   36    36     0    7.1218e+18 1.9555e-02 447.71  2.5044e-01  59.75
Packit 67cb25
  steihaug-toint            35    88   345   0      0    7.1218e+18 1.9555e-02 inf     2.5044e-01  0.09
Packit 67cb25
Packit 67cb25
The first five rows use methods based on factoring the dense :math:`J^T J` matrix
Packit 67cb25
while the last row uses the iterative Steihaug-Toint method. While the number
Packit 67cb25
of Jacobian matrix-vector products (NJUEV) is less for the dense methods, the added time
Packit 67cb25
to construct and factor the :math:`J^T J` matrix (NJTJEV) results in a much larger runtime than the
Packit 67cb25
iterative method (see last column).
Packit 67cb25
Packit 67cb25
The program is given below.
Packit 67cb25
Packit 67cb25
.. include:: examples/nlfit4.c
Packit 67cb25
   :code:
Packit 67cb25
Packit 67cb25
References and Further Reading
Packit 67cb25
==============================
Packit 67cb25
Packit 67cb25
The following publications are relevant to the algorithms described
Packit 67cb25
in this section,
Packit 67cb25
Packit 67cb25
* J.J. |More|, *The Levenberg-Marquardt Algorithm: Implementation and
Packit 67cb25
  Theory*, Lecture Notes in Mathematics, v630 (1978), ed G. Watson.
Packit 67cb25
Packit 67cb25
* H. B. Nielsen, "Damping Parameter in Marquardt's Method",
Packit 67cb25
  IMM Department of Mathematical Modeling, DTU, Tech. Report IMM-REP-1999-05
Packit 67cb25
  (1999).
Packit 67cb25
Packit 67cb25
* K. Madsen and H. B. Nielsen, "Introduction to Optimization and Data
Packit 67cb25
  Fitting", IMM Department of Mathematical Modeling, DTU, 2010.
Packit 67cb25
Packit 67cb25
* J. E. Dennis and R. B. Schnabel, Numerical Methods for Unconstrained
Packit 67cb25
  Optimization and Nonlinear Equations, SIAM, 1996.
Packit 67cb25
Packit 67cb25
* M. K. Transtrum, B. B. Machta, and J. P. Sethna,
Packit 67cb25
  Geometry of nonlinear least squares with applications to sloppy models and optimization,
Packit 67cb25
  Phys. Rev. E 83, 036701, 2011.
Packit 67cb25
Packit 67cb25
* M. K. Transtrum and J. P. Sethna, Improvements to the Levenberg-Marquardt
Packit 67cb25
  algorithm for nonlinear least-squares minimization, arXiv:1201.5885, 2012.
Packit 67cb25
Packit 67cb25
* J.J. |More|, B.S. Garbow, K.E. Hillstrom, "Testing Unconstrained
Packit 67cb25
  Optimization Software", ACM Transactions on Mathematical Software, Vol
Packit 67cb25
  7, No 1 (1981), p 17--41.
Packit 67cb25
Packit 67cb25
* H. B. Nielsen, "UCTP Test Problems for Unconstrained Optimization",
Packit 67cb25
  IMM Department of Mathematical Modeling, DTU, Tech. Report IMM-REP-2000-17
Packit 67cb25
  (2000).