Tree - source-git/gsl - CentOS Git server

source-git / gsl

Blame doc/nls.rst

Blob History Raw

Packit	67cb25	`.. index::`
Packit	67cb25	`single: nonlinear least squares`
Packit	67cb25	`single: least squares, nonlinear`
Packit	67cb25
Packit	67cb25	`*******************************`
Packit	67cb25	`Nonlinear Least-Squares Fitting`
Packit	67cb25	`*******************************`
Packit	67cb25
Packit	67cb25	`.. include:: include.rst`
Packit	67cb25
Packit	67cb25	`This chapter describes functions for multidimensional nonlinear`
Packit	67cb25	`least-squares fitting. There are generally two classes of`
Packit	67cb25	`algorithms for solving nonlinear least squares problems, which`
Packit	67cb25	`fall under line search methods and trust region methods.`
Packit	67cb25	`GSL currently implements only trust region methods and`
Packit	67cb25	`provides the user with`
Packit	67cb25	`full access to intermediate steps of the iteration. The user`
Packit	67cb25	`also has the ability to tune a number of parameters which affect`
Packit	67cb25	`low-level aspects of the algorithm which can help to accelerate`
Packit	67cb25	`convergence for the specific problem at hand. GSL provides`
Packit	67cb25	`two separate interfaces for nonlinear least squares fitting. The`
Packit	67cb25	`first is designed for small to moderate sized problems, and the`
Packit	67cb25	`second is designed for very large problems, which may or may not`
Packit	67cb25	`have significant sparse structure.`
Packit	67cb25
Packit	67cb25	The header file :file:`gsl_multifit_nlinear.h` contains prototypes for the
Packit	67cb25	`multidimensional nonlinear fitting functions and related declarations`
Packit	67cb25	`relating to the small to moderate sized systems.`
Packit	67cb25
Packit	67cb25	The header file :file:`gsl_multilarge_nlinear.h` contains prototypes for the
Packit	67cb25	`multidimensional nonlinear fitting functions and related declarations`
Packit	67cb25	`relating to large systems.`
Packit	67cb25
Packit	67cb25	`.. index::`
Packit	67cb25	`single: nonlinear least squares, overview`
Packit	67cb25
Packit	67cb25	`Overview`
Packit	67cb25	`========`
Packit	67cb25
Packit	67cb25	`The problem of multidimensional nonlinear least-squares fitting requires`
Packit	67cb25	the minimization of the squared residuals of :math:`n` functions,
Packit	67cb25	:math:`f_i`, in :math:`p` parameters, :math:`x_i`,
Packit	67cb25
Packit	67cb25	`.. only:: not texinfo`
Packit	67cb25
Packit	67cb25	`.. math::`
Packit	67cb25
Packit	67cb25	`\Phi(x) &= {1 \over 2} \|\| f(x) \|\|^2 \\`
Packit	67cb25	`&= {1 \over 2} \sum_{i=1}^{n} f_i (x_1, \dots, x_p)^2`
Packit	67cb25
Packit	67cb25	`.. only:: texinfo`
Packit	67cb25
Packit	67cb25	`::`
Packit	67cb25
Packit	67cb25	`\Phi(x) = (1/2) \|\| f(x) \|\|^2`
Packit	67cb25	`= (1/2) \sum_{i=1}^{n} f_i(x_1, ..., x_p)^2`
Packit	67cb25
Packit	67cb25	In trust region methods, the objective (or cost) function :math:`\Phi(x)` is approximated
Packit	67cb25	by a model function :math:`m_k(\delta)` in the vicinity of some point :math:`x_k`. The
Packit	67cb25	`model function is often simply a second order Taylor series expansion around the`
Packit	67cb25	point :math:`x_k`, ie:
Packit	67cb25
Packit	67cb25	`.. only:: not texinfo`
Packit	67cb25
Packit	67cb25	`.. math:: \Phi(x_k + \delta) \approx m_k(\delta) = \Phi(x_k) + g_k^T \delta + {1 \over 2} \delta^T B_k \delta`
Packit	67cb25
Packit	67cb25	`.. only:: texinfo`
Packit	67cb25
Packit	67cb25	`::`
Packit	67cb25
Packit	67cb25	`\Phi(x_k + \delta) ~=~ m_k(\delta) = \Phi(x_k) + g_k^T \delta + 1/2 \delta^T B_k \delta`
Packit	67cb25
Packit	67cb25	where :math:`g_k = \nabla \Phi(x_k) = J^T f` is the gradient vector at the point :math:`x_k`,
Packit	67cb25	:math:`B_k = \nabla^2 \Phi(x_k)` is the Hessian matrix at :math:`x_k`, or some
Packit	67cb25	approximation to it, and :math:`J` is the :math:`n`-by-:math:`p` Jacobian matrix
Packit	67cb25
Packit	67cb25	`.. only:: not texinfo`
Packit	67cb25
Packit	67cb25	`.. math:: J_{ij} = \partial f_i / \partial x_j`
Packit	67cb25
Packit	67cb25	`.. only:: texinfo`
Packit	67cb25
Packit	67cb25	`::`
Packit	67cb25
Packit	67cb25	`J_{ij} = d f_i / d x_j`
Packit	67cb25
Packit	67cb25	In order to find the next step :math:`\delta`, we minimize the model function
Packit	67cb25	:math:`m_k(\delta)`, but search for solutions only within a region where
Packit	67cb25	we trust that :math:`m_k(\delta)` is a good approximation to the objective
Packit	67cb25	function :math:`\Phi(x_k + \delta)`. In other words,
Packit	67cb25	`we seek a solution of the trust region subproblem (TRS)`
Packit	67cb25
Packit	67cb25	`.. only:: not texinfo`
Packit	67cb25
Packit	67cb25	`.. math:: \min_{\delta \in R^p} m_k(\delta) = \Phi(x_k) + g_k^T \delta + {1 \over 2} \delta^T B_k \delta, \qquad\hbox{s.t.}\quad \|\| D_k \delta \|\| \le \Delta_k`
Packit	67cb25
Packit	67cb25	`.. only:: texinfo`
Packit	67cb25
Packit	67cb25	`::`
Packit	67cb25
Packit	67cb25	`\min_(\delta \in R^p) m_k(\delta), s.t. \|\| D_k \delta \|\| <= \Delta_k`
Packit	67cb25
Packit	67cb25	where :math:`\Delta_k > 0` is the trust region radius and :math:`D_k` is
Packit	67cb25	a scaling matrix. If :math:`D_k = I`, then the trust region is a ball
Packit	67cb25	of radius :math:`\Delta_k` centered at :math:`x_k`. In some applications,
Packit	67cb25	the parameter vector :math:`x` may have widely different scales. For
Packit	67cb25	`example, one parameter might be a temperature on the order of`
Packit	67cb25	:math:`10^3` K, while another might be a length on the order of
Packit	67cb25	:math:`10^{-6}` m. In such cases, a spherical trust region may not
Packit	67cb25	be the best choice, since if :math:`\Phi` changes rapidly along
Packit	67cb25	`directions with one scale, and more slowly along directions with`
Packit	67cb25	a different scale, the model function :math:`m_k` may be a poor
Packit	67cb25	approximation to :math:`\Phi` along the rapidly changing directions.
Packit	67cb25	`In such problems, it may be best to use an elliptical trust region,`
Packit	67cb25	by setting :math:`D_k` to a diagonal matrix whose entries are designed
Packit	67cb25	so that the scaled step :math:`D_k \delta` has entries of approximately the same
Packit	67cb25	`order of magnitude.`
Packit	67cb25
Packit	67cb25	`The trust region subproblem above normally amounts to solving a`
Packit	67cb25	`linear least squares system (or multiple systems) for the step`
Packit	67cb25	:math:`\delta`. Once :math:`\delta` is computed, it is checked whether
Packit	67cb25	or not it reduces the objective function :math:`\Phi(x)`. A useful
Packit	67cb25	`statistic for this is to look at the ratio`
Packit	67cb25
Packit	67cb25	`.. only:: not texinfo`
Packit	67cb25
Packit	67cb25	`.. math:: \rho_k = { \Phi(x_k) - \Phi(x_k + \delta_k) \over m_k(0) - m_k(\delta_k) }`
Packit	67cb25
Packit	67cb25	`.. only:: texinfo`
Packit	67cb25
Packit	67cb25	`::`
Packit	67cb25
Packit	67cb25	`\rho_k = ( \Phi(x_k) - \Phi(x_k + \delta_k) / ( m_k(0) - m_k(\delta_k) )`
Packit	67cb25
Packit	67cb25	`where the numerator is the actual reduction of the objective function`
Packit	67cb25	due to the step :math:`\delta_k`, and the denominator is the predicted
Packit	67cb25	reduction due to the model :math:`m_k`. If :math:`\rho_k` is negative,
Packit	67cb25	it means that the step :math:`\delta_k` increased the objective function
Packit	67cb25	and so it is rejected. If :math:`\rho_k` is positive,
Packit	67cb25	`then we have found a step which reduced the objective function and`
Packit	67cb25	it is accepted. Furthermore, if :math:`\rho_k` is close to 1,
Packit	67cb25	`then this indicates that the model function is a good approximation`
Packit	67cb25	`to the objective function in the trust region, and so on the next`
Packit	67cb25	`iteration the trust region is enlarged in order to take more ambitious`
Packit	67cb25	`steps. When a step is rejected, the trust region is made smaller and`
Packit	67cb25	`the TRS is solved again. An outline for the general trust region method`
Packit	67cb25	`used by GSL can now be given.`
Packit	67cb25
Packit	67cb25	`Trust Region Algorithm`
Packit	67cb25
Packit	67cb25	1. Initialize: given :math:`x_0`, construct :math:`m_0(\delta)`, :math:`D_0` and :math:`\Delta_0 > 0`
Packit	67cb25
Packit	67cb25	`2. For k = 0, 1, 2, ...`
Packit	67cb25
Packit	67cb25	`a. If converged, then stop`
Packit	67cb25	b. Solve TRS for trial step :math:`\delta_k`
Packit	67cb25	c. Evaluate trial step by computing :math:`\rho_k`
Packit	67cb25
Packit	67cb25	1). if step is accepted, set :math:`x_{k+1} = x_k + \delta_k` and increase radius,
Packit	67cb25	:math:`\Delta_{k+1} = \alpha \Delta_k`
Packit	67cb25	2). if step is rejected, set :math:`x_{k+1} = x_k` and decrease radius,
Packit	67cb25	:math:`\Delta_{k+1} = {\Delta_k \over \beta}`; goto 2(b)
Packit	67cb25
Packit	67cb25	d. Construct :math:`m_{k+1}(\delta)` and :math:`D_{k+1}`
Packit	67cb25
Packit	67cb25	`GSL offers the user a number of different algorithms for solving the trust`
Packit	67cb25	`region subproblem in 2(b), as well as different choices of scaling matrices`
Packit	67cb25	:math:`D_k` and different methods of updating the trust region radius
Packit	67cb25	:math:`\Delta_k`. Therefore, while reasonable default methods are provided,
Packit	67cb25	`the user has a lot of control to fine-tune the various steps of the`
Packit	67cb25	`algorithm for their specific problem.`
Packit	67cb25
Packit	67cb25	`Solving the Trust Region Subproblem (TRS)`
Packit	67cb25	`=========================================`
Packit	67cb25
Packit	67cb25	`Below we describe the methods available for solving the trust region`
Packit	67cb25	`subproblem. The methods available provide either exact or approximate`
Packit	67cb25	`solutions to the trust region subproblem. In all algorithms below,`
Packit	67cb25	the Hessian matrix :math:`B_k` is approximated as :math:`B_k \approx J_k^T J_k`,
Packit	67cb25	where :math:`J_k = J(x_k)`. In all methods, the solution of the TRS
Packit	67cb25	`involves solving a linear least squares system involving the Jacobian`
Packit	67cb25	matrix. For small to moderate sized problems (:code:`gsl_multifit_nlinear` interface),
Packit	67cb25	`this is accomplished by factoring the full Jacobian matrix, which is provided`
Packit	67cb25	`by the user, with the Cholesky, QR, or SVD decompositions. For large systems`
Packit	67cb25	(:code:`gsl_multilarge_nlinear` interface), the user has two choices. One
Packit	67cb25	`is to solve the system iteratively, without needing to store the full`
Packit	67cb25	`Jacobian matrix in memory. With this method, the user must provide a routine`
Packit	67cb25	to calculate the matrix-vector products :math:`J u` or :math:`J^T u` for a given vector :math:`u`.
Packit	67cb25	`This iterative method is particularly useful for systems where the Jacobian has`
Packit	67cb25	`sparse structure, since forming matrix-vector products can be done cheaply. The`
Packit	67cb25	`second option for large systems involves forming the normal equations matrix`
Packit	67cb25	:math:`J^T J` and then factoring it using a Cholesky decomposition. The normal
Packit	67cb25	equations matrix is :math:`p`-by-:math:`p`, typically much smaller than the full
Packit	67cb25	:math:`n`-by-:math:`p` Jacobian, and can usually be stored in memory even if the full
Packit	67cb25	`Jacobian matrix cannot. This option is useful for large, dense systems, or if the`
Packit	67cb25	`iterative method has difficulty converging.`
Packit	67cb25
Packit	67cb25	`.. index::`
Packit	67cb25	`single: Levenberg-Marquardt algorithm`
Packit	67cb25	`single: nonlinear least squares, levenberg-marquardt`
Packit	67cb25
Packit	67cb25	`Levenberg-Marquardt`
Packit	67cb25	`-------------------`
Packit	67cb25
Packit	67cb25	There is a theorem which states that if :math:`\delta_k` is a solution
Packit	67cb25	`to the trust region subproblem given above, then there exists`
Packit	67cb25	:math:`\mu_k \ge 0` such that
Packit	67cb25
Packit	67cb25	`.. only:: not texinfo`
Packit	67cb25
Packit	67cb25	`.. math:: \left( B_k + \mu_k D_k^T D_k \right) \delta_k = -g_k`
Packit	67cb25
Packit	67cb25	`.. only:: texinfo`
Packit	67cb25
Packit	67cb25	`::`
Packit	67cb25
Packit	67cb25	`( B_k + \mu_k D_k^T D_k ) \delta_k = -g_k`
Packit	67cb25
Packit	67cb25	with :math:`\mu_k (\Delta_k - \|\|D_k \delta_k\|\|) = 0`. This
Packit	67cb25	`forms the basis of the Levenberg-Marquardt algorithm, which controls`
Packit	67cb25	the trust region size by adjusting the parameter :math:`\mu_k`
Packit	67cb25	rather than the radius :math:`\Delta_k` directly. For each radius
Packit	67cb25	:math:`\Delta_k`, there is a unique parameter :math:`\mu_k` which
Packit	67cb25	`solves the TRS, and they have an inverse relationship, so that large values of`
Packit	67cb25	:math:`\mu_k` correspond to smaller trust regions, while small
Packit	67cb25	values of :math:`\mu_k` correspond to larger trust regions.
Packit	67cb25
Packit	67cb25	With the approximation :math:`B_k \approx J_k^T J_k`, on each iteration,
Packit	67cb25	in order to calculate the step :math:`\delta_k`,
Packit	67cb25	`the following linear least squares problem is solved:`
Packit	67cb25
Packit	67cb25	`.. only:: not texinfo`
Packit	67cb25
Packit	67cb25	`.. math::`
Packit	67cb25
Packit	67cb25	`\left[`
Packit	67cb25	`\begin{array}{c}`
Packit	67cb25	`J_k \\`
Packit	67cb25	`\sqrt{\mu_k} D_k`
Packit	67cb25	`\end{array}`
Packit	67cb25	`\right]`
Packit	67cb25	`\delta_k =`
Packit	67cb25	`-`
Packit	67cb25	`\left[`
Packit	67cb25	`\begin{array}{c}`
Packit	67cb25	`f_k \cr`
Packit	67cb25	`0`
Packit	67cb25	`\end{array}`
Packit	67cb25	`\right]`
Packit	67cb25
Packit	67cb25	`.. only:: texinfo`
Packit	67cb25
Packit	67cb25	`::`
Packit	67cb25
Packit	67cb25	`[J_k; sqrt(mu_k) D_k] \delta_k = - [f_k; 0]`
Packit	67cb25
Packit	67cb25	If the step :math:`\delta_k` is accepted, then
Packit	67cb25	:math:`\mu_k` is decreased on the next iteration in order
Packit	67cb25	`to take a larger step, otherwise it is increased to take`
Packit	67cb25	`a smaller step. The Levenberg-Marquardt algorithm provides`
Packit	67cb25	`an exact solution of the trust region subproblem, but`
Packit	67cb25	`typically has a higher computational cost per iteration`
Packit	67cb25	`than the approximate methods discussed below, since it`
Packit	67cb25	`may need to solve the least squares system above several`
Packit	67cb25	times for different values of :math:`\mu_k`.
Packit	67cb25
Packit	67cb25	`.. index::`
Packit	67cb25	`single: Levenberg-Marquardt algorithm, geodesic acceleration`
Packit	67cb25	`single: nonlinear least squares, levenberg-marquardt, geodesic acceleration`
Packit	67cb25
Packit	67cb25	`Levenberg-Marquardt with Geodesic Acceleration`
Packit	67cb25	`----------------------------------------------`
Packit	67cb25
Packit	67cb25	`This method applies a so-called geodesic acceleration correction to`
Packit	67cb25	the standard Levenberg-Marquardt step :math:`\delta_k` (Transtrum et al, 2011).
Packit	67cb25	By interpreting :math:`\delta_k` as a first order step along a geodesic in the
Packit	67cb25	model parameter space (ie: a velocity :math:`\delta_k = v_k`), the geodesic
Packit	67cb25	acceleration :math:`a_k` is a second order correction along the
Packit	67cb25	`geodesic which is determined by solving the linear least squares system`
Packit	67cb25
Packit	67cb25	`.. only:: not texinfo`
Packit	67cb25
Packit	67cb25	`.. math::`
Packit	67cb25
Packit	67cb25	`\left[`
Packit	67cb25	`\begin{array}{c}`
Packit	67cb25	`J_k \\`
Packit	67cb25	`\sqrt{\mu_k} D_k`
Packit	67cb25	`\end{array}`
Packit	67cb25	`\right]`
Packit	67cb25	`a_k =`
Packit	67cb25	`-`
Packit	67cb25	`\left[`
Packit	67cb25	`\begin{array}{c}`
Packit	67cb25	`f_{vv}(x_k) \\`
Packit	67cb25	`0`
Packit	67cb25	`\end{array}`
Packit	67cb25	`\right]`
Packit	67cb25
Packit	67cb25	`.. only:: texinfo`
Packit	67cb25
Packit	67cb25	`::`
Packit	67cb25
Packit	67cb25	`[J_k; sqrt(mu_k) D_k] a_k = - [f_vv(x_k); 0]`
Packit	67cb25
Packit	67cb25	where :math:`f_{vv}` is the second directional derivative of
Packit	67cb25	the residual vector in the velocity direction :math:`v`,
Packit	67cb25	:math:`f_{vv}(x) = D_v^2 f = \sum_{\alpha\beta} v_{\alpha} v_{\beta} \partial_{\alpha} \partial_{\beta} f(x)`,
Packit	67cb25	where :math:`\alpha` and :math:`\beta` are summed over the :math:`p`
Packit	67cb25	parameters. The new total step is then :math:`\delta_k' = v_k + {1 \over 2}a_k`.
Packit	67cb25	The second order correction :math:`a_k` can be calculated with a modest additional
Packit	67cb25	`cost, and has been shown to dramatically reduce the number of iterations`
Packit	67cb25	`(and expensive Jacobian evaluations) required to reach convergence on a variety`
Packit	67cb25	`of different problems. In order to utilize the geodesic acceleration, the user must supply a`
Packit	67cb25	`function which provides the second directional derivative vector`
Packit	67cb25	:math:`f_{vv}(x)`, or alternatively the library can use a finite
Packit	67cb25	`difference method to estimate this vector with one additional function`
Packit	67cb25	evaluation of :math:`f(x + h v)` where :math:`h` is a tunable step size
Packit	67cb25	(see the :code:`h_fvv` parameter description).
Packit	67cb25
Packit	67cb25	`.. index::`
Packit	67cb25	`single: Dogleg algorithm`
Packit	67cb25	`single: nonlinear least squares, dogleg`
Packit	67cb25
Packit	67cb25	`Dogleg`
Packit	67cb25	`------`
Packit	67cb25
Packit	67cb25	`This is Powell's dogleg method, which finds an approximate`
Packit	67cb25	`solution to the trust region subproblem, by restricting`
Packit	67cb25	`its search to a piecewise linear "dogleg" path,`
Packit	67cb25	`composed of the origin, the Cauchy point which represents`
Packit	67cb25	`the model minimizer along the steepest descent direction,`
Packit	67cb25	`and the Gauss-Newton point, which is the overall minimizer`
Packit	67cb25	`of the unconstrained model. The Gauss-Newton step is calculated by`
Packit	67cb25	`solving`
Packit	67cb25
Packit	67cb25	`.. math:: J_k \delta_{gn} = -f_k`
Packit	67cb25
Packit	67cb25	`which is the main computational task for each iteration,`
Packit	67cb25	`but only needs to be performed once per iteration. If`
Packit	67cb25	`the Gauss-Newton point is inside the trust region, it is`
Packit	67cb25	`selected as the step. If it is outside, the method then`
Packit	67cb25	`calculates the Cauchy point, which is located along the`
Packit	67cb25	`gradient direction. If the Cauchy point is also outside`
Packit	67cb25	`the trust region, the method assumes that it is still far`
Packit	67cb25	`from the minimum and so proceeds along the gradient`
Packit	67cb25	`direction, truncating the step at the trust region`
Packit	67cb25	`boundary. If the Cauchy point is inside the trust region,`
Packit	67cb25	`with the Gauss-Newton point outside, the method`
Packit	67cb25	`uses a dogleg step, which is a linear combination of the`
Packit	67cb25	`gradient direction and the Gauss-Newton direction, stopping at the trust`
Packit	67cb25	`region boundary.`
Packit	67cb25
Packit	67cb25	`.. index::`
Packit	67cb25	`single: double Dogleg algorithm`
Packit	67cb25	`single: Dogleg algorithm, double`
Packit	67cb25	`single: nonlinear least squares, double dogleg`
Packit	67cb25
Packit	67cb25	`Double Dogleg`
Packit	67cb25	`-------------`
Packit	67cb25
Packit	67cb25	`This method is an improvement over the classical dogleg`
Packit	67cb25	`algorithm, which attempts to include information about`
Packit	67cb25	`the Gauss-Newton step while the iteration is still far from`
Packit	67cb25	`the minimum. When the Cauchy point is inside the trust region`
Packit	67cb25	`and the Gauss-Newton point is outside, the method computes`
Packit	67cb25	`a scaled Gauss-Newton point and then takes a dogleg step`
Packit	67cb25	`between the Cauchy point and the scaled Gauss-Newton point.`
Packit	67cb25	`The scaling is calculated to ensure that the reduction`
Packit	67cb25	in the model :math:`m_k` is about the same as the reduction
Packit	67cb25	`provided by the Cauchy point.`
Packit	67cb25
Packit	67cb25	`Two Dimensional Subspace`
Packit	67cb25	`------------------------`
Packit	67cb25
Packit	67cb25	`The dogleg methods restrict the search for the TRS solution`
Packit	67cb25	`to a 1D curve defined by the Cauchy and Gauss-Newton points.`
Packit	67cb25	`An improvement to this is to search for a solution using`
Packit	67cb25	`the full two dimensional subspace spanned by the Cauchy`
Packit	67cb25	`and Gauss-Newton directions. The dogleg path is of course`
Packit	67cb25	`inside this subspace, and so this method solves the TRS`
Packit	67cb25	`at least as accurately as the dogleg methods. Since this`
Packit	67cb25	`method searches a larger subspace for a solution, it can`
Packit	67cb25	`converge more quickly than dogleg on some problems. Because`
Packit	67cb25	`the subspace is only two dimensional, this method is`
Packit	67cb25	`very efficient and the main computation per iteration is`
Packit	67cb25	`to determine the Gauss-Newton point.`
Packit	67cb25
Packit	67cb25	`Steihaug-Toint Conjugate Gradient`
Packit	67cb25	`---------------------------------`
Packit	67cb25
Packit	67cb25	`One difficulty of the dogleg methods is calculating the`
Packit	67cb25	`Gauss-Newton step when the Jacobian matrix is singular. The`
Packit	67cb25	`Steihaug-Toint method also computes a generalized dogleg`
Packit	67cb25	`step, but avoids solving for the Gauss-Newton step directly,`
Packit	67cb25	`instead using an iterative conjugate gradient algorithm. This`
Packit	67cb25	`method performs well at points where the Jacobian is singular,`
Packit	67cb25	`and is also suitable for large-scale problems where factoring`
Packit	67cb25	`the Jacobian matrix could be prohibitively expensive.`
Packit	67cb25
Packit	67cb25	`Weighted Nonlinear Least-Squares`
Packit	67cb25	`================================`
Packit	67cb25
Packit	67cb25	`Weighted nonlinear least-squares fitting minimizes the function`
Packit	67cb25
Packit	67cb25	`.. only:: not texinfo`
Packit	67cb25
Packit	67cb25	`.. math::`
Packit	67cb25
Packit	67cb25	`\Phi(x) &= {1 \over 2} \|\| f \|\|_W^2 \\`
Packit	67cb25	`&= {1 \over 2} \sum_{i=1}^{n} w_i f_i (x_1, \dots, x_p)^2`
Packit	67cb25
Packit	67cb25	`.. only:: texinfo`
Packit	67cb25
Packit	67cb25	`::`
Packit	67cb25
Packit	67cb25	`\Phi(x) = (1/2) \|\| f(x) \|\|_W^2`
Packit	67cb25	`= (1/2) \sum_{i=1}^{n} f_i(x_1, ..., x_p)^2`
Packit	67cb25
Packit	67cb25	where :math:`W = \diag(w_1,w_2,...,w_n)` is the weighting matrix,
Packit	67cb25	and :math:`\|\|f\|\|_W^2 = f^T W f`.
Packit	67cb25	The weights :math:`w_i` are commonly defined as :math:`w_i = 1/\sigma_i^2`,
Packit	67cb25	where :math:`\sigma_i` is the error in the :math:`i`-th measurement.
Packit	67cb25	A simple change of variables :math:`\tilde{f} = W^{1 \over 2} f` yields
Packit	67cb25	:math:`\Phi(x) = {1 \over 2} \|\|\tilde{f}\|\|^2`, which is in the
Packit	67cb25	`same form as the unweighted case. The user can either perform this`
Packit	67cb25	`transform directly on their function residuals and Jacobian, or use`
Packit	67cb25	the :func:`gsl_multifit_nlinear_winit` interface which automatically
Packit	67cb25	`performs the correct scaling. To manually perform this transformation,`
Packit	67cb25	`the residuals and Jacobian should be modified according to`
Packit	67cb25
Packit	67cb25	`.. only:: not texinfo`
Packit	67cb25
Packit	67cb25	`.. math::`
Packit	67cb25
Packit	67cb25	`\tilde{f}_i & = \sqrt{w_i} f_i = {f_i \over \sigma_i} \\`
Packit	67cb25	`\tilde{J}_{ij} & = \sqrt{w_i} { \partial f_i \over \partial x_j } = { 1 \over \sigma_i} { \partial f_i \over \partial x_j }`
Packit	67cb25
Packit	67cb25	`.. only:: texinfo`
Packit	67cb25
Packit	67cb25	`::`
Packit	67cb25
Packit	67cb25	`f~_i = f_i / \sigma_i`
Packit	67cb25	`J~_ij = 1 / \sigma_i df_i/dx_j`
Packit	67cb25
Packit	67cb25	`For large systems, the user must perform their own weighting.`
Packit	67cb25
Packit	67cb25	`.. _sec_tunable-parameters:`
Packit	67cb25
Packit	67cb25	`Tunable Parameters`
Packit	67cb25	`==================`
Packit	67cb25
Packit	67cb25	`The user can tune nearly all aspects of the iteration at allocation`
Packit	67cb25	time. For the :code:`gsl_multifit_nlinear` interface, the user may
Packit	67cb25	modify the :type:`gsl_multifit_nlinear_parameters` structure, which is
Packit	67cb25	`defined as follows:`
Packit	67cb25
Packit	67cb25	`.. type:: gsl_multifit_nlinear_parameters`
Packit	67cb25
Packit	67cb25	`::`
Packit	67cb25
Packit	67cb25	`typedef struct`
Packit	67cb25	`{`
Packit	67cb25	`const gsl_multifit_nlinear_trs trs; / trust region subproblem method */`
Packit	67cb25	`const gsl_multifit_nlinear_scale scale; / scaling method */`
Packit	67cb25	`const gsl_multifit_nlinear_solver solver; / solver method */`
Packit	67cb25	`gsl_multifit_nlinear_fdtype fdtype; /* finite difference method */`
Packit	67cb25	`double factor_up; /* factor for increasing trust radius */`
Packit	67cb25	`double factor_down; /* factor for decreasing trust radius */`
Packit	67cb25	`double avmax; /* max allowed \|a\|/\|v\| */`
Packit	67cb25	`double h_df; /* step size for finite difference Jacobian */`
Packit	67cb25	`double h_fvv; /* step size for finite difference fvv */`
Packit	67cb25	`} gsl_multifit_nlinear_parameters;`
Packit	67cb25
Packit	67cb25	For the :code:`gsl_multilarge_nlinear` interface, the user may
Packit	67cb25	modify the :type:`gsl_multilarge_nlinear_parameters` structure, which is
Packit	67cb25	`defined as follows:`
Packit	67cb25
Packit	67cb25	`.. type:: gsl_multilarge_nlinear_parameters`
Packit	67cb25
Packit	67cb25	`::`
Packit	67cb25
Packit	67cb25	`typedef struct`
Packit	67cb25	`{`
Packit	67cb25	`const gsl_multilarge_nlinear_trs trs; / trust region subproblem method */`
Packit	67cb25	`const gsl_multilarge_nlinear_scale scale; / scaling method */`
Packit	67cb25	`const gsl_multilarge_nlinear_solver solver; / solver method */`
Packit	67cb25	`gsl_multilarge_nlinear_fdtype fdtype; /* finite difference method */`
Packit	67cb25	`double factor_up; /* factor for increasing trust radius */`
Packit	67cb25	`double factor_down; /* factor for decreasing trust radius */`
Packit	67cb25	`double avmax; /* max allowed \|a\|/\|v\| */`
Packit	67cb25	`double h_df; /* step size for finite difference Jacobian */`
Packit	67cb25	`double h_fvv; /* step size for finite difference fvv */`
Packit	67cb25	`size_t max_iter; /* maximum iterations for trs method */`
Packit	67cb25	`double tol; /* tolerance for solving trs */`
Packit	67cb25	`} gsl_multilarge_nlinear_parameters;`
Packit	67cb25
Packit	67cb25	`Each of these parameters is discussed in further detail below.`
Packit	67cb25
Packit	67cb25	`.. type:: gsl_multifit_nlinear_trs`
Packit	67cb25	`gsl_multilarge_nlinear_trs`
Packit	67cb25
Packit	67cb25	The parameter :data:`trs` determines the method used to solve the trust region
Packit	67cb25	`subproblem, and may be selected from the following choices,`
Packit	67cb25
Packit	67cb25	`.. var:: gsl_multifit_nlinear_trs_lm`
Packit	67cb25	`gsl_multilarge_nlinear_trs_lm`
Packit	67cb25
Packit	67cb25	`This selects the Levenberg-Marquardt algorithm.`
Packit	67cb25
Packit	67cb25	`.. var:: gsl_multifit_nlinear_trs_lmaccel`
Packit	67cb25	`gsl_multilarge_nlinear_trs_lmaccel`
Packit	67cb25
Packit	67cb25	`This selects the Levenberg-Marquardt algorithm with geodesic`
Packit	67cb25	`acceleration.`
Packit	67cb25
Packit	67cb25	`.. var:: gsl_multifit_nlinear_trs_dogleg`
Packit	67cb25	`gsl_multilarge_nlinear_trs_dogleg`
Packit	67cb25
Packit	67cb25	`This selects the dogleg algorithm.`
Packit	67cb25
Packit	67cb25	`.. var:: gsl_multifit_nlinear_trs_ddogleg`
Packit	67cb25	`gsl_multilarge_nlinear_trs_ddogleg`
Packit	67cb25
Packit	67cb25	`This selects the double dogleg algorithm.`
Packit	67cb25
Packit	67cb25	`.. var:: gsl_multifit_nlinear_trs_subspace2D`
Packit	67cb25	`gsl_multilarge_nlinear_trs_subspace2D`
Packit	67cb25
Packit	67cb25	`This selects the 2D subspace algorithm.`
Packit	67cb25
Packit	67cb25	`.. var:: gsl_multilarge_nlinear_trs_cgst`
Packit	67cb25
Packit	67cb25	`This selects the Steihaug-Toint conjugate gradient algorithm. This`
Packit	67cb25	`method is available only for large systems.`
Packit	67cb25
Packit	67cb25	`.. type:: gsl_multifit_nlinear_scale`
Packit	67cb25	`gsl_multilarge_nlinear_scale`
Packit	67cb25
Packit	67cb25	The parameter :data:`scale` determines the diagonal scaling matrix :math:`D` and
Packit	67cb25	`may be selected from the following choices,`
Packit	67cb25
Packit	67cb25	`.. var:: gsl_multifit_nlinear_scale_more`
Packit	67cb25	`gsl_multilarge_nlinear_scale_more`
Packit	67cb25
Packit	67cb25	`This damping strategy was suggested by \|More\|, and`
Packit	67cb25	corresponds to :math:`D^T D = \max(\diag(J^T J))`,
Packit	67cb25	`in other words the maximum elements of`
Packit	67cb25	:math:`\diag(J^T J)` encountered thus far in the iteration.
Packit	67cb25	This choice of :math:`D` makes the problem scale-invariant,
Packit	67cb25	so that if the model parameters :math:`x_i` are each scaled
Packit	67cb25	by an arbitrary constant, :math:`\tilde{x}_i = a_i x_i`, then
Packit	67cb25	`the sequence of iterates produced by the algorithm would`
Packit	67cb25	`be unchanged. This method can work very well in cases`
Packit	67cb25	`where the model parameters have widely different scales`
Packit	67cb25	`(ie: if some parameters are measured in nanometers, while others`
Packit	67cb25	`are measured in degrees Kelvin). This strategy has been proven`
Packit	67cb25	`effective on a large class of problems and so it is the library`
Packit	67cb25	`default, but it may not be the best choice for all problems.`
Packit	67cb25
Packit	67cb25	`.. var:: gsl_multifit_nlinear_scale_levenberg`
Packit	67cb25	`gsl_multilarge_nlinear_scale_levenberg`
Packit	67cb25
Packit	67cb25	`This damping strategy was originally suggested by Levenberg, and`
Packit	67cb25	corresponds to :math:`D^T D = I`. This method has also proven
Packit	67cb25	`effective on a large class of problems, but is not scale-invariant.`
Packit	67cb25	`However, some authors (e.g. Transtrum and Sethna 2012) argue`
Packit	67cb25	`that this choice is better for problems which are susceptible`
Packit	67cb25	`to parameter evaporation (ie: parameters go to infinity)`
Packit	67cb25
Packit	67cb25	`.. var:: gsl_multifit_nlinear_scale_marquardt`
Packit	67cb25	`gsl_multilarge_nlinear_scale_marquardt`
Packit	67cb25
Packit	67cb25	`This damping strategy was suggested by Marquardt, and`
Packit	67cb25	corresponds to :math:`D^T D = \diag(J^T J)`. This
Packit	67cb25	`method is scale-invariant, but it is generally considered`
Packit	67cb25	`inferior to both the Levenberg and \|More\| strategies, though`
Packit	67cb25	`may work well on certain classes of problems.`
Packit	67cb25
Packit	67cb25	`.. type:: gsl_multifit_nlinear_solver`
Packit	67cb25	`gsl_multilarge_nlinear_solver`
Packit	67cb25
Packit	67cb25	`Solving the trust region subproblem on each iteration almost always`
Packit	67cb25	`requires the solution of the following linear least squares system`
Packit	67cb25
Packit	67cb25	`.. only:: not texinfo`
Packit	67cb25
Packit	67cb25	`.. math::`
Packit	67cb25
Packit	67cb25	`\left[`
Packit	67cb25	`\begin{array}{c}`
Packit	67cb25	`J \\`
Packit	67cb25	`\sqrt{\mu} D`
Packit	67cb25	`\end{array}`
Packit	67cb25	`\right]`
Packit	67cb25	`\delta =`
Packit	67cb25	`-`
Packit	67cb25	`\left[`
Packit	67cb25	`\begin{array}{c}`
Packit	67cb25	`f \\`
Packit	67cb25	`0`
Packit	67cb25	`\end{array}`
Packit	67cb25	`\right]`
Packit	67cb25
Packit	67cb25	`.. only:: texinfo`
Packit	67cb25
Packit	67cb25	`::`
Packit	67cb25
Packit	67cb25	`[J; sqrt(mu) D] \delta = - [f; 0]`
Packit	67cb25
Packit	67cb25	The :data:`solver` parameter determines how the system is
Packit	67cb25	`solved and can be selected from the following choices:`
Packit	67cb25
Packit	67cb25	`.. var:: gsl_multifit_nlinear_solver_qr`
Packit	67cb25
Packit	67cb25	`This method solves the system using a rank revealing QR`
Packit	67cb25	decomposition of the Jacobian :math:`J`. This method will
Packit	67cb25	`produce reliable solutions in cases where the Jacobian`
Packit	67cb25	`is rank deficient or near-singular but does require about`
Packit	67cb25	`twice as many operations as the Cholesky method discussed`
Packit	67cb25	`below.`
Packit	67cb25
Packit	67cb25	`.. var:: gsl_multifit_nlinear_solver_cholesky`
Packit	67cb25	`gsl_multilarge_nlinear_solver_cholesky`
Packit	67cb25
Packit	67cb25	`This method solves the alternate normal equations problem`
Packit	67cb25
Packit	67cb25	`.. only:: not texinfo`
Packit	67cb25
Packit	67cb25	`.. math:: \left( J^T J + \mu D^T D \right) \delta = -J^T f`
Packit	67cb25
Packit	67cb25	`.. only:: texinfo`
Packit	67cb25
Packit	67cb25	`::`
Packit	67cb25
Packit	67cb25	`( J^T J + \mu D^T D ) \delta = -J^T f`
Packit	67cb25
Packit	67cb25	`by using a Cholesky decomposition of the matrix`
Packit	67cb25	:math:`J^T J + \mu D^T D`. This method is faster than the
Packit	67cb25	`QR approach, however it is susceptible to numerical instabilities`
Packit	67cb25	`if the Jacobian matrix is rank deficient or near-singular. In`
Packit	67cb25	`these cases, an attempt is made to reduce the condition number`
Packit	67cb25	`of the matrix using Jacobi preconditioning, but for highly`
Packit	67cb25	`ill-conditioned problems the QR approach is better. If it is`
Packit	67cb25	`known that the Jacobian matrix is well conditioned, this method`
Packit	67cb25	`is accurate and will perform faster than the QR approach.`
Packit	67cb25
Packit	67cb25	`.. var:: gsl_multifit_nlinear_solver_svd`
Packit	67cb25
Packit	67cb25	`This method solves the system using a singular value`
Packit	67cb25	decomposition of the Jacobian :math:`J`. This method will
Packit	67cb25	`produce the most reliable solutions for ill-conditioned Jacobians`
Packit	67cb25	`but is also the slowest solver method.`
Packit	67cb25
Packit	67cb25	`.. type:: gsl_multifit_nlinear_fdtype`
Packit	67cb25
Packit	67cb25	The parameter :data:`fdtype` specifies whether to use forward or centered
Packit	67cb25	`differences when approximating the Jacobian. This is only`
Packit	67cb25	`used when an analytic Jacobian is not provided to the solver.`
Packit	67cb25	`This parameter may be set to one of the following choices.`
Packit	67cb25
Packit	67cb25	`.. macro:: GSL_MULTIFIT_NLINEAR_FWDIFF`
Packit	67cb25
Packit	67cb25	`This specifies a forward finite difference to approximate`
Packit	67cb25	`the Jacobian matrix. The Jacobian matrix will be calculated as`
Packit	67cb25
Packit	67cb25	`.. only:: not texinfo`
Packit	67cb25
Packit	67cb25	`.. math:: J_{ij} = {1 \over \Delta_j} \left( f_i(x + \Delta_j e_j) - f_i(x) \right)`
Packit	67cb25
Packit	67cb25	`.. only:: texinfo`
Packit	67cb25
Packit	67cb25	`::`
Packit	67cb25
Packit	67cb25	`J_ij = 1 / \Delta_j ( f_i(x + \Delta_j e_j) - f_i(x) )`
Packit	67cb25
Packit	67cb25	where :math:`\Delta_j = h \|x_j\|` and :math:`e_j` is the standard
Packit	67cb25	:math:`j`-th Cartesian unit basis vector so that
Packit	67cb25	:math:`x + \Delta_j e_j` represents a small (forward) perturbation of
Packit	67cb25	the :math:`j`-th parameter by an amount :math:`\Delta_j`. The perturbation
Packit	67cb25	:math:`\Delta_j` is proportional to the current value :math:`\|x_j\|` which
Packit	67cb25	`helps to calculate an accurate Jacobian when the various parameters have`
Packit	67cb25	different scale sizes. The value of :math:`h` is specified by the :code:`h_df`
Packit	67cb25	parameter. The accuracy of this method is :math:`O(h)`, and evaluating this
Packit	67cb25	matrix requires an additional :math:`p` function evaluations.
Packit	67cb25
Packit	67cb25	`.. macro:: GSL_MULTIFIT_NLINEAR_CTRDIFF`
Packit	67cb25
Packit	67cb25	`This specifies a centered finite difference to approximate`
Packit	67cb25	`the Jacobian matrix. The Jacobian matrix will be calculated as`
Packit	67cb25
Packit	67cb25	`.. only:: not texinfo`
Packit	67cb25
Packit	67cb25	`.. math:: J_{ij} = {1 \over \Delta_j} \left( f_i(x + {1 \over 2} \Delta_j e_j) - f_i(x - {1 \over 2} \Delta_j e_j) \right)`
Packit	67cb25
Packit	67cb25	`.. only:: texinfo`
Packit	67cb25
Packit	67cb25	`::`
Packit	67cb25
Packit	67cb25	`J_ij = 1 / \Delta_j ( f_i(x + 1/2 \Delta_j e_j) - f_i(x - 1/2 \Delta_j e_j) )`
Packit	67cb25
Packit	67cb25	See above for a description of :math:`\Delta_j`. The accuracy of this
Packit	67cb25	method is :math:`O(h^2)`, but evaluating this
Packit	67cb25	matrix requires an additional :math:`2p` function evaluations.
Packit	67cb25
Packit	67cb25	:code:`double factor_up`
Packit	67cb25
Packit	67cb25	`When a step is accepted, the trust region radius will be increased`
Packit	67cb25	by this factor. The default value is :math:`3`.
Packit	67cb25
Packit	67cb25	:code:`double factor_down`
Packit	67cb25
Packit	67cb25	`When a step is rejected, the trust region radius will be decreased`
Packit	67cb25	by this factor. The default value is :math:`2`.
Packit	67cb25
Packit	67cb25	:code:`double avmax`
Packit	67cb25
Packit	67cb25	`When using geodesic acceleration to solve a nonlinear least squares problem,`
Packit	67cb25	`an important parameter to monitor is the ratio of the acceleration term`
Packit	67cb25	`to the velocity term,`
Packit	67cb25
Packit	67cb25	`.. only:: not texinfo`
Packit	67cb25
Packit	67cb25	`.. math:: { \|\|a\|\| \over \|\|v\|\| }`
Packit	67cb25
Packit	67cb25	`.. only:: texinfo`
Packit	67cb25
Packit	67cb25	`::`
Packit	67cb25
Packit	67cb25	`\|a\| / \|v\|`
Packit	67cb25
Packit	67cb25	`If this ratio is small, it means the acceleration correction`
Packit	67cb25	`is contributing very little to the step. This could be because`
Packit	67cb25	`the problem is not "nonlinear" enough to benefit from`
Packit	67cb25	the acceleration. If the ratio is large (:math:`> 1`) it
Packit	67cb25	`means that the acceleration is larger than the velocity,`
Packit	67cb25	`which shouldn't happen since the step represents a truncated`
Packit	67cb25	series and so the second order term :math:`a` should be smaller than
Packit	67cb25	the first order term :math:`v` to guarantee convergence.
Packit	67cb25	`Therefore any steps with a ratio larger than the parameter`
Packit	67cb25	:data:`avmax` are rejected. :data:`avmax` is set to 0.75 by default.
Packit	67cb25	`For problems which experience difficulty converging, this threshold`
Packit	67cb25	`could be lowered.`
Packit	67cb25
Packit	67cb25	:code:`double h_df`
Packit	67cb25
Packit	67cb25	`This parameter specifies the step size for approximating the`
Packit	67cb25	`Jacobian matrix with finite differences. It is set to`
Packit	67cb25	:math:`\sqrt{\epsilon}` by default, where :math:`\epsilon`
Packit	67cb25	is :macro:`GSL_DBL_EPSILON`.
Packit	67cb25
Packit	67cb25	:code:`double h_fvv`
Packit	67cb25
Packit	67cb25	`When using geodesic acceleration, the user must either supply`
Packit	67cb25	a function to calculate :math:`f_{vv}(x)` or the library
Packit	67cb25	`can estimate this second directional derivative using a finite`
Packit	67cb25	`difference method. When using finite differences, the library`
Packit	67cb25	must calculate :math:`f(x + h v)` where :math:`h` represents
Packit	67cb25	`a small step in the velocity direction. The parameter`
Packit	67cb25	:data:`h_fvv` defines this step size and is set to 0.02 by
Packit	67cb25	`default.`
Packit	67cb25
Packit	67cb25	`Initializing the Solver`
Packit	67cb25	`=======================`
Packit	67cb25
Packit	67cb25	`.. type:: gsl_multifit_nlinear_type`
Packit	67cb25
Packit	67cb25	`This structure specifies the type of algorithm which will be used`
Packit	67cb25	`to solve a nonlinear least squares problem. It may be selected from the`
Packit	67cb25	`following choices,`
Packit	67cb25
Packit	67cb25	`.. var:: gsl_multifit_nlinear_trust`
Packit	67cb25
Packit	67cb25	`This specifies a trust region method. It is currently the only implemented`
Packit	67cb25	`nonlinear least squares method.`
Packit	67cb25
Packit	67cb25	`.. function:: gsl_multifit_nlinear_workspace * gsl_multifit_nlinear_alloc (const gsl_multifit_nlinear_type * T, const gsl_multifit_nlinear_parameters * params, const size_t n, const size_t p)`
Packit	67cb25	`gsl_multilarge_nlinear_workspace * gsl_multilarge_nlinear_alloc (const gsl_multilarge_nlinear_type * T, const gsl_multilarge_nlinear_parameters * params, const size_t n, const size_t p)`
Packit	67cb25
Packit	67cb25	`These functions return a pointer to a newly allocated instance of a`
Packit	67cb25	derivative solver of type :data:`T` for :data:`n` observations and :data:`p`
Packit	67cb25	parameters. The :data:`params` input specifies a tunable set of
Packit	67cb25	`parameters which will affect important details in each iteration`
Packit	67cb25	`of the trust region subproblem algorithm. It is recommended to start`
Packit	67cb25	`with the suggested default parameters (see`
Packit	67cb25	:func:`gsl_multifit_nlinear_default_parameters` and
Packit	67cb25	:func:`gsl_multilarge_nlinear_default_parameters`) and then tune
Packit	67cb25	`the parameters once the code is working correctly. See`
Packit	67cb25	:ref:`sec_tunable-parameters`.
Packit	67cb25	`for descriptions of the various parameters.`
Packit	67cb25	`For example, the following code creates an instance of a`
Packit	67cb25	`Levenberg-Marquardt solver for 100 data points and 3 parameters,`
Packit	67cb25	`using suggested defaults::`
Packit	67cb25
Packit	67cb25	`const gsl_multifit_nlinear_type * T = gsl_multifit_nlinear_trust;`
Packit	67cb25	`gsl_multifit_nlinear_parameters params = gsl_multifit_nlinear_default_parameters();`
Packit	67cb25	`gsl_multifit_nlinear_workspace * w = gsl_multifit_nlinear_alloc (T, &params, 100, 3);`
Packit	67cb25
Packit	67cb25	The number of observations :data:`n` must be greater than or equal to
Packit	67cb25	parameters :data:`p`.
Packit	67cb25
Packit	67cb25	`If there is insufficient memory to create the solver then the function`
Packit	67cb25	`returns a null pointer and the error handler is invoked with an error`
Packit	67cb25	code of :macro:`GSL_ENOMEM`.
Packit	67cb25
Packit	67cb25	`.. function:: gsl_multifit_nlinear_parameters gsl_multifit_nlinear_default_parameters (void)`
Packit	67cb25	`gsl_multilarge_nlinear_parameters gsl_multilarge_nlinear_default_parameters (void)`
Packit	67cb25
Packit	67cb25	`These functions return a set of recommended default parameters`
Packit	67cb25	`for use in solving nonlinear least squares problems. The user`
Packit	67cb25	`can tune each parameter to improve the performance on their`
Packit	67cb25	particular problem, see :ref:`sec_tunable-parameters`.
Packit	67cb25
Packit	67cb25	`.. function:: int gsl_multifit_nlinear_init (const gsl_vector * x, gsl_multifit_nlinear_fdf * fdf, gsl_multifit_nlinear_workspace * w)`
Packit	67cb25	`int gsl_multifit_nlinear_winit (const gsl_vector * x, const gsl_vector * wts, gsl_multifit_nlinear_fdf * fdf, gsl_multifit_nlinear_workspace * w)`
Packit	67cb25	`int gsl_multilarge_nlinear_init (const gsl_vector * x, gsl_multilarge_nlinear_fdf * fdf, gsl_multilarge_nlinear_workspace * w)`
Packit	67cb25
Packit	67cb25	These functions initialize, or reinitialize, an existing workspace :data:`w`
Packit	67cb25	to use the system :data:`fdf` and the initial guess
Packit	67cb25	:data:`x`. See :ref:`sec_providing-function-minimized`
Packit	67cb25	for a description of the :data:`fdf` structure.
Packit	67cb25
Packit	67cb25	Optionally, a weight vector :data:`wts` can be given to perform
Packit	67cb25	`a weighted nonlinear regression. Here, the weighting matrix is`
Packit	67cb25	:math:`W = \diag(w_1,w_2,...,w_n)`.
Packit	67cb25
Packit	67cb25	`.. function:: void gsl_multifit_nlinear_free (gsl_multifit_nlinear_workspace * w)`
Packit	67cb25	`void gsl_multilarge_nlinear_free (gsl_multilarge_nlinear_workspace * w)`
Packit	67cb25
Packit	67cb25	These functions free all the memory associated with the workspace :data:`w`.
Packit	67cb25
Packit	67cb25	`.. function:: const char * gsl_multifit_nlinear_name (const gsl_multifit_nlinear_workspace * w)`
Packit	67cb25	`const char * gsl_multilarge_nlinear_name (const gsl_multilarge_nlinear_workspace * w)`
Packit	67cb25
Packit	67cb25	`These functions return a pointer to the name of the solver. For example::`
Packit	67cb25
Packit	67cb25	`printf ("w is a '%s' solver\n", gsl_multifit_nlinear_name (w));`
Packit	67cb25
Packit	67cb25	would print something like :code:`w is a 'trust-region' solver`.
Packit	67cb25
Packit	67cb25	`.. function:: const char * gsl_multifit_nlinear_trs_name (const gsl_multifit_nlinear_workspace * w)`
Packit	67cb25	`const char * gsl_multilarge_nlinear_trs_name (const gsl_multilarge_nlinear_workspace * w)`
Packit	67cb25
Packit	67cb25	`These functions return a pointer to the name of the trust region subproblem`
Packit	67cb25	`method. For example::`
Packit	67cb25
Packit	67cb25	`printf ("w is a '%s' solver\n", gsl_multifit_nlinear_trs_name (w));`
Packit	67cb25
Packit	67cb25	would print something like :code:`w is a 'levenberg-marquardt' solver`.
Packit	67cb25
Packit	67cb25	`.. _sec_providing-function-minimized:`
Packit	67cb25
Packit	67cb25	`Providing the Function to be Minimized`
Packit	67cb25	`======================================`
Packit	67cb25
Packit	67cb25	The user must provide :math:`n` functions of :math:`p` variables for the
Packit	67cb25	`minimization algorithm to operate on. In order to allow for`
Packit	67cb25	`arbitrary parameters the functions are defined by the following data`
Packit	67cb25	`types:`
Packit	67cb25
Packit	67cb25	`.. type:: gsl_multifit_nlinear_fdf`
Packit	67cb25
Packit	67cb25	`This data type defines a general system of functions with arbitrary parameters,`
Packit	67cb25	`the corresponding Jacobian matrix of derivatives, and optionally the`
Packit	67cb25	`second directional derivative of the functions for geodesic acceleration.`
Packit	67cb25
Packit	67cb25	:code:`int (* f) (const gsl_vector * x, void * params, gsl_vector * f)`
Packit	67cb25
Packit	67cb25	This function should store the :math:`n` components of the vector
Packit	67cb25	:math:`f(x)` in :data:`f` for argument :data:`x` and arbitrary parameters :data:`params`,
Packit	67cb25	`returning an appropriate error code if the function cannot be computed.`
Packit	67cb25
Packit	67cb25	:code:`int (* df) (const gsl_vector * x, void * params, gsl_matrix * J)`
Packit	67cb25
Packit	67cb25	This function should store the :data:`n`-by-:data:`p` matrix result
Packit	67cb25
Packit	67cb25	`.. only:: not texinfo`
Packit	67cb25
Packit	67cb25	`.. math:: J_{ij} = \partial f_i(x) / \partial x_j`
Packit	67cb25
Packit	67cb25	`.. only:: texinfo`
Packit	67cb25
Packit	67cb25	`::`
Packit	67cb25
Packit	67cb25	`J_ij = d f_i(x) / d x_j`
Packit	67cb25
Packit	67cb25	in :data:`J` for argument :data:`x`
Packit	67cb25	and arbitrary parameters :data:`params`, returning an appropriate error code if the
Packit	67cb25	`matrix cannot be computed. If an analytic Jacobian is unavailable, or too expensive`
Packit	67cb25	to compute, this function pointer may be set to :code:`NULL`, in which
Packit	67cb25	`case the Jacobian will be internally computed using finite difference approximations`
Packit	67cb25	of the function :data:`f`.
Packit	67cb25
Packit	67cb25	:code:`int (* fvv) (const gsl_vector * x, const gsl_vector * v, void * params, gsl_vector * fvv)`
Packit	67cb25
Packit	67cb25	`When geodesic acceleration is enabled, this function should store the`
Packit	67cb25	:math:`n` components of the vector
Packit	67cb25	:math:`f_{vv}(x) = \sum_{\alpha\beta} v_{\alpha} v_{\beta} {\partial \over \partial x_{\alpha}} {\partial \over \partial x_{\beta}} f(x)`,
Packit	67cb25	`representing second directional derivatives of the function to be minimized,`
Packit	67cb25	into the output :data:`fvv`. The parameter vector is provided in :data:`x` and
Packit	67cb25	the velocity vector is provided in :data:`v`, both of which have :math:`p`
Packit	67cb25	components. The arbitrary parameters are given in :data:`params`. If
Packit	67cb25	analytic expressions for :math:`f_{vv}(x)` are unavailable or too difficult
Packit	67cb25	to compute, this function pointer may be set to :code:`NULL`, in which case
Packit	67cb25	:math:`f_{vv}(x)` will be computed internally using a finite difference
Packit	67cb25	`approximation.`
Packit	67cb25
Packit	67cb25	:code:`size_t n`
Packit	67cb25
Packit	67cb25	`the number of functions, i.e. the number of components of the`
Packit	67cb25	vector :data:`f`.
Packit	67cb25
Packit	67cb25	:code:`size_t p`
Packit	67cb25
Packit	67cb25	`the number of independent variables, i.e. the number of components of`
Packit	67cb25	the vector :data:`x`.
Packit	67cb25
Packit	67cb25	:code:`void * params`
Packit	67cb25
Packit	67cb25	`a pointer to the arbitrary parameters of the function.`
Packit	67cb25
Packit	67cb25	:code:`size_t nevalf`
Packit	67cb25
Packit	67cb25	`This does not need to be set by the user. It counts the number of`
Packit	67cb25	function evaluations and is initialized by the :code:`_init` function.
Packit	67cb25
Packit	67cb25	:code:`size_t nevaldf`
Packit	67cb25
Packit	67cb25	`This does not need to be set by the user. It counts the number of`
Packit	67cb25	Jacobian evaluations and is initialized by the :code:`_init` function.
Packit	67cb25
Packit	67cb25	:code:`size_t nevalfvv`
Packit	67cb25
Packit	67cb25	`This does not need to be set by the user. It counts the number of`
Packit	67cb25	:math:`f_{vv}(x)` evaluations and is initialized by the :code:`_init` function.
Packit	67cb25
Packit	67cb25	`.. type:: gsl_multilarge_nlinear_fdf`
Packit	67cb25
Packit	67cb25	`This data type defines a general system of functions with arbitrary parameters,`
Packit	67cb25	a function to compute :math:`J u` or :math:`J^T u` for a given vector :math:`u`,
Packit	67cb25	the normal equations matrix :math:`J^T J`,
Packit	67cb25	`and optionally the second directional derivative of the functions for geodesic acceleration.`
Packit	67cb25
Packit	67cb25	:code:`int (* f) (const gsl_vector * x, void * params, gsl_vector * f)`
Packit	67cb25
Packit	67cb25	This function should store the :math:`n` components of the vector
Packit	67cb25	:math:`f(x)` in :data:`f` for argument :data:`x` and arbitrary parameters :data:`params`,
Packit	67cb25	`returning an appropriate error code if the function cannot be computed.`
Packit	67cb25
Packit	67cb25	:code:`int (* df) (CBLAS_TRANSPOSE_t TransJ, const gsl_vector * x, const gsl_vector * u, void * params, gsl_vector * v, gsl_matrix * JTJ)`
Packit	67cb25
Packit	67cb25	If :data:`TransJ` is equal to :code:`CblasNoTrans`, then this function should
Packit	67cb25	compute the matrix-vector product :math:`J u` and store the result in :data:`v`.
Packit	67cb25	If :data:`TransJ` is equal to :code:`CblasTrans`, then this function should
Packit	67cb25	compute the matrix-vector product :math:`J^T u` and store the result in :data:`v`.
Packit	67cb25	Additionally, the normal equations matrix :math:`J^T J` should be stored in the
Packit	67cb25	lower half of :data:`JTJ`. The input matrix :data:`JTJ` could be set to :code:`NULL`,
Packit	67cb25	`for example by iterative methods which do not require this matrix, so the user`
Packit	67cb25	`should check for this prior to constructing the matrix.`
Packit	67cb25	The input :data:`params` contains the arbitrary parameters.
Packit	67cb25
Packit	67cb25	:code:`int (* fvv) (const gsl_vector * x, const gsl_vector * v, void * params, gsl_vector * fvv)`
Packit	67cb25
Packit	67cb25	`When geodesic acceleration is enabled, this function should store the`
Packit	67cb25	:math:`n` components of the vector
Packit	67cb25	:math:`f_{vv}(x) = \sum_{\alpha\beta} v_{\alpha} v_{\beta} {\partial \over \partial x_{\alpha}} {\partial \over \partial x_{\beta}} f(x)`,
Packit	67cb25	`representing second directional derivatives of the function to be minimized,`
Packit	67cb25	into the output :data:`fvv`. The parameter vector is provided in :data:`x` and
Packit	67cb25	the velocity vector is provided in :data:`v`, both of which have :math:`p`
Packit	67cb25	components. The arbitrary parameters are given in :data:`params`. If
Packit	67cb25	analytic expressions for :math:`f_{vv}(x)` are unavailable or too difficult
Packit	67cb25	to compute, this function pointer may be set to :code:`NULL`, in which case
Packit	67cb25	:math:`f_{vv}(x)` will be computed internally using a finite difference
Packit	67cb25	`approximation.`
Packit	67cb25
Packit	67cb25	:code:`size_t n`
Packit	67cb25
Packit	67cb25	`the number of functions, i.e. the number of components of the`
Packit	67cb25	vector :data:`f`.
Packit	67cb25
Packit	67cb25	:code:`size_t p`
Packit	67cb25
Packit	67cb25	`the number of independent variables, i.e. the number of components of`
Packit	67cb25	the vector :data:`x`.
Packit	67cb25
Packit	67cb25	:code:`void * params`
Packit	67cb25
Packit	67cb25	`a pointer to the arbitrary parameters of the function.`
Packit	67cb25
Packit	67cb25	:code:`size_t nevalf`
Packit	67cb25
Packit	67cb25	`This does not need to be set by the user. It counts the number of`
Packit	67cb25	function evaluations and is initialized by the :code:`_init` function.
Packit	67cb25
Packit	67cb25	:code:`size_t nevaldfu`
Packit	67cb25
Packit	67cb25	`This does not need to be set by the user. It counts the number of`
Packit	67cb25	Jacobian matrix-vector evaluations (:math:`J u` or :math:`J^T u`) and
Packit	67cb25	is initialized by the :code:`_init` function.
Packit	67cb25
Packit	67cb25	:code:`size_t nevaldf2`
Packit	67cb25
Packit	67cb25	`This does not need to be set by the user. It counts the number of`
Packit	67cb25	:math:`J^T J` evaluations and is initialized by the :code:`_init` function.
Packit	67cb25
Packit	67cb25	:code:`size_t nevalfvv`
Packit	67cb25
Packit	67cb25	`This does not need to be set by the user. It counts the number of`
Packit	67cb25	:math:`f_{vv}(x)` evaluations and is initialized by the :code:`_init` function.
Packit	67cb25
Packit	67cb25	`Note that when fitting a non-linear model against experimental data,`
Packit	67cb25	`the data is passed to the functions above using the`
Packit	67cb25	:data:`params` argument and the trial best-fit parameters through the
Packit	67cb25	:data:`x` argument.
Packit	67cb25
Packit	67cb25	`Iteration`
Packit	67cb25	`=========`
Packit	67cb25
Packit	67cb25	`The following functions drive the iteration of each algorithm. Each`
Packit	67cb25	`function performs one iteration of the trust region method and updates`
Packit	67cb25	`the state of the solver.`
Packit	67cb25
Packit	67cb25	`.. function:: int gsl_multifit_nlinear_iterate (gsl_multifit_nlinear_workspace * w)`
Packit	67cb25	`int gsl_multilarge_nlinear_iterate (gsl_multilarge_nlinear_workspace * w)`
Packit	67cb25
Packit	67cb25	These functions perform a single iteration of the solver :data:`w`. If
Packit	67cb25	`the iteration encounters an unexpected problem then an error code will`
Packit	67cb25	`be returned. The solver workspace maintains a current estimate of the`
Packit	67cb25	`best-fit parameters at all times.`
Packit	67cb25
Packit	67cb25	The solver workspace :data:`w` contains the following entries, which can
Packit	67cb25	`be used to track the progress of the solution:`
Packit	67cb25
Packit	67cb25	:code:`gsl_vector * x`
Packit	67cb25
Packit	67cb25	The current position, length :math:`p`.
Packit	67cb25
Packit	67cb25	:code:`gsl_vector * f`
Packit	67cb25
Packit	67cb25	The function residual vector at the current position :math:`f(x)`, length
Packit	67cb25	:math:`n`.
Packit	67cb25
Packit	67cb25	:code:`gsl_matrix * J`
Packit	67cb25
Packit	67cb25	The Jacobian matrix at the current position :math:`J(x)`, size
Packit	67cb25	:math:`n`-by-:math:`p` (only for :code:`gsl_multifit_nlinear` interface).
Packit	67cb25
Packit	67cb25	:code:`gsl_vector * dx`
Packit	67cb25
Packit	67cb25	`The difference between the current position and the previous position,`
Packit	67cb25	i.e. the last step :math:`\delta`, taken as a vector, length :math:`p`.
Packit	67cb25
Packit	67cb25	`These quantities can be accessed with the following functions,`
Packit	67cb25
Packit	67cb25	`.. function:: gsl_vector * gsl_multifit_nlinear_position (const gsl_multifit_nlinear_workspace * w)`
Packit	67cb25	`gsl_vector * gsl_multilarge_nlinear_position (const gsl_multilarge_nlinear_workspace * w)`
Packit	67cb25
Packit	67cb25	These functions return the current position :math:`x` (i.e. best-fit
Packit	67cb25	parameters) of the solver :data:`w`.
Packit	67cb25
Packit	67cb25	`.. function:: gsl_vector * gsl_multifit_nlinear_residual (const gsl_multifit_nlinear_workspace * w)`
Packit	67cb25	`gsl_vector * gsl_multilarge_nlinear_residual (const gsl_multilarge_nlinear_workspace * w)`
Packit	67cb25
Packit	67cb25	These functions return the current residual vector :math:`f(x)` of the
Packit	67cb25	solver :data:`w`. For weighted systems, the residual vector includes the
Packit	67cb25	weighting factor :math:`\sqrt{W}`.
Packit	67cb25
Packit	67cb25	`.. function:: gsl_matrix * gsl_multifit_nlinear_jac (const gsl_multifit_nlinear_workspace * w)`
Packit	67cb25
Packit	67cb25	This function returns a pointer to the :math:`n`-by-:math:`p` Jacobian matrix for the
Packit	67cb25	current iteration of the solver :data:`w`. This function is available only for the
Packit	67cb25	:code:`gsl_multifit_nlinear` interface.
Packit	67cb25
Packit	67cb25	`.. function:: size_t gsl_multifit_nlinear_niter (const gsl_multifit_nlinear_workspace * w)`
Packit	67cb25	`size_t gsl_multilarge_nlinear_niter (const gsl_multilarge_nlinear_workspace * w)`
Packit	67cb25
Packit	67cb25	`These functions return the number of iterations performed so far.`
Packit	67cb25	`The iteration counter is updated on each call to the`
Packit	67cb25	:code:`_iterate` functions above, and reset to 0 in the
Packit	67cb25	:code:`_init` functions.
Packit	67cb25
Packit	67cb25	`.. function:: int gsl_multifit_nlinear_rcond (double * rcond, const gsl_multifit_nlinear_workspace * w)`
Packit	67cb25	`int gsl_multilarge_nlinear_rcond (double * rcond, const gsl_multilarge_nlinear_workspace * w)`
Packit	67cb25
Packit	67cb25	`This function estimates the reciprocal condition number`
Packit	67cb25	of the Jacobian matrix at the current position :math:`x` and
Packit	67cb25	stores it in :data:`rcond`. The computed value is only an estimate
Packit	67cb25	`to give the user a guideline as to the conditioning of their particular`
Packit	67cb25	`problem. Its calculation is based on which factorization`
Packit	67cb25	`method is used (Cholesky, QR, or SVD).`
Packit	67cb25
Packit	67cb25	* For the Cholesky solver, the matrix :math:`J^T J` is factored at each
Packit	67cb25	`iteration. Therefore this function will estimate the 1-norm condition number`
Packit	67cb25	:math:`rcond^2 = 1/(\|\|J^T J\|\|_1 \cdot \|\|(J^T J)^{-1}\|\|_1)`
Packit	67cb25
Packit	67cb25	* For the QR solver, :math:`J` is factored as :math:`J = Q R` at each
Packit	67cb25	`iteration. For simplicity, this function calculates the 1-norm conditioning of`
Packit	67cb25	only the :math:`R` factor, :math:`rcond = 1 / (\|\|R\|\|_1 \cdot \|\|R^{-1}\|\|_1)`.
Packit	67cb25	This can be computed efficiently since :math:`R` is upper triangular.
Packit	67cb25
Packit	67cb25	`* For the SVD solver, in order to efficiently solve the trust region`
Packit	67cb25	subproblem, the matrix which is factored is :math:`J D^{-1}`, instead of
Packit	67cb25	:math:`J` itself. The resulting singular values are used to provide
Packit	67cb25	the 2-norm reciprocal condition number, as :math:`rcond = \sigma_{min} / \sigma_{max}`.
Packit	67cb25	Note that when using \|More\| scaling, :math:`D \ne I` and the resulting
Packit	67cb25	:data:`rcond` estimate may be significantly different from the true
Packit	67cb25	:data:`rcond` of :math:`J` itself.
Packit	67cb25
Packit	67cb25	`.. function:: double gsl_multifit_nlinear_avratio (const gsl_multifit_nlinear_workspace * w)`
Packit	67cb25	`double gsl_multilarge_nlinear_avratio (const gsl_multilarge_nlinear_workspace * w)`
Packit	67cb25
Packit	67cb25	This function returns the current ratio :math:`\|a\| / \|v\|` of the acceleration correction term to
Packit	67cb25	`the velocity step term. The acceleration term is computed only by the`
Packit	67cb25	:type:`gsl_multifit_nlinear_trs_lmaccel` and :type:`gsl_multilarge_nlinear_trs_lmaccel` methods, so
Packit	67cb25	`this ratio will be zero for other TRS methods.`
Packit	67cb25
Packit	67cb25	`.. index::`
Packit	67cb25	`single: nonlinear fitting, stopping parameters, convergence`
Packit	67cb25
Packit	67cb25	`Testing for Convergence`
Packit	67cb25	`=======================`
Packit	67cb25
Packit	67cb25	`A minimization procedure should stop when one of the following conditions is`
Packit	67cb25	`true:`
Packit	67cb25
Packit	67cb25	`* A minimum has been found to within the user-specified precision.`
Packit	67cb25	`* A user-specified maximum number of iterations has been reached.`
Packit	67cb25	`* An error has occurred.`
Packit	67cb25
Packit	67cb25	`The handling of these conditions is under user control. The functions`
Packit	67cb25	`below allow the user to test the current estimate of the best-fit`
Packit	67cb25	`parameters in several standard ways.`
Packit	67cb25
Packit	67cb25	`.. function:: int gsl_multifit_nlinear_test (const double xtol, const double gtol, const double ftol, int * info, const gsl_multifit_nlinear_workspace * w)`
Packit	67cb25	`int gsl_multilarge_nlinear_test (const double xtol, const double gtol, const double ftol, int * info, const gsl_multilarge_nlinear_workspace * w)`
Packit	67cb25
Packit	67cb25	`These functions test for convergence of the minimization method`
Packit	67cb25	`using the following criteria:`
Packit	67cb25
Packit	67cb25	`* Testing for a small step size relative to the current parameter vector`
Packit	67cb25
Packit	67cb25	`.. math:: \|\delta_i\| \le xtol (\|x_i\| + xtol)`
Packit	67cb25
Packit	67cb25	for each :math:`0 <= i < p`. Each element of the step vector :math:`\delta`
Packit	67cb25	`is tested individually in case the different parameters have widely`
Packit	67cb25	different scales. Adding :data:`xtol` to :math:`\|x_i\|` helps the test avoid
Packit	67cb25	breaking down in situations where the true solution value :math:`x_i = 0`.
Packit	67cb25	If this test succeeds, :data:`info` is set to 1 and the function
Packit	67cb25	returns :macro:`GSL_SUCCESS`.
Packit	67cb25
Packit	67cb25	`A general guideline for selecting the step tolerance is to choose`
Packit	67cb25	:math:`xtol = 10^{-d}` where :math:`d` is the number of accurate
Packit	67cb25	decimal digits desired in the solution :math:`x`. See Dennis and
Packit	67cb25	`Schnabel for more information.`
Packit	67cb25
Packit	67cb25	* Testing for a small gradient (:math:`g = \nabla \Phi(x) = J^T f`)
Packit	67cb25	`indicating a local function minimum:`
Packit	67cb25
Packit	67cb25	`.. only:: not texinfo`
Packit	67cb25
Packit	67cb25	`.. math:: \max_i \|g_i \times \max(x_i, 1)\| \le gtol \times \max(\Phi(x), 1)`
Packit	67cb25
Packit	67cb25	`.. only:: texinfo`
Packit	67cb25
Packit	67cb25	`::`
Packit	67cb25
Packit	67cb25	`\|\|g\|\|_inf <= gtol`
Packit	67cb25
Packit	67cb25	`This expression tests whether the ratio`
Packit	67cb25	:math:`(\nabla \Phi)_i x_i / \Phi` is small. Testing this scaled gradient
Packit	67cb25	is a better than :math:`\nabla \Phi` alone since it is a dimensionless
Packit	67cb25	`quantity and so independent of the scale of the problem. The`
Packit	67cb25	:code:`max` arguments help ensure the test doesn't break down in
Packit	67cb25	regions where :math:`x_i` or :math:`\Phi(x)` are close to 0.
Packit	67cb25	If this test succeeds, :data:`info` is set to 2 and the function
Packit	67cb25	returns :macro:`GSL_SUCCESS`.
Packit	67cb25
Packit	67cb25	`A general guideline for choosing the gradient tolerance is to set`
Packit	67cb25	:code:`gtol = GSL_DBL_EPSILON^(1/3)`. See Dennis and Schnabel for
Packit	67cb25	`more information.`
Packit	67cb25
Packit	67cb25	If none of the tests succeed, :data:`info` is set to 0 and the
Packit	67cb25	function returns :macro:`GSL_CONTINUE`, indicating further iterations
Packit	67cb25	`are required.`
Packit	67cb25
Packit	67cb25	`High Level Driver`
Packit	67cb25	`=================`
Packit	67cb25
Packit	67cb25	`These routines provide a high level wrapper that combines the iteration`
Packit	67cb25	`and convergence testing for easy use.`
Packit	67cb25
Packit	67cb25	`.. function:: int gsl_multifit_nlinear_driver (const size_t maxiter, const double xtol, const double gtol, const double ftol, void (* callback)(const size_t iter, void * params, const gsl_multifit_linear_workspace * w), void * callback_params, int * info, gsl_multifit_nlinear_workspace * w)`
Packit	67cb25	`int gsl_multilarge_nlinear_driver (const size_t maxiter, const double xtol, const double gtol, const double ftol, void (* callback)(const size_t iter, void * params, const gsl_multilarge_linear_workspace * w), void * callback_params, int * info, gsl_multilarge_nlinear_workspace * w)`
Packit	67cb25
Packit	67cb25	These functions iterate the nonlinear least squares solver :data:`w` for a
Packit	67cb25	maximum of :data:`maxiter` iterations. After each iteration, the system is
Packit	67cb25	tested for convergence with the error tolerances :data:`xtol`, :data:`gtol` and :data:`ftol`.
Packit	67cb25	Additionally, the user may supply a callback function :data:`callback`
Packit	67cb25	`which is called after each iteration, so that the user may save or print`
Packit	67cb25	relevant quantities for each iteration. The parameter :data:`callback_params`
Packit	67cb25	is passed to the :data:`callback` function. The parameters :data:`callback`
Packit	67cb25	and :data:`callback_params` may be set to :code:`NULL` to disable this feature.
Packit	67cb25	Upon successful convergence, the function returns :macro:`GSL_SUCCESS`
Packit	67cb25	and sets :data:`info` to the reason for convergence (see
Packit	67cb25	:func:`gsl_multifit_nlinear_test`). If the function has not
Packit	67cb25	converged after :data:`maxiter` iterations, :macro:`GSL_EMAXITER` is
Packit	67cb25	`returned. In rare cases, during an iteration the algorithm may`
Packit	67cb25	be unable to find a new acceptable step :math:`\delta` to take. In
Packit	67cb25	this case, :macro:`GSL_ENOPROG` is returned indicating no further
Packit	67cb25	`progress can be made. If your problem is having difficulty converging,`
Packit	67cb25	see :ref:`sec_nlinear-troubleshooting` for further guidance.
Packit	67cb25
Packit	67cb25	`.. index::`
Packit	67cb25	`single: best-fit parameters, covariance`
Packit	67cb25	`single: least squares, covariance of best-fit parameters`
Packit	67cb25	`single: covariance matrix, nonlinear fits`
Packit	67cb25
Packit	67cb25	`Covariance matrix of best fit parameters`
Packit	67cb25	`========================================`
Packit	67cb25
Packit	67cb25	`.. function:: int gsl_multifit_nlinear_covar (const gsl_matrix * J, const double epsrel, gsl_matrix * covar)`
Packit	67cb25	`int gsl_multilarge_nlinear_covar (gsl_matrix * covar, gsl_multilarge_nlinear_workspace * w)`
Packit	67cb25
Packit	67cb25	`This function computes the covariance matrix of best-fit parameters`
Packit	67cb25	using the Jacobian matrix :data:`J` and stores it in :data:`covar`.
Packit	67cb25	The parameter :data:`epsrel` is used to remove linear-dependent columns
Packit	67cb25	when :data:`J` is rank deficient.
Packit	67cb25
Packit	67cb25	`The covariance matrix is given by,`
Packit	67cb25
Packit	67cb25	`.. math:: C = (J^T J)^{-1}`
Packit	67cb25
Packit	67cb25	`or in the weighted case,`
Packit	67cb25
Packit	67cb25	`.. math:: C = (J^T W J)^{-1}`
Packit	67cb25
Packit	67cb25	`and is computed using the factored form of the Jacobian (Cholesky, QR, or SVD).`
Packit	67cb25	Any columns of :math:`R` which satisfy
Packit	67cb25
Packit	67cb25	`.. math:: \|R_{kk}\| \leq epsrel \|R_{11}\|`
Packit	67cb25
Packit	67cb25	`are considered linearly-dependent and are excluded from the covariance`
Packit	67cb25	`matrix (the corresponding rows and columns of the covariance matrix are`
Packit	67cb25	`set to zero).`
Packit	67cb25
Packit	67cb25	`If the minimisation uses the weighted least-squares function`
Packit	67cb25	:math:`f_i = (Y(x, t_i) - y_i) / \sigma_i` then the covariance
Packit	67cb25	`matrix above gives the statistical error on the best-fit parameters`
Packit	67cb25	resulting from the Gaussian errors :math:`\sigma_i` on
Packit	67cb25	the underlying data :math:`y_i`. This can be verified from the relation
Packit	67cb25	:math:`\delta f = J \delta c` and the fact that the fluctuations in :math:`f`
Packit	67cb25	from the data :math:`y_i` are normalised by :math:`\sigma_i` and
Packit	67cb25	`so satisfy`
Packit	67cb25
Packit	67cb25	`.. only:: not texinfo`
Packit	67cb25
Packit	67cb25	`.. math:: \langle \delta f \delta f^T \rangle = I`
Packit	67cb25
Packit	67cb25	`.. only:: texinfo`
Packit	67cb25
Packit	67cb25	`::`
Packit	67cb25
Packit	67cb25	`<\delta f \delta f^T> = I`
Packit	67cb25
Packit	67cb25	For an unweighted least-squares function :math:`f_i = (Y(x, t_i) - y_i)`
Packit	67cb25	`the covariance matrix above should be multiplied by the variance`
Packit	67cb25	of the residuals about the best-fit :math:`\sigma^2 = \sum (y_i - Y(x,t_i))^2 / (n-p)`
Packit	67cb25	`to give the variance-covariance`
Packit	67cb25	matrix :math:`\sigma^2 C`. This estimates the statistical error on the
Packit	67cb25	`best-fit parameters from the scatter of the underlying data.`
Packit	67cb25
Packit	67cb25	`For more information about covariance matrices see`
Packit	67cb25	:ref:`Linear Least-Squares Overview <sec_lls-overview>`.
Packit	67cb25
Packit	67cb25	`.. _sec_nlinear-troubleshooting:`
Packit	67cb25
Packit	67cb25	`Troubleshooting`
Packit	67cb25	`===============`
Packit	67cb25
Packit	67cb25	`When developing a code to solve a nonlinear least squares problem,`
Packit	67cb25	`here are a few considerations to keep in mind.`
Packit	67cb25
Packit	67cb25	`#. The most common difficulty is the accurate implementation of the Jacobian`
Packit	67cb25	`matrix. If the analytic Jacobian is not properly provided to the`
Packit	67cb25	`solver, this can hinder and many times prevent convergence of the method.`
Packit	67cb25	`When developing a new nonlinear least squares code, it often helps`
Packit	67cb25	`to compare the program output with the internally computed finite`
Packit	67cb25	`difference Jacobian and the user supplied analytic Jacobian. If there`
Packit	67cb25	`is a large difference in coefficients, it is likely the analytic`
Packit	67cb25	`Jacobian is incorrectly implemented.`
Packit	67cb25
Packit	67cb25	`#. If your code is having difficulty converging, the next thing to`
Packit	67cb25	`check is the starting point provided to the solver. The methods`
Packit	67cb25	`of this chapter are local methods, meaning if you provide a starting`
Packit	67cb25	`point far away from the true minimum, the method may converge to`
Packit	67cb25	`a local minimum or not converge at all. Sometimes it is possible`
Packit	67cb25	`to solve a linearized approximation to the nonlinear problem,`
Packit	67cb25	`and use the linear solution as the starting point to the nonlinear`
Packit	67cb25	`problem.`
Packit	67cb25
Packit	67cb25	#. If the various parameters of the coefficient vector :math:`x`
Packit	67cb25	`vary widely in magnitude, then the problem is said to be badly scaled.`
Packit	67cb25	`The methods of this chapter do attempt to automatically rescale`
Packit	67cb25	the elements of :math:`x` to have roughly the same order of magnitude,
Packit	67cb25	`but in extreme cases this could still cause problems for convergence.`
Packit	67cb25	`In these cases it is recommended for the user to scale their`
Packit	67cb25	parameter vector :math:`x` so that each parameter spans roughly the
Packit	67cb25	same range, say :math:`[-1,1]`. The solution vector can be backscaled
Packit	67cb25	`to recover the original units of the problem.`
Packit	67cb25
Packit	67cb25	`Examples`
Packit	67cb25	`========`
Packit	67cb25
Packit	67cb25	`The following example programs demonstrate the nonlinear least`
Packit	67cb25	`squares fitting capabilities.`
Packit	67cb25
Packit	67cb25	`Exponential Fitting Example`
Packit	67cb25	`---------------------------`
Packit	67cb25
Packit	67cb25	`The following example program fits a weighted exponential model with`
Packit	67cb25	background to experimental data, :math:`Y = A \exp(-\lambda t) + b`. The
Packit	67cb25	first part of the program sets up the functions :func:`expb_f` and
Packit	67cb25	:func:`expb_df` to calculate the model and its Jacobian. The appropriate
Packit	67cb25	`fitting function is given by,`
Packit	67cb25
Packit	67cb25	`.. math:: f_i = (A \exp(-\lambda t_i) + b) - y_i`
Packit	67cb25
Packit	67cb25	where we have chosen :math:`t_i = i T / (N - 1)`, where :math:`N` is the number
Packit	67cb25	of data points fitted, so that :math:`t_i \in [0, T]`. The Jacobian matrix :math:`J` is
Packit	67cb25	`the derivative of these functions with respect to the three parameters`
Packit	67cb25	(:math:`A`, :math:`\lambda`, :math:`b`). It is given by,
Packit	67cb25
Packit	67cb25	`.. only:: not texinfo`
Packit	67cb25
Packit	67cb25	`.. math:: J_{ij} = {\partial f_i \over \partial x_j}`
Packit	67cb25
Packit	67cb25	`.. only:: texinfo`
Packit	67cb25
Packit	67cb25	`::`
Packit	67cb25
Packit	67cb25	`J_{ij} = d f_i / d x_j`
Packit	67cb25
Packit	67cb25	where :math:`x_0 = A`, :math:`x_1 = \lambda` and :math:`x_2 = b`.
Packit	67cb25	The :math:`i`-th row of the Jacobian is therefore
Packit	67cb25
Packit	67cb25	`.. only:: not texinfo`
Packit	67cb25
Packit	67cb25	`.. math::`
Packit	67cb25
Packit	67cb25	`J_{i\cdot} =`
Packit	67cb25	`\left(`
Packit	67cb25	`\begin{array}{ccc}`
Packit	67cb25	`\exp(-\lambda t_i) & -t_i A \exp(-\lambda t_i) & 1`
Packit	67cb25	`\end{array}`
Packit	67cb25	`\right)`
Packit	67cb25
Packit	67cb25	`.. only:: texinfo`
Packit	67cb25
Packit	67cb25	`::`
Packit	67cb25
Packit	67cb25	`J(i,:) = [ \exp(-\lambda t_i) ; -t_i A \exp(-\lambda t_i) ; 1 ]`
Packit	67cb25
Packit	67cb25	`The main part of the program sets up a Levenberg-Marquardt solver and`
Packit	67cb25	`some simulated random data. The data uses the known parameters`
Packit	67cb25	`(5.0,0.1,1.0) combined with Gaussian noise (standard deviation = 0.1)`
Packit	67cb25	with a maximum time :math:`T = 40` and :math:`N = 100` timesteps.
Packit	67cb25	`The initial guess for the parameters is`
Packit	67cb25	`chosen as (1.0, 1.0, 0.0). The iteration terminates when the relative`
Packit	67cb25	change in x is smaller than :math:`10^{-8}`, or when the magnitude of
Packit	67cb25	the gradient falls below :math:`10^{-8}`. Here are the results of running
Packit	67cb25	`the program::`
Packit	67cb25
Packit	67cb25	`iter 0: A = 1.0000, lambda = 1.0000, b = 0.0000, cond(J) = inf, \|f(x)\| = 100.8779`
Packit	67cb25	`iter 1: A = 1.2692, lambda = 0.3924, b = 0.0443, cond(J) = 69.5973, \|f(x)\| = 97.1734`
Packit	67cb25	`iter 2: A = 1.6749, lambda = 0.1685, b = 0.1072, cond(J) = 29.5220, \|f(x)\| = 88.6636`
Packit	67cb25	`iter 3: A = 2.5579, lambda = 0.0544, b = 0.2552, cond(J) = 22.9334, \|f(x)\| = 42.7765`
Packit	67cb25	`iter 4: A = 3.0167, lambda = 0.0472, b = 0.3704, cond(J) = 120.2912, \|f(x)\| = 23.0102`
Packit	67cb25	`iter 5: A = 3.3590, lambda = 0.0455, b = 0.4321, cond(J) = 266.5620, \|f(x)\| = 16.0680`
Packit	67cb25	`iter 6: A = 3.6552, lambda = 0.0479, b = 0.4426, cond(J) = 343.8946, \|f(x)\| = 14.6421`
Packit	67cb25	`iter 7: A = 3.9546, lambda = 0.0532, b = 0.4897, cond(J) = 301.4985, \|f(x)\| = 13.5266`
Packit	67cb25	`iter 8: A = 4.1421, lambda = 0.0633, b = 0.6783, cond(J) = 203.9164, \|f(x)\| = 12.3149`
Packit	67cb25	`iter 9: A = 4.3752, lambda = 0.0800, b = 0.9228, cond(J) = 158.2267, \|f(x)\| = 11.2475`
Packit	67cb25	`iter 10: A = 4.6371, lambda = 0.0891, b = 0.9588, cond(J) = 136.6189, \|f(x)\| = 10.5457`
Packit	67cb25	`iter 11: A = 4.7684, lambda = 0.0937, b = 0.9860, cond(J) = 125.4740, \|f(x)\| = 10.4753`
Packit	67cb25	`iter 12: A = 4.7977, lambda = 0.0948, b = 0.9917, cond(J) = 120.1098, \|f(x)\| = 10.4723`
Packit	67cb25	`iter 13: A = 4.8006, lambda = 0.0949, b = 0.9924, cond(J) = 118.9113, \|f(x)\| = 10.4723`
Packit	67cb25	`iter 14: A = 4.8008, lambda = 0.0949, b = 0.9925, cond(J) = 118.7661, \|f(x)\| = 10.4723`
Packit	67cb25	`iter 15: A = 4.8008, lambda = 0.0949, b = 0.9925, cond(J) = 118.7550, \|f(x)\| = 10.4723`
Packit	67cb25	`iter 16: A = 4.8008, lambda = 0.0949, b = 0.9925, cond(J) = 118.7543, \|f(x)\| = 10.4723`
Packit	67cb25	`iter 17: A = 4.8008, lambda = 0.0949, b = 0.9925, cond(J) = 118.7543, \|f(x)\| = 10.4723`
Packit	67cb25	`iter 18: A = 4.8008, lambda = 0.0949, b = 0.9925, cond(J) = 118.7543, \|f(x)\| = 10.4723`
Packit	67cb25	`summary from method 'trust-region/levenberg-marquardt'`
Packit	67cb25	`number of iterations: 18`
Packit	67cb25	`function evaluations: 25`
Packit	67cb25	`Jacobian evaluations: 19`
Packit	67cb25	`reason for stopping: small step size`
Packit	67cb25	`initial \|f(x)\| = 100.877904`
Packit	67cb25	`final \|f(x)\| = 10.472268`
Packit	67cb25	`chisq/dof = 1.1306`
Packit	67cb25	`A = 4.80085 +/- 0.17652`
Packit	67cb25	`lambda = 0.09488 +/- 0.00527`
Packit	67cb25	`b = 0.99249 +/- 0.04419`
Packit	67cb25	`status = success`
Packit	67cb25
Packit	67cb25	`The approximate values of the parameters are found correctly, and the`
Packit	67cb25	`chi-squared value indicates a good fit (the chi-squared per degree of`
Packit	67cb25	`freedom is approximately 1). In this case the errors on the parameters`
Packit	67cb25	`can be estimated from the square roots of the diagonal elements of the`
Packit	67cb25	`covariance matrix. If the chi-squared value shows a poor fit (i.e.`
Packit	67cb25	:math:`\chi^2/(n-p) \gg 1`
Packit	67cb25	`then the error estimates obtained from the`
Packit	67cb25	`covariance matrix will be too small. In the example program the error estimates`
Packit	67cb25	are multiplied by :math:`\sqrt{\chi^2/(n-p)}`
Packit	67cb25	`in this case, a common way of increasing the`
Packit	67cb25	`errors for a poor fit. Note that a poor fit will result from the use`
Packit	67cb25	`of an inappropriate model, and the scaled error estimates may then`
Packit	67cb25	`be outside the range of validity for Gaussian errors.`
Packit	67cb25
Packit	67cb25	Additionally, we see that the condition number of :math:`J(x)` stays
Packit	67cb25	`reasonably small throughout the iteration. This indicates we could`
Packit	67cb25	`safely switch to the Cholesky solver for speed improvement,`
Packit	67cb25	`although this particular system is too small to really benefit.`
Packit	67cb25
Packit	67cb25	:numref:`fig_fit-exp` shows the fitted curve with the original data.
Packit	67cb25
Packit	67cb25	`.. _fig_fit-exp:`
Packit	67cb25
Packit	67cb25	`.. figure:: /images/fit-exp.png`
Packit	67cb25	`:scale: 60%`
Packit	67cb25
Packit	67cb25	`Exponential fitted curve with data`
Packit	67cb25
Packit	67cb25	`.. include:: examples/nlfit.c`
Packit	67cb25	`:code:`
Packit	67cb25
Packit	67cb25	`Geodesic Acceleration Example 1`
Packit	67cb25	`-------------------------------`
Packit	67cb25
Packit	67cb25	`The following example program minimizes a modified Rosenbrock function,`
Packit	67cb25	`which is characterized by a narrow canyon with steep walls. The`
Packit	67cb25	`starting point is selected high on the canyon wall, so the solver`
Packit	67cb25	`must first find the canyon bottom and then navigate to the minimum.`
Packit	67cb25	`The problem is solved both with and without using geodesic acceleration`
Packit	67cb25	`for comparison. The cost function is given by`
Packit	67cb25
Packit	67cb25	`.. only:: not texinfo`
Packit	67cb25
Packit	67cb25	`.. math::`
Packit	67cb25
Packit	67cb25	`\Phi(x) &= {1 \over 2} (f_1^2 + f_2^2) \\`
Packit	67cb25	`f_1 &= 100 \left( x_2 - x_1^2 \right) \\`
Packit	67cb25	`f_2 &= 1 - x_1`
Packit	67cb25
Packit	67cb25	`.. only:: texinfo`
Packit	67cb25
Packit	67cb25	`::`
Packit	67cb25
Packit	67cb25	`Phi(x) = 1/2 (f1^2 + f2^2)`
Packit	67cb25	`f1 = 100 ( x2 - x1^2 )`
Packit	67cb25	`f2 = 1 - x1`
Packit	67cb25
Packit	67cb25	`The Jacobian matrix is`
Packit	67cb25
Packit	67cb25	`.. only:: not texinfo`
Packit	67cb25
Packit	67cb25	`.. math::`
Packit	67cb25
Packit	67cb25	`J =`
Packit	67cb25	`\left(`
Packit	67cb25	`\begin{array}{cc}`
Packit	67cb25	`{\partial f_1 \over \partial x_1} & {\partial f_1 \over \partial x_2} \\`
Packit	67cb25	`{\partial f_2 \over \partial x_1} & {\partial f_2 \over \partial x_2}`
Packit	67cb25	`\end{array}`
Packit	67cb25	`\right) =`
Packit	67cb25	`\left(`
Packit	67cb25	`\begin{array}{cc}`
Packit	67cb25	`-200 x_1 & 100 \\`
Packit	67cb25	`-1 & 0`
Packit	67cb25	`\end{array}`
Packit	67cb25	`\right)`
Packit	67cb25
Packit	67cb25	`.. only:: texinfo`
Packit	67cb25
Packit	67cb25	`::`
Packit	67cb25
Packit	67cb25	`J = [ -200*x1 100 ]`
Packit	67cb25	`[ -1 0 ]`
Packit	67cb25
Packit	67cb25	`In order to use geodesic acceleration, the user must provide`
Packit	67cb25	`the second directional derivative of each residual in the`
Packit	67cb25	`velocity direction,`
Packit	67cb25	:math:`D_v^2 f_i = \sum_{\alpha\beta} v_{\alpha} v_{\beta} \partial_{\alpha} \partial_{\beta} f_i`.
Packit	67cb25	The velocity vector :math:`v` is provided by the solver. For this example,
Packit	67cb25	`these derivatives are`
Packit	67cb25
Packit	67cb25	`.. only:: not texinfo`
Packit	67cb25
Packit	67cb25	`.. math::`
Packit	67cb25
Packit	67cb25	`f_{vv} =`
Packit	67cb25	`D_v^2`
Packit	67cb25	`\left(`
Packit	67cb25	`\begin{array}{c}`
Packit	67cb25	`f_1 \\`
Packit	67cb25	`f_2`
Packit	67cb25	`\end{array}`
Packit	67cb25	`\right) =`
Packit	67cb25	`\left(`
Packit	67cb25	`\begin{array}{c}`
Packit	67cb25	`-200 v_1^2 \\`
Packit	67cb25	`0`
Packit	67cb25	`\end{array}`
Packit	67cb25	`\right)`
Packit	67cb25
Packit	67cb25	`.. only:: texinfo`
Packit	67cb25
Packit	67cb25	`::`
Packit	67cb25
Packit	67cb25	`fvv = [ -200 v1^2 ]`
Packit	67cb25	`[ 0 ]`
Packit	67cb25
Packit	67cb25	`The solution of this minimization problem is`
Packit	67cb25
Packit	67cb25	`.. only:: not texinfo`
Packit	67cb25
Packit	67cb25	`.. math::`
Packit	67cb25
Packit	67cb25	`x^{*} &=`
Packit	67cb25	`\left(`
Packit	67cb25	`\begin{array}{c}`
Packit	67cb25	`1 \\`
Packit	67cb25	`1`
Packit	67cb25	`\end{array}`
Packit	67cb25	`\right) \\`
Packit	67cb25	`\Phi(x^{*}) &= 0`
Packit	67cb25
Packit	67cb25	`.. only:: texinfo`
Packit	67cb25
Packit	67cb25	`::`
Packit	67cb25
Packit	67cb25	`x* = [ 1 ; 1 ]`
Packit	67cb25	`Phi(x*) = 0`
Packit	67cb25
Packit	67cb25	`The program output is shown below::`
Packit	67cb25
Packit	67cb25	`=== Solving system without acceleration ===`
Packit	67cb25	`NITER = 53`
Packit	67cb25	`NFEV = 56`
Packit	67cb25	`NJEV = 54`
Packit	67cb25	`NAEV = 0`
Packit	67cb25	`initial cost = 2.250225000000e+04`
Packit	67cb25	`final cost = 6.674986031430e-18`
Packit	67cb25	`final x = (9.999999974165e-01, 9.999999948328e-01)`
Packit	67cb25	`final cond(J) = 6.000096055094e+02`
Packit	67cb25	`=== Solving system with acceleration ===`
Packit	67cb25	`NITER = 15`
Packit	67cb25	`NFEV = 17`
Packit	67cb25	`NJEV = 16`
Packit	67cb25	`NAEV = 16`
Packit	67cb25	`initial cost = 2.250225000000e+04`
Packit	67cb25	`final cost = 7.518932873279e-19`
Packit	67cb25	`final x = (9.999999991329e-01, 9.999999982657e-01)`
Packit	67cb25	`final cond(J) = 6.000097233278e+02`
Packit	67cb25
Packit	67cb25	`.. _fig_nlfit2:`
Packit	67cb25
Packit	67cb25	`.. figure:: /images/nlfit2.png`
Packit	67cb25
Packit	67cb25	`Paths taken by solver for Rosenbrock function`
Packit	67cb25
Packit	67cb25	`We can see that enabling geodesic acceleration requires less`
Packit	67cb25	`than a third of the number of Jacobian evaluations in order to locate`
Packit	67cb25	the minimum. The path taken by both methods is shown in :numref:`fig_nlfit2`.
Packit	67cb25	`The contours show the cost function`
Packit	67cb25	:math:`\Phi(x_1,x_2)`. We see that both methods quickly
Packit	67cb25	`find the canyon bottom, but the geodesic acceleration method`
Packit	67cb25	`navigates along the bottom to the solution with significantly`
Packit	67cb25	`fewer iterations.`
Packit	67cb25
Packit	67cb25	`The program is given below.`
Packit	67cb25
Packit	67cb25	`.. include:: examples/nlfit2.c`
Packit	67cb25	`:code:`
Packit	67cb25
Packit	67cb25	`Geodesic Acceleration Example 2`
Packit	67cb25	`-------------------------------`
Packit	67cb25
Packit	67cb25	`The following example fits a set of data to a Gaussian model`
Packit	67cb25	`using the Levenberg-Marquardt method with geodesic acceleration.`
Packit	67cb25	`The cost function is`
Packit	67cb25
Packit	67cb25	`.. only:: not texinfo`
Packit	67cb25
Packit	67cb25	`.. math::`
Packit	67cb25
Packit	67cb25	`\Phi(x) &= {1 \over 2} \sum_i f_i^2 \\`
Packit	67cb25	`f_i &= y_i - Y(a,b,c;t_i)`
Packit	67cb25
Packit	67cb25	`.. only:: texinfo`
Packit	67cb25
Packit	67cb25	`::`
Packit	67cb25
Packit	67cb25	`Phi(x) = 1/2 \sum_i f_i^2`
Packit	67cb25	`f_i = y_i - Y(a,b,c;t_i)`
Packit	67cb25
Packit	67cb25	where :math:`y_i` is the measured data point at time :math:`t_i`, and
Packit	67cb25	`the model is specified by`
Packit	67cb25
Packit	67cb25	`.. only:: not texinfo`
Packit	67cb25
Packit	67cb25	`.. math::`
Packit	67cb25
Packit	67cb25	`Y(a,b,c;t) = a \exp{`
Packit	67cb25	`\left[`
Packit	67cb25	`-{1 \over 2}`
Packit	67cb25	`\left(`
Packit	67cb25	`{ t - b \over c }`
Packit	67cb25	`\right)^2`
Packit	67cb25	`\right]`
Packit	67cb25	`}`
Packit	67cb25
Packit	67cb25	`.. only:: texinfo`
Packit	67cb25
Packit	67cb25	`::`
Packit	67cb25
Packit	67cb25	`Y(a,b,c;t) = a exp(-1/2 ((t-b)/c)^2)`
Packit	67cb25
Packit	67cb25	The parameters :math:`a,b,c` represent the amplitude, mean, and width of the Gaussian
Packit	67cb25	respectively. The program below generates the :math:`y_i` data on :math:`[0,1]` using
Packit	67cb25	the values :math:`a = 5`, :math:`b = 0.4`, :math:`c = 0.15` and adding random noise.
Packit	67cb25	The :math:`i`-th row of the Jacobian is
Packit	67cb25
Packit	67cb25	`.. only:: not texinfo`
Packit	67cb25
Packit	67cb25	`.. math::`
Packit	67cb25
Packit	67cb25	`J_{i,:} =`
Packit	67cb25	`\left(`
Packit	67cb25	`\begin{array}{ccc}`
Packit	67cb25	`{\partial f_i \over \partial a} & {\partial f_i \over \partial b} & {\partial f_i \over \partial c}`
Packit	67cb25	`\end{array}`
Packit	67cb25	`\right) =`
Packit	67cb25	`\left(`
Packit	67cb25	`\begin{array}{ccc}`
Packit	67cb25	`-e_i & -{a \over c} z_i e_i & -{a \over c} z_i^2 e_i`
Packit	67cb25	`\end{array}`
Packit	67cb25	`\right)`
Packit	67cb25
Packit	67cb25	`.. only:: texinfo`
Packit	67cb25
Packit	67cb25	`::`
Packit	67cb25
Packit	67cb25	`J(i,:) = ( -e_i -(a/c)z_ie_i -(a/c)z_i^2e_i )`
Packit	67cb25
Packit	67cb25	`where`
Packit	67cb25
Packit	67cb25	`.. only:: not texinfo`
Packit	67cb25
Packit	67cb25	`.. math::`
Packit	67cb25
Packit	67cb25	`z_i &= { t_i - b \over c} \\`
Packit	67cb25	`e_i &= \exp{\left( -{1 \over 2} z_i^2 \right)}`
Packit	67cb25
Packit	67cb25	`.. only:: texinfo`
Packit	67cb25
Packit	67cb25	`::`
Packit	67cb25
Packit	67cb25	`z_i = (t_i - b) / c`
Packit	67cb25	`e_i = \exp(-1/2 z_i^2)`
Packit	67cb25
Packit	67cb25	`In order to use geodesic acceleration, we need the second directional derivative`
Packit	67cb25	`of the residuals in the velocity direction,`
Packit	67cb25	:math:`D_v^2 f_i = \sum_{\alpha\beta} v_{\alpha} v_{\beta} \partial_{\alpha} \partial_{\beta} f_i`,
Packit	67cb25	where :math:`v` is provided by the solver. To compute this, it is helpful to make a table of
Packit	67cb25	all second derivatives of the residuals :math:`f_i` with respect to each combination of model parameters.
Packit	67cb25	`This table is`
Packit	67cb25
Packit	67cb25	`.. only:: not texinfo`
Packit	67cb25
Packit	67cb25	`.. math::`
Packit	67cb25
Packit	67cb25	`\begin{array}{cccc}`
Packit	67cb25	`& {\partial \over \partial a} & {\partial \over \partial b} & {\partial \over \partial c} \cr`
Packit	67cb25	`{\partial \over \partial a} & 0 & -{z_i \over c} e_i & -{z_i^2 \over c} e_i \cr`
Packit	67cb25	`{\partial \over \partial b} & & {a \over c^2} \left( 1 - z_i^2 \right) e_i & {a \over c^2} z_i \left( 2 - z_i^2 \right) e_i \cr`
Packit	67cb25	`{\partial \over \partial c} & & & {a \over c^2} z_i^2 \left( 3 - z_i^2 \right) e_i`
Packit	67cb25	`\end{array}`
Packit	67cb25
Packit	67cb25	`The lower half of the table is omitted since it is symmetric. Then, the second directional derivative`
Packit	67cb25	of :math:`f_i` is
Packit	67cb25
Packit	67cb25	`.. only:: not texinfo`
Packit	67cb25
Packit	67cb25	`.. math:: D_v^2 f_i = v_a^2 \partial_a^2 f_i + 2 v_a v_b \partial_a \partial_b f_i + 2 v_a v_c \partial_a \partial_c f_i + v_b^2 \partial_b^2 f_i + 2 v_b v_c \partial_b \partial_c f_i + v_c^2 \partial_c^2 f_i`
Packit	67cb25
Packit	67cb25	`.. only:: texinfo`
Packit	67cb25
Packit	67cb25	`::`
Packit	67cb25
Packit	67cb25	`D_v^2 f_i = v_a^2 (d/da)^2 f_i + 2 v_a v_b (d/da) (d/db) f_i + 2 v_a v_c (d/da) (d/dc) f_i + v_b^2 (d/db)^2 f_i + 2 v_b v_c (d/db) (d/dc) f_i + v_c^2 (d/dc)^2 f_i`
Packit	67cb25
Packit	67cb25	`The factors of 2 come from the symmetry of the mixed second partial derivatives.`
Packit	67cb25	The iteration is started using the initial guess :math:`a = 1, b = 0, c = 1`.
Packit	67cb25	`The program output is shown below::`
Packit	67cb25
Packit	67cb25	`iter 0: a = 1.0000, b = 0.0000, c = 1.0000, \|a\|/\|v\| = 0.0000 cond(J) = inf, \|f(x)\| = 35.4785`
Packit	67cb25	`iter 1: a = 1.5708, b = 0.5321, c = 0.5219, \|a\|/\|v\| = 0.3093 cond(J) = 29.0443, \|f(x)\| = 31.1042`
Packit	67cb25	`iter 2: a = 1.7387, b = 0.4040, c = 0.4568, \|a\|/\|v\| = 0.1199 cond(J) = 3.5256, \|f(x)\| = 28.7217`
Packit	67cb25	`iter 3: a = 2.2340, b = 0.3829, c = 0.3053, \|a\|/\|v\| = 0.3308 cond(J) = 4.5121, \|f(x)\| = 23.8074`
Packit	67cb25	`iter 4: a = 3.2275, b = 0.3952, c = 0.2243, \|a\|/\|v\| = 0.2784 cond(J) = 8.6499, \|f(x)\| = 15.6003`
Packit	67cb25	`iter 5: a = 4.3347, b = 0.3974, c = 0.1752, \|a\|/\|v\| = 0.2029 cond(J) = 15.1732, \|f(x)\| = 7.5908`
Packit	67cb25	`iter 6: a = 4.9352, b = 0.3992, c = 0.1536, \|a\|/\|v\| = 0.1001 cond(J) = 26.6621, \|f(x)\| = 4.8402`
Packit	67cb25	`iter 7: a = 5.0716, b = 0.3994, c = 0.1498, \|a\|/\|v\| = 0.0166 cond(J) = 34.6922, \|f(x)\| = 4.7103`
Packit	67cb25	`iter 8: a = 5.0828, b = 0.3994, c = 0.1495, \|a\|/\|v\| = 0.0012 cond(J) = 36.5422, \|f(x)\| = 4.7095`
Packit	67cb25	`iter 9: a = 5.0831, b = 0.3994, c = 0.1495, \|a\|/\|v\| = 0.0000 cond(J) = 36.6929, \|f(x)\| = 4.7095`
Packit	67cb25	`iter 10: a = 5.0831, b = 0.3994, c = 0.1495, \|a\|/\|v\| = 0.0000 cond(J) = 36.6975, \|f(x)\| = 4.7095`
Packit	67cb25	`iter 11: a = 5.0831, b = 0.3994, c = 0.1495, \|a\|/\|v\| = 0.0000 cond(J) = 36.6976, \|f(x)\| = 4.7095`
Packit	67cb25	`NITER = 11`
Packit	67cb25	`NFEV = 18`
Packit	67cb25	`NJEV = 12`
Packit	67cb25	`NAEV = 17`
Packit	67cb25	`initial cost = 1.258724737288e+03`
Packit	67cb25	`final cost = 2.217977560180e+01`
Packit	67cb25	`final x = (5.083101559156e+00, 3.994484109594e-01, 1.494898e-01)`
Packit	67cb25	`final cond(J) = 3.669757713403e+01`
Packit	67cb25
Packit	67cb25	`We see the method converges after 11 iterations. For comparison the standard`
Packit	67cb25	`Levenberg-Marquardt method requires 26 iterations and so the Gaussian fitting`
Packit	67cb25	`problem benefits substantially from the geodesic acceleration correction. The`
Packit	67cb25	column marked :code:`\|a\|/\|v\|` above shows the ratio of the acceleration term
Packit	67cb25	`to the velocity term as the iteration progresses. Larger values of this`
Packit	67cb25	`ratio indicate that the geodesic acceleration correction term is contributing`
Packit	67cb25	`substantial information to the solver relative to the standard LM velocity step.`
Packit	67cb25
Packit	67cb25	The data and fitted model are shown in :numref:`fig_nlfit2b`.
Packit	67cb25
Packit	67cb25	`.. _fig_nlfit2b:`
Packit	67cb25
Packit	67cb25	`.. figure:: /images/nlfit2b.png`
Packit	67cb25
Packit	67cb25	`Gaussian model fitted to data`
Packit	67cb25
Packit	67cb25	`The program is given below.`
Packit	67cb25
Packit	67cb25	`.. include:: examples/nlfit2b.c`
Packit	67cb25	`:code:`
Packit	67cb25
Packit	67cb25	`Comparing TRS Methods Example`
Packit	67cb25	`-----------------------------`
Packit	67cb25
Packit	67cb25	`The following program compares all available nonlinear least squares`
Packit	67cb25	`trust-region subproblem (TRS) methods on the Branin function, a common`
Packit	67cb25	`optimization test problem. The cost function is`
Packit	67cb25
Packit	67cb25	`.. only:: not texinfo`
Packit	67cb25
Packit	67cb25	`.. math::`
Packit	67cb25
Packit	67cb25	`\Phi(x) &= {1 \over 2} (f_1^2 + f_2^2) \\`
Packit	67cb25	`f_1 &= x_2 + a_1 x_1^2 + a_2 x_1 + a_3 \\`
Packit	67cb25	`f_2 &= \sqrt{a_4} \sqrt{1 + (1 - a_5) \cos{x_1}}`
Packit	67cb25
Packit	67cb25	`.. only:: texinfo`
Packit	67cb25
Packit	67cb25	`::`
Packit	67cb25
Packit	67cb25	`\Phi(x) &= 1/2 (f_1^2 + f_2^2)`
Packit	67cb25	`f_1 &= x_2 + a_1 x_1^2 + a_2 x_1 + a_3`
Packit	67cb25	`f_2 &= sqrt(a_4) sqrt(1 + (1 - a_5) cos(x_1))`
Packit	67cb25
Packit	67cb25	with :math:`a_1 = -{5.1 \over 4 \pi^2}, a_2 = {5 \over \pi}, a_3 = -6, a_4 = 10, a_5 = {1 \over 8\pi}`.
Packit	67cb25	`There are three minima of this function in the range`
Packit	67cb25	:math:`(x_1,x_2) \in [-5,15] \times [-5,15]`. The program
Packit	67cb25	below uses the starting point :math:`(x_1,x_2) = (6,14.5)`
Packit	67cb25	`and calculates the solution with all available nonlinear`
Packit	67cb25	`least squares TRS methods. The program output is shown below::`
Packit	67cb25
Packit	67cb25	`Method NITER NFEV NJEV Initial Cost Final cost Final cond(J) Final x`
Packit	67cb25	`levenberg-marquardt 20 27 21 1.9874e+02 3.9789e-01 6.1399e+07 (-3.14e+00, 1.23e+01)`
Packit	67cb25	`levenberg-marquardt+accel 27 36 28 1.9874e+02 3.9789e-01 1.4465e+07 (3.14e+00, 2.27e+00)`
Packit	67cb25	`dogleg 23 64 23 1.9874e+02 3.9789e-01 5.0692e+08 (3.14e+00, 2.28e+00)`
Packit	67cb25	`double-dogleg 24 69 24 1.9874e+02 3.9789e-01 3.4879e+07 (3.14e+00, 2.27e+00)`
Packit	67cb25	`2D-subspace 23 54 24 1.9874e+02 3.9789e-01 2.5142e+07 (3.14e+00, 2.27e+00)`
Packit	67cb25
Packit	67cb25	`The first row of output above corresponds to standard Levenberg-Marquardt, while`
Packit	67cb25	`the second row includes geodesic acceleration. We see that the standard LM method`
Packit	67cb25	converges to the minimum at :math:`(-\pi,12.275)` and also uses the least number
Packit	67cb25	`of iterations and Jacobian evaluations. All other methods converge to the minimum`
Packit	67cb25	:math:`(\pi,2.275)` and perform similarly in terms of number of Jacobian evaluations.
Packit	67cb25	We see that :math:`J` is fairly ill-conditioned
Packit	67cb25	`at both minima, indicating that the QR (or SVD) solver is the best choice for this problem.`
Packit	67cb25	`Since there are only two parameters in this optimization problem, we can easily`
Packit	67cb25	visualize the paths taken by each method, which are shown in :numref:`fig_nlfit3`.
Packit	67cb25	The figure shows contours of the cost function :math:`\Phi(x_1,x_2)` which exhibits
Packit	67cb25	three global minima in the range :math:`[-5,15] \times [-5,15]`. The paths taken
Packit	67cb25	`by each solver are shown as colored lines.`
Packit	67cb25
Packit	67cb25	`.. _fig_nlfit3:`
Packit	67cb25
Packit	67cb25	`.. figure:: /images/nlfit3.png`
Packit	67cb25	`:scale: 60%`
Packit	67cb25
Packit	67cb25	`Paths taken for different TRS methods for the Branin function`
Packit	67cb25
Packit	67cb25	`The program is given below.`
Packit	67cb25
Packit	67cb25	`.. include:: examples/nlfit3.c`
Packit	67cb25	`:code:`
Packit	67cb25
Packit	67cb25	`Large Nonlinear Least Squares Example`
Packit	67cb25	`-------------------------------------`
Packit	67cb25
Packit	67cb25	`The following program illustrates the large nonlinear least`
Packit	67cb25	`squares solvers on a system with significant sparse structure`
Packit	67cb25	`in the Jacobian. The cost function is`
Packit	67cb25
Packit	67cb25	`.. only:: not texinfo`
Packit	67cb25
Packit	67cb25	`.. math::`
Packit	67cb25
Packit	67cb25	`\Phi(x) &= {1 \over 2} \sum_{i=1}^{p+1} f_i^2 \\`
Packit	67cb25	`f_i &= \sqrt{\alpha} (x_i - 1), \quad 1 \le i \le p \\`
Packit	67cb25	`f_{p+1} &= \|\|x\|\|^2 - {1 \over 4}`
Packit	67cb25
Packit	67cb25	`.. only:: texinfo`
Packit	67cb25
Packit	67cb25	`::`
Packit	67cb25
Packit	67cb25	`\Phi(x) &= 1/2 \sum_{i=1}^{p+1} f_i^2`
Packit	67cb25	`f_i &= \sqrt{\alpha} (x_i - 1), 1 \le i \le p`
Packit	67cb25	`f_{p+1} &= \|\|x\|\|^2 - 1/4`
Packit	67cb25
Packit	67cb25	with :math:`\alpha = 10^{-5}`. The residual :math:`f_{p+1}` imposes a constraint on the :math:`p`
Packit	67cb25	parameters :math:`x`, to ensure that :math:`\|\|x\|\|^2 \approx {1 \over 4}`.
Packit	67cb25	The :math:`(p+1)`-by-:math:`p` Jacobian for this system is
Packit	67cb25
Packit	67cb25	`.. only:: not texinfo`
Packit	67cb25
Packit	67cb25	`.. math::`
Packit	67cb25
Packit	67cb25	`J(x) =`
Packit	67cb25	`\left(`
Packit	67cb25	`\begin{array}{c}`
Packit	67cb25	`\sqrt{\alpha} I_p \\`
Packit	67cb25	`2 x^T`
Packit	67cb25	`\end{array}`
Packit	67cb25	`\right)`
Packit	67cb25
Packit	67cb25	`.. only:: texinfo`
Packit	67cb25
Packit	67cb25	`::`
Packit	67cb25
Packit	67cb25	`J(x) = [ \sqrt{alpha} I_p; 2 x^T ]`
Packit	67cb25
Packit	67cb25	`and the normal equations matrix is`
Packit	67cb25
Packit	67cb25	`.. math:: J^T J = \alpha I_p + 4 x x^T`
Packit	67cb25
Packit	67cb25	Finally, the second directional derivative of :math:`f` for the
Packit	67cb25	`geodesic acceleration method is`
Packit	67cb25
Packit	67cb25	`.. only:: not texinfo`
Packit	67cb25
Packit	67cb25	`.. math::`
Packit	67cb25
Packit	67cb25	`f_{vv} = D_v^2 f =`
Packit	67cb25	`\left(`
Packit	67cb25	`\begin{array}{c}`
Packit	67cb25	`0 \\`
Packit	67cb25	`2 \|\|v\|\|^2`
Packit	67cb25	`\end{array}`
Packit	67cb25	`\right)`
Packit	67cb25
Packit	67cb25	`.. only:: texinfo`
Packit	67cb25
Packit	67cb25	`::`
Packit	67cb25
Packit	67cb25	`fvv = [ 0 ]`
Packit	67cb25	`[ 2 \|\|v\|\|^2 ]`
Packit	67cb25
Packit	67cb25	Since the upper :math:`p`-by-:math:`p` block of :math:`J` is diagonal,
Packit	67cb25	`this sparse structure should be exploited in the nonlinear solver.`
Packit	67cb25	For comparison, the following program solves the system for :math:`p = 2000`
Packit	67cb25	`using the dense direct Cholesky solver based on the normal equations matrix`
Packit	67cb25	:math:`J^T J`, as well as the iterative Steihaug-Toint solver, based on
Packit	67cb25	sparse matrix-vector products :math:`J u` and :math:`J^T u`. The
Packit	67cb25	`program output is shown below::`
Packit	67cb25
Packit	67cb25	`Method NITER NFEV NJUEV NJTJEV NAEV Init Cost Final cost cond(J) Final \|x\|^2 Time (s)`
Packit	67cb25	`levenberg-marquardt 25 31 26 26 0 7.1218e+18 1.9555e-02 447.50 2.5044e-01 46.28`
Packit	67cb25	`levenberg-marquardt+accel 22 23 45 23 22 7.1218e+18 1.9555e-02 447.64 2.5044e-01 33.92`
Packit	67cb25	`dogleg 37 87 36 36 0 7.1218e+18 1.9555e-02 447.59 2.5044e-01 56.05`
Packit	67cb25	`double-dogleg 35 88 34 34 0 7.1218e+18 1.9555e-02 447.62 2.5044e-01 52.65`
Packit	67cb25	`2D-subspace 37 88 36 36 0 7.1218e+18 1.9555e-02 447.71 2.5044e-01 59.75`
Packit	67cb25	`steihaug-toint 35 88 345 0 0 7.1218e+18 1.9555e-02 inf 2.5044e-01 0.09`
Packit	67cb25
Packit	67cb25	The first five rows use methods based on factoring the dense :math:`J^T J` matrix
Packit	67cb25	`while the last row uses the iterative Steihaug-Toint method. While the number`
Packit	67cb25	`of Jacobian matrix-vector products (NJUEV) is less for the dense methods, the added time`
Packit	67cb25	to construct and factor the :math:`J^T J` matrix (NJTJEV) results in a much larger runtime than the
Packit	67cb25	`iterative method (see last column).`
Packit	67cb25
Packit	67cb25	`The program is given below.`
Packit	67cb25
Packit	67cb25	`.. include:: examples/nlfit4.c`
Packit	67cb25	`:code:`
Packit	67cb25
Packit	67cb25	`References and Further Reading`
Packit	67cb25	`==============================`
Packit	67cb25
Packit	67cb25	`The following publications are relevant to the algorithms described`
Packit	67cb25	`in this section,`
Packit	67cb25
Packit	67cb25	`* J.J. \|More\|, *The Levenberg-Marquardt Algorithm: Implementation and`
Packit	67cb25	`Theory*, Lecture Notes in Mathematics, v630 (1978), ed G. Watson.`
Packit	67cb25
Packit	67cb25	`* H. B. Nielsen, "Damping Parameter in Marquardt's Method",`
Packit	67cb25	`IMM Department of Mathematical Modeling, DTU, Tech. Report IMM-REP-1999-05`
Packit	67cb25	`(1999).`
Packit	67cb25
Packit	67cb25	`* K. Madsen and H. B. Nielsen, "Introduction to Optimization and Data`
Packit	67cb25	`Fitting", IMM Department of Mathematical Modeling, DTU, 2010.`
Packit	67cb25
Packit	67cb25	`* J. E. Dennis and R. B. Schnabel, Numerical Methods for Unconstrained`
Packit	67cb25	`Optimization and Nonlinear Equations, SIAM, 1996.`
Packit	67cb25
Packit	67cb25	`* M. K. Transtrum, B. B. Machta, and J. P. Sethna,`
Packit	67cb25	`Geometry of nonlinear least squares with applications to sloppy models and optimization,`
Packit	67cb25	`Phys. Rev. E 83, 036701, 2011.`
Packit	67cb25
Packit	67cb25	`* M. K. Transtrum and J. P. Sethna, Improvements to the Levenberg-Marquardt`
Packit	67cb25	`algorithm for nonlinear least-squares minimization, arXiv:1201.5885, 2012.`
Packit	67cb25
Packit	67cb25	`* J.J. \|More\|, B.S. Garbow, K.E. Hillstrom, "Testing Unconstrained`
Packit	67cb25	`Optimization Software", ACM Transactions on Mathematical Software, Vol`
Packit	67cb25	`7, No 1 (1981), p 17--41.`
Packit	67cb25
Packit	67cb25	`* H. B. Nielsen, "UCTP Test Problems for Unconstrained Optimization",`
Packit	67cb25	`IMM Department of Mathematical Modeling, DTU, Tech. Report IMM-REP-2000-17`
Packit	67cb25	`(2000).`

source-git / gsl

Source Code

Blame doc/nls.rst