|
Packit |
67cb25 |
.. index::
|
|
Packit |
67cb25 |
single: fitting
|
|
Packit |
67cb25 |
single: least squares fit
|
|
Packit |
67cb25 |
single: regression, least squares
|
|
Packit |
67cb25 |
single: weighted linear fits
|
|
Packit |
67cb25 |
single: unweighted linear fits
|
|
Packit |
67cb25 |
|
|
Packit |
67cb25 |
****************************
|
|
Packit |
67cb25 |
Linear Least-Squares Fitting
|
|
Packit |
67cb25 |
****************************
|
|
Packit |
67cb25 |
|
|
Packit |
67cb25 |
This chapter describes routines for performing least squares fits to
|
|
Packit |
67cb25 |
experimental data using linear combinations of functions. The data
|
|
Packit |
67cb25 |
may be weighted or unweighted, i.e. with known or unknown errors. For
|
|
Packit |
67cb25 |
weighted data the functions compute the best fit parameters and their
|
|
Packit |
67cb25 |
associated covariance matrix. For unweighted data the covariance
|
|
Packit |
67cb25 |
matrix is estimated from the scatter of the points, giving a
|
|
Packit |
67cb25 |
variance-covariance matrix.
|
|
Packit |
67cb25 |
|
|
Packit |
67cb25 |
The functions are divided into separate versions for simple one- or
|
|
Packit |
67cb25 |
two-parameter regression and multiple-parameter fits.
|
|
Packit |
67cb25 |
|
|
Packit |
67cb25 |
.. _sec_lls-overview:
|
|
Packit |
67cb25 |
|
|
Packit |
67cb25 |
Overview
|
|
Packit |
67cb25 |
========
|
|
Packit |
67cb25 |
|
|
Packit |
67cb25 |
Least-squares fits are found by minimizing :math:`\chi^2`
|
|
Packit |
67cb25 |
(chi-squared), the weighted sum of squared residuals over :math:`n`
|
|
Packit |
67cb25 |
experimental datapoints :math:`(x_i, y_i)` for the model :math:`Y(c,x)`,
|
|
Packit |
67cb25 |
|
|
Packit |
67cb25 |
.. math:: \chi^2 = \sum_i w_i (y_i - Y(c, x_i))^2
|
|
Packit |
67cb25 |
|
|
Packit |
67cb25 |
The :math:`p` parameters of the model are :math:`c = \{c_0, c_1, \dots\}`.
|
|
Packit |
67cb25 |
The weight factors :math:`w_i` are given by :math:`w_i = 1/\sigma_i^2`
|
|
Packit |
67cb25 |
where :math:`\sigma_i` is the experimental error on the data-point
|
|
Packit |
67cb25 |
:math:`y_i`. The errors are assumed to be
|
|
Packit |
67cb25 |
Gaussian and uncorrelated.
|
|
Packit |
67cb25 |
For unweighted data the chi-squared sum is computed without any weight factors.
|
|
Packit |
67cb25 |
|
|
Packit |
67cb25 |
.. index::
|
|
Packit |
67cb25 |
single: covariance matrix, linear fits
|
|
Packit |
67cb25 |
|
|
Packit |
67cb25 |
The fitting routines return the best-fit parameters :math:`c` and their
|
|
Packit |
67cb25 |
:math:`p \times p` covariance matrix. The covariance matrix measures the
|
|
Packit |
67cb25 |
statistical errors on the best-fit parameters resulting from the
|
|
Packit |
67cb25 |
errors on the data, :math:`\sigma_i`, and is defined
|
|
Packit |
67cb25 |
as
|
|
Packit |
67cb25 |
|
|
Packit |
67cb25 |
.. only:: not texinfo
|
|
Packit |
67cb25 |
|
|
Packit |
67cb25 |
.. math:: C_{ab} = \langle \delta c_a \delta c_b \rangle
|
|
Packit |
67cb25 |
|
|
Packit |
67cb25 |
.. only:: texinfo
|
|
Packit |
67cb25 |
|
|
Packit |
67cb25 |
::
|
|
Packit |
67cb25 |
|
|
Packit |
67cb25 |
C_{ab} = <\delta c_a \delta c_b>
|
|
Packit |
67cb25 |
|
|
Packit |
67cb25 |
where :math:`\langle \, \rangle`
|
|
Packit |
67cb25 |
denotes an average over the Gaussian error distributions of the underlying datapoints.
|
|
Packit |
67cb25 |
|
|
Packit |
67cb25 |
The covariance matrix is calculated by error propagation from the data
|
|
Packit |
67cb25 |
errors :math:`\sigma_i`. The change in a fitted parameter :math:`\delta c_a`
|
|
Packit |
67cb25 |
caused by a small change in the data :math:`\delta y_i` is given
|
|
Packit |
67cb25 |
by
|
|
Packit |
67cb25 |
|
|
Packit |
67cb25 |
.. only:: not texinfo
|
|
Packit |
67cb25 |
|
|
Packit |
67cb25 |
.. math:: \delta c_a = \sum_i {\partial c_a \over \partial y_i} \delta y_i
|
|
Packit |
67cb25 |
|
|
Packit |
67cb25 |
.. only:: texinfo
|
|
Packit |
67cb25 |
|
|
Packit |
67cb25 |
::
|
|
Packit |
67cb25 |
|
|
Packit |
67cb25 |
\delta c_a = \sum_i (dc_a/dy_i) \delta y_i
|
|
Packit |
67cb25 |
|
|
Packit |
67cb25 |
allowing the covariance matrix to be written in terms of the errors on the data,
|
|
Packit |
67cb25 |
|
|
Packit |
67cb25 |
.. only:: not texinfo
|
|
Packit |
67cb25 |
|
|
Packit |
67cb25 |
.. math::
|
|
Packit |
67cb25 |
|
|
Packit |
67cb25 |
C_{ab} = \sum_{i,j} {\partial c_a \over \partial y_i}
|
|
Packit |
67cb25 |
{\partial c_b \over \partial y_j}
|
|
Packit |
67cb25 |
\langle \delta y_i \delta y_j \rangle
|
|
Packit |
67cb25 |
|
|
Packit |
67cb25 |
.. only:: texinfo
|
|
Packit |
67cb25 |
|
|
Packit |
67cb25 |
::
|
|
Packit |
67cb25 |
|
|
Packit |
67cb25 |
C_{ab} = \sum_{i,j} (dc_a/dy_i) (dc_b/dy_j) <\delta y_i \delta y_j>
|
|
Packit |
67cb25 |
|
|
Packit |
67cb25 |
For uncorrelated data the fluctuations of the underlying datapoints satisfy
|
|
Packit |
67cb25 |
|
|
Packit |
67cb25 |
.. only:: not texinfo
|
|
Packit |
67cb25 |
|
|
Packit |
67cb25 |
.. math:: \langle \delta y_i \delta y_j \rangle = \sigma_i^2 \delta_{ij}
|
|
Packit |
67cb25 |
|
|
Packit |
67cb25 |
.. only:: texinfo
|
|
Packit |
67cb25 |
|
|
Packit |
67cb25 |
::
|
|
Packit |
67cb25 |
|
|
Packit |
67cb25 |
<\delta y_i \delta y_j> = \sigma_i^2 \delta_{ij}
|
|
Packit |
67cb25 |
|
|
Packit |
67cb25 |
giving a corresponding parameter covariance matrix of
|
|
Packit |
67cb25 |
|
|
Packit |
67cb25 |
.. only:: not texinfo
|
|
Packit |
67cb25 |
|
|
Packit |
67cb25 |
.. math:: C_{ab} = \sum_{i} {1 \over w_i} {\partial c_a \over \partial y_i} {\partial c_b \over \partial y_i}
|
|
Packit |
67cb25 |
|
|
Packit |
67cb25 |
.. only:: texinfo
|
|
Packit |
67cb25 |
|
|
Packit |
67cb25 |
::
|
|
Packit |
67cb25 |
|
|
Packit |
67cb25 |
C_{ab} = \sum_i (1/w_i) (dc_a/dy_i) (dc_b/dy_i)
|
|
Packit |
67cb25 |
|
|
Packit |
67cb25 |
.. index::
|
|
Packit |
67cb25 |
single: variance-covariance matrix, linear fits
|
|
Packit |
67cb25 |
|
|
Packit |
67cb25 |
When computing the covariance matrix for unweighted data, i.e. data with unknown errors,
|
|
Packit |
67cb25 |
the weight factors :math:`w_i` in this sum are replaced by the single estimate
|
|
Packit |
67cb25 |
:math:`w = 1/\sigma^2`, where :math:`\sigma^2` is the computed variance of the
|
|
Packit |
67cb25 |
residuals about the best-fit model, :math:`\sigma^2 = \sum (y_i - Y(c,x_i))^2 / (n-p)`.
|
|
Packit |
67cb25 |
This is referred to as the *variance-covariance matrix*.
|
|
Packit |
67cb25 |
|
|
Packit |
67cb25 |
The standard deviations of the best-fit parameters are given by the
|
|
Packit |
67cb25 |
square root of the corresponding diagonal elements of
|
|
Packit |
67cb25 |
the covariance matrix, :math:`\sigma_{c_a} = \sqrt{C_{aa}}`.
|
|
Packit |
67cb25 |
The correlation coefficient of the fit parameters :math:`c_a` and :math:`c_b`
|
|
Packit |
67cb25 |
is given by :math:`\rho_{ab} = C_{ab} / \sqrt{C_{aa} C_{bb}}`.
|
|
Packit |
67cb25 |
|
|
Packit |
67cb25 |
.. index:: linear regression
|
|
Packit |
67cb25 |
|
|
Packit |
67cb25 |
Linear regression
|
|
Packit |
67cb25 |
=================
|
|
Packit |
67cb25 |
|
|
Packit |
67cb25 |
The functions in this section are used to fit simple one or two
|
|
Packit |
67cb25 |
parameter linear regression models. The functions are declared in
|
|
Packit |
67cb25 |
the header file :file:`gsl_fit.h`.
|
|
Packit |
67cb25 |
|
|
Packit |
67cb25 |
Linear regression with a constant term
|
|
Packit |
67cb25 |
--------------------------------------
|
|
Packit |
67cb25 |
The functions described in this section can be used to perform
|
|
Packit |
67cb25 |
least-squares fits to a straight line model, :math:`Y(c,x) = c_0 + c_1 x`.
|
|
Packit |
67cb25 |
|
|
Packit |
67cb25 |
.. index::
|
|
Packit |
67cb25 |
single: covariance matrix, from linear regression
|
|
Packit |
67cb25 |
|
|
Packit |
67cb25 |
.. function:: int gsl_fit_linear (const double * x, const size_t xstride, const double * y, const size_t ystride, size_t n, double * c0, double * c1, double * cov00, double * cov01, double * cov11, double * sumsq)
|
|
Packit |
67cb25 |
|
|
Packit |
67cb25 |
This function computes the best-fit linear regression coefficients
|
|
Packit |
67cb25 |
(:data:`c0`, :data:`c1`) of the model :math:`Y = c_0 + c_1 X` for the dataset
|
|
Packit |
67cb25 |
(:data:`x`, :data:`y`), two vectors of length :data:`n` with strides
|
|
Packit |
67cb25 |
:data:`xstride` and :data:`ystride`. The errors on :data:`y` are assumed unknown so
|
|
Packit |
67cb25 |
the variance-covariance matrix for the
|
|
Packit |
67cb25 |
parameters (:data:`c0`, :data:`c1`) is estimated from the scatter of the
|
|
Packit |
67cb25 |
points around the best-fit line and returned via the parameters
|
|
Packit |
67cb25 |
(:data:`cov00`, :data:`cov01`, :data:`cov11`).
|
|
Packit |
67cb25 |
The sum of squares of the residuals from the best-fit line is returned
|
|
Packit |
67cb25 |
in :data:`sumsq`. Note: the correlation coefficient of the data can be computed using
|
|
Packit |
67cb25 |
:func:`gsl_stats_correlation`, it does not depend on the fit.
|
|
Packit |
67cb25 |
|
|
Packit |
67cb25 |
.. function:: int gsl_fit_wlinear (const double * x, const size_t xstride, const double * w, const size_t wstride, const double * y, const size_t ystride, size_t n, double * c0, double * c1, double * cov00, double * cov01, double * cov11, double * chisq)
|
|
Packit |
67cb25 |
|
|
Packit |
67cb25 |
This function computes the best-fit linear regression coefficients
|
|
Packit |
67cb25 |
(:data:`c0`, :data:`c1`) of the model :math:`Y = c_0 + c_1 X` for the weighted
|
|
Packit |
67cb25 |
dataset (:data:`x`, :data:`y`), two vectors of length :data:`n` with strides
|
|
Packit |
67cb25 |
:data:`xstride` and :data:`ystride`. The vector :data:`w`, of length :data:`n`
|
|
Packit |
67cb25 |
and stride :data:`wstride`, specifies the weight of each datapoint. The
|
|
Packit |
67cb25 |
weight is the reciprocal of the variance for each datapoint in :data:`y`.
|
|
Packit |
67cb25 |
|
|
Packit |
67cb25 |
The covariance matrix for the parameters (:data:`c0`, :data:`c1`) is
|
|
Packit |
67cb25 |
computed using the weights and returned via the parameters
|
|
Packit |
67cb25 |
(:data:`cov00`, :data:`cov01`, :data:`cov11`). The weighted sum of squares
|
|
Packit |
67cb25 |
of the residuals from the best-fit line, :math:`\chi^2`, is returned in
|
|
Packit |
67cb25 |
:data:`chisq`.
|
|
Packit |
67cb25 |
|
|
Packit |
67cb25 |
.. function:: int gsl_fit_linear_est (double x, double c0, double c1, double cov00, double cov01, double cov11, double * y, double * y_err)
|
|
Packit |
67cb25 |
|
|
Packit |
67cb25 |
This function uses the best-fit linear regression coefficients
|
|
Packit |
67cb25 |
:data:`c0`, :data:`c1` and their covariance
|
|
Packit |
67cb25 |
:data:`cov00`, :data:`cov01`, :data:`cov11` to compute the fitted function
|
|
Packit |
67cb25 |
:data:`y` and its standard deviation :data:`y_err` for the model :math:`Y = c_0 + c_1 X`
|
|
Packit |
67cb25 |
at the point :data:`x`.
|
|
Packit |
67cb25 |
|
|
Packit |
67cb25 |
Linear regression without a constant term
|
|
Packit |
67cb25 |
-----------------------------------------
|
|
Packit |
67cb25 |
|
|
Packit |
67cb25 |
The functions described in this section can be used to perform
|
|
Packit |
67cb25 |
least-squares fits to a straight line model without a constant term,
|
|
Packit |
67cb25 |
:math:`Y = c_1 X`.
|
|
Packit |
67cb25 |
|
|
Packit |
67cb25 |
.. function:: int gsl_fit_mul (const double * x, const size_t xstride, const double * y, const size_t ystride, size_t n, double * c1, double * cov11, double * sumsq)
|
|
Packit |
67cb25 |
|
|
Packit |
67cb25 |
This function computes the best-fit linear regression coefficient
|
|
Packit |
67cb25 |
:data:`c1` of the model :math:`Y = c_1 X` for the datasets (:data:`x`,
|
|
Packit |
67cb25 |
:data:`y`), two vectors of length :data:`n` with strides :data:`xstride` and
|
|
Packit |
67cb25 |
:data:`ystride`. The errors on :data:`y` are assumed unknown so the
|
|
Packit |
67cb25 |
variance of the parameter :data:`c1` is estimated from
|
|
Packit |
67cb25 |
the scatter of the points around the best-fit line and returned via the
|
|
Packit |
67cb25 |
parameter :data:`cov11`. The sum of squares of the residuals from the
|
|
Packit |
67cb25 |
best-fit line is returned in :data:`sumsq`.
|
|
Packit |
67cb25 |
|
|
Packit |
67cb25 |
.. function:: int gsl_fit_wmul (const double * x, const size_t xstride, const double * w, const size_t wstride, const double * y, const size_t ystride, size_t n, double * c1, double * cov11, double * sumsq)
|
|
Packit |
67cb25 |
|
|
Packit |
67cb25 |
This function computes the best-fit linear regression coefficient
|
|
Packit |
67cb25 |
:data:`c1` of the model :math:`Y = c_1 X` for the weighted datasets
|
|
Packit |
67cb25 |
(:data:`x`, :data:`y`), two vectors of length :data:`n` with strides
|
|
Packit |
67cb25 |
:data:`xstride` and :data:`ystride`. The vector :data:`w`, of length :data:`n`
|
|
Packit |
67cb25 |
and stride :data:`wstride`, specifies the weight of each datapoint. The
|
|
Packit |
67cb25 |
weight is the reciprocal of the variance for each datapoint in :data:`y`.
|
|
Packit |
67cb25 |
|
|
Packit |
67cb25 |
The variance of the parameter :data:`c1` is computed using the weights
|
|
Packit |
67cb25 |
and returned via the parameter :data:`cov11`. The weighted sum of
|
|
Packit |
67cb25 |
squares of the residuals from the best-fit line, :math:`\chi^2`, is
|
|
Packit |
67cb25 |
returned in :data:`chisq`.
|
|
Packit |
67cb25 |
|
|
Packit |
67cb25 |
.. function:: int gsl_fit_mul_est (double x, double c1, double cov11, double * y, double * y_err)
|
|
Packit |
67cb25 |
|
|
Packit |
67cb25 |
This function uses the best-fit linear regression coefficient :data:`c1`
|
|
Packit |
67cb25 |
and its covariance :data:`cov11` to compute the fitted function
|
|
Packit |
67cb25 |
:data:`y` and its standard deviation :data:`y_err` for the model :math:`Y = c_1 X`
|
|
Packit |
67cb25 |
at the point :data:`x`.
|
|
Packit |
67cb25 |
|
|
Packit |
67cb25 |
.. index::
|
|
Packit |
67cb25 |
single: multi-parameter regression
|
|
Packit |
67cb25 |
single: fits, multi-parameter linear
|
|
Packit |
67cb25 |
|
|
Packit |
67cb25 |
Multi-parameter regression
|
|
Packit |
67cb25 |
==========================
|
|
Packit |
67cb25 |
|
|
Packit |
67cb25 |
This section describes routines which perform least squares fits
|
|
Packit |
67cb25 |
to a linear model by minimizing the cost function
|
|
Packit |
67cb25 |
|
|
Packit |
67cb25 |
.. math:: \chi^2 = \sum_i w_i (y_i - \sum_j X_{ij} c_j)^2 = || y - Xc ||_W^2
|
|
Packit |
67cb25 |
|
|
Packit |
67cb25 |
where :math:`y` is a vector of :math:`n` observations, :math:`X` is an
|
|
Packit |
67cb25 |
:math:`n`-by-:math:`p` matrix of predictor variables, :math:`c`
|
|
Packit |
67cb25 |
is a vector of the :math:`p` unknown best-fit parameters to be estimated,
|
|
Packit |
67cb25 |
and :math:`||r||_W^2 = r^T W r`.
|
|
Packit |
67cb25 |
The matrix :math:`W = \diag(w_1,w_2,...,w_n)`
|
|
Packit |
67cb25 |
defines the weights or uncertainties of the observation vector.
|
|
Packit |
67cb25 |
|
|
Packit |
67cb25 |
This formulation can be used for fits to any number of functions and/or
|
|
Packit |
67cb25 |
variables by preparing the :math:`n`-by-:math:`p` matrix :math:`X`
|
|
Packit |
67cb25 |
appropriately. For example, to fit to a :math:`p`-th order polynomial in
|
|
Packit |
67cb25 |
:data:`x`, use the following matrix,
|
|
Packit |
67cb25 |
|
|
Packit |
67cb25 |
.. math:: X_{ij} = x_i^j
|
|
Packit |
67cb25 |
|
|
Packit |
67cb25 |
where the index :math:`i` runs over the observations and the index
|
|
Packit |
67cb25 |
:math:`j` runs from 0 to :math:`p-1`.
|
|
Packit |
67cb25 |
|
|
Packit |
67cb25 |
To fit to a set of :math:`p` sinusoidal functions with fixed frequencies
|
|
Packit |
67cb25 |
:math:`\omega_1`, :math:`\omega_2`, :math:`\ldots`, :math:`\omega_p`, use,
|
|
Packit |
67cb25 |
|
|
Packit |
67cb25 |
.. math:: X_{ij} = \sin(\omega_j x_i)
|
|
Packit |
67cb25 |
|
|
Packit |
67cb25 |
To fit to :math:`p` independent variables :math:`x_1`, :math:`x_2`, :math:`\ldots`,
|
|
Packit |
67cb25 |
:math:`x_p`, use,
|
|
Packit |
67cb25 |
|
|
Packit |
67cb25 |
.. math:: X_{ij} = x_j(i)
|
|
Packit |
67cb25 |
|
|
Packit |
67cb25 |
where :math:`x_j(i)` is the :math:`i`-th value of the predictor variable
|
|
Packit |
67cb25 |
:math:`x_j`.
|
|
Packit |
67cb25 |
|
|
Packit |
67cb25 |
The solution of the general linear least-squares system requires an
|
|
Packit |
67cb25 |
additional working space for intermediate results, such as the singular
|
|
Packit |
67cb25 |
value decomposition of the matrix :math:`X`.
|
|
Packit |
67cb25 |
|
|
Packit |
67cb25 |
These functions are declared in the header file :file:`gsl_multifit.h`.
|
|
Packit |
67cb25 |
|
|
Packit |
67cb25 |
.. type:: gsl_multifit_linear_workspace
|
|
Packit |
67cb25 |
|
|
Packit |
67cb25 |
This workspace contains internal variables for fitting multi-parameter models.
|
|
Packit |
67cb25 |
|
|
Packit |
67cb25 |
.. function:: gsl_multifit_linear_workspace * gsl_multifit_linear_alloc (const size_t n, const size_t p)
|
|
Packit |
67cb25 |
|
|
Packit |
67cb25 |
This function allocates a workspace for fitting a model to a maximum of :data:`n`
|
|
Packit |
67cb25 |
observations using a maximum of :data:`p` parameters. The user may later supply
|
|
Packit |
67cb25 |
a smaller least squares system if desired. The size of the workspace is
|
|
Packit |
67cb25 |
:math:`O(np + p^2)`.
|
|
Packit |
67cb25 |
|
|
Packit |
67cb25 |
.. function:: void gsl_multifit_linear_free (gsl_multifit_linear_workspace * work)
|
|
Packit |
67cb25 |
|
|
Packit |
67cb25 |
This function frees the memory associated with the workspace :data:`w`.
|
|
Packit |
67cb25 |
|
|
Packit |
67cb25 |
.. function:: int gsl_multifit_linear_svd (const gsl_matrix * X, gsl_multifit_linear_workspace * work)
|
|
Packit |
67cb25 |
|
|
Packit |
67cb25 |
This function performs a singular value decomposition of the
|
|
Packit |
67cb25 |
matrix :data:`X` and stores the SVD factors internally in :data:`work`.
|
|
Packit |
67cb25 |
|
|
Packit |
67cb25 |
.. function:: int gsl_multifit_linear_bsvd (const gsl_matrix * X, gsl_multifit_linear_workspace * work)
|
|
Packit |
67cb25 |
|
|
Packit |
67cb25 |
This function performs a singular value decomposition of the
|
|
Packit |
67cb25 |
matrix :data:`X` and stores the SVD factors internally in :data:`work`.
|
|
Packit |
67cb25 |
The matrix :data:`X` is first balanced by applying column scaling
|
|
Packit |
67cb25 |
factors to improve the accuracy of the singular values.
|
|
Packit |
67cb25 |
|
|
Packit |
67cb25 |
.. function:: int gsl_multifit_linear (const gsl_matrix * X, const gsl_vector * y, gsl_vector * c, gsl_matrix * cov, double * chisq, gsl_multifit_linear_workspace * work)
|
|
Packit |
67cb25 |
|
|
Packit |
67cb25 |
This function computes the best-fit parameters :data:`c` of the model
|
|
Packit |
67cb25 |
:math:`y = X c` for the observations :data:`y` and the matrix of
|
|
Packit |
67cb25 |
predictor variables :data:`X`, using the preallocated workspace provided
|
|
Packit |
67cb25 |
in :data:`work`. The :math:`p`-by-:math:`p` variance-covariance matrix of the model parameters
|
|
Packit |
67cb25 |
:data:`cov` is set to :math:`\sigma^2 (X^T X)^{-1}`, where :math:`\sigma` is
|
|
Packit |
67cb25 |
the standard deviation of the fit residuals.
|
|
Packit |
67cb25 |
The sum of squares of the residuals from the best-fit,
|
|
Packit |
67cb25 |
:math:`\chi^2`, is returned in :data:`chisq`. If the coefficient of
|
|
Packit |
67cb25 |
determination is desired, it can be computed from the expression
|
|
Packit |
67cb25 |
:math:`R^2 = 1 - \chi^2 / TSS`, where the total sum of squares (TSS) of
|
|
Packit |
67cb25 |
the observations :data:`y` may be computed from :func:`gsl_stats_tss`.
|
|
Packit |
67cb25 |
|
|
Packit |
67cb25 |
The best-fit is found by singular value decomposition of the matrix
|
|
Packit |
67cb25 |
:data:`X` using the modified Golub-Reinsch SVD algorithm, with column
|
|
Packit |
67cb25 |
scaling to improve the accuracy of the singular values. Any components
|
|
Packit |
67cb25 |
which have zero singular value (to machine precision) are discarded
|
|
Packit |
67cb25 |
from the fit.
|
|
Packit |
67cb25 |
|
|
Packit |
67cb25 |
.. function:: int gsl_multifit_linear_tsvd (const gsl_matrix * X, const gsl_vector * y, const double tol, gsl_vector * c, gsl_matrix * cov, double * chisq, size_t * rank, gsl_multifit_linear_workspace * work)
|
|
Packit |
67cb25 |
|
|
Packit |
67cb25 |
This function computes the best-fit parameters :data:`c` of the model
|
|
Packit |
67cb25 |
:math:`y = X c` for the observations :data:`y` and the matrix of
|
|
Packit |
67cb25 |
predictor variables :data:`X`, using a truncated SVD expansion.
|
|
Packit |
67cb25 |
Singular values which satisfy :math:`s_i \le tol \times s_0`
|
|
Packit |
67cb25 |
are discarded from the fit, where :math:`s_0` is the largest singular value.
|
|
Packit |
67cb25 |
The :math:`p`-by-:math:`p` variance-covariance matrix of the model parameters
|
|
Packit |
67cb25 |
:data:`cov` is set to :math:`\sigma^2 (X^T X)^{-1}`, where :math:`\sigma` is
|
|
Packit |
67cb25 |
the standard deviation of the fit residuals.
|
|
Packit |
67cb25 |
The sum of squares of the residuals from the best-fit,
|
|
Packit |
67cb25 |
:math:`\chi^2`, is returned in :data:`chisq`. The effective rank
|
|
Packit |
67cb25 |
(number of singular values used in solution) is returned in :data:`rank`.
|
|
Packit |
67cb25 |
If the coefficient of
|
|
Packit |
67cb25 |
determination is desired, it can be computed from the expression
|
|
Packit |
67cb25 |
:math:`R^2 = 1 - \chi^2 / TSS`, where the total sum of squares (TSS) of
|
|
Packit |
67cb25 |
the observations :data:`y` may be computed from :func:`gsl_stats_tss`.
|
|
Packit |
67cb25 |
|
|
Packit |
67cb25 |
.. function:: int gsl_multifit_wlinear (const gsl_matrix * X, const gsl_vector * w, const gsl_vector * y, gsl_vector * c, gsl_matrix * cov, double * chisq, gsl_multifit_linear_workspace * work)
|
|
Packit |
67cb25 |
|
|
Packit |
67cb25 |
This function computes the best-fit parameters :data:`c` of the weighted
|
|
Packit |
67cb25 |
model :math:`y = X c` for the observations :data:`y` with weights :data:`w`
|
|
Packit |
67cb25 |
and the matrix of predictor variables :data:`X`, using the preallocated
|
|
Packit |
67cb25 |
workspace provided in :data:`work`. The :math:`p`-by-:math:`p` covariance matrix of the model
|
|
Packit |
67cb25 |
parameters :data:`cov` is computed as :math:`(X^T W X)^{-1}`. The weighted
|
|
Packit |
67cb25 |
sum of squares of the residuals from the best-fit, :math:`\chi^2`, is
|
|
Packit |
67cb25 |
returned in :data:`chisq`. If the coefficient of determination is
|
|
Packit |
67cb25 |
desired, it can be computed from the expression :math:`R^2 = 1 - \chi^2 / WTSS`,
|
|
Packit |
67cb25 |
where the weighted total sum of squares (WTSS) of the
|
|
Packit |
67cb25 |
observations :data:`y` may be computed from :func:`gsl_stats_wtss`.
|
|
Packit |
67cb25 |
|
|
Packit |
67cb25 |
.. function:: int gsl_multifit_wlinear_tsvd (const gsl_matrix * X, const gsl_vector * w, const gsl_vector * y, const double tol, gsl_vector * c, gsl_matrix * cov, double * chisq, size_t * rank, gsl_multifit_linear_workspace * work)
|
|
Packit |
67cb25 |
|
|
Packit |
67cb25 |
This function computes the best-fit parameters :data:`c` of the weighted
|
|
Packit |
67cb25 |
model :math:`y = X c` for the observations :data:`y` with weights :data:`w`
|
|
Packit |
67cb25 |
and the matrix of predictor variables :data:`X`, using a truncated SVD expansion.
|
|
Packit |
67cb25 |
Singular values which satisfy :math:`s_i \le tol \times s_0`
|
|
Packit |
67cb25 |
are discarded from the fit, where :math:`s_0` is the largest singular value.
|
|
Packit |
67cb25 |
The :math:`p`-by-:math:`p` covariance matrix of the model
|
|
Packit |
67cb25 |
parameters :data:`cov` is computed as :math:`(X^T W X)^{-1}`. The weighted
|
|
Packit |
67cb25 |
sum of squares of the residuals from the best-fit, :math:`\chi^2`, is
|
|
Packit |
67cb25 |
returned in :data:`chisq`. The effective rank of the system (number of
|
|
Packit |
67cb25 |
singular values used in the solution) is returned in :data:`rank`.
|
|
Packit |
67cb25 |
If the coefficient of determination is
|
|
Packit |
67cb25 |
desired, it can be computed from the expression :math:`R^2 = 1 - \chi^2 / WTSS`,
|
|
Packit |
67cb25 |
where the weighted total sum of squares (WTSS) of the
|
|
Packit |
67cb25 |
observations :data:`y` may be computed from :func:`gsl_stats_wtss`.
|
|
Packit |
67cb25 |
|
|
Packit |
67cb25 |
.. function:: int gsl_multifit_linear_est (const gsl_vector * x, const gsl_vector * c, const gsl_matrix * cov, double * y, double * y_err)
|
|
Packit |
67cb25 |
|
|
Packit |
67cb25 |
This function uses the best-fit multilinear regression coefficients
|
|
Packit |
67cb25 |
:data:`c` and their covariance matrix
|
|
Packit |
67cb25 |
:data:`cov` to compute the fitted function value
|
|
Packit |
67cb25 |
:data:`y` and its standard deviation :data:`y_err` for the model :math:`y = x.c`
|
|
Packit |
67cb25 |
at the point :data:`x`.
|
|
Packit |
67cb25 |
|
|
Packit |
67cb25 |
.. function:: int gsl_multifit_linear_residuals (const gsl_matrix * X, const gsl_vector * y, const gsl_vector * c, gsl_vector * r)
|
|
Packit |
67cb25 |
|
|
Packit |
67cb25 |
This function computes the vector of residuals :math:`r = y - X c` for
|
|
Packit |
67cb25 |
the observations :data:`y`, coefficients :data:`c` and matrix of predictor
|
|
Packit |
67cb25 |
variables :data:`X`.
|
|
Packit |
67cb25 |
|
|
Packit |
67cb25 |
.. function:: size_t gsl_multifit_linear_rank (const double tol, const gsl_multifit_linear_workspace * work)
|
|
Packit |
67cb25 |
|
|
Packit |
67cb25 |
This function returns the rank of the matrix :math:`X` which must first have its
|
|
Packit |
67cb25 |
singular value decomposition computed. The rank is computed by counting the number
|
|
Packit |
67cb25 |
of singular values :math:`\sigma_j` which satisfy :math:`\sigma_j > tol \times \sigma_0`,
|
|
Packit |
67cb25 |
where :math:`\sigma_0` is the largest singular value.
|
|
Packit |
67cb25 |
|
|
Packit |
67cb25 |
.. index::
|
|
Packit |
67cb25 |
single: ridge regression
|
|
Packit |
67cb25 |
single: Tikhonov regression
|
|
Packit |
67cb25 |
single: regression, ridge
|
|
Packit |
67cb25 |
single: regression, Tikhonov
|
|
Packit |
67cb25 |
single: least squares, regularized
|
|
Packit |
67cb25 |
|
|
Packit |
67cb25 |
.. _sec_regularized-regression:
|
|
Packit |
67cb25 |
|
|
Packit |
67cb25 |
Regularized regression
|
|
Packit |
67cb25 |
======================
|
|
Packit |
67cb25 |
|
|
Packit |
67cb25 |
Ordinary weighted least squares models seek a solution vector :math:`c`
|
|
Packit |
67cb25 |
which minimizes the residual
|
|
Packit |
67cb25 |
|
|
Packit |
67cb25 |
.. math:: \chi^2 = || y - Xc ||_W^2
|
|
Packit |
67cb25 |
|
|
Packit |
67cb25 |
where :math:`y` is the :math:`n`-by-:math:`1` observation vector,
|
|
Packit |
67cb25 |
:math:`X` is the :math:`n`-by-:math:`p` design matrix, :math:`c` is
|
|
Packit |
67cb25 |
the :math:`p`-by-:math:`1` solution vector,
|
|
Packit |
67cb25 |
:math:`W = \diag(w_1,...,w_n)` is the data weighting matrix,
|
|
Packit |
67cb25 |
and :math:`||r||_W^2 = r^T W r`.
|
|
Packit |
67cb25 |
In cases where the least squares matrix :math:`X` is ill-conditioned,
|
|
Packit |
67cb25 |
small perturbations (ie: noise) in the observation vector could lead to
|
|
Packit |
67cb25 |
widely different solution vectors :math:`c`.
|
|
Packit |
67cb25 |
One way of dealing with ill-conditioned matrices is to use a "truncated SVD"
|
|
Packit |
67cb25 |
in which small singular values, below some given tolerance, are discarded
|
|
Packit |
67cb25 |
from the solution. The truncated SVD method is available using the functions
|
|
Packit |
67cb25 |
:func:`gsl_multifit_linear_tsvd` and :func:`gsl_multifit_wlinear_tsvd`. Another way
|
|
Packit |
67cb25 |
to help solve ill-posed problems is to include a regularization term in the least squares
|
|
Packit |
67cb25 |
minimization
|
|
Packit |
67cb25 |
|
|
Packit |
67cb25 |
.. math:: \chi^2 = || y - Xc ||_W^2 + \lambda^2 || L c ||^2
|
|
Packit |
67cb25 |
|
|
Packit |
67cb25 |
for a suitably chosen regularization parameter :math:`\lambda` and
|
|
Packit |
67cb25 |
matrix :math:`L`. This type of regularization is known as Tikhonov, or ridge,
|
|
Packit |
67cb25 |
regression. In some applications, :math:`L` is chosen as the identity matrix, giving
|
|
Packit |
67cb25 |
preference to solution vectors :math:`c` with smaller norms.
|
|
Packit |
67cb25 |
Including this regularization term leads to the explicit "normal equations" solution
|
|
Packit |
67cb25 |
|
|
Packit |
67cb25 |
.. only:: not texinfo
|
|
Packit |
67cb25 |
|
|
Packit |
67cb25 |
.. math:: c = \left( X^T W X + \lambda^2 L^T L \right)^{-1} X^T W y
|
|
Packit |
67cb25 |
|
|
Packit |
67cb25 |
.. only:: texinfo
|
|
Packit |
67cb25 |
|
|
Packit |
67cb25 |
::
|
|
Packit |
67cb25 |
|
|
Packit |
67cb25 |
c = ( X^T W X + \lambda^2 L^T L )^-1 X^T W y
|
|
Packit |
67cb25 |
|
|
Packit |
67cb25 |
which reduces to the ordinary least squares solution when :math:`L = 0`.
|
|
Packit |
67cb25 |
In practice, it is often advantageous to transform a regularized least
|
|
Packit |
67cb25 |
squares system into the form
|
|
Packit |
67cb25 |
|
|
Packit |
67cb25 |
.. only:: not texinfo
|
|
Packit |
67cb25 |
|
|
Packit |
67cb25 |
.. math:: \chi^2 = || \tilde{y} - \tilde{X} \tilde{c} ||^2 + \lambda^2 || \tilde{c} ||^2
|
|
Packit |
67cb25 |
|
|
Packit |
67cb25 |
.. only:: texinfo
|
|
Packit |
67cb25 |
|
|
Packit |
67cb25 |
::
|
|
Packit |
67cb25 |
|
|
Packit |
67cb25 |
\chi^2 = || y~ - X~ c~ ||^2 + \lambda^2 || c~ ||^2
|
|
Packit |
67cb25 |
|
|
Packit |
67cb25 |
This is known as the Tikhonov "standard form" and has the normal equations solution
|
|
Packit |
67cb25 |
|
|
Packit |
67cb25 |
.. only:: not texinfo
|
|
Packit |
67cb25 |
|
|
Packit |
67cb25 |
.. math:: \tilde{c} = \left( \tilde{X}^T \tilde{X} + \lambda^2 I \right)^{-1} \tilde{X}^T \tilde{y}
|
|
Packit |
67cb25 |
|
|
Packit |
67cb25 |
.. only:: texinfo
|
|
Packit |
67cb25 |
|
|
Packit |
67cb25 |
::
|
|
Packit |
67cb25 |
|
|
Packit |
67cb25 |
\tilde{c} = ( \tilde{X}^T \tilde{X} + \lambda^2 I )^{-1} \tilde{X}^T \tilde{y}
|
|
Packit |
67cb25 |
|
|
Packit |
67cb25 |
For an :math:`m`-by-:math:`p` matrix :math:`L` which is full rank and has :math:`m >= p` (ie: :math:`L` is
|
|
Packit |
67cb25 |
square or has more rows than columns), we can calculate the "thin" QR decomposition of :math:`L`, and
|
|
Packit |
67cb25 |
note that :math:`||L c|| = ||R c||` since the :math:`Q` factor will not change the norm. Since
|
|
Packit |
67cb25 |
:math:`R` is :math:`p`-by-:math:`p`, we can then use the transformation
|
|
Packit |
67cb25 |
|
|
Packit |
67cb25 |
.. only:: not texinfo
|
|
Packit |
67cb25 |
|
|
Packit |
67cb25 |
.. math::
|
|
Packit |
67cb25 |
|
|
Packit |
67cb25 |
\tilde{X} &= W^{1 \over 2} X R^{-1} \\
|
|
Packit |
67cb25 |
\tilde{y} &= W^{1 \over 2} y \\
|
|
Packit |
67cb25 |
\tilde{c} &= R c
|
|
Packit |
67cb25 |
|
|
Packit |
67cb25 |
.. only:: texinfo
|
|
Packit |
67cb25 |
|
|
Packit |
67cb25 |
::
|
|
Packit |
67cb25 |
|
|
Packit |
67cb25 |
X~ = sqrt(W) X R^-1
|
|
Packit |
67cb25 |
y~ = sqrt(W) y
|
|
Packit |
67cb25 |
c~ = R c
|
|
Packit |
67cb25 |
|
|
Packit |
67cb25 |
to achieve the standard form. For a rectangular matrix :math:`L` with :math:`m < p`,
|
|
Packit |
67cb25 |
a more sophisticated approach is needed (see Hansen 1998, chapter 2.3).
|
|
Packit |
67cb25 |
In practice, the normal equations solution above is not desirable due to
|
|
Packit |
67cb25 |
numerical instabilities, and so the system is solved using the
|
|
Packit |
67cb25 |
singular value decomposition of the matrix :math:`\tilde{X}`.
|
|
Packit |
67cb25 |
The matrix :math:`L` is often chosen as the identity matrix, or as a first
|
|
Packit |
67cb25 |
or second finite difference operator, to ensure a smoothly varying
|
|
Packit |
67cb25 |
coefficient vector :math:`c`, or as a diagonal matrix to selectively damp
|
|
Packit |
67cb25 |
each model parameter differently. If :math:`L \ne I`, the user must first
|
|
Packit |
67cb25 |
convert the least squares problem to standard form using
|
|
Packit |
67cb25 |
:func:`gsl_multifit_linear_stdform1` or :func:`gsl_multifit_linear_stdform2`,
|
|
Packit |
67cb25 |
solve the system, and then backtransform the solution vector to recover
|
|
Packit |
67cb25 |
the solution of the original problem (see
|
|
Packit |
67cb25 |
:func:`gsl_multifit_linear_genform1` and :func:`gsl_multifit_linear_genform2`).
|
|
Packit |
67cb25 |
|
|
Packit |
67cb25 |
In many regularization problems, care must be taken when choosing
|
|
Packit |
67cb25 |
the regularization parameter :math:`\lambda`. Since both the
|
|
Packit |
67cb25 |
residual norm :math:`||y - X c||` and solution norm :math:`||L c||`
|
|
Packit |
67cb25 |
are being minimized, the parameter :math:`\lambda` represents
|
|
Packit |
67cb25 |
a tradeoff between minimizing either the residuals or the
|
|
Packit |
67cb25 |
solution vector. A common tool for visualizing the comprimise between
|
|
Packit |
67cb25 |
the minimization of these two quantities is known as the L-curve.
|
|
Packit |
67cb25 |
The L-curve is a log-log plot of the residual norm :math:`||y - X c||`
|
|
Packit |
67cb25 |
on the horizontal axis and the solution norm :math:`||L c||` on the
|
|
Packit |
67cb25 |
vertical axis. This curve nearly always as an :math:`L` shaped
|
|
Packit |
67cb25 |
appearance, with a distinct corner separating the horizontal
|
|
Packit |
67cb25 |
and vertical sections of the curve. The regularization parameter
|
|
Packit |
67cb25 |
corresponding to this corner is often chosen as the optimal
|
|
Packit |
67cb25 |
value. GSL provides routines to calculate the L-curve for all
|
|
Packit |
67cb25 |
relevant regularization parameters as well as locating the corner.
|
|
Packit |
67cb25 |
|
|
Packit |
67cb25 |
Another method of choosing the regularization parameter is known
|
|
Packit |
67cb25 |
as Generalized Cross Validation (GCV). This method is based on
|
|
Packit |
67cb25 |
the idea that if an arbitrary element :math:`y_i` is left out of the
|
|
Packit |
67cb25 |
right hand side, the resulting regularized solution should predict this element
|
|
Packit |
67cb25 |
accurately. This leads to choosing the parameter :math:`\lambda`
|
|
Packit |
67cb25 |
which minimizes the GCV function
|
|
Packit |
67cb25 |
|
|
Packit |
67cb25 |
.. only:: not texinfo
|
|
Packit |
67cb25 |
|
|
Packit |
67cb25 |
.. math:: G(\lambda) = {||y - X c_{\lambda}||^2 \over \textrm{Tr}(I_n - X X_{\lambda}^I)^2}
|
|
Packit |
67cb25 |
|
|
Packit |
67cb25 |
.. only:: texinfo
|
|
Packit |
67cb25 |
|
|
Packit |
67cb25 |
::
|
|
Packit |
67cb25 |
|
|
Packit |
67cb25 |
G(\lambda) = (||y - X c_{\lambda}||^2) / Tr(I_n - X X^I)^2
|
|
Packit |
67cb25 |
|
|
Packit |
67cb25 |
where :math:`X_{\lambda}^I` is the matrix which relates the solution :math:`c_{\lambda}`
|
|
Packit |
67cb25 |
to the right hand side :math:`y`, ie: :math:`c_{\lambda} = X_{\lambda}^I y`. GSL
|
|
Packit |
67cb25 |
provides routines to compute the GCV curve and its minimum.
|
|
Packit |
67cb25 |
|
|
Packit |
67cb25 |
For most applications, the steps required to solve a regularized least
|
|
Packit |
67cb25 |
squares problem are as follows:
|
|
Packit |
67cb25 |
|
|
Packit |
67cb25 |
#. Construct the least squares system (:math:`X`, :math:`y`, :math:`W`, :math:`L`)
|
|
Packit |
67cb25 |
|
|
Packit |
67cb25 |
#. Transform the system to standard form (:math:`\tilde{X}`, :math:`\tilde{y}`). This
|
|
Packit |
67cb25 |
step can be skipped if :math:`L = I_p` and :math:`W = I_n`.
|
|
Packit |
67cb25 |
|
|
Packit |
67cb25 |
#. Calculate the SVD of :math:`\tilde{X}`.
|
|
Packit |
67cb25 |
|
|
Packit |
67cb25 |
#. Determine an appropriate regularization parameter :math:`\lambda` (using for example
|
|
Packit |
67cb25 |
L-curve or GCV analysis).
|
|
Packit |
67cb25 |
|
|
Packit |
67cb25 |
#. Solve the standard form system using the chosen :math:`\lambda` and the SVD of :math:`\tilde{X}`.
|
|
Packit |
67cb25 |
|
|
Packit |
67cb25 |
#. Backtransform the standard form solution :math:`\tilde{c}` to recover the
|
|
Packit |
67cb25 |
original solution vector :math:`c`.
|
|
Packit |
67cb25 |
|
|
Packit |
67cb25 |
.. function:: int gsl_multifit_linear_stdform1 (const gsl_vector * L, const gsl_matrix * X, const gsl_vector * y, gsl_matrix * Xs, gsl_vector * ys, gsl_multifit_linear_workspace * work)
|
|
Packit |
67cb25 |
int gsl_multifit_linear_wstdform1 (const gsl_vector * L, const gsl_matrix * X, const gsl_vector * w, const gsl_vector * y, gsl_matrix * Xs, gsl_vector * ys, gsl_multifit_linear_workspace * work)
|
|
Packit |
67cb25 |
|
|
Packit |
67cb25 |
These functions define a regularization matrix
|
|
Packit |
67cb25 |
:math:`L = \diag(l_0,l_1,...,l_{p-1})`.
|
|
Packit |
67cb25 |
The diagonal matrix element :math:`l_i` is provided by the
|
|
Packit |
67cb25 |
:math:`i`-th element of the input vector :data:`L`.
|
|
Packit |
67cb25 |
The :math:`n`-by-:math:`p` least squares matrix :data:`X` and
|
|
Packit |
67cb25 |
vector :data:`y` of length :math:`n` are then
|
|
Packit |
67cb25 |
converted to standard form as described above and the parameters
|
|
Packit |
67cb25 |
(:math:`\tilde{X}`, :math:`\tilde{y}`) are stored in :data:`Xs` and :data:`ys`
|
|
Packit |
67cb25 |
on output. :data:`Xs` and :data:`ys` have the same dimensions as
|
|
Packit |
67cb25 |
:data:`X` and :data:`y`. Optional data weights may be supplied in the
|
|
Packit |
67cb25 |
vector :data:`w` of length :math:`n`. In order to apply this transformation,
|
|
Packit |
67cb25 |
:math:`L^{-1}` must exist and so none of the :math:`l_i`
|
|
Packit |
67cb25 |
may be zero. After the standard form system has been solved,
|
|
Packit |
67cb25 |
use :func:`gsl_multifit_linear_genform1` to recover the original solution vector.
|
|
Packit |
67cb25 |
It is allowed to have :data:`X` = :data:`Xs` and :data:`y` = :data:`ys` for an in-place transform.
|
|
Packit |
67cb25 |
In order to perform a weighted regularized fit with :math:`L = I`, the user may
|
|
Packit |
67cb25 |
call :func:`gsl_multifit_linear_applyW` to convert to standard form.
|
|
Packit |
67cb25 |
|
|
Packit |
67cb25 |
.. function:: int gsl_multifit_linear_L_decomp (gsl_matrix * L, gsl_vector * tau)
|
|
Packit |
67cb25 |
|
|
Packit |
67cb25 |
This function factors the :math:`m`-by-:math:`p` regularization matrix
|
|
Packit |
67cb25 |
:data:`L` into a form needed for the later transformation to standard form. :data:`L`
|
|
Packit |
67cb25 |
may have any number of rows :math:`m`. If :math:`m \ge p` the QR decomposition of
|
|
Packit |
67cb25 |
:data:`L` is computed and stored in :data:`L` on output. If :math:`m < p`, the QR decomposition
|
|
Packit |
67cb25 |
of :math:`L^T` is computed and stored in :data:`L` on output. On output,
|
|
Packit |
67cb25 |
the Householder scalars are stored in the vector :data:`tau` of size :math:`MIN(m,p)`.
|
|
Packit |
67cb25 |
These outputs will be used by :func:`gsl_multifit_linear_wstdform2` to complete the
|
|
Packit |
67cb25 |
transformation to standard form.
|
|
Packit |
67cb25 |
|
|
Packit |
67cb25 |
.. function:: int gsl_multifit_linear_stdform2 (const gsl_matrix * LQR, const gsl_vector * Ltau, const gsl_matrix * X, const gsl_vector * y, gsl_matrix * Xs, gsl_vector * ys, gsl_matrix * M, gsl_multifit_linear_workspace * work)
|
|
Packit |
67cb25 |
int gsl_multifit_linear_wstdform2 (const gsl_matrix * LQR, const gsl_vector * Ltau, const gsl_matrix * X, const gsl_vector * w, const gsl_vector * y, gsl_matrix * Xs, gsl_vector * ys, gsl_matrix * M, gsl_multifit_linear_workspace * work)
|
|
Packit |
67cb25 |
|
|
Packit |
67cb25 |
These functions convert the least squares system (:data:`X`, :data:`y`, :data:`W`, :math:`L`) to standard
|
|
Packit |
67cb25 |
form (:math:`\tilde{X}`, :math:`\tilde{y}`) which are stored in :data:`Xs` and :data:`ys`
|
|
Packit |
67cb25 |
respectively. The :math:`m`-by-:math:`p` regularization matrix :data:`L` is specified by the inputs
|
|
Packit |
67cb25 |
:data:`LQR` and :data:`Ltau`, which are outputs from :func:`gsl_multifit_linear_L_decomp`.
|
|
Packit |
67cb25 |
The dimensions of the standard form parameters (:math:`\tilde{X}`, :math:`\tilde{y}`)
|
|
Packit |
67cb25 |
depend on whether :math:`m` is larger or less than :math:`p`. For :math:`m \ge p`,
|
|
Packit |
67cb25 |
:data:`Xs` is :math:`n`-by-:math:`p`, :data:`ys` is :math:`n`-by-1, and :data:`M` is
|
|
Packit |
67cb25 |
not used. For :math:`m < p`, :data:`Xs` is :math:`(n - p + m)`-by-:math:`m`,
|
|
Packit |
67cb25 |
:data:`ys` is :math:`(n - p + m)`-by-1, and :data:`M` is additional :math:`n`-by-:math:`p` workspace,
|
|
Packit |
67cb25 |
which is required to recover the original solution vector after the system has been
|
|
Packit |
67cb25 |
solved (see :func:`gsl_multifit_linear_genform2`). Optional data weights may be supplied in the
|
|
Packit |
67cb25 |
vector :data:`w` of length :math:`n`, where :math:`W = \diag(w)`.
|
|
Packit |
67cb25 |
|
|
Packit |
67cb25 |
.. function:: int gsl_multifit_linear_solve (const double lambda, const gsl_matrix * Xs, const gsl_vector * ys, gsl_vector * cs, double * rnorm, double * snorm, gsl_multifit_linear_workspace * work)
|
|
Packit |
67cb25 |
|
|
Packit |
67cb25 |
This function computes the regularized best-fit parameters :math:`\tilde{c}`
|
|
Packit |
67cb25 |
which minimize the cost function
|
|
Packit |
67cb25 |
:math:`\chi^2 = || \tilde{y} - \tilde{X} \tilde{c} ||^2 + \lambda^2 || \tilde{c} ||^2`
|
|
Packit |
67cb25 |
which is in standard form. The least squares system must therefore be converted
|
|
Packit |
67cb25 |
to standard form prior to calling this function.
|
|
Packit |
67cb25 |
The observation vector :math:`\tilde{y}` is provided in :data:`ys` and the matrix of
|
|
Packit |
67cb25 |
predictor variables :math:`\tilde{X}` in :data:`Xs`. The solution vector :math:`\tilde{c}` is
|
|
Packit |
67cb25 |
returned in :data:`cs`, which has length min(:math:`m,p`). The SVD of :data:`Xs` must be computed prior
|
|
Packit |
67cb25 |
to calling this function, using :func:`gsl_multifit_linear_svd`.
|
|
Packit |
67cb25 |
The regularization parameter :math:`\lambda` is provided in :data:`lambda`.
|
|
Packit |
67cb25 |
The residual norm :math:`|| \tilde{y} - \tilde{X} \tilde{c} || = ||y - X c||_W`
|
|
Packit |
67cb25 |
is returned in :data:`rnorm`.
|
|
Packit |
67cb25 |
The solution norm :math:`|| \tilde{c} || = ||L c||` is returned in
|
|
Packit |
67cb25 |
:data:`snorm`.
|
|
Packit |
67cb25 |
|
|
Packit |
67cb25 |
.. function:: int gsl_multifit_linear_genform1 (const gsl_vector * L, const gsl_vector * cs, gsl_vector * c, gsl_multifit_linear_workspace * work)
|
|
Packit |
67cb25 |
|
|
Packit |
67cb25 |
After a regularized system has been solved with
|
|
Packit |
67cb25 |
:math:`L = \diag(\l_0,\l_1,...,\l_{p-1})`,
|
|
Packit |
67cb25 |
this function backtransforms the standard form solution vector :data:`cs`
|
|
Packit |
67cb25 |
to recover the solution vector of the original problem :data:`c`. The
|
|
Packit |
67cb25 |
diagonal matrix elements :math:`l_i` are provided in
|
|
Packit |
67cb25 |
the vector :data:`L`. It is allowed to have :data:`c` = :data:`cs` for an
|
|
Packit |
67cb25 |
in-place transform.
|
|
Packit |
67cb25 |
|
|
Packit |
67cb25 |
.. function:: int gsl_multifit_linear_genform2 (const gsl_matrix * LQR, const gsl_vector * Ltau, const gsl_matrix * X, const gsl_vector * y, const gsl_vector * cs, const gsl_matrix * M, gsl_vector * c, gsl_multifit_linear_workspace * work)
|
|
Packit |
67cb25 |
int gsl_multifit_linear_wgenform2 (const gsl_matrix * LQR, const gsl_vector * Ltau, const gsl_matrix * X, const gsl_vector * w, const gsl_vector * y, const gsl_vector * cs, const gsl_matrix * M, gsl_vector * c, gsl_multifit_linear_workspace * work)
|
|
Packit |
67cb25 |
|
|
Packit |
67cb25 |
After a regularized system has been solved with a general rectangular matrix :math:`L`,
|
|
Packit |
67cb25 |
specified by (:data:`LQR`, :data:`Ltau`), this function backtransforms the standard form solution :data:`cs`
|
|
Packit |
67cb25 |
to recover the solution vector of the original problem, which is stored in :data:`c`,
|
|
Packit |
67cb25 |
of length :math:`p`. The original least squares matrix and observation vector are provided in
|
|
Packit |
67cb25 |
:data:`X` and :data:`y` respectively. :data:`M` is the matrix computed by
|
|
Packit |
67cb25 |
:func:`gsl_multifit_linear_stdform2`. For weighted fits, the weight vector
|
|
Packit |
67cb25 |
:data:`w` must also be supplied.
|
|
Packit |
67cb25 |
|
|
Packit |
67cb25 |
.. function:: int gsl_multifit_linear_applyW (const gsl_matrix * X, const gsl_vector * w, const gsl_vector * y, gsl_matrix * WX, gsl_vector * Wy)
|
|
Packit |
67cb25 |
|
|
Packit |
67cb25 |
For weighted least squares systems with :math:`L = I`, this function may be used to
|
|
Packit |
67cb25 |
convert the system to standard form by applying the weight matrix :math:`W = \diag(w)`
|
|
Packit |
67cb25 |
to the least squares matrix :data:`X` and observation vector :data:`y`. On output, :data:`WX`
|
|
Packit |
67cb25 |
is equal to :math:`W^{1/2} X` and :data:`Wy` is equal to :math:`W^{1/2} y`. It is allowed
|
|
Packit |
67cb25 |
for :data:`WX` = :data:`X` and :data:`Wy` = :data:`y` for an in-place transform.
|
|
Packit |
67cb25 |
|
|
Packit |
67cb25 |
.. function:: int gsl_multifit_linear_lcurve (const gsl_vector * y, gsl_vector * reg_param, gsl_vector * rho, gsl_vector * eta, gsl_multifit_linear_workspace * work)
|
|
Packit |
67cb25 |
|
|
Packit |
67cb25 |
This function computes the L-curve for a least squares system
|
|
Packit |
67cb25 |
using the right hand side vector :data:`y` and the SVD decomposition
|
|
Packit |
67cb25 |
of the least squares matrix :data:`X`, which must be provided
|
|
Packit |
67cb25 |
to :func:`gsl_multifit_linear_svd` prior to
|
|
Packit |
67cb25 |
calling this function. The output vectors :data:`reg_param`,
|
|
Packit |
67cb25 |
:data:`rho`, and :data:`eta` must all be the same size, and will
|
|
Packit |
67cb25 |
contain the regularization parameters :math:`\lambda_i`, residual norms
|
|
Packit |
67cb25 |
:math:`||y - X c_i||`, and solution norms :math:`|| L c_i ||`
|
|
Packit |
67cb25 |
which compose the L-curve, where :math:`c_i` is the regularized
|
|
Packit |
67cb25 |
solution vector corresponding to :math:`\lambda_i`.
|
|
Packit |
67cb25 |
The user may determine the number of points on the L-curve by
|
|
Packit |
67cb25 |
adjusting the size of these input arrays. The regularization
|
|
Packit |
67cb25 |
parameters :math:`\lambda_i` are estimated from the singular values
|
|
Packit |
67cb25 |
of :data:`X`, and chosen to represent the most relevant portion of
|
|
Packit |
67cb25 |
the L-curve.
|
|
Packit |
67cb25 |
|
|
Packit |
67cb25 |
.. function:: int gsl_multifit_linear_lcorner (const gsl_vector * rho, const gsl_vector * eta, size_t * idx)
|
|
Packit |
67cb25 |
|
|
Packit |
67cb25 |
This function attempts to locate the corner of the L-curve
|
|
Packit |
67cb25 |
:math:`(||y - X c||, ||L c||)` defined by the :data:`rho` and :data:`eta`
|
|
Packit |
67cb25 |
input arrays respectively. The corner is defined as the point of maximum
|
|
Packit |
67cb25 |
curvature of the L-curve in log-log scale. The :data:`rho` and :data:`eta`
|
|
Packit |
67cb25 |
arrays can be outputs of :func:`gsl_multifit_linear_lcurve`. The
|
|
Packit |
67cb25 |
algorithm used simply fits a circle to 3 consecutive points on the L-curve
|
|
Packit |
67cb25 |
and uses the circle's radius to determine the curvature at
|
|
Packit |
67cb25 |
the middle point. Therefore, the input array sizes must be
|
|
Packit |
67cb25 |
:math:`\ge 3`. With more points provided for the L-curve, a better
|
|
Packit |
67cb25 |
estimate of the curvature can be obtained. The array index
|
|
Packit |
67cb25 |
corresponding to maximum curvature (ie: the corner) is returned
|
|
Packit |
67cb25 |
in :data:`idx`. If the input arrays contain colinear points,
|
|
Packit |
67cb25 |
this function could fail and return :macro:`GSL_EINVAL`.
|
|
Packit |
67cb25 |
|
|
Packit |
67cb25 |
.. function:: int gsl_multifit_linear_lcorner2 (const gsl_vector * reg_param, const gsl_vector * eta, size_t * idx)
|
|
Packit |
67cb25 |
|
|
Packit |
67cb25 |
This function attempts to locate the corner of an alternate L-curve
|
|
Packit |
67cb25 |
:math:`(\lambda^2, ||L c||^2)` studied by Rezghi and Hosseini, 2009.
|
|
Packit |
67cb25 |
This alternate L-curve can provide better estimates of the
|
|
Packit |
67cb25 |
regularization parameter for smooth solution vectors. The regularization
|
|
Packit |
67cb25 |
parameters :math:`\lambda` and solution norms :math:`||L c||` are provided
|
|
Packit |
67cb25 |
in the :data:`reg_param` and :data:`eta` input arrays respectively. The
|
|
Packit |
67cb25 |
corner is defined as the point of maximum curvature of this
|
|
Packit |
67cb25 |
alternate L-curve in linear scale. The :data:`reg_param` and :data:`eta`
|
|
Packit |
67cb25 |
arrays can be outputs of :func:`gsl_multifit_linear_lcurve`. The
|
|
Packit |
67cb25 |
algorithm used simply fits a circle to 3 consecutive points on the L-curve
|
|
Packit |
67cb25 |
and uses the circle's radius to determine the curvature at
|
|
Packit |
67cb25 |
the middle point. Therefore, the input array sizes must be
|
|
Packit |
67cb25 |
:math:`\ge 3`. With more points provided for the L-curve, a better
|
|
Packit |
67cb25 |
estimate of the curvature can be obtained. The array index
|
|
Packit |
67cb25 |
corresponding to maximum curvature (ie: the corner) is returned
|
|
Packit |
67cb25 |
in :data:`idx`. If the input arrays contain colinear points,
|
|
Packit |
67cb25 |
this function could fail and return :macro:`GSL_EINVAL`.
|
|
Packit |
67cb25 |
|
|
Packit |
67cb25 |
.. function:: int gsl_multifit_linear_gcv_init(const gsl_vector * y, gsl_vector * reg_param, gsl_vector * UTy, double * delta0, gsl_multifit_linear_workspace * work)
|
|
Packit |
67cb25 |
|
|
Packit |
67cb25 |
This function performs some initialization in preparation for computing
|
|
Packit |
67cb25 |
the GCV curve and its minimum. The right hand side vector is provided
|
|
Packit |
67cb25 |
in :data:`y`. On output, :data:`reg_param` is set to a vector of regularization
|
|
Packit |
67cb25 |
parameters in decreasing order and may be of any size. The vector
|
|
Packit |
67cb25 |
:data:`UTy` of size :math:`p` is set to :math:`U^T y`. The parameter
|
|
Packit |
67cb25 |
:data:`delta0` is needed for subsequent steps of the GCV calculation.
|
|
Packit |
67cb25 |
|
|
Packit |
67cb25 |
.. function:: int gsl_multifit_linear_gcv_curve(const gsl_vector * reg_param, const gsl_vector * UTy, const double delta0, gsl_vector * G, gsl_multifit_linear_workspace * work)
|
|
Packit |
67cb25 |
|
|
Packit |
67cb25 |
This funtion calculates the GCV curve :math:`G(\lambda)` and stores it in
|
|
Packit |
67cb25 |
:data:`G` on output, which must be the same size as :data:`reg_param`. The
|
|
Packit |
67cb25 |
inputs :data:`reg_param`, :data:`UTy` and :data:`delta0` are computed in
|
|
Packit |
67cb25 |
:func:`gsl_multifit_linear_gcv_init`.
|
|
Packit |
67cb25 |
|
|
Packit |
67cb25 |
.. function:: int gsl_multifit_linear_gcv_min(const gsl_vector * reg_param, const gsl_vector * UTy, const gsl_vector * G, const double delta0, double * lambda, gsl_multifit_linear_workspace * work)
|
|
Packit |
67cb25 |
|
|
Packit |
67cb25 |
This function computes the value of the regularization parameter
|
|
Packit |
67cb25 |
which minimizes the GCV curve :math:`G(\lambda)` and stores it in
|
|
Packit |
67cb25 |
:data:`lambda`. The input :data:`G` is calculated by
|
|
Packit |
67cb25 |
:func:`gsl_multifit_linear_gcv_curve` and the inputs
|
|
Packit |
67cb25 |
:data:`reg_param`, :data:`UTy` and :data:`delta0` are computed by
|
|
Packit |
67cb25 |
:func:`gsl_multifit_linear_gcv_init`.
|
|
Packit |
67cb25 |
|
|
Packit |
67cb25 |
.. function:: double gsl_multifit_linear_gcv_calc(const double lambda, const gsl_vector * UTy, const double delta0, gsl_multifit_linear_workspace * work)
|
|
Packit |
67cb25 |
|
|
Packit |
67cb25 |
This function returns the value of the GCV curve :math:`G(\lambda)` corresponding
|
|
Packit |
67cb25 |
to the input :data:`lambda`.
|
|
Packit |
67cb25 |
|
|
Packit |
67cb25 |
.. function:: int gsl_multifit_linear_gcv(const gsl_vector * y, gsl_vector * reg_param, gsl_vector * G, double * lambda, double * G_lambda, gsl_multifit_linear_workspace * work)
|
|
Packit |
67cb25 |
|
|
Packit |
67cb25 |
This function combines the steps :code:`gcv_init`, :code:`gcv_curve`,
|
|
Packit |
67cb25 |
and :code:`gcv_min` defined above into a single function. The input
|
|
Packit |
67cb25 |
:data:`y` is the right hand side vector. On output, :data:`reg_param` and
|
|
Packit |
67cb25 |
:data:`G`, which must be the same size, are set to vectors of
|
|
Packit |
67cb25 |
:math:`\lambda` and :math:`G(\lambda)` values respectively. The
|
|
Packit |
67cb25 |
output :data:`lambda` is set to the optimal value of :math:`\lambda`
|
|
Packit |
67cb25 |
which minimizes the GCV curve. The minimum value of the GCV curve is
|
|
Packit |
67cb25 |
returned in :data:`G_lambda`.
|
|
Packit |
67cb25 |
|
|
Packit |
67cb25 |
.. function:: int gsl_multifit_linear_Lk (const size_t p, const size_t k, gsl_matrix * L)
|
|
Packit |
67cb25 |
|
|
Packit |
67cb25 |
This function computes the discrete approximation to the derivative operator :math:`L_k` of
|
|
Packit |
67cb25 |
order :data:`k` on a regular grid of :data:`p` points and stores it in :data:`L`. The dimensions of :data:`L` are
|
|
Packit |
67cb25 |
:math:`(p-k)`-by-:math:`p`.
|
|
Packit |
67cb25 |
|
|
Packit |
67cb25 |
.. function:: int gsl_multifit_linear_Lsobolev (const size_t p, const size_t kmax, const gsl_vector * alpha, gsl_matrix * L, gsl_multifit_linear_workspace * work)
|
|
Packit |
67cb25 |
|
|
Packit |
67cb25 |
This function computes the regularization matrix :data:`L` corresponding to the
|
|
Packit |
67cb25 |
weighted Sobolov norm
|
|
Packit |
67cb25 |
:math:`||L c||^2 = \sum_k \alpha_k^2 ||L_k c||^2` where :math:`L_k` approximates
|
|
Packit |
67cb25 |
the derivative operator of order :math:`k`. This regularization norm can be useful
|
|
Packit |
67cb25 |
in applications where it is necessary to smooth several derivatives of the solution.
|
|
Packit |
67cb25 |
:data:`p` is the number of model parameters, :data:`kmax` is the highest derivative
|
|
Packit |
67cb25 |
to include in the summation above, and :data:`alpha` is the vector of weights of
|
|
Packit |
67cb25 |
size :data:`kmax` + 1, where :code:`alpha[k]` = :math:`\alpha_k` is the weight
|
|
Packit |
67cb25 |
assigned to the derivative of order :math:`k`. The output matrix :data:`L` is size
|
|
Packit |
67cb25 |
:data:`p`-by-:data:`p` and upper triangular.
|
|
Packit |
67cb25 |
|
|
Packit |
67cb25 |
.. function:: double gsl_multifit_linear_rcond (const gsl_multifit_linear_workspace * work)
|
|
Packit |
67cb25 |
|
|
Packit |
67cb25 |
This function returns the reciprocal condition number of the least squares matrix :math:`X`,
|
|
Packit |
67cb25 |
defined as the ratio of the smallest and largest singular values,
|
|
Packit |
67cb25 |
rcond = :math:`\sigma_{min}/\sigma_{max}`.
|
|
Packit |
67cb25 |
The routine :func:`gsl_multifit_linear_svd` must first be called to compute the SVD
|
|
Packit |
67cb25 |
of :math:`X`.
|
|
Packit |
67cb25 |
|
|
Packit |
67cb25 |
.. index::
|
|
Packit |
67cb25 |
single: robust regression
|
|
Packit |
67cb25 |
single: regression, robust
|
|
Packit |
67cb25 |
single: least squares, robust
|
|
Packit |
67cb25 |
|
|
Packit |
67cb25 |
Robust linear regression
|
|
Packit |
67cb25 |
========================
|
|
Packit |
67cb25 |
|
|
Packit |
67cb25 |
Ordinary least squares (OLS) models are often heavily influenced by the presence of outliers.
|
|
Packit |
67cb25 |
Outliers are data points which do not follow the general trend of the other observations,
|
|
Packit |
67cb25 |
although there is strictly no precise definition of an outlier. Robust linear regression
|
|
Packit |
67cb25 |
refers to regression algorithms which are robust to outliers. The most common type of
|
|
Packit |
67cb25 |
robust regression is M-estimation. The general M-estimator minimizes the objective function
|
|
Packit |
67cb25 |
|
|
Packit |
67cb25 |
.. math:: \sum_i \rho(e_i) = \sum_i \rho (y_i - Y(c, x_i))
|
|
Packit |
67cb25 |
|
|
Packit |
67cb25 |
where :math:`e_i = y_i - Y(c, x_i)` is the residual of the ith data point, and
|
|
Packit |
67cb25 |
:math:`\rho(e_i)` is a function which should have the following properties:
|
|
Packit |
67cb25 |
|
|
Packit |
67cb25 |
* :math:`\rho(e) \ge 0`
|
|
Packit |
67cb25 |
* :math:`\rho(0) = 0`
|
|
Packit |
67cb25 |
* :math:`\rho(-e) = \rho(e)`
|
|
Packit |
67cb25 |
* :math:`\rho(e_1) > \rho(e_2)` for :math:`|e_1| > |e_2|`
|
|
Packit |
67cb25 |
|
|
Packit |
67cb25 |
The special case of ordinary least squares is given by :math:`\rho(e_i) = e_i^2`.
|
|
Packit |
67cb25 |
Letting :math:`\psi = \rho'` be the derivative of :math:`\rho`, differentiating
|
|
Packit |
67cb25 |
the objective function with respect to the coefficients :math:`c`
|
|
Packit |
67cb25 |
and setting the partial derivatives to zero produces the system of equations
|
|
Packit |
67cb25 |
|
|
Packit |
67cb25 |
.. math:: \sum_i \psi(e_i) X_i = 0
|
|
Packit |
67cb25 |
|
|
Packit |
67cb25 |
where :math:`X_i` is a vector containing row :math:`i` of the design matrix :math:`X`.
|
|
Packit |
67cb25 |
Next, we define a weight function :math:`w(e) = \psi(e)/e`, and let
|
|
Packit |
67cb25 |
:math:`w_i = w(e_i)`:
|
|
Packit |
67cb25 |
|
|
Packit |
67cb25 |
.. math:: \sum_i w_i e_i X_i = 0
|
|
Packit |
67cb25 |
|
|
Packit |
67cb25 |
This system of equations is equivalent to solving a weighted ordinary least squares
|
|
Packit |
67cb25 |
problem, minimizing :math:`\chi^2 = \sum_i w_i e_i^2`. The weights however, depend
|
|
Packit |
67cb25 |
on the residuals :math:`e_i`, which depend on the coefficients :math:`c`, which depend
|
|
Packit |
67cb25 |
on the weights. Therefore, an iterative solution is used, called Iteratively Reweighted
|
|
Packit |
67cb25 |
Least Squares (IRLS).
|
|
Packit |
67cb25 |
|
|
Packit |
67cb25 |
#. Compute initial estimates of the coefficients :math:`c^{(0)}` using ordinary least squares
|
|
Packit |
67cb25 |
|
|
Packit |
67cb25 |
#. For iteration :math:`k`, form the residuals :math:`e_i^{(k)} = (y_i - X_i c^{(k-1)})/(t \sigma^{(k)} \sqrt{1 - h_i})`,
|
|
Packit |
67cb25 |
where :math:`t` is a tuning constant depending on the choice of :math:`\psi`, and :math:`h_i` are the
|
|
Packit |
67cb25 |
statistical leverages (diagonal elements of the matrix :math:`X (X^T X)^{-1} X^T`). Including :math:`t`
|
|
Packit |
67cb25 |
and :math:`h_i` in the residual calculation has been shown to improve the convergence of the method.
|
|
Packit |
67cb25 |
The residual standard deviation is approximated as :math:`\sigma^{(k)} = MAD / 0.6745`, where MAD is the
|
|
Packit |
67cb25 |
Median-Absolute-Deviation of the :math:`n-p` largest residuals from the previous iteration.
|
|
Packit |
67cb25 |
|
|
Packit |
67cb25 |
#. Compute new weights :math:`w_i^{(k)} = \psi(e_i^{(k)})/e_i^{(k)}`.
|
|
Packit |
67cb25 |
|
|
Packit |
67cb25 |
#. Compute new coefficients :math:`c^{(k)}` by solving the weighted least squares problem with
|
|
Packit |
67cb25 |
weights :math:`w_i^{(k)}`.
|
|
Packit |
67cb25 |
|
|
Packit |
67cb25 |
#. Steps 2 through 4 are iterated until the coefficients converge or until some maximum iteration
|
|
Packit |
67cb25 |
limit is reached. Coefficients are tested for convergence using the critera:
|
|
Packit |
67cb25 |
|
|
Packit |
67cb25 |
.. only:: not texinfo
|
|
Packit |
67cb25 |
|
|
Packit |
67cb25 |
.. math:: |c_i^{(k)} - c_i^{(k-1)}| \le \epsilon \times \hbox{max}(|c_i^{(k)}|, |c_i^{(k-1)}|)
|
|
Packit |
67cb25 |
|
|
Packit |
67cb25 |
.. only:: texinfo
|
|
Packit |
67cb25 |
|
|
Packit |
67cb25 |
::
|
|
Packit |
67cb25 |
|
|
Packit |
67cb25 |
|c_i^(k) - c_i^(k-1)| <= \epsilon * max(|c_i^(k)|, |c_i^(k-1)|)
|
|
Packit |
67cb25 |
|
|
Packit |
67cb25 |
for all :math:`0 \le i < p` where :math:`\epsilon` is a small tolerance factor.
|
|
Packit |
67cb25 |
|
|
Packit |
67cb25 |
The key to this method lies in selecting the function :math:`\psi(e_i)` to assign
|
|
Packit |
67cb25 |
smaller weights to large residuals, and larger weights to smaller residuals. As
|
|
Packit |
67cb25 |
the iteration proceeds, outliers are assigned smaller and smaller weights, eventually
|
|
Packit |
67cb25 |
having very little or no effect on the fitted model.
|
|
Packit |
67cb25 |
|
|
Packit |
67cb25 |
.. type:: gsl_multifit_robust_workspace
|
|
Packit |
67cb25 |
|
|
Packit |
67cb25 |
This workspace is used for robust least squares fitting.
|
|
Packit |
67cb25 |
|
|
Packit |
67cb25 |
.. function:: gsl_multifit_robust_workspace * gsl_multifit_robust_alloc (const gsl_multifit_robust_type * T, const size_t n, const size_t p)
|
|
Packit |
67cb25 |
|
|
Packit |
67cb25 |
This function allocates a workspace for fitting a model to :data:`n`
|
|
Packit |
67cb25 |
observations using :data:`p` parameters. The size of the workspace
|
|
Packit |
67cb25 |
is :math:`O(np + p^2)`. The type :data:`T` specifies the
|
|
Packit |
67cb25 |
function :math:`\psi` and can be selected from the following choices.
|
|
Packit |
67cb25 |
|
|
Packit |
67cb25 |
.. type:: gsl_multifit_robust_type
|
|
Packit |
67cb25 |
|
|
Packit |
67cb25 |
.. var:: gsl_multifit_robust_default
|
|
Packit |
67cb25 |
|
|
Packit |
67cb25 |
This specifies the :data:`gsl_multifit_robust_bisquare` type (see below) and is a good
|
|
Packit |
67cb25 |
general purpose choice for robust regression.
|
|
Packit |
67cb25 |
|
|
Packit |
67cb25 |
.. var:: gsl_multifit_robust_bisquare
|
|
Packit |
67cb25 |
|
|
Packit |
67cb25 |
This is Tukey's biweight (bisquare) function and is a good general purpose choice for
|
|
Packit |
67cb25 |
robust regression. The weight function is given by
|
|
Packit |
67cb25 |
|
|
Packit |
67cb25 |
.. only:: not texinfo
|
|
Packit |
67cb25 |
|
|
Packit |
67cb25 |
.. math::
|
|
Packit |
67cb25 |
|
|
Packit |
67cb25 |
w(e) =
|
|
Packit |
67cb25 |
\left\{
|
|
Packit |
67cb25 |
\begin{array}{cc}
|
|
Packit |
67cb25 |
(1 - e^2)^2, & |e| \le 1 \\
|
|
Packit |
67cb25 |
0, & |e| > 1
|
|
Packit |
67cb25 |
\end{array}
|
|
Packit |
67cb25 |
\right.
|
|
Packit |
67cb25 |
|
|
Packit |
67cb25 |
.. only:: texinfo
|
|
Packit |
67cb25 |
|
|
Packit |
67cb25 |
::
|
|
Packit |
67cb25 |
|
|
Packit |
67cb25 |
w(e) = { (1 - e^2)^2, |e| <= 1
|
|
Packit |
67cb25 |
{ 0, |e| > 1
|
|
Packit |
67cb25 |
|
|
Packit |
67cb25 |
and the default tuning constant is :math:`t = 4.685`.
|
|
Packit |
67cb25 |
|
|
Packit |
67cb25 |
.. var:: gsl_multifit_robust_cauchy
|
|
Packit |
67cb25 |
|
|
Packit |
67cb25 |
This is Cauchy's function, also known as the Lorentzian function.
|
|
Packit |
67cb25 |
This function does not guarantee a unique solution,
|
|
Packit |
67cb25 |
meaning different choices of the coefficient vector :data:`c`
|
|
Packit |
67cb25 |
could minimize the objective function. Therefore this option should
|
|
Packit |
67cb25 |
be used with care. The weight function is given by
|
|
Packit |
67cb25 |
|
|
Packit |
67cb25 |
.. only:: not texinfo
|
|
Packit |
67cb25 |
|
|
Packit |
67cb25 |
.. math:: w(e) = {1 \over 1 + e^2}
|
|
Packit |
67cb25 |
|
|
Packit |
67cb25 |
.. only:: texinfo
|
|
Packit |
67cb25 |
|
|
Packit |
67cb25 |
::
|
|
Packit |
67cb25 |
|
|
Packit |
67cb25 |
w(e) = 1 / (1 + e^2)
|
|
Packit |
67cb25 |
|
|
Packit |
67cb25 |
and the default tuning constant is :math:`t = 2.385`.
|
|
Packit |
67cb25 |
|
|
Packit |
67cb25 |
.. var:: gsl_multifit_robust_fair
|
|
Packit |
67cb25 |
|
|
Packit |
67cb25 |
This is the fair :math:`\rho` function, which guarantees a unique solution and
|
|
Packit |
67cb25 |
has continuous derivatives to three orders. The weight function is given by
|
|
Packit |
67cb25 |
|
|
Packit |
67cb25 |
.. only:: not texinfo
|
|
Packit |
67cb25 |
|
|
Packit |
67cb25 |
.. math:: w(e) = {1 \over 1 + |e|}
|
|
Packit |
67cb25 |
|
|
Packit |
67cb25 |
.. only:: texinfo
|
|
Packit |
67cb25 |
|
|
Packit |
67cb25 |
::
|
|
Packit |
67cb25 |
|
|
Packit |
67cb25 |
w(e) = 1 / (1 + |e|)
|
|
Packit |
67cb25 |
|
|
Packit |
67cb25 |
and the default tuning constant is :math:`t = 1.400`.
|
|
Packit |
67cb25 |
|
|
Packit |
67cb25 |
.. var:: gsl_multifit_robust_huber
|
|
Packit |
67cb25 |
|
|
Packit |
67cb25 |
This specifies Huber's :math:`\rho` function, which is a parabola in the vicinity of zero and
|
|
Packit |
67cb25 |
increases linearly for a given threshold :math:`|e| > t`. This function is also considered
|
|
Packit |
67cb25 |
an excellent general purpose robust estimator, however, occasional difficulties can
|
|
Packit |
67cb25 |
be encountered due to the discontinuous first derivative of the :math:`\psi` function.
|
|
Packit |
67cb25 |
The weight function is given by
|
|
Packit |
67cb25 |
|
|
Packit |
67cb25 |
.. only:: not texinfo
|
|
Packit |
67cb25 |
|
|
Packit |
67cb25 |
.. math::
|
|
Packit |
67cb25 |
|
|
Packit |
67cb25 |
w(e) =
|
|
Packit |
67cb25 |
\left\{
|
|
Packit |
67cb25 |
\begin{array}{cc}
|
|
Packit |
67cb25 |
1, & |e| \le 1 \\
|
|
Packit |
67cb25 |
{1 \over |e|}, & |e| > 1
|
|
Packit |
67cb25 |
\end{array}
|
|
Packit |
67cb25 |
\right.
|
|
Packit |
67cb25 |
|
|
Packit |
67cb25 |
.. only:: texinfo
|
|
Packit |
67cb25 |
|
|
Packit |
67cb25 |
::
|
|
Packit |
67cb25 |
|
|
Packit |
67cb25 |
w(e) = 1/max(1,|e|)
|
|
Packit |
67cb25 |
|
|
Packit |
67cb25 |
and the default tuning constant is :math:`t = 1.345`.
|
|
Packit |
67cb25 |
|
|
Packit |
67cb25 |
.. var:: gsl_multifit_robust_ols
|
|
Packit |
67cb25 |
|
|
Packit |
67cb25 |
This specifies the ordinary least squares solution, which can be useful for quickly
|
|
Packit |
67cb25 |
checking the difference between the various robust and OLS solutions. The
|
|
Packit |
67cb25 |
weight function is given by
|
|
Packit |
67cb25 |
|
|
Packit |
67cb25 |
.. math:: w(e) = 1
|
|
Packit |
67cb25 |
|
|
Packit |
67cb25 |
and the default tuning constant is :math:`t = 1`.
|
|
Packit |
67cb25 |
|
|
Packit |
67cb25 |
.. var:: gsl_multifit_robust_welsch
|
|
Packit |
67cb25 |
|
|
Packit |
67cb25 |
This specifies the Welsch function which can perform well in cases where the
|
|
Packit |
67cb25 |
residuals have an exponential distribution. The weight function is given by
|
|
Packit |
67cb25 |
|
|
Packit |
67cb25 |
.. math:: w(e) = \exp{(-e^2)}
|
|
Packit |
67cb25 |
|
|
Packit |
67cb25 |
and the default tuning constant is :math:`t = 2.985`.
|
|
Packit |
67cb25 |
|
|
Packit |
67cb25 |
.. function:: void gsl_multifit_robust_free (gsl_multifit_robust_workspace * w)
|
|
Packit |
67cb25 |
|
|
Packit |
67cb25 |
This function frees the memory associated with the workspace :data:`w`.
|
|
Packit |
67cb25 |
|
|
Packit |
67cb25 |
.. function:: const char * gsl_multifit_robust_name (const gsl_multifit_robust_workspace * w)
|
|
Packit |
67cb25 |
|
|
Packit |
67cb25 |
This function returns the name of the robust type :data:`T` specified to
|
|
Packit |
67cb25 |
:func:`gsl_multifit_robust_alloc`.
|
|
Packit |
67cb25 |
|
|
Packit |
67cb25 |
.. function:: int gsl_multifit_robust_tune (const double tune, gsl_multifit_robust_workspace * w)
|
|
Packit |
67cb25 |
|
|
Packit |
67cb25 |
This function sets the tuning constant :math:`t` used to adjust the residuals at each
|
|
Packit |
67cb25 |
iteration to :data:`tune`. Decreasing the tuning constant increases the downweight
|
|
Packit |
67cb25 |
assigned to large residuals, while increasing the tuning constant decreases the
|
|
Packit |
67cb25 |
downweight assigned to large residuals.
|
|
Packit |
67cb25 |
|
|
Packit |
67cb25 |
.. function:: int gsl_multifit_robust_maxiter (const size_t maxiter, gsl_multifit_robust_workspace * w)
|
|
Packit |
67cb25 |
|
|
Packit |
67cb25 |
This function sets the maximum number of iterations in the iteratively
|
|
Packit |
67cb25 |
reweighted least squares algorithm to :data:`maxiter`. By default,
|
|
Packit |
67cb25 |
this value is set to 100 by :func:`gsl_multifit_robust_alloc`.
|
|
Packit |
67cb25 |
|
|
Packit |
67cb25 |
.. function:: int gsl_multifit_robust_weights (const gsl_vector * r, gsl_vector * wts, gsl_multifit_robust_workspace * w)
|
|
Packit |
67cb25 |
|
|
Packit |
67cb25 |
This function assigns weights to the vector :data:`wts` using the residual vector
|
|
Packit |
67cb25 |
:data:`r` and previously specified weighting function. The output weights are given
|
|
Packit |
67cb25 |
by :math:`wts_i = w(r_i / (t \sigma))`, where the weighting functions :math:`w` are
|
|
Packit |
67cb25 |
detailed in :func:`gsl_multifit_robust_alloc`. :math:`\sigma` is an estimate of the
|
|
Packit |
67cb25 |
residual standard deviation based on the Median-Absolute-Deviation and :math:`t`
|
|
Packit |
67cb25 |
is the tuning constant. This function is useful if the user wishes to implement
|
|
Packit |
67cb25 |
their own robust regression rather than using
|
|
Packit |
67cb25 |
the supplied :func:`gsl_multifit_robust` routine below.
|
|
Packit |
67cb25 |
|
|
Packit |
67cb25 |
.. function:: int gsl_multifit_robust (const gsl_matrix * X, const gsl_vector * y, gsl_vector * c, gsl_matrix * cov, gsl_multifit_robust_workspace * w)
|
|
Packit |
67cb25 |
|
|
Packit |
67cb25 |
This function computes the best-fit parameters :data:`c` of the model
|
|
Packit |
67cb25 |
:math:`y = X c` for the observations :data:`y` and the matrix of
|
|
Packit |
67cb25 |
predictor variables :data:`X`, attemping to reduce the influence
|
|
Packit |
67cb25 |
of outliers using the algorithm outlined above.
|
|
Packit |
67cb25 |
The :math:`p`-by-:math:`p` variance-covariance matrix of the model parameters
|
|
Packit |
67cb25 |
:data:`cov` is estimated as :math:`\sigma^2 (X^T X)^{-1}`, where :math:`\sigma` is
|
|
Packit |
67cb25 |
an approximation of the residual standard deviation using the theory of robust
|
|
Packit |
67cb25 |
regression. Special care must be taken when estimating :math:`\sigma` and
|
|
Packit |
67cb25 |
other statistics such as :math:`R^2`, and so these
|
|
Packit |
67cb25 |
are computed internally and are available by calling the function
|
|
Packit |
67cb25 |
:func:`gsl_multifit_robust_statistics`.
|
|
Packit |
67cb25 |
|
|
Packit |
67cb25 |
If the coefficients do not converge within the maximum iteration
|
|
Packit |
67cb25 |
limit, the function returns :macro:`GSL_EMAXITER`. In this case,
|
|
Packit |
67cb25 |
the current estimates of the coefficients and covariance matrix
|
|
Packit |
67cb25 |
are returned in :data:`c` and :data:`cov` and the internal fit statistics
|
|
Packit |
67cb25 |
are computed with these estimates.
|
|
Packit |
67cb25 |
|
|
Packit |
67cb25 |
.. function:: int gsl_multifit_robust_est (const gsl_vector * x, const gsl_vector * c, const gsl_matrix * cov, double * y, double * y_err)
|
|
Packit |
67cb25 |
|
|
Packit |
67cb25 |
This function uses the best-fit robust regression coefficients
|
|
Packit |
67cb25 |
:data:`c` and their covariance matrix
|
|
Packit |
67cb25 |
:data:`cov` to compute the fitted function value
|
|
Packit |
67cb25 |
:data:`y` and its standard deviation :data:`y_err` for the model :math:`y = x \cdot c`
|
|
Packit |
67cb25 |
at the point :data:`x`.
|
|
Packit |
67cb25 |
|
|
Packit |
67cb25 |
.. function:: int gsl_multifit_robust_residuals (const gsl_matrix * X, const gsl_vector * y, const gsl_vector * c, gsl_vector * r, gsl_multifit_robust_workspace * w)
|
|
Packit |
67cb25 |
|
|
Packit |
67cb25 |
This function computes the vector of studentized residuals
|
|
Packit |
67cb25 |
:math:`r_i = {y_i - (X c)_i \over \sigma \sqrt{1 - h_i}}` for
|
|
Packit |
67cb25 |
the observations :data:`y`, coefficients :data:`c` and matrix of predictor
|
|
Packit |
67cb25 |
variables :data:`X`. The routine :func:`gsl_multifit_robust` must
|
|
Packit |
67cb25 |
first be called to compute the statisical leverages :math:`h_i` of
|
|
Packit |
67cb25 |
the matrix :data:`X` and residual standard deviation estimate :math:`\sigma`.
|
|
Packit |
67cb25 |
|
|
Packit |
67cb25 |
.. function:: gsl_multifit_robust_stats gsl_multifit_robust_statistics (const gsl_multifit_robust_workspace * w)
|
|
Packit |
67cb25 |
|
|
Packit |
67cb25 |
This function returns a structure containing relevant statistics from a robust regression.
|
|
Packit |
67cb25 |
The function :func:`gsl_multifit_robust` must be called first to perform the regression and
|
|
Packit |
67cb25 |
calculate these statistics. The returned :type:`gsl_multifit_robust_stats` structure
|
|
Packit |
67cb25 |
contains the following fields.
|
|
Packit |
67cb25 |
|
|
Packit |
67cb25 |
.. type:: gsl_multifit_robust_stats
|
|
Packit |
67cb25 |
|
|
Packit |
67cb25 |
:code:`double sigma_ols`
|
|
Packit |
67cb25 |
|
|
Packit |
67cb25 |
This contains the standard deviation of the residuals as computed from ordinary
|
|
Packit |
67cb25 |
least squares (OLS).
|
|
Packit |
67cb25 |
|
|
Packit |
67cb25 |
:code:`double sigma_mad`
|
|
Packit |
67cb25 |
|
|
Packit |
67cb25 |
This contains an estimate of the standard deviation of the final residuals using
|
|
Packit |
67cb25 |
the Median-Absolute-Deviation statistic
|
|
Packit |
67cb25 |
|
|
Packit |
67cb25 |
:code:`double sigma_rob`
|
|
Packit |
67cb25 |
|
|
Packit |
67cb25 |
This contains an estimate of the standard deviation of the final residuals
|
|
Packit |
67cb25 |
from the theory of robust regression (see Street et al, 1988).
|
|
Packit |
67cb25 |
|
|
Packit |
67cb25 |
:code:`double sigma`
|
|
Packit |
67cb25 |
|
|
Packit |
67cb25 |
This contains an estimate of the standard deviation of the final residuals
|
|
Packit |
67cb25 |
by attemping to reconcile :code:`sigma_rob` and :code:`sigma_ols`
|
|
Packit |
67cb25 |
in a reasonable way.
|
|
Packit |
67cb25 |
|
|
Packit |
67cb25 |
:code:`double Rsq`
|
|
Packit |
67cb25 |
|
|
Packit |
67cb25 |
This contains the :math:`R^2` coefficient of determination statistic using
|
|
Packit |
67cb25 |
the estimate :code:`sigma`.
|
|
Packit |
67cb25 |
|
|
Packit |
67cb25 |
:code:`double adj_Rsq`
|
|
Packit |
67cb25 |
|
|
Packit |
67cb25 |
This contains the adjusted :math:`R^2` coefficient of determination statistic
|
|
Packit |
67cb25 |
using the estimate :code:`sigma`.
|
|
Packit |
67cb25 |
|
|
Packit |
67cb25 |
:code:`double rmse`
|
|
Packit |
67cb25 |
|
|
Packit |
67cb25 |
This contains the root mean squared error of the final residuals
|
|
Packit |
67cb25 |
|
|
Packit |
67cb25 |
:code:`double sse`
|
|
Packit |
67cb25 |
|
|
Packit |
67cb25 |
This contains the residual sum of squares taking into account the robust
|
|
Packit |
67cb25 |
covariance matrix.
|
|
Packit |
67cb25 |
|
|
Packit |
67cb25 |
:code:`size_t dof`
|
|
Packit |
67cb25 |
|
|
Packit |
67cb25 |
This contains the number of degrees of freedom :math:`n - p`
|
|
Packit |
67cb25 |
|
|
Packit |
67cb25 |
:code:`size_t numit`
|
|
Packit |
67cb25 |
|
|
Packit |
67cb25 |
Upon successful convergence, this contains the number of iterations performed
|
|
Packit |
67cb25 |
|
|
Packit |
67cb25 |
:code:`gsl_vector * weights`
|
|
Packit |
67cb25 |
|
|
Packit |
67cb25 |
This contains the final weight vector of length :data:`n`
|
|
Packit |
67cb25 |
|
|
Packit |
67cb25 |
:code:`gsl_vector * r`
|
|
Packit |
67cb25 |
|
|
Packit |
67cb25 |
This contains the final residual vector of length :data:`n`, :math:`r = y - X c`
|
|
Packit |
67cb25 |
|
|
Packit |
67cb25 |
.. index::
|
|
Packit |
67cb25 |
single: large dense linear least squares
|
|
Packit |
67cb25 |
single: linear least squares, large
|
|
Packit |
67cb25 |
|
|
Packit |
67cb25 |
Large dense linear systems
|
|
Packit |
67cb25 |
==========================
|
|
Packit |
67cb25 |
|
|
Packit |
67cb25 |
This module is concerned with solving large dense least squares systems
|
|
Packit |
67cb25 |
:math:`X c = y` where the :math:`n`-by-:math:`p` matrix
|
|
Packit |
67cb25 |
:math:`X` has :math:`n >> p` (ie: many more rows than columns).
|
|
Packit |
67cb25 |
This type of matrix is called a "tall skinny" matrix, and for
|
|
Packit |
67cb25 |
some applications, it may not be possible to fit the
|
|
Packit |
67cb25 |
entire matrix in memory at once to use the standard SVD approach.
|
|
Packit |
67cb25 |
Therefore, the algorithms in this module are designed to allow
|
|
Packit |
67cb25 |
the user to construct smaller blocks of the matrix :math:`X` and
|
|
Packit |
67cb25 |
accumulate those blocks into the larger system one at a time. The
|
|
Packit |
67cb25 |
algorithms in this module never need to store the entire matrix
|
|
Packit |
67cb25 |
:math:`X` in memory. The large linear least squares routines
|
|
Packit |
67cb25 |
support data weights and Tikhonov regularization, and are
|
|
Packit |
67cb25 |
designed to minimize the residual
|
|
Packit |
67cb25 |
|
|
Packit |
67cb25 |
.. math:: \chi^2 = || y - Xc ||_W^2 + \lambda^2 || L c ||^2
|
|
Packit |
67cb25 |
|
|
Packit |
67cb25 |
where :math:`y` is the :math:`n`-by-:math:`1` observation vector,
|
|
Packit |
67cb25 |
:math:`X` is the :math:`n`-by-:math:`p` design matrix, :math:`c` is
|
|
Packit |
67cb25 |
the :math:`p`-by-:math:`1` solution vector,
|
|
Packit |
67cb25 |
:math:`W = \diag(w_1,...,w_n)` is the data weighting matrix,
|
|
Packit |
67cb25 |
:math:`L` is an :math:`m`-by-:math:`p` regularization matrix,
|
|
Packit |
67cb25 |
:math:`\lambda` is a regularization parameter,
|
|
Packit |
67cb25 |
and :math:`||r||_W^2 = r^T W r`. In the discussion which follows,
|
|
Packit |
67cb25 |
we will assume that the system has been converted into Tikhonov
|
|
Packit |
67cb25 |
standard form,
|
|
Packit |
67cb25 |
|
|
Packit |
67cb25 |
.. only:: not texinfo
|
|
Packit |
67cb25 |
|
|
Packit |
67cb25 |
.. math:: \chi^2 = || \tilde{y} - \tilde{X} \tilde{c} ||^2 + \lambda^2 || \tilde{c} ||^2
|
|
Packit |
67cb25 |
|
|
Packit |
67cb25 |
.. only:: texinfo
|
|
Packit |
67cb25 |
|
|
Packit |
67cb25 |
::
|
|
Packit |
67cb25 |
|
|
Packit |
67cb25 |
\chi^2 = || y~ - X~ c~ ||^2 + \lambda^2 || c~ ||^2
|
|
Packit |
67cb25 |
|
|
Packit |
67cb25 |
and we will drop the tilde characters from the various parameters.
|
|
Packit |
67cb25 |
For a discussion of the transformation to standard form,
|
|
Packit |
67cb25 |
see :ref:`sec_regularized-regression`.
|
|
Packit |
67cb25 |
|
|
Packit |
67cb25 |
The basic idea is to partition the matrix :math:`X` and observation
|
|
Packit |
67cb25 |
vector :math:`y` as
|
|
Packit |
67cb25 |
|
|
Packit |
67cb25 |
.. only:: not texinfo
|
|
Packit |
67cb25 |
|
|
Packit |
67cb25 |
.. math::
|
|
Packit |
67cb25 |
|
|
Packit |
67cb25 |
\left(
|
|
Packit |
67cb25 |
\begin{array}{c}
|
|
Packit |
67cb25 |
X_1 \\
|
|
Packit |
67cb25 |
X_2 \\
|
|
Packit |
67cb25 |
X_3 \\
|
|
Packit |
67cb25 |
\vdots \\
|
|
Packit |
67cb25 |
X_k
|
|
Packit |
67cb25 |
\end{array}
|
|
Packit |
67cb25 |
\right)
|
|
Packit |
67cb25 |
c =
|
|
Packit |
67cb25 |
\left(
|
|
Packit |
67cb25 |
\begin{array}{c}
|
|
Packit |
67cb25 |
y_1 \\
|
|
Packit |
67cb25 |
y_2 \\
|
|
Packit |
67cb25 |
y_3 \\
|
|
Packit |
67cb25 |
\vdots \\
|
|
Packit |
67cb25 |
y_k
|
|
Packit |
67cb25 |
\end{array}
|
|
Packit |
67cb25 |
\right)
|
|
Packit |
67cb25 |
|
|
Packit |
67cb25 |
.. only:: texinfo
|
|
Packit |
67cb25 |
|
|
Packit |
67cb25 |
::
|
|
Packit |
67cb25 |
|
|
Packit |
67cb25 |
[ X_1 ] c = [ y_1 ]
|
|
Packit |
67cb25 |
[ X_2 ] [ y_2 ]
|
|
Packit |
67cb25 |
[ X_3 ] [ y_3 ]
|
|
Packit |
67cb25 |
[ ... ] [ ... ]
|
|
Packit |
67cb25 |
[ X_k ] [ y_k ]
|
|
Packit |
67cb25 |
|
|
Packit |
67cb25 |
into :math:`k` blocks, where each block (:math:`X_i,y_i`) may have
|
|
Packit |
67cb25 |
any number of rows, but each :math:`X_i` has :math:`p` columns.
|
|
Packit |
67cb25 |
The sections below describe the methods available for solving
|
|
Packit |
67cb25 |
this partitioned system. The functions are declared in
|
|
Packit |
67cb25 |
the header file :file:`gsl_multilarge.h`.
|
|
Packit |
67cb25 |
|
|
Packit |
67cb25 |
.. index::
|
|
Packit |
67cb25 |
single: large linear least squares, normal equations
|
|
Packit |
67cb25 |
|
|
Packit |
67cb25 |
Normal Equations Approach
|
|
Packit |
67cb25 |
-------------------------
|
|
Packit |
67cb25 |
|
|
Packit |
67cb25 |
The normal equations approach to the large linear least squares
|
|
Packit |
67cb25 |
problem described above is popular due to its speed and simplicity.
|
|
Packit |
67cb25 |
Since the normal equations solution to the problem is given by
|
|
Packit |
67cb25 |
|
|
Packit |
67cb25 |
.. only:: not texinfo
|
|
Packit |
67cb25 |
|
|
Packit |
67cb25 |
.. math:: c = \left( X^T X + \lambda^2 I \right)^{-1} X^T y
|
|
Packit |
67cb25 |
|
|
Packit |
67cb25 |
.. only:: texinfo
|
|
Packit |
67cb25 |
|
|
Packit |
67cb25 |
::
|
|
Packit |
67cb25 |
|
|
Packit |
67cb25 |
c = ( X^T X + \lambda^2 I )^-1 X^T y
|
|
Packit |
67cb25 |
|
|
Packit |
67cb25 |
only the :math:`p`-by-:math:`p` matrix :math:`X^T X` and
|
|
Packit |
67cb25 |
:math:`p`-by-1 vector :math:`X^T y` need to be stored. Using
|
|
Packit |
67cb25 |
the partition scheme described above, these are given by
|
|
Packit |
67cb25 |
|
|
Packit |
67cb25 |
.. only:: not texinfo
|
|
Packit |
67cb25 |
|
|
Packit |
67cb25 |
.. math::
|
|
Packit |
67cb25 |
|
|
Packit |
67cb25 |
X^T X &= \sum_i X_i^T X_i \\
|
|
Packit |
67cb25 |
X^T y &= \sum_i X_i^T y_i
|
|
Packit |
67cb25 |
|
|
Packit |
67cb25 |
.. only:: texinfo
|
|
Packit |
67cb25 |
|
|
Packit |
67cb25 |
::
|
|
Packit |
67cb25 |
|
|
Packit |
67cb25 |
X^T X = \sum_i X_i^T X_i
|
|
Packit |
67cb25 |
X^T y = \sum_i X_i^T y_i
|
|
Packit |
67cb25 |
|
|
Packit |
67cb25 |
Since the matrix :math:`X^T X` is symmetric, only half of it
|
|
Packit |
67cb25 |
needs to be calculated. Once all of the blocks :math:`(X_i,y_i)`
|
|
Packit |
67cb25 |
have been accumulated into the final :math:`X^T X` and :math:`X^T y`,
|
|
Packit |
67cb25 |
the system can be solved with a Cholesky factorization of the
|
|
Packit |
67cb25 |
:math:`X^T X` matrix. The :math:`X^T X` matrix is first transformed via
|
|
Packit |
67cb25 |
a diagonal scaling transformation to attempt to reduce its condition
|
|
Packit |
67cb25 |
number as much as possible to recover a more accurate solution vector.
|
|
Packit |
67cb25 |
The normal equations approach is the fastest method for solving the
|
|
Packit |
67cb25 |
large least squares problem, and is accurate for well-conditioned
|
|
Packit |
67cb25 |
matrices :math:`X`. However, for ill-conditioned matrices, as is often
|
|
Packit |
67cb25 |
the case for large systems, this method can suffer from numerical
|
|
Packit |
67cb25 |
instabilities (see Trefethen and Bau, 1997). The number of operations
|
|
Packit |
67cb25 |
for this method is :math:`O(np^2 + {1 \over 3}p^3)`.
|
|
Packit |
67cb25 |
|
|
Packit |
67cb25 |
.. index::
|
|
Packit |
67cb25 |
single: large linear least squares, TSQR
|
|
Packit |
67cb25 |
|
|
Packit |
67cb25 |
Tall Skinny QR (TSQR) Approach
|
|
Packit |
67cb25 |
------------------------------
|
|
Packit |
67cb25 |
|
|
Packit |
67cb25 |
An algorithm which has better numerical stability for ill-conditioned
|
|
Packit |
67cb25 |
problems is known as the Tall Skinny QR (TSQR) method. This method
|
|
Packit |
67cb25 |
is based on computing the thin QR decomposition of the least squares
|
|
Packit |
67cb25 |
matrix :math:`X = Q R`, where :math:`Q` is an :math:`n`-by-:math:`p` matrix
|
|
Packit |
67cb25 |
with orthogonal columns, and :math:`R` is a :math:`p`-by-:math:`p`
|
|
Packit |
67cb25 |
upper triangular matrix. Once these factors are calculated, the
|
|
Packit |
67cb25 |
residual becomes
|
|
Packit |
67cb25 |
|
|
Packit |
67cb25 |
.. math:: \chi^2 = || Q^T y - R c ||^2 + \lambda^2 || c ||^2
|
|
Packit |
67cb25 |
|
|
Packit |
67cb25 |
which can be written as the matrix equation
|
|
Packit |
67cb25 |
|
|
Packit |
67cb25 |
.. only:: not texinfo
|
|
Packit |
67cb25 |
|
|
Packit |
67cb25 |
.. math::
|
|
Packit |
67cb25 |
|
|
Packit |
67cb25 |
\left(
|
|
Packit |
67cb25 |
\begin{array}{c}
|
|
Packit |
67cb25 |
R \\
|
|
Packit |
67cb25 |
\lambda I
|
|
Packit |
67cb25 |
\end{array}
|
|
Packit |
67cb25 |
\right) c =
|
|
Packit |
67cb25 |
\left(
|
|
Packit |
67cb25 |
\begin{array}{c}
|
|
Packit |
67cb25 |
Q^T y \\
|
|
Packit |
67cb25 |
0
|
|
Packit |
67cb25 |
\end{array}
|
|
Packit |
67cb25 |
\right)
|
|
Packit |
67cb25 |
|
|
Packit |
67cb25 |
.. only:: texinfo
|
|
Packit |
67cb25 |
|
|
Packit |
67cb25 |
::
|
|
Packit |
67cb25 |
|
|
Packit |
67cb25 |
[ R ; \lambda I ] c = [ Q^T b ; 0 ]
|
|
Packit |
67cb25 |
|
|
Packit |
67cb25 |
The matrix on the left hand side is now a much
|
|
Packit |
67cb25 |
smaller :math:`2p`-by-:math:`p` matrix which can
|
|
Packit |
67cb25 |
be solved with a standard SVD approach. The
|
|
Packit |
67cb25 |
:math:`Q` matrix is just as large as the original
|
|
Packit |
67cb25 |
matrix :math:`X`, however it does not need to be
|
|
Packit |
67cb25 |
explicitly constructed. The TSQR algorithm
|
|
Packit |
67cb25 |
computes only the :math:`p`-by-:math:`p` matrix
|
|
Packit |
67cb25 |
:math:`R` and the :math:`p`-by-1 vector :math:`Q^T y`,
|
|
Packit |
67cb25 |
and updates these quantities as new blocks
|
|
Packit |
67cb25 |
are added to the system. Each time a new block of rows
|
|
Packit |
67cb25 |
(:math:`X_i,y_i`) is added, the algorithm performs a QR decomposition
|
|
Packit |
67cb25 |
of the matrix
|
|
Packit |
67cb25 |
|
|
Packit |
67cb25 |
.. only:: not texinfo
|
|
Packit |
67cb25 |
|
|
Packit |
67cb25 |
.. math::
|
|
Packit |
67cb25 |
|
|
Packit |
67cb25 |
\left(
|
|
Packit |
67cb25 |
\begin{array}{c}
|
|
Packit |
67cb25 |
R_{i-1} \\
|
|
Packit |
67cb25 |
X_i
|
|
Packit |
67cb25 |
\end{array}
|
|
Packit |
67cb25 |
\right)
|
|
Packit |
67cb25 |
|
|
Packit |
67cb25 |
.. only:: texinfo
|
|
Packit |
67cb25 |
|
|
Packit |
67cb25 |
::
|
|
Packit |
67cb25 |
|
|
Packit |
67cb25 |
[ R_(i-1) ; X_i ]
|
|
Packit |
67cb25 |
|
|
Packit |
67cb25 |
where :math:`R_{i-1}` is the upper triangular
|
|
Packit |
67cb25 |
:math:`R` factor for the matrix
|
|
Packit |
67cb25 |
|
|
Packit |
67cb25 |
.. only:: not texinfo
|
|
Packit |
67cb25 |
|
|
Packit |
67cb25 |
.. math::
|
|
Packit |
67cb25 |
|
|
Packit |
67cb25 |
\left(
|
|
Packit |
67cb25 |
\begin{array}{c}
|
|
Packit |
67cb25 |
X_1 \\
|
|
Packit |
67cb25 |
\vdots \\
|
|
Packit |
67cb25 |
X_{i-1}
|
|
Packit |
67cb25 |
\end{array}
|
|
Packit |
67cb25 |
\right)
|
|
Packit |
67cb25 |
|
|
Packit |
67cb25 |
.. only:: texinfo
|
|
Packit |
67cb25 |
|
|
Packit |
67cb25 |
::
|
|
Packit |
67cb25 |
|
|
Packit |
67cb25 |
[ X_1 ; ... ; X_(i-1) ]
|
|
Packit |
67cb25 |
|
|
Packit |
67cb25 |
This QR decomposition is done efficiently taking into account
|
|
Packit |
67cb25 |
the sparse structure of :math:`R_{i-1}`. See Demmel et al, 2008 for
|
|
Packit |
67cb25 |
more details on how this is accomplished. The number
|
|
Packit |
67cb25 |
of operations for this method is :math:`O(2np^2 - {2 \over 3}p^3)`.
|
|
Packit |
67cb25 |
|
|
Packit |
67cb25 |
.. index::
|
|
Packit |
67cb25 |
single: large linear least squares, steps
|
|
Packit |
67cb25 |
|
|
Packit |
67cb25 |
Large Dense Linear Systems Solution Steps
|
|
Packit |
67cb25 |
-----------------------------------------
|
|
Packit |
67cb25 |
|
|
Packit |
67cb25 |
The typical steps required to solve large regularized linear least
|
|
Packit |
67cb25 |
squares problems are as follows:
|
|
Packit |
67cb25 |
|
|
Packit |
67cb25 |
#. Choose the regularization matrix :math:`L`.
|
|
Packit |
67cb25 |
|
|
Packit |
67cb25 |
#. Construct a block of rows of the least squares matrix, right
|
|
Packit |
67cb25 |
hand side vector, and weight vector (:math:`X_i`, :math:`y_i`, :math:`w_i`).
|
|
Packit |
67cb25 |
|
|
Packit |
67cb25 |
#. Transform the block to standard form (:math:`\tilde{X_i}`, :math:`\tilde{y_i}`). This
|
|
Packit |
67cb25 |
step can be skipped if :math:`L = I` and :math:`W = I`.
|
|
Packit |
67cb25 |
|
|
Packit |
67cb25 |
#. Accumulate the standard form block (:math:`\tilde{X_i}`, :math:`\tilde{y_i}`) into
|
|
Packit |
67cb25 |
the system.
|
|
Packit |
67cb25 |
|
|
Packit |
67cb25 |
#. Repeat steps 2-4 until the entire matrix and right hand side vector have
|
|
Packit |
67cb25 |
been accumulated.
|
|
Packit |
67cb25 |
|
|
Packit |
67cb25 |
#. Determine an appropriate regularization parameter :math:`\lambda` (using for example
|
|
Packit |
67cb25 |
L-curve analysis).
|
|
Packit |
67cb25 |
|
|
Packit |
67cb25 |
#. Solve the standard form system using the chosen :math:`\lambda`.
|
|
Packit |
67cb25 |
|
|
Packit |
67cb25 |
#. Backtransform the standard form solution :math:`\tilde{c}` to recover the
|
|
Packit |
67cb25 |
original solution vector :math:`c`.
|
|
Packit |
67cb25 |
|
|
Packit |
67cb25 |
.. index::
|
|
Packit |
67cb25 |
single: large linear least squares, routines
|
|
Packit |
67cb25 |
|
|
Packit |
67cb25 |
Large Dense Linear Least Squares Routines
|
|
Packit |
67cb25 |
-----------------------------------------
|
|
Packit |
67cb25 |
|
|
Packit |
67cb25 |
.. type:: gsl_multilarge_linear_workspace
|
|
Packit |
67cb25 |
|
|
Packit |
67cb25 |
This workspace contains parameters for solving large linear least squares problems.
|
|
Packit |
67cb25 |
|
|
Packit |
67cb25 |
.. function:: gsl_multilarge_linear_workspace * gsl_multilarge_linear_alloc (const gsl_multilarge_linear_type * T, const size_t p)
|
|
Packit |
67cb25 |
|
|
Packit |
67cb25 |
This function allocates a workspace for solving large linear least squares
|
|
Packit |
67cb25 |
systems. The least squares matrix :math:`X` has :data:`p` columns,
|
|
Packit |
67cb25 |
but may have any number of rows.
|
|
Packit |
67cb25 |
|
|
Packit |
67cb25 |
.. type:: gsl_multilarge_linear_type
|
|
Packit |
67cb25 |
|
|
Packit |
67cb25 |
The parameter :data:`T` specifies
|
|
Packit |
67cb25 |
the method to be used for solving the large least squares system
|
|
Packit |
67cb25 |
and may be selected from the following choices
|
|
Packit |
67cb25 |
|
|
Packit |
67cb25 |
.. var:: gsl_multilarge_linear_normal
|
|
Packit |
67cb25 |
|
|
Packit |
67cb25 |
This specifies the normal equations approach for
|
|
Packit |
67cb25 |
solving the least squares system. This method is suitable
|
|
Packit |
67cb25 |
in cases where performance is critical and it is known that the
|
|
Packit |
67cb25 |
least squares matrix :math:`X` is well conditioned. The size
|
|
Packit |
67cb25 |
of this workspace is :math:`O(p^2)`.
|
|
Packit |
67cb25 |
|
|
Packit |
67cb25 |
.. var:: gsl_multilarge_linear_tsqr
|
|
Packit |
67cb25 |
|
|
Packit |
67cb25 |
This specifies the sequential Tall Skinny QR (TSQR) approach for
|
|
Packit |
67cb25 |
solving the least squares system. This method is a good
|
|
Packit |
67cb25 |
general purpose choice for large systems, but requires about
|
|
Packit |
67cb25 |
twice as many operations as the normal equations method for
|
|
Packit |
67cb25 |
:math:`n >> p`. The size of this workspace is :math:`O(p^2)`.
|
|
Packit |
67cb25 |
|
|
Packit |
67cb25 |
.. function:: void gsl_multilarge_linear_free (gsl_multilarge_linear_workspace * w)
|
|
Packit |
67cb25 |
|
|
Packit |
67cb25 |
This function frees the memory associated with the
|
|
Packit |
67cb25 |
workspace :data:`w`.
|
|
Packit |
67cb25 |
|
|
Packit |
67cb25 |
.. function:: const char * gsl_multilarge_linear_name (gsl_multilarge_linear_workspace * w)
|
|
Packit |
67cb25 |
|
|
Packit |
67cb25 |
This function returns a string pointer to the name
|
|
Packit |
67cb25 |
of the multilarge solver.
|
|
Packit |
67cb25 |
|
|
Packit |
67cb25 |
.. function:: int gsl_multilarge_linear_reset (gsl_multilarge_linear_workspace * w)
|
|
Packit |
67cb25 |
|
|
Packit |
67cb25 |
This function resets the workspace :data:`w` so
|
|
Packit |
67cb25 |
it can begin to accumulate a new least squares
|
|
Packit |
67cb25 |
system.
|
|
Packit |
67cb25 |
|
|
Packit |
67cb25 |
.. function:: int gsl_multilarge_linear_stdform1 (const gsl_vector * L, const gsl_matrix * X, const gsl_vector * y, gsl_matrix * Xs, gsl_vector * ys, gsl_multilarge_linear_workspace * work)
|
|
Packit |
67cb25 |
int gsl_multilarge_linear_wstdform1 (const gsl_vector * L, const gsl_matrix * X, const gsl_vector * w, const gsl_vector * y, gsl_matrix * Xs, gsl_vector * ys, gsl_multilarge_linear_workspace * work)
|
|
Packit |
67cb25 |
|
|
Packit |
67cb25 |
These functions define a regularization matrix
|
|
Packit |
67cb25 |
:math:`L = \diag(l_0,l_1,...,l_{p-1})`.
|
|
Packit |
67cb25 |
The diagonal matrix element :math:`l_i` is provided by the
|
|
Packit |
67cb25 |
:math:`i`-th element of the input vector :data:`L`.
|
|
Packit |
67cb25 |
The block (:data:`X`, :data:`y`) is converted to standard form and
|
|
Packit |
67cb25 |
the parameters (:math:`\tilde{X}`, :math:`\tilde{y}`) are stored in :data:`Xs`
|
|
Packit |
67cb25 |
and :data:`ys` on output. :data:`Xs` and :data:`ys` have the same dimensions as
|
|
Packit |
67cb25 |
:data:`X` and :data:`y`. Optional data weights may be supplied in the
|
|
Packit |
67cb25 |
vector :data:`w`. In order to apply this transformation,
|
|
Packit |
67cb25 |
:math:`L^{-1}` must exist and so none of the :math:`l_i`
|
|
Packit |
67cb25 |
may be zero. After the standard form system has been solved,
|
|
Packit |
67cb25 |
use :func:`gsl_multilarge_linear_genform1` to recover the original solution vector.
|
|
Packit |
67cb25 |
It is allowed to have :data:`X` = :data:`Xs` and :data:`y` = :data:`ys` for an
|
|
Packit |
67cb25 |
in-place transform.
|
|
Packit |
67cb25 |
|
|
Packit |
67cb25 |
.. function:: int gsl_multilarge_linear_L_decomp (gsl_matrix * L, gsl_vector * tau)
|
|
Packit |
67cb25 |
|
|
Packit |
67cb25 |
This function calculates the QR decomposition of the :math:`m`-by-:math:`p`
|
|
Packit |
67cb25 |
regularization matrix :data:`L`. :data:`L` must have :math:`m \ge p`. On output,
|
|
Packit |
67cb25 |
the Householder scalars are stored in the vector :data:`tau` of size :math:`p`.
|
|
Packit |
67cb25 |
These outputs will be used by :func:`gsl_multilarge_linear_wstdform2` to complete the
|
|
Packit |
67cb25 |
transformation to standard form.
|
|
Packit |
67cb25 |
|
|
Packit |
67cb25 |
.. function:: int gsl_multilarge_linear_stdform2 (const gsl_matrix * LQR, const gsl_vector * Ltau, const gsl_matrix * X, const gsl_vector * y, gsl_matrix * Xs, gsl_vector * ys, gsl_multilarge_linear_workspace * work)
|
|
Packit |
67cb25 |
int gsl_multilarge_linear_wstdform2 (const gsl_matrix * LQR, const gsl_vector * Ltau, const gsl_matrix * X, const gsl_vector * w, const gsl_vector * y, gsl_matrix * Xs, gsl_vector * ys, gsl_multilarge_linear_workspace * work)
|
|
Packit |
67cb25 |
|
|
Packit |
67cb25 |
These functions convert a block of rows (:data:`X`, :data:`y`, :data:`w`) to standard
|
|
Packit |
67cb25 |
form (:math:`\tilde{X}`, :math:`\tilde{y}`) which are stored in :data:`Xs` and :data:`ys`
|
|
Packit |
67cb25 |
respectively. :data:`X`, :data:`y`, and :data:`w` must all have the same number of rows.
|
|
Packit |
67cb25 |
The :math:`m`-by-:math:`p` regularization matrix :data:`L` is specified by the inputs
|
|
Packit |
67cb25 |
:data:`LQR` and :data:`Ltau`, which are outputs from :func:`gsl_multilarge_linear_L_decomp`.
|
|
Packit |
67cb25 |
:data:`Xs` and :data:`ys` have the same dimensions as :data:`X` and :data:`y`. After the
|
|
Packit |
67cb25 |
standard form system has been solved, use :func:`gsl_multilarge_linear_genform2` to
|
|
Packit |
67cb25 |
recover the original solution vector. Optional data weights may be supplied in the
|
|
Packit |
67cb25 |
vector :data:`w`, where :math:`W = \diag(w)`.
|
|
Packit |
67cb25 |
|
|
Packit |
67cb25 |
.. function:: int gsl_multilarge_linear_accumulate (gsl_matrix * X, gsl_vector * y, gsl_multilarge_linear_workspace * w)
|
|
Packit |
67cb25 |
|
|
Packit |
67cb25 |
This function accumulates the standard form block (:math:`X,y`) into the
|
|
Packit |
67cb25 |
current least squares system. :data:`X` and :data:`y` have the same number
|
|
Packit |
67cb25 |
of rows, which can be arbitrary. :data:`X` must have :math:`p` columns.
|
|
Packit |
67cb25 |
For the TSQR method, :data:`X` and :data:`y` are destroyed on output.
|
|
Packit |
67cb25 |
For the normal equations method, they are both unchanged.
|
|
Packit |
67cb25 |
|
|
Packit |
67cb25 |
.. function:: int gsl_multilarge_linear_solve (const double lambda, gsl_vector * c, double * rnorm, double * snorm, gsl_multilarge_linear_workspace * w)
|
|
Packit |
67cb25 |
|
|
Packit |
67cb25 |
After all blocks (:math:`X_i,y_i`) have been accumulated into
|
|
Packit |
67cb25 |
the large least squares system, this function will compute
|
|
Packit |
67cb25 |
the solution vector which is stored in :data:`c` on output.
|
|
Packit |
67cb25 |
The regularization parameter :math:`\lambda` is provided in
|
|
Packit |
67cb25 |
:data:`lambda`. On output, :data:`rnorm` contains the residual norm
|
|
Packit |
67cb25 |
:math:`||y - X c||_W` and :data:`snorm` contains the solution
|
|
Packit |
67cb25 |
norm :math:`||L c||`.
|
|
Packit |
67cb25 |
|
|
Packit |
67cb25 |
.. function:: int gsl_multilarge_linear_genform1 (const gsl_vector * L, const gsl_vector * cs, gsl_vector * c, gsl_multilarge_linear_workspace * work)
|
|
Packit |
67cb25 |
|
|
Packit |
67cb25 |
After a regularized system has been solved with
|
|
Packit |
67cb25 |
:math:`L = \diag(\l_0,\l_1,...,\l_{p-1})`,
|
|
Packit |
67cb25 |
this function backtransforms the standard form solution vector :data:`cs`
|
|
Packit |
67cb25 |
to recover the solution vector of the original problem :data:`c`. The
|
|
Packit |
67cb25 |
diagonal matrix elements :math:`l_i` are provided in
|
|
Packit |
67cb25 |
the vector :data:`L`. It is allowed to have :data:`c` = :data:`cs` for an
|
|
Packit |
67cb25 |
in-place transform.
|
|
Packit |
67cb25 |
|
|
Packit |
67cb25 |
.. function:: int gsl_multilarge_linear_genform2 (const gsl_matrix * LQR, const gsl_vector * Ltau, const gsl_vector * cs, gsl_vector * c, gsl_multilarge_linear_workspace * work)
|
|
Packit |
67cb25 |
|
|
Packit |
67cb25 |
After a regularized system has been solved with a regularization matrix :math:`L`,
|
|
Packit |
67cb25 |
specified by (:data:`LQR`, :data:`Ltau`), this function backtransforms the standard form
|
|
Packit |
67cb25 |
solution :data:`cs`
|
|
Packit |
67cb25 |
to recover the solution vector of the original problem, which is stored in :data:`c`,
|
|
Packit |
67cb25 |
of length :math:`p`.
|
|
Packit |
67cb25 |
|
|
Packit |
67cb25 |
.. function:: int gsl_multilarge_linear_lcurve (gsl_vector * reg_param, gsl_vector * rho, gsl_vector * eta, gsl_multilarge_linear_workspace * work)
|
|
Packit |
67cb25 |
|
|
Packit |
67cb25 |
This function computes the L-curve for a large least squares system
|
|
Packit |
67cb25 |
after it has been fully accumulated into the workspace :data:`work`.
|
|
Packit |
67cb25 |
The output vectors :data:`reg_param`, :data:`rho`, and :data:`eta` must all
|
|
Packit |
67cb25 |
be the same size, and will contain the regularization parameters
|
|
Packit |
67cb25 |
:math:`\lambda_i`, residual norms :math:`||y - X c_i||`, and solution
|
|
Packit |
67cb25 |
norms :math:`|| L c_i ||` which compose the L-curve, where :math:`c_i`
|
|
Packit |
67cb25 |
is the regularized solution vector corresponding to :math:`\lambda_i`.
|
|
Packit |
67cb25 |
The user may determine the number of points on the L-curve by
|
|
Packit |
67cb25 |
adjusting the size of these input arrays. For the TSQR method,
|
|
Packit |
67cb25 |
the regularization parameters :math:`\lambda_i` are estimated from the
|
|
Packit |
67cb25 |
singular values of the triangular :math:`R` factor. For the normal
|
|
Packit |
67cb25 |
equations method, they are estimated from the eigenvalues of the
|
|
Packit |
67cb25 |
:math:`X^T X` matrix.
|
|
Packit |
67cb25 |
|
|
Packit |
67cb25 |
.. function:: int gsl_multilarge_linear_rcond (double * rcond, gsl_multilarge_linear_workspace * work)
|
|
Packit |
67cb25 |
|
|
Packit |
67cb25 |
This function computes the reciprocal condition number, stored in
|
|
Packit |
67cb25 |
:data:`rcond`, of the least squares matrix after it has been accumulated
|
|
Packit |
67cb25 |
into the workspace :data:`work`. For the TSQR algorithm, this is
|
|
Packit |
67cb25 |
accomplished by calculating the SVD of the :math:`R` factor, which
|
|
Packit |
67cb25 |
has the same singular values as the matrix :math:`X`. For the normal
|
|
Packit |
67cb25 |
equations method, this is done by computing the eigenvalues of
|
|
Packit |
67cb25 |
:math:`X^T X`, which could be inaccurate for ill-conditioned matrices
|
|
Packit |
67cb25 |
:math:`X`.
|
|
Packit |
67cb25 |
|
|
Packit |
67cb25 |
.. index:: least squares troubleshooting
|
|
Packit |
67cb25 |
|
|
Packit |
67cb25 |
Troubleshooting
|
|
Packit |
67cb25 |
===============
|
|
Packit |
67cb25 |
|
|
Packit |
67cb25 |
When using models based on polynomials, care should be taken when constructing the design matrix
|
|
Packit |
67cb25 |
:math:`X`. If the :math:`x` values are large, then the matrix :math:`X` could be ill-conditioned
|
|
Packit |
67cb25 |
since its columns are powers of :math:`x`, leading to unstable least-squares solutions.
|
|
Packit |
67cb25 |
In this case it can often help to center and scale the :math:`x` values using the mean and standard deviation:
|
|
Packit |
67cb25 |
|
|
Packit |
67cb25 |
.. only:: not texinfo
|
|
Packit |
67cb25 |
|
|
Packit |
67cb25 |
.. math:: x' = {x - \mu(x) \over \sigma(x)}
|
|
Packit |
67cb25 |
|
|
Packit |
67cb25 |
.. only:: texinfo
|
|
Packit |
67cb25 |
|
|
Packit |
67cb25 |
::
|
|
Packit |
67cb25 |
|
|
Packit |
67cb25 |
x' = (x - mu)/sigma
|
|
Packit |
67cb25 |
|
|
Packit |
67cb25 |
and then construct the :math:`X` matrix using the transformed values :math:`x'`.
|
|
Packit |
67cb25 |
|
|
Packit |
67cb25 |
Examples
|
|
Packit |
67cb25 |
========
|
|
Packit |
67cb25 |
|
|
Packit |
67cb25 |
The example programs in this section demonstrate the various linear regression methods.
|
|
Packit |
67cb25 |
|
|
Packit |
67cb25 |
Simple Linear Regression Example
|
|
Packit |
67cb25 |
--------------------------------
|
|
Packit |
67cb25 |
|
|
Packit |
67cb25 |
The following program computes a least squares straight-line fit to a
|
|
Packit |
67cb25 |
simple dataset, and outputs the best-fit line and its
|
|
Packit |
67cb25 |
associated one standard-deviation error bars.
|
|
Packit |
67cb25 |
|
|
Packit |
67cb25 |
.. include:: examples/fitting.c
|
|
Packit |
67cb25 |
:code:
|
|
Packit |
67cb25 |
|
|
Packit |
67cb25 |
The following commands extract the data from the output of the program
|
|
Packit |
67cb25 |
and display it using the GNU plotutils "graph" utility::
|
|
Packit |
67cb25 |
|
|
Packit |
67cb25 |
$ ./demo > tmp
|
|
Packit |
67cb25 |
$ more tmp
|
|
Packit |
67cb25 |
# best fit: Y = -106.6 + 0.06 X
|
|
Packit |
67cb25 |
# covariance matrix:
|
|
Packit |
67cb25 |
# [ 39602, -19.9
|
|
Packit |
67cb25 |
# -19.9, 0.01]
|
|
Packit |
67cb25 |
# chisq = 0.8
|
|
Packit |
67cb25 |
|
|
Packit |
67cb25 |
$ for n in data fit hi lo ;
|
|
Packit |
67cb25 |
do
|
|
Packit |
67cb25 |
grep "^$n" tmp | cut -d: -f2 > $n ;
|
|
Packit |
67cb25 |
done
|
|
Packit |
67cb25 |
$ graph -T X -X x -Y y -y 0 20 -m 0 -S 2 -Ie data
|
|
Packit |
67cb25 |
-S 0 -I a -m 1 fit -m 2 hi -m 2 lo
|
|
Packit |
67cb25 |
|
|
Packit |
67cb25 |
The result is shown in :numref:`fig_fit-wlinear`.
|
|
Packit |
67cb25 |
|
|
Packit |
67cb25 |
.. _fig_fit-wlinear:
|
|
Packit |
67cb25 |
|
|
Packit |
67cb25 |
.. figure:: /images/fit-wlinear.png
|
|
Packit |
67cb25 |
|
|
Packit |
67cb25 |
Straight line fit with 1-:math:`\sigma` error bars
|
|
Packit |
67cb25 |
|
|
Packit |
67cb25 |
Multi-parameter Linear Regression Example
|
|
Packit |
67cb25 |
-----------------------------------------
|
|
Packit |
67cb25 |
|
|
Packit |
67cb25 |
The following program performs a quadratic fit :math:`y = c_0 + c_1 x + c_2 x^2`
|
|
Packit |
67cb25 |
to a weighted dataset using the generalised linear fitting function
|
|
Packit |
67cb25 |
:func:`gsl_multifit_wlinear`. The model matrix :math:`X` for a quadratic
|
|
Packit |
67cb25 |
fit is given by,
|
|
Packit |
67cb25 |
|
|
Packit |
67cb25 |
.. only:: not texinfo
|
|
Packit |
67cb25 |
|
|
Packit |
67cb25 |
.. math::
|
|
Packit |
67cb25 |
|
|
Packit |
67cb25 |
X =
|
|
Packit |
67cb25 |
\left(
|
|
Packit |
67cb25 |
\begin{array}{ccc}
|
|
Packit |
67cb25 |
1 & x_0 & x_0^2 \\
|
|
Packit |
67cb25 |
1 & x_1 & x_1^2 \\
|
|
Packit |
67cb25 |
1 & x_2 & x_2^2 \\
|
|
Packit |
67cb25 |
\dots & \dots & \dots
|
|
Packit |
67cb25 |
\end{array}
|
|
Packit |
67cb25 |
\right)
|
|
Packit |
67cb25 |
|
|
Packit |
67cb25 |
.. only:: texinfo
|
|
Packit |
67cb25 |
|
|
Packit |
67cb25 |
::
|
|
Packit |
67cb25 |
|
|
Packit |
67cb25 |
X = [ 1 , x_0 , x_0^2 ;
|
|
Packit |
67cb25 |
1 , x_1 , x_1^2 ;
|
|
Packit |
67cb25 |
1 , x_2 , x_2^2 ;
|
|
Packit |
67cb25 |
... , ... , ... ]
|
|
Packit |
67cb25 |
|
|
Packit |
67cb25 |
where the column of ones corresponds to the constant term :math:`c_0`.
|
|
Packit |
67cb25 |
The two remaining columns corresponds to the terms :math:`c_1 x` and
|
|
Packit |
67cb25 |
:math:`c_2 x^2`.
|
|
Packit |
67cb25 |
|
|
Packit |
67cb25 |
The program reads :data:`n` lines of data in the format (:data:`x`, :data:`y`,
|
|
Packit |
67cb25 |
:data:`err`) where :data:`err` is the error (standard deviation) in the
|
|
Packit |
67cb25 |
value :data:`y`.
|
|
Packit |
67cb25 |
|
|
Packit |
67cb25 |
.. include:: examples/fitting2.c
|
|
Packit |
67cb25 |
:code:
|
|
Packit |
67cb25 |
|
|
Packit |
67cb25 |
A suitable set of data for fitting can be generated using the following
|
|
Packit |
67cb25 |
program. It outputs a set of points with gaussian errors from the curve
|
|
Packit |
67cb25 |
:math:`y = e^x` in the region :math:`0 < x < 2`.
|
|
Packit |
67cb25 |
|
|
Packit |
67cb25 |
.. include:: examples/fitting3.c
|
|
Packit |
67cb25 |
:code:
|
|
Packit |
67cb25 |
|
|
Packit |
67cb25 |
The data can be prepared by running the resulting executable program::
|
|
Packit |
67cb25 |
|
|
Packit |
67cb25 |
$ GSL_RNG_TYPE=mt19937_1999 ./generate > exp.dat
|
|
Packit |
67cb25 |
$ more exp.dat
|
|
Packit |
67cb25 |
0.1 0.97935 0.110517
|
|
Packit |
67cb25 |
0.2 1.3359 0.12214
|
|
Packit |
67cb25 |
0.3 1.52573 0.134986
|
|
Packit |
67cb25 |
0.4 1.60318 0.149182
|
|
Packit |
67cb25 |
0.5 1.81731 0.164872
|
|
Packit |
67cb25 |
0.6 1.92475 0.182212
|
|
Packit |
67cb25 |
....
|
|
Packit |
67cb25 |
|
|
Packit |
67cb25 |
To fit the data use the previous program, with the number of data points
|
|
Packit |
67cb25 |
given as the first argument. In this case there are 19 data points::
|
|
Packit |
67cb25 |
|
|
Packit |
67cb25 |
$ ./fit 19 < exp.dat
|
|
Packit |
67cb25 |
0.1 0.97935 +/- 0.110517
|
|
Packit |
67cb25 |
0.2 1.3359 +/- 0.12214
|
|
Packit |
67cb25 |
...
|
|
Packit |
67cb25 |
# best fit: Y = 1.02318 + 0.956201 X + 0.876796 X^2
|
|
Packit |
67cb25 |
# covariance matrix:
|
|
Packit |
67cb25 |
[ +1.25612e-02, -3.64387e-02, +1.94389e-02
|
|
Packit |
67cb25 |
-3.64387e-02, +1.42339e-01, -8.48761e-02
|
|
Packit |
67cb25 |
+1.94389e-02, -8.48761e-02, +5.60243e-02 ]
|
|
Packit |
67cb25 |
# chisq = 23.0987
|
|
Packit |
67cb25 |
|
|
Packit |
67cb25 |
The parameters of the quadratic fit match the coefficients of the
|
|
Packit |
67cb25 |
expansion of :math:`e^x`, taking into account the errors on the
|
|
Packit |
67cb25 |
parameters and the :math:`O(x^3)` difference between the exponential and
|
|
Packit |
67cb25 |
quadratic functions for the larger values of :math:`x`. The errors on
|
|
Packit |
67cb25 |
the parameters are given by the square-root of the corresponding
|
|
Packit |
67cb25 |
diagonal elements of the covariance matrix. The chi-squared per degree
|
|
Packit |
67cb25 |
of freedom is 1.4, indicating a reasonable fit to the data.
|
|
Packit |
67cb25 |
|
|
Packit |
67cb25 |
:numref:`fig_fit-wlinear2` shows the resulting fit.
|
|
Packit |
67cb25 |
|
|
Packit |
67cb25 |
.. _fig_fit-wlinear2:
|
|
Packit |
67cb25 |
|
|
Packit |
67cb25 |
.. figure:: /images/fit-wlinear2.png
|
|
Packit |
67cb25 |
|
|
Packit |
67cb25 |
Weighted fit example with error bars
|
|
Packit |
67cb25 |
|
|
Packit |
67cb25 |
Regularized Linear Regression Example 1
|
|
Packit |
67cb25 |
---------------------------------------
|
|
Packit |
67cb25 |
|
|
Packit |
67cb25 |
The next program demonstrates the difference between ordinary and
|
|
Packit |
67cb25 |
regularized least squares when the design matrix is near-singular.
|
|
Packit |
67cb25 |
In this program, we generate two random normally distributed variables
|
|
Packit |
67cb25 |
:math:`u` and :math:`v`, with :math:`v = u + noise` so that :math:`u`
|
|
Packit |
67cb25 |
and :math:`v` are nearly colinear. We then set a third dependent
|
|
Packit |
67cb25 |
variable :math:`y = u + v + noise` and solve for the coefficients
|
|
Packit |
67cb25 |
:math:`c_1,c_2` of the model :math:`Y(c_1,c_2) = c_1 u + c_2 v`.
|
|
Packit |
67cb25 |
Since :math:`u \approx v`, the design matrix :math:`X` is nearly
|
|
Packit |
67cb25 |
singular, leading to unstable ordinary least squares solutions.
|
|
Packit |
67cb25 |
|
|
Packit |
67cb25 |
Here is the program output::
|
|
Packit |
67cb25 |
|
|
Packit |
67cb25 |
matrix condition number = 1.025113e+04
|
|
Packit |
67cb25 |
|
|
Packit |
67cb25 |
=== Unregularized fit ===
|
|
Packit |
67cb25 |
best fit: y = -43.6588 u + 45.6636 v
|
|
Packit |
67cb25 |
residual norm = 31.6248
|
|
Packit |
67cb25 |
solution norm = 63.1764
|
|
Packit |
67cb25 |
chisq/dof = 1.00213
|
|
Packit |
67cb25 |
|
|
Packit |
67cb25 |
=== Regularized fit (L-curve) ===
|
|
Packit |
67cb25 |
optimal lambda: 4.51103
|
|
Packit |
67cb25 |
best fit: y = 1.00113 u + 1.0032 v
|
|
Packit |
67cb25 |
residual norm = 31.6547
|
|
Packit |
67cb25 |
solution norm = 1.41728
|
|
Packit |
67cb25 |
chisq/dof = 1.04499
|
|
Packit |
67cb25 |
|
|
Packit |
67cb25 |
=== Regularized fit (GCV) ===
|
|
Packit |
67cb25 |
optimal lambda: 0.0232029
|
|
Packit |
67cb25 |
best fit: y = -19.8367 u + 21.8417 v
|
|
Packit |
67cb25 |
residual norm = 31.6332
|
|
Packit |
67cb25 |
solution norm = 29.5051
|
|
Packit |
67cb25 |
chisq/dof = 1.00314
|
|
Packit |
67cb25 |
|
|
Packit |
67cb25 |
We see that the ordinary least squares solution is completely wrong,
|
|
Packit |
67cb25 |
while the L-curve regularized method with the optimal
|
|
Packit |
67cb25 |
:math:`\lambda = 4.51103` finds the correct solution
|
|
Packit |
67cb25 |
:math:`c_1 \approx c_2 \approx 1`. The GCV regularized method finds
|
|
Packit |
67cb25 |
a regularization parameter :math:`\lambda = 0.0232029` which is too
|
|
Packit |
67cb25 |
small to give an accurate solution, although it performs better than OLS.
|
|
Packit |
67cb25 |
The L-curve and its computed corner, as well as the GCV curve and its
|
|
Packit |
67cb25 |
minimum are plotted in :numref:`fig_regularized`.
|
|
Packit |
67cb25 |
|
|
Packit |
67cb25 |
.. _fig_regularized:
|
|
Packit |
67cb25 |
|
|
Packit |
67cb25 |
.. figure:: /images/regularized.png
|
|
Packit |
67cb25 |
|
|
Packit |
67cb25 |
L-curve and GCV curve for example program.
|
|
Packit |
67cb25 |
|
|
Packit |
67cb25 |
The program is given below.
|
|
Packit |
67cb25 |
|
|
Packit |
67cb25 |
.. include:: examples/fitreg.c
|
|
Packit |
67cb25 |
:code:
|
|
Packit |
67cb25 |
|
|
Packit |
67cb25 |
Regularized Linear Regression Example 2
|
|
Packit |
67cb25 |
---------------------------------------
|
|
Packit |
67cb25 |
|
|
Packit |
67cb25 |
The following example program minimizes the cost function
|
|
Packit |
67cb25 |
|
|
Packit |
67cb25 |
.. math:: ||y - X c||^2 + \lambda^2 ||x||^2
|
|
Packit |
67cb25 |
|
|
Packit |
67cb25 |
where :math:`X` is the :math:`10`-by-:math:`8` Hilbert matrix whose
|
|
Packit |
67cb25 |
entries are given by
|
|
Packit |
67cb25 |
|
|
Packit |
67cb25 |
.. only:: not texinfo
|
|
Packit |
67cb25 |
|
|
Packit |
67cb25 |
.. math:: X_{ij} = {1 \over i + j - 1}
|
|
Packit |
67cb25 |
|
|
Packit |
67cb25 |
.. only:: texinfo
|
|
Packit |
67cb25 |
|
|
Packit |
67cb25 |
::
|
|
Packit |
67cb25 |
|
|
Packit |
67cb25 |
X_{ij} = 1 / (i + j - 1)
|
|
Packit |
67cb25 |
|
|
Packit |
67cb25 |
and the right hand side vector is given by
|
|
Packit |
67cb25 |
:math:`y = [1,-1,1,-1,1,-1,1,-1,1,-1]^T`. Solutions
|
|
Packit |
67cb25 |
are computed for :math:`\lambda = 0` (unregularized) as
|
|
Packit |
67cb25 |
well as for optimal parameters :math:`\lambda` chosen by
|
|
Packit |
67cb25 |
analyzing the L-curve and GCV curve.
|
|
Packit |
67cb25 |
|
|
Packit |
67cb25 |
Here is the program output::
|
|
Packit |
67cb25 |
|
|
Packit |
67cb25 |
matrix condition number = 3.565872e+09
|
|
Packit |
67cb25 |
|
|
Packit |
67cb25 |
=== Unregularized fit ===
|
|
Packit |
67cb25 |
residual norm = 2.15376
|
|
Packit |
67cb25 |
solution norm = 2.92217e+09
|
|
Packit |
67cb25 |
chisq/dof = 2.31934
|
|
Packit |
67cb25 |
|
|
Packit |
67cb25 |
=== Regularized fit (L-curve) ===
|
|
Packit |
67cb25 |
optimal lambda: 7.11407e-07
|
|
Packit |
67cb25 |
residual norm = 2.60386
|
|
Packit |
67cb25 |
solution norm = 424507
|
|
Packit |
67cb25 |
chisq/dof = 3.43565
|
|
Packit |
67cb25 |
|
|
Packit |
67cb25 |
=== Regularized fit (GCV) ===
|
|
Packit |
67cb25 |
optimal lambda: 1.72278
|
|
Packit |
67cb25 |
residual norm = 3.1375
|
|
Packit |
67cb25 |
solution norm = 0.139357
|
|
Packit |
67cb25 |
chisq/dof = 4.95076
|
|
Packit |
67cb25 |
|
|
Packit |
67cb25 |
Here we see the unregularized solution results in a large solution
|
|
Packit |
67cb25 |
norm due to the ill-conditioned matrix. The L-curve solution finds
|
|
Packit |
67cb25 |
a small value of :math:`\lambda = 7.11e-7` which still results in
|
|
Packit |
67cb25 |
a badly conditioned system and a large solution norm. The GCV method
|
|
Packit |
67cb25 |
finds a parameter :math:`\lambda = 1.72` which results in a well-conditioned
|
|
Packit |
67cb25 |
system and small solution norm.
|
|
Packit |
67cb25 |
|
|
Packit |
67cb25 |
The L-curve and its computed corner, as well as the GCV curve and its
|
|
Packit |
67cb25 |
minimum are plotted in :numref:`fig_regularized2`.
|
|
Packit |
67cb25 |
|
|
Packit |
67cb25 |
.. _fig_regularized2:
|
|
Packit |
67cb25 |
|
|
Packit |
67cb25 |
.. figure:: /images/regularized2.png
|
|
Packit |
67cb25 |
|
|
Packit |
67cb25 |
L-curve and GCV curve for example program.
|
|
Packit |
67cb25 |
|
|
Packit |
67cb25 |
The program is given below.
|
|
Packit |
67cb25 |
|
|
Packit |
67cb25 |
.. include:: examples/fitreg2.c
|
|
Packit |
67cb25 |
:code:
|
|
Packit |
67cb25 |
|
|
Packit |
67cb25 |
Robust Linear Regression Example
|
|
Packit |
67cb25 |
--------------------------------
|
|
Packit |
67cb25 |
|
|
Packit |
67cb25 |
The next program demonstrates the advantage of robust least squares on
|
|
Packit |
67cb25 |
a dataset with outliers. The program generates linear :math:`(x,y)`
|
|
Packit |
67cb25 |
data pairs on the line :math:`y = 1.45 x + 3.88`, adds some random
|
|
Packit |
67cb25 |
noise, and inserts 3 outliers into the dataset. Both the robust
|
|
Packit |
67cb25 |
and ordinary least squares (OLS) coefficients are computed for
|
|
Packit |
67cb25 |
comparison.
|
|
Packit |
67cb25 |
|
|
Packit |
67cb25 |
.. include:: examples/robfit.c
|
|
Packit |
67cb25 |
:code:
|
|
Packit |
67cb25 |
|
|
Packit |
67cb25 |
The output from the program is shown in :numref:`fig_robust`.
|
|
Packit |
67cb25 |
|
|
Packit |
67cb25 |
.. _fig_robust:
|
|
Packit |
67cb25 |
|
|
Packit |
67cb25 |
.. figure:: /images/robust.png
|
|
Packit |
67cb25 |
|
|
Packit |
67cb25 |
Linear fit to dataset with outliers.
|
|
Packit |
67cb25 |
|
|
Packit |
67cb25 |
Large Dense Linear Regression Example
|
|
Packit |
67cb25 |
-------------------------------------
|
|
Packit |
67cb25 |
|
|
Packit |
67cb25 |
The following program demostrates the large dense linear least squares
|
|
Packit |
67cb25 |
solvers. This example is adapted from Trefethen and Bau,
|
|
Packit |
67cb25 |
and fits the function :math:`f(t) = \exp{(\sin^3{(10t)}})` on
|
|
Packit |
67cb25 |
the interval :math:`[0,1]` with a degree 15 polynomial. The
|
|
Packit |
67cb25 |
program generates :math:`n = 50000` equally spaced points
|
|
Packit |
67cb25 |
:math:`t_i` on this interval, calculates the function value
|
|
Packit |
67cb25 |
and adds random noise to determine the observation value
|
|
Packit |
67cb25 |
:math:`y_i`. The entries of the least squares matrix are
|
|
Packit |
67cb25 |
:math:`X_{ij} = t_i^j`, representing a polynomial fit. The
|
|
Packit |
67cb25 |
matrix is highly ill-conditioned, with a condition number
|
|
Packit |
67cb25 |
of about :math:`1.4 \cdot 10^{11}`. The program accumulates the
|
|
Packit |
67cb25 |
matrix into the least squares system in 5 blocks, each with
|
|
Packit |
67cb25 |
10000 rows. This way the full matrix :math:`X` is never
|
|
Packit |
67cb25 |
stored in memory. We solve the system with both the
|
|
Packit |
67cb25 |
normal equations and TSQR methods. The results are shown
|
|
Packit |
67cb25 |
in :numref:`fig_multilarge`. In the top left plot, we see the unregularized
|
|
Packit |
67cb25 |
normal equations solution has larger error than TSQR due to
|
|
Packit |
67cb25 |
the ill-conditioning of the matrix. In the bottom left plot,
|
|
Packit |
67cb25 |
we show the L-curve, which exhibits multiple corners.
|
|
Packit |
67cb25 |
In the top right panel, we plot a regularized solution using
|
|
Packit |
67cb25 |
:math:`\lambda = 10^{-6}`. The TSQR and normal solutions now agree,
|
|
Packit |
67cb25 |
however they are unable to provide a good fit due to the damping.
|
|
Packit |
67cb25 |
This indicates that for some ill-conditioned
|
|
Packit |
67cb25 |
problems, regularizing the normal equations does not improve the
|
|
Packit |
67cb25 |
solution. This is further illustrated in the bottom right panel,
|
|
Packit |
67cb25 |
where we plot the L-curve calculated from the normal equations.
|
|
Packit |
67cb25 |
The curve agrees with the TSQR curve for larger damping parameters,
|
|
Packit |
67cb25 |
but for small :math:`\lambda`, the normal equations approach cannot
|
|
Packit |
67cb25 |
provide accurate solution vectors leading to numerical
|
|
Packit |
67cb25 |
inaccuracies in the left portion of the curve.
|
|
Packit |
67cb25 |
|
|
Packit |
67cb25 |
.. _fig_multilarge:
|
|
Packit |
67cb25 |
|
|
Packit |
67cb25 |
.. figure:: /images/multilarge.png
|
|
Packit |
67cb25 |
|
|
Packit |
67cb25 |
.. include:: examples/largefit.c
|
|
Packit |
67cb25 |
:code:
|
|
Packit |
67cb25 |
|
|
Packit |
67cb25 |
References and Further Reading
|
|
Packit |
67cb25 |
==============================
|
|
Packit |
67cb25 |
|
|
Packit |
67cb25 |
A summary of formulas and techniques for least squares fitting can be
|
|
Packit |
67cb25 |
found in the "Statistics" chapter of the Annual Review of Particle
|
|
Packit |
67cb25 |
Physics prepared by the Particle Data Group,
|
|
Packit |
67cb25 |
|
|
Packit |
67cb25 |
* *Review of Particle Properties*,
|
|
Packit |
67cb25 |
R.M. Barnett et al., Physical Review D54, 1 (1996)
|
|
Packit |
67cb25 |
http://pdg.lbl.gov
|
|
Packit |
67cb25 |
|
|
Packit |
67cb25 |
The Review of Particle Physics is available online at the website given
|
|
Packit |
67cb25 |
above.
|
|
Packit |
67cb25 |
|
|
Packit |
67cb25 |
.. index::
|
|
Packit |
67cb25 |
single: NIST Statistical Reference Datasets
|
|
Packit |
67cb25 |
single: Statistical Reference Datasets (StRD)
|
|
Packit |
67cb25 |
|
|
Packit |
67cb25 |
The tests used to prepare these routines are based on the NIST
|
|
Packit |
67cb25 |
Statistical Reference Datasets. The datasets and their documentation are
|
|
Packit |
67cb25 |
available from NIST at the following website,
|
|
Packit |
67cb25 |
|
|
Packit |
67cb25 |
http://www.nist.gov/itl/div898/strd/index.html
|
|
Packit |
67cb25 |
|
|
Packit |
67cb25 |
More information on Tikhonov regularization can be found in
|
|
Packit |
67cb25 |
|
|
Packit |
67cb25 |
* Hansen, P. C. (1998), Rank-Deficient and Discrete Ill-Posed Problems:
|
|
Packit |
67cb25 |
Numerical Aspects of Linear Inversion. SIAM Monogr. on Mathematical
|
|
Packit |
67cb25 |
Modeling and Computation, Society for Industrial and Applied Mathematics
|
|
Packit |
67cb25 |
|
|
Packit |
67cb25 |
* M. Rezghi and S. M. Hosseini (2009), A new variant of L-curve for
|
|
Packit |
67cb25 |
Tikhonov regularization, Journal of Computational and Applied Mathematics,
|
|
Packit |
67cb25 |
Volume 231, Issue 2, pages 914-924.
|
|
Packit |
67cb25 |
|
|
Packit |
67cb25 |
The GSL implementation of robust linear regression closely follows the publications
|
|
Packit |
67cb25 |
|
|
Packit |
67cb25 |
* DuMouchel, W. and F. O'Brien (1989), "Integrating a robust
|
|
Packit |
67cb25 |
option into a multiple regression computing environment,"
|
|
Packit |
67cb25 |
Computer Science and Statistics: Proceedings of the 21st
|
|
Packit |
67cb25 |
Symposium on the Interface, American Statistical Association
|
|
Packit |
67cb25 |
|
|
Packit |
67cb25 |
* Street, J.O., R.J. Carroll, and D. Ruppert (1988), "A note on
|
|
Packit |
67cb25 |
computing robust regression estimates via iteratively
|
|
Packit |
67cb25 |
reweighted least squares," The American Statistician, v. 42,
|
|
Packit |
67cb25 |
pp. 152-154.
|
|
Packit |
67cb25 |
|
|
Packit |
67cb25 |
More information about the normal equations and TSQR approach for solving
|
|
Packit |
67cb25 |
large linear least squares systems can be found in the publications
|
|
Packit |
67cb25 |
|
|
Packit |
67cb25 |
* Trefethen, L. N. and Bau, D. (1997), "Numerical Linear Algebra", SIAM.
|
|
Packit |
67cb25 |
|
|
Packit |
67cb25 |
* Demmel, J., Grigori, L., Hoemmen, M. F., and Langou, J.
|
|
Packit |
67cb25 |
"Communication-optimal parallel and sequential QR and LU factorizations",
|
|
Packit |
67cb25 |
UCB Technical Report No. UCB/EECS-2008-89, 2008.
|