Blame doc/movstat.rst

Packit 67cb25
.. index::
Packit 67cb25
   single: moving window statistics
Packit 67cb25
   single: statistics, moving window
Packit 67cb25
   single: online statistics
Packit 67cb25
Packit 67cb25
************************
Packit 67cb25
Moving Window Statistics
Packit 67cb25
************************
Packit 67cb25
Packit 67cb25
This chapter describes routines for computing *moving window
Packit 67cb25
statistics* (also called rolling statistics and running statistics),
Packit 67cb25
using a window around a sample which is used to calculate various
Packit 67cb25
local statistical properties of an input data stream. The window is
Packit 67cb25
then slid forward by one sample to process the next data point and so on.
Packit 67cb25
Packit 67cb25
The functions described in this chapter are declared in the header file
Packit 67cb25
:file:`gsl_movstat.h`.
Packit 67cb25
Packit 67cb25
Introduction
Packit 67cb25
============
Packit 67cb25
Packit 67cb25
This chapter is concerned with calculating various statistics from
Packit 67cb25
subsets of a given dataset. The main idea is to compute statistics
Packit 67cb25
in the vicinity of a given data sample by defining a *window* which
Packit 67cb25
includes the sample itself as well as some specified number of samples
Packit 67cb25
before and after the sample in question. For a sample :math:`x_i`, we
Packit 67cb25
define a window :math:`W_i^{H,J}` as
Packit 67cb25
Packit 67cb25
.. only:: not texinfo
Packit 67cb25
Packit 67cb25
   .. math:: W_i^{H,J} = \left\{ x_{i-H}, \dots, x_i, \dots, x_{i+J} \right\}
Packit 67cb25
Packit 67cb25
.. only:: texinfo
Packit 67cb25
Packit 67cb25
   ::
Packit 67cb25
Packit 67cb25
      W_i^{H,J} = {x_{i-H},...,x_i,...,x_{i+J}}
Packit 67cb25
Packit 67cb25
The parameters :math:`H` and :math:`J` are non-negative integers specifying
Packit 67cb25
the number of samples to include before and after the sample :math:`x_i`.
Packit 67cb25
Statistics such as the mean and standard deviation of the window :math:`W_i^{H,J}`
Packit 67cb25
may be computed, and then the window is shifted forward by one sample to
Packit 67cb25
focus on :math:`x_{i+1}`. The total number of samples in the window is
Packit 67cb25
:math:`K = H + J + 1`. To define a symmetric window centered on :math:`x_i`,
Packit 67cb25
one would set :math:`H = J = \left\lfloor K / 2 \right\rfloor`.
Packit 67cb25
Packit 67cb25
Handling Endpoints
Packit 67cb25
==================
Packit 67cb25
Packit 67cb25
When processing samples near the ends of the input signal, there will not
Packit 67cb25
be enough samples to fill the window :math:`W_i^{H,J}` defined above.
Packit 67cb25
Therefore the user must specify how to construct the windows near the end points.
Packit 67cb25
This is done by passing an input argument of type :type:`gsl_movstat_end_t`:
Packit 67cb25
Packit 67cb25
.. type:: gsl_movstat_end_t
Packit 67cb25
Packit 67cb25
   This data type specifies how to construct windows near end points and can
Packit 67cb25
   be selected from the following choices:
Packit 67cb25
Packit 67cb25
   .. macro:: GSL_MOVSTAT_END_PADZERO
Packit 67cb25
Packit 67cb25
      With this option, a full window of length :math:`K` will be constructed
Packit 67cb25
      by inserting zeros into the window near the signal end points. Effectively,
Packit 67cb25
      the input signal is modified to
Packit 67cb25
Packit 67cb25
      .. only:: not texinfo
Packit 67cb25
Packit 67cb25
         .. math:: \tilde{x} = \{ \underbrace{0, \dots, 0}_{H \textrm{ zeros}}, x_1, x_2, \dots, x_{n-1}, x_n, \underbrace{0, \dots, 0}_{J \textrm{ zeros} } \}
Packit 67cb25
Packit 67cb25
      .. only:: texinfo
Packit 67cb25
Packit 67cb25
         ::
Packit 67cb25
Packit 67cb25
            x~ = {0, ..., 0, x_1, x_2, ..., x_{n-1}, x_n, 0, ..., 0}
Packit 67cb25
Packit 67cb25
      to ensure a well-defined window for all :math:`x_i`.
Packit 67cb25
Packit 67cb25
   .. macro:: GSL_MOVSTAT_END_PADVALUE
Packit 67cb25
Packit 67cb25
      With this option, a full window of length :math:`K` will be constructed
Packit 67cb25
      by padding the window with the first and last sample in the input signal.
Packit 67cb25
      Effectively, the input signal is modified to
Packit 67cb25
Packit 67cb25
      .. only:: not texinfo
Packit 67cb25
Packit 67cb25
         .. math:: \tilde{x} = \{ \underbrace{x_1, \dots, x_1}_{H}, x_1, x_2, \dots, x_{n-1}, x_n, \underbrace{x_n, \dots, x_n}_{J} \}
Packit 67cb25
Packit 67cb25
      .. only:: texinfo
Packit 67cb25
Packit 67cb25
         ::
Packit 67cb25
Packit 67cb25
            x~ = {x_1, ..., x_1, x_1, x_2, ..., x_{n-1}, x_n, x_n, ..., x_n}
Packit 67cb25
Packit 67cb25
   .. macro:: GSL_MOVSTAT_END_TRUNCATE
Packit 67cb25
Packit 67cb25
      With this option, no padding is performed, and the windows are simply truncated
Packit 67cb25
      as the end points are approached.
Packit 67cb25
Packit 67cb25
.. index::
Packit 67cb25
   single: moving window, allocation
Packit 67cb25
Packit 67cb25
Allocation for Moving Window Statistics
Packit 67cb25
=======================================
Packit 67cb25
Packit 67cb25
.. type:: gsl_movstat_workspace
Packit 67cb25
Packit 67cb25
   The moving window statistical routines use a common workspace.
Packit 67cb25
Packit 67cb25
.. function:: gsl_movstat_workspace * gsl_movstat_alloc(const size_t K)
Packit 67cb25
Packit 67cb25
   This function allocates a workspace for computing symmetric, centered moving statistics with a window
Packit 67cb25
   length of :math:`K` samples. In this case, :math:`H = J = \left\lfloor K/2 \right\rfloor`. The size of the workspace
Packit 67cb25
   is :math:`O(7K)`.
Packit 67cb25
Packit 67cb25
.. function:: gsl_movstat_workspace * gsl_movstat_alloc2(const size_t H, const size_t J)
Packit 67cb25
Packit 67cb25
   This function allocates a workspace for computing moving statistics using a window with :math:`H`
Packit 67cb25
   samples prior to the current sample, and :math:`J` samples after the current sample. The
Packit 67cb25
   total window size is :math:`K = H + J + 1`. The size of the workspace is :math:`O(7K)`.
Packit 67cb25
Packit 67cb25
.. function:: void * gsl_movstat_free(gsl_movstat_workspace * w)
Packit 67cb25
Packit 67cb25
   This function frees the memory associated with :data:`w`.
Packit 67cb25
Packit 67cb25
.. index::
Packit 67cb25
   single: moving mean
Packit 67cb25
   single: rolling mean
Packit 67cb25
Packit 67cb25
Moving Mean
Packit 67cb25
===========
Packit 67cb25
Packit 67cb25
The moving window mean calculates the mean of the values of each window :math:`W_i^{H,J}`.
Packit 67cb25
Packit 67cb25
.. only:: not texinfo
Packit 67cb25
Packit 67cb25
   .. math:: \hat{\mu}_i = \frac{1}{\left| W_i^{H,J} \right|} \sum_{x_m \in W_i^{H,J}} x_m
Packit 67cb25
Packit 67cb25
.. only:: texinfo
Packit 67cb25
Packit 67cb25
   ::
Packit 67cb25
Packit 67cb25
      \hat{\mu}_i = 1/| W_i^{H,J} | \sum_{x_m \in W_i^{H,J}} x_m
Packit 67cb25
Packit 67cb25
Here, :math:`\left| W_i^{H,J} \right|` represents the number of elements in the window
Packit 67cb25
:math:`W_i^{H,J}`. This will normally be :math:`K`, unless the :macro:`GSL_MOVSTAT_END_TRUNCATE`
Packit 67cb25
option is selected, in which case it could be less than :math:`K` near the signal end points.
Packit 67cb25
Packit 67cb25
.. function:: int gsl_movstat_mean(const gsl_movstat_end_t endtype, const gsl_vector * x, gsl_vector * y, gsl_movstat_workspace * w)
Packit 67cb25
Packit 67cb25
   This function computes the moving window mean of the input vector :data:`x`, storing
Packit 67cb25
   the output in :data:`y`. The parameter :data:`endtype` specifies how windows near
Packit 67cb25
   the ends of the input should be handled. It is allowed to have :data:`x` = :data:`y`
Packit 67cb25
   for an in-place moving mean.
Packit 67cb25
Packit 67cb25
.. index::
Packit 67cb25
   single: moving variance
Packit 67cb25
   single: moving standard deviation
Packit 67cb25
   single: rolling variance
Packit 67cb25
   single: rolling standard deviation
Packit 67cb25
Packit 67cb25
Moving Variance and Standard Deviation
Packit 67cb25
======================================
Packit 67cb25
Packit 67cb25
The moving window variance calculates the *sample variance* of the values of each window :math:`W_i^{H,J}`,
Packit 67cb25
defined by
Packit 67cb25
Packit 67cb25
.. only:: not texinfo
Packit 67cb25
Packit 67cb25
   .. math:: \hat{\sigma}_i^2 = \frac{1}{\left( \left| W_i^{H,J} \right| - 1 \right)} \sum_{x_m \in W_i^{H,J}} \left( x_m - \hat{\mu}_i \right)^2
Packit 67cb25
Packit 67cb25
.. only:: texinfo
Packit 67cb25
Packit 67cb25
   ::
Packit 67cb25
Packit 67cb25
      \hat{\sigma}_i^2 = 1/(|W_i^{H,J}| - 1) \sum_{x_m \in W_i^{H,J}} ( x_m - \hat{\mu}_i )^2
Packit 67cb25
Packit 67cb25
where :math:`\hat{\mu}_i` is the mean of :math:`W_i^{H,J}` defined above. The standard deviation :math:`\hat{\sigma}_i`
Packit 67cb25
is the square root of the variance.
Packit 67cb25
Packit 67cb25
.. function:: int gsl_movstat_variance(const gsl_movstat_end_t endtype, const gsl_vector * x, gsl_vector * y, gsl_movstat_workspace * w)
Packit 67cb25
Packit 67cb25
   This function computes the moving window variance of the input vector :data:`x`, storing
Packit 67cb25
   the output in :data:`y`. The parameter :data:`endtype` specifies how windows near
Packit 67cb25
   the ends of the input should be handled. It is allowed to have :data:`x` = :data:`y`
Packit 67cb25
   for an in-place moving variance.
Packit 67cb25
Packit 67cb25
.. function:: int gsl_movstat_sd(const gsl_movstat_end_t endtype, const gsl_vector * x, gsl_vector * y, gsl_movstat_workspace * w)
Packit 67cb25
Packit 67cb25
   This function computes the moving window standard deviation of the input vector :data:`x`, storing
Packit 67cb25
   the output in :data:`y`. The parameter :data:`endtype` specifies how windows near
Packit 67cb25
   the ends of the input should be handled. It is allowed to have :data:`x` = :data:`y`
Packit 67cb25
   for an in-place moving standard deviation.
Packit 67cb25
Packit 67cb25
.. index::
Packit 67cb25
   single: moving minimum
Packit 67cb25
   single: moving maximum
Packit 67cb25
   single: rolling minimum
Packit 67cb25
   single: rolling maximum
Packit 67cb25
Packit 67cb25
Moving Minimum and Maximum
Packit 67cb25
==========================
Packit 67cb25
Packit 67cb25
The moving minimum/maximum calculates the minimum and maximum values of
Packit 67cb25
each window :math:`W_i^{H,J}`.
Packit 67cb25
Packit 67cb25
.. only:: not texinfo
Packit 67cb25
Packit 67cb25
    .. math::
Packit 67cb25
Packit 67cb25
       y_i^{min} &= \min \left( W_i^{H,J} \right) \\
Packit 67cb25
       y_i^{max} &= \max \left( W_i^{H,J} \right)
Packit 67cb25
Packit 67cb25
.. only:: texinfo
Packit 67cb25
Packit 67cb25
   ::
Packit 67cb25
Packit 67cb25
      y_i^{min} = \min W_i^{H,J}
Packit 67cb25
      y_i^{max} = \max W_i^{H,J}
Packit 67cb25
Packit 67cb25
.. function:: int gsl_movstat_min(const gsl_movstat_end_t endtype, const gsl_vector * x, gsl_vector * y, gsl_movstat_workspace * w)
Packit 67cb25
Packit 67cb25
   This function computes the moving minimum of the input vector :data:`x`, storing
Packit 67cb25
   the result in :data:`y`. The parameter :data:`endtype` specifies how windows near
Packit 67cb25
   the ends of the input should be handled.
Packit 67cb25
   It is allowed to have :data:`x` = :data:`y` for an in-place moving minimum.
Packit 67cb25
Packit 67cb25
.. function:: int gsl_movstat_max(const gsl_movstat_end_t endtype, const gsl_vector * x, gsl_vector * y, gsl_movstat_workspace * w)
Packit 67cb25
Packit 67cb25
   This function computes the moving maximum of the input vector :data:`x`, storing
Packit 67cb25
   the result in :data:`y`. The parameter :data:`endtype` specifies how windows near
Packit 67cb25
   the ends of the input should be handled.
Packit 67cb25
   It is allowed to have :data:`x` = :data:`y` for an in-place moving maximum.
Packit 67cb25
Packit 67cb25
.. function:: int gsl_movstat_minmax(const gsl_movstat_end_t endtype, const gsl_vector * x, gsl_vector * y_min, gsl_vector * y_max, gsl_movstat_workspace * w)
Packit 67cb25
Packit 67cb25
   This function computes the moving minimum and maximum of the input vector :data:`x`, storing
Packit 67cb25
   the window minimums in :data:`y_min` and the window maximums in :data:`y_max`.
Packit 67cb25
   The parameter :data:`endtype` specifies how windows near the ends of the input should be handled.
Packit 67cb25
Packit 67cb25
.. index::
Packit 67cb25
   single: moving sum
Packit 67cb25
   single: rolling sum
Packit 67cb25
Packit 67cb25
Moving Sum
Packit 67cb25
==========
Packit 67cb25
Packit 67cb25
The moving window sum calculates the sum of the values of each window :math:`W_i^{H,J}`.
Packit 67cb25
Packit 67cb25
.. math:: y_i = \sum_{x_m \in W_i^{H,J}} x_m
Packit 67cb25
Packit 67cb25
.. function:: int gsl_movstat_sum(const gsl_movstat_end_t endtype, const gsl_vector * x, gsl_vector * y, gsl_movstat_workspace * w)
Packit 67cb25
Packit 67cb25
   This function computes the moving window sum of the input vector :data:`x`, storing
Packit 67cb25
   the output in :data:`y`. The parameter :data:`endtype` specifies how windows near
Packit 67cb25
   the ends of the input should be handled. It is allowed to have :data:`x` = :data:`y`
Packit 67cb25
   for an in-place moving sum.
Packit 67cb25
Packit 67cb25
.. index::
Packit 67cb25
   single: moving median
Packit 67cb25
   single: rolling median
Packit 67cb25
Packit 67cb25
Moving Median
Packit 67cb25
=============
Packit 67cb25
Packit 67cb25
The moving median calculates the median of the window :math:`W_i^{H,J}` for
Packit 67cb25
each sample :math:`x_i`:
Packit 67cb25
Packit 67cb25
.. only:: not texinfo
Packit 67cb25
Packit 67cb25
    .. math:: y_i = \textrm{median} \left( W_i^{H,J} \right)
Packit 67cb25
Packit 67cb25
.. only:: texinfo
Packit 67cb25
Packit 67cb25
   ::
Packit 67cb25
Packit 67cb25
      y_i = median(W_i^{H,J})
Packit 67cb25
Packit 67cb25
.. function:: int gsl_movstat_median(const gsl_movstat_end_t endtype, const gsl_vector * x, gsl_vector * y, gsl_movstat_workspace * w)
Packit 67cb25
Packit 67cb25
   This function computes the moving median of the input vector :data:`x`, storing
Packit 67cb25
   the output in :data:`y`. The parameter :data:`endtype` specifies how windows near
Packit 67cb25
   the ends of the input should be handled. It is allowed for
Packit 67cb25
   :data:`x` = :data:`y` for an in-place moving window median.
Packit 67cb25
Packit 67cb25
Robust Scale Estimation
Packit 67cb25
=======================
Packit 67cb25
Packit 67cb25
A common problem in statistics is to quantify the dispersion (also known as the variability, scatter, and spread) of
Packit 67cb25
a set of data. Often this is done by calculating the variance or standard deviation. However these statistics
Packit 67cb25
are strongly influenced by outliers, and can often provide erroneous results when even a small number of outliers
Packit 67cb25
are present.
Packit 67cb25
Packit 67cb25
Several useful statistics have emerged to provide robust estimates of scale which are not as susceptible to data outliers.
Packit 67cb25
A few of these statistical scale estimators are described below.
Packit 67cb25
Packit 67cb25
.. index::
Packit 67cb25
   single: moving median absolute deviation
Packit 67cb25
   single: rolling median absolute deviation
Packit 67cb25
Packit 67cb25
Moving MAD
Packit 67cb25
----------
Packit 67cb25
Packit 67cb25
The median absolute deviation (MAD) for the window :math:`W_i^{H,J}` is defined
Packit 67cb25
to be the median of the absolute deviations from the window's median:
Packit 67cb25
Packit 67cb25
.. only:: not texinfo
Packit 67cb25
Packit 67cb25
    .. math:: MAD_i = 1.4826 \times \textrm{median} \left( \left| W_i^{H,J} - \textrm{median} \left( W_i^{H,J} \right) \right| \right)
Packit 67cb25
Packit 67cb25
.. only:: texinfo
Packit 67cb25
Packit 67cb25
   ::
Packit 67cb25
Packit 67cb25
      MAD_i = 1.4826 * median[ |W_i^{H,J} - median(W_i^{H,J})| ]
Packit 67cb25
Packit 67cb25
The factor of :math:`1.4826` makes the MAD an unbiased estimator of the standard deviation
Packit 67cb25
for Gaussian data. The MAD has an efficiency of 37%.  See :ref:`here <sec_mad-statistic>` for more information.
Packit 67cb25
Packit 67cb25
.. function:: int gsl_movstat_mad0(const gsl_movstat_end_t endtype, const gsl_vector * x, gsl_vector * xmedian, gsl_vector * xmad, gsl_movstat_workspace * w)
Packit 67cb25
              int gsl_movstat_mad(const gsl_movstat_end_t endtype, const gsl_vector * x, gsl_vector * xmedian, gsl_vector * xmad, gsl_movstat_workspace * w)
Packit 67cb25
Packit 67cb25
   These functions compute the moving MAD of the input vector :data:`x` and store the result
Packit 67cb25
   in :data:`xmad`. The medians of each window :math:`W_i^{H,J}` are stored in :data:`xmedian`
Packit 67cb25
   on output. The inputs :data:`x`, :data:`xmedian`, and :data:`xmad` must all be the same length.
Packit 67cb25
   The parameter :data:`endtype` specifies how windows near the ends of the input should be handled.
Packit 67cb25
   The function :code:`mad0` does not include the scale factor of :math:`1.4826`, while the
Packit 67cb25
   function :code:`mad` does include this factor.
Packit 67cb25
Packit 67cb25
.. index::
Packit 67cb25
   single: moving quantile range
Packit 67cb25
   single: rolling quantile range
Packit 67cb25
Packit 67cb25
Moving QQR
Packit 67cb25
----------
Packit 67cb25
Packit 67cb25
The q-quantile range (QQR) is the difference between the :math:`(1-q)` and :math:`q` quantiles
Packit 67cb25
of a set of data,
Packit 67cb25
Packit 67cb25
.. math:: QQR = Q_{1-q} - Q_q
Packit 67cb25
Packit 67cb25
The case :math:`q = 0.25` corresponds to the well-known *interquartile range (IQR)*, which
Packit 67cb25
is the difference between the 75th and 25th percentiles of a set of data. The QQR is
Packit 67cb25
a *trimmed estimator*, the main idea being to discard the largest and smallest values in
Packit 67cb25
a data window and compute a scale estimate from the remaining middle values. In the case
Packit 67cb25
of the IQR, the largest and smallest 25% of the data are discarded and the scale is
Packit 67cb25
estimated from the remaining (middle) 50%.
Packit 67cb25
Packit 67cb25
.. function:: int gsl_movstat_qqr(const gsl_movstat_end_t endtype, const gsl_vector * x, const double q, gsl_vector * xqqr, gsl_movstat_workspace * w)
Packit 67cb25
Packit 67cb25
   This function computes the moving QQR of the input vector :data:`x` and stores the q-quantile ranges
Packit 67cb25
   of each window :math:`W_i^{H,J}` in :data:`xqqr`. The quantile parameter :data:`q` must be between
Packit 67cb25
   :math:`0` and :math:`0.5`. The input :math:`q = 0.25` corresponds to the IQR.
Packit 67cb25
   The inputs :data:`x` and :data:`xqqr` must be the same length.
Packit 67cb25
   The parameter :data:`endtype` specifies how windows near the ends of the input should be handled.
Packit 67cb25
Packit 67cb25
Moving :math:`S_n`
Packit 67cb25
------------------
Packit 67cb25
Packit 67cb25
The :math:`S_n` statistic proposed by Croux and Rousseeuw is based on pairwise differences between
Packit 67cb25
all samples in the window. It has an efficiency of 58%, significantly higher than the MAD.
Packit 67cb25
See :ref:`here <sec_Sn-statistic>` for more information.
Packit 67cb25
Packit 67cb25
.. function:: int gsl_movstat_Sn(const gsl_movstat_end_t endtype, const gsl_vector * x, gsl_vector * xscale, gsl_movstat_workspace * w)
Packit 67cb25
Packit 67cb25
   This function computes the moving :math:`S_n` of the input vector :data:`x` and stores the output
Packit 67cb25
   in :data:`xscale`. The inputs :data:`x` and :data:`xscale` must be the same length.
Packit 67cb25
   The parameter :data:`endtype` specifies how windows near the ends of the input should be handled.
Packit 67cb25
   It is allowed for :data:`x` = :data:`xscale` for an in-place moving window :math:`S_n`.
Packit 67cb25
Packit 67cb25
Moving :math:`Q_n`
Packit 67cb25
------------------
Packit 67cb25
Packit 67cb25
The :math:`Q_n` statistic proposed by Croux and Rousseeuw is loosely based on the Hodges-Lehmann location
Packit 67cb25
estimator. It has a relatively high efficiency of 82%. See :ref:`here <sec_Qn-statistic>` for more information.
Packit 67cb25
Packit 67cb25
.. function:: int gsl_movstat_Qn(const gsl_movstat_end_t endtype, const gsl_vector * x, gsl_vector * xscale, gsl_movstat_workspace * w)
Packit 67cb25
Packit 67cb25
   This function computes the moving :math:`Q_n` of the input vector :data:`x` and stores the output
Packit 67cb25
   in :data:`xscale`. The inputs :data:`x` and :data:`xscale` must be the same length.
Packit 67cb25
   The parameter :data:`endtype` specifies how windows near the ends of the input should be handled.
Packit 67cb25
   It is allowed for :data:`x` = :data:`xscale` for an in-place moving window :math:`Q_n`.
Packit 67cb25
Packit 67cb25
.. index::
Packit 67cb25
   single: moving window accumulators
Packit 67cb25
   single: rolling window accumulators
Packit 67cb25
Packit 67cb25
User-defined Moving Statistics
Packit 67cb25
==============================
Packit 67cb25
Packit 67cb25
GSL offers an interface for users to define their own moving window statistics
Packit 67cb25
functions, without needing to implement the edge-handling and accumulator
Packit 67cb25
machinery. This can be done by explicitly constructing the windows
Packit 67cb25
:math:`W_i^{H,J}` for a given input signal (:func:`gsl_movstat_fill`), or by calculating a user-defined
Packit 67cb25
function for each window automatically. In order to apply a user-defined
Packit 67cb25
function to each window, users must define a variable of type
Packit 67cb25
:type:`gsl_movstat_function` to pass into :func:`gsl_movstat_apply`.
Packit 67cb25
This structure is defined as follows.
Packit 67cb25
Packit 67cb25
.. type:: gsl_movstat_function
Packit 67cb25
Packit 67cb25
   Structure specifying user-defined moving window statistical function::
Packit 67cb25
Packit 67cb25
     typedef struct
Packit 67cb25
     {
Packit 67cb25
       double (* function) (const size_t n, double x[], void * params);
Packit 67cb25
       void * params;
Packit 67cb25
     } gsl_movstat_function;
Packit 67cb25
Packit 67cb25
   This structure contains a pointer to the user-defined function as well
Packit 67cb25
   as possible parameters to pass to the function.
Packit 67cb25
Packit 67cb25
   .. member:: double (* function) (const size_t n, double x[], void * params)
Packit 67cb25
Packit 67cb25
      This function returns the user-defined statistic of the array :data:`x`
Packit 67cb25
      of length :data:`n`. User-specified parameters are passed in via :data:`params`.
Packit 67cb25
      It is allowed to modify the array :data:`x`.
Packit 67cb25
Packit 67cb25
   .. member:: void * params
Packit 67cb25
Packit 67cb25
      User-specified parameters to be passed into the function.
Packit 67cb25
Packit 67cb25
.. function:: int gsl_movstat_apply(const gsl_movstat_end_t endtype, const gsl_movstat_function * F, const gsl_vector * x, gsl_vector * y, gsl_movstat_workspace * w)
Packit 67cb25
Packit 67cb25
   This function applies the user-defined moving window statistic specified in :data:`F`
Packit 67cb25
   to the input vector :data:`x`, storing the output in :data:`y`.
Packit 67cb25
   The parameter :data:`endtype` specifies how windows near the ends of the input should be handled.
Packit 67cb25
   It is allowed for :data:`x` = :data:`y` for an in-place moving window calculation.
Packit 67cb25
Packit 67cb25
.. function:: size_t gsl_movstat_fill(const gsl_movstat_end_t endtype, const gsl_vector * x, const size_t idx, const size_t H, const size_t J, double * window)
Packit 67cb25
Packit 67cb25
   This function explicitly constructs the sliding window for the input vector :data:`x` which
Packit 67cb25
   is centered on the sample :data:`idx`. On output, the array :data:`window` will contain
Packit 67cb25
   :math:`W_{idx}^{H,J}`. The number of samples to the left and right
Packit 67cb25
   of the sample :data:`idx` are specified by :data:`H` and :data:`J` respectively.
Packit 67cb25
   The parameter :data:`endtype` specifies how windows near the ends of the input should be handled.
Packit 67cb25
   The function returns the size of the window.
Packit 67cb25
Packit 67cb25
Accumulators
Packit 67cb25
============
Packit 67cb25
Packit 67cb25
Many of the algorithms of this chapter are based on an accumulator design, which
Packit 67cb25
process the input vector one sample at a time, updating calculations of the
Packit 67cb25
desired statistic for the current window. Each accumulator is stored in the
Packit 67cb25
following structure:
Packit 67cb25
Packit 67cb25
.. type:: gsl_movstat_accum
Packit 67cb25
Packit 67cb25
   Structure specifying accumulator for moving window statistics::
Packit 67cb25
Packit 67cb25
     typedef struct
Packit 67cb25
     {
Packit 67cb25
       size_t (* size) (const size_t n);
Packit 67cb25
       int (* init) (const size_t n, void * vstate);
Packit 67cb25
       int (* insert) (const double x, void * vstate);
Packit 67cb25
       int (* delete) (void * vstate);
Packit 67cb25
       int (* get) (void * params, double * result, const void * vstate);
Packit 67cb25
     } gsl_movstat_accum;
Packit 67cb25
Packit 67cb25
   The structure contains function pointers responsible for performing
Packit 67cb25
   different tasks for the accumulator.
Packit 67cb25
Packit 67cb25
   .. member:: size_t (* size) (const size_t n)
Packit 67cb25
Packit 67cb25
        This function returns the size of the workspace (in bytes) needed by the accumulator
Packit 67cb25
        for a moving window of length :data:`n`.
Packit 67cb25
Packit 67cb25
   .. member:: int (* init) (const size_t n, void * vstate)
Packit 67cb25
Packit 67cb25
        This function initializes the workspace :data:`vstate` for a moving window of length :data:`n`.
Packit 67cb25
Packit 67cb25
   .. member:: int (* insert) (const double x, void * vstate)
Packit 67cb25
Packit 67cb25
        This function inserts a single sample :data:`x` into the accumulator, updating internal
Packit 67cb25
        calculations of the desired statistic. If the accumulator is full (i.e. :math:`n` samples
Packit 67cb25
        have already been inserted), then the oldest sample is deleted from the accumulator.
Packit 67cb25
Packit 67cb25
   .. member:: int (* delete) (void * vstate)
Packit 67cb25
Packit 67cb25
        This function deletes the oldest sample from the accumulator, updating internal
Packit 67cb25
        calculations of the desired statistic.
Packit 67cb25
Packit 67cb25
   .. member:: int (* get) (void * params, double * result, const void * vstate)
Packit 67cb25
Packit 67cb25
        This function stores the desired statistic for the current window in
Packit 67cb25
        :data:`result`. The input :data:`params` specifies optional parameters
Packit 67cb25
        for calculating the statistic.
Packit 67cb25
Packit 67cb25
The following accumulators of type :type:`gsl_movstat_accum` are defined by GSL to perform moving window statistics
Packit 67cb25
calculations.
Packit 67cb25
Packit 67cb25
.. var:: gsl_movstat_accum_min
Packit 67cb25
         gsl_movstat_accum_max
Packit 67cb25
         gsl_movstat_accum_minmax
Packit 67cb25
Packit 67cb25
   These accumulators calculate moving window minimum/maximums efficiently, using
Packit 67cb25
   the algorithm of D. Lemire.
Packit 67cb25
Packit 67cb25
.. var:: gsl_movstat_accum_mean
Packit 67cb25
         gsl_movstat_accum_sd
Packit 67cb25
         gsl_movstat_accum_variance
Packit 67cb25
Packit 67cb25
   These accumulators calculate the moving window mean, standard deviation, and variance,
Packit 67cb25
   using the algorithm of B. P. Welford.
Packit 67cb25
Packit 67cb25
.. var:: gsl_movstat_median
Packit 67cb25
Packit 67cb25
   This accumulator calculates the moving window median using the min/max heap algorithm
Packit 67cb25
   of Härdle and Steiger.
Packit 67cb25
Packit 67cb25
.. var:: gsl_movstat_accum_Sn
Packit 67cb25
         gsl_movstat_accum_Qn
Packit 67cb25
Packit 67cb25
   These accumulators calculate the moving window :math:`S_n` and :math:`Q_n` statistics
Packit 67cb25
   developed by Croux and Rousseeuw.
Packit 67cb25
Packit 67cb25
.. var:: gsl_movstat_accum_sum
Packit 67cb25
Packit 67cb25
   This accumulator calculates the moving window sum.
Packit 67cb25
Packit 67cb25
.. var:: gsl_movstat_accum_qqr
Packit 67cb25
Packit 67cb25
   This accumulator calculates the moving window q-quantile range.
Packit 67cb25
Packit 67cb25
Examples
Packit 67cb25
========
Packit 67cb25
Packit 67cb25
Example 1
Packit 67cb25
---------
Packit 67cb25
Packit 67cb25
The following example program computes the moving mean, minimum and maximum of a noisy
Packit 67cb25
sinusoid signal of length :math:`N = 500` with a symmetric moving window of size :math:`K = 11`.
Packit 67cb25
Packit 67cb25
.. _fig_movstat1:
Packit 67cb25
Packit 67cb25
.. figure:: /images/movstat1.png
Packit 67cb25
   :scale: 60%
Packit 67cb25
Packit 67cb25
   Original signal time series (gray) with moving mean (green), moving minimum (blue),
Packit 67cb25
   and moving maximum (orange).
Packit 67cb25
Packit 67cb25
The program is given below.
Packit 67cb25
Packit 67cb25
.. include:: examples/movstat1.c
Packit 67cb25
   :code:
Packit 67cb25
Packit 67cb25
Example 2: Robust Scale
Packit 67cb25
-----------------------
Packit 67cb25
Packit 67cb25
The following example program analyzes a time series of length :math:`N = 1000` composed
Packit 67cb25
of Gaussian random variates with zero mean whose standard deviation changes in a piecewise constant fashion
Packit 67cb25
as shown in the table below.
Packit 67cb25
Packit 67cb25
============ ==============
Packit 67cb25
Sample Range :math:`\sigma`
Packit 67cb25
============ ==============
Packit 67cb25
1-200        1.0
Packit 67cb25
201-450      5.0
Packit 67cb25
451-600      1.0
Packit 67cb25
601-850      3.0
Packit 67cb25
851-1000     5.0
Packit 67cb25
============ ==============
Packit 67cb25
Packit 67cb25
Additionally, about 1% of the samples are perturbed to represent outliers by adding
Packit 67cb25
:math:`\pm 15` to the random Gaussian variate.
Packit 67cb25
The program calculates the moving statistics MAD, IQR, :math:`S_n`, :math:`Q_n`, and
Packit 67cb25
the standard deviation using a symmetric moving window of length :math:`K = 41`. The results are shown in
Packit 67cb25
:numref:`fig_movstat2`.
Packit 67cb25
Packit 67cb25
.. _fig_movstat2:
Packit 67cb25
Packit 67cb25
.. figure:: /images/movstat2.png
Packit 67cb25
   :scale: 60%
Packit 67cb25
Packit 67cb25
   Top: time series of piecewise constant variance. Bottom: scale estimates using a moving
Packit 67cb25
   window; the true sigma value is in light blue, MAD in green, IQR in red, :math:`S_n` in yellow, and
Packit 67cb25
   :math:`Q_n` in dark blue. The moving standard deviation is shown in gray.
Packit 67cb25
Packit 67cb25
The robust statistics follow the true standard deviation piecewise changes well, without being
Packit 67cb25
influenced by the outliers. The moving standard deviation (gray curve) is heavily influenced by
Packit 67cb25
the presence of the outliers. The program is given below.
Packit 67cb25
Packit 67cb25
.. include:: examples/movstat2.c
Packit 67cb25
   :code:
Packit 67cb25
Packit 67cb25
Example 3: User-defined Moving Window
Packit 67cb25
-------------------------------------
Packit 67cb25
Packit 67cb25
This example program illustrates how a user can define their own moving window function to apply
Packit 67cb25
to an input vector. It constructs a random noisy time series of length :math:`N = 1000` with
Packit 67cb25
some outliers added. Then it applies a moving window trimmed mean to the time series with
Packit 67cb25
trim parameter :math:`\alpha = 0.1`. The length of the moving window is :math:`K = 11`, so
Packit 67cb25
the smallest and largest sample of each window is discarded prior to computing the mean.
Packit 67cb25
The results are shown in :numref:`fig_movstat3`.
Packit 67cb25
Packit 67cb25
.. _fig_movstat3:
Packit 67cb25
Packit 67cb25
.. figure:: /images/movstat3.png
Packit 67cb25
   :scale: 60%
Packit 67cb25
Packit 67cb25
   Noisy time series data (black) with moving window trimmed mean (red)
Packit 67cb25
Packit 67cb25
The program is given below.
Packit 67cb25
Packit 67cb25
.. include:: examples/movstat3.c
Packit 67cb25
   :code:
Packit 67cb25
Packit 67cb25
References and Further Reading
Packit 67cb25
==============================
Packit 67cb25
Packit 67cb25
The following publications are relevant to the algorithms described
Packit 67cb25
in this chapter,
Packit 67cb25
Packit 67cb25
* W.Hardle and W. Steiger, *Optimal Median Smoothing*, Appl. Statist., 44 (2), 1995.
Packit 67cb25
Packit 67cb25
* D. Lemire, *Streaming Maximum-Minimum Filter Using No More than Three Comparisons per Element*,
Packit 67cb25
  Nordic Journal of Computing, 13 (4), 2006 (https://arxiv.org/abs/cs/0610046).
Packit 67cb25
Packit 67cb25
* B. P. Welford, *Note on a method for calculating corrected sums of squares and products*,
Packit 67cb25
  Technometrics, 4 (3), 1962.