|
Packit |
67cb25 |
.. index::
|
|
Packit |
67cb25 |
single: moving window statistics
|
|
Packit |
67cb25 |
single: statistics, moving window
|
|
Packit |
67cb25 |
single: online statistics
|
|
Packit |
67cb25 |
|
|
Packit |
67cb25 |
************************
|
|
Packit |
67cb25 |
Moving Window Statistics
|
|
Packit |
67cb25 |
************************
|
|
Packit |
67cb25 |
|
|
Packit |
67cb25 |
This chapter describes routines for computing *moving window
|
|
Packit |
67cb25 |
statistics* (also called rolling statistics and running statistics),
|
|
Packit |
67cb25 |
using a window around a sample which is used to calculate various
|
|
Packit |
67cb25 |
local statistical properties of an input data stream. The window is
|
|
Packit |
67cb25 |
then slid forward by one sample to process the next data point and so on.
|
|
Packit |
67cb25 |
|
|
Packit |
67cb25 |
The functions described in this chapter are declared in the header file
|
|
Packit |
67cb25 |
:file:`gsl_movstat.h`.
|
|
Packit |
67cb25 |
|
|
Packit |
67cb25 |
Introduction
|
|
Packit |
67cb25 |
============
|
|
Packit |
67cb25 |
|
|
Packit |
67cb25 |
This chapter is concerned with calculating various statistics from
|
|
Packit |
67cb25 |
subsets of a given dataset. The main idea is to compute statistics
|
|
Packit |
67cb25 |
in the vicinity of a given data sample by defining a *window* which
|
|
Packit |
67cb25 |
includes the sample itself as well as some specified number of samples
|
|
Packit |
67cb25 |
before and after the sample in question. For a sample :math:`x_i`, we
|
|
Packit |
67cb25 |
define a window :math:`W_i^{H,J}` as
|
|
Packit |
67cb25 |
|
|
Packit |
67cb25 |
.. only:: not texinfo
|
|
Packit |
67cb25 |
|
|
Packit |
67cb25 |
.. math:: W_i^{H,J} = \left\{ x_{i-H}, \dots, x_i, \dots, x_{i+J} \right\}
|
|
Packit |
67cb25 |
|
|
Packit |
67cb25 |
.. only:: texinfo
|
|
Packit |
67cb25 |
|
|
Packit |
67cb25 |
::
|
|
Packit |
67cb25 |
|
|
Packit |
67cb25 |
W_i^{H,J} = {x_{i-H},...,x_i,...,x_{i+J}}
|
|
Packit |
67cb25 |
|
|
Packit |
67cb25 |
The parameters :math:`H` and :math:`J` are non-negative integers specifying
|
|
Packit |
67cb25 |
the number of samples to include before and after the sample :math:`x_i`.
|
|
Packit |
67cb25 |
Statistics such as the mean and standard deviation of the window :math:`W_i^{H,J}`
|
|
Packit |
67cb25 |
may be computed, and then the window is shifted forward by one sample to
|
|
Packit |
67cb25 |
focus on :math:`x_{i+1}`. The total number of samples in the window is
|
|
Packit |
67cb25 |
:math:`K = H + J + 1`. To define a symmetric window centered on :math:`x_i`,
|
|
Packit |
67cb25 |
one would set :math:`H = J = \left\lfloor K / 2 \right\rfloor`.
|
|
Packit |
67cb25 |
|
|
Packit |
67cb25 |
Handling Endpoints
|
|
Packit |
67cb25 |
==================
|
|
Packit |
67cb25 |
|
|
Packit |
67cb25 |
When processing samples near the ends of the input signal, there will not
|
|
Packit |
67cb25 |
be enough samples to fill the window :math:`W_i^{H,J}` defined above.
|
|
Packit |
67cb25 |
Therefore the user must specify how to construct the windows near the end points.
|
|
Packit |
67cb25 |
This is done by passing an input argument of type :type:`gsl_movstat_end_t`:
|
|
Packit |
67cb25 |
|
|
Packit |
67cb25 |
.. type:: gsl_movstat_end_t
|
|
Packit |
67cb25 |
|
|
Packit |
67cb25 |
This data type specifies how to construct windows near end points and can
|
|
Packit |
67cb25 |
be selected from the following choices:
|
|
Packit |
67cb25 |
|
|
Packit |
67cb25 |
.. macro:: GSL_MOVSTAT_END_PADZERO
|
|
Packit |
67cb25 |
|
|
Packit |
67cb25 |
With this option, a full window of length :math:`K` will be constructed
|
|
Packit |
67cb25 |
by inserting zeros into the window near the signal end points. Effectively,
|
|
Packit |
67cb25 |
the input signal is modified to
|
|
Packit |
67cb25 |
|
|
Packit |
67cb25 |
.. only:: not texinfo
|
|
Packit |
67cb25 |
|
|
Packit |
67cb25 |
.. math:: \tilde{x} = \{ \underbrace{0, \dots, 0}_{H \textrm{ zeros}}, x_1, x_2, \dots, x_{n-1}, x_n, \underbrace{0, \dots, 0}_{J \textrm{ zeros} } \}
|
|
Packit |
67cb25 |
|
|
Packit |
67cb25 |
.. only:: texinfo
|
|
Packit |
67cb25 |
|
|
Packit |
67cb25 |
::
|
|
Packit |
67cb25 |
|
|
Packit |
67cb25 |
x~ = {0, ..., 0, x_1, x_2, ..., x_{n-1}, x_n, 0, ..., 0}
|
|
Packit |
67cb25 |
|
|
Packit |
67cb25 |
to ensure a well-defined window for all :math:`x_i`.
|
|
Packit |
67cb25 |
|
|
Packit |
67cb25 |
.. macro:: GSL_MOVSTAT_END_PADVALUE
|
|
Packit |
67cb25 |
|
|
Packit |
67cb25 |
With this option, a full window of length :math:`K` will be constructed
|
|
Packit |
67cb25 |
by padding the window with the first and last sample in the input signal.
|
|
Packit |
67cb25 |
Effectively, the input signal is modified to
|
|
Packit |
67cb25 |
|
|
Packit |
67cb25 |
.. only:: not texinfo
|
|
Packit |
67cb25 |
|
|
Packit |
67cb25 |
.. math:: \tilde{x} = \{ \underbrace{x_1, \dots, x_1}_{H}, x_1, x_2, \dots, x_{n-1}, x_n, \underbrace{x_n, \dots, x_n}_{J} \}
|
|
Packit |
67cb25 |
|
|
Packit |
67cb25 |
.. only:: texinfo
|
|
Packit |
67cb25 |
|
|
Packit |
67cb25 |
::
|
|
Packit |
67cb25 |
|
|
Packit |
67cb25 |
x~ = {x_1, ..., x_1, x_1, x_2, ..., x_{n-1}, x_n, x_n, ..., x_n}
|
|
Packit |
67cb25 |
|
|
Packit |
67cb25 |
.. macro:: GSL_MOVSTAT_END_TRUNCATE
|
|
Packit |
67cb25 |
|
|
Packit |
67cb25 |
With this option, no padding is performed, and the windows are simply truncated
|
|
Packit |
67cb25 |
as the end points are approached.
|
|
Packit |
67cb25 |
|
|
Packit |
67cb25 |
.. index::
|
|
Packit |
67cb25 |
single: moving window, allocation
|
|
Packit |
67cb25 |
|
|
Packit |
67cb25 |
Allocation for Moving Window Statistics
|
|
Packit |
67cb25 |
=======================================
|
|
Packit |
67cb25 |
|
|
Packit |
67cb25 |
.. type:: gsl_movstat_workspace
|
|
Packit |
67cb25 |
|
|
Packit |
67cb25 |
The moving window statistical routines use a common workspace.
|
|
Packit |
67cb25 |
|
|
Packit |
67cb25 |
.. function:: gsl_movstat_workspace * gsl_movstat_alloc(const size_t K)
|
|
Packit |
67cb25 |
|
|
Packit |
67cb25 |
This function allocates a workspace for computing symmetric, centered moving statistics with a window
|
|
Packit |
67cb25 |
length of :math:`K` samples. In this case, :math:`H = J = \left\lfloor K/2 \right\rfloor`. The size of the workspace
|
|
Packit |
67cb25 |
is :math:`O(7K)`.
|
|
Packit |
67cb25 |
|
|
Packit |
67cb25 |
.. function:: gsl_movstat_workspace * gsl_movstat_alloc2(const size_t H, const size_t J)
|
|
Packit |
67cb25 |
|
|
Packit |
67cb25 |
This function allocates a workspace for computing moving statistics using a window with :math:`H`
|
|
Packit |
67cb25 |
samples prior to the current sample, and :math:`J` samples after the current sample. The
|
|
Packit |
67cb25 |
total window size is :math:`K = H + J + 1`. The size of the workspace is :math:`O(7K)`.
|
|
Packit |
67cb25 |
|
|
Packit |
67cb25 |
.. function:: void * gsl_movstat_free(gsl_movstat_workspace * w)
|
|
Packit |
67cb25 |
|
|
Packit |
67cb25 |
This function frees the memory associated with :data:`w`.
|
|
Packit |
67cb25 |
|
|
Packit |
67cb25 |
.. index::
|
|
Packit |
67cb25 |
single: moving mean
|
|
Packit |
67cb25 |
single: rolling mean
|
|
Packit |
67cb25 |
|
|
Packit |
67cb25 |
Moving Mean
|
|
Packit |
67cb25 |
===========
|
|
Packit |
67cb25 |
|
|
Packit |
67cb25 |
The moving window mean calculates the mean of the values of each window :math:`W_i^{H,J}`.
|
|
Packit |
67cb25 |
|
|
Packit |
67cb25 |
.. only:: not texinfo
|
|
Packit |
67cb25 |
|
|
Packit |
67cb25 |
.. math:: \hat{\mu}_i = \frac{1}{\left| W_i^{H,J} \right|} \sum_{x_m \in W_i^{H,J}} x_m
|
|
Packit |
67cb25 |
|
|
Packit |
67cb25 |
.. only:: texinfo
|
|
Packit |
67cb25 |
|
|
Packit |
67cb25 |
::
|
|
Packit |
67cb25 |
|
|
Packit |
67cb25 |
\hat{\mu}_i = 1/| W_i^{H,J} | \sum_{x_m \in W_i^{H,J}} x_m
|
|
Packit |
67cb25 |
|
|
Packit |
67cb25 |
Here, :math:`\left| W_i^{H,J} \right|` represents the number of elements in the window
|
|
Packit |
67cb25 |
:math:`W_i^{H,J}`. This will normally be :math:`K`, unless the :macro:`GSL_MOVSTAT_END_TRUNCATE`
|
|
Packit |
67cb25 |
option is selected, in which case it could be less than :math:`K` near the signal end points.
|
|
Packit |
67cb25 |
|
|
Packit |
67cb25 |
.. function:: int gsl_movstat_mean(const gsl_movstat_end_t endtype, const gsl_vector * x, gsl_vector * y, gsl_movstat_workspace * w)
|
|
Packit |
67cb25 |
|
|
Packit |
67cb25 |
This function computes the moving window mean of the input vector :data:`x`, storing
|
|
Packit |
67cb25 |
the output in :data:`y`. The parameter :data:`endtype` specifies how windows near
|
|
Packit |
67cb25 |
the ends of the input should be handled. It is allowed to have :data:`x` = :data:`y`
|
|
Packit |
67cb25 |
for an in-place moving mean.
|
|
Packit |
67cb25 |
|
|
Packit |
67cb25 |
.. index::
|
|
Packit |
67cb25 |
single: moving variance
|
|
Packit |
67cb25 |
single: moving standard deviation
|
|
Packit |
67cb25 |
single: rolling variance
|
|
Packit |
67cb25 |
single: rolling standard deviation
|
|
Packit |
67cb25 |
|
|
Packit |
67cb25 |
Moving Variance and Standard Deviation
|
|
Packit |
67cb25 |
======================================
|
|
Packit |
67cb25 |
|
|
Packit |
67cb25 |
The moving window variance calculates the *sample variance* of the values of each window :math:`W_i^{H,J}`,
|
|
Packit |
67cb25 |
defined by
|
|
Packit |
67cb25 |
|
|
Packit |
67cb25 |
.. only:: not texinfo
|
|
Packit |
67cb25 |
|
|
Packit |
67cb25 |
.. math:: \hat{\sigma}_i^2 = \frac{1}{\left( \left| W_i^{H,J} \right| - 1 \right)} \sum_{x_m \in W_i^{H,J}} \left( x_m - \hat{\mu}_i \right)^2
|
|
Packit |
67cb25 |
|
|
Packit |
67cb25 |
.. only:: texinfo
|
|
Packit |
67cb25 |
|
|
Packit |
67cb25 |
::
|
|
Packit |
67cb25 |
|
|
Packit |
67cb25 |
\hat{\sigma}_i^2 = 1/(|W_i^{H,J}| - 1) \sum_{x_m \in W_i^{H,J}} ( x_m - \hat{\mu}_i )^2
|
|
Packit |
67cb25 |
|
|
Packit |
67cb25 |
where :math:`\hat{\mu}_i` is the mean of :math:`W_i^{H,J}` defined above. The standard deviation :math:`\hat{\sigma}_i`
|
|
Packit |
67cb25 |
is the square root of the variance.
|
|
Packit |
67cb25 |
|
|
Packit |
67cb25 |
.. function:: int gsl_movstat_variance(const gsl_movstat_end_t endtype, const gsl_vector * x, gsl_vector * y, gsl_movstat_workspace * w)
|
|
Packit |
67cb25 |
|
|
Packit |
67cb25 |
This function computes the moving window variance of the input vector :data:`x`, storing
|
|
Packit |
67cb25 |
the output in :data:`y`. The parameter :data:`endtype` specifies how windows near
|
|
Packit |
67cb25 |
the ends of the input should be handled. It is allowed to have :data:`x` = :data:`y`
|
|
Packit |
67cb25 |
for an in-place moving variance.
|
|
Packit |
67cb25 |
|
|
Packit |
67cb25 |
.. function:: int gsl_movstat_sd(const gsl_movstat_end_t endtype, const gsl_vector * x, gsl_vector * y, gsl_movstat_workspace * w)
|
|
Packit |
67cb25 |
|
|
Packit |
67cb25 |
This function computes the moving window standard deviation of the input vector :data:`x`, storing
|
|
Packit |
67cb25 |
the output in :data:`y`. The parameter :data:`endtype` specifies how windows near
|
|
Packit |
67cb25 |
the ends of the input should be handled. It is allowed to have :data:`x` = :data:`y`
|
|
Packit |
67cb25 |
for an in-place moving standard deviation.
|
|
Packit |
67cb25 |
|
|
Packit |
67cb25 |
.. index::
|
|
Packit |
67cb25 |
single: moving minimum
|
|
Packit |
67cb25 |
single: moving maximum
|
|
Packit |
67cb25 |
single: rolling minimum
|
|
Packit |
67cb25 |
single: rolling maximum
|
|
Packit |
67cb25 |
|
|
Packit |
67cb25 |
Moving Minimum and Maximum
|
|
Packit |
67cb25 |
==========================
|
|
Packit |
67cb25 |
|
|
Packit |
67cb25 |
The moving minimum/maximum calculates the minimum and maximum values of
|
|
Packit |
67cb25 |
each window :math:`W_i^{H,J}`.
|
|
Packit |
67cb25 |
|
|
Packit |
67cb25 |
.. only:: not texinfo
|
|
Packit |
67cb25 |
|
|
Packit |
67cb25 |
.. math::
|
|
Packit |
67cb25 |
|
|
Packit |
67cb25 |
y_i^{min} &= \min \left( W_i^{H,J} \right) \\
|
|
Packit |
67cb25 |
y_i^{max} &= \max \left( W_i^{H,J} \right)
|
|
Packit |
67cb25 |
|
|
Packit |
67cb25 |
.. only:: texinfo
|
|
Packit |
67cb25 |
|
|
Packit |
67cb25 |
::
|
|
Packit |
67cb25 |
|
|
Packit |
67cb25 |
y_i^{min} = \min W_i^{H,J}
|
|
Packit |
67cb25 |
y_i^{max} = \max W_i^{H,J}
|
|
Packit |
67cb25 |
|
|
Packit |
67cb25 |
.. function:: int gsl_movstat_min(const gsl_movstat_end_t endtype, const gsl_vector * x, gsl_vector * y, gsl_movstat_workspace * w)
|
|
Packit |
67cb25 |
|
|
Packit |
67cb25 |
This function computes the moving minimum of the input vector :data:`x`, storing
|
|
Packit |
67cb25 |
the result in :data:`y`. The parameter :data:`endtype` specifies how windows near
|
|
Packit |
67cb25 |
the ends of the input should be handled.
|
|
Packit |
67cb25 |
It is allowed to have :data:`x` = :data:`y` for an in-place moving minimum.
|
|
Packit |
67cb25 |
|
|
Packit |
67cb25 |
.. function:: int gsl_movstat_max(const gsl_movstat_end_t endtype, const gsl_vector * x, gsl_vector * y, gsl_movstat_workspace * w)
|
|
Packit |
67cb25 |
|
|
Packit |
67cb25 |
This function computes the moving maximum of the input vector :data:`x`, storing
|
|
Packit |
67cb25 |
the result in :data:`y`. The parameter :data:`endtype` specifies how windows near
|
|
Packit |
67cb25 |
the ends of the input should be handled.
|
|
Packit |
67cb25 |
It is allowed to have :data:`x` = :data:`y` for an in-place moving maximum.
|
|
Packit |
67cb25 |
|
|
Packit |
67cb25 |
.. function:: int gsl_movstat_minmax(const gsl_movstat_end_t endtype, const gsl_vector * x, gsl_vector * y_min, gsl_vector * y_max, gsl_movstat_workspace * w)
|
|
Packit |
67cb25 |
|
|
Packit |
67cb25 |
This function computes the moving minimum and maximum of the input vector :data:`x`, storing
|
|
Packit |
67cb25 |
the window minimums in :data:`y_min` and the window maximums in :data:`y_max`.
|
|
Packit |
67cb25 |
The parameter :data:`endtype` specifies how windows near the ends of the input should be handled.
|
|
Packit |
67cb25 |
|
|
Packit |
67cb25 |
.. index::
|
|
Packit |
67cb25 |
single: moving sum
|
|
Packit |
67cb25 |
single: rolling sum
|
|
Packit |
67cb25 |
|
|
Packit |
67cb25 |
Moving Sum
|
|
Packit |
67cb25 |
==========
|
|
Packit |
67cb25 |
|
|
Packit |
67cb25 |
The moving window sum calculates the sum of the values of each window :math:`W_i^{H,J}`.
|
|
Packit |
67cb25 |
|
|
Packit |
67cb25 |
.. math:: y_i = \sum_{x_m \in W_i^{H,J}} x_m
|
|
Packit |
67cb25 |
|
|
Packit |
67cb25 |
.. function:: int gsl_movstat_sum(const gsl_movstat_end_t endtype, const gsl_vector * x, gsl_vector * y, gsl_movstat_workspace * w)
|
|
Packit |
67cb25 |
|
|
Packit |
67cb25 |
This function computes the moving window sum of the input vector :data:`x`, storing
|
|
Packit |
67cb25 |
the output in :data:`y`. The parameter :data:`endtype` specifies how windows near
|
|
Packit |
67cb25 |
the ends of the input should be handled. It is allowed to have :data:`x` = :data:`y`
|
|
Packit |
67cb25 |
for an in-place moving sum.
|
|
Packit |
67cb25 |
|
|
Packit |
67cb25 |
.. index::
|
|
Packit |
67cb25 |
single: moving median
|
|
Packit |
67cb25 |
single: rolling median
|
|
Packit |
67cb25 |
|
|
Packit |
67cb25 |
Moving Median
|
|
Packit |
67cb25 |
=============
|
|
Packit |
67cb25 |
|
|
Packit |
67cb25 |
The moving median calculates the median of the window :math:`W_i^{H,J}` for
|
|
Packit |
67cb25 |
each sample :math:`x_i`:
|
|
Packit |
67cb25 |
|
|
Packit |
67cb25 |
.. only:: not texinfo
|
|
Packit |
67cb25 |
|
|
Packit |
67cb25 |
.. math:: y_i = \textrm{median} \left( W_i^{H,J} \right)
|
|
Packit |
67cb25 |
|
|
Packit |
67cb25 |
.. only:: texinfo
|
|
Packit |
67cb25 |
|
|
Packit |
67cb25 |
::
|
|
Packit |
67cb25 |
|
|
Packit |
67cb25 |
y_i = median(W_i^{H,J})
|
|
Packit |
67cb25 |
|
|
Packit |
67cb25 |
.. function:: int gsl_movstat_median(const gsl_movstat_end_t endtype, const gsl_vector * x, gsl_vector * y, gsl_movstat_workspace * w)
|
|
Packit |
67cb25 |
|
|
Packit |
67cb25 |
This function computes the moving median of the input vector :data:`x`, storing
|
|
Packit |
67cb25 |
the output in :data:`y`. The parameter :data:`endtype` specifies how windows near
|
|
Packit |
67cb25 |
the ends of the input should be handled. It is allowed for
|
|
Packit |
67cb25 |
:data:`x` = :data:`y` for an in-place moving window median.
|
|
Packit |
67cb25 |
|
|
Packit |
67cb25 |
Robust Scale Estimation
|
|
Packit |
67cb25 |
=======================
|
|
Packit |
67cb25 |
|
|
Packit |
67cb25 |
A common problem in statistics is to quantify the dispersion (also known as the variability, scatter, and spread) of
|
|
Packit |
67cb25 |
a set of data. Often this is done by calculating the variance or standard deviation. However these statistics
|
|
Packit |
67cb25 |
are strongly influenced by outliers, and can often provide erroneous results when even a small number of outliers
|
|
Packit |
67cb25 |
are present.
|
|
Packit |
67cb25 |
|
|
Packit |
67cb25 |
Several useful statistics have emerged to provide robust estimates of scale which are not as susceptible to data outliers.
|
|
Packit |
67cb25 |
A few of these statistical scale estimators are described below.
|
|
Packit |
67cb25 |
|
|
Packit |
67cb25 |
.. index::
|
|
Packit |
67cb25 |
single: moving median absolute deviation
|
|
Packit |
67cb25 |
single: rolling median absolute deviation
|
|
Packit |
67cb25 |
|
|
Packit |
67cb25 |
Moving MAD
|
|
Packit |
67cb25 |
----------
|
|
Packit |
67cb25 |
|
|
Packit |
67cb25 |
The median absolute deviation (MAD) for the window :math:`W_i^{H,J}` is defined
|
|
Packit |
67cb25 |
to be the median of the absolute deviations from the window's median:
|
|
Packit |
67cb25 |
|
|
Packit |
67cb25 |
.. only:: not texinfo
|
|
Packit |
67cb25 |
|
|
Packit |
67cb25 |
.. math:: MAD_i = 1.4826 \times \textrm{median} \left( \left| W_i^{H,J} - \textrm{median} \left( W_i^{H,J} \right) \right| \right)
|
|
Packit |
67cb25 |
|
|
Packit |
67cb25 |
.. only:: texinfo
|
|
Packit |
67cb25 |
|
|
Packit |
67cb25 |
::
|
|
Packit |
67cb25 |
|
|
Packit |
67cb25 |
MAD_i = 1.4826 * median[ |W_i^{H,J} - median(W_i^{H,J})| ]
|
|
Packit |
67cb25 |
|
|
Packit |
67cb25 |
The factor of :math:`1.4826` makes the MAD an unbiased estimator of the standard deviation
|
|
Packit |
67cb25 |
for Gaussian data. The MAD has an efficiency of 37%. See :ref:`here <sec_mad-statistic>` for more information.
|
|
Packit |
67cb25 |
|
|
Packit |
67cb25 |
.. function:: int gsl_movstat_mad0(const gsl_movstat_end_t endtype, const gsl_vector * x, gsl_vector * xmedian, gsl_vector * xmad, gsl_movstat_workspace * w)
|
|
Packit |
67cb25 |
int gsl_movstat_mad(const gsl_movstat_end_t endtype, const gsl_vector * x, gsl_vector * xmedian, gsl_vector * xmad, gsl_movstat_workspace * w)
|
|
Packit |
67cb25 |
|
|
Packit |
67cb25 |
These functions compute the moving MAD of the input vector :data:`x` and store the result
|
|
Packit |
67cb25 |
in :data:`xmad`. The medians of each window :math:`W_i^{H,J}` are stored in :data:`xmedian`
|
|
Packit |
67cb25 |
on output. The inputs :data:`x`, :data:`xmedian`, and :data:`xmad` must all be the same length.
|
|
Packit |
67cb25 |
The parameter :data:`endtype` specifies how windows near the ends of the input should be handled.
|
|
Packit |
67cb25 |
The function :code:`mad0` does not include the scale factor of :math:`1.4826`, while the
|
|
Packit |
67cb25 |
function :code:`mad` does include this factor.
|
|
Packit |
67cb25 |
|
|
Packit |
67cb25 |
.. index::
|
|
Packit |
67cb25 |
single: moving quantile range
|
|
Packit |
67cb25 |
single: rolling quantile range
|
|
Packit |
67cb25 |
|
|
Packit |
67cb25 |
Moving QQR
|
|
Packit |
67cb25 |
----------
|
|
Packit |
67cb25 |
|
|
Packit |
67cb25 |
The q-quantile range (QQR) is the difference between the :math:`(1-q)` and :math:`q` quantiles
|
|
Packit |
67cb25 |
of a set of data,
|
|
Packit |
67cb25 |
|
|
Packit |
67cb25 |
.. math:: QQR = Q_{1-q} - Q_q
|
|
Packit |
67cb25 |
|
|
Packit |
67cb25 |
The case :math:`q = 0.25` corresponds to the well-known *interquartile range (IQR)*, which
|
|
Packit |
67cb25 |
is the difference between the 75th and 25th percentiles of a set of data. The QQR is
|
|
Packit |
67cb25 |
a *trimmed estimator*, the main idea being to discard the largest and smallest values in
|
|
Packit |
67cb25 |
a data window and compute a scale estimate from the remaining middle values. In the case
|
|
Packit |
67cb25 |
of the IQR, the largest and smallest 25% of the data are discarded and the scale is
|
|
Packit |
67cb25 |
estimated from the remaining (middle) 50%.
|
|
Packit |
67cb25 |
|
|
Packit |
67cb25 |
.. function:: int gsl_movstat_qqr(const gsl_movstat_end_t endtype, const gsl_vector * x, const double q, gsl_vector * xqqr, gsl_movstat_workspace * w)
|
|
Packit |
67cb25 |
|
|
Packit |
67cb25 |
This function computes the moving QQR of the input vector :data:`x` and stores the q-quantile ranges
|
|
Packit |
67cb25 |
of each window :math:`W_i^{H,J}` in :data:`xqqr`. The quantile parameter :data:`q` must be between
|
|
Packit |
67cb25 |
:math:`0` and :math:`0.5`. The input :math:`q = 0.25` corresponds to the IQR.
|
|
Packit |
67cb25 |
The inputs :data:`x` and :data:`xqqr` must be the same length.
|
|
Packit |
67cb25 |
The parameter :data:`endtype` specifies how windows near the ends of the input should be handled.
|
|
Packit |
67cb25 |
|
|
Packit |
67cb25 |
Moving :math:`S_n`
|
|
Packit |
67cb25 |
------------------
|
|
Packit |
67cb25 |
|
|
Packit |
67cb25 |
The :math:`S_n` statistic proposed by Croux and Rousseeuw is based on pairwise differences between
|
|
Packit |
67cb25 |
all samples in the window. It has an efficiency of 58%, significantly higher than the MAD.
|
|
Packit |
67cb25 |
See :ref:`here <sec_Sn-statistic>` for more information.
|
|
Packit |
67cb25 |
|
|
Packit |
67cb25 |
.. function:: int gsl_movstat_Sn(const gsl_movstat_end_t endtype, const gsl_vector * x, gsl_vector * xscale, gsl_movstat_workspace * w)
|
|
Packit |
67cb25 |
|
|
Packit |
67cb25 |
This function computes the moving :math:`S_n` of the input vector :data:`x` and stores the output
|
|
Packit |
67cb25 |
in :data:`xscale`. The inputs :data:`x` and :data:`xscale` must be the same length.
|
|
Packit |
67cb25 |
The parameter :data:`endtype` specifies how windows near the ends of the input should be handled.
|
|
Packit |
67cb25 |
It is allowed for :data:`x` = :data:`xscale` for an in-place moving window :math:`S_n`.
|
|
Packit |
67cb25 |
|
|
Packit |
67cb25 |
Moving :math:`Q_n`
|
|
Packit |
67cb25 |
------------------
|
|
Packit |
67cb25 |
|
|
Packit |
67cb25 |
The :math:`Q_n` statistic proposed by Croux and Rousseeuw is loosely based on the Hodges-Lehmann location
|
|
Packit |
67cb25 |
estimator. It has a relatively high efficiency of 82%. See :ref:`here <sec_Qn-statistic>` for more information.
|
|
Packit |
67cb25 |
|
|
Packit |
67cb25 |
.. function:: int gsl_movstat_Qn(const gsl_movstat_end_t endtype, const gsl_vector * x, gsl_vector * xscale, gsl_movstat_workspace * w)
|
|
Packit |
67cb25 |
|
|
Packit |
67cb25 |
This function computes the moving :math:`Q_n` of the input vector :data:`x` and stores the output
|
|
Packit |
67cb25 |
in :data:`xscale`. The inputs :data:`x` and :data:`xscale` must be the same length.
|
|
Packit |
67cb25 |
The parameter :data:`endtype` specifies how windows near the ends of the input should be handled.
|
|
Packit |
67cb25 |
It is allowed for :data:`x` = :data:`xscale` for an in-place moving window :math:`Q_n`.
|
|
Packit |
67cb25 |
|
|
Packit |
67cb25 |
.. index::
|
|
Packit |
67cb25 |
single: moving window accumulators
|
|
Packit |
67cb25 |
single: rolling window accumulators
|
|
Packit |
67cb25 |
|
|
Packit |
67cb25 |
User-defined Moving Statistics
|
|
Packit |
67cb25 |
==============================
|
|
Packit |
67cb25 |
|
|
Packit |
67cb25 |
GSL offers an interface for users to define their own moving window statistics
|
|
Packit |
67cb25 |
functions, without needing to implement the edge-handling and accumulator
|
|
Packit |
67cb25 |
machinery. This can be done by explicitly constructing the windows
|
|
Packit |
67cb25 |
:math:`W_i^{H,J}` for a given input signal (:func:`gsl_movstat_fill`), or by calculating a user-defined
|
|
Packit |
67cb25 |
function for each window automatically. In order to apply a user-defined
|
|
Packit |
67cb25 |
function to each window, users must define a variable of type
|
|
Packit |
67cb25 |
:type:`gsl_movstat_function` to pass into :func:`gsl_movstat_apply`.
|
|
Packit |
67cb25 |
This structure is defined as follows.
|
|
Packit |
67cb25 |
|
|
Packit |
67cb25 |
.. type:: gsl_movstat_function
|
|
Packit |
67cb25 |
|
|
Packit |
67cb25 |
Structure specifying user-defined moving window statistical function::
|
|
Packit |
67cb25 |
|
|
Packit |
67cb25 |
typedef struct
|
|
Packit |
67cb25 |
{
|
|
Packit |
67cb25 |
double (* function) (const size_t n, double x[], void * params);
|
|
Packit |
67cb25 |
void * params;
|
|
Packit |
67cb25 |
} gsl_movstat_function;
|
|
Packit |
67cb25 |
|
|
Packit |
67cb25 |
This structure contains a pointer to the user-defined function as well
|
|
Packit |
67cb25 |
as possible parameters to pass to the function.
|
|
Packit |
67cb25 |
|
|
Packit |
67cb25 |
.. member:: double (* function) (const size_t n, double x[], void * params)
|
|
Packit |
67cb25 |
|
|
Packit |
67cb25 |
This function returns the user-defined statistic of the array :data:`x`
|
|
Packit |
67cb25 |
of length :data:`n`. User-specified parameters are passed in via :data:`params`.
|
|
Packit |
67cb25 |
It is allowed to modify the array :data:`x`.
|
|
Packit |
67cb25 |
|
|
Packit |
67cb25 |
.. member:: void * params
|
|
Packit |
67cb25 |
|
|
Packit |
67cb25 |
User-specified parameters to be passed into the function.
|
|
Packit |
67cb25 |
|
|
Packit |
67cb25 |
.. function:: int gsl_movstat_apply(const gsl_movstat_end_t endtype, const gsl_movstat_function * F, const gsl_vector * x, gsl_vector * y, gsl_movstat_workspace * w)
|
|
Packit |
67cb25 |
|
|
Packit |
67cb25 |
This function applies the user-defined moving window statistic specified in :data:`F`
|
|
Packit |
67cb25 |
to the input vector :data:`x`, storing the output in :data:`y`.
|
|
Packit |
67cb25 |
The parameter :data:`endtype` specifies how windows near the ends of the input should be handled.
|
|
Packit |
67cb25 |
It is allowed for :data:`x` = :data:`y` for an in-place moving window calculation.
|
|
Packit |
67cb25 |
|
|
Packit |
67cb25 |
.. function:: size_t gsl_movstat_fill(const gsl_movstat_end_t endtype, const gsl_vector * x, const size_t idx, const size_t H, const size_t J, double * window)
|
|
Packit |
67cb25 |
|
|
Packit |
67cb25 |
This function explicitly constructs the sliding window for the input vector :data:`x` which
|
|
Packit |
67cb25 |
is centered on the sample :data:`idx`. On output, the array :data:`window` will contain
|
|
Packit |
67cb25 |
:math:`W_{idx}^{H,J}`. The number of samples to the left and right
|
|
Packit |
67cb25 |
of the sample :data:`idx` are specified by :data:`H` and :data:`J` respectively.
|
|
Packit |
67cb25 |
The parameter :data:`endtype` specifies how windows near the ends of the input should be handled.
|
|
Packit |
67cb25 |
The function returns the size of the window.
|
|
Packit |
67cb25 |
|
|
Packit |
67cb25 |
Accumulators
|
|
Packit |
67cb25 |
============
|
|
Packit |
67cb25 |
|
|
Packit |
67cb25 |
Many of the algorithms of this chapter are based on an accumulator design, which
|
|
Packit |
67cb25 |
process the input vector one sample at a time, updating calculations of the
|
|
Packit |
67cb25 |
desired statistic for the current window. Each accumulator is stored in the
|
|
Packit |
67cb25 |
following structure:
|
|
Packit |
67cb25 |
|
|
Packit |
67cb25 |
.. type:: gsl_movstat_accum
|
|
Packit |
67cb25 |
|
|
Packit |
67cb25 |
Structure specifying accumulator for moving window statistics::
|
|
Packit |
67cb25 |
|
|
Packit |
67cb25 |
typedef struct
|
|
Packit |
67cb25 |
{
|
|
Packit |
67cb25 |
size_t (* size) (const size_t n);
|
|
Packit |
67cb25 |
int (* init) (const size_t n, void * vstate);
|
|
Packit |
67cb25 |
int (* insert) (const double x, void * vstate);
|
|
Packit |
67cb25 |
int (* delete) (void * vstate);
|
|
Packit |
67cb25 |
int (* get) (void * params, double * result, const void * vstate);
|
|
Packit |
67cb25 |
} gsl_movstat_accum;
|
|
Packit |
67cb25 |
|
|
Packit |
67cb25 |
The structure contains function pointers responsible for performing
|
|
Packit |
67cb25 |
different tasks for the accumulator.
|
|
Packit |
67cb25 |
|
|
Packit |
67cb25 |
.. member:: size_t (* size) (const size_t n)
|
|
Packit |
67cb25 |
|
|
Packit |
67cb25 |
This function returns the size of the workspace (in bytes) needed by the accumulator
|
|
Packit |
67cb25 |
for a moving window of length :data:`n`.
|
|
Packit |
67cb25 |
|
|
Packit |
67cb25 |
.. member:: int (* init) (const size_t n, void * vstate)
|
|
Packit |
67cb25 |
|
|
Packit |
67cb25 |
This function initializes the workspace :data:`vstate` for a moving window of length :data:`n`.
|
|
Packit |
67cb25 |
|
|
Packit |
67cb25 |
.. member:: int (* insert) (const double x, void * vstate)
|
|
Packit |
67cb25 |
|
|
Packit |
67cb25 |
This function inserts a single sample :data:`x` into the accumulator, updating internal
|
|
Packit |
67cb25 |
calculations of the desired statistic. If the accumulator is full (i.e. :math:`n` samples
|
|
Packit |
67cb25 |
have already been inserted), then the oldest sample is deleted from the accumulator.
|
|
Packit |
67cb25 |
|
|
Packit |
67cb25 |
.. member:: int (* delete) (void * vstate)
|
|
Packit |
67cb25 |
|
|
Packit |
67cb25 |
This function deletes the oldest sample from the accumulator, updating internal
|
|
Packit |
67cb25 |
calculations of the desired statistic.
|
|
Packit |
67cb25 |
|
|
Packit |
67cb25 |
.. member:: int (* get) (void * params, double * result, const void * vstate)
|
|
Packit |
67cb25 |
|
|
Packit |
67cb25 |
This function stores the desired statistic for the current window in
|
|
Packit |
67cb25 |
:data:`result`. The input :data:`params` specifies optional parameters
|
|
Packit |
67cb25 |
for calculating the statistic.
|
|
Packit |
67cb25 |
|
|
Packit |
67cb25 |
The following accumulators of type :type:`gsl_movstat_accum` are defined by GSL to perform moving window statistics
|
|
Packit |
67cb25 |
calculations.
|
|
Packit |
67cb25 |
|
|
Packit |
67cb25 |
.. var:: gsl_movstat_accum_min
|
|
Packit |
67cb25 |
gsl_movstat_accum_max
|
|
Packit |
67cb25 |
gsl_movstat_accum_minmax
|
|
Packit |
67cb25 |
|
|
Packit |
67cb25 |
These accumulators calculate moving window minimum/maximums efficiently, using
|
|
Packit |
67cb25 |
the algorithm of D. Lemire.
|
|
Packit |
67cb25 |
|
|
Packit |
67cb25 |
.. var:: gsl_movstat_accum_mean
|
|
Packit |
67cb25 |
gsl_movstat_accum_sd
|
|
Packit |
67cb25 |
gsl_movstat_accum_variance
|
|
Packit |
67cb25 |
|
|
Packit |
67cb25 |
These accumulators calculate the moving window mean, standard deviation, and variance,
|
|
Packit |
67cb25 |
using the algorithm of B. P. Welford.
|
|
Packit |
67cb25 |
|
|
Packit |
67cb25 |
.. var:: gsl_movstat_median
|
|
Packit |
67cb25 |
|
|
Packit |
67cb25 |
This accumulator calculates the moving window median using the min/max heap algorithm
|
|
Packit |
67cb25 |
of Härdle and Steiger.
|
|
Packit |
67cb25 |
|
|
Packit |
67cb25 |
.. var:: gsl_movstat_accum_Sn
|
|
Packit |
67cb25 |
gsl_movstat_accum_Qn
|
|
Packit |
67cb25 |
|
|
Packit |
67cb25 |
These accumulators calculate the moving window :math:`S_n` and :math:`Q_n` statistics
|
|
Packit |
67cb25 |
developed by Croux and Rousseeuw.
|
|
Packit |
67cb25 |
|
|
Packit |
67cb25 |
.. var:: gsl_movstat_accum_sum
|
|
Packit |
67cb25 |
|
|
Packit |
67cb25 |
This accumulator calculates the moving window sum.
|
|
Packit |
67cb25 |
|
|
Packit |
67cb25 |
.. var:: gsl_movstat_accum_qqr
|
|
Packit |
67cb25 |
|
|
Packit |
67cb25 |
This accumulator calculates the moving window q-quantile range.
|
|
Packit |
67cb25 |
|
|
Packit |
67cb25 |
Examples
|
|
Packit |
67cb25 |
========
|
|
Packit |
67cb25 |
|
|
Packit |
67cb25 |
Example 1
|
|
Packit |
67cb25 |
---------
|
|
Packit |
67cb25 |
|
|
Packit |
67cb25 |
The following example program computes the moving mean, minimum and maximum of a noisy
|
|
Packit |
67cb25 |
sinusoid signal of length :math:`N = 500` with a symmetric moving window of size :math:`K = 11`.
|
|
Packit |
67cb25 |
|
|
Packit |
67cb25 |
.. _fig_movstat1:
|
|
Packit |
67cb25 |
|
|
Packit |
67cb25 |
.. figure:: /images/movstat1.png
|
|
Packit |
67cb25 |
:scale: 60%
|
|
Packit |
67cb25 |
|
|
Packit |
67cb25 |
Original signal time series (gray) with moving mean (green), moving minimum (blue),
|
|
Packit |
67cb25 |
and moving maximum (orange).
|
|
Packit |
67cb25 |
|
|
Packit |
67cb25 |
The program is given below.
|
|
Packit |
67cb25 |
|
|
Packit |
67cb25 |
.. include:: examples/movstat1.c
|
|
Packit |
67cb25 |
:code:
|
|
Packit |
67cb25 |
|
|
Packit |
67cb25 |
Example 2: Robust Scale
|
|
Packit |
67cb25 |
-----------------------
|
|
Packit |
67cb25 |
|
|
Packit |
67cb25 |
The following example program analyzes a time series of length :math:`N = 1000` composed
|
|
Packit |
67cb25 |
of Gaussian random variates with zero mean whose standard deviation changes in a piecewise constant fashion
|
|
Packit |
67cb25 |
as shown in the table below.
|
|
Packit |
67cb25 |
|
|
Packit |
67cb25 |
============ ==============
|
|
Packit |
67cb25 |
Sample Range :math:`\sigma`
|
|
Packit |
67cb25 |
============ ==============
|
|
Packit |
67cb25 |
1-200 1.0
|
|
Packit |
67cb25 |
201-450 5.0
|
|
Packit |
67cb25 |
451-600 1.0
|
|
Packit |
67cb25 |
601-850 3.0
|
|
Packit |
67cb25 |
851-1000 5.0
|
|
Packit |
67cb25 |
============ ==============
|
|
Packit |
67cb25 |
|
|
Packit |
67cb25 |
Additionally, about 1% of the samples are perturbed to represent outliers by adding
|
|
Packit |
67cb25 |
:math:`\pm 15` to the random Gaussian variate.
|
|
Packit |
67cb25 |
The program calculates the moving statistics MAD, IQR, :math:`S_n`, :math:`Q_n`, and
|
|
Packit |
67cb25 |
the standard deviation using a symmetric moving window of length :math:`K = 41`. The results are shown in
|
|
Packit |
67cb25 |
:numref:`fig_movstat2`.
|
|
Packit |
67cb25 |
|
|
Packit |
67cb25 |
.. _fig_movstat2:
|
|
Packit |
67cb25 |
|
|
Packit |
67cb25 |
.. figure:: /images/movstat2.png
|
|
Packit |
67cb25 |
:scale: 60%
|
|
Packit |
67cb25 |
|
|
Packit |
67cb25 |
Top: time series of piecewise constant variance. Bottom: scale estimates using a moving
|
|
Packit |
67cb25 |
window; the true sigma value is in light blue, MAD in green, IQR in red, :math:`S_n` in yellow, and
|
|
Packit |
67cb25 |
:math:`Q_n` in dark blue. The moving standard deviation is shown in gray.
|
|
Packit |
67cb25 |
|
|
Packit |
67cb25 |
The robust statistics follow the true standard deviation piecewise changes well, without being
|
|
Packit |
67cb25 |
influenced by the outliers. The moving standard deviation (gray curve) is heavily influenced by
|
|
Packit |
67cb25 |
the presence of the outliers. The program is given below.
|
|
Packit |
67cb25 |
|
|
Packit |
67cb25 |
.. include:: examples/movstat2.c
|
|
Packit |
67cb25 |
:code:
|
|
Packit |
67cb25 |
|
|
Packit |
67cb25 |
Example 3: User-defined Moving Window
|
|
Packit |
67cb25 |
-------------------------------------
|
|
Packit |
67cb25 |
|
|
Packit |
67cb25 |
This example program illustrates how a user can define their own moving window function to apply
|
|
Packit |
67cb25 |
to an input vector. It constructs a random noisy time series of length :math:`N = 1000` with
|
|
Packit |
67cb25 |
some outliers added. Then it applies a moving window trimmed mean to the time series with
|
|
Packit |
67cb25 |
trim parameter :math:`\alpha = 0.1`. The length of the moving window is :math:`K = 11`, so
|
|
Packit |
67cb25 |
the smallest and largest sample of each window is discarded prior to computing the mean.
|
|
Packit |
67cb25 |
The results are shown in :numref:`fig_movstat3`.
|
|
Packit |
67cb25 |
|
|
Packit |
67cb25 |
.. _fig_movstat3:
|
|
Packit |
67cb25 |
|
|
Packit |
67cb25 |
.. figure:: /images/movstat3.png
|
|
Packit |
67cb25 |
:scale: 60%
|
|
Packit |
67cb25 |
|
|
Packit |
67cb25 |
Noisy time series data (black) with moving window trimmed mean (red)
|
|
Packit |
67cb25 |
|
|
Packit |
67cb25 |
The program is given below.
|
|
Packit |
67cb25 |
|
|
Packit |
67cb25 |
.. include:: examples/movstat3.c
|
|
Packit |
67cb25 |
:code:
|
|
Packit |
67cb25 |
|
|
Packit |
67cb25 |
References and Further Reading
|
|
Packit |
67cb25 |
==============================
|
|
Packit |
67cb25 |
|
|
Packit |
67cb25 |
The following publications are relevant to the algorithms described
|
|
Packit |
67cb25 |
in this chapter,
|
|
Packit |
67cb25 |
|
|
Packit |
67cb25 |
* W.Hardle and W. Steiger, *Optimal Median Smoothing*, Appl. Statist., 44 (2), 1995.
|
|
Packit |
67cb25 |
|
|
Packit |
67cb25 |
* D. Lemire, *Streaming Maximum-Minimum Filter Using No More than Three Comparisons per Element*,
|
|
Packit |
67cb25 |
Nordic Journal of Computing, 13 (4), 2006 (https://arxiv.org/abs/cs/0610046).
|
|
Packit |
67cb25 |
|
|
Packit |
67cb25 |
* B. P. Welford, *Note on a method for calculating corrected sums of squares and products*,
|
|
Packit |
67cb25 |
Technometrics, 4 (3), 1962.
|