|
Packit |
ea1746 |
.. default-domain:: cpp
|
|
Packit |
ea1746 |
|
|
Packit |
ea1746 |
.. cpp:namespace:: ceres
|
|
Packit |
ea1746 |
|
|
Packit |
ea1746 |
.. _chapter-numerical_derivatives:
|
|
Packit |
ea1746 |
|
|
Packit |
ea1746 |
===================
|
|
Packit |
ea1746 |
Numeric derivatives
|
|
Packit |
ea1746 |
===================
|
|
Packit |
ea1746 |
|
|
Packit |
ea1746 |
The other extreme from using analytic derivatives is to use numeric
|
|
Packit |
ea1746 |
derivatives. The key observation here is that the process of
|
|
Packit |
ea1746 |
differentiating a function :math:`f(x)` w.r.t :math:`x` can be written
|
|
Packit |
ea1746 |
as the limiting process:
|
|
Packit |
ea1746 |
|
|
Packit |
ea1746 |
.. math::
|
|
Packit |
ea1746 |
Df(x) = \lim_{h \rightarrow 0} \frac{f(x + h) - f(x)}{h}
|
|
Packit |
ea1746 |
|
|
Packit |
ea1746 |
|
|
Packit |
ea1746 |
Forward Differences
|
|
Packit |
ea1746 |
===================
|
|
Packit |
ea1746 |
|
|
Packit |
ea1746 |
Now of course one cannot perform the limiting operation numerically on
|
|
Packit |
ea1746 |
a computer so we do the next best thing, which is to choose a small
|
|
Packit |
ea1746 |
value of :math:`h` and approximate the derivative as
|
|
Packit |
ea1746 |
|
|
Packit |
ea1746 |
.. math::
|
|
Packit |
ea1746 |
Df(x) \approx \frac{f(x + h) - f(x)}{h}
|
|
Packit |
ea1746 |
|
|
Packit |
ea1746 |
|
|
Packit |
ea1746 |
The above formula is the simplest most basic form of numeric
|
|
Packit |
ea1746 |
differentiation. It is known as the *Forward Difference* formula.
|
|
Packit |
ea1746 |
|
|
Packit |
ea1746 |
So how would one go about constructing a numerically differentiated
|
|
Packit |
ea1746 |
version of ``Rat43Analytic`` (`Rat43
|
|
Packit |
ea1746 |
<http://www.itl.nist.gov/div898/strd/nls/data/ratkowsky3.shtml>`_) in
|
|
Packit |
ea1746 |
Ceres Solver. This is done in two steps:
|
|
Packit |
ea1746 |
|
|
Packit |
ea1746 |
1. Define *Functor* that given the parameter values will evaluate the
|
|
Packit |
ea1746 |
residual for a given :math:`(x,y)`.
|
|
Packit |
ea1746 |
2. Construct a :class:`CostFunction` by using
|
|
Packit |
ea1746 |
:class:`NumericDiffCostFunction` to wrap an instance of
|
|
Packit |
ea1746 |
``Rat43CostFunctor``.
|
|
Packit |
ea1746 |
|
|
Packit |
ea1746 |
.. code-block:: c++
|
|
Packit |
ea1746 |
|
|
Packit |
ea1746 |
struct Rat43CostFunctor {
|
|
Packit |
ea1746 |
Rat43CostFunctor(const double x, const double y) : x_(x), y_(y) {}
|
|
Packit |
ea1746 |
|
|
Packit |
ea1746 |
bool operator()(const double* parameters, double* residuals) const {
|
|
Packit |
ea1746 |
const double b1 = parameters[0];
|
|
Packit |
ea1746 |
const double b2 = parameters[1];
|
|
Packit |
ea1746 |
const double b3 = parameters[2];
|
|
Packit |
ea1746 |
const double b4 = parameters[3];
|
|
Packit |
ea1746 |
residuals[0] = b1 * pow(1.0 + exp(b2 - b3 * x_), -1.0 / b4) - y_;
|
|
Packit |
ea1746 |
return true;
|
|
Packit |
ea1746 |
}
|
|
Packit |
ea1746 |
|
|
Packit |
ea1746 |
const double x_;
|
|
Packit |
ea1746 |
const double y_;
|
|
Packit |
ea1746 |
}
|
|
Packit |
ea1746 |
|
|
Packit |
ea1746 |
CostFunction* cost_function =
|
|
Packit |
ea1746 |
new NumericDiffCostFunction<Rat43CostFunctor, FORWARD, 1, 4>(
|
|
Packit |
ea1746 |
new Rat43CostFunctor(x, y));
|
|
Packit |
ea1746 |
|
|
Packit |
ea1746 |
This is about the minimum amount of work one can expect to do to
|
|
Packit |
ea1746 |
define the cost function. The only thing that the user needs to do is
|
|
Packit |
ea1746 |
to make sure that the evaluation of the residual is implemented
|
|
Packit |
ea1746 |
correctly and efficiently.
|
|
Packit |
ea1746 |
|
|
Packit |
ea1746 |
Before going further, it is instructive to get an estimate of the
|
|
Packit |
ea1746 |
error in the forward difference formula. We do this by considering the
|
|
Packit |
ea1746 |
`Taylor expansion <https://en.wikipedia.org/wiki/Taylor_series>`_ of
|
|
Packit |
ea1746 |
:math:`f` near :math:`x`.
|
|
Packit |
ea1746 |
|
|
Packit |
ea1746 |
.. math::
|
|
Packit |
ea1746 |
\begin{align}
|
|
Packit |
ea1746 |
f(x+h) &= f(x) + h Df(x) + \frac{h^2}{2!} D^2f(x) +
|
|
Packit |
ea1746 |
\frac{h^3}{3!}D^3f(x) + \cdots \\
|
|
Packit |
ea1746 |
Df(x) &= \frac{f(x + h) - f(x)}{h} - \left [\frac{h}{2!}D^2f(x) +
|
|
Packit |
ea1746 |
\frac{h^2}{3!}D^3f(x) + \cdots \right]\\
|
|
Packit |
ea1746 |
Df(x) &= \frac{f(x + h) - f(x)}{h} + O(h)
|
|
Packit |
ea1746 |
\end{align}
|
|
Packit |
ea1746 |
|
|
Packit |
ea1746 |
i.e., the error in the forward difference formula is
|
|
Packit |
ea1746 |
:math:`O(h)` [#f4]_.
|
|
Packit |
ea1746 |
|
|
Packit |
ea1746 |
|
|
Packit |
ea1746 |
Implementation Details
|
|
Packit |
ea1746 |
----------------------
|
|
Packit |
ea1746 |
|
|
Packit |
ea1746 |
:class:`NumericDiffCostFunction` implements a generic algorithm to
|
|
Packit |
ea1746 |
numerically differentiate a given functor. While the actual
|
|
Packit |
ea1746 |
implementation of :class:`NumericDiffCostFunction` is complicated, the
|
|
Packit |
ea1746 |
net result is a :class:`CostFunction` that roughly looks something
|
|
Packit |
ea1746 |
like the following:
|
|
Packit |
ea1746 |
|
|
Packit |
ea1746 |
.. code-block:: c++
|
|
Packit |
ea1746 |
|
|
Packit |
ea1746 |
class Rat43NumericDiffForward : public SizedCostFunction<1,4> {
|
|
Packit |
ea1746 |
public:
|
|
Packit |
ea1746 |
Rat43NumericDiffForward(const Rat43Functor* functor) : functor_(functor) {}
|
|
Packit |
ea1746 |
virtual ~Rat43NumericDiffForward() {}
|
|
Packit |
ea1746 |
virtual bool Evaluate(double const* const* parameters,
|
|
Packit |
ea1746 |
double* residuals,
|
|
Packit |
ea1746 |
double** jacobians) const {
|
|
Packit |
ea1746 |
functor_(parameters[0], residuals);
|
|
Packit |
ea1746 |
if (!jacobians) return true;
|
|
Packit |
ea1746 |
double* jacobian = jacobians[0];
|
|
Packit |
ea1746 |
if (!jacobian) return true;
|
|
Packit |
ea1746 |
|
|
Packit |
ea1746 |
const double f = residuals[0];
|
|
Packit |
ea1746 |
double parameters_plus_h[4];
|
|
Packit |
ea1746 |
for (int i = 0; i < 4; ++i) {
|
|
Packit |
ea1746 |
std::copy(parameters, parameters + 4, parameters_plus_h);
|
|
Packit |
ea1746 |
const double kRelativeStepSize = 1e-6;
|
|
Packit |
ea1746 |
const double h = std::abs(parameters[i]) * kRelativeStepSize;
|
|
Packit |
ea1746 |
parameters_plus_h[i] += h;
|
|
Packit |
ea1746 |
double f_plus;
|
|
Packit |
ea1746 |
functor_(parameters_plus_h, &f_plus);
|
|
Packit |
ea1746 |
jacobian[i] = (f_plus - f) / h;
|
|
Packit |
ea1746 |
}
|
|
Packit |
ea1746 |
return true;
|
|
Packit |
ea1746 |
}
|
|
Packit |
ea1746 |
|
|
Packit |
ea1746 |
private:
|
|
Packit |
ea1746 |
scoped_ptr<Rat43Functor> functor_;
|
|
Packit |
ea1746 |
};
|
|
Packit |
ea1746 |
|
|
Packit |
ea1746 |
|
|
Packit |
ea1746 |
Note the choice of step size :math:`h` in the above code, instead of
|
|
Packit |
ea1746 |
an absolute step size which is the same for all parameters, we use a
|
|
Packit |
ea1746 |
relative step size of :math:`\text{kRelativeStepSize} = 10^{-6}`. This
|
|
Packit |
ea1746 |
gives better derivative estimates than an absolute step size [#f2]_
|
|
Packit |
ea1746 |
[#f3]_. This choice of step size only works for parameter values that
|
|
Packit |
ea1746 |
are not close to zero. So the actual implementation of
|
|
Packit |
ea1746 |
:class:`NumericDiffCostFunction`, uses a more complex step size
|
|
Packit |
ea1746 |
selection logic, where close to zero, it switches to a fixed step
|
|
Packit |
ea1746 |
size.
|
|
Packit |
ea1746 |
|
|
Packit |
ea1746 |
|
|
Packit |
ea1746 |
Central Differences
|
|
Packit |
ea1746 |
===================
|
|
Packit |
ea1746 |
|
|
Packit |
ea1746 |
:math:`O(h)` error in the Forward Difference formula is okay but not
|
|
Packit |
ea1746 |
great. A better method is to use the *Central Difference* formula:
|
|
Packit |
ea1746 |
|
|
Packit |
ea1746 |
.. math::
|
|
Packit |
ea1746 |
Df(x) \approx \frac{f(x + h) - f(x - h)}{2h}
|
|
Packit |
ea1746 |
|
|
Packit |
ea1746 |
Notice that if the value of :math:`f(x)` is known, the Forward
|
|
Packit |
ea1746 |
Difference formula only requires one extra evaluation, but the Central
|
|
Packit |
ea1746 |
Difference formula requires two evaluations, making it twice as
|
|
Packit |
ea1746 |
expensive. So is the extra evaluation worth it?
|
|
Packit |
ea1746 |
|
|
Packit |
ea1746 |
To answer this question, we again compute the error of approximation
|
|
Packit |
ea1746 |
in the central difference formula:
|
|
Packit |
ea1746 |
|
|
Packit |
ea1746 |
.. math::
|
|
Packit |
ea1746 |
\begin{align}
|
|
Packit |
ea1746 |
f(x + h) &= f(x) + h Df(x) + \frac{h^2}{2!}
|
|
Packit |
ea1746 |
D^2f(x) + \frac{h^3}{3!} D^3f(x) + \frac{h^4}{4!} D^4f(x) + \cdots\\
|
|
Packit |
ea1746 |
f(x - h) &= f(x) - h Df(x) + \frac{h^2}{2!}
|
|
Packit |
ea1746 |
D^2f(x) - \frac{h^3}{3!} D^3f(c_2) + \frac{h^4}{4!} D^4f(x) +
|
|
Packit |
ea1746 |
\cdots\\
|
|
Packit |
ea1746 |
Df(x) & = \frac{f(x + h) - f(x - h)}{2h} + \frac{h^2}{3!}
|
|
Packit |
ea1746 |
D^3f(x) + \frac{h^4}{5!}
|
|
Packit |
ea1746 |
D^5f(x) + \cdots \\
|
|
Packit |
ea1746 |
Df(x) & = \frac{f(x + h) - f(x - h)}{2h} + O(h^2)
|
|
Packit |
ea1746 |
\end{align}
|
|
Packit |
ea1746 |
|
|
Packit |
ea1746 |
The error of the Central Difference formula is :math:`O(h^2)`, i.e.,
|
|
Packit |
ea1746 |
the error goes down quadratically whereas the error in the Forward
|
|
Packit |
ea1746 |
Difference formula only goes down linearly.
|
|
Packit |
ea1746 |
|
|
Packit |
ea1746 |
Using central differences instead of forward differences in Ceres
|
|
Packit |
ea1746 |
Solver is a simple matter of changing a template argument to
|
|
Packit |
ea1746 |
:class:`NumericDiffCostFunction` as follows:
|
|
Packit |
ea1746 |
|
|
Packit |
ea1746 |
.. code-block:: c++
|
|
Packit |
ea1746 |
|
|
Packit |
ea1746 |
CostFunction* cost_function =
|
|
Packit |
ea1746 |
new NumericDiffCostFunction<Rat43CostFunctor, CENTRAL, 1, 4>(
|
|
Packit |
ea1746 |
new Rat43CostFunctor(x, y));
|
|
Packit |
ea1746 |
|
|
Packit |
ea1746 |
But what do these differences in the error mean in practice? To see
|
|
Packit |
ea1746 |
this, consider the problem of evaluating the derivative of the
|
|
Packit |
ea1746 |
univariate function
|
|
Packit |
ea1746 |
|
|
Packit |
ea1746 |
.. math::
|
|
Packit |
ea1746 |
f(x) = \frac{e^x}{\sin x - x^2},
|
|
Packit |
ea1746 |
|
|
Packit |
ea1746 |
at :math:`x = 1.0`.
|
|
Packit |
ea1746 |
|
|
Packit |
ea1746 |
It is easy to determine that :math:`Df(1.0) =
|
|
Packit |
ea1746 |
140.73773557129658`. Using this value as reference, we can now compute
|
|
Packit |
ea1746 |
the relative error in the forward and central difference formulae as a
|
|
Packit |
ea1746 |
function of the absolute step size and plot them.
|
|
Packit |
ea1746 |
|
|
Packit |
ea1746 |
.. figure:: forward_central_error.png
|
|
Packit |
ea1746 |
:figwidth: 100%
|
|
Packit |
ea1746 |
:align: center
|
|
Packit |
ea1746 |
|
|
Packit |
ea1746 |
Reading the graph from right to left, a number of things stand out in
|
|
Packit |
ea1746 |
the above graph:
|
|
Packit |
ea1746 |
|
|
Packit |
ea1746 |
1. The graph for both formulae have two distinct regions. At first,
|
|
Packit |
ea1746 |
starting from a large value of :math:`h` the error goes down as
|
|
Packit |
ea1746 |
the effect of truncating the Taylor series dominates, but as the
|
|
Packit |
ea1746 |
value of :math:`h` continues to decrease, the error starts
|
|
Packit |
ea1746 |
increasing again as roundoff error starts to dominate the
|
|
Packit |
ea1746 |
computation. So we cannot just keep on reducing the value of
|
|
Packit |
ea1746 |
:math:`h` to get better estimates of :math:`Df`. The fact that we
|
|
Packit |
ea1746 |
are using finite precision arithmetic becomes a limiting factor.
|
|
Packit |
ea1746 |
2. Forward Difference formula is not a great method for evaluating
|
|
Packit |
ea1746 |
derivatives. Central Difference formula converges much more
|
|
Packit |
ea1746 |
quickly to a more accurate estimate of the derivative with
|
|
Packit |
ea1746 |
decreasing step size. So unless the evaluation of :math:`f(x)` is
|
|
Packit |
ea1746 |
so expensive that you absolutely cannot afford the extra
|
|
Packit |
ea1746 |
evaluation required by central differences, **do not use the
|
|
Packit |
ea1746 |
Forward Difference formula**.
|
|
Packit |
ea1746 |
3. Neither formula works well for a poorly chosen value of :math:`h`.
|
|
Packit |
ea1746 |
|
|
Packit |
ea1746 |
|
|
Packit |
ea1746 |
Ridders' Method
|
|
Packit |
ea1746 |
===============
|
|
Packit |
ea1746 |
|
|
Packit |
ea1746 |
So, can we get better estimates of :math:`Df` without requiring such
|
|
Packit |
ea1746 |
small values of :math:`h` that we start hitting floating point
|
|
Packit |
ea1746 |
roundoff errors?
|
|
Packit |
ea1746 |
|
|
Packit |
ea1746 |
One possible approach is to find a method whose error goes down faster
|
|
Packit |
ea1746 |
than :math:`O(h^2)`. This can be done by applying `Richardson
|
|
Packit |
ea1746 |
Extrapolation
|
|
Packit |
ea1746 |
<https://en.wikipedia.org/wiki/Richardson_extrapolation>`_ to the
|
|
Packit |
ea1746 |
problem of differentiation. This is also known as *Ridders' Method*
|
|
Packit |
ea1746 |
[Ridders]_.
|
|
Packit |
ea1746 |
|
|
Packit |
ea1746 |
Let us recall, the error in the central differences formula.
|
|
Packit |
ea1746 |
|
|
Packit |
ea1746 |
.. math::
|
|
Packit |
ea1746 |
\begin{align}
|
|
Packit |
ea1746 |
Df(x) & = \frac{f(x + h) - f(x - h)}{2h} + \frac{h^2}{3!}
|
|
Packit |
ea1746 |
D^3f(x) + \frac{h^4}{5!}
|
|
Packit |
ea1746 |
D^5f(x) + \cdots\\
|
|
Packit |
ea1746 |
& = \frac{f(x + h) - f(x - h)}{2h} + K_2 h^2 + K_4 h^4 + \cdots
|
|
Packit |
ea1746 |
\end{align}
|
|
Packit |
ea1746 |
|
|
Packit |
ea1746 |
The key thing to note here is that the terms :math:`K_2, K_4, ...`
|
|
Packit |
ea1746 |
are indepdendent of :math:`h` and only depend on :math:`x`.
|
|
Packit |
ea1746 |
|
|
Packit |
ea1746 |
Let us now define:
|
|
Packit |
ea1746 |
|
|
Packit |
ea1746 |
.. math::
|
|
Packit |
ea1746 |
|
|
Packit |
ea1746 |
A(1, m) = \frac{f(x + h/2^{m-1}) - f(x - h/2^{m-1})}{2h/2^{m-1}}.
|
|
Packit |
ea1746 |
|
|
Packit |
ea1746 |
Then observe that
|
|
Packit |
ea1746 |
|
|
Packit |
ea1746 |
.. math::
|
|
Packit |
ea1746 |
|
|
Packit |
ea1746 |
Df(x) = A(1,1) + K_2 h^2 + K_4 h^4 + \cdots
|
|
Packit |
ea1746 |
|
|
Packit |
ea1746 |
and
|
|
Packit |
ea1746 |
|
|
Packit |
ea1746 |
.. math::
|
|
Packit |
ea1746 |
|
|
Packit |
ea1746 |
Df(x) = A(1, 2) + K_2 (h/2)^2 + K_4 (h/2)^4 + \cdots
|
|
Packit |
ea1746 |
|
|
Packit |
ea1746 |
Here we have halved the step size to obtain a second central
|
|
Packit |
ea1746 |
differences estimate of :math:`Df(x)`. Combining these two estimates,
|
|
Packit |
ea1746 |
we get:
|
|
Packit |
ea1746 |
|
|
Packit |
ea1746 |
.. math::
|
|
Packit |
ea1746 |
|
|
Packit |
ea1746 |
Df(x) = \frac{4 A(1, 2) - A(1,1)}{4 - 1} + O(h^4)
|
|
Packit |
ea1746 |
|
|
Packit |
ea1746 |
which is an approximation of :math:`Df(x)` with truncation error that
|
|
Packit |
ea1746 |
goes down as :math:`O(h^4)`. But we do not have to stop here. We can
|
|
Packit |
ea1746 |
iterate this process to obtain even more accurate estimates as
|
|
Packit |
ea1746 |
follows:
|
|
Packit |
ea1746 |
|
|
Packit |
ea1746 |
.. math::
|
|
Packit |
ea1746 |
|
|
Packit |
ea1746 |
A(n, m) = \begin{cases}
|
|
Packit |
ea1746 |
\frac{\displaystyle f(x + h/2^{m-1}) - f(x -
|
|
Packit |
ea1746 |
h/2^{m-1})}{\displaystyle 2h/2^{m-1}} & n = 1 \\
|
|
Packit |
ea1746 |
\frac{\displaystyle 4^{n-1} A(n - 1, m + 1) - A(n - 1, m)}{\displaystyle 4^{n-1} - 1} & n > 1
|
|
Packit |
ea1746 |
\end{cases}
|
|
Packit |
ea1746 |
|
|
Packit |
ea1746 |
It is straightforward to show that the approximation error in
|
|
Packit |
ea1746 |
:math:`A(n, 1)` is :math:`O(h^{2n})`. To see how the above formula can
|
|
Packit |
ea1746 |
be implemented in practice to compute :math:`A(n,1)` it is helpful to
|
|
Packit |
ea1746 |
structure the computation as the following tableau:
|
|
Packit |
ea1746 |
|
|
Packit |
ea1746 |
.. math::
|
|
Packit |
ea1746 |
\begin{array}{ccccc}
|
|
Packit |
ea1746 |
A(1,1) & A(1, 2) & A(1, 3) & A(1, 4) & \cdots\\
|
|
Packit |
ea1746 |
& A(2, 1) & A(2, 2) & A(2, 3) & \cdots\\
|
|
Packit |
ea1746 |
& & A(3, 1) & A(3, 2) & \cdots\\
|
|
Packit |
ea1746 |
& & & A(4, 1) & \cdots \\
|
|
Packit |
ea1746 |
& & & & \ddots
|
|
Packit |
ea1746 |
\end{array}
|
|
Packit |
ea1746 |
|
|
Packit |
ea1746 |
So, to compute :math:`A(n, 1)` for increasing values of :math:`n` we
|
|
Packit |
ea1746 |
move from the left to the right, computing one column at a
|
|
Packit |
ea1746 |
time. Assuming that the primary cost here is the evaluation of the
|
|
Packit |
ea1746 |
function :math:`f(x)`, the cost of computing a new column of the above
|
|
Packit |
ea1746 |
tableau is two function evaluations. Since the cost of evaluating
|
|
Packit |
ea1746 |
:math:`A(1, n)`, requires evaluating the central difference formula
|
|
Packit |
ea1746 |
for step size of :math:`2^{1-n}h`
|
|
Packit |
ea1746 |
|
|
Packit |
ea1746 |
Applying this method to :math:`f(x) = \frac{e^x}{\sin x - x^2}`
|
|
Packit |
ea1746 |
starting with a fairly large step size :math:`h = 0.01`, we get:
|
|
Packit |
ea1746 |
|
|
Packit |
ea1746 |
.. math::
|
|
Packit |
ea1746 |
\begin{array}{rrrrr}
|
|
Packit |
ea1746 |
141.678097131 &140.971663667 &140.796145400 &140.752333523 &140.741384778\\
|
|
Packit |
ea1746 |
&140.736185846 &140.737639311 &140.737729564 &140.737735196\\
|
|
Packit |
ea1746 |
& &140.737736209 &140.737735581 &140.737735571\\
|
|
Packit |
ea1746 |
& & &140.737735571 &140.737735571\\
|
|
Packit |
ea1746 |
& & & &140.737735571\\
|
|
Packit |
ea1746 |
\end{array}
|
|
Packit |
ea1746 |
|
|
Packit |
ea1746 |
Compared to the *correct* value :math:`Df(1.0) = 140.73773557129658`,
|
|
Packit |
ea1746 |
:math:`A(5, 1)` has a relative error of :math:`10^{-13}`. For
|
|
Packit |
ea1746 |
comparison, the relative error for the central difference formula with
|
|
Packit |
ea1746 |
the same stepsize (:math:`0.01/2^4 = 0.000625`) is :math:`10^{-5}`.
|
|
Packit |
ea1746 |
|
|
Packit |
ea1746 |
The above tableau is the basis of Ridders' method for numeric
|
|
Packit |
ea1746 |
differentiation. The full implementation is an adaptive scheme that
|
|
Packit |
ea1746 |
tracks its own estimation error and stops automatically when the
|
|
Packit |
ea1746 |
desired precision is reached. Of course it is more expensive than the
|
|
Packit |
ea1746 |
forward and central difference formulae, but is also significantly
|
|
Packit |
ea1746 |
more robust and accurate.
|
|
Packit |
ea1746 |
|
|
Packit |
ea1746 |
Using Ridder's method instead of forward or central differences in
|
|
Packit |
ea1746 |
Ceres is again a simple matter of changing a template argument to
|
|
Packit |
ea1746 |
:class:`NumericDiffCostFunction` as follows:
|
|
Packit |
ea1746 |
|
|
Packit |
ea1746 |
.. code-block:: c++
|
|
Packit |
ea1746 |
|
|
Packit |
ea1746 |
CostFunction* cost_function =
|
|
Packit |
ea1746 |
new NumericDiffCostFunction<Rat43CostFunctor, RIDDERS, 1, 4>(
|
|
Packit |
ea1746 |
new Rat43CostFunctor(x, y));
|
|
Packit |
ea1746 |
|
|
Packit |
ea1746 |
The following graph shows the relative error of the three methods as a
|
|
Packit |
ea1746 |
function of the absolute step size. For Ridders's method we assume
|
|
Packit |
ea1746 |
that the step size for evaluating :math:`A(n,1)` is :math:`2^{1-n}h`.
|
|
Packit |
ea1746 |
|
|
Packit |
ea1746 |
.. figure:: forward_central_ridders_error.png
|
|
Packit |
ea1746 |
:figwidth: 100%
|
|
Packit |
ea1746 |
:align: center
|
|
Packit |
ea1746 |
|
|
Packit |
ea1746 |
Using the 10 function evaluations that are needed to compute
|
|
Packit |
ea1746 |
:math:`A(5,1)` we are able to approximate :math:`Df(1.0)` about a 1000
|
|
Packit |
ea1746 |
times better than the best central differences estimate. To put these
|
|
Packit |
ea1746 |
numbers in perspective, machine epsilon for double precision
|
|
Packit |
ea1746 |
arithmetic is :math:`\approx 2.22 \times 10^{-16}`.
|
|
Packit |
ea1746 |
|
|
Packit |
ea1746 |
Going back to ``Rat43``, let us also look at the runtime cost of the
|
|
Packit |
ea1746 |
various methods for computing numeric derivatives.
|
|
Packit |
ea1746 |
|
|
Packit |
ea1746 |
========================== =========
|
|
Packit |
ea1746 |
CostFunction Time (ns)
|
|
Packit |
ea1746 |
========================== =========
|
|
Packit |
ea1746 |
Rat43Analytic 255
|
|
Packit |
ea1746 |
Rat43AnalyticOptimized 92
|
|
Packit |
ea1746 |
Rat43NumericDiffForward 262
|
|
Packit |
ea1746 |
Rat43NumericDiffCentral 517
|
|
Packit |
ea1746 |
Rat43NumericDiffRidders 3760
|
|
Packit |
ea1746 |
========================== =========
|
|
Packit |
ea1746 |
|
|
Packit |
ea1746 |
As expected, Central Differences is about twice as expensive as
|
|
Packit |
ea1746 |
Forward Differences and the remarkable accuracy improvements of
|
|
Packit |
ea1746 |
Ridders' method cost an order of magnitude more runtime.
|
|
Packit |
ea1746 |
|
|
Packit |
ea1746 |
Recommendations
|
|
Packit |
ea1746 |
===============
|
|
Packit |
ea1746 |
|
|
Packit |
ea1746 |
Numeric differentiation should be used when you cannot compute the
|
|
Packit |
ea1746 |
derivatives either analytically or using automatic differention. This
|
|
Packit |
ea1746 |
is usually the case when you are calling an external library or
|
|
Packit |
ea1746 |
function whose analytic form you do not know or even if you do, you
|
|
Packit |
ea1746 |
are not in a position to re-write it in a manner required to use
|
|
Packit |
ea1746 |
:ref:`chapter-automatic_derivatives`.
|
|
Packit |
ea1746 |
|
|
Packit |
ea1746 |
|
|
Packit |
ea1746 |
When using numeric differentiation, use at least Central Differences,
|
|
Packit |
ea1746 |
and if execution time is not a concern or the objective function is
|
|
Packit |
ea1746 |
such that determining a good static relative step size is hard,
|
|
Packit |
ea1746 |
Ridders' method is recommended.
|
|
Packit |
ea1746 |
|
|
Packit |
ea1746 |
.. rubric:: Footnotes
|
|
Packit |
ea1746 |
|
|
Packit |
ea1746 |
.. [#f2] `Numerical Differentiation
|
|
Packit |
ea1746 |
<https://en.wikipedia.org/wiki/Numerical_differentiation#Practical_considerations_using_floating_point_arithmetic>`_
|
|
Packit |
ea1746 |
.. [#f3] [Press]_ Numerical Recipes, Section 5.7
|
|
Packit |
ea1746 |
.. [#f4] In asymptotic error analysis, an error of :math:`O(h^k)`
|
|
Packit |
ea1746 |
means that the absolute-value of the error is at most some
|
|
Packit |
ea1746 |
constant times :math:`h^k` when :math:`h` is close enough to
|
|
Packit |
ea1746 |
:math:`0`.
|