Tree - source-git/ceres-solver

source-git / ceres-solver

Blame docs/source/automatic_derivatives.rst

Blob History Raw

Packit	ea1746	`.. default-domain:: cpp`
Packit	ea1746
Packit	ea1746	`.. cpp:namespace:: ceres`
Packit	ea1746
Packit	ea1746	`.. _chapter-automatic_derivatives:`
Packit	ea1746
Packit	ea1746	`=====================`
Packit	ea1746	`Automatic Derivatives`
Packit	ea1746	`=====================`
Packit	ea1746
Packit	ea1746	`We will now consider automatic differentiation. It is a technique that`
Packit	ea1746	`can compute exact derivatives, fast, while requiring about the same`
Packit	ea1746	`effort from the user as is needed to use numerical differentiation.`
Packit	ea1746
Packit	ea1746	`Don't believe me? Well here goes. The following code fragment`
Packit	ea1746	implements an automatically differentiated ``CostFunction`` for `Rat43
Packit	ea1746	<http://www.itl.nist.gov/div898/strd/nls/data/ratkowsky3.shtml>`_.
Packit	ea1746
Packit	ea1746	`.. code-block:: c++`
Packit	ea1746
Packit	ea1746	`struct Rat43CostFunctor {`
Packit	ea1746	`Rat43CostFunctor(const double x, const double y) : x_(x), y_(y) {}`
Packit	ea1746
Packit	ea1746	`template <typename T>`
Packit	ea1746	`bool operator()(const T* parameters, T* residuals) const {`
Packit	ea1746	`const T b1 = parameters[0];`
Packit	ea1746	`const T b2 = parameters[1];`
Packit	ea1746	`const T b3 = parameters[2];`
Packit	ea1746	`const T b4 = parameters[3];`
Packit	ea1746	`residuals[0] = b1 * pow(1.0 + exp(b2 - b3 * x_), -1.0 / b4) - y_;`
Packit	ea1746	`return true;`
Packit	ea1746	`}`
Packit	ea1746
Packit	ea1746	`private:`
Packit	ea1746	`const double x_;`
Packit	ea1746	`const double y_;`
Packit	ea1746	`};`
Packit	ea1746
Packit	ea1746
Packit	ea1746	`CostFunction* cost_function =`
Packit	ea1746	`new AutoDiffCostFunction<Rat43CostFunctor, 1, 4>(`
Packit	ea1746	`new Rat43CostFunctor(x, y));`
Packit	ea1746
Packit	ea1746	`Notice that compared to numeric differentiation, the only difference`
Packit	ea1746	`when defining the functor for use with automatic differentiation is`
Packit	ea1746	the signature of the ``operator()``.
Packit	ea1746
Packit	ea1746	`In the case of numeric differentition it was`
Packit	ea1746
Packit	ea1746	`.. code-block:: c++`
Packit	ea1746
Packit	ea1746	`bool operator()(const double* parameters, double* residuals) const;`
Packit	ea1746
Packit	ea1746	`and for automatic differentiation it is a templated function of the`
Packit	ea1746	`form`
Packit	ea1746
Packit	ea1746	`.. code-block:: c++`
Packit	ea1746
Packit	ea1746	`template <typename T> bool operator()(const T* parameters, T* residuals) const;`
Packit	ea1746
Packit	ea1746
Packit	ea1746	`So what does this small change buy us? The following table compares`
Packit	ea1746	`the time it takes to evaluate the residual and the Jacobian for`
Packit	ea1746	`Rat43` using various methods.
Packit	ea1746
Packit	ea1746	`========================== =========`
Packit	ea1746	`CostFunction Time (ns)`
Packit	ea1746	`========================== =========`
Packit	ea1746	`Rat43Analytic 255`
Packit	ea1746	`Rat43AnalyticOptimized 92`
Packit	ea1746	`Rat43NumericDiffForward 262`
Packit	ea1746	`Rat43NumericDiffCentral 517`
Packit	ea1746	`Rat43NumericDiffRidders 3760`
Packit	ea1746	`Rat43AutomaticDiff 129`
Packit	ea1746	`========================== =========`
Packit	ea1746
Packit	ea1746	`We can get exact derivatives using automatic differentiation`
Packit	ea1746	(``Rat43AutomaticDiff``) with about the same effort that is required
Packit	ea1746	to write the code for numeric differentiation but only :math:`40\%`
Packit	ea1746	`slower than hand optimized analytical derivatives.`
Packit	ea1746
Packit	ea1746	`So how does it work? For this we will have to learn about **Dual`
Packit	ea1746	`Numbers and Jets** .`
Packit	ea1746
Packit	ea1746
Packit	ea1746	`Dual Numbers & Jets`
Packit	ea1746	`===================`
Packit	ea1746
Packit	ea1746	`.. NOTE::`
Packit	ea1746
Packit	ea1746	`Reading this and the next section on implementing Jets is not`
Packit	ea1746	`necessary to use automatic differentiation in Ceres Solver. But`
Packit	ea1746	`knowing the basics of how Jets work is useful when debugging and`
Packit	ea1746	`reasoning about the performance of automatic differentiation.`
Packit	ea1746
Packit	ea1746	`Dual numbers are an extension of the real numbers analogous to complex`
Packit	ea1746	`numbers: whereas complex numbers augment the reals by introducing an`
Packit	ea1746	imaginary unit :math:`\iota` such that :math:`\iota^2 = -1`, dual
Packit	ea1746	numbers introduce an infinitesimal unit :math:`\epsilon` such that
Packit	ea1746	:math:`\epsilon^2 = 0` . A dual number :math:`a + v\epsilon` has two
Packit	ea1746	components, the real component :math:`a` and the infinitesimal
Packit	ea1746	component :math:`v`.
Packit	ea1746
Packit	ea1746	`Surprisingly, this simple change leads to a convenient method for`
Packit	ea1746	`computing exact derivatives without needing to manipulate complicated`
Packit	ea1746	`symbolic expressions.`
Packit	ea1746
Packit	ea1746	`For example, consider the function`
Packit	ea1746
Packit	ea1746	`.. math::`
Packit	ea1746
Packit	ea1746	`f(x) = x^2 ,`
Packit	ea1746
Packit	ea1746	`Then,`
Packit	ea1746
Packit	ea1746	`.. math::`
Packit	ea1746
Packit	ea1746	`\begin{align}`
Packit	ea1746	`f(10 + \epsilon) &= (10 + \epsilon)^2\\`
Packit	ea1746	`&= 100 + 20 \epsilon + \epsilon^2\\`
Packit	ea1746	`&= 100 + 20 \epsilon`
Packit	ea1746	`\end{align}`
Packit	ea1746
Packit	ea1746	Observe that the coefficient of :math:`\epsilon` is :math:`Df(10) =
Packit	ea1746	20`. Indeed this generalizes to functions which are not
Packit	ea1746	`polynomial. Consider an arbitrary differentiable function`
Packit	ea1746	:math:`f(x)`. Then we can evaluate :math:`f(x + \epsilon)` by
Packit	ea1746	considering the Taylor expansion of :math:`f` near :math:`x`, which
Packit	ea1746	`gives us the infinite series`
Packit	ea1746
Packit	ea1746	`.. math::`
Packit	ea1746	`\begin{align}`
Packit	ea1746	`f(x + \epsilon) &= f(x) + Df(x) \epsilon + D^2f(x)`
Packit	ea1746	`\frac{\epsilon^2}{2} + D^3f(x) \frac{\epsilon^3}{6} + \cdots\\`
Packit	ea1746	`f(x + \epsilon) &= f(x) + Df(x) \epsilon`
Packit	ea1746	`\end{align}`
Packit	ea1746
Packit	ea1746	Here we are using the fact that :math:`\epsilon^2 = 0`.
Packit	ea1746
Packit	ea1746	A `Jet <https://en.wikipedia.org/wiki/Jet_(mathematics)>`_ is a
Packit	ea1746	:math:`n`-dimensional dual number, where we augment the real numbers
Packit	ea1746	with :math:`n` infinitesimal units :math:`\epsilon_i,\ i=1,...,n` with
Packit	ea1746	the property that :math:`\forall i, j\ :\epsilon_i\epsilon_j = 0`. Then
Packit	ea1746	a Jet consists of a real part :math:`a` and a :math:`n`-dimensional
Packit	ea1746	infinitesimal part :math:`\mathbf{v}`, i.e.,
Packit	ea1746
Packit	ea1746	`.. math::`
Packit	ea1746	`x = a + \sum_j v_{j} \epsilon_j`
Packit	ea1746
Packit	ea1746	`The summation notation gets tedious, so we will also just write`
Packit	ea1746
Packit	ea1746	`.. math::`
Packit	ea1746	`x = a + \mathbf{v}.`
Packit	ea1746
Packit	ea1746	where the :math:`\epsilon_i`'s are implict. Then, using the same
Packit	ea1746	`Taylor series expansion used above, we can see that:`
Packit	ea1746
Packit	ea1746	`.. math::`
Packit	ea1746
Packit	ea1746	`f(a + \mathbf{v}) = f(a) + Df(a) \mathbf{v}.`
Packit	ea1746
Packit	ea1746	`Similarly for a multivariate function`
Packit	ea1746	:math:`f:\mathbb{R}^{n}\rightarrow \mathbb{R}^m`, evaluated on
Packit	ea1746	:math:`x_i = a_i + \mathbf{v}_i,\ \forall i = 1,...,n`:
Packit	ea1746
Packit	ea1746	`.. math::`
Packit	ea1746	`f(x_1,..., x_n) = f(a_1, ..., a_n) + \sum_i D_i f(a_1, ..., a_n) \mathbf{v}_i`
Packit	ea1746
Packit	ea1746	So if each :math:`\mathbf{v}_i = e_i` were the :math:`i^{\text{th}}`
Packit	ea1746	`standard basis vector, then, the above expression would simplify to`
Packit	ea1746
Packit	ea1746	`.. math::`
Packit	ea1746	`f(x_1,..., x_n) = f(a_1, ..., a_n) + \sum_i D_i f(a_1, ..., a_n) \epsilon_i`
Packit	ea1746
Packit	ea1746	`and we can extract the coordinates of the Jacobian by inspecting the`
Packit	ea1746	coefficients of :math:`\epsilon_i`.
Packit	ea1746
Packit	ea1746	`Implementing Jets`
Packit	ea1746	`-----------------`
Packit	ea1746
Packit	ea1746	`In order for the above to work in practice, we will need the ability`
Packit	ea1746	to evaluate an arbitrary function :math:`f` not just on real numbers
Packit	ea1746	`but also on dual numbers, but one does not usually evaluate functions`
Packit	ea1746	`by evaluating their Taylor expansions,`
Packit	ea1746
Packit	ea1746	`This is where C++ templates and operator overloading comes into`
Packit	ea1746	`play. The following code fragment has a simple implementation of a`
Packit	ea1746	``Jet`` and some operators/functions that operate on them.
Packit	ea1746
Packit	ea1746	`.. code-block:: c++`
Packit	ea1746
Packit	ea1746	`template<int N> struct Jet {`
Packit	ea1746	`double a;`
Packit	ea1746	`Eigen::Matrix<double, 1, N> v;`
Packit	ea1746	`};`
Packit	ea1746
Packit	ea1746	`template<int N> Jet<N> operator+(const Jet<N>& f, const Jet<N>& g) {`
Packit	ea1746	`return Jet<N>(f.a + g.a, f.v + g.v);`
Packit	ea1746	`}`
Packit	ea1746
Packit	ea1746	`template<int N> Jet<N> operator-(const Jet<N>& f, const Jet<N>& g) {`
Packit	ea1746	`return Jet<N>(f.a - g.a, f.v - g.v);`
Packit	ea1746	`}`
Packit	ea1746
Packit	ea1746	`template<int N> Jet<N> operator*(const Jet<N>& f, const Jet<N>& g) {`
Packit	ea1746	`return Jet<N>(f.a * g.a, f.a * g.v + f.v * g.a);`
Packit	ea1746	`}`
Packit	ea1746
Packit	ea1746	`template<int N> Jet<N> operator/(const Jet<N>& f, const Jet<N>& g) {`
Packit	ea1746	`return Jet<N>(f.a / g.a, f.v / g.a - f.a * g.v / (g.a * g.a));`
Packit	ea1746	`}`
Packit	ea1746
Packit	ea1746	`template <int N> Jet<N> exp(const Jet<N>& f) {`
Packit	ea1746	`return Jet<T, N>(exp(f.a), exp(f.a) * f.v);`
Packit	ea1746	`}`
Packit	ea1746
Packit	ea1746	`// This is a simple implementation for illustration purposes, the`
Packit	ea1746	`// actual implementation of pow requires careful handling of a number`
Packit	ea1746	`// of corner cases.`
Packit	ea1746	`template <int N> Jet<N> pow(const Jet<N>& f, const Jet<N>& g) {`
Packit	ea1746	`return Jet<N>(pow(f.a, g.a),`
Packit	ea1746	`g.a * pow(f.a, g.a - 1.0) * f.v +`
Packit	ea1746	`pow(f.a, g.a) * log(f.a); * g.v);`
Packit	ea1746	`}`
Packit	ea1746
Packit	ea1746
Packit	ea1746	`With these overloaded functions in hand, we can now call`
Packit	ea1746	``Rat43CostFunctor`` with an array of Jets instead of doubles. Putting
Packit	ea1746	`that together with appropriately initialized Jets allows us to compute`
Packit	ea1746	`the Jacobian as follows:`
Packit	ea1746
Packit	ea1746	`.. code-block:: c++`
Packit	ea1746
Packit	ea1746	`class Rat43Automatic : public ceres::SizedCostFunction<1,4> {`
Packit	ea1746	`public:`
Packit	ea1746	`Rat43Automatic(const Rat43CostFunctor* functor) : functor_(functor) {}`
Packit	ea1746	`virtual ~Rat43Automatic() {}`
Packit	ea1746	`virtual bool Evaluate(double const* const* parameters,`
Packit	ea1746	`double* residuals,`
Packit	ea1746	`double** jacobians) const {`
Packit	ea1746	`// Just evaluate the residuals if Jacobians are not required.`
Packit	ea1746	`if (!jacobians) return (*functor_)(parameters[0], residuals);`
Packit	ea1746
Packit	ea1746	`// Initialize the Jets`
Packit	ea1746	`ceres::Jet<4> jets[4];`
Packit	ea1746	`for (int i = 0; i < 4; ++i) {`
Packit	ea1746	`jets[i].a = parameters[0][i];`
Packit	ea1746	`jets[i].v.setZero();`
Packit	ea1746	`jets[i].v[i] = 1.0;`
Packit	ea1746	`}`
Packit	ea1746
Packit	ea1746	`ceres::Jet<4> result;`
Packit	ea1746	`(*functor_)(jets, &result);`
Packit	ea1746
Packit	ea1746	`// Copy the values out of the Jet.`
Packit	ea1746	`residuals[0] = result.a;`
Packit	ea1746	`for (int i = 0; i < 4; ++i) {`
Packit	ea1746	`jacobians[0][i] = result.v[i];`
Packit	ea1746	`}`
Packit	ea1746	`return true;`
Packit	ea1746	`}`
Packit	ea1746
Packit	ea1746	`private:`
Packit	ea1746	`std::unique_ptr<const Rat43CostFunctor> functor_;`
Packit	ea1746	`};`
Packit	ea1746
Packit	ea1746	Indeed, this is essentially how :class:`AutoDiffCostFunction` works.
Packit	ea1746
Packit	ea1746
Packit	ea1746	`Pitfalls`
Packit	ea1746	`========`
Packit	ea1746
Packit	ea1746	`Automatic differentiation frees the user from the burden of computing`
Packit	ea1746	`and reasoning about the symbolic expressions for the Jacobians, but`
Packit	ea1746	`this freedom comes at a cost. For example consider the following`
Packit	ea1746	`simple functor:`
Packit	ea1746
Packit	ea1746	`.. code-block:: c++`
Packit	ea1746
Packit	ea1746	`struct Functor {`
Packit	ea1746	`template <typename T> bool operator()(const T* x, T* residual) const {`
Packit	ea1746	`residual[0] = 1.0 - sqrt(x[0] * x[0] + x[1] * x[1]);`
Packit	ea1746	`return true;`
Packit	ea1746	`}`
Packit	ea1746	`};`
Packit	ea1746
Packit	ea1746	`Looking at the code for the residual computation, one does not foresee`
Packit	ea1746	`any problems. However, if we look at the analytical expressions for`
Packit	ea1746	`the Jacobian:`
Packit	ea1746
Packit	ea1746	`.. math::`
Packit	ea1746
Packit	ea1746	`y &= 1 - \sqrt{x_0^2 + x_1^2}\\`
Packit	ea1746	`D_1y &= -\frac{x_0}{\sqrt{x_0^2 + x_1^2}},\`
Packit	ea1746	`D_2y = -\frac{x_1}{\sqrt{x_0^2 + x_1^2}}`
Packit	ea1746
Packit	ea1746	we find that it is an indeterminate form at :math:`x_0 = 0, x_1 =
Packit	ea1746	0`.
Packit	ea1746
Packit	ea1746	`There is no single solution to this problem. In some cases one needs`
Packit	ea1746	`to reason explicitly about the points where indeterminacy may occur`
Packit	ea1746	and use alternate expressions using `L'Hopital's rule
Packit	ea1746	<https://en.wikipedia.org/wiki/L'H%C3%B4pital's_rule>`_ (see for
Packit	ea1746	example some of the conversion routines in `rotation.h
Packit	ea1746	<https://github.com/ceres-solver/ceres-solver/blob/master/include/ceres/rotation.h>`_. In
Packit	ea1746	`other cases, one may need to regularize the expressions to eliminate`
Packit	ea1746	`these points.`

source-git / ceres-solver

Source Code

Blame docs/source/automatic_derivatives.rst