Tree - source-git/ceres-solver

source-git / ceres-solver

Blame docs/html/_sources/analytical_derivatives.txt

Blob History Raw

Packit	ea1746	`.. default-domain:: cpp`
Packit	ea1746
Packit	ea1746	`.. cpp:namespace:: ceres`
Packit	ea1746
Packit	ea1746	`.. _chapter-analytical_derivatives:`
Packit	ea1746
Packit	ea1746	`====================`
Packit	ea1746	`Analytic Derivatives`
Packit	ea1746	`====================`
Packit	ea1746
Packit	ea1746	Consider the problem of fitting the following curve (`Rat43
Packit	ea1746	<http://www.itl.nist.gov/div898/strd/nls/data/ratkowsky3.shtml>`_) to
Packit	ea1746	`data:`
Packit	ea1746
Packit	ea1746	`.. math::`
Packit	ea1746	`y = \frac{b_1}{(1+e^{b_2-b_3x})^{1/b_4}}`
Packit	ea1746
Packit	ea1746	That is, given some data :math:`\{x_i, y_i\},\ \forall i=1,... ,n`,
Packit	ea1746	determine parameters :math:`b_1, b_2, b_3` and :math:`b_4` that best
Packit	ea1746	`fit this data.`
Packit	ea1746
Packit	ea1746	`Which can be stated as the problem of finding the`
Packit	ea1746	values of :math:`b_1, b_2, b_3` and :math:`b_4` are the ones that
Packit	ea1746	`minimize the following objective function [#f1]_:`
Packit	ea1746
Packit	ea1746	`.. math::`
Packit	ea1746	`\begin{align}`
Packit	ea1746	`E(b_1, b_2, b_3, b_4)`
Packit	ea1746	`&= \sum_i f^2(b_1, b_2, b_3, b_4 ; x_i, y_i)\\`
Packit	ea1746	`&= \sum_i \left(\frac{b_1}{(1+e^{b_2-b_3x_i})^{1/b_4}} - y_i\right)^2\\`
Packit	ea1746	`\end{align}`
Packit	ea1746
Packit	ea1746	`To solve this problem using Ceres Solver, we need to define a`
Packit	ea1746	:class:`CostFunction` that computes the residual :math:`f` for a given
Packit	ea1746	:math:`x` and :math:`y` and its derivatives with respect to
Packit	ea1746	:math:`b_1, b_2, b_3` and :math:`b_4`.
Packit	ea1746
Packit	ea1746	`Using elementary differential calculus, we can see that:`
Packit	ea1746
Packit	ea1746	`.. math::`
Packit	ea1746	`\begin{align}`
Packit	ea1746	`D_1 f(b_1, b_2, b_3, b_4; x,y) &= \frac{1}{(1+e^{b_2-b_3x})^{1/b_4}}\\`
Packit	ea1746	`D_2 f(b_1, b_2, b_3, b_4; x,y) &=`
Packit	ea1746	`\frac{-b_1e^{b_2-b_3x}}{b_4(1+e^{b_2-b_3x})^{1/b_4 + 1}} \\`
Packit	ea1746	`D_3 f(b_1, b_2, b_3, b_4; x,y) &=`
Packit	ea1746	`\frac{b_1xe^{b_2-b_3x}}{b_4(1+e^{b_2-b_3x})^{1/b_4 + 1}} \\`
Packit	ea1746	`D_4 f(b_1, b_2, b_3, b_4; x,y) & = \frac{b_1 \log\left(1+e^{b_2-b_3x}\right) }{b_4^2(1+e^{b_2-b_3x})^{1/b_4}}`
Packit	ea1746	`\end{align}`
Packit	ea1746
Packit	ea1746	`With these derivatives in hand, we can now implement the`
Packit	ea1746	:class:`CostFunction` as:
Packit	ea1746
Packit	ea1746	`.. code-block:: c++`
Packit	ea1746
Packit	ea1746	`class Rat43Analytic : public SizedCostFunction<1,4> {`
Packit	ea1746	`public:`
Packit	ea1746	`Rat43Analytic(const double x, const double y) : x_(x), y_(y) {}`
Packit	ea1746	`virtual ~Rat43Analytic() {}`
Packit	ea1746	`virtual bool Evaluate(double const* const* parameters,`
Packit	ea1746	`double* residuals,`
Packit	ea1746	`double** jacobians) const {`
Packit	ea1746	`const double b1 = parameters[0][0];`
Packit	ea1746	`const double b2 = parameters[0][1];`
Packit	ea1746	`const double b3 = parameters[0][2];`
Packit	ea1746	`const double b4 = parameters[0][3];`
Packit	ea1746
Packit	ea1746	`residuals[0] = b1 * pow(1 + exp(b2 - b3 * x_), -1.0 / b4) - y_;`
Packit	ea1746
Packit	ea1746	`if (!jacobians) return true;`
Packit	ea1746	`double* jacobian = jacobians[0];`
Packit	ea1746	`if (!jacobian) return true;`
Packit	ea1746
Packit	ea1746	`jacobian[0] = pow(1 + exp(b2 - b3 * x_), -1.0 / b4);`
Packit	ea1746	`jacobian[1] = -b1 * exp(b2 - b3 * x_) *`
Packit	ea1746	`pow(1 + exp(b2 - b3 * x_), -1.0 / b4 - 1) / b4;`
Packit	ea1746	`jacobian[2] = x_ * b1 * exp(b2 - b3 * x_) *`
Packit	ea1746	`pow(1 + exp(b2 - b3 * x_), -1.0 / b4 - 1) / b4;`
Packit	ea1746	`jacobian[3] = b1 * log(1 + exp(b2 - b3 * x_)) *`
Packit	ea1746	`pow(1 + exp(b2 - b3 * x_), -1.0 / b4) / (b4 * b4);`
Packit	ea1746	`return true;`
Packit	ea1746	`}`
Packit	ea1746
Packit	ea1746	`private:`
Packit	ea1746	`const double x_;`
Packit	ea1746	`const double y_;`
Packit	ea1746	`};`
Packit	ea1746
Packit	ea1746	`This is tedious code, hard to read and with a lot of`
Packit	ea1746	`redundancy. So in practice we will cache some sub-expressions to`
Packit	ea1746	`improve its efficiency, which would give us something like:`
Packit	ea1746
Packit	ea1746	`.. code-block:: c++`
Packit	ea1746
Packit	ea1746	`class Rat43AnalyticOptimized : public SizedCostFunction<1,4> {`
Packit	ea1746	`public:`
Packit	ea1746	`Rat43AnalyticOptimized(const double x, const double y) : x_(x), y_(y) {}`
Packit	ea1746	`virtual ~Rat43AnalyticOptimized() {}`
Packit	ea1746	`virtual bool Evaluate(double const* const* parameters,`
Packit	ea1746	`double* residuals,`
Packit	ea1746	`double** jacobians) const {`
Packit	ea1746	`const double b1 = parameters[0][0];`
Packit	ea1746	`const double b2 = parameters[0][1];`
Packit	ea1746	`const double b3 = parameters[0][2];`
Packit	ea1746	`const double b4 = parameters[0][3];`
Packit	ea1746
Packit	ea1746	`const double t1 = exp(b2 - b3 * x_);`
Packit	ea1746	`const double t2 = 1 + t1;`
Packit	ea1746	`const double t3 = pow(t2, -1.0 / b4);`
Packit	ea1746	`residuals[0] = b1 * t3 - y_;`
Packit	ea1746
Packit	ea1746	`if (!jacobians) return true;`
Packit	ea1746	`double* jacobian = jacobians[0];`
Packit	ea1746	`if (!jacobian) return true;`
Packit	ea1746
Packit	ea1746	`const double t4 = pow(t2, -1.0 / b4 - 1);`
Packit	ea1746	`jacobian[0] = t3;`
Packit	ea1746	`jacobian[1] = -b1 * t1 * t4 / b4;`
Packit	ea1746	`jacobian[2] = -x_ * jacobian[1];`
Packit	ea1746	`jacobian[3] = b1 * log(t2) * t3 / (b4 * b4);`
Packit	ea1746	`return true;`
Packit	ea1746	`}`
Packit	ea1746
Packit	ea1746	`private:`
Packit	ea1746	`const double x_;`
Packit	ea1746	`const double y_;`
Packit	ea1746	`};`
Packit	ea1746
Packit	ea1746	`What is the difference in performance of these two implementations?`
Packit	ea1746
Packit	ea1746	`========================== =========`
Packit	ea1746	`CostFunction Time (ns)`
Packit	ea1746	`========================== =========`
Packit	ea1746	`Rat43Analytic 255`
Packit	ea1746	`Rat43AnalyticOptimized 92`
Packit	ea1746	`========================== =========`
Packit	ea1746
Packit	ea1746	``Rat43AnalyticOptimized`` is :math:`2.8` times faster than
Packit	ea1746	``Rat43Analytic``. This difference in run-time is not uncommon. To
Packit	ea1746	`get the best performance out of analytically computed derivatives, one`
Packit	ea1746	`usually needs to optimize the code to account for common`
Packit	ea1746	`sub-expressions.`
Packit	ea1746
Packit	ea1746
Packit	ea1746	`When should you use analytical derivatives?`
Packit	ea1746	`===========================================`
Packit	ea1746
Packit	ea1746	`#. The expressions are simple, e.g. mostly linear.`
Packit	ea1746
Packit	ea1746	#. A computer algebra system like `Maple
Packit	ea1746	<https://www.maplesoft.com/products/maple/>`_ , `Mathematica
Packit	ea1746	<https://www.wolfram.com/mathematica/>`_, or `SymPy
Packit	ea1746	<http://www.sympy.org/en/index.html>`_ can be used to symbolically
Packit	ea1746	`differentiate the objective function and generate the C++ to`
Packit	ea1746	`evaluate them.`
Packit	ea1746
Packit	ea1746	`#. Performance is of utmost concern and there is algebraic structure`
Packit	ea1746	`in the terms that you can exploit to get better performance than`
Packit	ea1746	`automatic differentiation.`
Packit	ea1746
Packit	ea1746	`That said, getting the best performance out of analytical`
Packit	ea1746	`derivatives requires a non-trivial amount of work. Before going`
Packit	ea1746	`down this path, it is useful to measure the amount of time being`
Packit	ea1746	`spent evaluating the Jacobian as a fraction of the total solve time`
Packit	ea1746	and remember `Amdahl's Law
Packit	ea1746	<https://en.wikipedia.org/wiki/Amdahl's_law>`_ is your friend.
Packit	ea1746
Packit	ea1746	`#. There is no other way to compute the derivatives, e.g. you`
Packit	ea1746	`wish to compute the derivative of the root of a polynomial:`
Packit	ea1746
Packit	ea1746	`.. math::`
Packit	ea1746	`a_3(x,y)z^3 + a_2(x,y)z^2 + a_1(x,y)z + a_0(x,y) = 0`
Packit	ea1746
Packit	ea1746
Packit	ea1746	with respect to :math:`x` and :math:`y`. This requires the use of
Packit	ea1746	the `Inverse Function Theorem
Packit	ea1746	<https://en.wikipedia.org/wiki/Inverse_function_theorem>`_
Packit	ea1746
Packit	ea1746	`#. You love the chain rule and actually enjoy doing all the algebra by`
Packit	ea1746	`hand.`
Packit	ea1746
Packit	ea1746
Packit	ea1746	`.. rubric:: Footnotes`
Packit	ea1746
Packit	ea1746	`.. [#f1] The notion of best fit depends on the choice of the objective`
Packit	ea1746	`function used to measure the quality of fit, which in turn`
Packit	ea1746	`depends on the underlying noise process which generated the`
Packit	ea1746	`observations. Minimizing the sum of squared differences is`
Packit	ea1746	the right thing to do when the noise is `Gaussian
Packit	ea1746	<https://en.wikipedia.org/wiki/Normal_distribution>`_. In
Packit	ea1746	that case the optimal value of the parameters is the `Maximum
Packit	ea1746	`Likelihood Estimate`
Packit	ea1746	<https://en.wikipedia.org/wiki/Maximum_likelihood_estimation>`_.

source-git / ceres-solver

Source Code

Blame docs/html/_sources/analytical_derivatives.txt