Tree - source-git/opae - CentOS Git server

source-git / opae

Blame external/pybind11/docs/advanced/pycpp/numpy.rst

Blob History Raw

Packit	534379	`.. _numpy:`
Packit	534379
Packit	534379	`NumPy`
Packit	534379	`#####`
Packit	534379
Packit	534379	`Buffer protocol`
Packit	534379	`===============`
Packit	534379
Packit	534379	`Python supports an extremely general and convenient approach for exchanging`
Packit	534379	`data between plugin libraries. Types can expose a buffer view [#f2]_, which`
Packit	534379	`provides fast direct access to the raw internal data representation. Suppose we`
Packit	534379	`want to bind the following simplistic Matrix class:`
Packit	534379
Packit	534379	`.. code-block:: cpp`
Packit	534379
Packit	534379	`class Matrix {`
Packit	534379	`public:`
Packit	534379	`Matrix(size_t rows, size_t cols) : m_rows(rows), m_cols(cols) {`
Packit	534379	`m_data = new float[rows*cols];`
Packit	534379	`}`
Packit	534379	`float *data() { return m_data; }`
Packit	534379	`size_t rows() const { return m_rows; }`
Packit	534379	`size_t cols() const { return m_cols; }`
Packit	534379	`private:`
Packit	534379	`size_t m_rows, m_cols;`
Packit	534379	`float *m_data;`
Packit	534379	`};`
Packit	534379
Packit	534379	The following binding code exposes the ``Matrix`` contents as a buffer object,
Packit	534379	`making it possible to cast Matrices into NumPy arrays. It is even possible to`
Packit	534379	`completely avoid copy operations with Python expressions like`
Packit	534379	``np.array(matrix_instance, copy = False)``.
Packit	534379
Packit	534379	`.. code-block:: cpp`
Packit	534379
Packit	534379	`py::class_<Matrix>(m, "Matrix", py::buffer_protocol())`
Packit	534379	`.def_buffer([](Matrix &m) -> py::buffer_info {`
Packit	534379	`return py::buffer_info(`
Packit	534379	`m.data(), /* Pointer to buffer */`
Packit	534379	`sizeof(float), /* Size of one scalar */`
Packit	534379	`py::format_descriptor<float>::format(), /* Python struct-style format descriptor */`
Packit	534379	`2, /* Number of dimensions */`
Packit	534379	`{ m.rows(), m.cols() }, /* Buffer dimensions */`
Packit	534379	`{ sizeof(float) * m.cols(), /* Strides (in bytes) for each index */`
Packit	534379	`sizeof(float) }`
Packit	534379	`);`
Packit	534379	`});`
Packit	534379
Packit	534379	`Supporting the buffer protocol in a new type involves specifying the special`
Packit	534379	``py::buffer_protocol()`` tag in the ``py::class_`` constructor and calling the
Packit	534379	``def_buffer()`` method with a lambda function that creates a
Packit	534379	``py::buffer_info`` description record on demand describing a given matrix
Packit	534379	instance. The contents of ``py::buffer_info`` mirror the Python buffer protocol
Packit	534379	`specification.`
Packit	534379
Packit	534379	`.. code-block:: cpp`
Packit	534379
Packit	534379	`struct buffer_info {`
Packit	534379	`void *ptr;`
Packit	534379	`ssize_t itemsize;`
Packit	534379	`std::string format;`
Packit	534379	`ssize_t ndim;`
Packit	534379	`std::vector<ssize_t> shape;`
Packit	534379	`std::vector<ssize_t> strides;`
Packit	534379	`};`
Packit	534379
Packit	534379	`To create a C++ function that can take a Python buffer object as an argument,`
Packit	534379	simply use the type ``py::buffer`` as one of its arguments. Buffers can exist
Packit	534379	`in a great variety of configurations, hence some safety checks are usually`
Packit	534379	`necessary in the function body. Below, you can see an basic example on how to`
Packit	534379	`define a custom constructor for the Eigen double precision matrix`
Packit	534379	(``Eigen::MatrixXd``) type, which supports initialization from compatible
Packit	534379	`buffer objects (e.g. a NumPy matrix).`
Packit	534379
Packit	534379	`.. code-block:: cpp`
Packit	534379
Packit	534379	`/* Bind MatrixXd (or some other Eigen type) to Python */`
Packit	534379	`typedef Eigen::MatrixXd Matrix;`
Packit	534379
Packit	534379	`typedef Matrix::Scalar Scalar;`
Packit	534379	`constexpr bool rowMajor = Matrix::Flags & Eigen::RowMajorBit;`
Packit	534379
Packit	534379	`py::class_<Matrix>(m, "Matrix", py::buffer_protocol())`
Packit	534379	`.def("__init__", [](Matrix &m, py::buffer b) {`
Packit	534379	`typedef Eigen::Stride<Eigen::Dynamic, Eigen::Dynamic> Strides;`
Packit	534379
Packit	534379	`/* Request a buffer descriptor from Python */`
Packit	534379	`py::buffer_info info = b.request();`
Packit	534379
Packit	534379	`/* Some sanity checks ... */`
Packit	534379	`if (info.format != py::format_descriptor<Scalar>::format())`
Packit	534379	`throw std::runtime_error("Incompatible format: expected a double array!");`
Packit	534379
Packit	534379	`if (info.ndim != 2)`
Packit	534379	`throw std::runtime_error("Incompatible buffer dimension!");`
Packit	534379
Packit	534379	`auto strides = Strides(`
Packit	534379	`info.strides[rowMajor ? 0 : 1] / (py::ssize_t)sizeof(Scalar),`
Packit	534379	`info.strides[rowMajor ? 1 : 0] / (py::ssize_t)sizeof(Scalar));`
Packit	534379
Packit	534379	`auto map = Eigen::Map<Matrix, 0, Strides>(`
Packit	534379	`static_cast<Scalar *>(info.ptr), info.shape[0], info.shape[1], strides);`
Packit	534379
Packit	534379	`new (&m) Matrix(map);`
Packit	534379	`});`
Packit	534379
Packit	534379	For reference, the ``def_buffer()`` call for this Eigen data type should look
Packit	534379	`as follows:`
Packit	534379
Packit	534379	`.. code-block:: cpp`
Packit	534379
Packit	534379	`.def_buffer([](Matrix &m) -> py::buffer_info {`
Packit	534379	`return py::buffer_info(`
Packit	534379	`m.data(), /* Pointer to buffer */`
Packit	534379	`sizeof(Scalar), /* Size of one scalar */`
Packit	534379	`py::format_descriptor<Scalar>::format(), /* Python struct-style format descriptor */`
Packit	534379	`2, /* Number of dimensions */`
Packit	534379	`{ m.rows(), m.cols() }, /* Buffer dimensions */`
Packit	534379	`{ sizeof(Scalar) * (rowMajor ? m.cols() : 1),`
Packit	534379	`sizeof(Scalar) * (rowMajor ? 1 : m.rows()) }`
Packit	534379	`/* Strides (in bytes) for each index */`
Packit	534379	`);`
Packit	534379	`})`
Packit	534379
Packit	534379	`For a much easier approach of binding Eigen types (although with some`
Packit	534379	limitations), refer to the section on :doc:`/advanced/cast/eigen`.
Packit	534379
Packit	534379	`.. seealso::`
Packit	534379
Packit	534379	The file :file:`tests/test_buffers.cpp` contains a complete example
Packit	534379	`that demonstrates using the buffer protocol with pybind11 in more detail.`
Packit	534379
Packit	534379	`.. [#f2] http://docs.python.org/3/c-api/buffer.html`
Packit	534379
Packit	534379	`Arrays`
Packit	534379	`======`
Packit	534379
Packit	534379	By exchanging ``py::buffer`` with ``py::array`` in the above snippet, we can
Packit	534379	`restrict the function so that it only accepts NumPy arrays (rather than any`
Packit	534379	`type of Python object satisfying the buffer protocol).`
Packit	534379
Packit	534379	`In many situations, we want to define a function which only accepts a NumPy`
Packit	534379	array of a certain data type. This is possible via the ``py::array_t<T>``
Packit	534379	`template. For instance, the following function requires the argument to be a`
Packit	534379	`NumPy array containing double precision values.`
Packit	534379
Packit	534379	`.. code-block:: cpp`
Packit	534379
Packit	534379	`void f(py::array_t<double> array);`
Packit	534379
Packit	534379	`When it is invoked with a different type (e.g. an integer or a list of`
Packit	534379	`integers), the binding code will attempt to cast the input into a NumPy array`
Packit	534379	`of the requested type. Note that this feature requires the`
Packit	534379	:file:`pybind11/numpy.h` header to be included.
Packit	534379
Packit	534379	`Data in NumPy arrays is not guaranteed to packed in a dense manner;`
Packit	534379	`furthermore, entries can be separated by arbitrary column and row strides.`
Packit	534379	`Sometimes, it can be useful to require a function to only accept dense arrays`
Packit	534379	`using either the C (row-major) or Fortran (column-major) ordering. This can be`
Packit	534379	accomplished via a second template argument with values ``py::array::c_style``
Packit	534379	or ``py::array::f_style``.
Packit	534379
Packit	534379	`.. code-block:: cpp`
Packit	534379
Packit	534379	`void f(py::array_t<double, py::array::c_style \| py::array::forcecast> array);`
Packit	534379
Packit	534379	The ``py::array::forcecast`` argument is the default value of the second
Packit	534379	`template parameter, and it ensures that non-conforming arguments are converted`
Packit	534379	`into an array satisfying the specified requirements instead of trying the next`
Packit	534379	`function overload.`
Packit	534379
Packit	534379	`Structured types`
Packit	534379	`================`
Packit	534379
Packit	534379	In order for ``py::array_t`` to work with structured (record) types, we first
Packit	534379	`need to register the memory layout of the type. This can be done via`
Packit	534379	``PYBIND11_NUMPY_DTYPE`` macro, called in the plugin definition code, which
Packit	534379	`expects the type followed by field names:`
Packit	534379
Packit	534379	`.. code-block:: cpp`
Packit	534379
Packit	534379	`struct A {`
Packit	534379	`int x;`
Packit	534379	`double y;`
Packit	534379	`};`
Packit	534379
Packit	534379	`struct B {`
Packit	534379	`int z;`
Packit	534379	`A a;`
Packit	534379	`};`
Packit	534379
Packit	534379	`// ...`
Packit	534379	`PYBIND11_MODULE(test, m) {`
Packit	534379	`// ...`
Packit	534379
Packit	534379	`PYBIND11_NUMPY_DTYPE(A, x, y);`
Packit	534379	`PYBIND11_NUMPY_DTYPE(B, z, a);`
Packit	534379	`/* now both A and B can be used as template arguments to py::array_t */`
Packit	534379	`}`
Packit	534379
Packit	534379	The structure should consist of fundamental arithmetic types, ``std::complex``,
Packit	534379	`previously registered substructures, and arrays of any of the above. Both C++`
Packit	534379	arrays and ``std::array`` are supported. While there is a static assertion to
Packit	534379	`prevent many types of unsupported structures, it is still the user's`
Packit	534379	`responsibility to use only "plain" structures that can be safely manipulated as`
Packit	534379	`raw memory without violating invariants.`
Packit	534379
Packit	534379	`Vectorizing functions`
Packit	534379	`=====================`
Packit	534379
Packit	534379	`Suppose we want to bind a function with the following signature to Python so`
Packit	534379	`that it can process arbitrary NumPy array arguments (vectors, matrices, general`
Packit	534379	`N-D arrays) in addition to its normal arguments:`
Packit	534379
Packit	534379	`.. code-block:: cpp`
Packit	534379
Packit	534379	`double my_func(int x, float y, double z);`
Packit	534379
Packit	534379	After including the ``pybind11/numpy.h`` header, this is extremely simple:
Packit	534379
Packit	534379	`.. code-block:: cpp`
Packit	534379
Packit	534379	`m.def("vectorized_func", py::vectorize(my_func));`
Packit	534379
Packit	534379	Invoking the function like below causes 4 calls to be made to ``my_func`` with
Packit	534379	`each of the array elements. The significant advantage of this compared to`
Packit	534379	solutions like ``numpy.vectorize()`` is that the loop over the elements runs
Packit	534379	`entirely on the C++ side and can be crunched down into a tight, optimized loop`
Packit	534379	`by the compiler. The result is returned as a NumPy array of type`
Packit	534379	``numpy.dtype.float64``.
Packit	534379
Packit	534379	`.. code-block:: pycon`
Packit	534379
Packit	534379	`>>> x = np.array([[1, 3],[5, 7]])`
Packit	534379	`>>> y = np.array([[2, 4],[6, 8]])`
Packit	534379	`>>> z = 3`
Packit	534379	`>>> result = vectorized_func(x, y, z)`
Packit	534379
Packit	534379	The scalar argument ``z`` is transparently replicated 4 times. The input
Packit	534379	arrays ``x`` and ``y`` are automatically converted into the right types (they
Packit	534379	are of type ``numpy.dtype.int64`` but need to be ``numpy.dtype.int32`` and
Packit	534379	``numpy.dtype.float32``, respectively).
Packit	534379
Packit	534379	`.. note::`
Packit	534379
Packit	534379	Only arithmetic, complex, and POD types passed by value or by ``const &``
Packit	534379	`reference are vectorized; all other arguments are passed through as-is.`
Packit	534379	`Functions taking rvalue reference arguments cannot be vectorized.`
Packit	534379
Packit	534379	`In cases where the computation is too complicated to be reduced to`
Packit	534379	``vectorize``, it will be necessary to create and access the buffer contents
Packit	534379	`manually. The following snippet contains a complete example that shows how this`
Packit	534379	`works (the code is somewhat contrived, since it could have been done more`
Packit	534379	simply using ``vectorize``).
Packit	534379
Packit	534379	`.. code-block:: cpp`
Packit	534379
Packit	534379	`#include <pybind11/pybind11.h>`
Packit	534379	`#include <pybind11/numpy.h>`
Packit	534379
Packit	534379	`namespace py = pybind11;`
Packit	534379
Packit	534379	`py::array_t<double> add_arrays(py::array_t<double> input1, py::array_t<double> input2) {`
Packit	534379	`py::buffer_info buf1 = input1.request(), buf2 = input2.request();`
Packit	534379
Packit	534379	`if (buf1.ndim != 1 \|\| buf2.ndim != 1)`
Packit	534379	`throw std::runtime_error("Number of dimensions must be one");`
Packit	534379
Packit	534379	`if (buf1.size != buf2.size)`
Packit	534379	`throw std::runtime_error("Input shapes must match");`
Packit	534379
Packit	534379	`/* No pointer is passed, so NumPy will allocate the buffer */`
Packit	534379	`auto result = py::array_t<double>(buf1.size);`
Packit	534379
Packit	534379	`py::buffer_info buf3 = result.request();`
Packit	534379
Packit	534379	`double ptr1 = (double ) buf1.ptr,`
Packit	534379	`ptr2 = (double ) buf2.ptr,`
Packit	534379	`ptr3 = (double ) buf3.ptr;`
Packit	534379
Packit	534379	`for (size_t idx = 0; idx < buf1.shape[0]; idx++)`
Packit	534379	`ptr3[idx] = ptr1[idx] + ptr2[idx];`
Packit	534379
Packit	534379	`return result;`
Packit	534379	`}`
Packit	534379
Packit	534379	`PYBIND11_MODULE(test, m) {`
Packit	534379	`m.def("add_arrays", &add_arrays, "Add two NumPy arrays");`
Packit	534379	`}`
Packit	534379
Packit	534379	`.. seealso::`
Packit	534379
Packit	534379	The file :file:`tests/test_numpy_vectorize.cpp` contains a complete
Packit	534379	example that demonstrates using :func:`vectorize` in more detail.
Packit	534379
Packit	534379	`Direct access`
Packit	534379	`=============`
Packit	534379
Packit	534379	`For performance reasons, particularly when dealing with very large arrays, it`
Packit	534379	`is often desirable to directly access array elements without internal checking`
Packit	534379	`of dimensions and bounds on every access when indices are known to be already`
Packit	534379	valid. To avoid such checks, the ``array`` class and ``array_t<T>`` template
Packit	534379	`class offer an unchecked proxy object that can be used for this unchecked`
Packit	534379	access through the ``unchecked<N>`` and ``mutable_unchecked<N>`` methods,
Packit	534379	where ``N`` gives the required dimensionality of the array:
Packit	534379
Packit	534379	`.. code-block:: cpp`
Packit	534379
Packit	534379	`m.def("sum_3d", [](py::array_t<double> x) {`
Packit	534379	`auto r = x.unchecked<3>(); // x must have ndim = 3; can be non-writeable`
Packit	534379	`double sum = 0;`
Packit	534379	`for (ssize_t i = 0; i < r.shape(0); i++)`
Packit	534379	`for (ssize_t j = 0; j < r.shape(1); j++)`
Packit	534379	`for (ssize_t k = 0; k < r.shape(2); k++)`
Packit	534379	`sum += r(i, j, k);`
Packit	534379	`return sum;`
Packit	534379	`});`
Packit	534379	`m.def("increment_3d", [](py::array_t<double> x) {`
Packit	534379	`auto r = x.mutable_unchecked<3>(); // Will throw if ndim != 3 or flags.writeable is false`
Packit	534379	`for (ssize_t i = 0; i < r.shape(0); i++)`
Packit	534379	`for (ssize_t j = 0; j < r.shape(1); j++)`
Packit	534379	`for (ssize_t k = 0; k < r.shape(2); k++)`
Packit	534379	`r(i, j, k) += 1.0;`
Packit	534379	`}, py::arg().noconvert());`
Packit	534379
Packit	534379	To obtain the proxy from an ``array`` object, you must specify both the data
Packit	534379	type and number of dimensions as template arguments, such as ``auto r =
Packit	534379	myarray.mutable_unchecked<float, 2>()``.
Packit	534379
Packit	534379	`If the number of dimensions is not known at compile time, you can omit the`
Packit	534379	dimensions template parameter (i.e. calling ``arr_t.unchecked()`` or
Packit	534379	``arr.unchecked<T>()``. This will give you a proxy object that works in the
Packit	534379	`same way, but results in less optimizable code and thus a small efficiency`
Packit	534379	`loss in tight loops.`
Packit	534379
Packit	534379	`Note that the returned proxy object directly references the array's data, and`
Packit	534379	`only reads its shape, strides, and writeable flag when constructed. You must`
Packit	534379	`take care to ensure that the referenced array is not destroyed or reshaped for`
Packit	534379	`the duration of the returned object, typically by limiting the scope of the`
Packit	534379	`returned instance.`
Packit	534379
Packit	534379	The returned proxy object supports some of the same methods as ``py::array`` so
Packit	534379	`that it can be used as a drop-in replacement for some existing, index-checked`
Packit	534379	uses of ``py::array``:
Packit	534379
Packit	534379	- ``r.ndim()`` returns the number of dimensions
Packit	534379
Packit	534379	- ``r.data(1, 2, ...)`` and ``r.mutable_data(1, 2, ...)``` returns a pointer to
Packit	534379	the ``const T`` or ``T`` data, respectively, at the given indices. The
Packit	534379	latter is only available to proxies obtained via ``a.mutable_unchecked()``.
Packit	534379
Packit	534379	- ``itemsize()`` returns the size of an item in bytes, i.e. ``sizeof(T)``.
Packit	534379
Packit	534379	- ``ndim()`` returns the number of dimensions.
Packit	534379
Packit	534379	- ``shape(n)`` returns the size of dimension ``n``
Packit	534379
Packit	534379	- ``size()`` returns the total number of elements (i.e. the product of the shapes).
Packit	534379
Packit	534379	- ``nbytes()`` returns the number of bytes used by the referenced elements
Packit	534379	(i.e. ``itemsize()`` times ``size()``).
Packit	534379
Packit	534379	`.. seealso::`
Packit	534379
Packit	534379	The file :file:`tests/test_numpy_array.cpp` contains additional examples
Packit	534379	`demonstrating the use of this feature.`
Packit	534379
Packit	534379	`Ellipsis`
Packit	534379	`========`
Packit	534379
Packit	534379	Python 3 provides a convenient ``...`` ellipsis notation that is often used to
Packit	534379	`slice multidimensional arrays. For instance, the following snippet extracts the`
Packit	534379	`middle dimensions of a tensor with the first and last index set to zero.`
Packit	534379
Packit	534379	`.. code-block:: python`
Packit	534379
Packit	534379	`a = # a NumPy array`
Packit	534379	`b = a[0, ..., 0]`
Packit	534379
Packit	534379	The function ``py::ellipsis()`` function can be used to perform the same
Packit	534379	`operation on the C++ side:`
Packit	534379
Packit	534379	`.. code-block:: cpp`
Packit	534379
Packit	534379	`py::array a = /* A NumPy array */;`
Packit	534379	`py::array b = a[py::make_tuple(0, py::ellipsis(), 0)];`

source-git / opae

Source Code

Blame external/pybind11/docs/advanced/pycpp/numpy.rst