|
Packit Service |
99d393 |
.. _buffer:
|
|
Packit Service |
99d393 |
|
|
Packit Service |
99d393 |
Implementing the buffer protocol
|
|
Packit Service |
99d393 |
================================
|
|
Packit Service |
99d393 |
|
|
Packit Service |
99d393 |
Cython objects can expose memory buffers to Python code
|
|
Packit Service |
99d393 |
by implementing the "buffer protocol".
|
|
Packit Service |
99d393 |
This chapter shows how to implement the protocol
|
|
Packit Service |
99d393 |
and make use of the memory managed by an extension type from NumPy.
|
|
Packit Service |
99d393 |
|
|
Packit Service |
99d393 |
|
|
Packit Service |
99d393 |
A matrix class
|
|
Packit Service |
99d393 |
--------------
|
|
Packit Service |
99d393 |
|
|
Packit Service |
99d393 |
The following Cython/C++ code implements a matrix of floats,
|
|
Packit Service |
99d393 |
where the number of columns is fixed at construction time
|
|
Packit Service |
99d393 |
but rows can be added dynamically.
|
|
Packit Service |
99d393 |
|
|
Packit Service |
99d393 |
::
|
|
Packit Service |
99d393 |
|
|
Packit Service |
99d393 |
# matrix.pyx
|
|
Packit Service |
99d393 |
from libcpp.vector cimport vector
|
|
Packit Service |
99d393 |
|
|
Packit Service |
99d393 |
cdef class Matrix:
|
|
Packit Service |
99d393 |
cdef unsigned ncols
|
|
Packit Service |
99d393 |
cdef vector[float] v
|
|
Packit Service |
99d393 |
|
|
Packit Service |
99d393 |
def __cinit__(self, unsigned ncols):
|
|
Packit Service |
99d393 |
self.ncols = ncols
|
|
Packit Service |
99d393 |
|
|
Packit Service |
99d393 |
def add_row(self):
|
|
Packit Service |
99d393 |
"""Adds a row, initially zero-filled."""
|
|
Packit Service |
99d393 |
self.v.extend(self.ncols)
|
|
Packit Service |
99d393 |
|
|
Packit Service |
99d393 |
There are no methods to do anything productive with the matrices' contents.
|
|
Packit Service |
99d393 |
We could implement custom ``__getitem__``, ``__setitem__``, etc. for this,
|
|
Packit Service |
99d393 |
but instead we'll use the buffer protocol to expose the matrix's data to Python
|
|
Packit Service |
99d393 |
so we can use NumPy to do useful work.
|
|
Packit Service |
99d393 |
|
|
Packit Service |
99d393 |
Implementing the buffer protocol requires adding two methods,
|
|
Packit Service |
99d393 |
``__getbuffer__`` and ``__releasebuffer__``,
|
|
Packit Service |
99d393 |
which Cython handles specially.
|
|
Packit Service |
99d393 |
|
|
Packit Service |
99d393 |
::
|
|
Packit Service |
99d393 |
|
|
Packit Service |
99d393 |
from cpython cimport Py_buffer
|
|
Packit Service |
99d393 |
from libcpp.vector cimport vector
|
|
Packit Service |
99d393 |
|
|
Packit Service |
99d393 |
cdef class Matrix:
|
|
Packit Service |
99d393 |
cdef Py_ssize_t ncols
|
|
Packit Service |
99d393 |
cdef Py_ssize_t shape[2]
|
|
Packit Service |
99d393 |
cdef Py_ssize_t strides[2]
|
|
Packit Service |
99d393 |
cdef vector[float] v
|
|
Packit Service |
99d393 |
|
|
Packit Service |
99d393 |
def __cinit__(self, Py_ssize_t ncols):
|
|
Packit Service |
99d393 |
self.ncols = ncols
|
|
Packit Service |
99d393 |
|
|
Packit Service |
99d393 |
def add_row(self):
|
|
Packit Service |
99d393 |
"""Adds a row, initially zero-filled."""
|
|
Packit Service |
99d393 |
self.v.extend(self.ncols)
|
|
Packit Service |
99d393 |
|
|
Packit Service |
99d393 |
def __getbuffer__(self, Py_buffer *buffer, int flags):
|
|
Packit Service |
99d393 |
cdef Py_ssize_t itemsize = sizeof(self.v[0])
|
|
Packit Service |
99d393 |
|
|
Packit Service |
99d393 |
self.shape[0] = self.v.size() / self.ncols
|
|
Packit Service |
99d393 |
self.shape[1] = self.ncols
|
|
Packit Service |
99d393 |
|
|
Packit Service |
99d393 |
# Stride 1 is the distance, in bytes, between two items in a row;
|
|
Packit Service |
99d393 |
# this is the distance between two adjacent items in the vector.
|
|
Packit Service |
99d393 |
# Stride 0 is the distance between the first elements of adjacent rows.
|
|
Packit Service |
99d393 |
self.strides[1] = <Py_ssize_t>( <char *>&(self.v[1])
|
|
Packit Service |
99d393 |
- <char *>&(self.v[0]))
|
|
Packit Service |
99d393 |
self.strides[0] = self.ncols * self.strides[1]
|
|
Packit Service |
99d393 |
|
|
Packit Service |
99d393 |
buffer.buf = <char *>&(self.v[0])
|
|
Packit Service |
99d393 |
buffer.format = 'f' # float
|
|
Packit Service |
99d393 |
buffer.internal = NULL # see References
|
|
Packit Service |
99d393 |
buffer.itemsize = itemsize
|
|
Packit Service |
99d393 |
buffer.len = self.v.size() * itemsize # product(shape) * itemsize
|
|
Packit Service |
99d393 |
buffer.ndim = 2
|
|
Packit Service |
99d393 |
buffer.obj = self
|
|
Packit Service |
99d393 |
buffer.readonly = 0
|
|
Packit Service |
99d393 |
buffer.shape = self.shape
|
|
Packit Service |
99d393 |
buffer.strides = self.strides
|
|
Packit Service |
99d393 |
buffer.suboffsets = NULL # for pointer arrays only
|
|
Packit Service |
99d393 |
|
|
Packit Service |
99d393 |
def __releasebuffer__(self, Py_buffer *buffer):
|
|
Packit Service |
99d393 |
pass
|
|
Packit Service |
99d393 |
|
|
Packit Service |
99d393 |
The method ``Matrix.__getbuffer__`` fills a descriptor structure,
|
|
Packit Service |
99d393 |
called a ``Py_buffer``, that is defined by the Python C-API.
|
|
Packit Service |
99d393 |
It contains a pointer to the actual buffer in memory,
|
|
Packit Service |
99d393 |
as well as metadata about the shape of the array and the strides
|
|
Packit Service |
99d393 |
(step sizes to get from one element or row to the next).
|
|
Packit Service |
99d393 |
Its ``shape`` and ``strides`` members are pointers
|
|
Packit Service |
99d393 |
that must point to arrays of type and size ``Py_ssize_t[ndim]``.
|
|
Packit Service |
99d393 |
These arrays have to stay alive as long as any buffer views the data,
|
|
Packit Service |
99d393 |
so we store them on the ``Matrix`` object as members.
|
|
Packit Service |
99d393 |
|
|
Packit Service |
99d393 |
The code is not yet complete, but we can already compile it
|
|
Packit Service |
99d393 |
and test the basic functionality.
|
|
Packit Service |
99d393 |
|
|
Packit Service |
99d393 |
::
|
|
Packit Service |
99d393 |
|
|
Packit Service |
99d393 |
>>> from matrix import Matrix
|
|
Packit Service |
99d393 |
>>> import numpy as np
|
|
Packit Service |
99d393 |
>>> m = Matrix(10)
|
|
Packit Service |
99d393 |
>>> np.asarray(m)
|
|
Packit Service |
99d393 |
array([], shape=(0, 10), dtype=float32)
|
|
Packit Service |
99d393 |
>>> m.add_row()
|
|
Packit Service |
99d393 |
>>> a = np.asarray(m)
|
|
Packit Service |
99d393 |
>>> a[:] = 1
|
|
Packit Service |
99d393 |
>>> m.add_row()
|
|
Packit Service |
99d393 |
>>> a = np.asarray(m)
|
|
Packit Service |
99d393 |
>>> a
|
|
Packit Service |
99d393 |
array([[ 1., 1., 1., 1., 1., 1., 1., 1., 1., 1.],
|
|
Packit Service |
99d393 |
[ 0., 0., 0., 0., 0., 0., 0., 0., 0., 0.]], dtype=float32)
|
|
Packit Service |
99d393 |
|
|
Packit Service |
99d393 |
Now we can view the ``Matrix`` as a NumPy ``ndarray``,
|
|
Packit Service |
99d393 |
and modify its contents using standard NumPy operations.
|
|
Packit Service |
99d393 |
|
|
Packit Service |
99d393 |
|
|
Packit Service |
99d393 |
Memory safety and reference counting
|
|
Packit Service |
99d393 |
------------------------------------
|
|
Packit Service |
99d393 |
|
|
Packit Service |
99d393 |
The ``Matrix`` class as implemented so far is unsafe.
|
|
Packit Service |
99d393 |
The ``add_row`` operation can move the underlying buffer,
|
|
Packit Service |
99d393 |
which invalidates any NumPy (or other) view on the data.
|
|
Packit Service |
99d393 |
If you try to access values after an ``add_row`` call,
|
|
Packit Service |
99d393 |
you'll get outdated values or a segfault.
|
|
Packit Service |
99d393 |
|
|
Packit Service |
99d393 |
This is where ``__releasebuffer__`` comes in.
|
|
Packit Service |
99d393 |
We can add a reference count to each matrix,
|
|
Packit Service |
99d393 |
and lock it for mutation whenever a view exists.
|
|
Packit Service |
99d393 |
|
|
Packit Service |
99d393 |
::
|
|
Packit Service |
99d393 |
|
|
Packit Service |
99d393 |
cdef class Matrix:
|
|
Packit Service |
99d393 |
# ...
|
|
Packit Service |
99d393 |
cdef int view_count
|
|
Packit Service |
99d393 |
|
|
Packit Service |
99d393 |
def __cinit__(self, Py_ssize_t ncols):
|
|
Packit Service |
99d393 |
self.ncols = ncols
|
|
Packit Service |
99d393 |
self.view_count = 0
|
|
Packit Service |
99d393 |
|
|
Packit Service |
99d393 |
def add_row(self):
|
|
Packit Service |
99d393 |
if self.view_count > 0:
|
|
Packit Service |
99d393 |
raise ValueError("can't add row while being viewed")
|
|
Packit Service |
99d393 |
self.v.resize(self.v.size() + self.ncols)
|
|
Packit Service |
99d393 |
|
|
Packit Service |
99d393 |
def __getbuffer__(self, Py_buffer *buffer, int flags):
|
|
Packit Service |
99d393 |
# ... as before
|
|
Packit Service |
99d393 |
|
|
Packit Service |
99d393 |
self.view_count += 1
|
|
Packit Service |
99d393 |
|
|
Packit Service |
99d393 |
def __releasebuffer__(self, Py_buffer *buffer):
|
|
Packit Service |
99d393 |
self.view_count -= 1
|
|
Packit Service |
99d393 |
|
|
Packit Service |
99d393 |
|
|
Packit Service |
99d393 |
Flags
|
|
Packit Service |
99d393 |
-----
|
|
Packit Service |
99d393 |
We skipped some input validation in the code.
|
|
Packit Service |
99d393 |
The ``flags`` argument to ``__getbuffer__`` comes from ``np.asarray``
|
|
Packit Service |
99d393 |
(and other clients) and is an OR of boolean flags
|
|
Packit Service |
99d393 |
that describe the kind of array that is requested.
|
|
Packit Service |
99d393 |
Strictly speaking, if the flags contain ``PyBUF_ND``, ``PyBUF_SIMPLE``,
|
|
Packit Service |
99d393 |
or ``PyBUF_F_CONTIGUOUS``, ``__getbuffer__`` must raise a ``BufferError``.
|
|
Packit Service |
99d393 |
These macros can be ``cimport``'d from ``cpython.buffer``.
|
|
Packit Service |
99d393 |
|
|
Packit Service |
99d393 |
(The matrix-in-vector structure actually conforms to ``PyBUF_ND``,
|
|
Packit Service |
99d393 |
but that would prohibit ``__getbuffer__`` from filling in the strides.
|
|
Packit Service |
99d393 |
A single-row matrix is F-contiguous, but a larger matrix is not.)
|
|
Packit Service |
99d393 |
|
|
Packit Service |
99d393 |
|
|
Packit Service |
99d393 |
References
|
|
Packit Service |
99d393 |
----------
|
|
Packit Service |
99d393 |
|
|
Packit Service |
99d393 |
The buffer interface used here is set out in
|
|
Packit Service |
99d393 |
:PEP:`3118`, Revising the buffer protocol.
|
|
Packit Service |
99d393 |
|
|
Packit Service |
99d393 |
A tutorial for using this API from C is on Jake Vanderplas's blog,
|
|
Packit Service |
99d393 |
`An Introduction to the Python Buffer Protocol
|
|
Packit Service |
99d393 |
<https://jakevdp.github.io/blog/2014/05/05/introduction-to-the-python-buffer-protocol/>`_.
|
|
Packit Service |
99d393 |
|
|
Packit Service |
99d393 |
Reference documentation is available for
|
|
Packit Service |
99d393 |
`Python 3 <https://docs.python.org/3/c-api/buffer.html>`_
|
|
Packit Service |
99d393 |
and `Python 2 <https://docs.python.org/2.7/c-api/buffer.html>`_.
|
|
Packit Service |
99d393 |
The Py2 documentation also describes an older buffer protocol
|
|
Packit Service |
99d393 |
that is no longer in use;
|
|
Packit Service |
99d393 |
since Python 2.6, the :PEP:`3118` protocol has been implemented,
|
|
Packit Service |
99d393 |
and the older protocol is only relevant for legacy code.
|