Blame doc/PY3PORT.txt

Packit 130fc8
===============================
Packit 130fc8
Porting python-dbus to Python 3
Packit 130fc8
===============================
Packit 130fc8
Packit 130fc8
This is an experimental port to Python 3.x where x >= 2.  There are lots of
Packit 130fc8
great sources for porting C extensions to Python 3, including:
Packit 130fc8
Packit 130fc8
 * http://python3porting.com/toc.html
Packit 130fc8
 * http://docs.python.org/howto/cporting.html
Packit 130fc8
 * http://docs.python.org/py3k/c-api/index.html
Packit 130fc8
Packit 130fc8
I also consulted an early take on this port by John Palmieri and David Malcolm
Packit 130fc8
in the context of Fedora:
Packit 130fc8
Packit 130fc8
 * https://bugs.freedesktop.org/show_bug.cgi?id=26420
Packit 130fc8
Packit 130fc8
although I have made some different choices.  The patches in that tracker
Packit 130fc8
issue also don't cover porting the Python bits (e.g. the test suite), nor the
Packit 130fc8
pygtk -> pygi porting, both which I've also attempted to do in this branch.
Packit 130fc8
Packit 130fc8
This document outlines my notes and strategies for doing this port.  Please
Packit 130fc8
feel free to contact me with any bugs, issues, disagreements, suggestions,
Packit 130fc8
kudos, and curses.
Packit 130fc8
Packit 130fc8
Barry Warsaw
Packit 130fc8
barry@python.org
Packit 130fc8
2011-11-11
Packit 130fc8
Packit 130fc8
Packit 130fc8
User visible changes
Packit 130fc8
====================
Packit 130fc8
Packit 130fc8
You've got some dbus-python code that works great in Python 2.  This branch
Packit 130fc8
should generally allow your existing Python 2 code to continue to work
Packit 130fc8
unchanged.  There are a few changes you'll notice in Python 2 though:
Packit 130fc8
Packit 130fc8
 - The minimum supported Python 2 version is 2.6.
Packit 130fc8
 - All object reprs are unicodes.  This change was made because it greatly
Packit 130fc8
   simplifies the implementation and cross-compatibility with Python 3.
Packit 130fc8
 - Some exception strings have changed.
Packit 130fc8
 - `MethodCallMessage` and `SignalMessage` objects have better reprs now.
Packit 130fc8
Packit 130fc8
What do you need to do to port that to Python 3?  Here are the user visible
Packit 130fc8
changes you should be aware of, relative to Python 2.  Python 3.2 is the
Packit 130fc8
minimal required version:
Packit 130fc8
Packit 130fc8
 - `ByteArray` objects must be initialized with bytes objects, not unicodes.
Packit 130fc8
   Use `b''` literals in the constructor.  This also works in Python 2, where
Packit 130fc8
   bytes objects are aliases for 8-bit strings.
Packit 130fc8
 - `Byte` objects must be initialized with either a length-1 bytes object
Packit 130fc8
   (again, use `b''` literals to be compatible with either Python 2 or 3)
Packit 130fc8
   or an integer.
Packit 130fc8
 - byte signatures (i.e. `y` type codes) must be passed either a length-1
Packit 130fc8
   bytes object or an integer. unicodes (str in Python 3) are not allowed.
Packit 130fc8
 - `ByteArray` is now a subclass of `bytes`, where in Python 2 it is a
Packit 130fc8
   subclass of `str`.
Packit 130fc8
 - `dbus.UTF8String` is gone, use `dbus.String`.  Also `utf8_string` arguments
Packit 130fc8
   are no longer allowed.
Packit 130fc8
 - All longs are now ints, since Python 3 has only a single int type.  This
Packit 130fc8
   also means that the class hierarchy for the dbus numeric types has changed
Packit 130fc8
   (all derive from int in Python 3).
Packit 130fc8
Packit 130fc8
Packit 130fc8
Bytes vs. Strings
Packit 130fc8
=================
Packit 130fc8
Packit 130fc8
All strings in dbus are defined as UTF-8:
Packit 130fc8
Packit 130fc8
http://dbus.freedesktop.org/doc/dbus-specification.html#message-protocol-signatures
Packit 130fc8
Packit 130fc8
However, the dbus C API accepts `char*` which must be UTF-8 strings NUL
Packit 130fc8
terminated and no other NUL bytes.
Packit 130fc8
Packit 130fc8
This page describes the mapping between Python types and dbus types:
Packit 130fc8
Packit 130fc8
    http://dbus.freedesktop.org/doc/dbus-python/doc/tutorial.html#basic-types
Packit 130fc8
Packit 130fc8
Notice that it maps dbus `string` (`'s'`) to `dbus.String` (unicode) or
Packit 130fc8
`dbus.UTF8String` (str).  Also notice that there is no direct dbus equivalent
Packit 130fc8
of Python's bytes type (although dbus does have byte arrays), so I am mapping
Packit 130fc8
dbus strings to unicodes in all cases, and getting rid of `dbus.UTF8String` in
Packit 130fc8
Python 3.  I've also added a `dbus._BytesBase` type which is unused in Python
Packit 130fc8
2, but which forms the base class for `dbus.ByteArray` in Python 3.  This is
Packit 130fc8
an implementation detail and not part of the public API.
Packit 130fc8
Packit 130fc8
In Python 3, object paths (`'o'` or `dbus.ObjectPath`), signatures (`'g'` or
Packit 130fc8
`dbus.Signature`), bus names, interfaces, and methods are all strings.  A
Packit 130fc8
previous aborted effort was made to use bytes for these, which at first blush
Packit 130fc8
may makes some sense, but on deeper consideration does not.  This approach
Packit 130fc8
also tended to impose too many changes on user code, and caused lots of
Packit 130fc8
difficult to track down problems.
Packit 130fc8
Packit 130fc8
In Python 3, all such objects are subclasses of `str` (i.e. `unicode`).
Packit 130fc8
Packit 130fc8
(As an example, dbus-python's callback dispatching pretty much assumes all
Packit 130fc8
these things are strings.  When they are bytes, the fact that `'foo' != b'foo'`
Packit 130fc8
causes dispatch matching to fail in difficult to debug ways.  Even bus names
Packit 130fc8
are not immune, since they do things like `bus_name[:1] == ':'` which fails in
Packit 130fc8
multiple ways when `bus_name` is a bytes.  For sanity purposes, these are all
Packit 130fc8
unicode strings now, and we just eat the complexity at the C level.)
Packit 130fc8
Packit 130fc8
I am using `#include <bytesobject.h>`, which exposes the PyBytes API to Python
Packit 130fc8
2.6 and 2.7, and I have converted all internal PyString calls to PyBytes
Packit 130fc8
calls.  Where this is inappropriate, we'll use PyUnicode calls explicitly.
Packit 130fc8
E.g. all repr() implementations now return unicodes.  Most of these changes
Packit 130fc8
shouldn't be noticed, even in existing Python 2 code.
Packit 130fc8
Packit 130fc8
Generally, I've left the descriptions and docstrings saying "str" instead of
Packit 130fc8
"unicode" since there's no distinction in Python 3.
Packit 130fc8
Packit 130fc8
APIs which previously returned PyStrings will usually return PyUnicodes, not
Packit 130fc8
PyBytes.
Packit 130fc8
Packit 130fc8
Packit 130fc8
Ints vs. Longs
Packit 130fc8
==============
Packit 130fc8
Packit 130fc8
Python 3 only has PyLong types; PyInts are gone.  For that reason, I've
Packit 130fc8
switched all PyInt calls to use PyLong in both Python 2 and Python 3.  Python
Packit 130fc8
3.0 had a nice `<intobject.h>` header that aliased PyInt to PyLong, but that's
Packit 130fc8
gone as of Python 3.1, and the minimal required Python 3 version is 3.2.
Packit 130fc8
Packit 130fc8
In the above page mapping basic types, you'll notice that the Python int type
Packit 130fc8
is mapped to 32-bit signed integers ('i') and the Python long type is mapped
Packit 130fc8
to 64-bit signed integers ('x').  Python 3 doesn't have this distinction, so
Packit 130fc8
ints map to 'i' even though ints can be larger in Python 3.  Use the
Packit 130fc8
dbus-specific integer types if you must have more exact mappings.
Packit 130fc8
Packit 130fc8
APIs which accepted ints in Python 2 will still do so, but they'll also now
Packit 130fc8
accept longs.  These APIs obviously only accept longs in Python 3.
Packit 130fc8
Packit 130fc8
Long literals in Python code are an interesting thing to have to port.  Don't
Packit 130fc8
use them if you want your code to work in both Python versions.
Packit 130fc8
Packit 130fc8
`dbus._IntBase` is removed in Python 3, you only have `dbus._LongBase`, which
Packit 130fc8
inherits from a Python 3 int (i.e. a PyLong).  Again, this is an
Packit 130fc8
implementation detail that users should never care about.
Packit 130fc8
Packit 130fc8
Packit 130fc8
Macros
Packit 130fc8
======
Packit 130fc8
Packit 130fc8
In types-internal.h, I define `PY3K` when `PY_MAJOR_VERSION` >= 3, so you'll
Packit 130fc8
see ifdefs on the former symbol within the C code.
Packit 130fc8
Packit 130fc8
Python 3 really could use a PY_REFCNT() wrapper for ob_refcnt access.
Packit 130fc8
Packit 130fc8
Packit 130fc8
PyCapsule vs. PyCObject
Packit 130fc8
=======================
Packit 130fc8
Packit 130fc8
`_dbus_bindings._C_API` is an attribute exposed to Python in the module.  In
Packit 130fc8
Python 2, this is a PyCObject, but these do not exist in Python >= 3.2, so it
Packit 130fc8
is replaced with a PyCapsules for Python 3.  However, since PyCapsules were
Packit 130fc8
only introduced in Python 2.7, and I want to support Python 2.6, PyCObjects
Packit 130fc8
are still used when this module is compiled for Python 2.
Packit 130fc8
Packit 130fc8
Packit 130fc8
Python level compatibility
Packit 130fc8
==========================
Packit 130fc8
Packit 130fc8
`from dbus import _is_py3` gives you a flag to check if you must do something
Packit 130fc8
different in Python 3.  In general I use this flag to support both versions in
Packit 130fc8
one set of sources, which seems better than trying to use 2to3.  It's not part
Packit 130fc8
of the dbus-python public API, so you must not use it in third-party projects.
Packit 130fc8
Packit 130fc8
Packit 130fc8
Miscellaneous
Packit 130fc8
=============
Packit 130fc8
Packit 130fc8
The PyDoc_STRVAR() documentation is probably out of date.  Once the API
Packit 130fc8
choices have been green-lighted upstream, I'll make a pass through the code to
Packit 130fc8
update them.  It might be tricky based on any differences between Python 2 and
Packit 130fc8
Python 3.
Packit 130fc8
Packit 130fc8
There were a few places where I noticed what might be considered bugs,
Packit 130fc8
unchecked exception conditions, or possible reference count leaks.  In these
Packit 130fc8
cases, I've just fixed what I can and hopefully haven't made the situation
Packit 130fc8
worse.
Packit 130fc8
Packit 130fc8
`dbus_py_variant_level_get()` did not check possible error conditions, nor did
Packit 130fc8
their callers.  When `dbus_py_variant_level_get()` encounters an error, it now
Packit 130fc8
returns -1, and callers check this.
Packit 130fc8
Packit 130fc8
As much as possible, I've refrained from general code cleanups (e.g. 80
Packit 130fc8
columns), unless it just bugged me too much or I touched the code for reasons
Packit 130fc8
related to the port.  I've also tried to stick to existing C code style,
Packit 130fc8
e.g. through the use of pervasive `Py_CLEAR()` calls, comparison against NULL
Packit 130fc8
usually with `!foo`, and such.  As Bart Simpson might write on his classroom
Packit 130fc8
blackboard::
Packit 130fc8
Packit 130fc8
    This is not a rewrite
Packit 130fc8
    This is not a rewrite
Packit 130fc8
    This is not a rewrite
Packit 130fc8
    This is not a rewrite
Packit 130fc8
    ...
Packit 130fc8
Packit 130fc8
and so on.  Well, mostly ;).
Packit 130fc8
Packit 130fc8
I think I fixed a reference leak in `DBusPyServer_set_auth_mechanisms()`.
Packit 130fc8
`PySequence_Fast()` returns a new reference, which wasn't getting decref'd in
Packit 130fc8
any return path.
Packit 130fc8
Packit 130fc8
 - Instantiation of metaclasses uses different, incompatible syntax in Python
Packit 130fc8
   2 and 3.  You have to use direct calling of the metaclass to work across
Packit 130fc8
   versions, i.e. `Interface = InterfaceType('Interface', (object,), {})`
Packit 130fc8
 - `iteritems()` and friends are gone.  I dropped the "iter" prefixes.
Packit 130fc8
 - `xrange() is gone.  I changed them to use `range()`.
Packit 130fc8
 - `isSequenceType()` is gone in Python 3, so I use a different idiom there.
Packit 130fc8
 - `__next__()` vs. `next()`
Packit 130fc8
 - `PyUnicode_FromFormat()` `%V` flag is a clever hack!
Packit 130fc8
 - `sys.version_info` is a tuple in Python 2.6, not a namedtuple.  i.e. there
Packit 130fc8
   is no `sys.version_info.major`
Packit 130fc8
 - `PyArg_Parse()`: No 'y' code in Python 2; in Python 3, no equivalent of 'z'
Packit 130fc8
   for bytes objects.
Packit 130fc8
Packit 130fc8
Packit 130fc8
Open issues
Packit 130fc8
===========
Packit 130fc8
Packit 130fc8
Here are a few things that still need to be done, or for which there may be
Packit 130fc8
open questions::
Packit 130fc8
Packit 130fc8
 - Update all C extension docstrings for accuracy.