|
Packit Service |
b74dd5 |
Memory management
|
|
Packit Service |
b74dd5 |
=================
|
|
Packit Service |
b74dd5 |
|
|
Packit Service |
b74dd5 |
There can be two types of nodes:
|
|
Packit Service |
b74dd5 |
|
|
Packit Service |
b74dd5 |
* those connected to an existing tree
|
|
Packit Service |
b74dd5 |
|
|
Packit Service |
b74dd5 |
* those unconnected. These may be the top node of a tree
|
|
Packit Service |
b74dd5 |
|
|
Packit Service |
b74dd5 |
Nodes consist of a C-level libxml2 node, Node for short, and
|
|
Packit Service |
b74dd5 |
optionally a Python-level proxy node, Proxy. Zero, one or more Proxies can
|
|
Packit Service |
b74dd5 |
exist for a single Node.
|
|
Packit Service |
b74dd5 |
|
|
Packit Service |
b74dd5 |
Proxies are garbage collected automatically by Python. Nodes are not
|
|
Packit Service |
b74dd5 |
garbage collected at all. Instead, explicit mechanisms exist for
|
|
Packit Service |
b74dd5 |
Nodes to clear them and the tree they may be the top of.
|
|
Packit Service |
b74dd5 |
|
|
Packit Service |
b74dd5 |
A Node can be safely freed when:
|
|
Packit Service |
b74dd5 |
|
|
Packit Service |
b74dd5 |
* no Proxy is connected to this Node
|
|
Packit Service |
b74dd5 |
|
|
Packit Service |
b74dd5 |
* no Proxy cannot be created for this Node
|
|
Packit Service |
b74dd5 |
|
|
Packit Service |
b74dd5 |
A Proxy cannot be created to a CNode when:
|
|
Packit Service |
b74dd5 |
|
|
Packit Service |
b74dd5 |
* no Proxy exist for nodes that are connected to that Node
|
|
Packit Service |
b74dd5 |
|
|
Packit Service |
b74dd5 |
This is the case when:
|
|
Packit Service |
b74dd5 |
|
|
Packit Service |
b74dd5 |
* the Node is in a tree that has no Proxy connected to any of the nodes.
|
|
Packit Service |
b74dd5 |
|
|
Packit Service |
b74dd5 |
This means that the whole tree in such a condition can be freed.
|
|
Packit Service |
b74dd5 |
|
|
Packit Service |
b74dd5 |
Detecting whether a Node is in a tree that has no Proxies connected to
|
|
Packit Service |
b74dd5 |
it can be done by relying on Python's garbage collection
|
|
Packit Service |
b74dd5 |
algorithm. Each Proxy can have a reference to the Proxy that points to
|
|
Packit Service |
b74dd5 |
the top of the tree. In case of a document tree, this reference is to
|
|
Packit Service |
b74dd5 |
the Document Proxy. When no more references exist in the system to the
|
|
Packit Service |
b74dd5 |
top Proxy, this means no more Proxies exist that point to the Node
|
|
Packit Service |
b74dd5 |
tree the top Proxy is the top of. If this Node tree is unconnected;
|
|
Packit Service |
b74dd5 |
i.e. it is not a subtree, this means that tree can be safely garbage
|
|
Packit Service |
b74dd5 |
collected.
|
|
Packit Service |
b74dd5 |
|
|
Packit Service |
b74dd5 |
A special case exists for document references. Each Proxy will always
|
|
Packit Service |
b74dd5 |
have a reference to the Document Proxy, as any Node will have such a
|
|
Packit Service |
b74dd5 |
reference to the Document Node. This means that a Document Node can
|
|
Packit Service |
b74dd5 |
only be garbage collected when no more Proxies at all exist anymore
|
|
Packit Service |
b74dd5 |
which refer to the Document. This is a separate system from the
|
|
Packit Service |
b74dd5 |
top-Node references, even though the top-node in many cases will be
|
|
Packit Service |
b74dd5 |
the Document. This because there is no way to get to a node that is
|
|
Packit Service |
b74dd5 |
not connected to the Document tree from a Document Proxy.
|
|
Packit Service |
b74dd5 |
|
|
Packit Service |
b74dd5 |
This approach requires a system that can keep track of the top of the
|
|
Packit Service |
b74dd5 |
tree in any case. Usually this is simple: when a Proxy gets connected,
|
|
Packit Service |
b74dd5 |
the tree top becomes the tree top of whatever node it is connected
|
|
Packit Service |
b74dd5 |
to.
|
|
Packit Service |
b74dd5 |
|
|
Packit Service |
b74dd5 |
Sometimes this is more difficult: a Proxy may exist pointing to a node
|
|
Packit Service |
b74dd5 |
in a subtree that just got connected. The top reference cannot be
|
|
Packit Service |
b74dd5 |
updated. This is a problem in the following case:
|
|
Packit Service |
b74dd5 |
|
|
Packit Service |
b74dd5 |
a
|
|
Packit Service |
b74dd5 |
b c h
|
|
Packit Service |
b74dd5 |
d e f g i j
|
|
Packit Service |
b74dd5 |
k
|
|
Packit Service |
b74dd5 |
|
|
Packit Service |
b74dd5 |
now imagine we have a proxy to k, K, and a proxy of i, I. They both
|
|
Packit Service |
b74dd5 |
have a pointer to proxy H.
|
|
Packit Service |
b74dd5 |
|
|
Packit Service |
b74dd5 |
Now imagine i gets moved under g through proxy I. Proxy I will have an
|
|
Packit Service |
b74dd5 |
updated pointer to proxy A. However, proxy K cannot be updated and still
|
|
Packit Service |
b74dd5 |
points to H, from which it is now in fact disconnected.
|
|
Packit Service |
b74dd5 |
|
|
Packit Service |
b74dd5 |
proxy H cannot be removed now until proxy A is removed. In addition,
|
|
Packit Service |
b74dd5 |
proxy A has a refcount that is too low because proxy K doesn't point
|
|
Packit Service |
b74dd5 |
to it but should.
|
|
Packit Service |
b74dd5 |
|
|
Packit Service |
b74dd5 |
Another strategy involves having a reference count on the underlying
|
|
Packit Service |
b74dd5 |
nodes, one per proxy. A node can only be freed if there is no
|
|
Packit Service |
b74dd5 |
descendant-or-self that has the refcount higher than 0. A node, when
|
|
Packit Service |
b74dd5 |
no more Python references to it exist, will check for refcounts first.
|
|
Packit Service |
b74dd5 |
The drawback of this is potentially heavy tree-walking each time a proxy
|
|
Packit Service |
b74dd5 |
can be removed.
|