|
Packit |
d28291 |
|
|
Packit |
d28291 |
<HEAD>
|
|
Packit |
d28291 |
<TITLE>Finalization in the Boehm-Demers-Weiser collector </TITLE>
|
|
Packit |
d28291 |
</HEAD>
|
|
Packit |
d28291 |
<BODY>
|
|
Packit |
d28291 |
Finalization
|
|
Packit |
d28291 |
Many garbage collectors provide a facility for executing user code
|
|
Packit |
d28291 |
just before an object is collected. This can be used to reclaim any
|
|
Packit |
d28291 |
system resources or non-garbage-collected memory associated with the
|
|
Packit |
d28291 |
object.
|
|
Packit |
d28291 |
Experience has shown that this can be a useful facility.
|
|
Packit |
d28291 |
It is indispensable in cases in which system resources are embedded
|
|
Packit |
d28291 |
in complex data structures (e.g. file descriptors
|
|
Packit |
d28291 |
in the cord package).
|
|
Packit |
d28291 |
|
|
Packit |
d28291 |
Our collector provides the necessary functionality through
|
|
Packit |
d28291 |
<TT>GC_register_finalizer</tt> in
|
|
Packit |
d28291 |
gc.h, or by
|
|
Packit |
d28291 |
inheriting from <TT>gc_cleanup</tt>
|
|
Packit |
d28291 |
in gc_cpp.h.
|
|
Packit |
d28291 |
|
|
Packit |
d28291 |
However, finalization should not be used in the same way as C++
|
|
Packit |
d28291 |
destructors. In well-written programs there will typically be
|
|
Packit |
d28291 |
very few uses of finalization. (Garbage collected programs that
|
|
Packit |
d28291 |
interact with explicitly memory-managed libraries may be an exception.)
|
|
Packit |
d28291 |
|
|
Packit |
d28291 |
In general the following guidelines should be followed:
|
|
Packit |
d28291 |
|
|
Packit |
d28291 |
|
|
Packit |
d28291 |
Actions that must be executed promptly do not belong in finalizers.
|
|
Packit |
d28291 |
They should be handled by explicit calls in the code (or C++
|
|
Packit |
d28291 |
destructors if you prefer). If you expect the action to occur at
|
|
Packit |
d28291 |
a specific point, this is probably not hard.
|
|
Packit |
d28291 |
|
|
Packit |
d28291 |
Finalizers are intended for resource reclamation.
|
|
Packit |
d28291 |
|
|
Packit |
d28291 |
Scarce system resources should be managed explicitly whenever
|
|
Packit |
d28291 |
convenient. Use finalizers only as a backup mechanism for the
|
|
Packit |
d28291 |
cases that would be hard to handle explicitly.
|
|
Packit |
d28291 |
|
|
Packit |
d28291 |
If scarce resources are managed with finalization, the allocation
|
|
Packit |
d28291 |
routine for that resource (e.g. open for file handles) should force
|
|
Packit |
d28291 |
a garbage collection (two if that doesn't suffice) if it finds itself
|
|
Packit |
d28291 |
short of the resource.
|
|
Packit |
d28291 |
|
|
Packit |
d28291 |
If extremely scarce resources are managed by finalization (e.g.
|
|
Packit |
d28291 |
file descriptors on systems which have a limit of 20 open files),
|
|
Packit |
d28291 |
it may be necessary to introduce a descriptor caching scheme to
|
|
Packit |
d28291 |
hide the resource limit.
|
|
Packit |
d28291 |
(E.g., the program would keep real file descriptors
|
|
Packit |
d28291 |
for the 20 most recently used logically open files.
|
|
Packit |
d28291 |
Any other needed files would be closed after saving their state.
|
|
Packit |
d28291 |
They would then be reopened on demand.
|
|
Packit |
d28291 |
Finalization would logically close the file, closing the
|
|
Packit |
d28291 |
real descriptor only if it happened to be cached.)
|
|
Packit |
d28291 |
Note that most modern systems (e.g. Irix®) allow hundreds or
|
|
Packit |
d28291 |
thousands of open files, and this is typically not an issue.
|
|
Packit |
d28291 |
|
|
Packit |
d28291 |
Finalization code may
|
|
Packit |
d28291 |
be run anyplace an allocation or other call to the collector
|
|
Packit |
d28291 |
takes place.
|
|
Packit |
d28291 |
In multi-threaded programs, finalizers have to obey the normal
|
|
Packit |
d28291 |
locking conventions to ensure safety.
|
|
Packit |
d28291 |
Code run directly from finalizers should not acquire locks that may
|
|
Packit |
d28291 |
be held during allocation. This restriction can be easily circumvented
|
|
Packit |
d28291 |
by registering a finalizer which enqueues the real action for execution
|
|
Packit |
d28291 |
in a separate thread.
|
|
Packit |
d28291 |
|
|
Packit |
d28291 |
In single-threaded code, it is also often easiest to have finalizers
|
|
Packit |
d28291 |
queue actions, which are then explicitly run during an
|
|
Packit |
d28291 |
explicit call by the user's program.
|
|
Packit |
d28291 |
|
|
Packit |
d28291 |
Topologically Ordered Finalization
|
|
Packit |
d28291 |
Our conservative garbage collector supports
|
|
Packit |
d28291 |
a form of finalization
|
|
Packit |
d28291 |
(with <TT>GC_register_finalizer</tt>)
|
|
Packit |
d28291 |
in which objects are finalized in topological
|
|
Packit |
d28291 |
order. If A points to B, and both are registered for
|
|
Packit |
d28291 |
finalization, it is guaranteed the A will be finalized first.
|
|
Packit |
d28291 |
This usually guarantees that finalization procedures see only
|
|
Packit |
d28291 |
unfinalized objects.
|
|
Packit |
d28291 |
|
|
Packit |
d28291 |
This decision is often questioned, particularly since it has an obvious
|
|
Packit |
d28291 |
disadvantage. The current implementation finalizes long chains of
|
|
Packit |
d28291 |
finalizable objects one per collection. This is hard to avoid, since
|
|
Packit |
d28291 |
the first finalizer invoked may store a pointer to the rest of the chain
|
|
Packit |
d28291 |
in a global variable, making it accessible again. Or it may mutate the
|
|
Packit |
d28291 |
rest of the chain.
|
|
Packit |
d28291 |
|
|
Packit |
d28291 |
Cycles involving one or more finalizable objects are never finalized.
|
|
Packit |
d28291 |
|
|
Packit |
d28291 |
Why topological ordering?
|
|
Packit |
d28291 |
|
|
Packit |
d28291 |
It is important to keep in mind that the choice of finalization ordering
|
|
Packit |
d28291 |
matters only in relatively rare cases. In spite of the fact that it has
|
|
Packit |
d28291 |
received a lot of discussion, it is not one of the more important
|
|
Packit |
d28291 |
decisions in designing a system. Many, especially smaller, applications
|
|
Packit |
d28291 |
will never notice the difference. Nonetheless, we believe that topologically
|
|
Packit |
d28291 |
ordered finalization is the right choice.
|
|
Packit |
d28291 |
|
|
Packit |
d28291 |
To understand the justification, observe that if As
|
|
Packit |
d28291 |
finalization procedure does not refer to B, we could fairly easily have
|
|
Packit |
d28291 |
avoided the dependency. We could have split A into A'
|
|
Packit |
d28291 |
and A'' such that any references to A become references to
|
|
Packit |
d28291 |
A', A' points to A'' but not vice-versa, only fields
|
|
Packit |
d28291 |
needed for finalization are stored in A'', and A'' is enabled
|
|
Packit |
d28291 |
for finalization. (<TT>GC_register_disappearing_link</tt> provides an
|
|
Packit |
d28291 |
alternative mechanism that does not require breaking up objects.)
|
|
Packit |
d28291 |
|
|
Packit |
d28291 |
Thus assume that A actually does need access to B during
|
|
Packit |
d28291 |
finalization. To make things concrete, assume that B is
|
|
Packit |
d28291 |
finalizable because it holds a pointer to a C object, which must be
|
|
Packit |
d28291 |
explicitly deallocated. (This is likely to be one of the most common
|
|
Packit |
d28291 |
uses of finalization.) If B happens to be finalized first,
|
|
Packit |
d28291 |
A will see a dangling pointer during its finalization. But a
|
|
Packit |
d28291 |
principal goal of garbage collection was to avoid dangling pointers.
|
|
Packit |
d28291 |
|
|
Packit |
d28291 |
Note that the client program could enforce topological ordering
|
|
Packit |
d28291 |
even if the system didn't. A pointer to B could be stored in
|
|
Packit |
d28291 |
some globally visible place, where it is cleared only by As
|
|
Packit |
d28291 |
finalizer. But this puts the burden to ensure safety back on the
|
|
Packit |
d28291 |
programmer.
|
|
Packit |
d28291 |
|
|
Packit |
d28291 |
With topologically ordered finalization, the programmer
|
|
Packit |
d28291 |
can fail to split an object, thus leaving an accidental cycle. This
|
|
Packit |
d28291 |
results in a leak, which is arguably less dangerous than a dangling
|
|
Packit |
d28291 |
pointer. More importantly, it is much easier to diagnose,
|
|
Packit |
d28291 |
since the garbage collector would have to go out of its way not to
|
|
Packit |
d28291 |
notice finalization cycles. It can trivially report them.
|
|
Packit |
d28291 |
|
|
Packit |
d28291 |
Furthermore unordered finalization does not really solve the problem
|
|
Packit |
d28291 |
of cycles. Consider the above case in which As
|
|
Packit |
d28291 |
finalization procedure depends on B, and thus a pointer to B
|
|
Packit |
d28291 |
is stored in a global data structure, to be cleared by As finalizer.
|
|
Packit |
d28291 |
If there is an accidental pointer from B back to A, and
|
|
Packit |
d28291 |
thus a cycle, neither B nor A will become unreachable.
|
|
Packit |
d28291 |
The leak is there, just as in the topologically ordered case, but it is
|
|
Packit |
d28291 |
hidden from easy diagnosis.
|
|
Packit |
d28291 |
|
|
Packit |
d28291 |
A number of alternative finalization orderings have been proposed, e.g.
|
|
Packit |
d28291 |
based on statically assigned priorities. In our opinion, these are much
|
|
Packit |
d28291 |
more likely to require complex programming discipline to use in a large
|
|
Packit |
d28291 |
modular system. (Some of them, e.g. Guardians proposed by Dybvig,
|
|
Packit |
d28291 |
Bruggeman, and Eby, do avoid some problems which arise in combination
|
|
Packit |
d28291 |
with certain other collection algorithms.)
|
|
Packit |
d28291 |
|
|
Packit |
d28291 |
Fundamentally, a garbage collector assumes that objects reachable
|
|
Packit |
d28291 |
via pointer chains may be accessed, and thus should be preserved.
|
|
Packit |
d28291 |
Topologically ordered finalization simply extends this to object finalization;
|
|
Packit |
d28291 |
an finalizable object reachable from another finalizer via a pointer chain
|
|
Packit |
d28291 |
is presumed to be accessible by the finalizer, and thus should not be
|
|
Packit |
d28291 |
finalized.
|
|
Packit |
d28291 |
|
|
Packit |
d28291 |
Programming with topological finalization
|
|
Packit |
d28291 |
Experience with Cedar has shown that cycles or long chains of finalizable
|
|
Packit |
d28291 |
objects are typically not a problem.
|
|
Packit |
d28291 |
Finalizable objects are typically rare.
|
|
Packit |
d28291 |
There are several ways to reduce spurious dependencies between finalizable
|
|
Packit |
d28291 |
objects. Splitting objects as discussed above is one technique.
|
|
Packit |
d28291 |
The collector also provides <TT>GC_register_disappearing_link</tt>, which
|
|
Packit |
d28291 |
explicitly nils a pointer before determining finalization ordering.
|
|
Packit |
d28291 |
|
|
Packit |
d28291 |
Some so-called "operating systems" fail to clean up some resources associated
|
|
Packit |
d28291 |
with a process. These resources must be deallocated at all cost before
|
|
Packit |
d28291 |
process exit whether or not they are still referenced. Probably the best
|
|
Packit |
d28291 |
way to deal with those is by not relying exclusively on finalization.
|
|
Packit |
d28291 |
They should be registered in a table of weak pointers (implemented as
|
|
Packit |
d28291 |
disguised pointers cleared by the finalization procedure that deallocates
|
|
Packit |
d28291 |
the resource). If any references are still left at process exit, they
|
|
Packit |
d28291 |
can be explicitly deallocated then.
|
|
Packit |
d28291 |
|
|
Packit |
d28291 |
Getting around topological finalization ordering
|
|
Packit |
d28291 |
There are certain situations in which cycles between finalizable objects are
|
|
Packit |
d28291 |
genuinely unavoidable. Most notably, C++ compilers introduce self-cycles
|
|
Packit |
d28291 |
to represent inheritance. <TT>GC_register_finalizer_ignore_self</tt> tells the
|
|
Packit |
d28291 |
finalization part of the collector to ignore self cycles.
|
|
Packit |
d28291 |
This is used by the C++ interface.
|
|
Packit |
d28291 |
|
|
Packit |
d28291 |
Finalize.c actually contains an intentionally undocumented mechanism
|
|
Packit |
d28291 |
for registering a finalizable object with user-defined dependencies.
|
|
Packit |
d28291 |
The problem is that this dependency information is also used for memory
|
|
Packit |
d28291 |
reclamation, not just finalization ordering. Thus misuse can result in
|
|
Packit |
d28291 |
dangling pointers even if finalization doesn't create any.
|
|
Packit |
d28291 |
The risk of dangling pointers can be eliminated by building the collector
|
|
Packit |
d28291 |
with -DJAVA_FINALIZATION. This forces objects reachable from finalizers
|
|
Packit |
d28291 |
to be marked, even though this dependency is not considered for finalization
|
|
Packit |
d28291 |
ordering.
|
|
Packit |
d28291 |
|
|
Packit |
d28291 |
</body>
|
|
Packit |
d28291 |
</html>
|