Blame doc/finalization.html

Packit d28291
Packit d28291
<HEAD>
Packit d28291
<TITLE>Finalization in the Boehm-Demers-Weiser collector </TITLE>
Packit d28291
</HEAD>
Packit d28291
<BODY>
Packit d28291

Finalization

Packit d28291
Many garbage collectors provide a facility for executing user code
Packit d28291
just before an object is collected.  This can be used to reclaim any
Packit d28291
system resources or non-garbage-collected memory associated with the
Packit d28291
object.
Packit d28291
Experience has shown that this can be a useful facility.
Packit d28291
It is indispensable in cases in which system resources are embedded
Packit d28291
in complex data structures (e.g. file descriptors
Packit d28291
in the cord package).
Packit d28291

Packit d28291
Our collector provides the necessary functionality through
Packit d28291
<TT>GC_register_finalizer</tt> in
Packit d28291
gc.h, or by
Packit d28291
inheriting from <TT>gc_cleanup</tt>
Packit d28291
in gc_cpp.h.
Packit d28291

Packit d28291
However, finalization should not be used in the same way as C++
Packit d28291
destructors.  In well-written programs there will typically be
Packit d28291
very few uses of finalization.  (Garbage collected programs that
Packit d28291
interact with explicitly memory-managed libraries may be an exception.)
Packit d28291

Packit d28291
In general the following guidelines should be followed:
Packit d28291
    Packit d28291
  • Packit d28291
    Actions that must be executed promptly do not belong in finalizers.
    Packit d28291
    They should be handled by explicit calls in the code (or C++
    Packit d28291
    destructors if you prefer).  If you expect the action to occur at
    Packit d28291
    a specific point, this is probably not hard.
    Packit d28291
  • Packit d28291
    Finalizers are intended for resource reclamation.
    Packit d28291
  • Packit d28291
    Scarce system resources should be managed explicitly whenever
    Packit d28291
    convenient.  Use finalizers only as a backup mechanism for the
    Packit d28291
    cases that would be hard to handle explicitly.
    Packit d28291
  • Packit d28291
    If scarce resources are managed with finalization, the allocation
    Packit d28291
    routine for that resource (e.g. open for file handles) should force
    Packit d28291
    a garbage collection (two if that doesn't suffice) if it finds itself
    Packit d28291
    short of the resource.
    Packit d28291
  • Packit d28291
    If extremely scarce resources are managed by finalization (e.g.
    Packit d28291
    file descriptors on systems which have a limit of 20 open files),
    Packit d28291
    it may be necessary to introduce a descriptor caching scheme to
    Packit d28291
    hide the resource limit.
    Packit d28291
    (E.g., the program would keep real file descriptors
    Packit d28291
    for the 20 most recently used logically open files.
    Packit d28291
    Any other needed files would be closed after saving their state.
    Packit d28291
    They would then be reopened on demand.
    Packit d28291
    Finalization would logically close the file, closing the
    Packit d28291
    real descriptor only if it happened to be cached.)
    Packit d28291
    Note that most modern systems (e.g. Irix®) allow hundreds or
    Packit d28291
    thousands of open files, and this is typically not an issue.
    Packit d28291
  • Packit d28291
    Finalization code may
    Packit d28291
    be run anyplace an allocation or other call to the collector
    Packit d28291
    takes place.
    Packit d28291
    In multi-threaded programs, finalizers have to obey the normal
    Packit d28291
    locking conventions to ensure safety.
    Packit d28291
    Code run directly from finalizers should not acquire locks that may
    Packit d28291
    be held during allocation.  This restriction can be easily circumvented
    Packit d28291
    by registering a finalizer which enqueues the real action for execution
    Packit d28291
    in a separate thread.
    Packit d28291

    Packit d28291
    In single-threaded code, it is also often easiest to have finalizers
    Packit d28291
    queue actions, which are then explicitly run during an
    Packit d28291
    explicit call by the user's program.
    Packit d28291
    Packit d28291

    Topologically Ordered Finalization

    Packit d28291
    Our conservative garbage collector supports
    Packit d28291
    a form of finalization
    Packit d28291
    (with <TT>GC_register_finalizer</tt>)
    Packit d28291
    in which objects are finalized in topological
    Packit d28291
    order.  If A points to B, and both are registered for
    Packit d28291
    finalization, it is guaranteed the A will be finalized first.
    Packit d28291
    This usually guarantees that finalization procedures see only
    Packit d28291
    unfinalized objects.
    Packit d28291

    Packit d28291
    This decision is often questioned, particularly since it has an obvious
    Packit d28291
    disadvantage.  The current implementation finalizes long chains of
    Packit d28291
    finalizable objects one per collection.  This is hard to avoid, since
    Packit d28291
    the first finalizer invoked may store a pointer to the rest of the chain
    Packit d28291
    in a global variable, making it accessible again.  Or it may mutate the
    Packit d28291
    rest of the chain.
    Packit d28291

    Packit d28291
    Cycles involving one or more finalizable objects are never finalized.
    Packit d28291

    Packit d28291
    Why topological ordering?
    Packit d28291
    Packit d28291
    It is important to keep in mind that the choice of finalization ordering
    Packit d28291
    matters only in relatively rare cases.  In spite of the fact that it has
    Packit d28291
    received a lot of discussion, it is not one of the more important
    Packit d28291
    decisions in designing a system.  Many, especially smaller, applications
    Packit d28291
    will never notice the difference.  Nonetheless, we believe that topologically
    Packit d28291
    ordered finalization is the right choice.
    Packit d28291

    Packit d28291
    To understand the justification, observe that if As
    Packit d28291
    finalization procedure does not refer to B, we could fairly easily have
    Packit d28291
    avoided the dependency.  We could have split A into A'
    Packit d28291
    and A'' such that any references to A become references to
    Packit d28291
    A', A' points to A'' but not vice-versa, only fields
    Packit d28291
    needed for finalization are stored in A'', and A'' is enabled
    Packit d28291
    for finalization.  (<TT>GC_register_disappearing_link</tt> provides an
    Packit d28291
    alternative mechanism that does not require breaking up objects.)
    Packit d28291

    Packit d28291
    Thus assume that A actually does need access to B during
    Packit d28291
    finalization.  To make things concrete, assume that B is
    Packit d28291
    finalizable because it holds a pointer to a C object, which must be
    Packit d28291
    explicitly deallocated.  (This is likely to be one of the most common
    Packit d28291
    uses of finalization.)  If B happens to be finalized first,
    Packit d28291
    A will see a dangling pointer during its finalization.  But a
    Packit d28291
    principal goal of garbage collection was to avoid dangling pointers.
    Packit d28291

    Packit d28291
    Note that the client program could enforce topological ordering
    Packit d28291
    even if the system didn't.  A pointer to B could be stored in
    Packit d28291
    some globally visible place, where it is cleared only by As
    Packit d28291
    finalizer.  But this puts the burden to ensure safety back on the
    Packit d28291
    programmer.
    Packit d28291

    Packit d28291
    With topologically ordered finalization, the programmer
    Packit d28291
    can fail to split an object, thus leaving an accidental cycle.  This
    Packit d28291
    results in a leak, which is arguably less dangerous than a dangling
    Packit d28291
    pointer.  More importantly, it is much easier to diagnose,
    Packit d28291
    since the garbage collector would have to go out of its way not to
    Packit d28291
    notice finalization cycles.  It can trivially report them.
    Packit d28291

    Packit d28291
    Furthermore unordered finalization does not really solve the problem
    Packit d28291
    of cycles.  Consider the above case in which As
    Packit d28291
    finalization procedure depends on B, and thus a pointer to B
    Packit d28291
    is stored in a global data structure, to be cleared by As finalizer.
    Packit d28291
    If there is an accidental pointer from B back to A, and
    Packit d28291
    thus a cycle, neither B nor A will become unreachable.
    Packit d28291
    The leak is there, just as in the topologically ordered case, but it is
    Packit d28291
    hidden from easy diagnosis.
    Packit d28291

    Packit d28291
    A number of alternative finalization orderings have been proposed, e.g.
    Packit d28291
    based on statically assigned priorities.  In our opinion, these are much
    Packit d28291
    more likely to require complex programming discipline to use in a large
    Packit d28291
    modular system.  (Some of them, e.g. Guardians proposed by Dybvig,
    Packit d28291
    Bruggeman, and Eby, do avoid some problems which arise in combination
    Packit d28291
    with certain other collection algorithms.)
    Packit d28291

    Packit d28291
    Fundamentally, a garbage collector assumes that objects reachable
    Packit d28291
    via pointer chains may be accessed, and thus should be preserved.
    Packit d28291
    Topologically ordered finalization simply extends this to object finalization;
    Packit d28291
    an finalizable object reachable from another finalizer via a pointer chain
    Packit d28291
    is presumed to be accessible by the finalizer, and thus should not be
    Packit d28291
    finalized.
    Packit d28291
    Packit d28291

    Programming with topological finalization

    Packit d28291
    Experience with Cedar has shown that cycles or long chains of finalizable
    Packit d28291
    objects are typically not a problem.
    Packit d28291
    Finalizable objects are typically rare.
    Packit d28291
    There are several ways to reduce spurious dependencies between finalizable
    Packit d28291
    objects.  Splitting objects as discussed above is one technique.
    Packit d28291
    The collector also provides <TT>GC_register_disappearing_link</tt>, which
    Packit d28291
    explicitly nils a pointer before determining finalization ordering.
    Packit d28291

    Packit d28291
    Some so-called "operating systems" fail to clean up some resources associated
    Packit d28291
    with a process.  These resources must be deallocated at all cost before
    Packit d28291
    process exit whether or not they are still referenced.  Probably the best
    Packit d28291
    way to deal with those is by not relying exclusively on finalization.
    Packit d28291
    They should be registered in a table of weak pointers (implemented as
    Packit d28291
    disguised pointers cleared by the finalization procedure that deallocates
    Packit d28291
    the resource).  If any references are still left at process exit, they
    Packit d28291
    can be explicitly deallocated then.
    Packit d28291
    Packit d28291

    Getting around topological finalization ordering

    Packit d28291
    There are certain situations in which cycles between finalizable objects are
    Packit d28291
    genuinely unavoidable.  Most notably, C++ compilers introduce self-cycles
    Packit d28291
    to represent inheritance.  <TT>GC_register_finalizer_ignore_self</tt> tells the
    Packit d28291
    finalization part of the collector to ignore self cycles.
    Packit d28291
    This is used by the C++ interface.
    Packit d28291

    Packit d28291
    Finalize.c actually contains an intentionally undocumented mechanism
    Packit d28291
    for registering a finalizable object with user-defined dependencies.
    Packit d28291
    The problem is that this dependency information is also used for memory
    Packit d28291
    reclamation, not just finalization ordering.  Thus misuse can result in
    Packit d28291
    dangling pointers even if finalization doesn't create any.
    Packit d28291
    The risk of dangling pointers can be eliminated by building the collector
    Packit d28291
    with -DJAVA_FINALIZATION.  This forces objects reachable from finalizers
    Packit d28291
    to be marked, even though this dependency is not considered for finalization
    Packit d28291
    ordering.
    Packit d28291
    Packit d28291
    </body>
    Packit d28291
    </html>