Blame doc/gcdescr.html

Packit d28291
<HTML>
Packit d28291
<HEAD>
Packit d28291
    <TITLE> Conservative GC Algorithmic Overview </TITLE>
Packit d28291
    <AUTHOR> Hans-J. Boehm, HP Labs (Some of this was written at SGI)</author>
Packit d28291
</HEAD>
Packit d28291
<BODY>
Packit d28291

This is under construction, and may always be.

Packit d28291

Conservative GC Algorithmic Overview

Packit d28291

Packit d28291
This is a description of the algorithms and data structures used in our
Packit d28291
conservative garbage collector.  I expect the level of detail to increase
Packit d28291
with time.  For a survey of GC algorithms, see for example
Packit d28291
 Paul Wilson's
Packit d28291
excellent paper.  For an overview of the collector interface,
Packit d28291
see here.
Packit d28291

Packit d28291
This description is targeted primarily at someone trying to understand the
Packit d28291
source code.  It specifically refers to variable and function names.
Packit d28291
It may also be useful for understanding the algorithms at a higher level.
Packit d28291

Packit d28291
The description here assumes that the collector is used in default mode.
Packit d28291
In particular, we assume that it used as a garbage collector, and not just
Packit d28291
a leak detector.  We initially assume that it is used in stop-the-world,
Packit d28291
non-incremental mode, though the presence of the incremental collector
Packit d28291
will be apparent in the design.
Packit d28291
We assume the default finalization model, but the code affected by that
Packit d28291
is very localized.
Packit d28291

Introduction

Packit d28291
The garbage collector uses a modified mark-sweep algorithm.  Conceptually
Packit d28291
it operates roughly in four phases, which are performed occasionally
Packit d28291
as part of a memory allocation:
Packit d28291
Packit d28291
    Packit d28291
    Packit d28291
  1. Packit d28291
    Preparation Each object has an associated mark bit.
    Packit d28291
    Clear all mark bits, indicating that all objects
    Packit d28291
    are potentially unreachable.
    Packit d28291
    Packit d28291
  2. Packit d28291
    Mark phase Marks all objects that can be reachable via chains of
    Packit d28291
    pointers from variables.  Often the collector has no real information
    Packit d28291
    about the location of pointer variables in the heap, so it
    Packit d28291
    views all static data areas, stacks and registers as potentially containing
    Packit d28291
    pointers.  Any bit patterns that represent addresses inside
    Packit d28291
    heap objects managed by the collector are viewed as pointers.
    Packit d28291
    Unless the client program has made heap object layout information
    Packit d28291
    available to the collector, any heap objects found to be reachable from
    Packit d28291
    variables are again scanned similarly.
    Packit d28291
    Packit d28291
  3. Packit d28291
    Sweep phase Scans the heap for inaccessible, and hence unmarked,
    Packit d28291
    objects, and returns them to an appropriate free list for reuse.  This is
    Packit d28291
    not really a separate phase; even in non incremental mode this is operation
    Packit d28291
    is usually performed on demand during an allocation that discovers an empty
    Packit d28291
    free list.  Thus the sweep phase is very unlikely to touch a page that
    Packit d28291
    would not have been touched shortly thereafter anyway.
    Packit d28291
    Packit d28291
  4. Packit d28291
    Finalization phase Unreachable objects which had been registered
    Packit d28291
    for finalization are enqueued for finalization outside the collector.
    Packit d28291
    Packit d28291
    Packit d28291
    Packit d28291

    Packit d28291
    The remaining sections describe the memory allocation data structures,
    Packit d28291
    and then the last 3 collection phases in more detail. We conclude by
    Packit d28291
    outlining some of the additional features implemented in the collector.
    Packit d28291
    Packit d28291

    Allocation

    Packit d28291
    The collector includes its own memory allocator.  The allocator obtains
    Packit d28291
    memory from the system in a platform-dependent way.  Under UNIX, it
    Packit d28291
    uses either <TT>malloc</tt>, <TT>sbrk</tt>, or <TT>mmap</tt>.
    Packit d28291

    Packit d28291
    Most static data used by the allocator, as well as that needed by the
    Packit d28291
    rest of the garbage collector is stored inside the
    Packit d28291
    <TT>_GC_arrays</tt> structure.
    Packit d28291
    This allows the garbage collector to easily ignore the collectors own
    Packit d28291
    data structures when it searches for root pointers.  Other allocator
    Packit d28291
    and collector internal data structures are allocated dynamically
    Packit d28291
    with <TT>GC_scratch_alloc</tt>. <TT>GC_scratch_alloc</tt> does not
    Packit d28291
    allow for deallocation, and is therefore used only for permanent data
    Packit d28291
    structures.
    Packit d28291

    Packit d28291
    The allocator allocates objects of different kinds.
    Packit d28291
    Different kinds are handled somewhat differently by certain parts
    Packit d28291
    of the garbage collector.  Certain kinds are scanned for pointers,
    Packit d28291
    others are not.  Some may have per-object type descriptors that
    Packit d28291
    determine pointer locations.  Or a specific kind may correspond
    Packit d28291
    to one specific object layout.  Two built-in kinds are uncollectible.
    Packit d28291
    One (<TT>STUBBORN</tt>) is immutable without special precautions.
    Packit d28291
    In spite of that, it is very likely that most C clients of the
    Packit d28291
    collector currently
    Packit d28291
    use at most two kinds: <TT>NORMAL</tt> and <TT>PTRFREE</tt> objects.
    Packit d28291
    The GCJ runtime
    Packit d28291
    also makes heavy use of a kind (allocated with GC_gcj_malloc) that stores
    Packit d28291
    type information at a known offset in method tables.
    Packit d28291

    Packit d28291
    The collector uses a two level allocator.  A large block is defined to
    Packit d28291
    be one larger than half of <TT>HBLKSIZE</tt>, which is a power of 2,
    Packit d28291
    typically on the order of the page size.
    Packit d28291

    Packit d28291
    Large block sizes are rounded up to
    Packit d28291
    the next multiple of <TT>HBLKSIZE</tt> and then allocated by
    Packit d28291
    <TT>GC_allochblk</tt>.  Recent versions of the collector
    Packit d28291
    use an approximate best fit algorithm by keeping free lists for
    Packit d28291
    several large block sizes.
    Packit d28291
    The actual
    Packit d28291
    implementation of <TT>GC_allochblk</tt>
    Packit d28291
    is significantly complicated by black-listing issues
    Packit d28291
    (see below).
    Packit d28291

    Packit d28291
    Small blocks are allocated in chunks of size <TT>HBLKSIZE</tt>.
    Packit d28291
    Each chunk is
    Packit d28291
    dedicated to only one object size and kind.
    Packit d28291

    Packit d28291
    The allocator maintains
    Packit d28291
    separate free lists for each size and kind of object.
    Packit d28291
    Associated with each kind is an array of free list pointers,
    Packit d28291
    with entry <TT>freelist[</tt>i<TT>]</tt> pointing to
    Packit d28291
    a free list of size i objects.
    Packit d28291
    In recent versions of the
    Packit d28291
    collector, index <TT>i</tt> is expressed in granules, which are the
    Packit d28291
    minimum allocatable unit, typically 8 or 16 bytes.
    Packit d28291
    The free lists themselves are
    Packit d28291
    linked through the first word in each object (see <TT>obj_link()</tt>
    Packit d28291
    macro).
    Packit d28291

    Packit d28291
    Once a large block is split for use in smaller objects, it can only
    Packit d28291
    be used for objects of that size, unless the collector discovers a completely
    Packit d28291
    empty chunk.  Completely empty chunks are restored to the appropriate
    Packit d28291
    large block free list.
    Packit d28291

    Packit d28291
    In order to avoid allocating blocks for too many distinct object sizes,
    Packit d28291
    the collector normally does not directly allocate objects of every possible
    Packit d28291
    request size.  Instead, the request is rounded up to one of a smaller number
    Packit d28291
    of allocated sizes, for which free lists are maintained.  The exact
    Packit d28291
    allocated sizes are computed on demand, but subject to the constraint
    Packit d28291
    that they increase roughly in geometric progression.  Thus objects
    Packit d28291
    requested early in the execution are likely to be allocated with exactly
    Packit d28291
    the requested size, subject to alignment constraints.
    Packit d28291
    See <TT>GC_init_size_map</tt> for details.
    Packit d28291

    Packit d28291
    The actual size rounding operation during small object allocation is
    Packit d28291
    implemented as a table lookup in <TT>GC_size_map</tt> which maps
    Packit d28291
    a requested allocation size in bytes to a number of granules.
    Packit d28291

    Packit d28291
    Both collector initialization and computation of allocated sizes are
    Packit d28291
    handled carefully so that they do not slow down the small object fast
    Packit d28291
    allocation path.  An attempt to allocate before the collector is initialized,
    Packit d28291
    or before the appropriate <TT>GC_size_map</tt> entry is computed,
    Packit d28291
    will take the same path as an allocation attempt with an empty free list.
    Packit d28291
    This results in a call to the slow path code (<TT>GC_generic_malloc_inner</tt>)
    Packit d28291
    which performs the appropriate initialization checks.
    Packit d28291

    Packit d28291
    In non-incremental mode, we make a decision about whether to garbage collect
    Packit d28291
    whenever an allocation would otherwise have failed with the current heap size.
    Packit d28291
    If the total amount of allocation since the last collection is less than
    Packit d28291
    the heap size divided by <TT>GC_free_space_divisor</tt>, we try to
    Packit d28291
    expand the heap.  Otherwise, we initiate a garbage collection.  This ensures
    Packit d28291
    that the amount of garbage collection work per allocated byte remains
    Packit d28291
    constant.
    Packit d28291

    Packit d28291
    The above is in fact an oversimplification of the real heap expansion
    Packit d28291
    and GC triggering heuristic, which adjusts slightly for root size
    Packit d28291
    and certain kinds of
    Packit d28291
    fragmentation.  In particular:
    Packit d28291
      Packit d28291
    • Programs with a large root set size and
    • Packit d28291
      little live heap memory will expand the heap to amortize the cost of
      Packit d28291
      scanning the roots.
      Packit d28291
    • Versions 5.x of the collector actually collect more frequently in
    • Packit d28291
      nonincremental mode.  The large block allocator usually refuses to split
      Packit d28291
      large heap blocks once the garbage collection threshold is
      Packit d28291
      reached.  This often has the effect of collecting well before the
      Packit d28291
      heap fills up, thus reducing fragmentation and working set size at the
      Packit d28291
      expense of GC time.  Versions 6.x choose an intermediate strategy depending
      Packit d28291
      on how much large object allocation has taken place in the past.
      Packit d28291
      (If the collector is configured to unmap unused pages, versions 6.x
      Packit d28291
      use the 5.x strategy.)
      Packit d28291
    • In calculating the amount of allocation since the last collection we
    • Packit d28291
      give partial credit for objects we expect to be explicitly deallocated.
      Packit d28291
      Even if all objects are explicitly managed, it is often desirable to collect
      Packit d28291
      on rare occasion, since that is our only mechanism for coalescing completely
      Packit d28291
      empty chunks.
      Packit d28291
      Packit d28291

      Packit d28291
      It has been suggested that this should be adjusted so that we favor
      Packit d28291
      expansion if the resulting heap still fits into physical memory.
      Packit d28291
      In many cases, that would no doubt help.  But it is tricky to do this
      Packit d28291
      in a way that remains robust if multiple application are contending
      Packit d28291
      for a single pool of physical memory.
      Packit d28291
      Packit d28291

      Mark phase

      Packit d28291
      Packit d28291
      At each collection, the collector marks all objects that are
      Packit d28291
      possibly reachable from pointer variables.  Since it cannot generally
      Packit d28291
      tell where pointer variables are located, it scans the following
      Packit d28291
      root segments for pointers:
      Packit d28291
        Packit d28291
      • The registers. Depending on the architecture, this may be done using
      • Packit d28291
        assembly code, or by calling a <TT>setjmp</tt>-like function which saves
        Packit d28291
        register contents on the stack.
        Packit d28291
      • The stack(s). In the case of a single-threaded application,
      • Packit d28291
        on most platforms this
        Packit d28291
        is done by scanning the memory between (an approximation of) the current
        Packit d28291
        stack pointer and <TT>GC_stackbottom</tt>.  (For Itanium, the register stack
        Packit d28291
        scanned separately.)  The <TT>GC_stackbottom</tt> variable is set in
        Packit d28291
        a highly platform-specific way depending on the appropriate configuration
        Packit d28291
        information in <TT>gcconfig.h</tt>.  Note that the currently active
        Packit d28291
        stack needs to be scanned carefully, since callee-save registers of
        Packit d28291
        client code may appear inside collector stack frames, which may
        Packit d28291
        change during the mark process.  This is addressed by scanning
        Packit d28291
        some sections of the stack "eagerly", effectively capturing a snapshot
        Packit d28291
        at one point in time.
        Packit d28291
      • Static data region(s). In the simplest case, this is the region
      • Packit d28291
        between <TT>DATASTART</tt> and <TT>DATAEND</tt>, as defined in
        Packit d28291
        <TT>gcconfig.h</tt>.  However, in most cases, this will also involve
        Packit d28291
        static data regions associated with dynamic libraries.  These are
        Packit d28291
        identified by the mostly platform-specific code in <TT>dyn_load.c</tt>.
        Packit d28291
        Packit d28291
        The marker maintains an explicit stack of memory regions that are known
        Packit d28291
        to be accessible, but that have not yet been searched for contained pointers.
        Packit d28291
        Each stack entry contains the starting address of the block to be scanned,
        Packit d28291
        as well as a descriptor of the block.  If no layout information is
        Packit d28291
        available for the block, then the descriptor is simply a length.
        Packit d28291
        (For other possibilities, see <TT>gc_mark.h</tt>.)
        Packit d28291

        Packit d28291
        At the beginning of the mark phase, all root segments
        Packit d28291
        (as described above) are pushed on the
        Packit d28291
        stack by <TT>GC_push_roots</tt>.  (Registers and eagerly processed
        Packit d28291
        stack sections are processed by pushing the referenced objects instead
        Packit d28291
        of the stack section itself.)  If <TT>ALL_INTERIOR_POINTERS</tt> is not
        Packit d28291
        defined, then stack roots require special treatment.  In this case, the
        Packit d28291
        normal marking code ignores interior pointers, but <TT>GC_push_all_stack</tt>
        Packit d28291
        explicitly checks for interior pointers and pushes descriptors for target
        Packit d28291
        objects.
        Packit d28291

        Packit d28291
        The marker is structured to allow incremental marking.
        Packit d28291
        Each call to <TT>GC_mark_some</tt> performs a small amount of
        Packit d28291
        work towards marking the heap.
        Packit d28291
        It maintains
        Packit d28291
        explicit state in the form of <TT>GC_mark_state</tt>, which
        Packit d28291
        identifies a particular sub-phase.  Some other pieces of state, most
        Packit d28291
        notably the mark stack, identify how much work remains to be done
        Packit d28291
        in each sub-phase.  The normal progression of mark states for
        Packit d28291
        a stop-the-world collection is:
        Packit d28291
          Packit d28291
        1. <TT>MS_INVALID</tt> indicating that there may be accessible unmarked
        2. Packit d28291
          objects.  In this case <TT>GC_objects_are_marked</tt> will simultaneously
          Packit d28291
          be false, so the mark state is advanced to
          Packit d28291
        3. <TT>MS_PUSH_UNCOLLECTABLE</tt> indicating that it suffices to push
        4. Packit d28291
          uncollectible objects, roots, and then mark everything reachable from them.
          Packit d28291
          <TT>Scan_ptr</tt> is advanced through the heap until all uncollectible
          Packit d28291
          objects are pushed, and objects reachable from them are marked.
          Packit d28291
          At that point, the next call to <TT>GC_mark_some</tt> calls
          Packit d28291
          <TT>GC_push_roots</tt> to push the roots.  It the advances the
          Packit d28291
          mark state to
          Packit d28291
        5. <TT>MS_ROOTS_PUSHED</tt> asserting that once the mark stack is
        6. Packit d28291
          empty, all reachable objects are marked.  Once in this state, we work
          Packit d28291
          only on emptying the mark stack.  Once this is completed, the state
          Packit d28291
          changes to
          Packit d28291
        7. <TT>MS_NONE</tt> indicating that reachable objects are marked.
        8. Packit d28291
          Packit d28291
          Packit d28291
          The core mark routine <TT>GC_mark_from</tt>, is called
          Packit d28291
          repeatedly by several of the sub-phases when the mark stack starts to fill
          Packit d28291
          up.  It is also called repeatedly in <TT>MS_ROOTS_PUSHED</tt> state
          Packit d28291
          to empty the mark stack.
          Packit d28291
          The routine is designed to only perform a limited amount of marking at
          Packit d28291
          each call, so that it can also be used by the incremental collector.
          Packit d28291
          It is fairly carefully tuned, since it usually consumes a large majority
          Packit d28291
          of the garbage collection time.
          Packit d28291

          Packit d28291
          The fact that it performs only a small amount of work per call also
          Packit d28291
          allows it to be used as the core routine of the parallel marker.  In that
          Packit d28291
          case it is normally invoked on thread-private mark stacks instead of the
          Packit d28291
          global mark stack.  More details can be found in
          Packit d28291
          scale.html
          Packit d28291

          Packit d28291
          The marker correctly handles mark stack overflows.  Whenever the mark stack
          Packit d28291
          overflows, the mark state is reset to <TT>MS_INVALID</tt>.
          Packit d28291
          Since there are already marked objects in the heap,
          Packit d28291
          this eventually forces a complete
          Packit d28291
          scan of the heap, searching for pointers, during which any unmarked objects
          Packit d28291
          referenced by marked objects are again pushed on the mark stack.  This
          Packit d28291
          process is repeated until the mark phase completes without a stack overflow.
          Packit d28291
          Each time the stack overflows, an attempt is made to grow the mark stack.
          Packit d28291
          All pieces of the collector that push regions onto the mark stack have to be
          Packit d28291
          careful to ensure forward progress, even in case of repeated mark stack
          Packit d28291
          overflows.  Every mark attempt results in additional marked objects.
          Packit d28291

          Packit d28291
          Each mark stack entry is processed by examining all candidate pointers
          Packit d28291
          in the range described by the entry.  If the region has no associated
          Packit d28291
          type information, then this typically requires that each 4-byte aligned
          Packit d28291
          quantity (8-byte aligned with 64-bit pointers) be considered a candidate
          Packit d28291
          pointer.
          Packit d28291

          Packit d28291
          We determine whether a candidate pointer is actually the address of
          Packit d28291
          a heap block.  This is done in the following steps:
          Packit d28291
          <NL>
          Packit d28291
        9. The candidate pointer is checked against rough heap bounds.
        10. Packit d28291
          These heap bounds are maintained such that all actual heap objects
          Packit d28291
          fall between them.  In order to facilitate black-listing (see below)
          Packit d28291
          we also include address regions that the heap is likely to expand into.
          Packit d28291
          Most non-pointers fail this initial test.
          Packit d28291
        11. The candidate pointer is divided into two pieces; the most significant
        12. Packit d28291
          bits identify a <TT>HBLKSIZE</tt>-sized page in the address space, and
          Packit d28291
          the least significant bits specify an offset within that page.
          Packit d28291
          (A hardware page may actually consist of multiple such pages.
          Packit d28291
          HBLKSIZE is usually the page size divided by a small power of two.)
          Packit d28291
        13. Packit d28291
          The page address part of the candidate pointer is looked up in a
          Packit d28291
          table.
          Packit d28291
          Each table entry contains either 0, indicating that the page is not part
          Packit d28291
          of the garbage collected heap, a small integer n, indicating
          Packit d28291
          that the page is part of large object, starting at least n pages
          Packit d28291
          back, or a pointer to a descriptor for the page.  In the first case,
          Packit d28291
          the candidate pointer i not a true pointer and can be safely ignored.
          Packit d28291
          In the last two cases, we can obtain a descriptor for the page containing
          Packit d28291
          the beginning of the object.
          Packit d28291
        14. Packit d28291
          The starting address of the referenced object is computed.
          Packit d28291
          The page descriptor contains the size of the object(s)
          Packit d28291
          in that page, the object kind, and the necessary mark bits for those
          Packit d28291
          objects.  The size information can be used to map the candidate pointer
          Packit d28291
          to the object starting address.  To accelerate this process, the page header
          Packit d28291
          also contains a pointer to a precomputed map of page offsets to displacements
          Packit d28291
          from the beginning of an object.  The use of this map avoids a
          Packit d28291
          potentially slow integer remainder operation in computing the object
          Packit d28291
          start address.
          Packit d28291
        15. Packit d28291
          The mark bit for the target object is checked and set.  If the object
          Packit d28291
          was previously unmarked, the object is pushed on the mark stack.
          Packit d28291
          The descriptor is read from the page descriptor.  (This is computed
          Packit d28291
          from information <TT>GC_obj_kinds</tt> when the page is first allocated.)
          Packit d28291
          </nl>
          Packit d28291

          Packit d28291
          At the end of the mark phase, mark bits for left-over free lists are cleared,
          Packit d28291
          in case a free list was accidentally marked due to a stray pointer.
          Packit d28291
          Packit d28291

          Sweep phase

          Packit d28291
          Packit d28291
          At the end of the mark phase, all blocks in the heap are examined.
          Packit d28291
          Unmarked large objects are immediately returned to the large object free list.
          Packit d28291
          Each small object page is checked to see if all mark bits are clear.
          Packit d28291
          If so, the entire page is returned to the large object free list.
          Packit d28291
          Small object pages containing some reachable object are queued for later
          Packit d28291
          sweeping, unless we determine that the page contains very little free
          Packit d28291
          space, in which case it is not examined further.
          Packit d28291

          Packit d28291
          This initial sweep pass touches only block headers, not
          Packit d28291
          the blocks themselves.  Thus it does not require significant paging, even
          Packit d28291
          if large sections of the heap are not in physical memory.
          Packit d28291

          Packit d28291
          Nonempty small object pages are swept when an allocation attempt
          Packit d28291
          encounters an empty free list for that object size and kind.
          Packit d28291
          Pages for the correct size and kind are repeatedly swept until at
          Packit d28291
          least one empty block is found.  Sweeping such a page involves
          Packit d28291
          scanning the mark bit array in the page header, and building a free
          Packit d28291
          list linked through the first words in the objects themselves.
          Packit d28291
          This does involve touching the appropriate data page, but in most cases
          Packit d28291
          it will be touched only just before it is used for allocation.
          Packit d28291
          Hence any paging is essentially unavoidable.
          Packit d28291

          Packit d28291
          Except in the case of pointer-free objects, we maintain the invariant
          Packit d28291
          that any object in a small object free list is cleared (except possibly
          Packit d28291
          for the link field).  Thus it becomes the burden of the small object
          Packit d28291
          sweep routine to clear objects.  This has the advantage that we can
          Packit d28291
          easily recover from accidentally marking a free list, though that could
          Packit d28291
          also be handled by other means.  The collector currently spends a fair
          Packit d28291
          amount of time clearing objects, and this approach should probably be
          Packit d28291
          revisited.
          Packit d28291

          Packit d28291
          In most configurations, we use specialized sweep routines to handle common
          Packit d28291
          small object sizes.  Since we allocate one mark bit per word, it becomes
          Packit d28291
          easier to examine the relevant mark bits if the object size divides
          Packit d28291
          the word length evenly.  We also suitably unroll the inner sweep loop
          Packit d28291
          in each case.  (It is conceivable that profile-based procedure cloning
          Packit d28291
          in the compiler could make this unnecessary and counterproductive.  I
          Packit d28291
          know of no existing compiler to which this applies.)
          Packit d28291

          Packit d28291
          The sweeping of small object pages could be avoided completely at the expense
          Packit d28291
          of examining mark bits directly in the allocator.  This would probably
          Packit d28291
          be more expensive, since each allocation call would have to reload
          Packit d28291
          a large amount of state (e.g. next object address to be swept, position
          Packit d28291
          in mark bit table) before it could do its work.  The current scheme
          Packit d28291
          keeps the allocator simple and allows useful optimizations in the sweeper.
          Packit d28291
          Packit d28291

          Finalization

          Packit d28291
          Both <TT>GC_register_disappearing_link</tt> and
          Packit d28291
          <TT>GC_register_finalizer</tt> add the request to a corresponding hash
          Packit d28291
          table.  The hash table is allocated out of collected memory, but
          Packit d28291
          the reference to the finalizable object is hidden from the collector.
          Packit d28291
          Currently finalization requests are processed non-incrementally at the
          Packit d28291
          end of a mark cycle.
          Packit d28291

          Packit d28291
          The collector makes an initial pass over the table of finalizable objects,
          Packit d28291
          pushing the contents of unmarked objects onto the mark stack.
          Packit d28291
          After pushing each object, the marker is invoked to mark all objects
          Packit d28291
          reachable from it.  The object itself is not explicitly marked.
          Packit d28291
          This assures that objects on which a finalizer depends are neither
          Packit d28291
          collected nor finalized.
          Packit d28291

          Packit d28291
          If in the process of marking from an object the
          Packit d28291
          object itself becomes marked, we have uncovered
          Packit d28291
          a cycle involving the object.  This usually results in a warning from the
          Packit d28291
          collector.  Such objects are not finalized, since it may be
          Packit d28291
          unsafe to do so.  See the more detailed
          Packit d28291
           discussion of finalization semantics.
          Packit d28291

          Packit d28291
          Any objects remaining unmarked at the end of this process are added to
          Packit d28291
          a queue of objects whose finalizers can be run.  Depending on collector
          Packit d28291
          configuration, finalizers are dequeued and run either implicitly during
          Packit d28291
          allocation calls, or explicitly in response to a user request.
          Packit d28291
          (Note that the former is unfortunately both the default and not generally safe.
          Packit d28291
          If finalizers perform synchronization, it may result in deadlocks.
          Packit d28291
          Nontrivial finalizers generally need to perform synchronization, and
          Packit d28291
          thus require a different collector configuration.)
          Packit d28291

          Packit d28291
          The collector provides a mechanism for replacing the procedure that is
          Packit d28291
          used to mark through objects.  This is used both to provide support for
          Packit d28291
          Java-style unordered finalization, and to ignore certain kinds of cycles,
          Packit d28291
          e.g. those arising from C++ implementations of virtual inheritance.
          Packit d28291
          Packit d28291

          Generational Collection and Dirty Bits

          Packit d28291
          We basically use the concurrent and generational GC algorithm described in
          Packit d28291
          "Mostly Parallel Garbage Collection",
          Packit d28291
          by Boehm, Demers, and Shenker.
          Packit d28291

          Packit d28291
          The most significant modification is that
          Packit d28291
          the collector always starts running in the allocating thread.
          Packit d28291
          There is no separate garbage collector thread.  (If parallel GC is
          Packit d28291
          enabled, helper threads may also be woken up.)
          Packit d28291
          If an allocation attempt either requests a large object, or encounters
          Packit d28291
          an empty small object free list, and notices that there is a collection
          Packit d28291
          in progress, it immediately performs a small amount of marking work
          Packit d28291
          as described above.
          Packit d28291

          Packit d28291
          This change was made both because we wanted to easily accommodate
          Packit d28291
          single-threaded environments, and because a separate GC thread requires
          Packit d28291
          very careful control over the scheduler to prevent the mutator from
          Packit d28291
          out-running the collector, and hence provoking unneeded heap growth.
          Packit d28291

          Packit d28291
          In incremental mode, the heap is always expanded when we encounter
          Packit d28291
          insufficient space for an allocation.  Garbage collection is triggered
          Packit d28291
          whenever we notice that more than
          Packit d28291
          <TT>GC_heap_size</tt>/2 * <TT>GC_free_space_divisor</tt>
          Packit d28291
          bytes of allocation have taken place.
          Packit d28291
          After <TT>GC_full_freq</tt> minor collections a major collection
          Packit d28291
          is started.
          Packit d28291

          Packit d28291
          All collections initially run uninterrupted until a predetermined
          Packit d28291
          amount of time (50 msecs by default) has expired.  If this allows
          Packit d28291
          the collection to complete entirely, we can avoid correcting
          Packit d28291
          for data structure modifications during the collection.  If it does
          Packit d28291
          not complete, we return control to the mutator, and perform small
          Packit d28291
          amounts of additional GC work during those later allocations that
          Packit d28291
          cannot be satisfied from small object free lists. When marking completes,
          Packit d28291
          the set of modified pages is retrieved, and we mark once again from
          Packit d28291
          marked objects on those pages, this time with the mutator stopped.
          Packit d28291

          Packit d28291
          We keep track of modified pages using one of several distinct mechanisms:
          Packit d28291
            Packit d28291
          1. Packit d28291
            Through explicit mutator cooperation.  Currently this requires
            Packit d28291
            the use of <TT>GC_malloc_stubborn</tt>, and is rarely used.
            Packit d28291
          2. Packit d28291
            (<TT>MPROTECT_VDB</tt>) By write-protecting physical pages and
            Packit d28291
            catching write faults.  This is
            Packit d28291
            implemented for many Unix-like systems and for win32.  It is not possible
            Packit d28291
            in a few environments.
            Packit d28291
          3. Packit d28291
            (<TT>GWW_VDB</tt>) By using the Win32 GetWriteWatch function to read dirty
            Packit d28291
            bits.
            Packit d28291
          4. Packit d28291
            (<TT>PROC_VDB</tt>) By retrieving dirty bit information from /proc.
            Packit d28291
            (Currently only Sun's
            Packit d28291
            Solaris supports this.  Though this is considerably cleaner, performance
            Packit d28291
            may actually be better with mprotect and signals.)
            Packit d28291
          5. Packit d28291
            (<TT>PCR_VDB</tt>) By relying on an external dirty bit implementation, in this
            Packit d28291
            case the one in Xerox PCR.
            Packit d28291
          6. Packit d28291
            (<TT>DEFAULT_VDB</tt>) By treating all pages as dirty.  This is the default if
            Packit d28291
            none of the other techniques is known to be usable, and
            Packit d28291
            <TT>GC_malloc_stubborn</tt> is not used.  Practical only for testing, or if
            Packit d28291
            the vast majority of objects use <TT>GC_malloc_stubborn</tt>.
            Packit d28291
            Packit d28291
            Packit d28291

            Black-listing

            Packit d28291
            Packit d28291
            The collector implements black-listing of pages, as described
            Packit d28291
            in
            Packit d28291
            Packit d28291
            Boehm, ``Space Efficient Conservative Collection'', PLDI '93, also available
            Packit d28291
            here.
            Packit d28291

            Packit d28291
            During the mark phase, the collector tracks ``near misses'', i.e. attempts
            Packit d28291
            to follow a ``pointer'' to just outside the garbage-collected heap, or
            Packit d28291
            to a currently unallocated page inside the heap.  Pages that have been
            Packit d28291
            the targets of such near misses are likely to be the targets of
            Packit d28291
            misidentified ``pointers'' in the future.  To minimize the future
            Packit d28291
            damage caused by such misidentification, they will be allocated only to
            Packit d28291
            small pointer-free objects.
            Packit d28291

            Packit d28291
            The collector understands two different kinds of black-listing.  A
            Packit d28291
            page may be black listed for interior pointer references
            Packit d28291
            (<TT>GC_add_to_black_list_stack</tt>), if it was the target of a near
            Packit d28291
            miss from a location that requires interior pointer recognition,
            Packit d28291
            e.g. the stack, or the heap if <TT>GC_all_interior_pointers</tt>
            Packit d28291
            is set.  In this case, we also avoid allocating large blocks that include
            Packit d28291
            this page.
            Packit d28291

            Packit d28291
            If the near miss came from a source that did not require interior
            Packit d28291
            pointer recognition, it is black-listed with
            Packit d28291
            <TT>GC_add_to_black_list_normal</tt>.
            Packit d28291
            A page black-listed in this way may appear inside a large object,
            Packit d28291
            so long as it is not the first page of a large object.
            Packit d28291

            Packit d28291
            The <TT>GC_allochblk</tt> routine respects black-listing when assigning
            Packit d28291
            a block to a particular object kind and size.  It occasionally
            Packit d28291
            drops (i.e. allocates and forgets) blocks that are completely black-listed
            Packit d28291
            in order to avoid excessively long large block free lists containing
            Packit d28291
            only unusable blocks.  This would otherwise become an issue
            Packit d28291
            if there is low demand for small pointer-free objects.
            Packit d28291
            Packit d28291

            Thread support

            Packit d28291
            We support several different threading models.  Unfortunately Pthreads,
            Packit d28291
            the only reasonably well standardized thread model, supports too narrow
            Packit d28291
            an interface for conservative garbage collection.  There appears to be
            Packit d28291
            no completely portable way to allow the collector
            Packit d28291
            to coexist with various Pthreads
            Packit d28291
            implementations.  Hence we currently support only the more
            Packit d28291
            common Pthreads implementations.
            Packit d28291

            Packit d28291
            In particular, it is very difficult for the collector to stop all other
            Packit d28291
            threads in the system and examine the register contents.  This is currently
            Packit d28291
            accomplished with very different mechanisms for some Pthreads
            Packit d28291
            implementations.  For Linux/HPUX/OSF1, Solaris and Irix it sends signals to
            Packit d28291
            individual Pthreads and has them wait in the signal handler.
            Packit d28291

            Packit d28291
            The Linux and Irix implementations use
            Packit d28291
            only documented Pthreads calls, but rely on extensions to their semantics.
            Packit d28291
            The Linux implementation <TT>pthread_stop_world.c</tt> relies on only very
            Packit d28291
            mild extensions to the pthreads semantics, and already supports a large number
            Packit d28291
            of other Unix-like pthreads implementations.  Our goal is to make this the
            Packit d28291
            only pthread support in the collector.
            Packit d28291

            Packit d28291
            All implementations must
            Packit d28291
            intercept thread creation and a few other thread-specific calls to allow
            Packit d28291
            enumeration of threads and location of thread stacks.  This is current
            Packit d28291
            accomplished with <TT># define</tt>'s in <TT>gc.h</tt>
            Packit d28291
            (really <TT>gc_pthread_redirects.h</tt>), or optionally
            Packit d28291
            by using ld's function call wrapping mechanism under Linux.
            Packit d28291

            Packit d28291
            Recent versions of the collector support several facilities to enhance
            Packit d28291
            the processor-scalability and thread performance of the collector.
            Packit d28291
            These are discussed in more detail here.
            Packit d28291
            We briefly outline the data approach to thread-local allocation in the
            Packit d28291
            next section.
            Packit d28291

            Thread-local allocation

            Packit d28291
            If thread-local allocation is enabled, the collector keeps separate
            Packit d28291
            arrays of free lists for each thread.  Thread-local allocation
            Packit d28291
            is currently only supported on a few platforms.
            Packit d28291

            Packit d28291
            The free list arrays associated
            Packit d28291
            with each thread are only used to satisfy requests for objects that
            Packit d28291
            are  both very small, and belong to one of a small number of well-known
            Packit d28291
            kinds.  These currently include "normal" and pointer-free objects.
            Packit d28291
            Depending on the configuration, "gcj" objects may also be included.
            Packit d28291

            Packit d28291
            Thread-local free list entries contain either a pointer to the first
            Packit d28291
            element of a free list, or they contain a counter of the number of
            Packit d28291
            allocation granules, corresponding to objects of this size,
            Packit d28291
            allocated so far.  Initially they contain the
            Packit d28291
            value one, i.e. a small counter value.
            Packit d28291

            Packit d28291
            Thread-local allocation allocates directly through the global
            Packit d28291
            allocator, if the object is of a size or kind not covered by the
            Packit d28291
            local free lists.
            Packit d28291

            Packit d28291
            If there is an appropriate local free list, the allocator checks whether it
            Packit d28291
            contains a sufficiently small counter value.  If so, the counter is simply
            Packit d28291
            incremented by the counter value, and the global allocator is used.
            Packit d28291
            In this way, the initial few allocations of a given size bypass the local
            Packit d28291
            allocator.  A thread that only allocates a handful of objects of a given
            Packit d28291
            size will not build up its own free list for that size.  This avoids
            Packit d28291
            wasting space for unpopular objects sizes or kinds.
            Packit d28291

            Packit d28291
            Once the counter passes a threshold, <TT>GC_malloc_many</tt> is called
            Packit d28291
            to allocate roughly <TT>HBLKSIZE</tt> space and put it on the corresponding
            Packit d28291
            local free list.  Further allocations of that size and kind then use
            Packit d28291
            this free list, and no longer need to acquire the allocation lock.
            Packit d28291
            The allocation procedure is otherwise similar to the global free lists.
            Packit d28291
            The local free lists are also linked using the first word in the object.
            Packit d28291
            In most cases this means they require considerably less time.
            Packit d28291

            Packit d28291
            Local free lists are treated buy most of the rest of the collector
            Packit d28291
            as though they were in-use reachable data.  This requires some care,
            Packit d28291
            since pointer-free objects are not normally traced, and hence a special
            Packit d28291
            tracing procedure is required to mark all objects on pointer-free and
            Packit d28291
            gcj local free lists.
            Packit d28291

            Packit d28291
            On thread exit, any remaining thread-local free list entries are
            Packit d28291
            transferred back to the global free list.
            Packit d28291

            Packit d28291
            Note that if the collector is configured for thread-local allocation,
            Packit d28291
            GC versions before 7 do not invoke the thread-local allocator by default.
            Packit d28291
            <TT>GC_malloc</tt> only uses thread-local allocation in version 7 and later.
            Packit d28291

            Packit d28291
            For some more details see here, and the
            Packit d28291
            technical report entitled
            Packit d28291
            Packit d28291
            "Fast Multiprocessor Memory Allocation and Garbage Collection"
            Packit d28291
            </body>
            Packit d28291
            </html>