|
Packit |
d28291 |
<HTML>
|
|
Packit |
d28291 |
<HEAD>
|
|
Packit |
d28291 |
<TITLE> Two-Level Tree Structure for Fast Pointer Lookup</TITLE>
|
|
Packit |
d28291 |
<AUTHOR> Hans-J. Boehm, Silicon Graphics (now at HP)</author>
|
|
Packit |
d28291 |
</HEAD>
|
|
Packit |
d28291 |
<BODY>
|
|
Packit |
d28291 |
Two-Level Tree Structure for Fast Pointer Lookup
|
|
Packit |
d28291 |
|
|
Packit |
d28291 |
The BDWGC conservative garbage collector uses a 2-level tree
|
|
Packit |
d28291 |
data structure to aid in fast pointer identification.
|
|
Packit |
d28291 |
This data structure is described in a bit more detail here, since
|
|
Packit |
d28291 |
|
|
Packit |
d28291 |
Variations of the data structure are more generally useful.
|
|
Packit |
d28291 |
It appears to be hard to understand by reading the code.
|
|
Packit |
d28291 |
Some other collectors appear to use inferior data structures to
|
|
Packit |
d28291 |
solve the same problem.
|
|
Packit |
d28291 |
It is central to fast collector operation.
|
|
Packit |
d28291 |
|
|
Packit |
d28291 |
A candidate pointer is divided into three sections, the high,
|
|
Packit |
d28291 |
middle, and low bits. The exact division between these
|
|
Packit |
d28291 |
three groups of bits is dependent on the detailed collector configuration.
|
|
Packit |
d28291 |
|
|
Packit |
d28291 |
The high and middle bits are used to look up an entry in the table described
|
|
Packit |
d28291 |
here. The resulting table entry consists of either a block descriptor
|
|
Packit |
d28291 |
(<TT>struct hblkhdr *</tt> or <TT>hdr *</tt>)
|
|
Packit |
d28291 |
identifying the layout of objects in the block, or an indication that this
|
|
Packit |
d28291 |
address range corresponds to the middle of a large block, together with a
|
|
Packit |
d28291 |
hint for locating the actual block descriptor. Such a hint consist
|
|
Packit |
d28291 |
of a displacement that can be subtracted from the middle bits of the candidate
|
|
Packit |
d28291 |
pointer without leaving the object.
|
|
Packit |
d28291 |
|
|
Packit |
d28291 |
In either case, the block descriptor (<TT>struct hblkhdr</tt>)
|
|
Packit |
d28291 |
refers to a table of object starting addresses (the <TT>hb_map</tt> field).
|
|
Packit |
d28291 |
The starting address table is indexed by the low bits if the candidate pointer.
|
|
Packit |
d28291 |
The resulting entry contains a displacement to the beginning of the object,
|
|
Packit |
d28291 |
or an indication that this cannot be a valid object pointer.
|
|
Packit |
d28291 |
(If all interior pointer are recognized, pointers into large objects
|
|
Packit |
d28291 |
are handled specially, as appropriate.)
|
|
Packit |
d28291 |
|
|
Packit |
d28291 |
The Tree
|
|
Packit |
d28291 |
|
|
Packit |
d28291 |
The rest of this discussion focuses on the two level data structure
|
|
Packit |
d28291 |
used to map the high and middle bits to the block descriptor.
|
|
Packit |
d28291 |
|
|
Packit |
d28291 |
The high bits are used as an index into the <TT>GC_top_index</tt> (really
|
|
Packit |
d28291 |
<TT>GC_arrays._top_index</tt>) array. Each entry points to a
|
|
Packit |
d28291 |
<TT>bottom_index</tt> data structure. This structure in turn consists
|
|
Packit |
d28291 |
mostly of an array <TT>index</tt> indexed by the middle bits of
|
|
Packit |
d28291 |
the candidate pointer. The <TT>index</tt> array contains the actual
|
|
Packit |
d28291 |
<TT>hdr</tt> pointers.
|
|
Packit |
d28291 |
|
|
Packit |
d28291 |
Thus a pointer lookup consists primarily of a handful of memory references,
|
|
Packit |
d28291 |
and can be quite fast:
|
|
Packit |
d28291 |
|
|
Packit |
d28291 |
The appropriate <TT>bottom_index</tt> pointer is looked up in
|
|
Packit |
d28291 |
<TT>GC_top_index</tt>, based on the high bits of the candidate pointer.
|
|
Packit |
d28291 |
The appropriate <TT>hdr</tt> pointer is looked up in the
|
|
Packit |
d28291 |
<TT>bottom_index</tt> structure, based on the middle bits.
|
|
Packit |
d28291 |
The block layout map pointer is retrieved from the <TT>hdr</tt>
|
|
Packit |
d28291 |
structure. (This memory reference is necessary since we try to share
|
|
Packit |
d28291 |
block layout maps.)
|
|
Packit |
d28291 |
The displacement to the beginning of the object is retrieved from the
|
|
Packit |
d28291 |
above map.
|
|
Packit |
d28291 |
|
|
Packit |
d28291 |
|
|
Packit |
d28291 |
In order to conserve space, not all <TT>GC_top_index</tt> entries in fact
|
|
Packit |
d28291 |
point to distinct <TT>bottom_index</tt> structures. If no address with
|
|
Packit |
d28291 |
the corresponding high bits is part of the heap, then the entry points
|
|
Packit |
d28291 |
to <TT>GC_all_nils</tt>, a single <TT>bottom_index</tt> structure consisting
|
|
Packit |
d28291 |
only of NULL <TT>hdr</tt> pointers.
|
|
Packit |
d28291 |
|
|
Packit |
d28291 |
<TT>Bottom_index</tt> structures contain slightly more information than
|
|
Packit |
d28291 |
just <TT>hdr</tt> pointers. The <TT>asc_link</tt> field is used to link
|
|
Packit |
d28291 |
all <TT>bottom_index</tt> structures in ascending order for fast traversal.
|
|
Packit |
d28291 |
This list is pointed to be <TT>GC_all_bottom_indices</tt>.
|
|
Packit |
d28291 |
It is maintained with the aid of <TT>key</tt> field that contains the
|
|
Packit |
d28291 |
high bits corresponding to the <TT>bottom_index</tt>.
|
|
Packit |
d28291 |
|
|
Packit |
d28291 |
64 bit addresses
|
|
Packit |
d28291 |
|
|
Packit |
d28291 |
In the case of 64 bit addresses, this picture is complicated slightly
|
|
Packit |
d28291 |
by the fact that one of the index structures would have to be huge to
|
|
Packit |
d28291 |
cover the entire address space with a two level tree. We deal with this
|
|
Packit |
d28291 |
by turning <TT>GC_top_index</tt> into a chained hash table, instead of
|
|
Packit |
d28291 |
a simple array. This adds a <TT>hash_link</tt> field to the
|
|
Packit |
d28291 |
<TT>bottom_index</tt> structure.
|
|
Packit |
d28291 |
|
|
Packit |
d28291 |
The "hash function" consists of dropping the high bits. This is cheap to
|
|
Packit |
d28291 |
compute, and guarantees that there will be no collisions if the heap
|
|
Packit |
d28291 |
is contiguous and not excessively large.
|
|
Packit |
d28291 |
|
|
Packit |
d28291 |
A picture
|
|
Packit |
d28291 |
|
|
Packit |
d28291 |
The following is an ASCII diagram of the data structure.
|
|
Packit |
d28291 |
This was contributed by Dave Barrett several years ago.
|
|
Packit |
d28291 |
|
|
Packit |
d28291 |
|
|
Packit |
d28291 |
Data Structure used by GC_base in gc3.7:
|
|
Packit |
d28291 |
21-Apr-94
|
|
Packit |
d28291 |
|
|
Packit |
d28291 |
|
|
Packit |
d28291 |
|
|
Packit |
d28291 |
|
|
Packit |
d28291 |
63 LOG_TOP_SZ[11] LOG_BOTTOM_SZ[10] LOG_HBLKSIZE[13]
|
|
Packit |
d28291 |
+------------------+----------------+------------------+------------------+
|
|
Packit |
d28291 |
p:| | TL_HASH(hi) | | HBLKDISPL(p) |
|
|
Packit |
d28291 |
+------------------+----------------+------------------+------------------+
|
|
Packit |
d28291 |
\-----------------------HBLKPTR(p)-------------------/
|
|
Packit |
d28291 |
\------------hi-------------------/
|
|
Packit |
d28291 |
\______ ________/ \________ _______/ \________ _______/
|
|
Packit |
d28291 |
V V V
|
|
Packit |
d28291 |
| | |
|
|
Packit |
d28291 |
GC_top_index[] | | |
|
|
Packit |
d28291 |
--- +--------------+ | | |
|
|
Packit |
d28291 |
^ | | | | |
|
|
Packit |
d28291 |
| | | | | |
|
|
Packit |
d28291 |
TOP +--------------+<--+ | |
|
|
Packit |
d28291 |
_SZ +-<| [] | * | |
|
|
Packit |
d28291 |
(items)| +--------------+ if 0 < bi< HBLKSIZE | |
|
|
Packit |
d28291 |
| | | | then large object | |
|
|
Packit |
d28291 |
| | | | starts at the bi'th | |
|
|
Packit |
d28291 |
v | | | HBLK before p. | i |
|
|
Packit |
d28291 |
--- | +--------------+ | (word- |
|
|
Packit |
d28291 |
v | aligned) |
|
|
Packit |
d28291 |
bi= |GET_BI(p){->hash_link}->key==hi | |
|
|
Packit |
d28291 |
v | |
|
|
Packit |
d28291 |
| (bottom_index) \ scratch_alloc'd | |
|
|
Packit |
d28291 |
| ( struct bi ) / by get_index() | |
|
|
Packit |
d28291 |
--- +->+--------------+ | |
|
|
Packit |
d28291 |
^ | | | |
|
|
Packit |
d28291 |
^ | | | |
|
|
Packit |
d28291 |
BOTTOM | | ha=GET_HDR_ADDR(p) | |
|
|
Packit |
d28291 |
_SZ(items)+--------------+<----------------------+ +-------+
|
|
Packit |
d28291 |
| +--<| index[] | |
|
|
Packit |
d28291 |
| | +--------------+ GC_obj_map: v
|
|
Packit |
d28291 |
| | | | from / +-+-+-----+-+-+-+-+ ---
|
|
Packit |
d28291 |
v | | | GC_add < 0| | | | | | | | ^
|
|
Packit |
d28291 |
--- | +--------------+ _map_entry \ +-+-+-----+-+-+-+-+ |
|
|
Packit |
d28291 |
| | asc_link | +-+-+-----+-+-+-+-+ MAXOBJSZ
|
|
Packit |
d28291 |
| +--------------+ +-->| | | j | | | | | +1
|
|
Packit |
d28291 |
| | key | | +-+-+-----+-+-+-+-+ |
|
|
Packit |
d28291 |
| +--------------+ | +-+-+-----+-+-+-+-+ |
|
|
Packit |
d28291 |
| | hash_link | | | | | | | | | | v
|
|
Packit |
d28291 |
| +--------------+ | +-+-+-----+-+-+-+-+ ---
|
|
Packit |
d28291 |
| | |<--MAX_OFFSET--->|
|
|
Packit |
d28291 |
| | (bytes)
|
|
Packit |
d28291 |
HDR(p)| GC_find_header(p) | |<--MAP_ENTRIES-->|
|
|
Packit |
d28291 |
| \ from | =HBLKSIZE/WORDSZ
|
|
Packit |
d28291 |
| (hdr) (struct hblkhdr) / alloc_hdr() | (1024 on Alpha)
|
|
Packit |
d28291 |
+-->+----------------------+ | (8/16 bits each)
|
|
Packit |
d28291 |
GET_HDR(p)| word hb_sz (words) | |
|
|
Packit |
d28291 |
+----------------------+ |
|
|
Packit |
d28291 |
| struct hblk *hb_next | |
|
|
Packit |
d28291 |
+----------------------+ |
|
|
Packit |
d28291 |
|mark_proc hb_mark_proc| |
|
|
Packit |
d28291 |
+----------------------+ |
|
|
Packit |
d28291 |
| char * hb_map |>-------------+
|
|
Packit |
d28291 |
+----------------------+
|
|
Packit |
d28291 |
| ushort hb_obj_kind |
|
|
Packit |
d28291 |
+----------------------+
|
|
Packit |
d28291 |
| hb_last_reclaimed |
|
|
Packit |
d28291 |
--- +----------------------+
|
|
Packit |
d28291 |
^ | |
|
|
Packit |
d28291 |
MARK_BITS| hb_marks[] | *if hdr is free, hb_sz
|
|
Packit |
d28291 |
_SZ(words)| | is the size of a heap chunk (struct hblk)
|
|
Packit |
d28291 |
v | | of at least MININCR*HBLKSIZE bytes (below),
|
|
Packit |
d28291 |
--- +----------------------+ otherwise, size of each object in chunk.
|
|
Packit |
d28291 |
|
|
Packit |
d28291 |
Dynamic data structures above are interleaved throughout the heap in blocks of
|
|
Packit |
d28291 |
size MININCR * HBLKSIZE bytes as done by gc_scratch_alloc which cannot be
|
|
Packit |
d28291 |
freed; free lists are used (e.g. alloc_hdr). HBLK's below are collected.
|
|
Packit |
d28291 |
|
|
Packit |
d28291 |
(struct hblk) HDR_BYTES
|
|
Packit |
d28291 |
--- +----------------------+ < HBLKSIZE --- (bytes)
|
|
Packit |
d28291 |
^ +-----hb_body----------+ (and WORDSZ) ^ --- ---
|
|
Packit |
d28291 |
| | | aligned | ^ ^
|
|
Packit |
d28291 |
| | | | hb_sz |
|
|
Packit |
d28291 |
| | | | (words) |
|
|
Packit |
d28291 |
| | Object 0 | | | |
|
|
Packit |
d28291 |
| | | i |(word- v |
|
|
Packit |
d28291 |
| + - - - - - - - - - - -+ --- (bytes)|aligned) --- |
|
|
Packit |
d28291 |
| | | ^ | ^ |
|
|
Packit |
d28291 |
| | | j (words) | | |
|
|
Packit |
d28291 |
n * | Object 1 | v v hb_sz BODY_SZ
|
|
Packit |
d28291 |
HBLKSIZE | |--------------- | (words)
|
|
Packit |
d28291 |
(bytes) | | v MAX_OFFSET
|
|
Packit |
d28291 |
| + - - - - - - - - - - -+ --- (bytes)
|
|
Packit |
d28291 |
| | | !ALL_INTERIOR_POINTERS ^ |
|
|
Packit |
d28291 |
| | | sets j only for hb_sz |
|
|
Packit |
d28291 |
| | Object N | valid object offsets. | |
|
|
Packit |
d28291 |
v | | All objects WORDSZ v v
|
|
Packit |
d28291 |
--- +----------------------+ aligned. --- ---
|
|
Packit |
d28291 |
|
|
Packit |
d28291 |
|
|
Packit |
d28291 |
</body>
|