|
Packit |
423ecb |
|
|
Packit |
423ecb |
|
|
Packit |
423ecb |
<html xmlns="http://www.w3.org/1999/xhtml"><head><meta http-equiv="Content-Type" content="text/html; charset=UTF-8" /><link rel="SHORTCUT ICON" href="/favicon.ico" /><style type="text/css">
|
|
Packit |
423ecb |
TD {font-family: Verdana,Arial,Helvetica}
|
|
Packit |
423ecb |
BODY {font-family: Verdana,Arial,Helvetica; margin-top: 2em; margin-left: 0em; margin-right: 0em}
|
|
Packit |
423ecb |
H1 {font-family: Verdana,Arial,Helvetica}
|
|
Packit |
423ecb |
H2 {font-family: Verdana,Arial,Helvetica}
|
|
Packit |
423ecb |
H3 {font-family: Verdana,Arial,Helvetica}
|
|
Packit |
423ecb |
A:link, A:visited, A:active { text-decoration: underline }
|
|
Packit |
423ecb |
</style><title>Memory Management</title></head><body bgcolor="#8b7765" text="#000000" link="#a06060" vlink="#000000"> | | The XML C parser and toolkit of GnomeMemory Management |
|
|
<center>Developer Menu</center> | <form action="search.php" enctype="application/x-www-form-urlencoded" method="get"><input name="query" type="text" size="20" value="" /><input name="submit" type="submit" value="Search ..." /></form> |
<center>API Indexes</center> | |
<center>Related links</center> | |
|
| |
|
|
|
Packit |
423ecb |
General overview
|
|
Packit |
423ecb |
Setting libxml2 set of memory routines
|
|
Packit |
423ecb |
Cleaning up after using the library
|
|
Packit |
423ecb |
Debugging routines
|
|
Packit |
423ecb |
General memory requirements
|
|
Packit |
423ecb |
Returning memory to the kernel
|
|
Packit |
423ecb |
The module xmlmemory.h
|
|
Packit |
423ecb |
provides the interfaces to the libxml2 memory system:
|
|
Packit |
423ecb |
libxml2 does not use the libc memory allocator directly but xmlFree(),
|
|
Packit |
423ecb |
xmlMalloc() and xmlRealloc()
|
|
Packit |
423ecb |
those routines can be reallocated to a specific set of routine, by
|
|
Packit |
423ecb |
default the libc ones i.e. free(), malloc() and realloc()
|
|
Packit |
423ecb |
the xmlmemory.c module includes a set of debugging routine
|
|
Packit |
423ecb |
It is sometimes useful to not use the default memory allocator, either for
|
|
Packit |
423ecb |
debugging, analysis or to implement a specific behaviour on memory management
|
|
Packit |
423ecb |
(like on embedded systems). Two function calls are available to do so:
|
|
Packit |
423ecb |
xmlMemGet
|
|
Packit |
423ecb |
() which return the current set of functions in use by the parser
|
|
Packit |
423ecb |
xmlMemSetup()
|
|
Packit |
423ecb |
which allow to set up a new set of memory allocation functions
|
|
Packit |
423ecb |
Of course a call to xmlMemSetup() should probably be done before calling
|
|
Packit |
423ecb |
any other libxml2 routines (unless you are sure your allocations routines are
|
|
Packit |
423ecb |
compatibles).Libxml2 is not stateless, there is a few set of memory structures needing
|
|
Packit |
423ecb |
allocation before the parser is fully functional (some encoding structures
|
|
Packit |
423ecb |
for example). This also mean that once parsing is finished there is a tiny
|
|
Packit |
423ecb |
amount of memory (a few hundred bytes) which can be recollected if you don't
|
|
Packit |
423ecb |
reuse the library or any document built with it:
|
|
Packit |
423ecb |
xmlCleanupParser
|
|
Packit |
423ecb |
() is a centralized routine to free the library state and data. Note
|
|
Packit |
423ecb |
that it won't deallocate any produced tree if any (use the xmlFreeDoc()
|
|
Packit |
423ecb |
and related routines for this). This should be called only when the library
|
|
Packit |
423ecb |
is not used anymore.
|
|
Packit |
423ecb |
xmlInitParser
|
|
Packit |
423ecb |
() is the dual routine allowing to preallocate the parsing state
|
|
Packit |
423ecb |
which can be useful for example to avoid initialization reentrancy
|
|
Packit |
423ecb |
problems when using libxml2 in multithreaded applications
|
|
Packit |
423ecb |
Generally xmlCleanupParser() is safe assuming no parsing is ongoing and
|
|
Packit |
423ecb |
no document is still being used, if needed the state will be rebuild at the
|
|
Packit |
423ecb |
next invocation of parser routines (or by xmlInitParser()), but be careful
|
|
Packit |
423ecb |
of the consequences in multithreaded applications.When configured using --with-mem-debug flag (off by default), libxml2 uses
|
|
Packit |
423ecb |
a set of memory allocation debugging routines keeping track of all allocated
|
|
Packit |
423ecb |
blocks and the location in the code where the routine was called. A couple of
|
|
Packit |
423ecb |
other debugging routines allow to dump the memory allocated infos to a file
|
|
Packit |
423ecb |
or call a specific routine when a given block number is allocated:
|
|
Packit |
423ecb |
xmlMallocLoc()
|
|
Packit |
423ecb |
xmlReallocLoc()
|
|
Packit |
423ecb |
and xmlMemStrdupLoc()
|
|
Packit |
423ecb |
are the memory debugging replacement allocation routines
|
|
Packit |
423ecb |
xmlMemoryDump
|
|
Packit |
423ecb |
() dumps all the information about the allocated memory block lefts
|
|
Packit |
423ecb |
in the .memdump file
|
|
Packit |
423ecb |
When developing libxml2 memory debug is enabled, the tests programs call
|
|
Packit |
423ecb |
xmlMemoryDump () and the "make test" regression tests will check for any
|
|
Packit |
423ecb |
memory leak during the full regression test sequence, this helps a lot
|
|
Packit |
423ecb |
ensuring that libxml2 does not leak memory and bullet proof memory
|
|
Packit |
423ecb |
allocations use (some libc implementations are known to be far too permissive
|
|
Packit |
423ecb |
resulting in major portability problems!).If the .memdump reports a leak, it displays the allocation function and
|
|
Packit |
423ecb |
also tries to give some information about the content and structure of the
|
|
Packit |
423ecb |
allocated blocks left. This is sufficient in most cases to find the culprit,
|
|
Packit |
423ecb |
but not always. Assuming the allocation problem is reproducible, it is
|
|
Packit |
423ecb |
possible to find more easily:
|
|
Packit |
423ecb |
write down the block number xxxx not allocated
|
|
Packit |
423ecb |
export the environment variable XML_MEM_BREAKPOINT=xxxx , the easiest
|
|
Packit |
423ecb |
when using GDB is to simply give the command
|
|
Packit |
423ecb |
set environment XML_MEM_BREAKPOINT xxxx
|
|
Packit |
423ecb |
before running the program.
|
|
Packit |
423ecb |
|
|
Packit |
423ecb |
run the program under a debugger and set a breakpoint on
|
|
Packit |
423ecb |
xmlMallocBreakpoint() a specific function called when this precise block
|
|
Packit |
423ecb |
is allocated
|
|
Packit |
423ecb |
when the breakpoint is reached you can then do a fine analysis of the
|
|
Packit |
423ecb |
allocation an step to see the condition resulting in the missing
|
|
Packit |
423ecb |
deallocation.
|
|
Packit |
423ecb |
I used to use a commercial tool to debug libxml2 memory problems but after
|
|
Packit |
423ecb |
noticing that it was not detecting memory leaks that simple mechanism was
|
|
Packit |
423ecb |
used and proved extremely efficient until now. Lately I have also used valgrind with quite some
|
|
Packit |
423ecb |
success, it is tied to the i386 architecture since it works by emulating the
|
|
Packit |
423ecb |
processor and instruction set, it is slow but extremely efficient, i.e. it
|
|
Packit |
423ecb |
spot memory usage errors in a very precise way.How much libxml2 memory require ? It's hard to tell in average it depends
|
|
Packit |
423ecb |
of a number of things:
|
|
Packit |
423ecb |
the parser itself should work in a fixed amount of memory, except for
|
|
Packit |
423ecb |
information maintained about the stacks of names and entities locations.
|
|
Packit |
423ecb |
The I/O and encoding handlers will probably account for a few KBytes.
|
|
Packit |
423ecb |
This is true for both the XML and HTML parser (though the HTML parser
|
|
Packit |
423ecb |
need more state).
|
|
Packit |
423ecb |
If you are generating the DOM tree then memory requirements will grow
|
|
Packit |
423ecb |
nearly linear with the size of the data. In general for a balanced
|
|
Packit |
423ecb |
textual document the internal memory requirement is about 4 times the
|
|
Packit |
423ecb |
size of the UTF8 serialization of this document (example the XML-1.0
|
|
Packit |
423ecb |
recommendation is a bit more of 150KBytes and takes 650KBytes of main
|
|
Packit |
423ecb |
memory when parsed). Validation will add a amount of memory required for
|
|
Packit |
423ecb |
maintaining the external Dtd state which should be linear with the
|
|
Packit |
423ecb |
complexity of the content model defined by the Dtd
|
|
Packit |
423ecb |
If you need to work with fixed memory requirements or don't need the
|
|
Packit |
423ecb |
full DOM tree then using the xmlReader
|
|
Packit |
423ecb |
interface is probably the best way to proceed, it still allows to
|
|
Packit |
423ecb |
validate or operate on subset of the tree if needed.
|
|
Packit |
423ecb |
If you don't care about the advanced features of libxml2 like
|
|
Packit |
423ecb |
validation, DOM, XPath or XPointer, don't use entities, need to work with
|
|
Packit |
423ecb |
fixed memory requirements, and try to get the fastest parsing possible
|
|
Packit |
423ecb |
then the SAX interface should be used, but it has known restrictions.
|
|
Packit |
423ecb |
You may encounter that your process using libxml2 does not have a
|
|
Packit |
423ecb |
reduced memory usage although you freed the trees. This is because
|
|
Packit |
423ecb |
libxml2 allocates memory in a number of small chunks. When freeing one
|
|
Packit |
423ecb |
of those chunks, the OS may decide that giving this little memory back
|
|
Packit |
423ecb |
to the kernel will cause too much overhead and delay the operation. As
|
|
Packit |
423ecb |
all chunks are this small, they get actually freed but not returned to
|
|
Packit |
423ecb |
the kernel. On systems using glibc, there is a function call
|
|
Packit |
423ecb |
"malloc_trim" from malloc.h which does this missing operation (note that
|
|
Packit |
423ecb |
it is allowed to fail). Thus, after freeing your tree you may simply try
|
|
Packit |
423ecb |
"malloc_trim(0);" to really get the memory back. If your OS does not
|
|
Packit |
423ecb |
provide malloc_trim, try searching for a similar function.Daniel Veillard </body></html>
|