|
Packit |
423ecb |
|
|
Packit |
423ecb |
|
|
Packit |
423ecb |
<html xmlns="http://www.w3.org/1999/xhtml"><head><meta http-equiv="Content-Type" content="text/html; charset=UTF-8" /><link rel="SHORTCUT ICON" href="/favicon.ico" /><style type="text/css">
|
|
Packit |
423ecb |
TD {font-family: Verdana,Arial,Helvetica}
|
|
Packit |
423ecb |
BODY {font-family: Verdana,Arial,Helvetica; margin-top: 2em; margin-left: 0em; margin-right: 0em}
|
|
Packit |
423ecb |
H1 {font-family: Verdana,Arial,Helvetica}
|
|
Packit |
423ecb |
H2 {font-family: Verdana,Arial,Helvetica}
|
|
Packit |
423ecb |
H3 {font-family: Verdana,Arial,Helvetica}
|
|
Packit |
423ecb |
A:link, A:visited, A:active { text-decoration: underline }
|
|
Packit |
423ecb |
</style><title>Entities or no entities</title></head><body bgcolor="#8b7765" text="#000000" link="#a06060" vlink="#000000"> | | The XML C parser and toolkit of GnomeEntities or no entities |
|
|
<center>Developer Menu</center> | <form action="search.php" enctype="application/x-www-form-urlencoded" method="get"><input name="query" type="text" size="20" value="" /><input name="submit" type="submit" value="Search ..." /></form> |
<center>API Indexes</center> | |
<center>Related links</center> | |
|
| Entities in principle are similar to simple C macros. An entity defines an |
|
|
|
|
|
|
Packit |
423ecb |
abbreviation for a given string that you can reuse many times throughout the
|
|
Packit |
423ecb |
content of your document. Entities are especially useful when a given string
|
|
Packit |
423ecb |
may occur frequently within a document, or to confine the change needed to a
|
|
Packit |
423ecb |
document to a restricted area in the internal subset of the document (at the
|
|
Packit |
423ecb |
beginning). Example:1 <?xml version="1.0"?>
|
|
Packit |
423ecb |
2 <!DOCTYPE EXAMPLE SYSTEM "example.dtd" [
|
|
Packit |
423ecb |
3 <!ENTITY xml "Extensible Markup Language">
|
|
Packit |
423ecb |
4 ]>
|
|
Packit |
423ecb |
5 <EXAMPLE>
|
|
Packit |
423ecb |
6 &xml;
|
|
Packit |
423ecb |
7 </EXAMPLE>Line 3 declares the xml entity. Line 6 uses the xml entity, by prefixing
|
|
Packit |
423ecb |
its name with '&' and following it by ';' without any spaces added. There
|
|
Packit |
423ecb |
are 5 predefined entities in libxml2 allowing you to escape characters with
|
|
Packit |
423ecb |
predefined meaning in some parts of the xml document content:
|
|
Packit |
423ecb |
< for the character '<', >
|
|
Packit |
423ecb |
for the character '>', ' for the character ''',
|
|
Packit |
423ecb |
" for the character '"', and
|
|
Packit |
423ecb |
& for the character '&'.One of the problems related to entities is that you may want the parser to
|
|
Packit |
423ecb |
substitute an entity's content so that you can see the replacement text in
|
|
Packit |
423ecb |
your application. Or you may prefer to keep entity references as such in the
|
|
Packit |
423ecb |
content to be able to save the document back without losing this usually
|
|
Packit |
423ecb |
precious information (if the user went through the pain of explicitly
|
|
Packit |
423ecb |
defining entities, he may have a a rather negative attitude if you blindly
|
|
Packit |
423ecb |
substitute them as saving time). The xmlSubstituteEntitiesDefault()
|
|
Packit |
423ecb |
function allows you to check and change the behaviour, which is to not
|
|
Packit |
423ecb |
substitute entities by default.Here is the DOM tree built by libxml2 for the previous document in the
|
|
Packit |
423ecb |
default case:/gnome/src/gnome-xml -> ./xmllint --debug test/ent1
|
|
Packit |
423ecb |
DOCUMENT
|
|
Packit |
423ecb |
version=1.0
|
|
Packit |
423ecb |
ELEMENT EXAMPLE
|
|
Packit |
423ecb |
TEXT
|
|
Packit |
423ecb |
content=
|
|
Packit |
423ecb |
ENTITY_REF
|
|
Packit |
423ecb |
INTERNAL_GENERAL_ENTITY xml
|
|
Packit |
423ecb |
content=Extensible Markup Language
|
|
Packit |
423ecb |
TEXT
|
|
Packit |
423ecb |
content=And here is the result when substituting entities: /gnome/src/gnome-xml -> ./tester --debug --noent test/ent1
|
|
Packit |
423ecb |
DOCUMENT
|
|
Packit |
423ecb |
version=1.0
|
|
Packit |
423ecb |
ELEMENT EXAMPLE
|
|
Packit |
423ecb |
TEXT
|
|
Packit |
423ecb |
content= Extensible Markup LanguageSo, entities or no entities? Basically, it depends on your use case. I
|
|
Packit |
423ecb |
suggest that you keep the non-substituting default behaviour and avoid using
|
|
Packit |
423ecb |
entities in your XML document or data if you are not willing to handle the
|
|
Packit |
423ecb |
entity references elements in the DOM tree.Note that at save time libxml2 enforces the conversion of the predefined
|
|
Packit |
423ecb |
entities where necessary to prevent well-formedness problems, and will also
|
|
Packit |
423ecb |
transparently replace those with chars (i.e. it will not generate entity
|
|
Packit |
423ecb |
reference elements in the DOM tree or call the reference() SAX callback when
|
|
Packit |
423ecb |
finding them in the input).WARNING: handling entities
|
|
Packit |
423ecb |
on top of the libxml2 SAX interface is difficult!!! If you plan to use
|
|
Packit |
423ecb |
non-predefined entities in your documents, then the learning curve to handle
|
|
Packit |
423ecb |
then using the SAX API may be long. If you plan to use complex documents, I
|
|
Packit |
423ecb |
strongly suggest you consider using the DOM interface instead and let libxml
|
|
Packit |
423ecb |
deal with the complexity rather than trying to do it yourself.Daniel Veillard </body></html>
|