Blame doc/upgrade.html

Packit 423ecb
Packit 423ecb
Packit 423ecb
<html xmlns="http://www.w3.org/1999/xhtml"><head><meta http-equiv="Content-Type" content="text/html; charset=UTF-8" /><link rel="SHORTCUT ICON" href="/favicon.ico" /><style type="text/css">
Packit 423ecb
TD {font-family: Verdana,Arial,Helvetica}
Packit 423ecb
BODY {font-family: Verdana,Arial,Helvetica; margin-top: 2em; margin-left: 0em; margin-right: 0em}
Packit 423ecb
H1 {font-family: Verdana,Arial,Helvetica}
Packit 423ecb
H2 {font-family: Verdana,Arial,Helvetica}
Packit 423ecb
H3 {font-family: Verdana,Arial,Helvetica}
Packit 423ecb
A:link, A:visited, A:active { text-decoration: underline }
Packit 423ecb
</style><title>Upgrading 1.x code</title></head><body bgcolor="#8b7765" text="#000000" link="#a06060" vlink="#000000">
Action against software patentsGnome2 LogoW3C LogoRed Hat Logo
Made with Libxml2 Logo

The XML C parser and toolkit of Gnome

Upgrading 1.x code

<center>Developer Menu</center>
<form action="search.php" enctype="application/x-www-form-urlencoded" method="get"><input name="query" type="text" size="20" value="" /><input name="submit" type="submit" value="Search ..." /></form>
<center>API Indexes</center>
<center>Related links</center>

Incompatible changes:

Version 2 of libxml2 is the first version introducing serious backward

Packit 423ecb
incompatible changes. The main goals were:

    Packit 423ecb
      
  • a general cleanup. A number of mistakes inherited from the very early
  • Packit 423ecb
        versions couldn't be changed due to compatibility constraints. Example
    Packit 423ecb
        the "childs" element in the nodes.
    Packit 423ecb
      
  • Uniformization of the various nodes, at least for their header and link
  • Packit 423ecb
        parts (doc, parent, children, prev, next), the goal is a simpler
    Packit 423ecb
        programming model and simplifying the task of the DOM implementors.
    Packit 423ecb
      
  • better conformances to the XML specification, for example version 1.x
  • Packit 423ecb
        had an heuristic to try to detect ignorable white spaces. As a result the
    Packit 423ecb
        SAX event generated were ignorableWhitespace() while the spec requires
    Packit 423ecb
        character() in that case. This also mean that a number of DOM node
    Packit 423ecb
        containing blank text may populate the DOM tree which were not present
    Packit 423ecb
        before.
    Packit 423ecb

    How to fix libxml-1.x code:

    So client code of libxml designed to run with version 1.x may have to be

    Packit 423ecb
    changed to compile against version 2.x of libxml. Here is a list of changes
    Packit 423ecb
    that I have collected, they may not be sufficient, so in case you find other
    Packit 423ecb
    change which are required, drop me a
    Packit 423ecb
    mail:

      Packit 423ecb
        
    1. The package name have changed from libxml to libxml2, the library name
    2. Packit 423ecb
          is now -lxml2 . There is a new xml2-config script which should be used to
      Packit 423ecb
          select the right parameters libxml2
      Packit 423ecb
        
    3. Node childs field has been renamed
    4. Packit 423ecb
          children so s/childs/children/g should be  applied
      Packit 423ecb
          (probability of having "childs" anywhere else is close to 0+
      Packit 423ecb
        
    5. The document don't have anymore a root element it has
    6. Packit 423ecb
          been replaced by children and usually you will get a
      Packit 423ecb
          list of element here. For example a Dtd element for the internal subset
      Packit 423ecb
          and it's declaration may be found in that list, as well as processing
      Packit 423ecb
          instructions or comments found before or after the document root element.
      Packit 423ecb
          Use xmlDocGetRootElement(doc) to get the root element of
      Packit 423ecb
          a document. Alternatively if you are sure to not reference DTDs nor have
      Packit 423ecb
          PIs or comments before or after the root element
      Packit 423ecb
          s/->root/->children/g will probably do it.
      Packit 423ecb
        
    7. The white space issue, this one is more complex, unless special case of
    8. Packit 423ecb
          validating parsing, the line breaks and spaces usually used for indenting
      Packit 423ecb
          and formatting the document content becomes significant. So they are
      Packit 423ecb
          reported by SAX and if your using the DOM tree, corresponding nodes are
      Packit 423ecb
          generated. Too approach can be taken:
      Packit 423ecb
          
        Packit 423ecb
              
      1. lazy one, use the compatibility call
      2. Packit 423ecb
                xmlKeepBlanksDefault(0) but be aware that you are
        Packit 423ecb
                relying on a special (and possibly broken) set of heuristics of
        Packit 423ecb
                libxml to detect ignorable blanks. Don't complain if it breaks or
        Packit 423ecb
                make your application not 100% clean w.r.t. to it's input.
        Packit 423ecb
              
      3. the Right Way: change you code to accept possibly insignificant
      4. Packit 423ecb
                blanks characters, or have your tree populated with weird blank text
        Packit 423ecb
                nodes. You can spot them using the commodity function
        Packit 423ecb
                xmlIsBlankNode(node) returning 1 for such blank
        Packit 423ecb
                nodes.
        Packit 423ecb
            
        Packit 423ecb
            

        Note also that with the new default the output functions don't add any

        Packit 423ecb
            extra indentation when saving a tree in order to be able to round trip
        Packit 423ecb
            (read and save) without inflating the document with extra formatting
        Packit 423ecb
            chars.

        Packit 423ecb
          
        Packit 423ecb
          
      5. The include path has changed to $prefix/libxml/ and the includes
      6. Packit 423ecb
            themselves uses this new prefix in includes instructions... If you are
        Packit 423ecb
            using (as expected) the
        Packit 423ecb
            
        xml2-config --cflags
        Packit 423ecb
            

        output to generate you compile commands this will probably work out of

        Packit 423ecb
            the box

        Packit 423ecb
          
        Packit 423ecb
          
      7. xmlDetectCharEncoding takes an extra argument indicating the length in
      8. Packit 423ecb
            byte of the head of the document available for character detection.
        Packit 423ecb

        Ensuring both libxml-1.x and libxml-2.x compatibility

        Two new version of libxml (1.8.11) and libxml2 (2.3.4) have been released

        Packit 423ecb
        to allow smooth upgrade of existing libxml v1code while retaining
        Packit 423ecb
        compatibility. They offers the following:

          Packit 423ecb
            
        1. similar include naming, one should use
        2. Packit 423ecb
              #include<libxml/...> in both cases.
          Packit 423ecb
            
        3. similar identifiers defined via macros for the child and root fields:
        4. Packit 423ecb
              respectively xmlChildrenNode and
          Packit 423ecb
              xmlRootNode
          Packit 423ecb
            
        5. a new macro LIBXML_TEST_VERSION which should be
        6. Packit 423ecb
              inserted once in the client code
          Packit 423ecb

          So the roadmap to upgrade your existing libxml applications is the

          Packit 423ecb
          following:

            Packit 423ecb
              
          1. install the libxml-1.8.8 (and libxml-devel-1.8.8) packages
          2. Packit 423ecb
              
          3. find all occurrences where the xmlDoc root field is
          4. Packit 423ecb
                used and change it to xmlRootNode
            Packit 423ecb
              
          5. similarly find all occurrences where the xmlNode
          6. Packit 423ecb
                childs field is used and change it to
            Packit 423ecb
                xmlChildrenNode
            Packit 423ecb
              
          7. add a LIBXML_TEST_VERSION macro somewhere in your
          8. Packit 423ecb
                main() or in the library init entry point
            Packit 423ecb
              
          9. Recompile, check compatibility, it should still work
          10. Packit 423ecb
              
          11. Change your configure script to look first for xml2-config and fall
          12. Packit 423ecb
                back using xml-config . Use the --cflags and --libs output of the command
            Packit 423ecb
                as the Include and Linking parameters needed to use libxml.
            Packit 423ecb
              
          13. install libxml2-2.3.x and libxml2-devel-2.3.x (libxml-1.8.y and
          14. Packit 423ecb
                libxml-devel-1.8.y can be kept simultaneously)
            Packit 423ecb
              
          15. remove your config.cache, relaunch your configuration mechanism, and
          16. Packit 423ecb
                recompile, if steps 2 and 3 were done right it should compile as-is
            Packit 423ecb
              
          17. Test that your application is still running correctly, if not this may
          18. Packit 423ecb
                be due to extra empty nodes due to formating spaces being kept in libxml2
            Packit 423ecb
                contrary to libxml1, in that case insert xmlKeepBlanksDefault(1) in your
            Packit 423ecb
                code before calling the parser (next to
            Packit 423ecb
                LIBXML_TEST_VERSION is a fine place).
            Packit 423ecb

            Following those steps should work. It worked for some of my own code.

            Let me put some emphasis on the fact that there is far more changes from

            Packit 423ecb
            libxml 1.x to 2.x than the ones you may have to patch for. The overall code
            Packit 423ecb
            has been considerably cleaned up and the conformance to the XML specification
            Packit 423ecb
            has been drastically improved too. Don't take those changes as an excuse to
            Packit 423ecb
            not upgrade, it may cost a lot on the long term ...

            Daniel Veillard

            </body></html>