Blame doc/example.html

Packit Service a31ea6
Packit Service a31ea6
Packit Service a31ea6
<html xmlns="http://www.w3.org/1999/xhtml"><head><meta http-equiv="Content-Type" content="text/html; charset=UTF-8" /><link rel="SHORTCUT ICON" href="/favicon.ico" /><style type="text/css">
Packit Service a31ea6
TD {font-family: Verdana,Arial,Helvetica}
Packit Service a31ea6
BODY {font-family: Verdana,Arial,Helvetica; margin-top: 2em; margin-left: 0em; margin-right: 0em}
Packit Service a31ea6
H1 {font-family: Verdana,Arial,Helvetica}
Packit Service a31ea6
H2 {font-family: Verdana,Arial,Helvetica}
Packit Service a31ea6
H3 {font-family: Verdana,Arial,Helvetica}
Packit Service a31ea6
A:link, A:visited, A:active { text-decoration: underline }
Packit Service a31ea6
</style><title>A real example</title></head><body bgcolor="#8b7765" text="#000000" link="#a06060" vlink="#000000">
Action against software patentsGnome2 LogoW3C LogoRed Hat Logo
Made with Libxml2 Logo

The XML C parser and toolkit of Gnome

A real example

<center>Developer Menu</center>
<form action="search.php" enctype="application/x-www-form-urlencoded" method="get"><input name="query" type="text" size="20" value="" /><input name="submit" type="submit" value="Search ..." /></form>
<center>API Indexes</center>
<center>Related links</center>

Here is a real size example, where the actual content of the application

Packit Service a31ea6
data is not kept in the DOM tree but uses internal structures. It is based on
Packit Service a31ea6
a proposal to keep a database of jobs related to Gnome, with an XML based
Packit Service a31ea6
storage structure. Here is an XML encoded jobs
Packit Service a31ea6
base:

<?xml version="1.0"?>
Packit Service a31ea6
<gjob:Helping xmlns:gjob="http://www.gnome.org/some-location">
Packit Service a31ea6
  <gjob:Jobs>
Packit Service a31ea6
Packit Service a31ea6
    <gjob:Job>
Packit Service a31ea6
      <gjob:Project ID="3"/>
Packit Service a31ea6
      <gjob:Application>GBackup</gjob:Application>
Packit Service a31ea6
      <gjob:Category>Development</gjob:Category>
Packit Service a31ea6
Packit Service a31ea6
      <gjob:Update>
Packit Service a31ea6
        <gjob:Status>Open</gjob:Status>
Packit Service a31ea6
        <gjob:Modified>Mon, 07 Jun 1999 20:27:45 -0400 MET DST</gjob:Modified>
Packit Service a31ea6
        <gjob:Salary>USD 0.00</gjob:Salary>
Packit Service a31ea6
      </gjob:Update>
Packit Service a31ea6
Packit Service a31ea6
      <gjob:Developers>
Packit Service a31ea6
        <gjob:Developer>
Packit Service a31ea6
        </gjob:Developer>
Packit Service a31ea6
      </gjob:Developers>
Packit Service a31ea6
Packit Service a31ea6
      <gjob:Contact>
Packit Service a31ea6
        <gjob:Person>Nathan Clemons</gjob:Person>
Packit Service a31ea6
        <gjob:Email>nathan@windsofstorm.net</gjob:Email>
Packit Service a31ea6
        <gjob:Company>
Packit Service a31ea6
        </gjob:Company>
Packit Service a31ea6
        <gjob:Organisation>
Packit Service a31ea6
        </gjob:Organisation>
Packit Service a31ea6
        <gjob:Webpage>
Packit Service a31ea6
        </gjob:Webpage>
Packit Service a31ea6
        <gjob:Snailmail>
Packit Service a31ea6
        </gjob:Snailmail>
Packit Service a31ea6
        <gjob:Phone>
Packit Service a31ea6
        </gjob:Phone>
Packit Service a31ea6
      </gjob:Contact>
Packit Service a31ea6
Packit Service a31ea6
      <gjob:Requirements>
Packit Service a31ea6
      The program should be released as free software, under the GPL.
Packit Service a31ea6
      </gjob:Requirements>
Packit Service a31ea6
Packit Service a31ea6
      <gjob:Skills>
Packit Service a31ea6
      </gjob:Skills>
Packit Service a31ea6
Packit Service a31ea6
      <gjob:Details>
Packit Service a31ea6
      A GNOME based system that will allow a superuser to configure 
Packit Service a31ea6
      compressed and uncompressed files and/or file systems to be backed 
Packit Service a31ea6
      up with a supported media in the system.  This should be able to 
Packit Service a31ea6
      perform via find commands generating a list of files that are passed 
Packit Service a31ea6
      to tar, dd, cpio, cp, gzip, etc., to be directed to the tape machine 
Packit Service a31ea6
      or via operations performed on the filesystem itself. Email 
Packit Service a31ea6
      notification and GUI status display very important.
Packit Service a31ea6
      </gjob:Details>
Packit Service a31ea6
Packit Service a31ea6
    </gjob:Job>
Packit Service a31ea6
Packit Service a31ea6
  </gjob:Jobs>
Packit Service a31ea6
</gjob:Helping>

While loading the XML file into an internal DOM tree is a matter of

Packit Service a31ea6
calling only a couple of functions, browsing the tree to gather the data and
Packit Service a31ea6
generate the internal structures is harder, and more error prone.

The suggested principle is to be tolerant with respect to the input

Packit Service a31ea6
structure. For example, the ordering of the attributes is not significant,
Packit Service a31ea6
the XML specification is clear about it. It's also usually a good idea not to
Packit Service a31ea6
depend on the order of the children of a given node, unless it really makes
Packit Service a31ea6
things harder. Here is some code to parse the information for a person:

/*
Packit Service a31ea6
 * A person record
Packit Service a31ea6
 */
Packit Service a31ea6
typedef struct person {
Packit Service a31ea6
    char *name;
Packit Service a31ea6
    char *email;
Packit Service a31ea6
    char *company;
Packit Service a31ea6
    char *organisation;
Packit Service a31ea6
    char *smail;
Packit Service a31ea6
    char *webPage;
Packit Service a31ea6
    char *phone;
Packit Service a31ea6
} person, *personPtr;
Packit Service a31ea6
Packit Service a31ea6
/*
Packit Service a31ea6
 * And the code needed to parse it
Packit Service a31ea6
 */
Packit Service a31ea6
personPtr parsePerson(xmlDocPtr doc, xmlNsPtr ns, xmlNodePtr cur) {
Packit Service a31ea6
    personPtr ret = NULL;
Packit Service a31ea6
Packit Service a31ea6
DEBUG("parsePerson\n");
Packit Service a31ea6
    /*
Packit Service a31ea6
     * allocate the struct
Packit Service a31ea6
     */
Packit Service a31ea6
    ret = (personPtr) malloc(sizeof(person));
Packit Service a31ea6
    if (ret == NULL) {
Packit Service a31ea6
        fprintf(stderr,"out of memory\n");
Packit Service a31ea6
        return(NULL);
Packit Service a31ea6
    }
Packit Service a31ea6
    memset(ret, 0, sizeof(person));
Packit Service a31ea6
Packit Service a31ea6
    /* We don't care what the top level element name is */
Packit Service a31ea6
    cur = cur->xmlChildrenNode;
Packit Service a31ea6
    while (cur != NULL) {
Packit Service a31ea6
        if ((!strcmp(cur->name, "Person")) && (cur->ns == ns))
Packit Service a31ea6
            ret->name = xmlNodeListGetString(doc, cur->xmlChildrenNode, 1);
Packit Service a31ea6
        if ((!strcmp(cur->name, "Email")) && (cur->ns == ns))
Packit Service a31ea6
            ret->email = xmlNodeListGetString(doc, cur->xmlChildrenNode, 1);
Packit Service a31ea6
        cur = cur->next;
Packit Service a31ea6
    }
Packit Service a31ea6
Packit Service a31ea6
    return(ret);
Packit Service a31ea6
}

Here are a couple of things to notice:

    Packit Service a31ea6
      
  • Usually a recursive parsing style is the more convenient one: XML data
  • Packit Service a31ea6
        is by nature subject to repetitive constructs and usually exhibits highly
    Packit Service a31ea6
        structured patterns.
    Packit Service a31ea6
      
  • The two arguments of type xmlDocPtr and xmlNsPtr,
  • Packit Service a31ea6
        i.e. the pointer to the global XML document and the namespace reserved to
    Packit Service a31ea6
        the application. Document wide information are needed for example to
    Packit Service a31ea6
        decode entities and it's a good coding practice to define a namespace for
    Packit Service a31ea6
        your application set of data and test that the element and attributes
    Packit Service a31ea6
        you're analyzing actually pertains to your application space. This is
    Packit Service a31ea6
        done by a simple equality test (cur->ns == ns).
    Packit Service a31ea6
      
  • To retrieve text and attributes value, you can use the function
  • Packit Service a31ea6
        xmlNodeListGetString to gather all the text and entity reference
    Packit Service a31ea6
        nodes generated by the DOM output and produce an single text string.
    Packit Service a31ea6

    Here is another piece of code used to parse another level of the

    Packit Service a31ea6
    structure:

    #include <libxml/tree.h>
    Packit Service a31ea6
    /*
    Packit Service a31ea6
     * a Description for a Job
    Packit Service a31ea6
     */
    Packit Service a31ea6
    typedef struct job {
    Packit Service a31ea6
        char *projectID;
    Packit Service a31ea6
        char *application;
    Packit Service a31ea6
        char *category;
    Packit Service a31ea6
        personPtr contact;
    Packit Service a31ea6
        int nbDevelopers;
    Packit Service a31ea6
        personPtr developers[100]; /* using dynamic alloc is left as an exercise */
    Packit Service a31ea6
    } job, *jobPtr;
    Packit Service a31ea6
    Packit Service a31ea6
    /*
    Packit Service a31ea6
     * And the code needed to parse it
    Packit Service a31ea6
     */
    Packit Service a31ea6
    jobPtr parseJob(xmlDocPtr doc, xmlNsPtr ns, xmlNodePtr cur) {
    Packit Service a31ea6
        jobPtr ret = NULL;
    Packit Service a31ea6
    Packit Service a31ea6
    DEBUG("parseJob\n");
    Packit Service a31ea6
        /*
    Packit Service a31ea6
         * allocate the struct
    Packit Service a31ea6
         */
    Packit Service a31ea6
        ret = (jobPtr) malloc(sizeof(job));
    Packit Service a31ea6
        if (ret == NULL) {
    Packit Service a31ea6
            fprintf(stderr,"out of memory\n");
    Packit Service a31ea6
            return(NULL);
    Packit Service a31ea6
        }
    Packit Service a31ea6
        memset(ret, 0, sizeof(job));
    Packit Service a31ea6
    Packit Service a31ea6
        /* We don't care what the top level element name is */
    Packit Service a31ea6
        cur = cur->xmlChildrenNode;
    Packit Service a31ea6
        while (cur != NULL) {
    Packit Service a31ea6
            
    Packit Service a31ea6
            if ((!strcmp(cur->name, "Project")) && (cur->ns == ns)) {
    Packit Service a31ea6
                ret->projectID = xmlGetProp(cur, "ID");
    Packit Service a31ea6
                if (ret->projectID == NULL) {
    Packit Service a31ea6
                    fprintf(stderr, "Project has no ID\n");
    Packit Service a31ea6
                }
    Packit Service a31ea6
            }
    Packit Service a31ea6
            if ((!strcmp(cur->name, "Application")) && (cur->ns == ns))
    Packit Service a31ea6
                ret->application = xmlNodeListGetString(doc, cur->xmlChildrenNode, 1);
    Packit Service a31ea6
            if ((!strcmp(cur->name, "Category")) && (cur->ns == ns))
    Packit Service a31ea6
                ret->category = xmlNodeListGetString(doc, cur->xmlChildrenNode, 1);
    Packit Service a31ea6
            if ((!strcmp(cur->name, "Contact")) && (cur->ns == ns))
    Packit Service a31ea6
                ret->contact = parsePerson(doc, ns, cur);
    Packit Service a31ea6
            cur = cur->next;
    Packit Service a31ea6
        }
    Packit Service a31ea6
    Packit Service a31ea6
        return(ret);
    Packit Service a31ea6
    }

    Once you are used to it, writing this kind of code is quite simple, but

    Packit Service a31ea6
    boring. Ultimately, it could be possible to write stubbers taking either C
    Packit Service a31ea6
    data structure definitions, a set of XML examples or an XML DTD and produce
    Packit Service a31ea6
    the code needed to import and export the content between C data and XML
    Packit Service a31ea6
    storage. This is left as an exercise to the reader :-)

    Feel free to use the code for the full C

    Packit Service a31ea6
    parsing example as a template, it is also available with Makefile in the
    Packit Service a31ea6
    Gnome SVN base under libxml2/example

    Daniel Veillard

    </body></html>