Blame doc/tutorial/ar01s03.html

Packit 423ecb
<html><head><meta http-equiv="Content-Type" content="text/html; charset=ISO-8859-1"><title>Parsing the file</title><meta name="generator" content="DocBook XSL Stylesheets V1.61.2"><link rel="home" href="index.html" title="Libxml Tutorial"><link rel="up" href="index.html" title="Libxml Tutorial"><link rel="previous" href="ar01s02.html" title="Data Types"><link rel="next" href="ar01s04.html" title="Retrieving Element Content"></head><body bgcolor="white" text="black" link="#0000FF" vlink="#840084" alink="#0000FF">

Parsing the file

Packit 423ecb
Parsing the file requires only the name of the file and a single
Packit 423ecb
      function call, plus error checking. Full code: Appendix C, Code for Keyword Example

Packit 423ecb
    

Packit 423ecb
        1 xmlDocPtr doc;
Packit 423ecb
	2 xmlNodePtr cur;
Packit 423ecb
Packit 423ecb
	3 doc = xmlParseFile(docname);
Packit 423ecb
	
Packit 423ecb
	4 if (doc == NULL ) {
Packit 423ecb
		fprintf(stderr,"Document not parsed successfully. \n");
Packit 423ecb
		return;
Packit 423ecb
	}
Packit 423ecb
Packit 423ecb
	5 cur = xmlDocGetRootElement(doc);
Packit 423ecb
	
Packit 423ecb
	6 if (cur == NULL) {
Packit 423ecb
		fprintf(stderr,"empty document\n");
Packit 423ecb
		xmlFreeDoc(doc);
Packit 423ecb
		return;
Packit 423ecb
	}
Packit 423ecb
	
Packit 423ecb
	7 if (xmlStrcmp(cur->name, (const xmlChar *) "story")) {
Packit 423ecb
		fprintf(stderr,"document of the wrong type, root node != story");
Packit 423ecb
		xmlFreeDoc(doc);
Packit 423ecb
		return;
Packit 423ecb
	}
Packit 423ecb
Packit 423ecb
    

Packit 423ecb
      

1

Declare the pointer that will point to your parsed document.

2

Declare a node pointer (you'll need this in order to

Packit 423ecb
	  interact with individual nodes).

4

Check to see that the document was successfully parsed. If it

Packit 423ecb
	    was not, libxml will at this point
Packit 423ecb
	    register an error and stop. 
Packit 423ecb
	    

[Note]Note

Packit 423ecb
One common example of an error at this point is improper
Packit 423ecb
	    handling of encoding. The XML standard requires
Packit 423ecb
	    documents stored with an encoding other than UTF-8 or UTF-16 to
Packit 423ecb
	    contain an explicit declaration of their encoding. If the
Packit 423ecb
	    declaration is there, libxml will
Packit 423ecb
	    automatically perform the necessary conversion to UTF-8 for
Packit 423ecb
		you. More information on XML's encoding
Packit 423ecb
		requirements is contained in the standard.

Packit 423ecb
	  

5

Retrieve the document's root element.

6

Check to make sure the document actually contains something.

7

In our case, we need to make sure the document is the right

Packit 423ecb
	  type. "story" is the root type of the documents used in this
Packit 423ecb
	  tutorial.

Packit 423ecb
      
Packit 423ecb
    

</body></html>