Blame doc/xmldtd.html

Packit Service a31ea6
Packit Service a31ea6
Packit Service a31ea6
<html xmlns="http://www.w3.org/1999/xhtml"><head><meta http-equiv="Content-Type" content="text/html; charset=UTF-8" /><link rel="SHORTCUT ICON" href="/favicon.ico" /><style type="text/css">
Packit Service a31ea6
TD {font-family: Verdana,Arial,Helvetica}
Packit Service a31ea6
BODY {font-family: Verdana,Arial,Helvetica; margin-top: 2em; margin-left: 0em; margin-right: 0em}
Packit Service a31ea6
H1 {font-family: Verdana,Arial,Helvetica}
Packit Service a31ea6
H2 {font-family: Verdana,Arial,Helvetica}
Packit Service a31ea6
H3 {font-family: Verdana,Arial,Helvetica}
Packit Service a31ea6
A:link, A:visited, A:active { text-decoration: underline }
Packit Service a31ea6
</style><title>Validation & DTDs</title></head><body bgcolor="#8b7765" text="#000000" link="#a06060" vlink="#000000">
Action against software patentsGnome2 LogoW3C LogoRed Hat Logo
Made with Libxml2 Logo

The XML C parser and toolkit of Gnome

Validation & DTDs

<center>Main Menu</center>
<form action="search.php" enctype="application/x-www-form-urlencoded" method="get"><input name="query" type="text" size="20" value="" /><input name="submit" type="submit" value="Search ..." /></form>
<center>Related links</center>

Table of Content:

    Packit Service a31ea6
      
  1. General overview
  2. Packit Service a31ea6
      
  3. The definition
  4. Packit Service a31ea6
      
  5. Simple rules
  6. Packit Service a31ea6
        
      Packit Service a31ea6
            
    1. How to reference a DTD from a document
    2. Packit Service a31ea6
            
    3. Declaring elements
    4. Packit Service a31ea6
            
    5. Declaring attributes
    6. Packit Service a31ea6
          
      Packit Service a31ea6
        
      Packit Service a31ea6
        
    7. Some examples
    8. Packit Service a31ea6
        
    9. How to validate
    10. Packit Service a31ea6
        
    11. Other resources
    12. Packit Service a31ea6

      General overview

      Well what is validation and what is a DTD ?

      DTD is the acronym for Document Type Definition. This is a description of

      Packit Service a31ea6
      the content for a family of XML files. This is part of the XML 1.0
      Packit Service a31ea6
      specification, and allows one to describe and verify that a given document
      Packit Service a31ea6
      instance conforms to the set of rules detailing its structure and content.

      Validation is the process of checking a document against a DTD (more

      Packit Service a31ea6
      generally against a set of construction rules).

      The validation process and building DTDs are the two most difficult parts

      Packit Service a31ea6
      of the XML life cycle. Briefly a DTD defines all the possible elements to be
      Packit Service a31ea6
      found within your document, what is the formal shape of your document tree
      Packit Service a31ea6
      (by defining the allowed content of an element; either text, a regular
      Packit Service a31ea6
      expression for the allowed list of children, or mixed content i.e. both text
      Packit Service a31ea6
      and children). The DTD also defines the valid attributes for all elements and
      Packit Service a31ea6
      the types of those attributes.

      The definition

      The W3C XML Recommendation (Tim Bray's annotated version of

      Packit Service a31ea6
      Rev1):

        Packit Service a31ea6
          
      • Declaring
      • Packit Service a31ea6
          elements
        Packit Service a31ea6
          
      • Declaring
      • Packit Service a31ea6
          attributes
        Packit Service a31ea6

        (unfortunately) all this is inherited from the SGML world, the syntax is

        Packit Service a31ea6
        ancient...

        Simple rules

        Writing DTDs can be done in many ways. The rules to build them if you need

        Packit Service a31ea6
        something permanent or something which can evolve over time can be radically
        Packit Service a31ea6
        different. Really complex DTDs like DocBook ones are flexible but quite
        Packit Service a31ea6
        harder to design. I will just focus on DTDs for a formats with a fixed simple
        Packit Service a31ea6
        structure. It is just a set of basic rules, and definitely not exhaustive nor
        Packit Service a31ea6
        usable for complex DTD design.

        How to reference a DTD from a document:

        Assuming the top element of the document is spec and the dtd

        Packit Service a31ea6
        is placed in the file mydtd in the subdirectory
        Packit Service a31ea6
        dtds of the directory from where the document were loaded:

        <!DOCTYPE spec SYSTEM "dtds/mydtd">

        Notes:

          Packit Service a31ea6
            
        • The system string is actually an URI-Reference (as defined in RFC 2396) so you can use a
        • Packit Service a31ea6
              full URL string indicating the location of your DTD on the Web. This is a
          Packit Service a31ea6
              really good thing to do if you want others to validate your document.
          Packit Service a31ea6
            
        • It is also possible to associate a PUBLIC identifier (a
        • Packit Service a31ea6
              magic string) so that the DTD is looked up in catalogs on the client side
          Packit Service a31ea6
              without having to locate it on the web.
          Packit Service a31ea6
            
        • A DTD contains a set of element and attribute declarations, but they
        • Packit Service a31ea6
              don't define what the root of the document should be. This is explicitly
          Packit Service a31ea6
              told to the parser/validator as the first element of the
          Packit Service a31ea6
              DOCTYPE declaration.
          Packit Service a31ea6

          Declaring elements:

          The following declares an element spec:

          <!ELEMENT spec (front, body, back?)>

          It also expresses that the spec element contains one front,

          Packit Service a31ea6
          one body and one optional back children elements in
          Packit Service a31ea6
          this order. The declaration of one element of the structure and its content
          Packit Service a31ea6
          are done in a single declaration. Similarly the following declares
          Packit Service a31ea6
          div1 elements:

          <!ELEMENT div1 (head, (p | list | note)*, div2?)>

          which means div1 contains one head then a series of optional

          Packit Service a31ea6
          p, lists and notes and then an
          Packit Service a31ea6
          optional div2. And last but not least an element can contain
          Packit Service a31ea6
          text:

          <!ELEMENT b (#PCDATA)>

          b contains text or being of mixed content (text and elements

          Packit Service a31ea6
          in no particular order):

          <!ELEMENT p (#PCDATA|a|ul|b|i|em)*>

          p can contain text or a, ul,

          Packit Service a31ea6
          b, i or em elements in no particular
          Packit Service a31ea6
          order.

          Declaring attributes:

          Again the attributes declaration includes their content definition:

          <!ATTLIST termdef name CDATA #IMPLIED>

          means that the element termdef can have a name

          Packit Service a31ea6
          attribute containing text (CDATA) and which is optional
          Packit Service a31ea6
          (#IMPLIED). The attribute value can also be defined within a
          Packit Service a31ea6
          set:

          <!ATTLIST list type (bullets|ordered|glossary)

          Packit Service a31ea6
          "ordered">

          means list element have a type attribute with 3

          Packit Service a31ea6
          allowed values "bullets", "ordered" or "glossary" and which default to
          Packit Service a31ea6
          "ordered" if the attribute is not explicitly specified.

          The content type of an attribute can be text (CDATA),

          Packit Service a31ea6
          anchor/reference/references
          Packit Service a31ea6
          (ID/IDREF/IDREFS), entity(ies)
          Packit Service a31ea6
          (ENTITY/ENTITIES) or name(s)
          Packit Service a31ea6
          (NMTOKEN/NMTOKENS). The following defines that a
          Packit Service a31ea6
          chapter element can have an optional id attribute
          Packit Service a31ea6
          of type ID, usable for reference from attribute of type
          Packit Service a31ea6
          IDREF:

          <!ATTLIST chapter id ID #IMPLIED>

          The last value of an attribute definition can be #REQUIRED

          Packit Service a31ea6
          meaning that the attribute has to be given, #IMPLIED
          Packit Service a31ea6
          meaning that it is optional, or the default value (possibly prefixed by
          Packit Service a31ea6
          #FIXED if it is the only allowed).

          Notes:

            Packit Service a31ea6
              
          • Usually the attributes pertaining to a given element are declared in a
          • Packit Service a31ea6
                single expression, but it is just a convention adopted by a lot of DTD
            Packit Service a31ea6
                writers:
            Packit Service a31ea6
                
            <!ATTLIST termdef
            Packit Service a31ea6
                      id      ID      #REQUIRED
            Packit Service a31ea6
                      name    CDATA   #IMPLIED>
            Packit Service a31ea6
                

            The previous construct defines both id and

            Packit Service a31ea6
                name attributes for the element termdef.

            Packit Service a31ea6
              
            Packit Service a31ea6

            Some examples

            The directory test/valid/dtds/ in the libxml2 distribution

            Packit Service a31ea6
            contains some complex DTD examples. The example in the file
            Packit Service a31ea6
            test/valid/dia.xml shows an XML file where the simple DTD is
            Packit Service a31ea6
            directly included within the document.

            How to validate

            The simplest way is to use the xmllint program included with libxml. The

            Packit Service a31ea6
            --valid option turns-on validation of the files given as input.
            Packit Service a31ea6
            For example the following validates a copy of the first revision of the XML
            Packit Service a31ea6
            1.0 specification:

            xmllint --valid --noout test/valid/REC-xml-19980210.xml

            the -- noout is used to disable output of the resulting tree.

            The --dtdvalid dtd allows validation of the document(s)

            Packit Service a31ea6
            against a given DTD.

            Libxml2 exports an API to handle DTDs and validation, check the associated

            Packit Service a31ea6
            description.

            Other resources

            DTDs are as old as SGML. So there may be a number of examples on-line, I

            Packit Service a31ea6
            will just list one for now, others pointers welcome:

              Packit Service a31ea6
                
            • XML-101 DTD
            • Packit Service a31ea6

              I suggest looking at the examples found under test/valid/dtd and any of

              Packit Service a31ea6
              the large number of books available on XML. The dia example in test/valid
              Packit Service a31ea6
              should be both simple and complete enough to allow you to build your own.

              Daniel Veillard

              </body></html>