General Structure of XML Documents - - Page 4
December 2, 2002
Additional authors can be added later.
If we use an automated batch method for updates, we might use code similar to the following:
<?selectdoc type="nonfiction" subtype="book"
title="Web Development Methodologies" ?>
<?addauthor ?>
<author>
<fname>bernard</fname>
<mname/>
<lname>rubble</lname>
</author>
The XML PI <?selectdoc...?> would be interpreted as a request to retrieve a particular document; the
three attributes type, subtype, and title would jointly form a unique key. Once the document has
been selected, the XML PI <?addauthor ?> would be interpreted as a request to add an
<author>
</author> block to the <authors>
</authors> block of the document.
The document's URL is defined next:
<page url="test.doc"/>
<body>
Some documents will either not have a URL associated with them, or the URL may be presently undefined.
When no URL is defined, use the empty form, <page/>.
To update the URL (either to define it or to change it), we might use the following code in batch mode:
<?selectdoc type="nonfiction" subtype="book"
title="Pro PHP Web Site Solutions" ?>
<?changeurl pageurl="__newurl__" ?>
Finally, the <body> element is where the actual document structure is defined. Each unique combination of
content type and subtype will likely have a different XML tagset. Further, each combination will need to
have a custom XML parser built for it to allow manipulation of documents:
In this example, we are defining a non-fiction book. In particular, we are defining a subset of the information
in the book that we are reading. For the first chapter, we are only defining the title, chapter number, and style
(the style attribute defines a font) attributes. For now, we shall leave out the content in the chapter:
<chap num="1" title="Introduction" style="chapter title"/>
In the listing above, "chapter title" represents the font used for chapter titles, not the actual text that
constitutes the title. For the second chapter, we shall define the chapter number, title, and style. We are also
defining sections in the chapter (and their level, title, and style). However, we are not defining any chapter or
section textual content. Notice that the sections are not numbered. Instead, we only define a level. For
example, a level 1 section is equivalent to the font style Heading 1. A level 2 section is equivalent to a
Heading 2, and is a sub-section of a level 1 section, and so on.
While the level attribute (lev) is not strictly necessary, it actually makes the parsing of the markup
much easier.
The last of ten design goals in the XML specification (http://www.w3.org/TR/2000/REC-xml-20001006)
states that, "Terseness of XML markup is of minimal importance". Keeping this in mind, the markup below
adds explicit information that can be retrieved easily, to reduce docoument-parsing time:
<chap num="2" title="PHP Overview" style="chapter title">
<sec lev="1" title="Introduction" style="Heading 1"/>
<sec lev="1" title="PHP File Extensions" style="Heading 1"/>
<sec lev="1" title="Comments" style="Heading 1"/>
<sec lev="1" title="Identifiers + Variables" style="Heading 1"/>
<sec lev="1" title="Variable Scope" style="Heading 1"/>
The above <sec> elements use the 'empty' form of tag, since no child elements are defined. The next section
contains three level 2 sections, so we cannot use the 'empty' tag form for the parent element's open tag:
<sec lev="1" title="Data Types + Casting" style="Heading 1">
<sec lev="2" title="Miscellaneous Type-Related Functions"
style="Heading 2"/>
<sec lev="2" title="Converting Data Types" style="Heading 2">
<sec lev="3" title="Casting Data Types" style="Heading 3"/>
</sec>
<sec lev="2" title="Arithmetic on Strings" style="Heading 2"/>
</sec>
The remainder of the <body> block simply shows a few more examples of section definition:
<sec lev="1" title="Expressions" style="Heading 1">
<sec lev="2" title="Constants" style="Heading 2">
<sec lev="3" title="User-defined Constants" style="Heading 3"/>
</sec>
<sec lev="2" title="Operators" style="Heading 2">
<sec lev="3" title="String Operators" style="Heading 3"/>
<sec lev="3" title="Arithmetic Operators" style="Heading 3"/>
<sec lev="3" title="Comparison Operators" style="Heading 3"/>
<sec lev="3" title="Logical Operators" style="Heading 3"/>
<sec lev="3" title="Reference Operators" style="Heading 3"/>
<sec lev="3" title="Bitwise Operators" style="Heading 3"/>
<sec lev="3" title="Assignment Operators" style="Heading 3"/>
<sec lev="3" title="Miscellaneous Operators"
style="Heading 3"/>
</sec>
<sec lev="2" title="Operator Precedences" style="Heading 2"/>
</sec>
<sec lev="1" title="Statements" style="Heading 1">
<sec lev="2" title="Assignments" style="Heading 2"/>
<sec lev="2" title="Input/Output" style="Heading 2">
<sec lev="3" title="Output" style="Heading 3">
<sec lev="4" title="echo" style="Heading 4"/>
<sec lev="4" title="print()" style="Heading 4"/>
<sec lev="4" title="printf()" style="Heading 4"/>
<sec lev="4" title="sprintf()" style="Heading 4"/>
</sec>
<sec lev="3" title="Output" style="Heading 3"/>
</sec>
</sec>
</chap>
</body>
</content>
General Structure of XML Documents - Page 3
Professional PHP4 Web Development Solutions
Example 2: Online Manual - Page 5
|