|
However cool the idea of escaping the limitations of
a basic tag set (like
HTML)
sounds, it isn't even close to the best thing about
XML.
The real power of XML comes from the fact that with XML, not only
can you define your own set of tags, but the rules specified by
those tags need not be limited to formatting rules. XML allows you
to define all sorts of tags with all sorts of rules, such as tags
representing business rules or tags representing data description
or data relationships.
Consider again the case of the contact list in SCLML.
Using standard HTML, a developer might use something like
the following:
<UL>
<LI>Gunther Birznieks
<UL>
<LI>Client ID: 001
<LI>Company: Bob's Fish Store
<LI>Email: gunther@bobsfishstore.com
<LI>Phone: 662-9999
<LI>Street Address: 1234 4th St.
<LI>City: New York
<LI>State: New York
<LI>Zip: 10024
</UL>
<LI>Susan Czigany
<UL>
<LI>Client ID: 002
<LI>Company: Netscape
<LI>Email: susan@eudora.org
<LI>Phone: 555-1234
<LI>Street Address: 9876 Hazen Blvd.
<LI>City: San Jose
<LI>State: California
<LI>Zip: 90034
</UL>
</UL>
While this may be an acceptable way to store and display
your data, it is hardly the most efficient or powerful. As you are
probably aware, there are many potential problems associated with
marking up your data using HTML. Three particularly serious
problems come to mind:
- The
GUI
is embedded in the data. What happens if
you decide that you like a table-based presentation better than a
list-based presentation? In order to change to a table-based
presentation, you must recode all your HTML! This could mean
editing many of pages.
- Searching for information in the data is tough.
How would you get a quick list of only the clients in California?
Certainly, some type of script would be necessary. But how
would that script work? It would probably have to search
through the file word for word looking for the string
"California". And even if it found matches, it would have no
way of knowing that California might have a relationship
to "New York" - that they are both states. Forget about the
relationships between pieces of data which are crucial to power
searching.
- The data is tied to the logic and language of HTML.
What happens if you want to present your data in a
Java
applet? Well, unfortunately, your Java applet would have to
parse through the HTML document stripping out tags and
reformat the data. Non-HTML processing applications should
not be burdened with extraneous work.
With XML, these problems and similar problems are solved. In XML, the
same page would look like the following:
<CLIENT>
<NAME>Gunther Birznieks</NAME>
<ID>001</ID>
<COMPANY>Bob's Fish Store</COMPANY>
<EMAIL>gunther@bobsfishstore.com</EMAIL>
<PHONE>662-9999</PHONE>
<STREET>1234 4th St.</STREET>
<CITY>New York</CITY>
<STATE>New York</STATE>
<ZIP>Zip: 10024</ZIP>
</CLIENT>
<CLIENT>
<NAME>Susan Czigany</NAME>
<ID>002</ID>
<COMPANY>Netscape</COMPANY>
<EMAIL>susan@eudora.org</EMAIL>
<PHONE>555-1234</PHONE>
<STREET>9876 Hazen Blvd.</STREET>
<CITY>San Jose</CITY>
<STATE>California</STATE>
<ZIP>90034</ZIP>
</CLIENT>
As you can see, custom tags are used to bring meaning to
the data being displayed. When stored this way, data
becomes extremely portable because it carries with it its
description rather than its display. Display is "extracted"
from the data and as we will see later, incorporated into a
"style sheet".
Let's consider some of the benefits.
- With
XML, the
GUI
is extracted. Thus, changes to display
do not require futzing with the data. Instead, a separate
style sheet
will specify a
table
display or a
list
display.
- Searching the data is easy and efficient. Search engines
can simply parse the description-bearing tags rather than
muddling in the data. Tags provide the search engines with
the intelligence they lack.
- Complex relationships like trees and inheritance can be
communicated.
- The code is much more legible to a person coming into the
environment with no prior knowledge. In the above example,
it is obvious that <ID>002</ID> represents
an ID whereas <LI>002 might not. XML is self-describing.
|