The Problem: HTML
May 22, 2000
Developers often complain that current browsers are bloated pieces
of software, and they are. In order to be able to deal with the
jungle of Web sites out there, a desktop browser needs to be
prepared to handle many things, from sloppy or outdated
HTML to
Java to
SSL to PDFs to
VRML and on and on and on. But the software that runs a cell phone or a PDA has to cut a much leaner figure, for obvious reasons. Until they invent a tiny hard drive and a much more powerful battery, handheld devices will be designed to do just a few tasks as efficiently as possible.
For now, this means that many of these devices will basically
only be able to access content that has been formatted specifically
for that device, or at least for the same general type of device.
At this early stage some wireless service providers are building
"walled gardens", allowing subscribers to access only their own
proprietary content. But eventually, as standards like WAP (or
something else) and markup languages like WML become more
mainstream, access and content will follow their natural separate
ways.
However, it will still be necessary to format content for specific
types of devices. As we mentioned above, there is no way that things
like screen size and multimedia capability will be standardized for
all types of devices. And there's certainly no way that we
long-suffering Web developers are going to code a completely new
version of an Internet site every time some new wireless gizmo
comes out! Obviously, what's needed is a markup language that
allows the content of a site to be completely independent of its
formatting.
Of course, that's what SGML is all about, and ironically, not too
far from what HTML was originally meant to be. In theory at least,
HTML tags identify only the type of content (heading, paragraph,
etc.), leaving it up to the browser to decide exactly how the
content should be displayed. But as the Web moved beyond simple
text, HTML proved too limiting. Designing attractive and functional
pages requires more control over layout and formatting, so things
like the FONT tag were introduced, contrary to the original concept.
With HTML 4.0, and the introduction of
Cascading Style Sheets (CSS),
the problem was largely solved, and the principle of separating
content from formatting restored.
Although it is certainly possible to serve HTML wirelessly to a
handheld device, HTML is too complex and bulky to be practical for
many small devices. We need to be able to create custom subsets of
SGML, each of which will be readable by a very lean bit of software
(sometimes called a microbrowser) in a particular type of client
device.
Of course, wireless issues aren't the only reasons for wanting to
transcend HTML. HTML provides no proper way to index documents, as
attested by the masses of irrelevant, out-of-context pages that
show up when you use a search engine. Also, while HTML works well
for presenting documents, it can be a hindrance to running online
applications and other forms of user interactivity.
Ken Sall states the case
concisely in his
XML overview. The main drawback to HTML is that it simply
changes too slowly. To accommodate new trends such as wireless,
the World Wide Web Consortium (W3C)
has to hash out new specifications, which is an orderly but slow
process. And browser makers' unfortunate decisions to introduce
their own proprietary tags have made things much worse.
So, the prescription for all our wireless and other woes is a
markup language that:
allows content and formatting to be completely independent of one
another.
is truly extensible, allowing developers to add elements as needed,
without resorting to proprietary extensions.
What's wireless to a web developer?
What's wireless to a web developer?
The Solution: XML
|