Rant on HTML extensions (Was: Ubiquity of Netscape (Mozilla Eats the World))

Rant on HTML extensions (Was: Ubiquity of Netscape (Mozilla Eats the World))

mwm@contessa.phone.net (Mike Meyer)
Wed, 18 Jan 95 12:22:06 PST
Missionaria Phonibalonica
Posted to: comp.infosystems.www.misc, comp.text.sgml
References: 1 , 2 , 3 , 4 , 5

In <3fh7gd$psj@mindcrime.interstate.net>, rdeaver@interstate.net (Rex Deaver) wrote:
> Actually, I applaud efforts to bring HTML into line with standard DTP
> practices...

Would you also applaud efforts to bring flight simulators into line
with standard DTP practices? Those aren't any further from a DTP
application than HTML.

> like it or not, are going to become the majority of Web users in a
> matter of months.  That market is going to drive development and
> Microsoft would *love* to own the standard...which they will, if the
> HTML specification isn't improved to provide the flexibility that
> market will demand, is in fact demanding now.

MicroSoft will also own the standard if we let proprietary extensions
by the dominant web browser dictate what HTML is.

Yes, HTML needs better ways to indicate how a document should be
presented. No, we don't need ways for the author to have absolute
control over how a document should be presented, because that control
is at best an illusion that deceives only the author, and at worst an
illusion that will cause useful information to be lost as the document
is translated into HTML from whatever it exists in before, even if
that is only the authors head.

We're already seeing that. Someone pointed out that with the Mozilla
extensions people are tempted to use <center> & <font> 
in lieue of the
existing <h?> tags. This means that tools that index headers, or build
tables of contents, or whatever, don't work properly. Another someone
suggested - and it looked like a serious suggestion - that the
solution was to only index on title, where markup questions don't come
into play.

I can only conclude that this person doesn't really care how many
people see their document, as otherwise they wouldn't suggest
disabling useful ways of finding it. If you're going to index on title
only, you might as well use RTF or LaTeX, and put the title in a
seperate file (there are web servers that will support this). That
gives you absolute control over the format and the indexing faciities
you're willing to settle for.

What we need are well-thought-out enhancements to HTML that preserve
the ability to display it on a wide variety of platforms, from Unix
boxes with 27" 32-bit-deep monitors to Newtons to your local pay
phone; that are designed to make it easy to write HTML that displays
well on platforms that can't display those enhancements or on browsers
that don't understand them, or better yet, hard to write HTML that
doesn't do that; and most importantly, it needs to take the desires of
ALL the potential users into account - those that want to translate it
to braille, or read it over a phone, or use it as part of a larger
data system need to be included, not just those who want it as part of
a multimedia system (even though those may be the vast majority).

HTML 1.0 fell down on the third point, because most of the current
users weren't aware that it was being done and probably would have
been annoyed at being bothered about an obscure research project. HTML
2.0 (basically, "fixed" versions of the Mosaic extensions) fell down
on all three. HTML 3.0 is still being developed, and corrects many of
the problems with HTML 2.0, as well as providing new extensions that
generally meet all three criteria. If you expect me to say that the
NetScape extensions do anything except fail on all three criteria,
you're not paying attention.

Clearly, almost nobody was consulted on the Mosaic and NetScape
extensions. They were born in finished form, complete with
deformities, from the forehead of Jove (or whever). Do they preserve
the ability to be displayed on a wide variety of platforms? Again,
clearly not. They extend HTML in ways that some platforms can't deal
with. They can't be blamed for that. Could they be better done, so
that more platforms could display them? Some yes, some no. <Font>, for
instance, could be replaced by the HTML 3.0 <big> and <small>, which
gets more flexibility for both the browser author and the HTML author.
For instance, VT100s and the like have a chance of displaying them by
using their very limited set of fonts. The same might be done for
<font>, but there's no indication on the <font> documentation about
which of the seven fonts should be used. Finally, and critically, do
they make it hard to write documents that aren't portable between
browsers, or at least easy to write documents that are? Again, the
answer is no. <CENTER> and <HR> lose line break information. So you
have to make sure those tags occur with other tags that generate line
breaks, or your documents will lose information in browsers that don't
understand those tags.

One suggestion was that a standard + lots of proprietary extensions is
a good thing, as this gives you lots of flexibility. The history of
Unix shows that this person is right; you get lots of flexilibity.
Almost any Unix system lets you do things that you can't do on a
standard Unix, and you can chose from almost as many ways as there are
Unix vendors to do it. If you want your software to work on all of
them, you have to either stick to the standard (well, the subset of
that the vendors didn't tweak in adding their extensions), or write
code with conditional comilation for all those features. Already, the
HTML world is seeing all three choices - people writing pages
specifically for one browser, people writing in a subset of the
standard that works on most browsers, and people checking the browser
type to figure out which page to send. It's been nearly 20 years since
the initial split in the Unix market, and it appears to be mostly
recovered; except that vendors are already starting the same game
again with the new standard. Personally, I'd rather not go through the
same thing with the WWW; I'd like to have a web that interoperated for
all documents before the year 2020, without going to a standard
dictated by the largest company writing browsers.

	<mike