Re: fwd:Fonts

Paul Prescod (papresco@calum.csclub.uwaterloo.ca)
Thu, 6 Jul 1995 10:02:13 -0400 (EDT)


I'm sorry about the length of this letter and I'm sorry to be a naysayer.
I think the intentions are right, and the people who are suggesting <TEXT>
and <C> are smart, but feel that they are too open to abuse.

> In other words, <C> is meant to replace style sheets, <TEXT> is meant
> to introduce semantics.

I caught the discussion of <TEXT>, but not of <C>. I don't think
that we should be introducing tags to replace style sheets, however.
If I understand it right, <C> would be used for attaching styles to
text without implying any semantics.

> That it can also be used to add extra styles
> is just a coincidence. In that respect TEXT is redundant, since one
> could just as well use <EM CLASS=CITY> for that.

It only makes sense to define extra styles if they are attached to new
pseudo-elements. Attaching arbitaray styles to a single element is a
regression to Word for Windows style formatting, e.g. <C CLASS="UNDERLINE">
<C CLASS="LARGE">, <C CLASS="EXTRA LARGE">. This is just Netscape
extensions with a more verbose syntax.

I also oppose <TEXT>, however. Any new element should be based on an old
one.

Think of it this way: what would be the easiest way to turn a Word for Windows
document into a "correct" HTML 3.0 document:

<STYLE>
massive amounts of style sheet declarations here to precisely emulate the
Word for Windows environment
</STYLE>
<TEXT class="s23dfe2as">Text<TEXT class="s2ecfe232"> text</TEXT></TEXT>

We must not allow this. If we force them to use real HTML elements then
we can show users the output and explain to them why it is wrong:
"look it used an emphasis tag when you didn't really want emphasis. Look
it used an address tag to enclose something that is not an address."

If we create a tag with no semantics it can be used anywehere without
ever being wrong. We must force authors to properly tag the semantics
of their document. We must force editor vendors to make that choice
explicit in their interfaces.

> I'm interested to see how Paul Prescod can use entities for the same
> function,

You don't need entities. You need more powerful style sheets. The situation
I promised to address is the one where the Economist bolds the first few
words of each sentence. The way to deal with that is with a tag
<P class="Economist-paragraph-style-bold-first-3-words">. It is then up to
the formatting engine to bold the first 3 words.

Do I really expect W3C style sheets to be that powerful? No. Just as
HTML is not a be-all end-all SGML DTD, W3C style sheets are not going to
be the be-all and end-all of style sheets. Some things just won't be
possible.

If the Economist needs those first three words bolded so desperately it should
encourage browser writers to support DSSSL. DSSSL has a full programming
language built in. It can easily find the first three words of a
paragraph and bold them. Yes, this is harder. Yes, really complex
layouts will require some form of HTML development engine. But it avoids
the problem I addressed above. <TEXT class="BLINK"> is sematically
identical to <BLINK>. We should exclude the former for all of the same
reasons we exclude the latter.

No, I'm not a dreamer. I know that ordinary tags can be abused. But at
least the abuse is blatant. If I use <TEXT class="Important"> where I
mean <EM>, who can say I am wrong? If I use
<EM class="address">papresco@undergrad.math.uwaterloo.ca</EM>, then I
am obviously wrong because I don't want that text emphasized, I just want
it to be an address.

> but I already have one important objection: the use of
> entities requires the UA to parse the document subset. I like to keep
> UAs simple, so that it is still possible for an individual to write a
> useful Web browser for HTML and (a subset of) SGML.

There are no less than 3! free SGML parsers on the Web. The trend should
be towards encouraging people to use these, not towards moving away from
full SGML compliance.

If parsing is too slow then the _server_ should optimize (and probably cache)
pre-normalized (even BINARY!) versions of the document. That normalization
could include entity replacement. Content negotiation can be used to
determine if the browser can handle a)raw HTML (i.e. using SGML features),
b)optimized HTML c)only normalized "fat" HTML.

> See
> <http://www.let.rug.nl/~bert/Stylesheets/SGML-Lite.html>; I'll post
> more about it later.

I support your idea for an SGML-lite. I just don't think authors
should be restricted to it. Computers are good at "normalizing."
If we take your idea to the extreme, we can convert "real HTML" documents
to a highly efficient compressed binary format.

I think you should get in touch with SGML Open. I'm sure they've
thought about "SGML-lite" a great deal.

Paul Prescod