HTML+

lamport@src.dec.com
Fri, 2 Sep 1994 17:20-0400


[This message has been redirected:
Forward to the relevant working group on HTML.
timbl, dlt, jcma have been removed;
Dave Raggett <dsr@hplb.hpl.hp.com>, www-html@www0.cern.ch have been added.]

Bill Weihl gave me your names as people who might be interested in
extending HTML.

I would like to see a universal, open standard for the interchange of
hypertext scientific documents. One route to this is through
enhancement of the existing mechanism for the exchange of hypertext on
the Web. This requires an enhanced version of HTML. This note is a
brief exposition of what I regard as the sensible options for such an
enhancement.

I see three possible basic approaches to enhancing HTML:

1. Make it a markup language, in which the HTML source describes the
logical content of the document, leaving formatting decisions to
the viewer.

2. Make it a formatting/typesetting language that tells the viewer
how to display the document.

3. Make it a language for organizing the pieces of the document,
where the actual contentt of the document is contained in files
in other formats (e.g., Postscript), possibly being displayed by
external viewers.

I will discuss these options below. But first, I need to explain the
difference between markup and formatting. Some people seem to think
that, if they use SGML syntax, they are doing markup. In fact, from
my knowledge of SGML, it seems impossible to do markup for scientific
documents using SGML. Consider the simple mathematical formula written
in TeX as $x_i^2$ (x sub i superscript 2). The document

http://info.cern.ch/hypertext/WWW/MarkUp/HTMLPlus/htmlplus_1.html

which is a proposal for an extension of HTML called HTML+, recommends
the following SGML representation for this formula

<math>
x <sub> i </sub> <sup> 2 </sup>
</math>

If this is supposed to be a markup language, then the <sub> and <sup>
tags should delimit logical entities. Most any scientist or
mathematician can tell you what those logical entities are: <sub> is
the operation of array indexing, and <sup> is the operation of
exponentiation. So, this formula would be interpreted as the logical
structure

EXPONENTIATE(INDEX(x, i), 2)

On the other hand, if this is a formatting/typesetting language, then
<sub> and <sup> are the typesetting directives: place the item a bit
lower/higher and set it in slightly smaller type (unless that would
make the type too small to be readable).

There is an easy way to tell whether <sub> and <sup> represent markup
tags or typesetting instructions. Fortran has simple conventions for
representing exponentiation and array indexing. If <sub> and <sup>
represent logical tags, then anyone who knows Fortran would have no
problem understanding the formula above if it were displayed

(x[i]) ** 2

Now, is the HTML+ document proposing a markup or a typesetting
language? The answer is clear. The document proposes that the
formula $\int_a^b f$ (the definite integral from a to b of f)
be written something like

<math>
<integral-sign> <sub> a </sub> <sup> b </sup> f
</math>

No-one would understand what was meant if this were displayed as

((<integral-sign>[a]) ** b) f

Thus, the HTML+ proposal describes a typesetting language, not a
markup language.

I will now discuss the three options.

1. HTML as A Markup Language

Markup languages are attractive. They describe the logical structure
of the document, which makes it much easier to use tools to manipulate
the document. For example, it's hard to search for instances of a
particular formula if the representation of the formula includes
typesetting instructions--which might differ depending on whether the
formula appeared in the running text or in a display.

Markup languages have one major problem for use with scientific
documents: there are at least hundreds and probably thousands of
different logical structures that might appear in a document. It is
impractical to provide tags for all of them. Moreover, a document may
introduce a brand new logical structure. (In some fields, virtually
every document introduces new structures.) Thus, a markup language
for scientific documents has to allow logical structures to be defined
for the particular document, which means the definitions must be
part of the document.

I don't think SGML provides such a definition facility. (My knowledge
of SGML is limited--someone please correct me if I'm wrong.)
Therefore, one is forced to write one structure (for example, a
definite integral) by using different logical structures (for example,
array indexing and exponentiation) that the author hopes will cause
the final document to look right. In other words, the author must try
to do typesetting with a very poor selection of typesetting commands,
which may behave unpredictably.

Ironically, TeX, which is explicitly a typesetting language, makes
it possible in principle to do markup. With sufficient ingenuity
and discipline, it is possible to define markup commands
like \definiteintegral in a preamble, and use only those markup
commands and no explicit typesetting commands in the body of the
document. Of course, TeX does not encourage such use.

If enhanced HTML is to be a markup language, it must include some
feature for defining new structures for the particular document. The
definition must include typesetting instructions. So, the markup
language would have to include an associated typesetting language.
Designing such a markup/typesetting language and implementing a viewer
for it would be a nice multi-person multi-year project. I doubt if
any organization significantly smaller than Microsoft would have the
resources to attempt it.

2. HTML as a Formatting Language

Building a new formatting language is also a formidable task.
However, there is no need to start from scratch. There already is a
widely used, reasonably adequate language: TeX. The easiest approach
is to use standard TeX plus some \special conventions for dealing
with the things that TeX doesn't handle--links, video, etc.

The problem with standard TeX is that it's not incremental. You can't
run TeX just on the page you want to display next. One can place
restrictions on how TeX is used that would make this possible--for
example, all global declarations must appear in a special place. Such
restrictions would be unnatural to a Plain TeX user, but would be
quite natural to a LaTeX user. If this route is chosen, I recommend
that the problem of enhancing HTML be merged with the LaTeX3 project.

A more ambitious plan is to design a new language in such a way that
most of TeX's typesetting engine can be used to display the output,
but in which the input has more of a markup flavor. A major goal of
this plan would be to integrate the viewer and the document editor, so
the user would have something more "WYSIWYG" when creating a document.
This would fit in with what I call LaTeX4, a long-term successor to
conventional TeX/LaTeX.

3. HTML as an Image Organizer

In this approach, one gives up on any thought of a standard
representation of a document and tries to take advantage of the myriad
of existing formats and viewers: gif, Postscript, dvi, Word, ... HTML
would provide the sort of primitive facilities it now has for producing
text; but, it would greatly expand the ability to include and interact
with images produced by other programs. There are two complementary
approaches--one or both could be used.

3.1. Inserting active image in the displayed document.

Currently, HTML has a way of specifying an image that is to appear in
the document. Mosaic can deal only with gif images, but there is no
conceptual difficulty in extending this to other formats: Postscript,
dvi, etc. Enhancements are needed in two areas:

* Control of image placement and size. The document has to be able to
specify fairly precisely where the images are to go and at what
magnification--possibly as a function of screen size.

* Specification of active areas. Right now, adding active areas
to gif files is a kludgy business that requires registering
with the server, so mouse clicks wind up having to be communicated
halfway around the world. Enhanced HTML must make it easy to attach
actions to regions of the displayed image.

3.2. Intelligent interaction with external viewers.

Currently, external viewers have no way of communicating back to the
HTML viewer (Mosaic). There should be a standard method of doing
this, probably through information in the HTML source for the
document. This information would allow the HTML viewer to act as
a manager of different external viewers--seeing that the right
thing happens if the user clicks in the Postscript viewer on a link
to a dvi document.

Leslie Lamport