Re: HTML+ and printed books

Christopher J. McRae (mcrae@lib.ucsf.edu)
Wed, 19 May 93 16:23:11 MDT


Dave Raggett writes:
| Perhaps we need some structuring elements which determine how a
| group of related documents can be sequenced into a form suitable
| for printing, along with a way of automatically generating a table
| of contents, and an index.

In response, Daniel Kehoe writes:
| >From this point of view, it seems a bad idea to include "structuring
| elements" inside documents that tie them together for printing (if
| that's what you were suggesting),

I agree that the idea of "structuring elements" specific to printing is
a bad idea. It seems to be just another way of sneaking presentation
information into the DTD, something which many seem to agree is a bad idea.

There are many different possible approaches to searching the web and
extracting subsets of it for various purposes. As architects of the hypertext
infrastructure, our job is to provide an information architecture rich enough
to support any search/pruning method. We should avoid imposing any particular
interpretation upon the information, but rather enable others to apply their
own interpretations as they see fit. The author of a document may choose to
publish one or more default organizations along with the information itself,
and readers may choose to select an alternate organization. So, all we need
in HTML is a *general* way of specifying relationships between documents, which
is what we already have in the REL attribute.

Note that the list of HTML link relationships is not part of the HTML
standard (see http://info.cern.ch/hypertext/WWW/MarkUp/Relationships.html).
This is good and allows for different standard "sets" of relationship values
to be defined and used. I am reminded of the "element sets" referred to as
part of ANSI Z39.50 (ISO 10162/10163). That standard does not require that all
clients or servers support any specific element set. Rather, each registered
element set has a unique name; the client and server neogiate on the element
set(s) to be used in a session. Several archetypal element sets are defined for
well-defined domains such as bibliographic references, but the protocol itself
does not limit the definition of element sets in general. In fact, Z39.50
servers can use ASN.1 to describe element sets to clients who would otherwise
not recognize those sets.

Similarly, relationship-sets could be defined for some archetypal forms
which we all use and understand (journals, books, programs, etc.). For
instance, while we certainly would like the capability to break out the
chapters and other components of a book, it doesn't make sense to speak
of a "chapter" of an animation any more than it does to reference a "frame"
of a book. So, we use different sets of relationship attributes for different
types of data. Perhaps we should add a relationship-set specification to the
document header. Something like,

<HEADER> <REL-SET NAME="US Book 153.5"> </HEADER>

If we allowed relationship-set specification within an anchor tag, then we
could support the use of mulitple relationship-sets within a single document.
For example,

<A HREF="http://foo/bar" REL-SET="CERN" REL="Made,REV">Dr. Elmer J. Fudd</A>
<A HREF="http://foo/bar" REL-SET="ISO-Journal 101" REL="ABSTRACT">Abstract</A>
ARTICLE TEXT HERE
<A HREF="http://foo/bar" REL-SET="US-MARC" REL="BIBLIO">References</A>

Of course, once we start using different relationship sets, a client may come
across a document containing relationships which it doesn't know about. This
could be solved using ASN.1 just as is done in Z39.50.

An alternative approach would be to use multiple DTD's. Why not just
incorporate generic SGML parsing into the browsers and have many "variants" of
HTML. We could provide a mechanism in the protocol for requesting the URL of
the DTD corresponding to the structure of some other URL and then use that DTD
to interpret the incoming document. That is, when the user activates a link
to some document, we contact the server specified in the HREF of that link and
rather than requesting the item referenced in the URL, we request the DTD
describing the tags contained in the referenced item. The server then would
either return the DTD, or a URL for that DTD. Once we have parsed and perhaps
cached the DTD, we would then request the referenced item and process it
accordingly. (I hope my ignorance of SGML isn't showing too badly here)

Chris
------------------------------------------------------------------------------
Christopher McRae mail: mcrae@ckm.ucsf.edu
UCSF Center for Knowledge Management at&t: 415/476-3577
530 Parnassus Avenue, Box 0840 fax: 415/476-4653
San Francisco, California 94143