Re: Frames & WWW

Gavin Nicol (gtn@ebt.com)
Thu, 17 Nov 1994 11:35:27 -0500


>I was thinking specifically of the HYTIME extensions that recently
>were passed 'Unanimously' despite there being no implementation. And
>of course without bothering to look at what was already in the field.

Well HyTime is a different kettle of fish altogether... and serves a
different purpose to SGML.

>As for `most people with sense avoid the worse features of this
>standard'. Lets think about that for a moment eh? Basically you are
>saying that the standard works be cause people don't follow it. Or in
>other words the standard is not in fact a standard.

No. Not following the standard means that you actually do something
that is illegal. Most implementors of SGML parsers instead restrict
themselves to something very close to the "minimal" SGML defined in
section 15 of the standard. In other words, they chose a *conforming*
subset. Standards work because they define a framework in which to
work. Simple standards get implemented fully, complicated standards
often do not, but rather set the boundaries for implementors.

>That being the case what should be done is to take the standard and
>rewrite it depreciating the more egregious lossage.

True. I am certainly not alone in thinking that SGML is a "good
thing", and I am certainly not alone in wishing it was not defined by
a lawyer! SGML is not perfect, and it carries a lot of excess baggage,
but the core concepts are excellent, and even better, they bring about
real benefits in the real world.

>As for not reading the TEI spec. I follow the ISO standard, 50 quid
>it cost me. Are you saying that I have the wrong standards body and
>that TEI is really the definitive version? If so we should get this
>cleared up as soon as possible.

Now, I must apologise to everyone for using the term "standard" in
relation to TEI. The TEI spec is not a standard, but is rather a well
thought out, and rigorously defined SGML application, targeted toward
academia, and in particular, humanities. Despite this, I think the TEI
work deserves some attention (and I am not active in the TEI in any
way). Let's place it on the "heavily recommended reading" list, and
pray that one day, the people using the TEI guidelines will be able to
make their
data available to the WWW.

>The problem that I have with this "X is wonderful it can do
>everything" attitude is that X frequently turns out to have two
>operating modes, it can either be simple or it can be doing
>everything and the two modes are disjoint.

Ah. It sounds to me like you *are* wondering how you can implement an
SGML parser :-) Actually, SGML isn't all that bad if you restrict
yourself to minimal SGML as defined in the standard. I have never
pretended there was a "simple" SGML, but there are certainly
"reasonable" subsets (TEI).

>If TEI are willing to send me a copy of their specification without
>cost I will read it.

I'm not sure where you can get a postscript version from. You can find
some TEI information at ftp://ftp.ifi.uio.no/pub/SGML/TEI but you'll
have to piece it together. The grammars for their
subset are there
anyway.

>If anyone is using their standard within the Web it would of course
>have greater weight.

We use part of it in DynaText and DynaWeb (specifically, we make the
TEI path spec available).

>But if its simply an unconnected commercial implementation then I am
>sure the IETF will give it due weight in their considerations.

TEI is largely academic. It was defined so that humanities groups
could share a common base. There are a substantial number of people
following the TEI guidelines with even more substantial databases that
just scream to be made available....

>The point that I am making is that Gavins assertion that my suggestion is
>`incorrect' is more than suspect. Unless he can point to an ISO or IETF
>standard that it conflicts with I reject his assertion. At best he can claim
>`incompatibility with existing implementation'.

Let's look at this again:

http://foobar.org/fred.fish.html#H1:3,H2:4,H3:7,P:8

OK. Perhaps "incorrect" is too strong. I take it you *are* using
occurences within the document as a numeric qualifier, otherwise it
certainly *is* incorrect, because you simply do not have the
structure.

Now, given that we have occurences as the qualifier, I have 2 problems
with your scheme.

Problem #1
It implies a hierarchy that is not there. Due to the fact that we
have no containers (or few of them) in HTML, your addressing scheme
has no need at all for the Hxxx's because the P is *not* contained
by them, so the 8 in P:8 *must* be an occurence counter which means
that

http://foobar.org/fred.fish.html#P:8

does just as well! What is even more interesting is that if we
rewrite it slightly

http://foobar.org/fred.fish.html#P=8

or perhaps

http://foobar.org/fred.fish.html/P=8

we have a TEI path! Now I have already pointed out that you
*cannot* fake containers by associating text with the preceding
header because your beloved common usage (font effects) kills you
one way, and the HTML content model kills you the other (you cannot
discern container ends). Again, I will emphasise that these types
of names are *not* SGML specific! LaTeX has more structure than
HTML, and RTF has about the same!

Problems #2
Your proposal tries to accomplish the same thing as a set of
guidelines in use by large data repositories in academia, but with
trivial syntax differences. Why not use what's already there.

So yes, assuming you do use occurence counting, I was wrong, and I
apologise. Your proposal is not incorrect, merely meaningless.