Re: In-line HTML files?

Ka-Ping Yee (kryee@novice.uwaterloo.ca)
Thu, 20 Jul 1995 13:03:00 -0400


On Thu, 20 Jul 1995, Paul Prescod wrote:
> You are suggesting a mechanism for implementing full subdocuments. I am
> suggesting a mechanism for implementing document fragments. Both are
> necessary.

I agree -- it's just that document fragments seem to be the trickier of
the two. How do you define the "document type" of a fragment? Only when
you have enumerated and defined all of the possible document types
(that is, state-based assumptions) that a fragment can be can you declare
just where and how that fragment is allowed to be used.

> The important thing to recognize is that SGML has already worked out all of
> the symantics for "including" text from a document fragment into another
> document. SGML External entities are well defined and well document. At
> first glance it seems that the simple answer would be:
>
> <!ENTITY foo SYSTEM "http://www.foo.com/fragment.html" >
>
> ...
>
> &foo;

Okay. So what is "fragment.html"? It can't be HTML; is it some sort of
subset, like %text; ?

> How is this a problem? How is it any different if I do a server-include or
> CPP or write a Perl script that combines the documents? The only difference
> is that someone else may control the linked data.

It isn't different. But it is still a possible source of great frustration,
because most authors and users will probably not want to take the time to
learn the structure of the HTML DTD, and instead simply go including bits
and pieces of HTML all over the place. Including complete subdocuments is
a little bit "safer" in that the scope of inclusion is fixed.

Would you allow things like including "foo.html" where "foo.html" contains

</ul><h2>Awooga

? There's no telling what might happen.

I've seen enough garbage all over the Web, with multiple <title>s, <body>s,
and misplaced tags (see <URL:http://www.msn.com/explore.html> for a <head>
within a <body>). Or see <URL:http://www.aist.go.jp/htbin/wclk> for a
rather unusual document that looks like it was produced by inclusion.

> document invalid, I will be at worst embarrased. A much worse situation can
> occur with a simple IMG. If you link to a picture, the maintainer can
> change it to something illegal or obscene. If you choose to include someone
> elses text or image in your home page, incorrect markup is the _least_ of
> your worries. =)

Perhaps. But complete subdocument inclusion is also safer in that it makes
absolutely clear the attribution of each component. Snagging a document
fragment and inserting it into one's own smacks just a bit of plagiarism
(i *know* the Web is about copying information all over the place, but this
behaviour would further blur the boundaries of possession).

As you said, both mechanisms are probably necessary. But including a
complete subdocument is likely to cause less confusion while accomplishing
what most people seem to want (disclaimers, copyrights, signatures, etc.),
so i think it would be better to make it the first of the two objectives.

> ----------------------------------------------------------
> HTML Myths Page: http://www.incontext.ca/~papresco/htmlmyth

I was looking forward to seeing this, but it wasn't available when i tried
it. Is this document up yet?

Ping (Ka-Ping Yee): 2B Computer Engineering, University of Waterloo, Canada
kryee@csclub.uwaterloo.ca | 62A Churchill St, Waterloo N2L 2X2, 519 886-3947
CWSF 89, 90, 92; LIYSF 90, 91; Shad Valley 92; DOE 93; IMO 91, 93; ACMIPC 94
:: Skuld :: Tendou Akane :: Belldandy :: Ayukawa Madoka :: Hayakawa Moemi ::
New! <http://csclub.uwaterloo.ca/u/kryee/> Yeah, i finally made a home page.