Inferring </p> tags would be is easy. Well... at least the SGML standard says
how to do it in a way that's consistent with current practice in HTML.
It's the start tags (<p>) that cause trouble.
>When we get to a point where we support stylesheets (PLEASE!) it is of
>extreme importance to consider <p></p> a container. Without this, it is
>not possible to assign stylistic attributes to a contained element.
Counter-argument: The MidasWWW browser had a really nifty stylesheet-based
hypertext widget set, and it grokked empty P elements just fine. Something
like:
*HTML*BODY.font: ...
*HTML*BODY*P.breakBefore: True
*HTML*BODY*P.breakAfter: True
>Current practice suggests that <p> is not a container at all, it is a
>logical break -- or it is considered as a container with no contents. This
>is the behavior of available browsers, as I understand them.
Agreed.
>---------------------Example---------------------
><body>
>This is text with no container. (1)
><p>
>Perhaps this is text in a <p> container. (2)
><p>
>Hmmm... no </p> associated with the previous <p>! Do we assume that there
>was to be one, or do we treat <p> as a break? (3)
></body>
>-------------------------------------------------
>
>The principles behind SGML -- and by its lineage, HTML -- are to markup
>the structure of the document.
>In the previous example, what is the text associated with (1)? It is
><body> text or <p> text?
It is straightforward to construct DTD's where (1) is content of
the BODY element. The draft-iiir-html-01 version of the html DTD
did this. My recent html version 1.7.2.4 also does this.
I think it is impossible to construct a DTD where (1) is the
content of a P element without doing stuff like "The first
element of a BODY element must be a P."
> And if we build stylesheets which allow logical
>elements within the document to have their own stylistic "hints", which do
>we apply to (1)?
Body.
The declarations
<!ELEMENT BODY O O (#PCDATA|P|OL|UL|DL|H1...)>
<!ELEMENT P - O EMPTY>
are consistent with current practice.
I have considerable evidence to back that claim.
Parsing extant documents relative to delcarations like
<!ELEMENT BODY O O (P|UL|OL|...) -- no #PCDATA -->
<!ELEMENT P - O (%htext)+>
results in errors.
If there is sufficient motivation to change all the documents out
there to move #PCDATA out of BODY and into a subordinate paragraph
element (which I agree is a good idea), why not call give that
element a new name like PP while we're at it?
Dan