There has been a fair amount of thought and work on related issues already.
The ANSI WG on SGML has looked at proposals for providing conventions
for PIs. The OS/FOSI committees have proposals for doing just what
Dan wants to do here, and we've had a fair amount of experience with
style sheets already. (Not to mention that products such as ArborText's
which have based both its editing and composition presentation on FOSI
style sheets for years has tackled various versions of this problem.)
For those that are aware of all this, please excuse the general
background in the following paragraph.
As I see it, the issue can be described as follows: the "location
model"--by which one breaks the various input elements into classes
so as to assign potentially distinct style to each class--defines
a certain granularity below which things are indistinguishable
for the purpose of style attachment. Almost invariably, there
will be a trade off between the complexity of the location model
and the granularity. Full DSSSL has a query language (one form of a
location model) that would presumably allow one to specify down to a
very fine grain--perhaps a specific element instance. However, most
style sheets are at the other end of the spectrum with either simple
element names (e.g., "title") or perhaps elements-in-context (e.g.,
"title" with a parent of "section" and "title" with a parent of
"chapter"). DSSSL Lite is also considering a relatively simple
element-in-context sort of model ("qualified-gi").
With a level of granularity such as the element-in-context, we run
into the issue Dan raises of "local overrides" to specific element
instances. When your style sheet location model isn't fine grained
enough to point to specific instances (and it rarely is in most
currently practical implementations), it is necessary to give the
location model some "help." IDs on instances basically do this--they
provide a way for the existing location model (assuming it can use
IDs for location) to locate an instance, and then the usual style
sheet attachment process can continue. Note, it's the location
model that needs "help," not the basic process of how one specifies
the style of something once it's been located!
Processing instructions provide a way to put extra information into
the instance that does not require a modification to the DTD, and
this has some advantages (though I would not rule out James Clark's
suggestion at this point either). But I would not want to use PIs
to change the basic paradigm of having style specified in the style
sheet. Instead, I would argue that one should use PIs to augment
the location model, but then continue to use the style sheet to
specify what style is attached to that location. This has lots
of benefits as I see it, not the least of which is giving some
control over the "tag abuse" syndrome while still providing the
format specification flexibility users want.
Such a proposal has been submitted to the OS/FOSI committee. Below,
I will try to translate it for our use.
Attach "element-in-context" (e-i-c) formatting specification to a PI
====================================================================
The suggestion is to allow the "qualified-gi" (in DSSSL Lite terms)
to be also basically a "qualified-pi". (Exactly how one allows PIs
to be qualified is a question, but the OS/FOSI idea was to allow PIs
to be qualified by their element-ancestry [PIs don't create ancestry
in this sense]. For example, a <?DL newline> PI inside a <title>
element might be formatted as a wordspace rather than a line break when
the title is displayed at the top of the page in the running header.)
The OS/FOSI proposal had the PIs using a different name space than
the DTDs GIs--whether we want to do that (I think so, but I'm not
adamant) and with what syntax to do that is up for discussion.
Then, the idea as far as the style sheet is straightforward--there
is an entry that looks just like (or almost like if we want a different
name space) an entry for a DTD element, but its "qualified-gi" really
refers to the (possibly qualified) name of a PI.
As far as how this works in the instance, the OS/FOSI proposal suggested
a PI format that looked a lot like SGML tags (with the "domain identifier"
of "FOSI"). (This general format is also compatible with the ANSI
recommendations for PI structure.) That is, it might look something like
the following (for DSSSL Lite):
<?DL name attribute_specification_list>
where 'name' is a name token identifying the PI (sort of its 'gi') and
attribute_specification_list has the form of an 8879 thing of the same
name, but all attributes are treated as if they had declared value of
CDATA and default value of #IMPLIED. [We proposed that a PI that serves
the purpose of a matching "end tag" would have the form <?DL /name>.]
The OS/FOSI proposal also felt that, to ensure that the definition of
what it means to attach various style characteristics to PIs is
unambiguous and to maintain a uniform interface and implementation, it
was important to make the handling of a PI style sheet entry completely
parallel to that for an element style sheet entry.
In particular, we feel it important to require a concept of "paired"
PIs that function analogously to start and end tags for elements. For
example, to specify a region in which hyphenation is inhibited, we
would require that the instance be tagged with <?DL hyphoff> and
<?DL /hyphoff>. (The use of the "/" in front of the PI's "name" for the
PI's equivalent to the "end tag" makes it easier both to visualize and
to implement in a way analogous to DTD elements.) All "attribute"
matching would be in the DSSSL Lite "constructor" function as with
element style sheet entries.
By definition and by analogy with DTD element style sheet entries, the
affect of any style changes would be scoped by the PI e-i-c's "start"
and "end" PIs. These "paired" PIs are required to be well-nested with
respect to themselves, other paired PIs, and the element structure of
the instance. [If a start PI is encountered such that a matching end
PI is not found within the appropriate scope, then the start PI will be
treated as if an omitted end PI occurred immediately preceding the end
of the surrounding structure; any unmatched end PI would simply be
ignored.] Note that a start PI immediately followed by an end PI can
be a reasonable thing to do: somes examples are a PI that forces a page
break or one that generates some text. (The reason there is no such
analogy to an EMPTY element for PIs is that there are no declarations
for PIs, so we must assume they are all of the same structure.
However, the "end tag" for a PI can often be treated as ommissible.)
Note that, while a PI can have a "context" (i.e., it could be qualified
by its element-ancestry), these "paired PIs" do not create or
contribute to the current context for either themselves or for element
style sheet entries as far style sheet entry qualification is concerned.
The effect PIs have on the composition environment is identical to that
of elements. In particular, inheritance of style characteristics
operates equivalently with respect to both. That is, if a PI pair has
changed the font to 10pt and then an emphasis element that is wholly
contained within the PI pair's scope inherits the font category but
sets the posture to italic, then the character data within the emphasis
element will inherit the size of 10pt as set by PI pair.
paul
Paul Grosso
VP Research Chief Technical Officer
ArborText, Inc. SGML Open
Email: paul@arbortext.com
or pbg@texcel.no