Re: Client-side highlighting; tag proposal

Jim Davis (davis@dri.cornell.edu)
Fri, 10 Mar 1995 09:08:09 +0500


I believe in the need for the function of this tag, (i.e. it makes
sense for search engines to show users what portions of a document
were relevant) but I am unconvinced that a new tag is the way to
do it.

1) It's not clear to me what the meaning of this tag is. One
definition might be "in a document returned by running a search, the
elements within the document that caused it to be selected". Another
might be "the passages within a document that matched the search
criteria". Both seem plausible and useful to me, but they are different.

Suppose I'm searching for a paragraphs with both "data" and "actor".
Under the first definition, only the words "data" and "actor" would be
highlighted. Under the second, the whole paragraph would. There are
probably other viable definitions. What's more, I am not sure what
happens (in the first definition) if the search included a NOT.

2) It's not clear how this tag relates to other cases of marking
passages within documents as "special" in some way. One example is
"strike out" text. [The HTML 2.0 spec (of July 94) has a STRIKE tag
for this purpose.] Are there other examples? Before adding another
special purpose semantic tag, it would be well to look for other cases
where such markup is needed, to make sure the design is clean.

3) A tag is not the only way to provide the function.

First of all, why not just use existing tags such as B or EM? Granted
this makes it hard to highlight something that's already bold, but
then again what's the client going to do in such a case anyway? (see
below). Using B will provide the function immediately and universally
and will work in most cases.

Second, Lee Shombert claimed that there was no simple expression that
could be expressed in the header for client interpretation to indicate
which words, phrases, paragraphs or images should be highlighted.
He's right that simple schemes won't work (e.g. just listing "data"
and "actor" would fail because the client would not limit highlighting
to terms in the same para.) but more complicated schemes are possible
(e.g. a set of ranges as byte offsets into the marked up document.).
So a tag isn't *necessary* to provide

4) If the tag is adopted, then the content of the tag should be
allowed to be anything, including headers and images. How the client
highlights text should not be specified. Note that no single grapical
method will work, since one might wish to highlight boldfaced,
underlined, or even (ughhh) blinking material and also one might wish
to highlight imported image files. Remember too that there are
visually impaired Web users. We should not be adding more purely
visual tags.