> A couple of times lately, I've brought up the notion that clients should
> handle highlights (the terms that match a search query) better. It's
> rather inefficient to force the search server to proxy documents just so
> that it can add highlights. Worse, it takes the decision about *how* to
> highlight (bold? underline? surround with asterisks?) out of the user's
> hands (barring some sort of ugly protocol for telling the server).
>
> We'd like to suggest a very simple approach -- a highlight tag. [...]
>
> I'd like to hear (1) suggestions on the form of this tag (we're assuming
> something terribly simple such as <hl> and </hl>) and (2) objections or
> concerns.
The current HTML3 draft has a <MARK> element that would
work well for this.
<MARK> is an EMPTY element, used in pairs to mark
contiguous spans of the document that may cross element
boundaries:
<!ELEMENT MARK - O EMPTY>
<!ATTLIST MARK -- requires either start or end attribute --
class NAMES #IMPLIED -- used to subclass range --
start ID #IMPLIED -- defines name of range --
end IDREF #IMPLIED -- paired with start element --
>
<MARK> elements may appear anywhere that character data
is legal [*]. This would work perfectly for a search application --
it could insert <MARK class=highlight start=xxx> at the start
of each range and <MARK end=xxx> at the end, without having
to worry about the element hierarchy. <HL> ... </HL> wouldn't
work in all cases.
([*] well, almost anywhere. To Dave Raggett: MARK should be added
to %pre.content; and a few other places where it's not currently legal.
Just making it an inclusion exception might mess up record-end
processing; I'll investigate that.)
--Joe English
jenglish@crl.com