> From: narnett@verity.com (Nick Arnett)
>
> Our engine builds a query-able index of the attributes, as
> well as a full-word index. The question I was asking was how
> we tag attributes in the HTML documents so that they can be
> captured by a spider or other indexing tool.
>
> However the indexable attributes are tagged in HTML documents, I
> believe it is important that there should be some way to closely
> associate index entries with the text they are about. In other words,
> the index entries should be immediately next to (before or after) the
> paragraph or list item.
Hmm ... I'm not sure I understand your terms. May I make this
discussion a little more concrete by giving an example?
I am working on a system concept for technical information
(documents, objects, ...). Briefly: files at specified locations
would be scanned for a set of specified attributes, which would be
pulled into an RDBMS server for querying (via HTML forms). As an
example of a marked-up technical document, I have a sample failure
analysis report with some special tags in it at:
http://epims1.gsfc.nasa.gov/fa/fa.html
Part of the concept is to have a sort of "family" of DTD's for
technical documents, all of which would include the HTML tag set
as a subset. I have not done any DTD's yet. Details of this
particular example may be naive ... comments?
--Steve.
oo _\o
\/\ \
/
____________________________________________ oo _________________
"Sometimes you're the windshield; sometimes you're the bug."
- Knopfler
Stephen C. Waterbury EPIMS: EEE Parts Information
Code 310.A, NASA/GSFC Management System
Greenbelt, MD 20771
Phone/FAX: 301-286-7557/1695 waterbug@epims1.gsfc.nasa.gov
_________________________________________________________________