I think we might be able to agree on the following:
1. We use the ALIWEB template format with the addition
of a creation date and an optional expiration date.
(I know the date will be in the HTTP header -- it should
be in the document too.)
2. The template should be accessed with the URL
http://hostname/site.idx.
3. It is up to the maintainer how to create the template (by hand
or automatically).
Now here are some personal thoughts on which there will likely be
disagreement:
1. Hand made versus Machine made:
Hand made templates are better (in general) than machine made
templates, but the percentage of sites willing or able to produce
them is very small. (About 1% of current sites participate in
ALIWEB. This is what we should expect in general.) Also as Peter
Deutsch points out, enthusiasm for making these templates declines
exponentially with time so hand made ones we usually not be up to
date.
For this reason it is vitally important that default automatic support for
template generation be built into servers. It is not even sufficient
to have an external script or program to create the template. It
must be done automatically by default. On this point we have yet
to hear from the two most important players, NCSA and CERN. Unless
the creators of these two servers buy into this proposal we are
probably wasting our time. The reality is that while our discussion
may be valuable, standards are set by NCSA and CERN.
2. HTML+ Meta information
I think that, by all means, authors should be encouraged to put
meta information into their html documents. And automatic
template creators should be designed to incorporate this
information if it exists. But I don't think that a robot doing a
HEAD request for all the documents on a server is the correct way
to do indexing. I particular I don't know how the robot would
know the URLs of documents on the server all it only used the HEAD
method.
John Franks Dept of Math. Northwestern University
john@math.nwu.edu