Re: Byte ranges (actually robots)

Kee Hinckley (nazgul@utopia.com)
Thu, 25 May 1995 09:18:05 +0500


At 8:46 AM 5/21/95, Gavin Nicol wrote:
> http://www.ebt.com/collection/book/doc=3D1/chap=3D2/sect=3D3
> http://www.ebt.com/collection/book/1/2/3
> http://www.ebt.com/collection/book/1
..
>PS. I should note that the above naming scheme is very, very useful in
>our case, but it drives spiders wild....

That actually brings up an issue I've been meaning to mention somewhere
(the robots list would be appropriate, but I don't have time to join a
mailing list in order to post one issue).

We use a technology we call Dynamic View(tm) to present large amounts of
information in managable chunks, without creating a large hierarchy. For
instance a mailing list might be broken into chunks by months, where you
would only see the details for the current month, and the rest would expand
when you click on the month name (see
http://www.utopia.com/mailings/edupage/ for an example).

This is great for people, however when a robot indexer comes to call I
don't want to confuse things. At best the structure will slow down the
indexing process. At worst it will look at a URL like
http://www.utopia.com/mailings/edupage/?NAME=3DApril+95#April 95
and decide not to follow it at all.

So what to do? If we recognize the user-agent as a known robot, we give it
a flat presentation of the structure. The catch of course, is that there
is no standard way to recognize a robot. Some convention in the UserAgent
field would be sufficient, but....

Has anyone considered a move in that direction? (Note, this also has a
bearing on URL-based shopping carts - I'd like the robot to be able to
browse the store without acquiring an ID, since you don't want the ID to
end up in some index somewhere. There of course, the problem is due to a
hack, so finding a solution is less critical if you're in a purist camp
:-).

Kee Hinckley Utopia Inc. - Cyberspace Architects=81 617/721-6100
nazgul@utopia.com http://www.utopia.com/

I'm not sure which upsets me more: that people are so unwilling to accept
responsibility for their own actions, or that they are so eager to regulate
everyone else's.