> The one problem with all these conceptual similarities is that it
> makes writing a web-roaming robot (spider) very difficult.
I'm convinced WWW-wide spiders are a pretty bad idea anyway, for more
reasons than the infinite dynamic page problem.
If you run the spider on your own server you know what URL's to avoid,
so there is no problem.
> A spider
> (or human) that specifically wants to avoid scripts or dynamically
> created documents needs to be able to determine whether or not the
> URL points to a script.
I really don't see why a human cares? And I can imagine that spiders
don't always care either. Say I have a welcome page that is dynamic
and displays the date and local time or whatever, I'd still want my
robot to use it. And most ISINDEX servers can be regarded as dynamic
too.
-- Martijn
__________
Internet: m.koster@nexor.co.uk
X-400: C=GB; A= ; P=Nexor; O=Nexor; S=koster; I=M
X-500: c=GB@o=NEXOR Ltd@cn=Martijn Koster
WWW: http://web.nexor.co.uk/mak/mak.html