Going this deep takes long enough (an hour or so). I use a filter at
each stage to cut out known slow sites (typically Eastern Europe) or
known buggy servers. I also have to clean the links quite a bit for
references with local hostnames only (not FQDN) and a small
amount of junk. [Obviously mailing the webmaster at sites
with bad links would be a possibility.]
This is only for interest. I don't generate a index. The engine is
just a bunch of scripts using www -listrefs.
Tim