Re: Caching Servers Considered Harmful (was: Re: Finger URL)

Chris Lilley, Computer Graphics Unit (lilley@v5.cgu.mcc.ac.uk)
Mon, 22 Aug 1994 21:40:33 GMT


In message <Pine.3.85.9408221338.A9925-0100000@enews> Rob Raisch said:

> I suggest that the only reason that caching servers exist is to improve
> network resource usage, and by doing so, improve the responsiveness of
> retrievals.

Well the original reason was to allow access to people behind firewalls, but
that just showed the advantages of cacheing. So yes, it improves network usage
and reduces long distance traffic, which is to the benefit of all of us.
Publishers included.

> This is for the benefit of the caching server's network usage

Sure, among other things.

> and against the best interests of the publisher.

Stating this often does not make it so.

Assuming your hypothetical publisher actually want people to access some of
their data - which is becoming unclear from your statements - then yes, there is
theoretically the potential for out of date or even modified versions to be
produced. This is nothing new. There is the potential for a bookstore to scan
your books through a Xerox Docutech, mask out your logo and copyright details,
print 'em by the boxload, and sell off cheap pirate copies to the third world.

Or, indeed, OCR your books and post them in installments to rec.slack via the
anonymous mailer in Finland.

In general, however, this does not happen often enoug that existing methods of
book distribution and sale are called into question.

Assuming then that some data is to be distributed, then making it accessible not
just to academic sites but also to commercial sites (the ones with the money,
and also the ones using proxies to get across their firewalls) is clearly in the
publishers interests.

Similarly, having a chain of proxy caches distribute your information to several
million readers without several million readers all doing slow intercontinental
accesses and giving your server the response time of an anaesthetised
hippopotamus is in the publishers interests.

> There is no incentive on
> the cache manager's part to acceed to the wishes of the publisher.

Could you advance some support of this notion, apart that is from what you have
assumed?

Setting up some sort of dichotomy between honest, quality assured publishers and
nasty, rapacious networking folk who are only out to maliciously rip off and
subvert your data is not, I feel, a good way forward.

> How
> many publisher's will state "All of our information is timely, so
> don't cache any of it" simply as an expedient?

Unless they only have your postings to go on ;-) I suspect that the answer is
'not many'.

It is always difficult on these discussion lists to know in what order people
have read messages and how often they check their mail; however I have seen
several recent messages pointing out that caches can readily provide information
of guaranteed timeliness. It may be that you have not read these yet. But if you
have, please stop repeating that caches necessarily serve outdated information,
as that statement is false.

> Caching servers treat the symptom, not the illness.

I remember someone saying that a paradox is a problem incorrectly stated. I
suspect that the illness you are attempting to invoke, supposedly impervious to
any form of technical solution, supposedly at the mercy of solutions arrived at
without any input from publishers, falls under the same category.

--
Chris Lilley
+--------------------------------------------------------------------------+
|Technical Author, ITTI Computer Graphics & Visualisation Training Project |
+--------------------------------------------------------------------------+
| Computer Graphics Unit,        |  Internet: C.C.Lilley@mcc.ac.uk         |
| Manchester Computing Centre,   |     Janet: C.C.Lilley@uk.ac.mcc         |
| Oxford Road,                   |     Voice: +44 61 275 6045              |
| Manchester, UK.  M13 9PL       |       Fax: +44 61 275 6040              |
| X400: /I=c/S=lilley/O=manchester-computing-centre/PRMD=UK.AC/ADMD= /C=GB/|
| <A HREF="http://info.mcc.ac.uk/CGU/staff/lilley/lilley.html">my page</A> | 
+--------------------------------------------------------------------------+