Re: Holding connections open: an immodest proposal

hallam@dxal18.cern.ch
Wed, 14 Sep 94 20:59:15 +0200


>Yes, exactly. I contend this solution will take longer to deploy than
>my proposal for the server to hold the connection open (for awhile).
>I think it's more complex, and, when the MIME solution is deployed,
>there will be compatibility problems until everyone upgrades.

Well speaking from where libwww comes from I think the multipart solution
is probably more realistic. The networking chunk of libwww is in the engine
bay being stripped down to go multithreaded. I don't think we can realisticaly
expect to try changing the spec at the same time.

On the other hand the current mime parsers are scheduled for demolition
and we have the replacement going through beta test (standalone) at the
moment. Once it passes regression etc we can move on to try the integration
phase.

> 2) I may already have the image cached in my client.
>So only the client actually knows if it NEEDS the image.

This is why we really need two requests, now we could keep the connection
for the second. The advantage is not quite as great however since we
have gone from O(n) to O(2). The client only requests the images it
needs. Proxies in the middle chop out any requests they can satisfy. There
are security issues but these are quite tricky. Not all proxies may be
trusted. If you use the proxy as your security agent things are different
again.

> How often do I really want to get a whole directory hierarchy? What
>happens if intermediate points in the hierarchy are mapped by the
>server to different files that are not in the actual file hierarchy?

Comming from the VMS world I use searches of the whole file hierarchy every
day, usually several times an hour. It is the most usefull single feature
in the VMS operating system. If you haven't got it you should find out
what life is like with it...

Plus we saw the hyper-g presentation yesterday and the tree following was
very nice. I want it. Henryk wants it and H&ken wants it.

Server side configuration could be a problem. But is fred/// necessarily
semanticaly the same as traversing the tree of fred?

>I shrink from the "continuous connection" label. In any case, how does
>that prevent a browser from taking async messages?

My prototype browser, Bilbo used continuous connection to do conferencing
and MUDs. That is where continous connection takes you if you get it right.
It is an area where the functionality improvement is dramatic. It fully
justifies the paradigm shift.

[Note, Bilbo died recently, but we are recycling most of the design into
libwww and other W3O ware]

>Please reexamine my proposal and some facts.
>1) After a TCP/IP connection closes, the host is obliged to tie up the
>resources for about 2 minutes, in case some tardy packets arrive.
>2) I proposed to keep a connection open on the order of tens of seconds
>in case a new GET (or other method, for that matter) arrives.
>3) Therefore my proposal holds resources out of use for no more than,
>say, 25% longer than they would be anyway.
>4) In the event that there was in fact an image URL in what the server
>returned, and the client needs it, the already-open connection saves
>connection setup time, slow-start time, and having a second (or more)
>connection around lingering in TCP/IP's close-wait timeout when it
>eventually gets closed.

Well its not a case of optimising resources quite at that level *yet*. First
off we are thinking beyond TCP/IP. A fundamental change in the philosophy
of HTTP is politically quite a difficult thing. Many people here want to
call the protocol something different.

When I raised the idea of keeping the connection open for this purpose at
WWW'94 there was quite a bit of resistance to the idea. It is a small step
for coding to do something that optimises our current situation. BUT we
want to be arround in ten years time. Before we allow continuous connection
into the spec we have to demonstrate that it works through proxies, and
can provide the conferencing, hyperterminal and transaction processing
benefits too.

To support multipart mime we do not actually need to change the specs at all.
It is actually in the current specs (!) The issues raised are on the level
of optimisation (behaviour of proxy chains).

>Stuff is coming along that
>will require negotiation between client and server: security, and
>payment for information.

Negotiation is not required for security. In many circumstances a one shot
connection can be maintained. for more details consult the Shen docs:

http://info.cern.ch/hypertext/WWW/Shen/ref/shen.html

>Consider, even now, the WWW Basic security scheme. The server rejects
>a client's request and demands authentication.

The WWW-Basic scheme was introduced because of all the hassle that the USA
put up concerning export regulations. We now have a DIGEST method that is
equivalent in terms of convenience but does not involve sending passwords
in the clear across the internet.

>If the negotiation paradigm -- reject request, query user, re-request --
>becomes common, the cost of opening and closing connections becomes
>more painful.

I agree in principle. I would consider transaction processing to be an even
better example. Imagine you want to synchonise the modules you have checked
out from a server with the master library. This requires two steps:-

1) Request a list of the module that are currently checked out

2) Replace each one into the library in turn.

The library should not change between 1 and 2. We need to lock it.

>You feel a sustained connection is a major change to HTTP. I agree it's
>a violation of the purity of the model, but in practical terms I think
>it's a small change with potential benefits.

It is a very major change, it is easy to implement but the specification
needs to be thought of carefully. The key point is that the security scheme
and conferencing interact very strongly with this question. This is why
the original PGP and PEM proposals were unsatisfactory. They prevented
the use of continous connection in a conference environment and in any case
did not proxy.

[Out of band comment #1a
When Ari came to CERN he was meant to be doing security work. Instead
he developed the proxy server. If he had done the security work first
instead proxying might have not been possible because too many
other changes to the spec may have stopped it from working
efficiently. we have to ensure that we do not cut ourselves off
from other ideas like proxying for the sake of `optimisation'
that turns out not to be.]

Just a small out of band comment here. Most WWW people think that it is
a document system. It is not. Or rather that is a tiny fraction of its
potential. Here at CERN we want to run the biggest and most expensive
experiment ever using the networking structures and model of WWW. There
are people looking into building operating systems based on the WWW
technology, it can potentialy provide the `binding energy' between
many currently separate areas, in particular AI. We are no longer able to
decide the <IMG> tag in the 9 hour period between Marc thinking about it and
having finished the implementation. People are demanding that we produce
industrial quality specs and stick to them. In the past the spec could be
altered quickly and the code was the thing that took time today the exact
opposite is true.

Phill H-B.