CERN httpd 2.16beta released

Ari Luotonen (luotonen@ptsun00.cern.ch)
Mon, 28 Feb 1994 16:26:36 --100


CERN httpd 2.16beta released, source:

ftp://info.cern.ch/pub/www/src/cern_httpd.tar.Z

This package includes EVERYTHING, so don't ftp the libwww. Also, DO
NOT use the libwww that comes with it for anything else, libwww 2.15
is NOT officially released yet.

Precompiled binaries are supplied for:

Sun4: ftp://info.cern.ch/pub/www/bin/sun4/httpd_2.16beta.Z
Solaris: ftp://info.cern.ch/pub/www/bin/solaris/httpd_2.16beta.Z
HP: ftp://info.cern.ch/pub/www/bin/snake/httpd_2.16beta.Z
NeXT: ftp://info.cern.ch/pub/www/bin/next/httpd_2.16beta.Z
Fat NeXT-386: ftp://info.cern.ch/pub/www/bin/next-386/httpd_2.16beta.Z
DecStation Ultrix: ftp://info.cern.ch/pub/www/bin/decstation/httpd_2.16beta.Z
DEC OSF/1: ftp://info.cern.ch/pub/www/bin/dec-osf1/httpd_2.16beta.Z

For other platforms I'm happy to receive diffs and bins. This release
is mainly for proxy caching, but a lot of other new features and fixes
have been included. I call this beta because of caching -- everything
else seems to be stable (I'm not saying caching isn't but just to be
careful).

CERN HTTPD 2.16BETA RELEASE NOTES

* If you are upgrading from 2.15beta, you need to make no changes.
* If you are upgrading from 2.14, there is one single thing that
needs to be done:


Rename your old /htbin scripts to end in .pp suffix!

Firewall Gateway (Proxy) Additions, Fixes

* ftp with binary files work
* x-compress and x-gzip work correctly over proxy
* Firewalling now works through arbitrary number of proxies;
http_proxy, ftp_proxy, gopher_proxy and wais_proxy configuration
directives cause proxy to connect to the outside world through
another proxy. Environment variables with the same names have same
effects, but config file is user-friendlier for this.
* Now sends all the headers sent by client.
* Proxy log file now gives byte count.
* Proxy log file now gives correct status code also on error.

Firewall Gateway (Proxy) Caching

* CacheRoot directive specifies cache root directory, and turns on
proxy caching. Cache root directory must be dedicated to httpd -
all files in there are subject to garbage collection.
* Cache size (in megabytes) is specified by CacheSize directive;
cache size should be several megabytes, 50-100MB should give good
results. Cache may, however, temporarily grow a few megabytes
bigger than specified. Also, space taken up by directories is not
calculated in the current version.
* http, ftp, gopher with GET method get cached.
* However, not caching:
+ HTTP0 responses (you never know if it failed; also confused
HTTP1 servers sometimes output garbage in front of HTTP1
headers).
+ Protected documents (request had Authorization: field).
+ Queries - they have too often side-effects. (POST should be
always used with forms, and all script responses should
have Expires: header when necessary. Until then, we don't
cache them.)
* Expiry date is extracted:
+ From Expires: header.
+ If not present Last-Modified: is used to approximate expires.
If a file hasn't changed in five months the chances are it
won't change during the next week. On the other hand, if a
file has changed yesterday, it will probably change again
pretty soon. I know this is heuristic but until all the
servers give Expires: this works much better than not using
it, so no flames about it.
+ If Last-Modified: not given use the time given by
CacheDefaultExpiry directive, default 7 days.
* Format of cache files and directory structure under cache root is
subject to change if necessary. No application should yet rely on
any certain cache format. Eventually I can see clients accessing
cache files directly, bypassing proxy server.
* Caching system understands both time formats, also the one output
by old NCSA httpds.
* Cache files get locked during transfer. Lock files time out if
something goes wrong. Timeout can be set by CacheLockTimeOut
directive (default 20 minutes). During the lock is in effect,
further requests to the same file get retrieved from the remote
host.
* Garbage collection directives:
+
+ GcMemoryUsage to advice gc about how radical to be in memory
use (more memory => smarter gc).
+ GcTimeInterval, how often to do gc.
+ GcReqInterval, after how many requests to do gc.
+ (gc is also automatically started if cache size limit is
reached.)
+ CacheLimit_1, size in KB until which files are equally
valuable despite their size (200K).
+ CacheLimit_2, size in KB after which files get discarded
because they are too big (4MB).
+ CacheClean, remove all files older than this (default 21
days).
+ CacheUnused, remove all files that have not been used in this
long time (default 14 days).
* Garbage collector always removes all expired, too long unused, and
too old files.
* If cache size limit is reached some files need to be sacrified;
the current algorithm takes into account:
+ Time remaining to unconditional removal; if it expires
tomorrow it might as well be removed today.
+ Time last accessed; if it hasn't been accessed in 5 days, it
probably won't be accessed anymore before it expires.
+ Size; huge files get removed move easily.
+ Time it took to load it from the remote host; files that were
time-consuming to transfer have much higher value. This
compensates the size factor. Load delay is the single most
significant value.
+ Time it has already been in cache; ancient files get removed
more easily than fresh ones.

Other New Features

* Error log file.
* Referer: field ends up in error log when a request fails.
* UserId and GroupId to set default uid and gid (used instead of
nobody and nogroup).
* Timeout for input and output; default time to wait for a request
is 2 minutes, and to send response 20 minutes. Timeout causes a
note to error log, and terminates child (no more hanging httpds).
Note: the one zombie is normal; don't report to me about it, I
may do something about it some day, or maybe I won't. Zombie
doesn't take up any other system resources except the one process
table entry.
* Suffixes are no longer case-sensitive by default; this may be
changed via the SuffixCaseSense configuration directive.
* Lou Montulli's news and proxy diffs added to the library.
* Most command line options now also available as configuration
directives:
+ DirAccess
+ DirReadme
+ AccessLog
+ ErrorLog
+ LogFormat
+ LogTime
* -vv command line option for Very Verbose trace output. Outputs
also request headers as they came in. Otherwise like -v flag.

Enhancements, Fixes

* NPH-scripts now work from automatically backgrounded standalone
server.
* Fixed the many problems with Content-Transfer-Encoding:
+ Mosaic uses Content-Encoding, although spec says
Content-Transfer-Encoding; I now output both
+ Content-Transfer-Encoding sometimes didn't show up although
it should have, fixed.
+ Content-Transfer-Encoding didn't come up correctly with ftp,
fixed.
* Strange escaping fixed with directory indexing (legal characters
got escaped randomly by a gcc-compiled version).
* Timezone bug around midnight with the new logfile format fixed.
(New logfile format is not yet default, use -newlog command line
option, or LogFormat directive in configuration file.)
* Dashes for non-existent status codes and byte counts now show up
correctly in the log.
* Forking code once again enhanced - fixed a possible hanging
situation.
* Log time fixed to be the time of incoming request, not the time of
request served.
* Zombies now correctly waited away on HP (this was in fact fixed
already in 2.15beta binaries distributed after February 17th -
note, that this bug had no effect on any other platforms ).
* Directory listings no longer have Content-Length: (because it was
wrong).
* Now understands also the old Accept: syntax, with spaces as
separators between actual content-type and its parameters. This
will eventually be taken out.
* htadm now uses the same file creation mask as in the original
password file.

Code is Purify'd, and I truly wonder how anybody can live without that
marvellous piece of software. My productivity has doubled after I
started using it. Well done, Pure Software!

-- Cheers, Ari --