which is not completely true. I never use a log analyzer. I look at the
file, and when I want some particular information I use grep. that works
great for me and I don't need to learn how to use a log analyzer, just to
generate the info Ari wants to put in the logfile. But I'm only
a small user. On the other hand large sites may very well be interested
in saving diskspace (except for the very very large sites :-))
>
> 2) The content of the data is only known at the time the log entry is
> made. Any information that was not written at that time is lost to
> any later analysis. Thus, it is usually preferable to log everything
> and let the log analysis program choose what should be ignored.
>
why? If I choose to write down only a limited set into the LOG than I know
I cannot fetch the other information, but it's my choise and my responsibility.
> 3) Every time the configuration changes, the old log file must be deleted.
> This is because any log analyzer will only be able to understand one
> log format at a time.
>
> 4) Special formatting conventions (like the square brackets surrounding
> the date field in NCSA httpd logs) make it much easier for analyzers
> to parse the data and identify mangled entries -- a condition which
> occurs quite often with NCSA httpd.
>
> 5) It makes it slightly harder for people like me to write and test a
> simple log analyzer program.
>
3) 4) 5) are irrelevant to a guy like me who simply want to broswe the
logfile and has only a very small local disk.
3) can be solved by writing down the configuration change in the logfile
which is also handy for 5) so you will know at all times what the
format is and which fields are written in the logfile.
ie. the same string that does the formatting in ari's program should
be present in the logfile; or if you are willing to forget about 3)
should be made avaliable for the log analyzer. That makes your '-'
suggested below unnesecairy,
- frans
>
> Having said all that, I still think that it may be a good idea providing
> that the above concerns are addressed (i.e. I have faith that the server
> authors will go out of their way to make my life easier, providing that
> I let them know what will make my life easier). Although I personally
> would prefer a fixed format, I am willing to go with the flow.
>
> In that spirit, let me propose that some generic indicator (such as "-")
> be used for any field which is desired by the configuration string (or by
> the fixed format) but is unknown or empty for a particular log entry.
> Thus, if the configurable string indicates REMOTE_IDENT should be logged
> between FULL YEAR and CLIENT HOST ADDRESS (as in "%Y %I %C"), and
> REMOTE_IDENT is empty, then the output should be like:
>
> "1994 - simplon.ics.uci.edu"
>
> rather than
>
> "1994 simplon.ics.uci.edu"
>
> for reasons that should be obvious to most hackers.
>
>
> ....Roy Fielding ICS Grad Student, University of California, Irvine USA
> (fielding@ics.uci.edu)
> <A HREF="http://www.ics.uci.edu/dir/grad/Software/fielding">About Roy</A>
>