Our research confirms that techniques do indeed exist that can
determine sessions for existing access logs. Without getting into
proprietary techniques or results, we can confirm the statistical
validity of a 20 minute or session boundary as measures by actual client side
monitoring. Our study used 25.5 minutes, which was 1.5 standard deviations
(generous) away from the mean with low variance. See this really long
URL for the online version of some of our published findings:
<URL:http://www.gatech.edu/lcc/idt/Students/Catledge/browsing/
UserPatterns.Paper4.formatted.html>
More over, calculations of percent error for ambiguous cases can
also be computed along with confidence intervals to quantify the
loss involving these determinations. Thus, errors that extend from
anomalous browsing patterns from proxied domains can be contained.
An underlying assumption from Terry's research is that humans
can accurately determine paths in a post-hoc manner. Unless this is
proven via a study that takes known sessions and tests the ability
of humans to determine sessions, we will not be able to rely upon this
assumption. Note, we could do that with our datasets.