<html>
<!--AD_DND-->
<head>
<title>User Session Tracking</title>
</head>

<body bgcolor=#ffffff text=#000000>
<h2>User Session Tracking</h2>

part of the <a href="index.html">ArsDigita Community System</a>
by <a href="http://photo.net/philg/">Philip Greenspun</a> and <a href=http://teadams.com>Tracy Adams</a>

<hr>

<ul>
<li>User-accessible directory:  none
<li>really important file:  /tcl/ad-last-visit.tcl
<li>Site administrator directory:  linked from 
<a href="/admin/users/">/admin/users/</a>
<li>data model :  inside <a href="/doc/sql/display-sql.tcl?url=/doc/sql/community-core.sql">/doc/sql/community-core.sql</a>

</ul>

<h3>What we said in the book</h3>

<blockquote>
<i>
(where "the book" = <a href="http://photo.net/wtr/thebook/">Philip and
Alex's Guide to Web Publishing</a>)
</i>
</blockquote>
<p>

In many areas of a community site, we will want to distinguish "new
since your last visit" content from "the stuff that you've already seen"
content.  The obvious implementation of storing a single
<code>last_visit</code> column is inadequate.  Suppose that a user
arrives at the site and the ACS sets the <cite>last_visit</cite> column
to the current date and time.  <a href="http://photo.net/wtr/thebook/glossary.html#HTTP">HTTP</a> is
a stateless protocol.  If the user clicks to visit a discussion forum,
the ACS queries the <code>users</code> table and finds that the last
visit was 3 seconds ago.  Consequently, none of the content will be
highlighted as new.

<p>

The ACS stores <code>last_visit</code> and
<code>second_to_last_visit</code> columns.  We take advantage of the
AOLserver filter facility to specify a Tcl program that runs before
every request is served.  The program does the following:

<blockquote>

IF a request comes with a user_id cookie, but the last_visit cookie is
either not present or more than one day old, THEN the filter proc
augments the AOLserver output headers with a persistent (expires in year
2010) set-cookie of last_visit to the current time (HTTP format).  It
also grabs an Oracle connection, and sets

<blockquote>
<pre>
last_visit = sysdate, 
second_to_last_visit = last_visit
</pre>
</blockquote>

We set a persistent second_to_last_visit cookie with the
<code>last_visit</code> time, either from the last_visit cookie or, if
that wasn't present, with the value we just put into the
<code>second_to_last_visit</code> column of the database.

</blockquote>

We do something similar for non-registered users, using pure browser
cookies rather than the database.

<h3>Stuff that we've added since</h3>

A lot of <a href="http://arsdigita.com">arsdigita.com</a> customers
wanted to know the total number of user sessions, the number of repeat
sessions, and how this was evolving over time.  So we added:

<ul>
<li>an <code>n_sessions</code> column in the <code>users</code> table.
<li>a table:
<blockquote>
<pre><code>
create table session_statistics (
	session_count	integer default 0 not null,
	repeat_count	integer default 0 not null,
	entry_date	date not null
);
</code></pre>
</blockquote>
<li>new code in ad-last-visits to stuff this table
</ul>

<h3>Rules</h3>

<table border=1>
<tr>
<td>last_visit cookie present?</td>
<td>log a  session</td> 
<td>log repeat session </td> 
<td>update last_visit cookie</td>
<td>update second_to_last_visit_cookie</td>
</tr>
<tr>
<td>Yes</td>
<td>Yes if date - last_visit > LastVisitExpiration</td>
<td>Yes if date - last_visit > LastVisitExpiration</td>
<td>Yes if date - last_visit > LastVisitUpdateInterval</td>
<td>Yes if date - last_visit > LastVisitExpiration</td>
<td></td>
</tr>
<tr>
<td>No</td>
<td>Yes if the IP address has not been seen in the LastVisitCacheUpdateInterval seconds </td>
<td>No</td>
<td>Yes if the IP address has not been seen in the LastVisitCacheUpdateInterval seconds </td>
<td>No</td>
</tr>
</table>
<P>

Upon login, a repeat session (but not an extra session) is logged
if the <code>second_to_last_visit</code> is not present. 
Logic: The user is a repeat user since they are logging in 
instead of registering. 
He either lost his cookies or is using a different
browser.  On the first page load, the <code>last_visit</code> 
cookie is set and a session is recorded.  When the user logs in, 
we learn that he is a repeat visiter
and log the repeat session.  (If the user was only missing a 
<code>user_id</code> cookie, both the <code>last_visit</code> 
and <code>second_to_last_visit</code> cookies would been updated on the
initial hit.)


<h3>Parameters</h3>

<ul>
<li><code>LastVisitUpdateInterval</code> - The <code>last_visit</code> cookie represents the date of the most recent visit, inclusive of the current visit. If the user remains on the site longer than the <code>LastVisitUpdateInterval</code>, the <code>last_visit</code> cookie is updated.  The database stores the <code>last_visit</code> date as well for using tracking and to display "who's online now".
<li><code>LastVisitExpiration</code> - The minimum time interval separating 2 sessions.
<li><code>LastVisitCacheUpdateInterval</code> - The period of time non-cookied hits from an individual IP is considered the same user for the  purpose of session tracking. (IP tracking and caching is necessary to not overcount browsers that do not take cookies.)
</ul>
<hr>
<a href="http://photo.net/philg/"><address>philg@mit.edu</address></a>
</body>
</html>