User Session Tracking
part of the
ArsDigita Community System
by
Philip Greenspun and
Tracy Adams
What we said in the book
(where "the book" = Philip and
Alex's Guide to Web Publishing)
In many areas of a community site, we will want to distinguish "new
since your last visit" content from "the stuff that you've already seen"
content. The obvious implementation of storing a single
last_visit
column is inadequate. Suppose that a user
arrives at the site and the ACS sets the last_visit column
to the current date and time. HTTP is
a stateless protocol. If the user clicks to visit a discussion forum,
the ACS queries the users
table and finds that the last
visit was 3 seconds ago. Consequently, none of the content will be
highlighted as new.
The ACS stores last_visit
and
second_to_last_visit
columns. We take advantage of the
AOLserver filter facility to specify a Tcl program that runs before
every request is served. The program does the following:
IF a request comes with a user_id cookie, but the last_visit cookie is
either not present or more than one day old, THEN the filter proc
augments the AOLserver output headers with a persistent (expires in year
2010) set-cookie of last_visit to the current time (HTTP format). It
also grabs an Oracle connection, and sets
last_visit = sysdate,
second_to_last_visit = last_visit
We set a persistent second_to_last_visit cookie with the
last_visit
time, either from the last_visit cookie or, if
that wasn't present, with the value we just put into the
second_to_last_visit
column of the database.
We do something similar for non-registered users, using pure browser
cookies rather than the database.
Stuff that we've added since
A lot of
arsdigita.com customers
wanted to know the total number of user sessions, the number of repeat
sessions, and how this was evolving over time. So we added:
- an
n_sessions
column in the users
table.
- a table:
create table session_statistics (
session_count integer default 0 not null,
repeat_count integer default 0 not null,
entry_date date not null
);
- new code in ad-last-visits to stuff this table
Rules
last_visit cookie present? |
log a session |
log repeat session |
update last_visit cookie |
update second_to_last_visit_cookie |
Yes |
Yes if date - last_visit > LastVisitExpiration |
Yes if date - last_visit > LastVisitExpiration |
Yes if date - last_visit > LastVisitUpdateInterval |
Yes if date - last_visit > LastVisitExpiration |
|
No |
Yes if the IP address has not been seen in the LastVisitCacheUpdateInterval seconds |
No |
Yes if the IP address has not been seen in the LastVisitCacheUpdateInterval seconds |
No |
Upon login, a repeat session (but not an extra session) is logged
if the second_to_last_visit
is not present.
Logic: The user is a repeat user since they are logging in
instead of registering.
He either lost his cookies or is using a different
browser. On the first page load, the last_visit
cookie is set and a session is recorded. When the user logs in,
we learn that he is a repeat visiter
and log the repeat session. (If the user was only missing a
user_id
cookie, both the last_visit
and second_to_last_visit
cookies would been updated on the
initial hit.)
Parameters
LastVisitUpdateInterval
- The last_visit
cookie represents the date of the most recent visit, inclusive of the current visit. If the user remains on the site longer than the LastVisitUpdateInterval
, the last_visit
cookie is updated. The database stores the last_visit
date as well for using tracking and to display "who's online now".
LastVisitExpiration
- The minimum time interval separating 2 sessions.
LastVisitCacheUpdateInterval
- The period of time non-cookied hits from an individual IP is considered the same user for the purpose of session tracking. (IP tracking and caching is necessary to not overcount browsers that do not take cookies.)
philg@mit.edu