ArsDigita Archives
 
 
   
 
spacer

User Session Tracking

part of the ArsDigita Community System by Philip Greenspun and Tracy Adams

What we said in the book

(where "the book" = Philip and Alex's Guide to Web Publishing)

In many areas of a community site, we will want to distinguish "new since your last visit" content from "the stuff that you've already seen" content. The obvious implementation of storing a single last_visit column is inadequate. Suppose that a user arrives at the site and the ACS sets the last_visit column to the current date and time. HTTP is a stateless protocol. If the user clicks to visit a discussion forum, the ACS queries the users table and finds that the last visit was 3 seconds ago. Consequently, none of the content will be highlighted as new.

The ACS stores last_visit and second_to_last_visit columns. We take advantage of the AOLserver filter facility to specify a Tcl program that runs before every request is served. The program does the following:

IF a request comes with a user_id cookie, but the last_visit cookie is either not present or more than one day old, THEN the filter proc augments the AOLserver output headers with a persistent (expires in year 2010) set-cookie of last_visit to the current time (HTTP format). It also grabs an Oracle connection, and sets
last_visit = sysdate, 
second_to_last_visit = last_visit
We set a persistent second_to_last_visit cookie with the last_visit time, either from the last_visit cookie or, if that wasn't present, with the value we just put into the second_to_last_visit column of the database.
We do something similar for non-registered users, using pure browser cookies rather than the database.

Stuff that we've added since

A lot of arsdigita.com customers wanted to know the total number of user sessions, the number of repeat sessions, and how this was evolving over time. So we added:
  • an n_sessions column in the users table.
  • a table:
    
    create table session_statistics (
    	session_count	integer default 0 not null,
    	repeat_count	integer default 0 not null,
    	entry_date	date not null
    );
    
  • new code in ad-last-visits to stuff this table

Rules

last_visit cookie present? log a session log repeat session update last_visit cookie update second_to_last_visit_cookie
Yes Yes if date - last_visit > LastVisitExpiration Yes if date - last_visit > LastVisitExpiration Yes if date - last_visit > LastVisitUpdateInterval Yes if date - last_visit > LastVisitExpiration
No Yes if the IP address has not been seen in the LastVisitCacheUpdateInterval seconds No Yes if the IP address has not been seen in the LastVisitCacheUpdateInterval seconds No

Upon login, a repeat session (but not an extra session) is logged if the second_to_last_visit is not present. Logic: The user is a repeat user since they are logging in instead of registering. He either lost his cookies or is using a different browser. On the first page load, the last_visit cookie is set and a session is recorded. When the user logs in, we learn that he is a repeat visiter and log the repeat session. (If the user was only missing a user_id cookie, both the last_visit and second_to_last_visit cookies would been updated on the initial hit.)

Parameters

  • LastVisitUpdateInterval - The last_visit cookie represents the date of the most recent visit, inclusive of the current visit. If the user remains on the site longer than the LastVisitUpdateInterval, the last_visit cookie is updated. The database stores the last_visit date as well for using tracking and to display "who's online now".
  • LastVisitExpiration - The minimum time interval separating 2 sessions.
  • LastVisitCacheUpdateInterval - The period of time non-cookied hits from an individual IP is considered the same user for the purpose of session tracking. (IP tracking and caching is necessary to not overcount browsers that do not take cookies.)

philg@mit.edu
spacer