Server Clustering
ACS Documentation :
ACS Administrator's Guide :
Server Clustering
- Tcl: /tcl/ad-server-cluster.tcl
The Problem
Many heavily-hit sites sit behind load balancers, which means that requests to a particular
site can be handled by one of several machine conspiring to appear as a single server.
For instance, requests to
www.foobar.com might be routed to either
www1.foobar.com,
www2.foobar.com, or
www3.foobar.com,
three physically separate servers which share an Oracle tablespace (and
hence all the data in ACS).
Many database queries are memoized in individual servers' local memory
(using the util_memoize procedures) to minimize fetches from the database.
When a server updates an item in the database, the
old item needs to be removed from the server's local cache (using util_memoize_flush)
to force a database query the next time this item is accessed. But what happens when:
- www1.foobar.com does util_memoize "get_greeble_info 43" (incurring an actual
database lookup, SELECT * FROM greeble WHERE greeble_id = 43, and caching the result)
- www2.foobar.com does util_memoize "get_greeble_info 43" (incurring a
database lookup and caching the result)
- www1.foobar.com UPDATEs the info for greeble #43 and does
util_memoize_flush "get_greeble_info 43"
- www2.foobar.com does util_memoize "get_greeble_info 43" (returned a cached
value). The old info for greeble #43 hasn't been flushed from its local cache, so the result
is outdated!
In general, if any of several servers can
update an item, the old version of the item can remain in other servers' local caches.
Doh!
The Solution
We introduce the concept of a
server cluster, a group of look-alike servers sharing an Oracle tablespace.
To set up a cluster, add the following to the ACS
parameters/yourservername.ini file on each
of the servers in the cluster:
; address information for a cluster of load-balanced servers (to enable
; distributed util_memoize_flushing, for instance).
[ns/server/yourservername/acs/server-cluster]
; is clustering enabled?
ClusterEnabledP=1
; which machines can issues requests (e.g., flushing) to the cluster?
ClusterAuthorizedIP=192.168.16.*
; which servers are in the cluster? This server's IP may be included too
ClusterPeerIP=192.168.16.1
ClusterPeerIP=192.168.16.2
ClusterPeerIP=192.168.16.3
; N.B.: www1 = 192.168.16.1, www2 = 192.168.16.2, www3 = 192.168.16.3
; log clustering events?
EnableLoggingP=1
(Of course, you'll want to replace the IP addresses with the actual IPs
of the hosts in the cluster.)
Now when a server (say, www1.foobar.com) invokes
util_memoize_flush or util_memoize_seed, those routines use
server_cluster_httpget_from_peers
to issue an HTTP GET request to all machines in the cluster (omitting the local server):
- GET http://www2.foobar.com/SYSTEM/flush-memoized-statement.tcl?statement=tcl-statement
- GET http://www3.foobar.com/SYSTEM/flush-memoized-statement.tcl?statement=tcl-statement
causing the other machines (
www2.foobar.com and
www3.foobar.com) to flush the Tcl statement
from their local caches. This is transparent and works with all existing code.
So don't think about it - just set up the server-cluster block in your yourservername.ini file,
and util_memoize and friends will be happy.
jsalz@mit.edu