Building Scalable eBusiness Solutions with ArsDigita
by Bryan Che (bryanche@arsdigita.com), Tracy Adams (teadams@arsdigita.com), Bruce Keilin (brucek@arsdigita.com)
Submitted on: 2001-03-08
Last updated: 2001-04-25
ArsDigita : ArsDigita Systems Journal : One article
Overview
A successful eBusiness service solution must not only meet the
functional business requirements that define its client's needs but also
scale reliably and economically to meet the demand for that service.
Two important elements in providing a scalable system are:
- building upon base software components that have been proven and
tested to offer solid performance and scalability
- using a hardware architecture that is capable of scaling through
incremental field upgrades to the existing hardware or by adding
additional hardware to the architecture.
By extending a scalable software base, the ArsDigita Community System
(ACS), and deploying a scalable hardware architecture, ArsDigita-built
solutions are able to realize throughput and performance gains without
requiring custom software changes.
ArsDigita-built Web sites, based on the ACS, have been proven and tested
to support exceptional performance of up to 49 page views per second, or
a rate of 4.23 million page views per day, with a download time
averaging less than 4 seconds. ArsDigita produced these results for a
client Web site using 500 simultaneous users, with each user waiting
five seconds between page requests. The virtual users followed traffic
patterns based on actual server logs from the production Web server. The
tested hardware configuration was one eight-CPU Sun E4500 with eight GB
of RAM as the database server and four one-CPU Sun Netras, each with 256
MB of RAM, as the Web servers. Adding more hardware could further
increase the performance of the tested Web site.
The ArsDigita scaling lab runs performance tests, and it benchmarks both
the ACS and ACS-based sites against client-specified load
requirements. This facility provides ArsDigita with the ability to
determine hardware requirements for deployment and also to proactively
discover and address performance and scalability bottlenecks well before
they reach production.
Production Examples
Photo.net (http://photo.net), running on
one Sun E450 with four CPU's, four GB of RAM, four mirrored 17-GB RAID
and one mirrored eight-GB RAID, achieves the following performance:
- 633,000 unique visitors/month (21,000 unique visitor per day)
- 11.43 minute average session length
- 8,000,000 impressionable page views/month (270,000 page/views day)
- 1500 classifieds posting/month
- 14,000 discussion forum postings/month
- 7500 high-end photo uploads/month
- 7 Categories and 4500 Postings in the "User Merchant Ratings" forum
Other example Web sites include the following:
- ILuvCamp.com supported 1.5
millions hits/day with one Sun E450 and six Sun Netras within six-months
after the start of development.
- Away.com serves 300,000 page views
daily (15% of them text-based searches) to a registered membership of
over 400,000 users with 2 Sun E450s and 3 Sun Netras.
- Scorecard.org, UsLaw.com, GuideStar.org, and our Boston
Marathon results sites received similar loads during peak periods of
publicity.
Scalability Testing
ArsDigita maintains a scaling lab in its Cambridge headquarters. It uses
this lab to help continually build up scaling knowledge and improve the
ACS. ArsDigita offers a scaling service using this lab to identify
hardware requirements for client's particular needs. By testing Web
sites in this lab, ArsDigita is able to:
- evaluate a Web site to ensure that it meets scalability requirements
- recommend hardware and configuration parameters
- find, understand, and work around bottlenecks
- test long-term stability under load
- iteratively design, develop and test for scalability
Scalability Results
With four Sun Netras (one CPU, 256MB RAM) as Web servers, one Netra
serving images and one Sun e4500 (eight CPU's, eight GB of RAM) database
server, ArsDigita scaled one of its sites to a throughput of 49 pages
per second, or a rate of 4.23 million pages per day. The results are
shown in the two figures below. The full report is available as a
sample scaling document.
ArsDigita simulated 500 simultaneous users cycling through scripts that
represented typical behavior on the tested site. It based the simulated
traffic pattern on actual server logs from the production site.
ArsDigita used a delay of five seconds between each requested page in
order to simulate real user behavior as they stop to read pages on the
site.
Figure 1 depicts the average page load time of the site's core pages
with an increasing number of simulated users. The vast majority of the
pages were served in less than an average of four seconds (beating
industry standards), even at 500 simulated-users. The second figure
shows the number of page-views per second the system was able to
achieve.
Figure 1 Sample Performance Vs Users Chart
Figure 2 Sample Throughput Chart
Benchmarks for ACS Java 4.0 show that it performs better than the ACS
TCL/AOLserver architecture. As the data model and queries for both ACS
Java and ACS Tcl are the same, this improvement is due to better
performance in the scripting layer. Using Apache Bench and a single
four-CPU E450, ArsDigita measured the peak throughput of certain ACS
pages:
Page
| ACS Java/Resin 1.21 (page views/sec)
| TCL/AOLserver 3.2 (page views/sec)
|
Empty loop (100 iterations)
| 50.14
| 24.08
|
/index (default root page)
| 38.02
| 11.52
|
/shared/community-member
| 14.40
| 5.55
|
Table 1 Java Vs Tcl Results
Technical Review
Hardware Scaling Strategy
A client site's hardware setup will depend on the customer's
application, traffic and performance requirements. ArsDigita designed
its hardware architecture, though, so that any site can readily support
more traffic by simply adding more, inexpensive hardware.
Figure 3 ArsDigita Hardware Setup
Database server(s): ArsDigita typically dedicates an
enterprise-class server such as a Sun e420 to run the RDBMS. These
servers have four CPU's and up to four GB RAM. ArsDigita recommends a
kind of server to use as a Web site's database server based upon the
level of information processing the database must perform. With proper
capacity planning, the recommended database server should prove
sufficient for a long time.
If Web site traffic should increase to the point that the site requires
a faster RDBMS server, ArsDigita can swap in a more powerful machine for
the existing database server. If a single powerful server such as a Sun
E4500 or E10000 becomes fully utilized, ArsDigita may employ other, more
complicated scaling options such as clustering using Oracle Parallel .
Web Servers: ArsDigita runs its Web servers and application code
on a number of small, relatively inexpensive machines. These front-end
machines, typically Sun Netras or VA Linux machines, run identical
copies of a site's Web server and application. By distributing its load
across multiple machines, ArsDigita is able to achieve cheap and
reasonable hardware scalability as well as reliability. ArsDigita can
add additional Web servers to grow the Web site's capacity as needed.
Load balancers: A load balancer receives all of a Web site's
incoming traffic and then distributes this traffic among the Web site's
front-end Web servers. ArsDigita uses its load balancers to help ensure
that no individual Web server becomes overwhelmed and to partition the
work to different Web services. The load balancer also provides a level
of fault tolerance by frequently monitoring the status of the Web
servers and removing any that are slow or non-responsive from the
available server pool.
More Information:
Software components
ArsDigita builds its custom Web sites using the following software
components:
Figure 4 Software Component Stack
- Customer Specific Solutions are built to the same engineering
standards as the ACS itself using ACS Application Programming Interfaces
(API). The ArsDigita engineering and design methodologies are well
documented, and ArsDigita has extensive training programs for
developers.
- ACS is ArsDigita's open-source platform of pre-integrated
modules built upon one common data model for developing customer
specific, eBusiness solutions. The performance and scalability of the
ACS is tested and extended through live production experience and
extensive review by the worldwide developer community. ArsDigita also
tests the ACS directly in its scaling lab.
- Servlet Engine: ACS Java will operate in any J2EE compliant
servlet engine.
- WebServer: ACS can use any Web server as a Web listener.
- Database (Oracle): Oracle is the leading database vendor
both in the general database market and in the Web market. Many high
profile sites, including Amazon.com, e-Trade, and CNN Interactive, scale
to large loads using Oracle's database.
- Operating System (Unix/Linux): Unix, and its open-source
derivative, Linux, are both stable operating systems offering true
multi-threading, memory protection, and high maintainability.
Additional Information
ArsDigita's scaling lab has written a document, ArsDigita Scalability
Testing, detailing its methodologies for evaluating a Web site's
scalability. ArsDigita also may provide sample scalability reports it
has written for some of its clients and WebTrends data from photo.net.
asj-editors@arsdigita.com
Reader's Comments
WRT the tcl / java speed comparison chart: are the tcl numbers with nsd7 or nsd8? I've heard that can make a difference.
-- Jonathan Ellis, May 15, 2001
The Tcl tests used nsd8
-- Bryan Che, May 21, 2001
From the bookmarks you say you get 24.08 pageviews for an empty 100 iteration loop page on a 4-cpu e450 with tcl/aol3.2? Why such poor performance? Is it the overhead due to filters you have registered? On my measly pIII 550 I get over 60 page/sec from ab while running other background processes as well?
-- Carl Garland, May 24, 2001
/index does little but call the RDBMS then return a template. Are you certain that you're not just measuring the difference between the two templating implementations? The fact that Java can iterate faster than Tcl is neither surprising nor terrible relevant.Most of the time consumed in serving ACS 4 Tcl pages seem to be spent in the request processor, the RDBMS, and the template system. Pinning the problem on the scripting language per se without further analyis doesn't really seem warranted.
-- Don Baccus, May 28, 2001