ArsDigita Scalability Testing
by Bryan Che (bryanche@arsdigita.com)
Submitted on: 2001-01-26
Last updated: 2001-04-25
ArsDigita : ArsDigita Systems Journal : One article
1 Introduction
Scalability is one of the most important requirements for any Web
service or application. If a system does not scale or perform well,
then its features will be unavailable to people--rendering that system
virtually useless. Thus, determining a system's overall scalability is
an important part of development.
The principal way to determine a system's ability to scale is to
actually test its performance under load. Load testing, if executed and
interpreted properly, can provide valuable insights into a system's
ability to meet scalability requirements and to grow for the future. It
can also help identify bottlenecks which may be hindering a system's
performance. But, load testing is not a trivial task. Load testers
must take care to follow certain practices while assessing applications.
Over the past few months, ArsDigita has worked to develop a
scalability-testing methodology based upon its own experience as well as
through consultation with other scalability experts.
2 Vocabulary
Scalability is a field which uses a certain jargon to explain various
concepts. This language includes the following terms, which provide a
well-defined framework in which to perform scalability testing:
- throughput: the amount of work accomplished in a given time,
i.e. work/time. Typically, transactions/second or bits
(transmitted)/second.
- scalability: the amount of change in linear throughput
corresponding to a change in resources. Scalability answers questions
like, "How many more transactions per second can my server do if I
double its memory?" or "How many less transactions per second can my
server do if I increase its number of concurrent users by fifty
percent?" Note that scalability increases as throughput increases.
- performance: the speed at which a system performs its work
- transaction time: the amount of time a transaction takes to
acquire and utilize its resources.
- latency: the time between making a request and beginning to
see a result
- virtual user: a software-generated visitor to a Web site.
Load-testing software typically implement a virtual user as a Web
browser that navigates through a Web site. The virtual user's behavior
is usually customizable; it can change its time interval between page
clicks, use HTML forms, and so on. Load-testing programs generate
multiple virtual users to impose load upon a Web site.
- concurrent users: the number of users or virtual users with
overlapping activity that a Web site receives. These concurrent users
might not be requesting Web pages at precisely the same instant. But,
they are all actively using the Web site during the same window of time.
- load test: a test that determines a system's behavior under
various workloads. This test answers the question, "Given a certain
load x, how will the system behave?"
- scalability test: a test that applies increasing workloads to
determine a system's ability to scale. This test answers the question,
"Given an increase from load x to load y, how will the
system behave at load y versus at load x?"
- performance test: a test that measures how quickly a system
performs under various workloads. This test is a specialized form of a
load test. It answers the question, "Given a certain load x, how
fast will the system return a result?"
- stress test: a test that increases the workload on a system
until the system fails. This test answers the question, "Under what
load will the system fail, and how does it fail?"
- user simulation test: a test that measures how a system
performs in "real life." This test answers the question, "If we have
x number of users doing this, y number of users doing
that, and z number of users going there, how will the system
respond?"
One of the key distinctions in terms is that performance and
scalability are not the same1. A Web site that
performs well--that runs quickly--for ten simultaneous users may operate
slowly for eleven users or more. Thus, one would say that site may
perform well, but it scales poorly.
When people say, then, that they wish to load test a Web site, they
generally do not mean they desire just to see how well a system
performs. They mean that they want to determine how a system behaves
under stress, including how quickly a system operates at a given load,
how well a system scales, and under what load a system fails
In other words, they mean load testing to be a combination of load,
performance, scalability, and stress testing. Furthermore, they
generally employ user simulation tests as a means of performing "load
testing" for how users might actually use their system. Using user
simulation tests provides a context in which to perform testing; user
simulation tests define the kind of load a system will experience.
Executing these tests will help developers ensure that their Web sites
meet operational and performance requirements.
Testing Platform
ArsDigita uses a specific set of resources for performing scalability
testing. These include:
- Empirix's (formerly RSW Software) e-Test Suite
- Server monitoring software
- An isolated scaling lab
3.1 E-Test Suite
Empirix's e-Test Suite consists of several automated testing
applications, including e-Test, e-Load, and e-Reporter. E-Test is a
tool for creating automated-testing scripts, e-Load is software that
performs load tests, and e-Reporter is a program that analyzes and
reports results from load tests run in e-Load. The e-Test Suite also
provides support for monitoring Web server data such as CPU utilization
and memory usage through SNMP messaging.
3.2 Scaling Lab
ArsDigita maintains a scaling lab for the purposes of load testing in an
isolated environment a Web site as it would be deployed in production.
This lab includes hardware for running test clients as well as for
running Web sites. Among the lab's equipment are:
- Sun E-450, E-4500 Web Servers
- F5 BigIPLoad Balancers
- Sun Netras
- A 100 Mbps Ethernet connection
- Client PC's running Windows 2000 and RSW Software's e-Load testing
software for performing load tests
4 Preparing for Load Testing
Before a Web site can be load tested, developers and testers must make
preparations.
4.1 Define Requirements
The first thing that a Web site's developers should do before load
testing is to determine the site's performance and scalability
requirements. Otherwise, they will not be able to determine if their
site has passed its load test. Ideally, these requirements should be
defined before development on a site begins as part of the site's
requirements-defining phase. People typically specify scalability
requirements in terms of the number of concurrent users they want their
site to support. For example, a client may want its Web site to support
up to 1,000 concurrent users, with a 90th percentile page
latency of at most eight seconds. In other words, if 1,000 users were
simultaneously visiting that site, 90% of the pages these users
requested would load within eight seconds.
Many scalability testers view a 90th percentile page latency of eight
seconds as the minimal acceptable performance for a Web page.
Furthermore, they do not find any server errors due to load acceptable.
Thus, when defining scalability requirements, developers should keep
this performance standard in mind.
4.2 Write a Test Plan
Once developers have determined the scalability and performance
requirements for their site, they should work with their site's testers
to determine an appropriate plan to test that their site meets its
performance requirements. A typical ArsDigita test plan consists of the
following steps:
- Profile the Web server to find its basic performance and scalability characteristics when serving static content
- Unit test individual functions within the Web site to see how they scale and to identify bottlenecks
- Test individual user scenarios to see how the Web site scales and to identify bottlenecks
- Perform a user simulation test consisting of all user scenarios to see how the Web site scales and performs under realistic load
- Perform a long-term test using all user scenarios to observe system
degradation over time
4.3 Create User Scenarios
A user scenario is an outline of a path or set of paths a certain type
of user might take through a Web site. For example, consider the site,
Amazon.com. One typical scenario
would represent a user who came to the site, searched for a specific
product, and then purchased it. The scenario would consist of the
various Web pages through which the user navigated to complete this
transaction.
Typically, a Web site will have a fairly small number of typical user
scenarios which comprise the vast majority of its traffic. Amazon.com,
for example, might have the following user scenarios:
- User simply browsing the site
- User searching the site for products
- User making a purchase
- User providing feedback
- User creating an account
These five scenarios probably represent the vast majority of traffic
which Amazon.com receives. Thus, load testing Amazon.com with these
user scenarios would yield a good understanding of how well Amazon.com
scales.
If a site is already live, a good way to define user scenarios is to
look at existing traffic patterns on that site using ClickStream,
WebTrends, or some other analysis tool. If a site has not yet gone
live, then the site's developers and client will have to make some
educated guesses about user scenarios.
4.4 Create Test Scripts
4.4.1 Scripting Approach
Once the developers have outlined the user scenarios for their site,
they need to work with the testers to create the test scripts which
implement their user scenarios. One way to do this would be to create a
script for each path in the user scenarios. For example, Amazon.com
would have a user scenario where a user makes a purchase. But, this
user scenario would consist of multiple paths through the site,
depending on the purchaser's status. If the user was a first-time
buyer, then he would have to fill out a variety of personal information.
If the user was a repeat buyer, he could presumably bypass filling out
this information. Thus, although the number of user scenarios in a site
may be limited, the number of paths comprising those scenarios could be
significantly large.
Figure 1 A single checkout scenario with multiple paths
Because a site could have a large number of paths through it even though
it may have a small number of user scenarios, creating a script for each
path could involve writing a great number of scripts. Using a sizeable
amount of scripts can lead to a lengthy testing period; for each script,
testers must write the script, validate the script, generate input data
for the script, run the script (which itself can take many hours), and
then analyze the results from the script's run. Thus, using one test
script per site path is not a practical approach to load testing if the
load test is to be completed in a timely fashion.
An alternative to creating one script per path while still generating
valid load characteristics takes advantage of the fact that, in general,
distinct paths through a Web site will contain overlapping pages. The
simplest example of this is that almost all user scenarios for a Web
site will start at that site's home page. Regardless of what a user
ends up doing on a site, he will probably start by accessing the site's
main page. Thus, recording the home page for every single script is
redundant.
Another common situation for overlapping pages is when developers
program a page so that it behaves differently depending upon the
information passed to it. Going back to the purchasing example, a new
user and a repeat buyer would travel different paths while placing their
orders. But, the actual pages for those paths may be the same. The
developer could have programmed a certain page within the checkout
process to check if the buyer was a new user or not. If the user was
new, then the page would display a form asking the user for his personal
information. If the user was a repeat buyer, then the page would not
display this form. Thus, the new user and the repeat buyer would travel
different paths along different URL's while making their purchase. But,
the actual pages accessed on the Web server through these two paths
could turn out to be the same.
Because paths through a site will quite likely have a significant number
of overlapping pages, the most efficient way to write scripts for load
testing is to write, so far as it is possible, one script per
page.
Writing one script per page for load testing offers several significant
advantages to writing one script per user scenario path. First of all,
it is more efficient both to write and run tests in this manner, as it
takes advantage of the overlap in user paths. Furthermore, writing one
script per page increases script maintainability because they do not
include any navigation to another page. Finally, testers can more
easily ensure that individual pages appear in a load test in an
appropriate distribution.
4.4.2 Writing One-Page e-Load Scripts
In order to write a test script which loads a single page, testers will
probably have to rely on data banking the URL or form parameters
which that page takes as input. A data bank is Empirix's term
for e-Load's feature that allows people to map Web page parameters to
data sets. For example, a Web page at an e-commerce site might contain
the URL parameter, product_id. A typical URL for this page might look
like /product?product_id=1563. Thus, a data bank for loading this Web
page would consist of a list of product_id's. e-Load uses the CSV file
format for loading data banks.
If a tester ran e-Load on this page using the product_id databank, then
e-Load would repeatedly (and concurrently) request the /product page
while feeding in various product_id parameters. Thus, e-Load would be
able to navigate directly to and load many URL's rooted at the /product
page in a one-page script.
In addition to providing data bank values for a script's page, a tester
must also add some custom VB code to his scripts for direct navigation
to the page to work. He needs to insert a test scriptlet before that
page which does the following:
- grab a set of data banked parameters for the page
- directly navigate to the page using the data banked parameters
For example, the site www.site59.com
has a page, /getaways/overview.adp which lists the description of a
"getaway" travel package. This page takes three input parameters:
offering_id, originating_city_id, and show_offering_links_p. So, a test
scriptlet before this page might look like:
'the variables storing the URL parameters
Dim varOffering_DB
Dim varCity_DB
Dim varShowOfferLinks_DB
'varOffering, varCity, varShowOfferLinks are data bank variables
Call rswObject.play.getdatabankvalue("varOffering",varOffering_DB)
Call rswObject.play.getdatabankvalue("varCity",varCity_DB)
Call rswObject.play.getdatabankvalue("varShowOfferLinks",varShowOfferLinks_DB)
'directly navigate to overview.adp using the databanked values for URL 'paramters
Call ChangeNavigation("", "http://www.site59.com/getaways/overview.adp?", "show_offering_links_p=" &
varShowOfferLinks_DB & "&offering_id=" & varOffering_DB & "&originating_city_id=" & varCity_DB)
'doing some validation on the page's results
If FindInHTML("any error message") Then
err.Number = -1
err.Description = "Error loading Page"
End If
Placing a VB scriptlet like this in an e-Load script allows e-Load to
directly navigate to a Web page requiring input parameters. Thus,
e-Load can test multiple versions of this page with just one script.
Furthermore, testers can also place additional, custom validation code
into their scriptlets to check for any special errors.
Load testers will invariably have to write some multi-page scripts in
addition to one-page scripts. This will usually occur when a script has
to perform some kind of transaction. For example, in an e-commerce
script, a final checkout script might require several pages in order to
place an order completely. But, testers may still wish to insert a VB
test scriplet before navigation to the first page in this script,
allowing the e-Load script to directly navigate to a certain page
immediately before starting the checkout process.
4.5 Setup For Load Testing
Load testing should take place in an isolated, controlled environment so
that testers may confidently identify performance characteristics and
bottlenecks. ArsDigita maintains a scaling lab in its Cambridge office
for this purpose.
To test a Web site, testers must reproduce the Web site's production
environment, software, and hardware in the scaling lab. This allows
them to be sure that the Web site's behavior they see in the lab will be
the same as the site's behavior on the Internet. Setting up a Web site
in the lab involves:
- configuring Oracle
- setting up server hardware
- installing the Web site's server and files
- installing related services
- making necessary testing-specific changes to the site
- validating the site's performance
4.5.1 Configuring Oracle
If the Web site to be tested is already a live site, the server in the
scaling lab should contain the same data in its database as the
production site. Thus, developers should provide the load testers with
a copy of the production site's database. Although one possible way to
do this is with Oracle imports/exports, this is not the best solution.
Exporting and importing data has a couple problems. First, it does not
create an exact duplicate of the production database; importing tables
may not necessarily extend those tables in the same way on the test
server that the tables are extended on the production server.
Furthermore, ArsDigita has found through past experience that not all
Oracle objects and data are always duplicated properly through imports
and exports.
A better way to transfer database information is to use either a binary
copy of the Oracle database on the production server or to use Oracle
hot backups for transferring information.
4.5.2 Setting Up Server Hardware
Load testers should use the same hardware for their test as the
production site uses. Typically, this will involve a large Sun server
(such as an E-450), several Sun Netras for front-ends, and load
balancers.
While configuring their test machines, load testers must allocate enough
virtual IP addresses across their servers and front-ends to run all the
AOLServer instances necessary for testing. They can do this using the
Unix command, ifconfig. For example, if a tester wanted to add the
virtual IP address 10.200.0.41 to a server, he would do something like:
#ifconfig -a
#ifconfig hme0:1 plumb
#ifconfig hme0:1 inet 10.200.0.41 netmask 255.255.255.0 broadcast 10.200.0.25
#ifconfig hme0:1 up
#echo "lt-server-003-001" > /etc/hostname.hme0:1
Once the tester has allocated his virtual IP addresses, he must
configure his load balancers to map the Web site's main address to these
virtual addresses.
4.5.3 Setting Up The Web Server
After setting up the database and server hardware the load tester needs
to setup the Web server. This is a fairly straightforward process and
involves:
- Copying the Web site's files to the test server
- Editing the server's configurations to run on the new server,
including the IP address parameters, Oracle datasource, log locations,
and server name
- Modifying /etc/inittab to run the servers
4.5.4 Installing Related Services
Once the server is up and running, the load tester should configure any
related services, such as keepalive, if necessary.
4.5.5 Making Changes For Testing
The tester will probably have to make certain changes to the site before
testing. Two common changes he will want to make are disabling e-mail
and also Cybercash connections. Since the test server will likely
contain real data, leaving e-mail or Cybercash enabled could result in
real people being e-mailed or billed as a result of load testing.
An additional change he may have to make is disabling links to outside
sites. This is especially relevant for sites that use Akamai or
DoubleClick. Load testing a server should not involve other Web sites
because testers should not be bogging down external servers. Also,
these external servers should not play a significant role in the test
server's performance.
4.5.6 Validating Setup
Once the test server is setup and running, the load tester should run
some cursory tests upon it to ensure that it is performing in the same
manner as the actual production site. He should check for broken links
and ensure that the site is behaving properly. He should also do a
cursory load test to check if the site is showing scalability
performance similar to that of the production site.
5 Performing The Load Test
As mentioned earlier, there are five basic steps in load testing:
- Profiling the Web server for basic performance and scalability
characteristics
- Unit testing individual functions within the Web site
- Testing individual user scenarios
- Performing a user simulation test
- Performing a long-term stability test
5.1 Profiling the Server
The first step in load testing is to perform some basic analyses to see
how well the Web server itself performs as it is currently configured.
This involves checking:
- How fast the server can serve files
- What is the server's maximum throughput
- How many connections can the server handle
- How does the server perform while serving images
These baseline checks can help provide insight about where bottlenecks
may or may not be.
5.1.1 The 1KB File Test
The load tester can test how quickly the server can deliver files by
placing small, 1KB files on the file systems of each of the server's
front ends. Then, he should run a scalability test and measure how
quickly the server is able to serve the 1KB files. This will allow the
tester to find how quickly the server performs in terms of rapidly
serving small amounts of data. If future user-simulation tests indicate
that the server is delivering hits at the same pace as this test, then
the server has hit a bottleneck.
5.1.2 The 64KB File Test
The load tester should also place sixteen 4KB images and an HTML file
referencing those images on the server. This makes the effective HTML
file about 64KB in size. The load tester should then perform a
scalability test and measure how quickly the server can deliver this
static HTML page. From the test, he should be able to determine the
maximum throughput of the server, and how many concurrent connections
(not users) the server can support. The tester should use netstat on
the servers to track how many connections the server is maintaining
while serving the files. Typically, a server should be able to handle
thousands of simultaneous connections.
In addition to measuring the system's throughput and connections, the
tester should also be able to conclude if serving images is likely to be
a bottleneck for the server. On most Web sites, the server will use up
all its bandwidth in serving images during this test but only use a tiny
fraction of its CPU power. If this is the case, then serving images is
not likely to be a significant factor during the load test. Indeed, if
the load tester wishes, he may test without images in order to conserve
his load testing clients' CPU power.
5.2 Unit Testing Individual Functions
Once the load tester has profiled the Web server, he should run a
scalability test using each script in isolation. This will allow him to
gage the relative performance of each script, or function, on the site.
He will probably find that certain functions perform worse than others.
These functions, then, are probable bottlenecks in the system.
Before the load tester runs a full scalability test on each function, he
should run a baseline test to assess the script's maximum performance on
the site. The tester should use two users so that the baseline test
will reflect concurrency effects. A typical e-Load configuration for
this baseline test would be:
Total Number of VU's | 2
|
Percent Reporting | 100
|
Delay Between Iterations | 10
|
Virtual User Pacing | Recorded (fixed at 10)
|
Browser Emulation | IE 5
|
Connection Speed | True Line
|
Caching Type | First Time User
|
User Mode | Thin Client
|
Separate Processes | No
|
Use Cookies | Yes
|
Download Images | Yes
|
Use Databanks | Yes
|
Collapse Mode | No
|
VU Display Ready | No
|
On Error View HTML | Yes
|
Perform User Defined Tests | No
|
Ramp-up Specification | 2 users concurrently
|
Table 1 e-Load Configuration for Baseline Test
While running this baseline test, the load tester should watch the
script's browsers to make sure that it does not encounter any errors.
Then, he should ramp up the script to see how well it performs.
5.3 Testing User Scenarios
Once the unit tests are complete, the load tester should run the user
scenario tests outlined in the test plan. Each of these user scenarios
should consist of the multiple scripts the tester wrote to implement
them. The load tester should assign the same number of virtual users to
each script during the load test. This will ensure that all the pages
in the scenario are hit approximately the same number of times. A
typical e-Load configuration for a user scenario test might look like:
Total Number of VU's | 1000
|
Percent Reporting | 100
|
Delay Between Iterations | 10
|
Virtual User Pacing | Recorded (fixed at 10)
|
Browser Emulation | IE 5
|
Connection Speed | True Line
|
Caching Type | First Time User
|
User Mode | Thin Client
|
Separate Processes | No
|
Use Cookies | Yes
|
Download Images | Yes
|
Use Databanks | Yes
|
Collapse Mode | No
|
VU Display Ready | No
|
On Error View HTML | Yes
|
Perform User Defined Tests | Yes
|
Ramp-up Specification | 20 users after 3 minutes
|
Table 2 e-Load Configuration for User Scenario Test
Unlike for unit tests, load testers do not run baseline tests for user
scenarios. This is because within a user scenario, they cannot specify
that the scripts run serially in some order. A scenario test will
simply run all the scripts concurrently.
5.4 Testing Overall Scalability
Once the load tester has completed the user scenario tests, he should
perform a user simulation test. This test consists of using all the
scripts in a certain mixture. If the site being tested is already a
live site, then the tester should obtain information about how much each
page on the production site is hit. Once he has this data, the load
tester should target to hit this page in a similar proportion during the
test. For example, the load tester's site may have its homepage hit 30%
of the time. Furthermore, the load tester will presumably have a
one-page script just for the homepage. Now, if the load tester is
trying to see how well the site scales to 1,000 users, then he should
allocate 300 virtual users to the script accessing the homepage. By
allocating scripts in this manner, the load tester will be able to
simulate a realistic load on the server.
If the site is not yet a live site and does not have data about how
often each page is hit, then the tester will have to work with the
developers to estimate a reasonable mixture for the user simulation
test.
Like in the previous user scenario tests, the user simulation test
should employ delays between page clicks. Unlike the previous tests,
however, the user simulation tests should not fix all connections at
true line speed. In order to gage real-world performance, load testers
should throttle many of the connections to modem speeds (56KBs) and
other representative rates. A typical e-Load configuration for user
scenario testing might resemble:
Total Number of VU's | 1000
|
Percent Reporting | 100
|
Delay Between Iterations | 10
|
Virtual User Pacing | Recorded (fixed at 10)
|
Browser Emulation | IE 5
|
Connection Speed | Mixed between dial-up, broadband rates
|
Caching Type | First Time User
|
User Mode | Thin Client
|
Separate Processes | No
|
Use Cookies | Yes
|
Download Images | Yes
|
Use Databanks | Yes
|
Collapse Mode | No
|
VU Display Ready | No
|
On Error View HTML | Yes
|
Perform User Defined Tests | Yes
|
Ramp-up Specification | 20 users after 3 minutes
|
Table 3 e-Load Configuration for User Simulation Test
5.5 Testing Long-Term Stability
The final test the load tester should run is a long-term stability test.
In this test, the load tester should run a user scenario test against
the site so that it only taxes about 60% of the server's capacity. So,
for example, if a site scaled acceptably to 1,000 concurrent users, then
the stability test should be run with 600 concurrent users.
This test should be run overnight and for an extended period of time.
At the end of the test, the load tester should examine the server's
performance to see if it has declined. He should also examine the
server's resources and any errors which may have occurred. The purpose
of this test is to see how the Web site degrades over time and to
identify any areas which may need to be resolved.
5.6 Settings and Monitors
While load testing, the load tester should be mindful of several
e-Load-specific options and features:
- He should ensure that page timers are enabled for testing. This
will allow him to measure the performance characteristics for individual
pages within a test run.
- He should ensure that data banking, if appropriate, is enabled
- He should verify that images are enabled/disabled as appropriate,
based upon the 64KB image test
- He should enable user-defined tests for scripts that use custom
VBScript validation
In addition to these settings, the load tester should also monitor
several things during the test run:
- He should make sure that no testing client has a CPU utilization
above 90%. If the test clients work harder than this, their results
will not be accurate.
- He should watch for errors during script runs
Finally, the load tester should monitor the Web server's hardware during
the test. E-Load supports SNMP hookup for server monitoring. The load
tester should enable server stats while testing to gather
information about the Web server machines' CPU, memory, I/O activity,
and other key metrics.
6 Interpreting Load Testing Results
Once a load test is complete, its results need to be analyzed.
Typically, a Web site's developers and clients will be interested in
finding out:
- How well does the site currently scale and perform
- How can the site's scalability be improved
The answers to these two questions can provide much insight into other,
business-related questions like, "should I buy more hardware to improve
my site's scalability?" The answer to this question should not be
yes unless the site currently does not meet scalability
requirements, and the site's bottleneck is hardware-related.
While attempting to answer these two principal scalability questions,
the load tester will need to analyze a variety of data from his tests.
In particular, he should pay attention to Performance Versus Users
curves, Statistics Versus Time curves, and error rates. He should also
consider factors such as hardware data and look for specific
bottlenecks.
6.1 What is Failure
Before a load tester can provide any meaningful results for a load test,
he first must be able to express what it means to pass or fail a load
test. The definitions for success or failure should stem directly from
the site's performance and scalability requirements. Thus, if a site is
expected to scale to 1,000 concurrent users and scales to 1,500 users
under testing, then it has passed its load test.
Again, people generally express scalability requirements in terms of
n users, where n is the number of concurrent users to
which the site can scale while
- 90th percentile page latency is 8 seconds
- The server does not exhibit any load-induced errors
If a server does not meet both of these conditions while scaling to
n users, then it has failed the load test.
6.2 Assessing Performance and Scalability
e-Load provides a large number of load-testing reports. The two most
important ones for assessing a site's scalability are the Performance
Versus Users and Statistics Versus Time reports. Another pertinent one
is the Errors Versus Users report.
6.2.1 Performance Versus Users
The Performance Versus Users report illustrates how the Web site scales
as the number of concurrent users accessing it increases. This report
can point out two significant pieces of information:
- At what number of concurrent users does the Web site encounter a bottleneck
- At what number of concurrent users does the Web site's performance
become unacceptable
A typical Performance Versus Users graph should look something like in
figure 2:
Figure 2 Typical Performance Versus Users Graph
In the initial portion of this graph, the script's performance remains
flat as the number of concurrent users increases. As long as the
performance remains flat, the site has not yet hit any limits in its
scalability; the Web site is showing no loss of performance even though
its load is increasing. This means that the Web server still has
resources to spare and can handle the increased load with no difficulty.
As the number of concurrent users increases, though, the Web server must
invariably hit a bottleneck somewhere. This bottleneck may be due to
lack of memory or a badly written SQL query or some other factor.
Regardless of what is causing the bottleneck, eventually the server will
reach a point where it is no longer able to maintain the same
performance. Rather, as the number of concurrent users increases, the
server's performance decreases (and the script's run time increases).
Thus, the point in the performance versus users graph where the script
run time starts to increase is the point where the server has hit its
bottleneck. Beyond this point, the server is trading off performance
for users: as the number of users increases, the level of performance
decreases.
In addition to illustrating at what load level the Web site is hitting a
bottleneck, the Performance Versus Users graph also illustrates how well
the site performs in a certain function or scenario. Thus, load testers
should compare the site's performance with the site's scalability
requirements. It may be the case that even though the Web site hits a
bottleneck at a certain number of concurrent users, it still performs
acceptably enough beyond this bottleneck-point to meet requirements.
For example, a one-page script may have a 90th percentile latency of one
second until the load reaches 500 concurrent users. From 500 users to
1,000 users, the script's 90th percentile latency drops to six seconds.
But, if the site only needs to scale to 1,000 users, then this drop in
performance is still acceptable because the latency has not increased
beyond eight seconds.
6.2.2 Statistics Versus Time
The Statistics Versus Time report illustrates the Web server's
throughput statistics during the load test. Since scalability and
throughput are directly related, this report is important for
demonstrating:
- At what throughput the Web site hits a scalability bottleneck.
- How much throughput the Web site can achieve
A Web site's throughput should increase as its load increases as long as
it has not hit a bottleneck. Once the site has hit a scalability
bottleneck, though, it cannot--by definition--further increase its
throughput. Thus, by examining a throughput Statistics Versus Time
graph, a load tester can see when the site experiences a bottleneck by
noting where the site's throughput levels off. Additionally, he can
measure the site's absolute throughput at any given point. Figure 3 is
a typical Statistics Versus Time graph:
Figure 3 Typical Statistics Versus Time Graph
6.2.3 Errors Versus Users
Ideally, load testing should not introduce any errors on a site. The
Errors Versus Users report will note if any errors did occur.
Furthermore, in the event that errors do occur, it will allow the load
tester to examine how errors occur as a function of load. This may be
useful if, for some reason, the load tester determines that some level
of errors are acceptable. For example, a site may have a known
functional error that always appears--regardless of load level. The load
tester would probably want to ignore this type of error and accept error
levels up to a certain threshold.
6.3 Identifying Bottlenecks
The Performance Versus Users and Statistics Versus Times reports can
help a load tester find at what load level a Web site hits bottlenecks.
But, they do not reveal what those bottlenecks might be. The first
thing that the load tester should do in looking for bottlenecks is to
look at page timer reports as well as server monitoring reports. These
two reports can help point out slow performing pages or hardware. The
load tester should also examine filters as potential bottlenecks.
Beyond these things, the load tester will have to do some investigative
work as each Web site will have its own peculiarities and problems.
6.3.1 Page Timers
e-Load Page Timer reports help load testers identify particularly slow
pages on the Web site. The Performance Versus Users Page Timer report
lists the time a page takes to load as a function of concurrent users.
Thus, load testers can examine how well each individual page in a test
performs and check for exceptionally slow pages as well as any pages
that do not meet performance requirements. Pages that are particularly
slow are probably bottlenecks on the Web site.
6.3.2 Server Monitoring Reports
In addition to the standard e-Load reports, the load tester should
examine server-monitoring reports. In particular, he should pay
attention to the CPU activity and I/O activity information to see if the
front-ends or main server are hitting hardware bottlenecks. These
server reports can help pinpoint bottlenecks. For example, the servers
should ideally have close to zero I/O activity. If they do exhibit
significant I/O activity, then they should be reconfigured--probably with
additional memory.
6.3.3 Filters
Filters--especially site-wide filters--can drastically hurt a site's
performance and scalability because they might always execute and add
overhead to each page request. Thus, load testers should always check
filters as potential bottlenecks.
7 Conclusion
Scalability testing is one part of an overall scalability process.
People do not test scalability for fun; rather, they are interested in
some end goal. Typically, this goal is business-related. People want
to know if their Web sites will support enough users, how much money
they need to spend to improve their site, how efficiently they are using
their resources. Thus, scalability testing is sort of a middle step in
meeting these goals.
Scalability testing does not in itself improve scalability. Instead, it
provides information for the other steps in the scalability process.
Developers and system architects need to learn how to improve their work
and if they are meeting certain requirements. Clients need to answer
business questions. All of this information comes from testing. Thus,
when load testers perform their scalability tests, they should keep in
mind that in all the work they do, their end result must be usable
information. Simply performing a load test won't do--testers need to
document and report their results in a manner which is meaningful and
helpful to developers and clients.
8 References
- Geiger, Gary and Pulsipher, Jon. "Top Windows DNA Performance
Mistakes and How to Prevent Them."
http://msdn.microsoft.com/library/default.asp?url=/library/en-us/dndna/html/windnamistakes.asp
- RSW Software, Inc. "RSW e-Load User Guide."
asj-editors@arsdigita.com