ArsDigita Scalability Testing (ArsDigita Systems Journal)

ArsDigita Scalability Testing

Submitted on: 2001-01-26
Last updated: 2001-04-25

ArsDigita : ArsDigita Systems Journal : One article

1 Introduction
Scalability is one of the most important requirements for any Web service or application. If a system does not scale or perform well, then its features will be unavailable to people--rendering that system virtually useless. Thus, determining a system's overall scalability is an important part of development.
The principal way to determine a system's ability to scale is to actually test its performance under load. Load testing, if executed and interpreted properly, can provide valuable insights into a system's ability to meet scalability requirements and to grow for the future. It can also help identify bottlenecks which may be hindering a system's performance. But, load testing is not a trivial task. Load testers must take care to follow certain practices while assessing applications. Over the past few months, ArsDigita has worked to develop a scalability-testing methodology based upon its own experience as well as through consultation with other scalability experts.
2 Vocabulary
Scalability is a field which uses a certain jargon to explain various concepts. This language includes the following terms, which provide a well-defined framework in which to perform scalability testing:

throughput: the amount of work accomplished in a given time, i.e. work/time. Typically, transactions/second or bits (transmitted)/second.
scalability: the amount of change in linear throughput corresponding to a change in resources. Scalability answers questions like, "How many more transactions per second can my server do if I double its memory?" or "How many less transactions per second can my server do if I increase its number of concurrent users by fifty percent?" Note that scalability increases as throughput increases.
performance: the speed at which a system performs its work
transaction time: the amount of time a transaction takes to acquire and utilize its resources.
latency: the time between making a request and beginning to see a result
virtual user: a software-generated visitor to a Web site. Load-testing software typically implement a virtual user as a Web browser that navigates through a Web site. The virtual user's behavior is usually customizable; it can change its time interval between page clicks, use HTML forms, and so on. Load-testing programs generate multiple virtual users to impose load upon a Web site.
concurrent users: the number of users or virtual users with overlapping activity that a Web site receives. These concurrent users might not be requesting Web pages at precisely the same instant. But, they are all actively using the Web site during the same window of time.
load test: a test that determines a system's behavior under various workloads. This test answers the question, "Given a certain load x, how will the system behave?"
scalability test: a test that applies increasing workloads to determine a system's ability to scale. This test answers the question, "Given an increase from load x to load y, how will the system behave at load y versus at load x?"
performance test: a test that measures how quickly a system performs under various workloads. This test is a specialized form of a load test. It answers the question, "Given a certain load x, how fast will the system return a result?"
stress test: a test that increases the workload on a system until the system fails. This test answers the question, "Under what load will the system fail, and how does it fail?"
user simulation test: a test that measures how a system performs in "real life." This test answers the question, "If we have x number of users doing this, y number of users doing that, and z number of users going there, how will the system respond?"
One of the key distinctions in terms is that performance and scalability are not the same¹. A Web site that performs well--that runs quickly--for ten simultaneous users may operate slowly for eleven users or more. Thus, one would say that site may perform well, but it scales poorly.
When people say, then, that they wish to load test a Web site, they generally do not mean they desire just to see how well a system performs. They mean that they want to determine how a system behaves under stress, including how quickly a system operates at a given load, how well a system scales, and under what load a system fails
In other words, they mean load testing to be a combination of load, performance, scalability, and stress testing. Furthermore, they generally employ user simulation tests as a means of performing "load testing" for how users might actually use their system. Using user simulation tests provides a context in which to perform testing; user simulation tests define the kind of load a system will experience. Executing these tests will help developers ensure that their Web sites meet operational and performance requirements.
Testing Platform
ArsDigita uses a specific set of resources for performing scalability testing. These include:

Empirix's (formerly RSW Software) e-Test Suite
Server monitoring software
An isolated scaling lab

3.1 E-Test Suite
Empirix's e-Test Suite consists of several automated testing applications, including e-Test, e-Load, and e-Reporter. E-Test is a tool for creating automated-testing scripts, e-Load is software that performs load tests, and e-Reporter is a program that analyzes and reports results from load tests run in e-Load. The e-Test Suite also provides support for monitoring Web server data such as CPU utilization and memory usage through SNMP messaging.
3.2 Scaling Lab
ArsDigita maintains a scaling lab for the purposes of load testing in an isolated environment a Web site as it would be deployed in production. This lab includes hardware for running test clients as well as for running Web sites. Among the lab's equipment are:

Sun E-450, E-4500 Web Servers
F5 BigIPLoad Balancers
Sun Netras
A 100 Mbps Ethernet connection
Client PC's running Windows 2000 and RSW Software's e-Load testing software for performing load tests

4 Preparing for Load Testing
Before a Web site can be load tested, developers and testers must make preparations.
4.1 Define Requirements
The first thing that a Web site's developers should do before load testing is to determine the site's performance and scalability requirements. Otherwise, they will not be able to determine if their site has passed its load test. Ideally, these requirements should be defined before development on a site begins as part of the site's requirements-defining phase. People typically specify scalability requirements in terms of the number of concurrent users they want their site to support. For example, a client may want its Web site to support up to 1,000 concurrent users, with a 90^th percentile page latency of at most eight seconds. In other words, if 1,000 users were simultaneously visiting that site, 90% of the pages these users requested would load within eight seconds.
Many scalability testers view a 90th percentile page latency of eight seconds as the minimal acceptable performance for a Web page. Furthermore, they do not find any server errors due to load acceptable. Thus, when defining scalability requirements, developers should keep this performance standard in mind.
4.2 Write a Test Plan
Once developers have determined the scalability and performance requirements for their site, they should work with their site's testers to determine an appropriate plan to test that their site meets its performance requirements. A typical ArsDigita test plan consists of the following steps:

Profile the Web server to find its basic performance and scalability characteristics when serving static content
Unit test individual functions within the Web site to see how they scale and to identify bottlenecks
Test individual user scenarios to see how the Web site scales and to identify bottlenecks
Perform a user simulation test consisting of all user scenarios to see how the Web site scales and performs under realistic load
Perform a long-term test using all user scenarios to observe system degradation over time

4.3 Create User Scenarios
A user scenario is an outline of a path or set of paths a certain type of user might take through a Web site. For example, consider the site, Amazon.com. One typical scenario would represent a user who came to the site, searched for a specific product, and then purchased it. The scenario would consist of the various Web pages through which the user navigated to complete this transaction.
Typically, a Web site will have a fairly small number of typical user scenarios which comprise the vast majority of its traffic. Amazon.com, for example, might have the following user scenarios:

User simply browsing the site
User searching the site for products
User making a purchase
User providing feedback
User creating an account
These five scenarios probably represent the vast majority of traffic which Amazon.com receives. Thus, load testing Amazon.com with these user scenarios would yield a good understanding of how well Amazon.com scales.
If a site is already live, a good way to define user scenarios is to look at existing traffic patterns on that site using ClickStream, WebTrends, or some other analysis tool. If a site has not yet gone live, then the site's developers and client will have to make some educated guesses about user scenarios.
4.4 Create Test Scripts

4.4.1 Scripting Approach
Once the developers have outlined the user scenarios for their site, they need to work with the testers to create the test scripts which implement their user scenarios. One way to do this would be to create a script for each path in the user scenarios. For example, Amazon.com would have a user scenario where a user makes a purchase. But, this user scenario would consist of multiple paths through the site, depending on the purchaser's status. If the user was a first-time buyer, then he would have to fill out a variety of personal information. If the user was a repeat buyer, he could presumably bypass filling out this information. Thus, although the number of user scenarios in a site may be limited, the number of paths comprising those scenarios could be significantly large.

Figure 1 A single checkout scenario with multiple paths
Because a site could have a large number of paths through it even though it may have a small number of user scenarios, creating a script for each path could involve writing a great number of scripts. Using a sizeable amount of scripts can lead to a lengthy testing period; for each script, testers must write the script, validate the script, generate input data for the script, run the script (which itself can take many hours), and then analyze the results from the script's run. Thus, using one test script per site path is not a practical approach to load testing if the load test is to be completed in a timely fashion.
An alternative to creating one script per path while still generating valid load characteristics takes advantage of the fact that, in general, distinct paths through a Web site will contain overlapping pages. The simplest example of this is that almost all user scenarios for a Web site will start at that site's home page. Regardless of what a user ends up doing on a site, he will probably start by accessing the site's main page. Thus, recording the home page for every single script is redundant.
Another common situation for overlapping pages is when developers program a page so that it behaves differently depending upon the information passed to it. Going back to the purchasing example, a new user and a repeat buyer would travel different paths while placing their orders. But, the actual pages for those paths may be the same. The developer could have programmed a certain page within the checkout process to check if the buyer was a new user or not. If the user was new, then the page would display a form asking the user for his personal information. If the user was a repeat buyer, then the page would not display this form. Thus, the new user and the repeat buyer would travel different paths along different URL's while making their purchase. But, the actual pages accessed on the Web server through these two paths could turn out to be the same.
Because paths through a site will quite likely have a significant number of overlapping pages, the most efficient way to write scripts for load testing is to write, so far as it is possible, one script per page.
Writing one script per page for load testing offers several significant advantages to writing one script per user scenario path. First of all, it is more efficient both to write and run tests in this manner, as it takes advantage of the overlap in user paths. Furthermore, writing one script per page increases script maintainability because they do not include any navigation to another page. Finally, testers can more easily ensure that individual pages appear in a load test in an appropriate distribution.
4.4.2 Writing One-Page e-Load Scripts
In order to write a test script which loads a single page, testers will probably have to rely on data banking the URL or form parameters which that page takes as input. A data bank is Empirix's term for e-Load's feature that allows people to map Web page parameters to data sets. For example, a Web page at an e-commerce site might contain the URL parameter, product_id. A typical URL for this page might look like /product?product_id=1563. Thus, a data bank for loading this Web page would consist of a list of product_id's. e-Load uses the CSV file format for loading data banks.
If a tester ran e-Load on this page using the product_id databank, then e-Load would repeatedly (and concurrently) request the /product page while feeding in various product_id parameters. Thus, e-Load would be able to navigate directly to and load many URL's rooted at the /product page in a one-page script.
In addition to providing data bank values for a script's page, a tester must also add some custom VB code to his scripts for direct navigation to the page to work. He needs to insert a test scriptlet before that page which does the following:

grab a set of data banked parameters for the page
directly navigate to the page using the data banked parameters
For example, the site www.site59.com has a page, /getaways/overview.adp which lists the description of a "getaway" travel package. This page takes three input parameters: offering_id, originating_city_id, and show_offering_links_p. So, a test scriptlet before this page might look like:
'the variables storing the URL parameters
Dim varOffering_DB
Dim varCity_DB
Dim varShowOfferLinks_DB

'varOffering, varCity, varShowOfferLinks are data bank variables
Call rswObject.play.getdatabankvalue("varOffering",varOffering_DB)
Call rswObject.play.getdatabankvalue("varCity",varCity_DB)
Call rswObject.play.getdatabankvalue("varShowOfferLinks",varShowOfferLinks_DB)

'directly navigate to overview.adp using the databanked values for URL 'paramters
Call ChangeNavigation("", "http://www.site59.com/getaways/overview.adp?", "show_offering_links_p=" &
            varShowOfferLinks_DB & "&offering_id=" & varOffering_DB & "&originating_city_id=" & varCity_DB)

'doing some validation on the page's results
If FindInHTML("any error message") Then
     err.Number = -1
     err.Description = "Error loading Page"
End If
Placing a VB scriptlet like this in an e-Load script allows e-Load to directly navigate to a Web page requiring input parameters. Thus, e-Load can test multiple versions of this page with just one script. Furthermore, testers can also place additional, custom validation code into their scriptlets to check for any special errors.
Load testers will invariably have to write some multi-page scripts in addition to one-page scripts. This will usually occur when a script has to perform some kind of transaction. For example, in an e-commerce script, a final checkout script might require several pages in order to place an order completely. But, testers may still wish to insert a VB test scriplet before navigation to the first page in this script, allowing the e-Load script to directly navigate to a certain page immediately before starting the checkout process.
4.5 Setup For Load Testing
Load testing should take place in an isolated, controlled environment so that testers may confidently identify performance characteristics and bottlenecks. ArsDigita maintains a scaling lab in its Cambridge office for this purpose.
To test a Web site, testers must reproduce the Web site's production environment, software, and hardware in the scaling lab. This allows them to be sure that the Web site's behavior they see in the lab will be the same as the site's behavior on the Internet. Setting up a Web site in the lab involves:

configuring Oracle
setting up server hardware
installing the Web site's server and files
installing related services
making necessary testing-specific changes to the site
validating the site's performance

4.5.1 Configuring Oracle
If the Web site to be tested is already a live site, the server in the scaling lab should contain the same data in its database as the production site. Thus, developers should provide the load testers with a copy of the production site's database. Although one possible way to do this is with Oracle imports/exports, this is not the best solution. Exporting and importing data has a couple problems. First, it does not create an exact duplicate of the production database; importing tables may not necessarily extend those tables in the same way on the test server that the tables are extended on the production server. Furthermore, ArsDigita has found through past experience that not all Oracle objects and data are always duplicated properly through imports and exports.
A better way to transfer database information is to use either a binary copy of the Oracle database on the production server or to use Oracle hot backups for transferring information.
4.5.2 Setting Up Server Hardware
Load testers should use the same hardware for their test as the production site uses. Typically, this will involve a large Sun server (such as an E-450), several Sun Netras for front-ends, and load balancers.
While configuring their test machines, load testers must allocate enough virtual IP addresses across their servers and front-ends to run all the AOLServer instances necessary for testing. They can do this using the Unix command, ifconfig. For example, if a tester wanted to add the virtual IP address 10.200.0.41 to a server, he would do something like:
#ifconfig -a
#ifconfig hme0:1 plumb
#ifconfig hme0:1 inet 10.200.0.41 netmask 255.255.255.0 broadcast 10.200.0.25
#ifconfig hme0:1 up
#echo "lt-server-003-001" > /etc/hostname.hme0:1
Once the tester has allocated his virtual IP addresses, he must configure his load balancers to map the Web site's main address to these virtual addresses.
4.5.3 Setting Up The Web Server
After setting up the database and server hardware the load tester needs to setup the Web server. This is a fairly straightforward process and involves:

Copying the Web site's files to the test server
Editing the server's configurations to run on the new server, including the IP address parameters, Oracle datasource, log locations, and server name
Modifying /etc/inittab to run the servers

4.5.4 Installing Related Services
Once the server is up and running, the load tester should configure any related services, such as keepalive, if necessary.
4.5.5 Making Changes For Testing
The tester will probably have to make certain changes to the site before testing. Two common changes he will want to make are disabling e-mail and also Cybercash connections. Since the test server will likely contain real data, leaving e-mail or Cybercash enabled could result in real people being e-mailed or billed as a result of load testing.
An additional change he may have to make is disabling links to outside sites. This is especially relevant for sites that use Akamai or DoubleClick. Load testing a server should not involve other Web sites because testers should not be bogging down external servers. Also, these external servers should not play a significant role in the test server's performance.
4.5.6 Validating Setup
Once the test server is setup and running, the load tester should run some cursory tests upon it to ensure that it is performing in the same manner as the actual production site. He should check for broken links and ensure that the site is behaving properly. He should also do a cursory load test to check if the site is showing scalability performance similar to that of the production site.
5 Performing The Load Test
As mentioned earlier, there are five basic steps in load testing:

Profiling the Web server for basic performance and scalability characteristics
Unit testing individual functions within the Web site
Testing individual user scenarios
Performing a user simulation test
Performing a long-term stability test

5.1 Profiling the Server
The first step in load testing is to perform some basic analyses to see how well the Web server itself performs as it is currently configured. This involves checking:

How fast the server can serve files
What is the server's maximum throughput
How many connections can the server handle
How does the server perform while serving images
These baseline checks can help provide insight about where bottlenecks may or may not be.
5.1.1 The 1KB File Test
The load tester can test how quickly the server can deliver files by placing small, 1KB files on the file systems of each of the server's front ends. Then, he should run a scalability test and measure how quickly the server is able to serve the 1KB files. This will allow the tester to find how quickly the server performs in terms of rapidly serving small amounts of data. If future user-simulation tests indicate that the server is delivering hits at the same pace as this test, then the server has hit a bottleneck.
5.1.2 The 64KB File Test
The load tester should also place sixteen 4KB images and an HTML file referencing those images on the server. This makes the effective HTML file about 64KB in size. The load tester should then perform a scalability test and measure how quickly the server can deliver this static HTML page. From the test, he should be able to determine the maximum throughput of the server, and how many concurrent connections (not users) the server can support. The tester should use netstat on the servers to track how many connections the server is maintaining while serving the files. Typically, a server should be able to handle thousands of simultaneous connections.
In addition to measuring the system's throughput and connections, the tester should also be able to conclude if serving images is likely to be a bottleneck for the server. On most Web sites, the server will use up all its bandwidth in serving images during this test but only use a tiny fraction of its CPU power. If this is the case, then serving images is not likely to be a significant factor during the load test. Indeed, if the load tester wishes, he may test without images in order to conserve his load testing clients' CPU power.
5.2 Unit Testing Individual Functions
Once the load tester has profiled the Web server, he should run a scalability test using each script in isolation. This will allow him to gage the relative performance of each script, or function, on the site. He will probably find that certain functions perform worse than others. These functions, then, are probable bottlenecks in the system.
Before the load tester runs a full scalability test on each function, he should run a baseline test to assess the script's maximum performance on the site. The tester should use two users so that the baseline test will reflect concurrency effects. A typical e-Load configuration for this baseline test would be:

Total Number of VU's 2
Percent Reporting 100
Delay Between Iterations 10
Virtual User Pacing Recorded (fixed at 10)
Browser Emulation IE 5
Connection Speed True Line
Caching Type First Time User
User Mode Thin Client
Separate Processes No
Use Cookies Yes
Download Images Yes
Use Databanks Yes
Collapse Mode No
VU Display Ready No
On Error View HTML Yes
Perform User Defined Tests No
Ramp-up Specification 2 users concurrently

Table 1 e-Load Configuration for Baseline Test
While running this baseline test, the load tester should watch the script's browsers to make sure that it does not encounter any errors. Then, he should ramp up the script to see how well it performs.
5.3 Testing User Scenarios
Once the unit tests are complete, the load tester should run the user scenario tests outlined in the test plan. Each of these user scenarios should consist of the multiple scripts the tester wrote to implement them. The load tester should assign the same number of virtual users to each script during the load test. This will ensure that all the pages in the scenario are hit approximately the same number of times. A typical e-Load configuration for a user scenario test might look like:

Total Number of VU's 1000
Percent Reporting 100
Delay Between Iterations 10
Virtual User Pacing Recorded (fixed at 10)
Browser Emulation IE 5
Connection Speed True Line
Caching Type First Time User
User Mode Thin Client
Separate Processes No
Use Cookies Yes
Download Images Yes
Use Databanks Yes
Collapse Mode No
VU Display Ready No
On Error View HTML Yes
Perform User Defined Tests Yes
Ramp-up Specification 20 users after 3 minutes

Table 2 e-Load Configuration for User Scenario Test
Unlike for unit tests, load testers do not run baseline tests for user scenarios. This is because within a user scenario, they cannot specify that the scripts run serially in some order. A scenario test will simply run all the scripts concurrently.
5.4 Testing Overall Scalability
Once the load tester has completed the user scenario tests, he should perform a user simulation test. This test consists of using all the scripts in a certain mixture. If the site being tested is already a live site, then the tester should obtain information about how much each page on the production site is hit. Once he has this data, the load tester should target to hit this page in a similar proportion during the test. For example, the load tester's site may have its homepage hit 30% of the time. Furthermore, the load tester will presumably have a one-page script just for the homepage. Now, if the load tester is trying to see how well the site scales to 1,000 users, then he should allocate 300 virtual users to the script accessing the homepage. By allocating scripts in this manner, the load tester will be able to simulate a realistic load on the server.
If the site is not yet a live site and does not have data about how often each page is hit, then the tester will have to work with the developers to estimate a reasonable mixture for the user simulation test.
Like in the previous user scenario tests, the user simulation test should employ delays between page clicks. Unlike the previous tests, however, the user simulation tests should not fix all connections at true line speed. In order to gage real-world performance, load testers should throttle many of the connections to modem speeds (56KBs) and other representative rates. A typical e-Load configuration for user scenario testing might resemble:

Total Number of VU's 1000
Percent Reporting 100
Delay Between Iterations 10
Virtual User Pacing Recorded (fixed at 10)
Browser Emulation IE 5
Connection Speed Mixed between dial-up, broadband rates
Caching Type First Time User
User Mode Thin Client
Separate Processes No
Use Cookies Yes
Download Images Yes
Use Databanks Yes
Collapse Mode No
VU Display Ready No
On Error View HTML Yes
Perform User Defined Tests Yes
Ramp-up Specification 20 users after 3 minutes

Table 3 e-Load Configuration for User Simulation Test
5.5 Testing Long-Term Stability
The final test the load tester should run is a long-term stability test. In this test, the load tester should run a user scenario test against the site so that it only taxes about 60% of the server's capacity. So, for example, if a site scaled acceptably to 1,000 concurrent users, then the stability test should be run with 600 concurrent users.
This test should be run overnight and for an extended period of time. At the end of the test, the load tester should examine the server's performance to see if it has declined. He should also examine the server's resources and any errors which may have occurred. The purpose of this test is to see how the Web site degrades over time and to identify any areas which may need to be resolved.
5.6 Settings and Monitors
While load testing, the load tester should be mindful of several e-Load-specific options and features:

He should ensure that page timers are enabled for testing. This will allow him to measure the performance characteristics for individual pages within a test run.
He should ensure that data banking, if appropriate, is enabled
He should verify that images are enabled/disabled as appropriate, based upon the 64KB image test
He should enable user-defined tests for scripts that use custom VBScript validation
In addition to these settings, the load tester should also monitor several things during the test run:

He should make sure that no testing client has a CPU utilization above 90%. If the test clients work harder than this, their results will not be accurate.
He should watch for errors during script runs
Finally, the load tester should monitor the Web server's hardware during the test. E-Load supports SNMP hookup for server monitoring. The load tester should enable server stats while testing to gather information about the Web server machines' CPU, memory, I/O activity, and other key metrics.
6 Interpreting Load Testing Results
Once a load test is complete, its results need to be analyzed. Typically, a Web site's developers and clients will be interested in finding out:

How well does the site currently scale and perform
How can the site's scalability be improved
The answers to these two questions can provide much insight into other, business-related questions like, "should I buy more hardware to improve my site's scalability?" The answer to this question should not be yes unless the site currently does not meet scalability requirements, and the site's bottleneck is hardware-related.
While attempting to answer these two principal scalability questions, the load tester will need to analyze a variety of data from his tests. In particular, he should pay attention to Performance Versus Users curves, Statistics Versus Time curves, and error rates. He should also consider factors such as hardware data and look for specific bottlenecks.
6.1 What is Failure
Before a load tester can provide any meaningful results for a load test, he first must be able to express what it means to pass or fail a load test. The definitions for success or failure should stem directly from the site's performance and scalability requirements. Thus, if a site is expected to scale to 1,000 concurrent users and scales to 1,500 users under testing, then it has passed its load test.
Again, people generally express scalability requirements in terms of n users, where n is the number of concurrent users to which the site can scale while

90th percentile page latency is 8 seconds
The server does not exhibit any load-induced errors
If a server does not meet both of these conditions while scaling to n users, then it has failed the load test.
6.2 Assessing Performance and Scalability
e-Load provides a large number of load-testing reports. The two most important ones for assessing a site's scalability are the Performance Versus Users and Statistics Versus Time reports. Another pertinent one is the Errors Versus Users report.
6.2.1 Performance Versus Users
The Performance Versus Users report illustrates how the Web site scales as the number of concurrent users accessing it increases. This report can point out two significant pieces of information:

At what number of concurrent users does the Web site encounter a bottleneck
At what number of concurrent users does the Web site's performance become unacceptable
A typical Performance Versus Users graph should look something like in figure 2:

Figure 2 Typical Performance Versus Users Graph
In the initial portion of this graph, the script's performance remains flat as the number of concurrent users increases. As long as the performance remains flat, the site has not yet hit any limits in its scalability; the Web site is showing no loss of performance even though its load is increasing. This means that the Web server still has resources to spare and can handle the increased load with no difficulty.
As the number of concurrent users increases, though, the Web server must invariably hit a bottleneck somewhere. This bottleneck may be due to lack of memory or a badly written SQL query or some other factor. Regardless of what is causing the bottleneck, eventually the server will reach a point where it is no longer able to maintain the same performance. Rather, as the number of concurrent users increases, the server's performance decreases (and the script's run time increases). Thus, the point in the performance versus users graph where the script run time starts to increase is the point where the server has hit its bottleneck. Beyond this point, the server is trading off performance for users: as the number of users increases, the level of performance decreases.
In addition to illustrating at what load level the Web site is hitting a bottleneck, the Performance Versus Users graph also illustrates how well the site performs in a certain function or scenario. Thus, load testers should compare the site's performance with the site's scalability requirements. It may be the case that even though the Web site hits a bottleneck at a certain number of concurrent users, it still performs acceptably enough beyond this bottleneck-point to meet requirements. For example, a one-page script may have a 90th percentile latency of one second until the load reaches 500 concurrent users. From 500 users to 1,000 users, the script's 90th percentile latency drops to six seconds. But, if the site only needs to scale to 1,000 users, then this drop in performance is still acceptable because the latency has not increased beyond eight seconds.
6.2.2 Statistics Versus Time
The Statistics Versus Time report illustrates the Web server's throughput statistics during the load test. Since scalability and throughput are directly related, this report is important for demonstrating:

At what throughput the Web site hits a scalability bottleneck.
How much throughput the Web site can achieve
A Web site's throughput should increase as its load increases as long as it has not hit a bottleneck. Once the site has hit a scalability bottleneck, though, it cannot--by definition--further increase its throughput. Thus, by examining a throughput Statistics Versus Time graph, a load tester can see when the site experiences a bottleneck by noting where the site's throughput levels off. Additionally, he can measure the site's absolute throughput at any given point. Figure 3 is a typical Statistics Versus Time graph:

Figure 3 Typical Statistics Versus Time Graph
6.2.3 Errors Versus Users
Ideally, load testing should not introduce any errors on a site. The Errors Versus Users report will note if any errors did occur. Furthermore, in the event that errors do occur, it will allow the load tester to examine how errors occur as a function of load. This may be useful if, for some reason, the load tester determines that some level of errors are acceptable. For example, a site may have a known functional error that always appears--regardless of load level. The load tester would probably want to ignore this type of error and accept error levels up to a certain threshold.
6.3 Identifying Bottlenecks
The Performance Versus Users and Statistics Versus Times reports can help a load tester find at what load level a Web site hits bottlenecks. But, they do not reveal what those bottlenecks might be. The first thing that the load tester should do in looking for bottlenecks is to look at page timer reports as well as server monitoring reports. These two reports can help point out slow performing pages or hardware. The load tester should also examine filters as potential bottlenecks. Beyond these things, the load tester will have to do some investigative work as each Web site will have its own peculiarities and problems.
6.3.1 Page Timers
e-Load Page Timer reports help load testers identify particularly slow pages on the Web site. The Performance Versus Users Page Timer report lists the time a page takes to load as a function of concurrent users. Thus, load testers can examine how well each individual page in a test performs and check for exceptionally slow pages as well as any pages that do not meet performance requirements. Pages that are particularly slow are probably bottlenecks on the Web site.
6.3.2 Server Monitoring Reports
In addition to the standard e-Load reports, the load tester should examine server-monitoring reports. In particular, he should pay attention to the CPU activity and I/O activity information to see if the front-ends or main server are hitting hardware bottlenecks. These server reports can help pinpoint bottlenecks. For example, the servers should ideally have close to zero I/O activity. If they do exhibit significant I/O activity, then they should be reconfigured--probably with additional memory.
6.3.3 Filters
Filters--especially site-wide filters--can drastically hurt a site's performance and scalability because they might always execute and add overhead to each page request. Thus, load testers should always check filters as potential bottlenecks.
7 Conclusion
Scalability testing is one part of an overall scalability process. People do not test scalability for fun; rather, they are interested in some end goal. Typically, this goal is business-related. People want to know if their Web sites will support enough users, how much money they need to spend to improve their site, how efficiently they are using their resources. Thus, scalability testing is sort of a middle step in meeting these goals.
Scalability testing does not in itself improve scalability. Instead, it provides information for the other steps in the scalability process. Developers and system architects need to learn how to improve their work and if they are meeting certain requirements. Clients need to answer business questions. All of this information comes from testing. Thus, when load testers perform their scalability tests, they should keep in mind that in all the work they do, their end result must be usable information. Simply performing a load test won't do--testers need to document and report their results in a manner which is meaningful and helpful to developers and clients.
8 References

Geiger, Gary and Pulsipher, Jon. "Top Windows DNA Performance Mistakes and How to Prevent Them." http://msdn.microsoft.com/library/default.asp?url=/library/en-us/dndna/html/windnamistakes.asp
RSW Software, Inc. "RSW e-Load User Guide."

asj-editors@arsdigita.com