Scaling <i>Site X</i> (ArsDigita Systems Journal)

Scaling Site X

Submitted on: 2001-02-28
Last updated: 2001-04-26

ArsDigita : ArsDigita Systems Journal : One article

The following is a sample report that ArsDigita's Scaling Team wrote for one of its clients. It describes how ArsDigita was able to scale the application-specific code on one of its more complicated Web sites through a methodical process. ArsDigita has scrubbed the report of company and URL names in order to protect the identity of its client. ArsDigita presents the modified paper here as an example of how to solve difficult scalability problems. For more information about building scalable Web sites, see the articles:

ArsDigita Scalability Testing
Building Scalable eBusiness Solutions with ArsDigita

1 Introduction
ArsDigita's Scaling Team evaluated the Site X Web site in its Cambridge Scaling Lab. This evaluation had two main goals:

Determine the existing performance and scalability of Site X
Ensure that Site X would scale well enough to handle expected traffic increases coming from a COMPANY Y-Site X joint venture

1.1 Requirements

1.1.1 Throughput
Site X required that its Web site scale acceptably up to four times its existing load levels in order to prepare for the COMPANY Y-generated traffic. Based on Site X's current peak throughput of about 1.6 pages/second, the overall site would have to support a throughput of at least 6.4 pages/second. Thus, ArsDigita and Site X set as a requirement that Site X must support a throughput of at least 6.5 pages/second.
1.1.2 Page Performance
The industry-standard requirement for individual Web-page performance is that 90th percentile page load times should be within eight seconds. ArsDigita used this standard for determining whether Site X's Web pages were performing acceptably during load tests.
1.2 Scalability Findings and Results
ArsDigita's Scaling Team worked iteratively with the Site X development team in order to test Site X, find its bottlenecks, and tune its architecture. From this work, the scaling team determined and accomplished the following:

Site X's initial AOLserver configuration was not optimal. For each Netra, doubling the number of AOLserver instances and changing various AOLserver initialization parameters increased Site X's scalability and stability
The page, /www/page3.adp, performed and scaled poorly. ArsDigita improved this page so that it was no longer a bottleneck in the system
The page, /www/page4.adp, performed and scaled poorly. ArsDigita improved this page so that it was no longer a bottleneck in the system
Many of the Netra front-end servers suffered from memory problems. ArsDigita traced this memory issue to poor thread-handling and was able to resolve the problem
Under load testing in a lab environment, Site X scaled acceptably to a throughput of about 49 pages/second, assuming the following configuration of five Netras:

Two consumer-site Netras
Two XML Netras
One Netra serving images, running the cache, and handling XML bulk downloads

Site X's AOLservers should be restarted twice a day to ensure long-term stability of the site

Following is a detailed description of how ArsDigita arrived at these findings:
2 Testing Approach
ArsDigita performed three rounds of testing while working on Site X:

An initial assessment of Site X's performance and scalability
A reassessment of Site X's performance and scalability after eliminating bottlenecks found in round one. This included troubleshooting and working alongside the Site X project team.
A final test of Site X's performance and scalability based upon new hardware purchased for the COMPANY Y-Site X joint venture

3 Round One Testing

3.1 Test Environment
ArsDigita performed all Round One testing within its Cambridge Scaling Lab. In this lab, ArsDigita used the following testing setup:

Sun e450 Server
9 Sun Netras with 256 MB of RAM
BigIP F5 load balancer
100 Mbps Ethernet
6 PC's running Empirix e-Load for performing load tests

For load testing, ArsDigita exported a copy of Site X's production database into the scaling lab's e450. It then setup the e450 and Netras with Site X's code base and produced a mirror copy of Site X's production environment for load testing.
3.2 Index+3 Script
Site X has a custom Web traffic profiler on its production site. ArsDigita used this profiler to gather information about Site X's traffic patterns. From analyzing this data, ArsDigita found that one path through Site X consisting of four pages comprised about 70% of the site's traffic. Therefore, ArsDigita focused its Round One testing efforts on this particular path.
ArsDigita wrote an e-Load script, called Index+3, to test this path. This path consisted of Site X's index page, and then three additional pages in the site:

/index.adp
/page2.adp
/page3.adp
/page4.adp

ArsDigita used a databank to vary values for things such as originating cities when running Index+3. This allowed testing using a variety of data.
3.3 Test Results
ArsDigita ran Index+3 repeatedly, adding Netras for each successive run, to see how Site X's performance would scale with more front-end servers. ArsDigita tested in configurations ranging from three to nine Netras. ArsDigita did not employ any delays between page clicks for these tests.
The initial run used the production setup of three Netras. Next, ArsDigita tried a different configuration with a separate image server and reduced main servers:

Netra 1 Netra 2 Netra 3
Cache-1 Main-2 Image
Main-1

Table 1 Test Configuration with Separate Image Server
This did not seem to produce any significantly different results from the previous configuration.
Then, ArsDigita gave each server instance its own Netra:

Netra 1 Netra 2 Netra 3 Netra 4 Netra 5 Netra 6
Cache Image Main Main Main Main

Table 2 Test Configuration with Six Netras
ArsDigita continued to add Netras until it reached a total of nine Netras:

Netra 1 Netra 2 Netra 3 Netra 4 Netra 5 Netra 6 Netra 7 Netra 8 Netra 9
Cache Image Main Main Main Main Main Main Main

Table 3 Test Configuration with Nine Netras
The following table summarizes the results of these tests:

3 Netras 6 Netras 9 Netras

Max Users w/o errors 40 60 55
Max Users before 8 sec. latency Page 1: n/a Page 1: n/a Page 1: n/a
Page 2: n/a Page 2: n/a Page 2: n/a
Page 3: 6 Page 3: 14 Page 3: 26
Page 4: 20 Page 4: 32 Page 4: n/a
Performance at 40 Users 72.75 sec. 37.39 sec. 21.04 sec.
Average Throughput
(transactions/second) 0.37 0.66 1.14
Average Throughput
(extrapolated pages/second) 1.48 2.64 4.56
Throughput Bottleneck 10 users 10 users 10 users

Table 4 Summary of Index+3 Results
3.3.1 Page Latency
A common performance requirement for Web sites is that no page should have 90th percentile load times greater than eight seconds. Therefore, although a page may eventually load without errors, if it takes a long time to load, it may still be considered broken.
In these tests, pages one and two (/index.adp and /page2.adp) always performed within eight seconds before the Web site began to exhibit errors. Page three, /page3.adp, however, was extremely slow; it consistently took too long to load early on in the testing ramp-ups. Page four, /page4.adp, was also slow, although it performed within acceptable limits when tested with nine Netras.
The following graphs illustrate the page latencies from these test runs:

Figure 1 Page Performance Vs Users (3 Netras)

Figure 2 Page Performance Vs Users (6 Netras)

Figure 3 Page Performance Vs Users (9 Netras)
3.3.2 Relative Performance
Adding Netras linearly improve the overall performance of the site. For example, at 40 concurrent users, the index+3 script took 72.75 seconds to complete with three Netras, 37.39 seconds with six Netras, and 21.04 seconds with nine Netras.
The following graph illustrate Site X's overall performance versus users for index+3 testing:

Figure 4 Round One Index+3 Performance Vs. Users
One striking characteristic of these performance versus users results is that they are linear. A typical performance versus users graph should take the following form:

Figure 5 Typical Performance Versus Users Graph
A Web site's performance should remain flat until it hits a bottleneck in the system. Once the site encounters a bottleneck, it will trade performance for users-the more users the site handles, the slower it performs.
Site X's performance-versus-users graphs are not flat anywhere. Therefore, index+3 testing is encountering a fundamental bottleneck as soon as testing begins. The reason this bottleneck shows up so quickly was likely due to the testing pace being too fast as the virtual users did not have any delay between page clicks. In other words, this test pushed the system as much as possible even with only one user.
3.3.3 Throughput
Adding Netras increased the overall throughput of the site. However, even with nine Netras, the site did not scale up to 6.5 pages/second. This indicated that ArsDigita would have to improve Site X's scalability as it worked to meet scalability requirements.
A typical throughput curve for a Web site will take the following form:

Figure 6 Typical Statistics Vs Time Graph
As long as a Web site's throughput is increasing as the number of concurrent users increases, it has not hit a major bottleneck. But, once the site's throughput levels off, the site has hit a bottleneck as it is no longer able to achieve further throughput despite the increased number of users.
Site X's throughput graphs took the following form:

Figure 7 Index+3 Transactions/Second - 9 Netras
This graph shows that Site X hit its scalability limit almost immediately after testing began. However, increasing Netras increased overall throughput. This, like the performance-versus-users graphs, indicates that the virtual user pace for Round One testing was too fast.
3.4 Round One Findings
From this first round of testing, ArsDigita determined the following:

ArsDigita needed to tune Site X in order to improve the Web site's scalability from 4.56 pages/second beyond 6.5 pages/second
Site X had two slow pages which needed tuning:

/page3.adp
/page4.adp

In addition, ArsDigita decided to modify its testing strategy for Round Two. First, ArsDigita would introduce delays between virtual user clicks in order to slow down the testing process. This would enable ArsDigita to more easily identify at what points scalability bottlenecks manifested themselves.
ArsDigita would also add additional pages for testing during Round Two. This would serve a couple purposes: it would allow ArsDigita to test the scalability and performance of new XML pages written for the COMPANY Y integration, and it would allow ArsDigita to make a more complete assessment of Site X's overall scalability.
4 Round Two Testing

4.1 Test Environment
ArsDigita performed all Round Two testing within its Cambridge Scaling Lab.
4.2 AOLserver Configuration
The first thing that ArsDigita did in Round Two was to optimize Site X's server setup. From previous experience and also through tests for confirmation, ArsDigita determined that Site X should configure its front-end servers so that:

Each Netra runs two instances of AOLserver rather than one instance. This helps increase throughput to the database and also provides some measure of redundancy
Each AOLserver should change its MaxThreads parameter from the default of 100 to 10
Each AOLserver should use eight main database handles
ArsDigita ran its major tests in Round Two with these configurations in place.
4.3 Tuning /page3.adp
During Round One Testing, ArsDigita found the page, /page3.adp, to be slow and a bottleneck in overall scalability. Therefore, ArsDigita spent some time optimizing this page. One of the principal optimizations ArsDigita performed was to cache the query results for this particular page. Once ArsDigita had finished its tuning, it load tested the page to see how it would now perform.
ArsDigita used the following configuration for this test:

Number Of Netras 1
AOLservers/Netra 1
MaxThreads Parameter 10
Number of Main Database Handles 8
Ramp Up 5 users, every 3 iterations
Virtual User Delay Between Clicks 10 seconds

Table 5 Test Configuration for /page3.adp test
Note that ArsDigita added a ten-second delay between virtual user page-clicks to slow down the testing pace.
The results of this load test are illustrated in the following graphs:

Figure 8 /page3.adp Performance Vs Users

Figure 9 page3.adp Pages/Second Throughput
As these two charts illustrate, page3.adp performed quite well following optimization so that on a single Netra running one AOLserver instance, it could scale to 250 virtual users and achieve a steady-state throughput of about 36 pages/second. These results were quite acceptable, and so ArsDigita next turned to tuning the other slow page in the Index+3 script.
4.4 Tuning /page4.adp
ArsDigita spent several rounds optimizing /page4.adp and then re-testing it to evaluate the results of the optimizations. At the beginning of the optimization process, this page scaled to about 30 virtual users with a throughput of about 1.12 pages/second on one Netra. Furthermore, even at these user levels, it performed unacceptably beyond eight-second load times.
After ArsDigita finished tuning /page4.adp, the page performed acceptably up to around 40 users and scaled up to 90 users. Its average throughput also increased fivefold to 5.56 pages/second. ArsDigita deemed this figures acceptable as this page performs several expensive computations.
ArsDigita used the following setup for evaluating this page:

Number Of Netras 1
AOLservers/Netra 1
MaxThreads Parameter 10
Number of Main Database Handles 8
Ramp Up 5 users, every 3 iterations
Virtual User Delay Between Clicks 10 seconds

Table 6 Configuration for /page4.adp test
The following graphs illustrate the improvements in this page:

Figure 10 Comparison of /page4.adp Performance Vs. Users

Figure 11 Comparison of /page4.adp Throughput: Pages/Second
4.5 Index+3 Comparison
Once ArsDigita had tuned /page3.adp and /page4.adp, it re-ran the Index+3 test to see how the overall script performance would improve. However, for this test, ArsDigita used a different server configuration from the initial Round One Index+3 test. In Round One, ArsDigita had used up to nine Netras, of which seven served primary content. For this round, though, ArsDigita only used four Netras for serving primary content due to server availability in the lab. Therefore, when comparing Index+3 results between Round One and Round Two, keep in mind that the Round Two tests ran on half the front-end hardware as Round One. Nevertheless, the results in Round Two were still significantly better.

Round One Round Two

Number of Main Netras Up to 7 4
Number of AOLservers/Netra 1 2
MaxThreads Parameter 100 10
Virtual User Delay Between Clicks 0 seconds 0 seconds

Table 7 Comparison of Index+3 Test Setup for Round One Vs. Round Two
Note that ArsDigita ran its Round Two Index+3 test with no virtual user delays in order to compare with Round One results.
The following graphs show the Index+3 results following ArsDigita's optimizations:

Figure 12 Comparison of Index+3 Results: Performance Vs. Users

Figure 13 Comparison of Index+3 Results: Throughput
As these graphs show, ArsDigita's tuning produced significantly improved results while running on less hardware than in Round One testing.
4.6 XML Testing
A large part of the Site X-COMPANY Y joint venture involved ArsDigita's development of various XML-generating pages. ArsDigita individually tested the two particular XML pages expected to receive the bulk of the new XML-based traffic.
4.6.1 XML-1
The first XML page ArsDigita examined in isolation was xml-1. ArsDigita used the following setup to test this page:

Number Of Netras 1
AOLservers/Netra 1
MaxThreads Parameter 10
Number of Main Database Handles 8
Ramp Up 2 users, every 10 iterations
Virtual User Delay Between Clicks 10 seconds

Table 8 Configuration for xml-1 test
From this analysis, ArsDigita found that xml-1 scaled well up to about 45 users and an average throughput of about 37.1 pages/second for one AOLserver on a single Netra:

Figure 14 xml-1 Performance Vs. Users

Figure 15 xml-1 Throughput: Pages/Second
Based on these results, ArsDigita concluded that xml-1 scaled adequately.
4.6.2 xml-2
The other XML page which ArsDigita tested was xml-2. ArsDigita setup this test as follows:

Number Of Netras 1
AOLservers/Netra 1
MaxThreads Parameter 10
Number of Main Database Handles 8
Ramp Up 5 users, every 10 iterations
Virtual User Delay Between Clicks 10 seconds

Table 9 Configuration for xml-2 test
From this test, ArsDigita found that xml-2 scaled acceptably to about 90 users and a steady-state throughput of about 20 pages/second on one AOLserver running on a single Netra.

Figure 16 xml-2 Performance Vs. Users

Figure 17 xml-2 Throughput: Pages/Second
Based on these results, ArsDigita concluded that xml-2 scaled adequately.
4.7 Site-Wide Test
The final test that ArsDigita performed during Round Two was a site-wide test. ArsDigita used this test to gain an estimate of Site X's overall scalability.
To mimic projected traffic patterns following the joint Site X-COMPANY Y launch, ArsDigita wrote additional scripts and classified each script as either background or foreground. Background scripts would consist mostly of XML bulk-download and administration activity, whereas foreground scripts would consist of pages that users would frequently visit.
ArsDigita further divided foreground scripts into two types: XML scripts and consumer-site scripts. Because Site X was expecting that the XML-generating pages would receive far more traffic than the regular Site X pages, ArsDigita decided to deploy foreground scripts at a 5:1 XML:consumer-site ratio.
ArsDigita's e-Load license supports up to 500 virtual users. Therefore, ArsDigita deployed its users using the background, XML-foreground, and consumer-site foreground scripts as follows:

Script Number of Users User Frequency

xml-3 1 every 10 minutes
xml-4 1 every 10 minutes
xml-5 1 every 10 minutes
xml-6 1 every 10 minutes
xml-7 1 every 10 minutes
xml-8 1 every 10 minutes
xml-9 1 every 10 minutes
xml-10 1 every 10 minutes
xml-11 1 every 10 minutes
xml-12 1 every 10 minutes
trace-1 2 every 5 minutes
trace-2 2 every 5 minutes
trace-3 2 every 5 minutes
Total 16 Background Users

Table 10 Site-Wide Test Background Users

Script Number of Users User Frequency

/index.adp 20 every 10 seconds
/page2.adp 20 every 10 seconds
/page3.adp 20 every 10 seconds
/page4.adp 11 every 10 seconds
/page4-special.adp 10 every 10 seconds
Total 81 Main Foreground Users

Table 11 Site-Wide Test Consumer-Site Foreground Users

Script Number of Users User Frequency

xml-1 101 every 10 seconds
xml-2 201 every 10 seconds
xml-3 101 every 10 seconds
Total 403 XML Foreground Users

Table 12 Site-Wide Test XML Foreground Users
All of these scripts were one page-scripts, except for the trace-* scripts. These trace scripts performed administration functions and also involved logging into the Site X site.
The configuration ArsDigita used for testing was as follows:

Number Of Netras 2 Main, 2 XML, 1 Image, 1 Cache*
AOLservers/Netra 2
MaxThreads Parameter 10
Number of Main Database Handles 8
Virtual User Delay Between Clicks 10 Seconds
Ramp Up Background: 16 users, all at once
Foreground: 5 users, every 3 minutes

Table 13 Site-Wide Test Configuration
*The Cache did not actually run during this test.
During this test, the site achieved an overall throughput of about 20 pages/second-exceeding its scalability requirements. Despite these figures, though, virtual users started encountering server errors such as connection resets after a load of about 130 users. Upon analyzing the test results, ArsDigita pinpointed these errors being due to Netras running out of physical memory. During load testing, the AOLserver processes on the four Main and XML Netras eventually consumed more memory than the 256MB of physical RAM available to them. Once this happened, the Netras' performance suffered, and the site began to experience errors. ArsDigita noted this memory problem but did not address it until the next round of testing.
One other problem that ArsDigita found during this test was that at higher loads, many of the XML bulk-download pages were failing. ArsDigita suspected this was because many of these pages didn't have access to the database handles they required. Thus, to attend to this problem, ArsDigita decided to dedicate an AOLserver specifically for serving XML bulk-downloads in its next test.
Following are graphs highlighting some of the site-wide test results:

Figure 18 Main Foreground: Performance Vs. Users

Figure 19 XML Foreground: Performance Vs. Users
Note that xml-1's performance grew too slow beyond 150 users (when Netra memory ran out), and the other XML foreground pages' performance became unacceptable at around 250 users. The main foreground pages, however, performed adequately throughout testing.
The next three graphs illustrate how errors started occurring once Netra physical memory ran out:

Figure 20 Main Foreground Errors Vs. Users

Figure 21 XML Foreground Errors Vs. Users

Figure 22 XML Background Error Rate Vs. Users

Figure 23 Netra Memory Use Vs. Users
The point where each Netra crosses over the 25000.00 line is approximately where it uses up its 256 MB of RAM. The two Netras that do not cross over the 256 MB threshold are the image and cache servers.
4.8 Round Two Findings
In this round of testing and tuning, ArsDigita was able to successfully tune two slow pages as well as improve the server configuration. Furthermore, through an Index+3 comparison test against Round One findings, ArsDigita found that these changes significantly improved overall site performance. Finally, in preparation for COMPANY Y, ArsDigita individually tested two XML pages and performed a site-wide test. Although these tests performed beyond requirements, they exposed a memory-growth problem.
5 Round Three Testing

5.1 Test Environment
For its final round of scaling, ArsDigita moved its test equipment from Cambridge to its space in the Waltham Exodus hosting site. There, ArsDigita setup it machines to test against Site X's new e4500 server as well as five Netras. This would help ArsDigita verify that on this new production environment, Site X would still be able to meet its scalability requirement of a 6.5 pages/second throughput.
ArsDigita was not able to test against a full complement of the nine Netras that Site X planned on using for its site because four of the Netras Site X planned on using with the e4500 were still deployed on the existing e450-based site. Nevertheless, if Site X could pass scalability requirements with five Netras, then it would certainly be able to do so with nine.
5.2 Initial Site-Wide Tests
For this round of testing, ArsDigita added one more script to the XML foreground, xml-13. It distributed its 500 e-Load virtual users similarly to the Round Two site-wide test: 16 background users, and five times as many XML foreground users as main foreground users.

Script Number of Users User Frequency

xml-3 1 every 10 minutes
xml-4 1 every 10 minutes
xml-5 1 every 10 minutes
xml-6 1 every 10 minutes
xml-7 1 every 10 minutes
xml-8 1 every 10 minutes
xml-9 1 every 10 minutes
xml-10 1 every 10 minutes
xml-11 1 every 10 minutes
xml-12 1 every 10 minutes
trace-1 2 every 5 minutes
trace-2 2 every 5 minutes
trace-3 2 every 5 minutes
Total 16 Background Users

Table 14 Round Three Background Users

Script Number of Users User Frequency

/index.adp 20 every 5 seconds
/page2.adp 20 every 5 seconds
/page3.adp 20 every 5 seconds
/page4.adp 11 every 5 seconds
/page4-special.adp 10 every 5 seconds
Total 81 Main Foreground Users

Table 15 Round Three Consumer-Site Foreground Users

Script Number of Users User Frequency

xml-1 91 every 5 seconds
xml-2 201 every 5 seconds
xml-3 91 every 5 seconds
xml-13 20 every 5 seconds
Total 403 XML Foreground Users

Table 16 Round Three XML Foreground Users
For its first test, ArsDigita setup the site as follows:

Number Of Netras 1 Main, 1 XML,
1 XML bulk-download,1 Image, 1 Cache
AOLservers/Netra 1 for Main; 2 for XML
MaxThreads Parameter 10
Number of Main Database Handles 8
Ramp Up Background: 16 users, all at once
Foreground: 5 users, every 3 minutes
Virtual User Delay Between Clicks 5 seconds

Table 17 Configuration for First Round-Three, Site-Wide Test
Note that ArsDigita, at Site X's request, reduced the virtual user delay between clicks from ten seconds to five seconds.
5.2.1 Memory Issue
During this test, the Netras again ran out of memory as in Round Two. However, this time, they ran out of RAM earlier on in testing than before. Furthermore, several of the Netras in this test had 512 MB of RAM rather than 256 MB.
One other thing that ArsDigita noticed during this test was that the XML bulk-download, image, and cache Netras were not working hard at all. ArsDigita decided that for its remaining tests in Round Three, it would re-configure its Netras so that these three functions would reside on one server:

Number Of Netras 2 Main, 2 XML,
1 XML bulk-download/Image/Cache
AOLservers/Netra 1 for Main; 2 for XML
MaxThreads Parameter 10
Number of Main Database Handles 8
Ramp Up Background: 16 users, all at once
Foreground: 5 users, every 3 minutes
Virtual User Delay Between Clicks 5 seconds

Table 18 Configuration for Remaining Round-Three Site-Wide Tests
With this configuration in place, ArsDigita proceeded to work on isolating the cause for the problematic AOLserver memory growth. After multiple rounds of testing and analysis, ArsDigita was able to trace the primary source of memory growth to the page, xml-2.
5.2.2 dqd_threadpool
Xml-2 spawned multiple threads to perform a reactive query every time it was loaded. Thread creation within AOLserver is a CPU and memory-intensive process. Thus, every time a user requested xml-2, AOLserver would spawn multiple threads and grow in memory size.
Once ArsDigita had determined that spawning threads was causing AOLserver to grow in size, it installed a custom AOLserver module, dqd_threadpool. With this module, ArsDigita created a pool of threads dedicated to the reactive query launched by xml-2. AOLserver would automatically create all the threads for this pool upon startup and keep them for the life of the server.
Now, whenever xml-2 ran its reactive query, it would grab existing threads from the thread pool rather than create new threads. If all the threads within the thread pool were already being used by other requests, then the reactive query would enter a queue to await the next free thread.
Using this pool put a bound on the number of threads that xml-2 could use, and it thus capped the memory size to which AOLserver could grow.
After setting up dqd_threadpool, Arsdigita tested xml-2 with a pool size of ten without seeing any memory growth problems. It, therefore, setup one final site-wide test for Site X to see how the site would scale with the new e4500 in place and memory issues resolved.
5.3 Final Site-Wide Test
For this final test, ArsDigita used the same script setup as in the initial Round Three site-wide tests. But, it increased the ramp-up pace for the foreground users to adding five users every 90 seconds:

Number Of Netras 2 Main (10.0.1.240-1),
2 XML (10.0.1.242-3),
1 XML bulk-download/Image/Cache (10.0.1.244)
AOLservers/Netra 1 for Main; 2 for XML
MaxThreads Parameter 10
Number of Main Database Handles 8
Ramp Up Background: 16 users, all at once
Foreground: 5 users, every 90 seconds
Virtual User Delay Between Clicks 5 seconds

Table 19 Final Site-Wide Test Configuration
During this test, the final site scaled acceptably to 500 users with a throughput of about 49 pages/second. These results, on less hardware than the production site, exceed the scalability requirements of 6.5 pages/second by nearly an order of magnitude.
The Netras did not run out of memory throughout this test, indicating that using dqd_threadpool had largely solved the AOLserver memory problem. Furthermore, the database and consumer-site Netras were not using their CPU's to full capacity. The XML servers were using their entire CPU capacity, but this did not hurt Site X's ability to meet scalability or performance requirements.
A couple potential problems do reveal themselves in this test. First, xml-13 performed poorly after about 300 concurrent users. The site itself has already about reached its throughput bottleneck at this point. Secondly, after about 200 users, the consumer-site pages started experiencing errors, primarily in the form of connection resets or timeouts. These problems occurred far above the required throughput but may be areas which should be investigated in the future.
Following are some graphs highlighting the results from this test:

Figure 24 Final Site-Wide Test Overall Throughput: Pages/Second

Figure 25 Main Foreground: Performance Vs. Users

Figure 26 XML Foreground: Performance Vs. Users

Figure 27 Main Foreground: Error Rate Vs. Users

Figure 28 XML Foreground: Error Rate Vs. Users

Figure 29 XML Background: Performance Vs. Users

Figure 30 Trace Background: Performance Vs. Users

Figure 31 Netras Memory Vs. Users
5.4 Long-Term Stability Test
Once ArsDigita had ascertained that Site X would scale and perform acceptably, it ran a long-term stability test on the site. A long-term stability test exerts a constant load upon a site for an extended period of time and measures how that site degrades over time. With Site X, ArsDigita performed an overnight, 11-hour test under a constant load of 226 users. These 226 users included the 16 background scripts and 210 foreground and XML-foreground scripts. Note that during the final site-wide test, the servers still had throughput capacity left beyond 226 users-the system throughput at 226 users was about 35 pages/second. Thus, 226 users was not pushing the system into any bottlenecks.
The long-term test's scripts ran for a total of 1,469,016 iterations with 5,152 total errors for a 0.35% error rate. Thus, for all intents and purposes, the long-term test ran without any problems.
ArsDigita noted the memory on each Netra at the beginning and end of the stability test. The results are as follows:

Netra Physical RAM (MB) RAM free (MB)Test Start RAM free (MB)Test End RAM Consumed (MB)

100 (main) 256 80 0 80+
101 (main) 512 327 202 125
102 (XML) 512 317 250 67
103 (XML) 256 111 91 20
104 (mixed) 512 283 243 40

Table 20 Long-Term Stability Test Memory Usage
Based upon these results, Site X proved fairly stable. But, under constant load, the AOLservers do slowly grow in memory usage-indicating, perhaps, some kind of small memory leak. ArsDigita recommends that Site X do the following with its production setup to deal with this memory growth:

Use 512 MB of RAM on each Netra
Restart the AOLserver instances twice a day

Following these steps should help ensure that Site X runs stably over time.
6 Conclusion
ArsDigita's Scaling Team worked iteratively with the Site X development team to ensure that Site X would be prepared for its COMPANY Y joint-venture. As a result of this effort, ArsDigita greatly improved Site X's scalability by

Changing Site X's server configuration
Improving slow page performance
Capping thread memory growth
Determining stability-ensuring measure

As a result of this work, Site X currently far exceeds its scalability requirement of 6.5 pages/second while running on less hardware than the actual production site.

asj-editors@arsdigita.com