WebSledge Logo

Sample Test Report

Technical Specs
Frequently Asked Questions
Pricing
Sample Test Reports
News/Press Releases
Contact Us

Downright Software LLC

Downright's WebSledge Testing Services

UNNAMED CLIENT Stress Test Report

Introduction
Configuration Tested

Testing Protocol

Testing Results

Observations
Recommendations/Summary

 

Line
1.0 Introduction

This document captures the observations, results, and recommendations of the UNNAMED CLIENT Stress Testing engagement conducted May 10, 1999 through May 13, 1999. This testing engagement focused on the performance of UNNAMED CLIENT's on-line bill paying application. During a planning meeting held on May 12th, CLIENT REPRESENTATIVE, UNNAMED CLIENT project manager, described the goals of the engagement as:

  1. Identify any bottlenecks in the online bill paying application.
  2. Identify any configuration changes required to optimally support the maximum number of users on the current HW/SW configuration.
  3. Identify the optimal number of Haht processes to support a given number of concurrent users.
  4. Identify the HW/SW growth requirements as the number of concurrent users grows.

UNNAMED CLIENT identified the following performance metrics as defining "acceptable" performance:

  1. Average per-page response time of less than 15 seconds (regardless of statement size.)
  2. Maximum per-page response time of less than 45 seconds (regardless of statement size.)
  3. Average "internal" response time of less than 2 seconds. ("Internal" response time consists of the time required for the "backend HTML generator" to generate a statement, as well as any Oracle processing necessary for a page.)

The testing and data collection processes went quite smoothly due to the planning and support of the UNNAMED CLIENT staff. The effort was supported by a number of UNNAMED CLIENT personnel, notably Russell Niesz and Gary Greg. Russ was very helpful in collecting performance data from the Solaris system(s) and Gary made sure that all required resources (people, hardware, and facilities) available. This support enabled the on-site HAHT representatives to focus on the testing and work very efficiently. The stability of the application and the servers under test, in combination with the professionalism and competence of the UNNAMED CLIENT staff directly contributed to an efficient and successful testing effort.

Line
2.0 Configuration Tested

Due to scheduling constraints, the system under test (SUT) was UNNAMED CLIENT's development configuration rather than their production system. The SUT consisted of the following HW/SW configuration:

Hardware Software
  • Sun Ultra 2 (dual 300mhz UltraSPARC II processors)
  • 896Mbytes memory
  • 2 x 4.2GB SCSI internal disks
  • External Raid 5 Controller with 4 x 4.2GB disks
  • Solaris V2.6
  • Oracle V7.3.4
  • Haht V3.1 Build 104
  • Backend HTML Generator

For comparison, the production system consists of a two machine configuration connected via a 100Mbit backbone:

Hardware Software
  • Sun Ultra 2 (dual 300mhz UltraSPARC II processors)
  • 384Mbytes memory
  • 2 x 4.2GB SCSI internal disks
  • External Raid 5 Controller with 4 x 4.2GB disks
  • Solaris V2.6
  • Haht V3.1 Build 104

 

  • Sun Ultra 450 (dual 300Mhz processors)
  • 500Mbytes memory
  • 8 x 4.2GB SCSI internal disks
  • External Raid 5 Controller with 7 x 4.2GB disks
  • Solaris V2.6
  • Oracle V7.3.4
  • Backend HTML Generator

 

Examination of these configurations implies that any performance measurements made on the development system should be worse than the performance of the production system. The production system spreads the same workload over two separate systems and the only HW difference on the machine supporting Haht software is memory size. During testing, the combined load of Oracle and the Backend HTML Generator never exceeded 10% of the total CPU demand on the development system.

The load for the test was generated using 3 NT V4.0 (SP 3) PCs running WebLoad Version 3.01.321

Line
3.0 Testing Protocol

Stress/Load testing was conducted by using WebLoad to simulate a number of concurrent users performing a set of tasks. Based on input from UNNAMED CLIENT developers, a set of scripts (called agendas in WebLoad terminology) which represented a single user performing a set of billing -paying operations were developed. To test the impact of statement size, agendas that referenced the CLIENT1 site and the CLIENT2 site were developed. (The CLIENT2 generated significantly larger statements.) The table below outlines the specific tasks incorporated in each agenda.

Agenda 1 Agenda 4
  1. Login to CLIENT1
  2. Export Statement
  3. Review Outstanding Balance
  4. Submit Payment
  5. Review Payment History
  6. Logout
  1. Login to CLIENT2
  2. Review Current Statement
  3. Submit Payment
  4. Logout

For load testing, the agendas were configured to simulate users by incorporating random "think times" between pages (but not between frames.) These "think times" represent the time a user spends reading and respond to a page before submitting data or "clinking" to another page. For these agendas, the random think times were evenly distributed with a range of 5 to 30 seconds.

For stress testing, the per-page think times were set to 0. This type of test agenda will generate the maximum amount of traffic and the most stress on a system.

Stress/Load tests were conducted in two different modes:

  1. WebLoad was used to simulate a given number of users, each performing the tasks in a specific agenda. When an agenda was completed (a user completed their interaction with the application), WebLoad repeated the agenda until the test was concluded. This mode results in a even load on the SUT, giving performance information about the application at a given number of users.
  2. WebLoad was used to gradually increase the number of users performing an agenda. As a simulated user completed an agenda, a new copy was started. For the UNNAMED CLIENT tests, we started with 15 simulated users and added an additional 15 users every 2 minutes. This mode results in an ever increasing load in the SUT. The testing cycle stops when a given maximum number of simulated users is reached, or the measured average and/or maximum response times exceed defined limits. This mode reveals the maximum supportable user level given a set of response time criteria. This mode is commonly referred to a "cruise control" mode.

The time to complete a single agenda by a single user is called "round time." Because of the random per-page think times, round times vary. Round times will also vary based on the number of concurrent users and the tasks performed by the users.

During test runs, WebLoad collects the following data on the performance of the SUT every 20 seconds:

  1. Simulated Load (number of simulated users)
  2. Min, max, average and current Round Time
  3. Number of successful rounds
  4. Number of failed rounds (the agenda generated some sort of error)
  5. Min, max, average and current Response time for each frame/page

(Note: WebLoad collects a wide variety of statistics. The statistics listed above are the one most relevant to the UNNAMED CLIENT testing scenarios and goals.)

A variety of other measurements were collected during each run:

  1. To measure time spent processing HAHTsite dynamic pages, HAHTsite collects session and page statistics on page wait times, run times, and CPU times. These statistics can be used to determine the "internal response time" metric. The "internal response time" metric can be measured using the difference between a Haht page run time and Haht page CPU time.
  2. Memory and CPU utilization rates on the SUT were measured using vmstat on the Solaris system.

Finally, at several points in the testing process, WebLoad and/or Oracle parameters were changed to overcome problems encountered while stress testing or to increase the overall throughput of the system.

Line
4.0 Testing Results

The first set of tests was used to generate a set of baseline performance metrics for each agenda. During these tests, the agenda were run in Load test mode (using per-page thing times) at a load level of 50 simulated users. These test run give a baseline set of performance metrics that were used to identify key agendas to use for further testing.

The application was then stress tested using the agenda(s) that generated the most load. This series of tests revealed the need to modify Oracle parameters and increase the number of configured HAHTsite processes.

Finally, the maximum number of concurrent users was identified using the most "stressful" agenda run in "cruise control" mode. The most "stressful" agenda was identified based on the size of the returned statement, the CPU and memory loads revealed by vmstat, and the WebLoad and HAHTsite response time statistics.

4.1 Average Statement Size

As can be seen from the table below, the CLIENT2 agenda generated significantly larger statement sizes than the CLIENT1 agenda(s). For this reason, test runs used for calculating the maximum number of supportable users were based on the CLIENT2 agenda.

Agenda Statement Size
CLIENT1 12,900 bytes
CLIENT2 61,775 bytes

4.2 Rounds per Minute

The number of rounds completed per minute is a useful measure in determining the overall throughput and capacity of the application. Round per minute measures are displayed below for the CLIENT1 agenda for both the 5 process and 10 process configurations of HAHTsite.

CLIENT1 Agenda
# of Users 5 Processes 10 Processes
50 10.5 rpm  
100 18.7 rpm 19.1 rpm
150   27.9 rpm


While the round per minute measure shows only about a 1% increase in capacity, another critical measure was impacted by increasing the number of HAHTsite processes. A significant component of the round time can be the time dynamic HAHT pages wait for free process. This measurement, Page Request Time on Queue, averaged 1.5 seconds while support 100 users with 5 processes and dropped to 0.6 seconds with 10 processes. The maximum measured Page Request Time on Queue was 18.8 seconds with 5 processes and 12.3 seconds with 10 processes.

The round per minute statistics for the CLIENT2 agenda reveal that the application breaks down somewhere between 100 and 150 users.

CLIENT2 Agenda
# of Users 10 Processes
100 20.5 rpm
150 13.3 rpm

4.3 Page Response Times

One of the most CPU intensive pages in the on-line bill paying application is the login page. This page makes a number of calls to Oracle to assemble information about the user and their current status.

The other intensive page is the Request Statement Detail page.

The charts on the next two pages show the Current, Session Average, and the maximum response time for the Login page and the Request Statement detail page when tested using the "Cruise control" mode.

The "Max" measurement reports the recorded maximum response time across the test run. The "Session Average" measurement represents the average response time across the test run. The "Current Average" measurement represents the average response time at the identified user load.

Client2 Agenda Chart:  Login Page Response Times

This chart reveals that the average and maximum response time constraints are exceeded when the SUT is supporting between 50 and 70 users. 

Comparing this chart to the Statement Detail Page response times reveals that the login process takes longer that the statement generation process. The Request Statement Detail pages are consistently returned in under 15 seconds until a load of 180 users is reached; and, tis operation never exceed the 45 seconds maximum response time threshold.

Inspection of the vmstat logs show the CPU idle time dropping to 0% at about 60 concurrent users (with a queue of 3 to 7 processes waiting to execute.) Between 60 and 75 users, the CPU idle time "bounces" between 0% and 50%. After exceeding 75 concurrent users, the CPU idle time remains consistently at 0%, with the queue of executable process growing to 8 to 12 processes.

Further inspection of the vmstat logs reveals that even at a user level of 220 users, no swapping occurred and there remained at least 81Mbytes of available memory. So, the principal constraint on the performance of this application is available CPU cycles.

CLIENT2 Agenda Chart: Response Statement Detail Page Response Time

Line
5.0 Observations

  • The stress testing revealed the need to increase the number of Oracle cursors as the number of supported concurrent users was increased.
  • Overall response of the application was improved by increasing the number of HAHTsite process from 5 to 10. At user loads that resulted in 15 second average response times, the CPU utilization was measured at 100% with a run queue of 4 to 8 processes. This measurement implies that further increasing the number of HAHTsite processes will have little impact on increasing the overall throughput of the application (unless hardware capacity is increased.)
  • Based on vmstat and top reports on the SUT while under load, Oracle and the Backend HTML Generator used less about 5% to 8% of the CPU capacity of the SUT.
  • Based on vmstat and top reports, the http server on the SUT consumed approximately 10% of the CPU during the tests using the CLIENT2 agenda. (The CLIENT2 agenda produced statements that were 62Kbytes in size, resulting in a significantly heavier load during SSL processing.)
  • The principal constraint (bottleneck) in the application is available CPU cycles. The HAHTsite and vmstat statistics reveal that application spent very little time waiting on Oracle or the Backend HTML Generator and the server supporting HAHTsite, Oracle and the Backend HTML Generator did not swap or run out of memory.
  • The agendas used in this testing emulated "graceful" users; i.e., they ended every interaction with the application by logging out. Not all (most?) "real" users will not bother to log out. Instead, they will pay their bill and leave the site. To the application they will appear to be simply a quiescent active user. As currently configured, HAHTsite will "time-out" their state after 15 minutes of inactivity. If this inactivity is factored in, 600 concurrent users will have the same CPU load as 81 "graceful" users.
  • At a load level of 75 users, the system was able to complete 24 rounds of CLIENT2 agenda per minute. This represents a system capacity to support 1440 bill paying operations per hour, or 8640 per 6 hour day.

Line
6.0 Recommendations/Summary

  • The current development configuration, and by implication the production system, will comfortably support 60 to 75 concurrent users with response times that fall within UNNAMED CLIENT's defined goals. Since the main constraint was CPU cycles, and HAHTsite's performance response to additional HW is typically linear, a quad CPU configuration should comfortably support 120 to 150 users.
  • Additional growth to 200 or more concurrent users will require moving to a distributed implementation of HAHTsite. Again, since HAHTsite scales linearly, adding another HAHTsite server (configured similarly to the HAHT production server) will permit the configuration to support a total of 240 to 300 users.
  • Supporting 60 to 75 concurrent users on the production system will probably require increasing the memory configuration on the system supporting HAHTsite.
  • Overall application performance and number of supportable concurrent users can be positively impacted by investigating and improving the performance of the login process. If the overhead of this process can be reduced, the current configuration(s) should be able to support more than 75 concurrent users with quite respectable response times.
  • These recommendations and observations are based on the assumption that the similarity in configurations between the development platform and the production platform allow a simple comparison. This assumption can be easily validated by running a test using the CLIENT2 agenda in "cruise control" mode against the production system. This validation was not available during the on-site testing due to I UNNAMED CLIENT's concerns about the test having a negative impact on the performance of the production system.
  • These tests reflect the performance of the application without the impact of Internet (and/or modem) induced delays. They represent a "best case" performance benchmark and reflect a test of the parameters within the control of UNNAMED CLIENT. (UNNAMED CLIENT cannot control the latency introduced by a very active Internet.) WebLoad can be configured to generate the simulated users from load generators located "outside" of UNNAMED CLIENT. If UNNAMED CLIENT wants to attempt to quantify the impact of Internet latency, this same series of tests should be run using load generators located in the Internet "cloud".

Line