CLIENT/SERVER TECHNOLOGY:CAPACITY PLANNING AND PERFORMANCE MANAGEMENT

4. CAPACITY PLANNING AND PERFORMANCE MANAGEMENT

Objectives of Capacity Planning

Client / server systems are made up of many hardware and software resources, including client work- stations, server systems, and network elements. User requests share the use of these common re- sources. The shared use of these resources gives rise to contention that degrades the system behavior and worsens users’ perception of performance.

In the example of the Internet banking service described in the Section 7, the response time, the time from when a user clicks the URL of a bank until the home page of the bank is displayed on

the user’s Web page, affects user’s perception of performance and quality of service. In designing a system that treats requirements from multiple users, some service levels may be set; for example, average response time requirements for requests must not exceed 2 sec, or 95% of requests must exhibit a response time of less than 3 sec. For the given service requirements, service providers must design and maintain C / S systems that meet the desired service levels. Therefore, the performance of the C / S system should be evaluated as exactly as possible and kinds and sizes of hardware and software resources included in client systems, server systems, and networks should be determined. This is the goal of capacity planning.

The following are resources to be designed by capacity planning to ensure that the C / S system performance will meet the service levels:

• Types and numbers of processors and disks, the type of operating system, etc.

• Type of database server, access language, database management system, etc.

• Type of transaction-processing monitor

• Kind of LAN technology and bandwidth or transmission speed for clients and servers network- ing

• Kind of WAN and bandwidth when WAN is used between clients and servers

In the design of C / S systems, processing devices, storage devices, and various types of software are the building blocks for the whole system. Because capacity can be added to clients, servers, or network elements, addition of capacity can be local or remote. The ideal approach to capacity plan- ning is to evaluate performance prior to installation. However, gathering the needed information prior to specifying the elements of the system can be a complicated matter.

Steps for Capacity Planning and Design

Capacity planning consists of the steps shown in Figure 14, which are essentially the same as those for capacity planning of general information systems.

Because new technology, altered business climate, increased workload, change in users, or demand for new applications affects the system performance, these steps should be repeated at regular intervals and whenever problems arise.

Performance Objectives and System Environment

Speciﬁc objectives should be quantiﬁed and recorded as service requirements. Performance objectives are essential to enable all aspects of performance management. To manage performance, we must set quantiﬁable, measurable performance objectives, then design with those objectives in mind, project to see whether we can meet them, monitor to see whether we are meeting them, and adjust system parameters to optimize performance. To set performance objectives, we must make a list of system services and expected effects.

We must learn what kind of hardware (clients and servers), software (OS, middleware, applica- tions), network elements, and network protocols are presented in the environment. Environment also involves the identiﬁcation of peak usage periods, management structures, and service-level agree- ments. To gather these information about the environment, we use various information-gathering techniques, including user group meetings, audits, questionnaires, help desk records, planning doc- uments, and interviews.

Performance Criteria

1. Response time is the time required to process a single unit of work. In interactive applications, the response time is the interval between when a request is made and when the computer responds to that request. Figure 15 shows a response time example for a client / server database access application. As the application executes, a sequence of interactions is exchanged among the components of the system, each of which contributes in some way to the delay that the user experiences between initiating the request for service and viewing the DBMS’s response to the query. In this example, the response time consists of several time components, such as (a) interaction with the client application, (b) conversion of the request into a data stream by an API, (c) transfer of the data stream from the client to the server by communication stacks, (d) translation of the request and invocation of a stored procedure by DBMS, (e) execution of SQL (a relational data-manipulation language) calls by the stored procedure, (f) conversion of the results into a data stream by DBMS, (g) transfer of the data stream from the server to the client by the API, (h) passing of the data stream to the application, and (i) display of the ﬁrst result.

2. Throughput is a measure of the amount of work a component or a system performs as a whole or of the rate at which a particular workload is being processed.

3. Resource utilization normally means the level of use of a particular system component. It is deﬁned by the ratio of what is used to what is available. Although unused capacity is a waste of resources, a high utilization value may indicate that bottlenecks in processing will occur in the near future.

4. Availability is deﬁned as the percentage of scheduled system time in which the computer is actually available to perform useful work. The stability of the system has an effect on per- formance. Unstable systems often have speciﬁc effects on the performance in addition to the other consequences of system failures.

5. Cost–Performance ratio: Once a system is designed and capacity is planned, a cost model can be developed. From the performance evaluation and the cost model, we can make an analysis regarding cost–performance trade-offs.

In addition to these criteria, resource queue length and resource waiting time are also used in de- signing some resources.

Response time and availability are both measures of the effectiveness of the system. An effective system is one that satisﬁes the expectations of users. Users are concerned with service effectiveness, which is measured by response time for transaction processing, elapsed time for batch processing, and query response time for query processing. On the other hand, a system manager is concerned with optimizing the efﬁciency of the system for all users. Both throughput and resource utilization are measures for ensuring the efﬁciency of system operations. If the quality of performance is mea- sured in terms of throughput, it depends on the utilization levels of shared resources such as servers, network elements, and application software.

Workload Modeling

Workload modeling is the most crucial step in capacity planning. Misleading performance evaluation is possible if the workload is not properly modeled.

Workload Characterization

Workload refers to the resource demands and arrival intensity characteristics of the load brought to the system by the different types of transactions and requests. A workload consists of several com- ponents, such as C / S transactions, web access, and mail processing. Each workload component is further decomposed into basic components such as personnel transactions, sales transactions, and corporate training.

A real workload is one observed on a system being used for normal operations. It cannot be repeated and therefore is generally not suitable for use as a test workload in the design phase. Instead, a workload model whose characteristics are similar to those of real workload and can be applied repeatedly in a controlled manner, is developed and used for performance evaluations.

The measured quantities, service requests, or resource demands that are used to characterize the workload, are called workload parameters. Examples of workload parameters are transaction types, instruction types, packet sizes, source destinations of a packet, and page-reference patterns. The workload parameters can be divided into workload intensity and service demands. Workload intensity is the load placed on the system, indicated by the number of units of work contending for system resources. Examples include arrival rate or interarrival times of component (e.g., transaction or re- quest), number of clients and think times, and number of processors or threads in execution simul- taneously (e.g., ﬁle reference behavior, which describes the percentage of accesses made to each ﬁle in the disk system) The service demand is the total amount of service time required by each basic component at each resource. Examples include CPU time of transaction at the database server, total transmission time of replies from the database server in LAN, and total I / O time at the Web server for requests of images and video clips used in a Web-based learning system.

Workload Modeling Methodology

If there is a system or service similar to a newly planned system or service, workloads are modeled based on historical data of request statistics measured by data-collecting tools such as monitors.

A monitor is used to observe the performance of systems. Monitors collect performance statistics, analyze the data, and display results. Monitors are widely used for the following objectives:

1. To ﬁnd the frequently used segments of the software and optimize their performance.

2. To measure the resource utilization and ﬁnd the performance bottleneck.

3. To tune the system; the system parameters can be adjusted to improve the performance.

4. To characterize the workload; the results may be used for the capacity planning and for creating test workloads.

5. To ﬁnd model parameters, validate models, and develop inputs for models.

Monitors are classiﬁed as software monitors, hardware monitors, ﬁrmware monitors, and hybrid monitors. Hybrid monitors combine software, hardware, or ﬁrmware.

If there is no system or service similar to a newly planned system or service, workload can be modeled by estimating the arrival process of requests and the distribution of service times or proc- essing times for resources, which may be forecast from analysis of users’ usage patterns and service requirements.

Common steps in a workload modeling process include:

1. Speciﬁcation of a viewpoint from which the workload is analyzed (identiﬁcation of the basic components of the workload of a system)

2. Selecting the set of workload parameters that captures the most relevant characteristics of the workload

3. Observing the system to obtain the raw performance data

4. Analyzing and reducing of the performance data

5. Constructing of a workload model.

The basic components that compose the workload must be identiﬁed. Transactions and requests are the most common.

Performance Evaluation

Performance models are used to evaluate the performance of a C / S system as a function of the system description and workload parameters. A performance model consists of system parameters, resource parameters, and workload parameters. Once workload models and system conﬁgurations have been obtained and a performance model has been built, the model must be examined to see how it can be used to answer the questions of interest about the system it is supposed to represent. This is the performance-evaluation process. Methods used in this process are explained below:

Analytical Models

Because generation of users’ requests and service times vary stochastically, analytical modeling is normally done using queueing theory. Complex systems can be represented as networks of queues in which requests receive services from one or more groups of servers and each group of servers has its own queue. The various queues that represent a distributed C / S system are interconnected, giving rise to a network of queues, called a queueing network. Thus, the performance of C / S systems can be evaluated using queueing network models. Figure 16 shows an example of queueing network model for the Web server, where the client and the server are connected through a client side LAN, a WAN, and a server-side LAN.

Simulation

Simulation models are computer programs that mimic the behavior of a system as transactions ﬂow through the various simulated resources. Simulation involves actually building a software model of

each device, a model of the queue for that device, model processes that use the devices, and a model of the clock in the real world. Simulation can be accomplished by using either a general programming languages such as FORTRAN or C or a special-purpose language such as GPSS, SIMSCRIPT, or SLAM. For network analysis, there are special-purpose simulation packages such as COMNET III and BONeS are available.

Benchmarking

A benchmark is a controlled test to determine exactly how a speciﬁc application performs in a given system environment. While monitoring supplies a proﬁle of performance over time for an application already deployed, either in a testing or a production environment, benchmarking produces a few controlled measurements designed to compare the performance of two or more implementation choices.

Some of the most popular benchmark programs and most widely published benchmark results come from groups of computer hardware and software vendors acting in consort. RISC workstation manufacturers sponsor the Systems Performance Evaluation Cooperative (SPEC), and DBMS vendors operate the transaction Processing Council (TPC). The TPC developed four system-level benchmarks that measure the entire system: TPC-A, B, C, and D. TPC-A and TPC-B are a standardization of the debit / credit benchmark. TPC-C is a standard for moderately complex online transaction-processing systems. TPC-D is used to evaluate price / performance of a given system executing decision support applications. The SPEC developed the standardized benchmark SPECweb, which measures a system’s ability to act as a web server for static pages.

Comparing Analysis and Simulation

Analytic performance modeling, using queueing theory, is very ﬂexible and complements traditional approaches to performance. It can be used early in the application development life cycle. The actual system, or a version of it, does not have to be built as it would be for a benchmark or prototype. This saves tremendously on the resources needed to build and to evaluate a design. On the other hand, a simulation shows the real world in slow motion. Simulation modeling tools allow us to observe the actual behavior of a complex system; if the system is unstable, we can see the transient phenomena of queues building up and going down repeatedly.

Many C / S systems are highly complex, so that valid mathematical models of them are themselves complex, precluding any possibility of an analytical solution. In this case, the model must be studied by means of simulation, numerically exercising the model for the inputs in question to see how they affect the output measures of performance.

Search This Blog

Operation Research course