Performance Test Analysis

sri kiran
Jul 16, 2016
15 min read

Filed under: Performance Test Analysis — Leave a comment

March 2, 2011

The reason for the above problem could be many..

I am just listing out below ::

>>Look for other performance counters that have been affected. More interesting could be network bytes, average response time, etc.

>>The increase can be due to several reasons like variance in connectivity to DB or other external servers from the application servers, network bandwidth, etc

>>If the application server has to wait for a little longer for response from DB or other external servers due to bandwidth issues, then it has to perform more context switches between requests and this will increase the % CPU on the servers.

>>Is anything else running on the application server(s) that could be eating up CPU cycles?

>>Are you sure that you are applying the same level of load? .If you are inadvertently creating more load then the CPU hit will be higher.

>>Are there more or less Application servers in the new environment? Less servers will require more CPU cycles to process the same load

>>What sort of load-balancing is in place? Has this changed? Uneven load-balancing may increase CPU cycles.

>>Compare the bare-metal profile of new and previous application servers looking for differences. (i.e Less memory, Different number and/or type of CPU’s).

>>Compare the versions of the appication server software. Different versions may be more CPU intensive.

>>Compare the number of threads in memory. If the number has increased then more CPU cycles will be required to deal with the extra processes.

Compare following parameters between old and new app servers. These parameters play very important role with CPU consumptions. 1. Prepared Statement Cache Size 2. Web container Thread Pool Size 3. JDBC Connections (min/max) 4. Garbage collection policy (Xgcpolicy).

>>The parameters that you have mentioned really do play an important role with CPU consumptions. However, these parameters are defined within the application that is deployed on the servers. So, a change of hardware will not affect these settings as these will be picked up from the application configuration files.

>>Does the application server layer use encryption?. If so the amount of encryption effort may have changed requiring additional CPU cycles.

>>There could also be an increase in network errors which may result in a higher CPU hit due to retransmission requirements.

>>If the load that is being generated is not controlled by pacing and left to the mercy of the response time then the number of requests generated by the generators could increase with the network speed.

>>If the application depends on any external services or the DB which are not located in the same subnet then the amount of idle time on the application under consideration may vary based on how fast the external entity can respond back to your application

>>Check you network adapter settings on Application server and make sure it is in par with switches. If your switches configured to 1 GB/S and your App Servers set at half or 100mbps, then you may see issues.

>>If you app server uses external storage, check disk activity

Comment

LR Analysis : Context Switches/sec

Filed under: Performance Test Analysis — Leave a comment

February 18, 2011

Context Switches/sec is the combined rate at which all processors on the computer are switched from one thread to another. Context switches occur when a running thread voluntarily relinquishes the processor, is preempted by a higher priority ready thread, or switches between user-mode and privileged (kernel) mode to use an Executive or subsystem service. It is the sum of Thread\\Context Switches/sec for all threads running on all processors in the computer and is measured in numbers of switches. There are context switch counters on the System and Thread objects. This counter displays the difference between the values observed in the last two samples, divided by the duration of the sample interval.

Context Switches/sec

A context switch occurs when the kernel switches the processor from one thread to another—for example, when a thread with a higher priority than the running thread becomes ready. Context switching activity is important for several reasons. A program that monopolizes the processor lowers the rate of context switches because it does not allow much processor time for the other processes’ threads. A high rate of context switching means that the processor is being shared repeatedly—for example, by many threads of equal priority. A high context-switch rate often indicates that there are too many threads competing for the processors on the system.

Context switch: steps

In a switch, the state of the first process must be saved somehow, so that, when the scheduler gets back to the execution of the first process, it can restore this state and continue.

The state of the process includes all the registers that the process may be using, especially theprogram counter, plus any other operating system specific data that may be necessary. This data is usually stored in a data structure called a process control block (PCB), or switchframe.

In order to switch processes, the PCB for the first process must be created and saved. The PCBs are sometimes stored upon a per-process stack in kernel memory (as opposed to the user-mode stack), or there may be some specific operating system defined data structure for this information.

Since the operating system has effectively suspended the execution of the first process, it can now load the PCB and context of the second process. In doing so, the program counter from the PCB is loaded, and thus execution can continue in the new process. New processes are chosen from a queue or queues. Process and thread priority can influence which process continues execution, with processes of the highest priority checked first for ready threads to execute.

Now coming to the basic ::

What is Process?

A process is an instance of a computer program that is being sequentially executed. Every program running on a computer, be it background services or applications, is a process. One CPU can run one process at a time. Process management is an operating system’s way of dealing with running multiple processes on a single CPU. Since most computers contain one processor with one core, multitasking is done by simply switching processes quickly (known as context switching). Process management involves computing and distributing CPU time as well as other resources.

Comment

LR Analysis : Threadpool concept

Filed under: Performance Test Analysis — Leave a comment

February 17, 2011

Currently I am working with one Load testing project which has IBM (Websphere) as an application server. while analysing the result of loadtest it is observed that system’s CPU Utilization was reaching up to 100% at very initial stage of execution so by taking help from development team and analysis team it is found that threadpool at websphere was configured less than the required I wonder what is threadpool exactly at websphere level and how does it impact at loadtesting?? then I realized that threadpool is also a parameter which has some influence during loadtest.

Let see one by one exactly what it is…

Benefits of threadpool???

Thread creation and destruction overhead is negated.
Better performance and better system stability.
It is better to use a thread pool in cases where the number of threads is very large.
You do not have to create, manage, schedule, and terminate your thread, the thread pool class do all of this for you.
There is no worries about creating too many threads and hence affecting system performance.

Now-a-days companies are increasingly using thread pools to enhance WebSphere Application Server performance by providing users with required information quickly and efficiently.

When users make request for Web site information, threads are necessary to process these requests. At WebSphere Application Server (WAS) environment, applications requesting a thread are acquired from either the HTTP Listener, Web container thread pool, ORB container thread pool, or the data source connection pool. The threads contained in these pools are the actual resources that service user requests. Therefore, tuning thread pools properly will enhance WAS performance and optimize the users’ experiences.

——:: Tuning threadpool at Websphere level ::——

The WAS administrative console is used to tune thread pools.
Each thread pool is assigned a minimum and a maximum thread pool number.
When WAS is launched, it will create the number of threads identified in the minimum thread pool number parameter.
As application processing occurs, the system will generate additional threads as required, up to the maximum number of threads parameter.
It is critical to set the maximum number of threads for any specific queue or command properly.
If the system is constantly creating and destroying threads from the pool, system thrashing will occur, wasting valuable resources.
When configuring a thread pool, it is important to remember that the “more is better” rule does not apply here. Threads require a memory commitment and system resources.
If the thread pool is configured to produce more threads than the system requires, valuable system resources are being denied to other resources.
This type of configuration will burden the system and slow the application.
Therefore, configuring thread pools accurately and harmoniously with each other is critical to optimal WebSphere performance.

:: There are four level of thread configuration that must be considered during the setting up of threadpool ::

1) First Level of threadpool configuration :: HTTP Listener The HTTP Listener is responsible for thread creation at the HTTP server level. Most of the processing that occurs here is static page serving.

2) Second level of threadpool configuration :: Web Container The Web container is responsible for thread pool creation at the application server level. Most of the processing at this level includes servlet, JSP, EJB, dynamic page creation, and back-end pass-through processing.

3) Thrid level of threadpool configuration :: ORB Container The ORB container is responsible for thread pool creation at the object level. Most of the processing that occurs here includes the processing of non-Web?based clients.

4) Fourth level threadpool configuration :: Data Source The data source level is responsible for creating the connection threads that are accessed from the database or “legacy” systems.

Tags: CPU Utilization, HTTP Listener, Level of threadpool configuration, or the data source connection pool, ORB container thread pool, stability, threadpool, Tuning threadpool, Web container thread pool, Websphere

Comment

Capacity Planning : Web Application

Filed under: Performance Test Analysis, strategies, testing — Leave a comment

February 7, 2011

The process of determining what type of hardware and software configuration is required to meet application needs adequately is called capacity planning. Capacity planning is not an exact science. Every application is different and every user behavior is different.

Capacity planning is process of anticipating growth of your application and acting before demand becomes critical. Using capacity planning saves time, money, and reputation.

Capacity Planning Factors:

A number of factors influence how much capacity a given hardware configuration will need in order to support a Web Server instance and a given application. The hardware capacity required to support your application depends on the specifics of the application and configuration.

Have you checked the tuning of App server ?

Answer :: At this stage in capacity planning,you gather information about the level of activity expected on your server,the anticipated number of users, the number of requests, acceptable response time, and preferred hardware configuration.Capacity planning for server hardware should focus more on performance requirements and set measurable objectives for capacity.

How well designed is the user application?

Answer ::

Is the database a bottleneck? Are there additional user storage requirements? Often the database server runs out of capacity much sooner that App Server does. Plan for a database that is sufficiently robust to handle the application. Typically, a good application’s database requires hardware three to four times more powerful than the application server hardware. It is good practice to use a separate machine for your database server.

Generally, you can tell if your database is the bottleneck if you are unable to maintain App Server CPU usage in the 85 to 95 percent range. This indicates that App Server is often idle and waiting for the database to return results. With load balancing in a cluster, the CPU utilization across the nodes should be about even.

Some database vendors are beginning to provide capacity planning information for application servers. Frequently this is a response to the three-tier model for applications.

An application might require user storage for operations that do not interact with a database. For example, in a secure system disk and memory are required to store security information for each user. You should calculate the size required to store one user’s information, and multiply by the maximum number of expected users.

Is the bandwidth sufficient?

App Server requires enough bandwidth to handle all connections from clients. In the case of programmatic clients, each client JVM will have a single socket to the server. Each socket requires bandwidth. A App Server handling programmatic clients should have 125 to 150 percent the bandwidth that a server with Web-based clients would handle. If you are interested in the bandwidth required to run a web server, you can assume that each 56kbps (kilobits per second) of bandwidth can handle between seven and ten simultaneous requests depending upon the size of the content that you are delivering. If you are handling only HTTP clients, expect a similar bandwidth requirement as a web server serving static pages.The primary factor affecting the requirements for a LAN infrastructure is the use of in-memory replication of session information for servlets and stateful session EJBs. In a cluster, in-memory replication of session information is the biggest consumer of LAN bandwidth. Consider whether your application will require the replication of session information for servlets and EJBs.

To determine whether you have enough bandwidth in a given deployment, look at the network tools provided by your network operating system vendor. In most cases, including Windows NT, Windows 2000, and Solaris, you can inspect the load on the network system. If the load is very high, bandwidth may be a bottleneck for your system.

How many transactions need to run simultaneously?

How many transactions must run concurrently? Determine the maximum number of concurrent sessions App Server will be called upon to handle. For each session, you will need to add more RAM for efficiency. Oracle recommends that you install a minimum of 256 MB of memory for each WebLogic Server installation that will be handling more than minimal capacity.

Next, research the maximum number of clients that will make requests at the same time, and how frequently each client will be making a request. The number of user interactions per second with App Server represents the total number of interactions that should be handled per second by a given App Server deployment. Typically for Web deployments, user interactions access JSP pages or servlets. User interactions in application deployments typically access EJBs.

Consider also the maximum number of transactions in a given period to handle spikes in demand. For example, in a stock report application, plan for a surge after the stock market opens and before it closes. If your company is broadcasting a Web site as part of an advertisement during the World Series or World Cup Soccer playoffs, you should expect spikes in demand.

Capacity planning is also performed using predictive analysis.Predictive analysis applies a mathematical model to historical data to predict future resource utilization of your application.

INTRODUCTION TO PREDICTIVE ANALYSIS :

One of the approaches “Predictive Analysis” predicts the future capacity requirements by extrapolating from historical and current data. With this approach, you analyze how computer resource usage relates to transaction volumes (or user operations). You can do this by analyzing the IIS log files to understand your application’s usage and recording performance data to understand resource utilization.

Therefore the important aspect of “Predictive Analysis” is collecting right performance data on the application. The accuracy and integrity of the performance data is basis to the usefulness of the resulting predictions.

As the predictions are based on historical data, it increases the likelihood of success. But it’s important to note that these predictions are not absolute statements but rather indicative pointers of the trend the application is following.

Summary of Steps

Step 1. Collect Performance Data
Step 2. Query the Collected Performance Data
Step 3. Analyze the Collected Performance Data
Step 4. Predict Future Requirements

Collect Performance Data:

The performance data for the application needs to be collected over a period of time. The greater the time duration, the greater the accuracy with which you can predict a usage pattern and future resource requirements.

Query the Collected Performance Data:

Query the collected performance data based on what you are trying to analyze. If your application is CPU bound, you might want to analyze CPU utilization over a period of time. For example, you can query the data for the percentage of CPU utilization for the last 40 days during peak hours (9:00 A.M.–4:00 P.M.), along with the number of connections established during the same period.

It is also possible to query segmented data from different, common points in the past; such as month-end processing over the last 12 months. The calculation is still useful to predict the future capacity of the system at future month-end processing dates.

Analyze the Collected Performance Data:

Before you analyze the historical performance data, you must be clear about what you are trying to predict. For example, you may be trying to answer the question, “What is the trend of CPU utilization during peak hours?”

Analyze the data obtained by querying the database. The data obtained for a given time frame results in a pattern that can be defined by a trend line. The pattern can be as simple as a linear growth of the resource utilization over a period of time. This growth can be represented by an equation for a straight line:

y = mx + b Where b is the x offset, m is the slope of the line, and x is an input. For the preceding question, you would solve for x given y:

x = (y – b)/m For the example in Step 1, the trend line is:

y = 0.36x + 53 Where y is the CPU utilization and x is the number of observations, 0.36 is the slope of the line and 53 is % CPU utilization from the day one, hence the offset. Following figure shows the trend for this example.

Trend of CPU utilization

Choosing the correct trend line is critical and depends on the nature of the source data. Some common behaviors can be described by polynomial, exponential, or logarithmic trend lines. You can use Microsoft Excel or other tools for trend line functions for analysis

Predict Future Requirements:

Using the trend lines, you can predict the future requirements. The predicted resource requirements assume that the current trend would continue into the future.

For example, consider the trend line mentioned in Step 3. Assuming you do not want the CPU utilization to increase beyond 75 percent on any of the servers, you would solve for x as follows:

x = (y – 53)/0.36 Therefore:

x = (75 – 53)/0.36 = 61.11 Based on the current trends, your system reaches 75 percent maximum CPU utilization when x = 61.11. Because the x axis shows daily measurements taken from the peak usage hours of 9:00 A.M. to 4:00 P.M., one observation corresponds to one day and as there are 40 observations in this example, your system will reach 75 percent CPU utilization

61.11 – 40 = 21.11 number of days.

Tags: app server, capacity planning, Predictive Analysis

Comment

LoadRunner Analysis : CPU Utilization

Filed under: Performance Test Analysis — Leave a comment

Whenever a hard disk is transferring data over the interface to the rest of the system, it uses some of the system’s resources. One of the more critical of these resources is how much CPU time is required for the transfer. This is called the CPU utilization of the transfer. CPU utilization is important because the higher the percentage of the CPU used by the data transfer, the less power the CPU can devote to other tasks. When multitasking, too high a CPU utilization can cause slowdowns in other tasks when doing large data transfers. Of course, if you are only doing a large file copy or similar disk access, then CPU utilization is less important.

For monitoring purpose we need to observe two types of objects for CPU Utilization for loadrunner monitor configuration:

1) Processor Object

2) System Object

Whats is Processor Object?

The Processor performance object consists of counters that measure aspects of processor activity. The processor is the part of the computer that performs arithmetic and logical computations, initiates operations on peripherals, and runs the threads of processes. A computer can have multiple processors. The processor object represents each processor as an instance of the object.

Processor(_Total)\% Processor Time The Processor(_Total)\% Processor Time is useful in measuring the total utilization of your processor by all running processes. Note that if you have a multiprocessor machine, Processor (_Total)\% Processor Time actually measures the average processor utilization of your machine (i.e. utilization averaged over all processors). The Processor\% Processor Time counter determines the percentage of time the processor is busy by measuring the percentage of time the thread of the Idle process is running and then subtracting that from 100 percent. This measurement is the amount of processor utilization. Although you might sometimes see high values for the Processor\% Processor Time counter (70 percent or greater depending on your workload and environment), it might not indicate a problem; you need more data to understand this activity. For example, high processor-time values typically occur when you are starting a new process and should not cause concern.

Note

The value that characterizes high processor utilization depends greatly on your system and workload. This chapter describes 70 percent as a typical threshold value; however, you may define your target maximum utilization at a higher or lower value. If so, substitute that target value for 70 percent in the examples provided in this section.

To illustrate, consider that Windows 2000 allows an application to consume all available processor time if no other thread is waiting. As a result, System Monitor shows processor-time rates of 100 percent. If the threads have equal or greater priority, as soon as another thread requests processor time, the thread that was consuming 100 percent of CPU time yields control so that the requesting thread can run, causing processor time to lessen.

If you establish that processor-time values are consistently high during certain processes, you need to determine whether a processor bottleneck exists by examining processor queue length data. Unless you already know the characteristics of the applications running on the system, upgrading or adding processors at this point would be a premature response to persistently high processor values, even values of 90 percent or higher. First, you need to know whether processor load is keeping important work from being done. You have several options for addressing processor bottlenecks, but you need to first verify their existence.

If you begin to see values of 70 percent or more for the Processor\% Processor Time counter, investigate your processor’s activity further, as follows:

Examine System\Processor Queue Length.
Identify the processes that are running when Processor\% Processor Time and System\Processor Queue Length values are highest.

What is System object ?

The System performance object consists of counters that apply to more than one instance of a component processors on the computer.

System\Processor Queue Length

If this counter is consistently higher than around 5 when processor utilization approaches 100%, then this is a good indication that there is more work (active threads) available (ready for execution) than the machine’s processors are able to handle. However, take note that it is not the best indicator of Processor contention based on overloaded threads. Other counters such as ASP\Requests Queued orASP.NET\Requests Queued can be introduced for such tasks.

What is Processor Queue length ?

A collection of one or more threads that is ready but not able to run on the processor due to another active thread that is currently running is called the processor queue.

In the following graph it illustrates the total CPU Utilization during the test execution

CPU and Running Users Graph

As per the graph you can see that the green color indicates the Number of users gradually increasing with some time interval and another line of graph indicates that the CPU Utilization is also increasing with number of users.So at concurrency level i.e. when the users are at concurrent stage CPU Utilization reached upto maximim of its potential i.e. 100%throughout the duration ofconcurrency.You will find that there is sudden downfall in the CPU Utilization during the test when the users are ramping down from concurrency.Now here I tried to find the cause of this sudden downfall and by investigating the application manually and found that the A

LR Analysis : Context Switches/sec

LR Analysis : Threadpool concept

Capacity Planning : Web Application

Capacity Planning Factors:

LoadRunner Analysis : CPU Utilization

Comments