Using Simulation Modeling to Predict Scalability of an E-commerce Website



Similar documents
The Association of System Performance Professionals

Web Load Stress Testing

USING OPNET TO SIMULATE THE COMPUTER SYSTEM THAT GIVES SUPPORT TO AN ON-LINE UNIVERSITY INTRANET

.:!II PACKARD. Performance Evaluation ofa Distributed Application Performance Monitor

Performance Testing Process A Whitepaper

Castelldefels Project: Simulating the Computer System that Gives Support to the Virtual Campus of the Open University of Catalonia

Performance Modeling for Web based J2EE and.net Applications

SIMULATION OF LOAD BALANCING ALGORITHMS: A Comparative Study

Web Server Software Architectures

Methodology of performance evaluation of integrated service systems with timeout control scheme

Discrete-Event Simulation


Programma della seconda parte del corso

Load Testing on Web Application using Automated Testing Tool: Load Complete

There are a number of factors that increase the risk of performance problems in complex computer and software systems, such as e-commerce systems.

Throughput Capacity Planning and Application Saturation

Portable Scale-Out Benchmarks for MySQL. MySQL User Conference 2008 Robert Hodges CTO Continuent, Inc.

PERFORMANCE ANALYSIS OF WEB SERVERS Apache and Microsoft IIS

Performance Evaluation Approach for Multi-Tier Cloud Applications

Table of Contents INTRODUCTION Prerequisites... 3 Audience... 3 Report Metrics... 3

Performance Modeling and Analysis of a Database Server with Write-Heavy Workload

Delivering Quality in Software Performance and Scalability Testing

Introduction to Analytical Modeling

Experimental Evaluation of Horizontal and Vertical Scalability of Cluster-Based Application Servers for Transactional Workloads

Load Balancing in Fault Tolerant Video Server

Noelle A. Stimely Senior Performance Test Engineer. University of California, San Francisco

A Performance Analysis of Secure HTTP Protocol

Performing Load Capacity Test for Web Applications

SUBHASRI DUTTAGUPTA et al: PERFORMANCE EXTRAPOLATION USING LOAD TESTING RESULTS

CHAPTER 3 CALL CENTER QUEUING MODEL WITH LOGNORMAL SERVICE TIME DISTRIBUTION

PREFETCHING INLINES TO IMPROVE WEB SERVER LATENCY

R-Capriccio: A Capacity Planning and Anomaly Detection Tool for Enterprise Services with Live Workloads

Introducing Performance Engineering by means of Tools and Practical Exercises

Improved metrics collection and correlation for the CERN cloud storage test framework

Performance Prediction, Sizing and Capacity Planning for Distributed E-Commerce Applications

The Importance of Load Testing For Web Sites

Oracle Applications Release 10.7 NCA Network Performance for the Enterprise. An Oracle White Paper January 1998

A methodology for workload characterization of file-sharing peer-to-peer networks

How To Test On The Dsms Application

Condusiv s V-locity Server Boosts Performance of SQL Server 2012 by 55%

Liferay Portal Performance. Benchmark Study of Liferay Portal Enterprise Edition

Deployment of express checkout lines at supermarkets

Building well-balanced CDN 1

Executive Summary. Methodology

Performance Issues of a Web Database

LOAD BALANCING AS A STRATEGY LEARNING TASK

Performance And Scalability In Oracle9i And SQL Server 2000

Software Performance and Scalability

TESTING AND OPTIMIZING WEB APPLICATION S PERFORMANCE AQA CASE STUDY

Load Testing Analysis Services Gerhard Brückl

QSEM SM : Quantitative Scalability Evaluation Method

Lecture 8 Performance Measurements and Metrics. Performance Metrics. Outline. Performance Metrics. Performance Metrics Performance Measurements

Monitoring Exchange 2007 and 2010 Environments

MEASURING WORKLOAD PERFORMANCE IS THE INFRASTRUCTURE A PROBLEM?

HSR HOCHSCHULE FÜR TECHNIK RA PPERSW I L

Company & Solution Profile

DELL TM PowerEdge TM T Mailbox Resiliency Exchange 2010 Storage Solution

How To Model A System

What Is Specific in Load Testing?

ADAPTIVE LOAD BALANCING FOR CLUSTER USING CONTENT AWARENESS WITH TRAFFIC MONITORING Archana Nigam, Tejprakash Singh, Anuj Tiwari, Ankita Singhal

Profiling services for resource optimization and capacity planning in distributed systems

Flow aware networking for effective quality of service control

Performance Test Process

white paper Capacity and Scaling of Microsoft Terminal Server on the Unisys ES7000/600 Unisys Systems & Technology Modeling and Measurement

Analysis of QoS Routing Approach and the starvation`s evaluation in LAN

Comparative Study of Load Testing Tools

Test Run Analysis Interpretation (AI) Made Easy with OpenLoad

The Hadoop Distributed File System

Deploying XenApp 7.5 on Microsoft Azure cloud

Continuous Fastest Path Planning in Road Networks by Mining Real-Time Traffic Event Information

A Quantitative Approach to the Performance of Internet Telephony to E-business Sites

SharePoint Server 2010 Capacity Management: Software Boundaries and Limits

Energy Efficient MapReduce

Is Truck Queuing Productive? Study of truck & shovel operations productivity using simulation platform MineDES

Introduction to SQLIO & SQLIOSim & FIO. XLVIII Encontro da Comunidade SQLPort

HSR HOCHSCHULE FÜR TECHNIK RA PPERSW I L

Analysis of IP Network for different Quality of Service

A Scalable Network Monitoring and Bandwidth Throttling System for Cloud Computing

A Tool for Evaluation and Optimization of Web Application Performance

Performance evaluation of Web Information Retrieval Systems and its application to e-business

Estimate Performance and Capacity Requirements for Workflow in SharePoint Server 2010

EWeb: Highly Scalable Client Transparent Fault Tolerant System for Cloud based Web Applications

Copyright 1

Load balancing as a strategy learning task

A Content-Based Load Balancing Algorithm for Metadata Servers in Cluster File Systems*

OBSERVEIT DEPLOYMENT SIZING GUIDE

Managing and Monitoring Windows 7 Performance Lesson 8

11.1 inspectit inspectit

Enhanced Transmission Selection

Agility Database Scalability Testing

Violin Memory 7300 Flash Storage Platform Supports Multiple Primary Storage Workloads

Capacity Planning. Capacity Planning Process

Computer Information Systems (CIS)

Network Performance Monitoring at Small Time Scales

Introduction to Software Performance Engineering

Predicting the QoS of an Electronic Commerce Server: Those Mean Percentiles

VMware and Xen Hypervisor Performance Comparisons in Thick and Thin Provisioned Environments

2. RELATIONAL WORK. Volume 2, Issue 5, May 2013 Page 67

Transcription:

Using Simulation Modeling to Predict Scalability of an E-commerce Website Rebeca Sandino Ronald Giachetti Department of Industrial and Systems Engineering Florida International University Miami, FL 33174 Abstract Scalability is the capability of a system to process an increasing workload or to increase its capacity at the minimum cost and without decreasing its performance below the threshold of customer expectation. Scalability is an essential characteristic of e-commerce web sites due to the high variance in the rate of arrival of requests and to the rapid growth in demand that a site can experience. Due to the complexity of the interactions between users, hardware, software it is difficult to predict the response of a client-server system to an increase in workload. Currently, load testing and analytical queuing networks are being used to achieve this. In this project, simulation modeling was used as a tool to predict the behavior of a system when subjected to varying workloads. The simulation model was built using a top down view of the system. The results showed it was possible to capture the system s behavior by modeling the interactions between requests, the web server, the application server and the database server. The output of the simulation mo del was validated against a load testing study conducted on the same test bed. Recommendations were made based on the results of the study and the difficulties found in the process. Keywords Scalability, discrete-event simulation, e-commerce Introduction The rapid growth of systems, in particular information systems, demands the capability to both increase a system s capacity and handle increasing loads using the capacity that is currently available. This must be done without dropping below the threshold of expected service quality. This presents a dilemma to both system designers and system managers who must be able to predict the reaction of a system to an increase in load and use this information to incorporate the adequate resources and at the same time design the system to allow for easy integration of additional resources. Several methods are currently in place to design and manage system scalability in information systems. These methods include trial and error and the use of load testing software. These methods, however, do not provide an adequate means to predict system scalability and plan for this requirement. There is a need for a method that will allow them to predict system behavior under varying workloads and varying configurations.

Description of Test bed NurseIn.net is an online provider of certification services for nurses specializing in the areas of Fertility and Endocrinology Care. States such as Florida and California require nurses to periodically renew their licenses. The traditional approach is via surface mail. Nurses can order material through the mail, read the course material, and take the test. The test is then mailed back to be graded, if the nurse passes the test a certificate is mailed back to them. Nursein.net speeds up the process by allowing nurses to download and print course material, take the test online, pay a fee, and print out a certificate. This Website was used as a test bed on a master s thesis that delivered a study of scalability using load-testing software [13]. This same test site is being used to perform a scalability study using discrete-event simulation. The purpose of using the same site is two-fold. First, to establish the feasibility of discrete-event simulation modeling to predict the behavior of a client-server system and second to serve as a validation to the results obtained using the load-testing software. NurseIn.net consists of a web server, an application server, and a database server all contained within the same computer. The site s contents consist of files written in HTML and in Coldfusion. There are six html files: How it Works, Accreditation, Authors, Publishing, About Us, and Links. The remaining files are the course descriptions and the quizzes. The files written in HTML use the web server; the courses use both the web server and the application server. The quizzes require all three resources. Figure 1 shows the flow of information as it was modeled. Web Server Application Server 1. request seizes available thread 1 2 * NOTE: The request does not release the thread until it responds to the initial request 1 2 Request for Web Page 3 3 Response to request completed *All threads are released after response is completed *Web Server has 5 threads that can run simultaneously 4 5 2. WS threads use CPU & Disk 4 5 CPU Disk Is AS Required? Yes No Figure 1 Information Flow Model

Description of the Model The test bed site is composed of ten pages and three resources carry out its functions. The web server has a capacity of twenty, the application server has a capacity of five and the database server has a capacity of one. Each server is preceded by an infinite queue with a first in first out discipline. The entities are defined as information packages. Each entity visits a sequence of pages. This sequence is termed a path. The paths followed by the entities were characterized using visits to the resources instead of visits to the pages. This minimizes the number of possible combinations per iteration and allows the model to be valid even if the contents of the web site are updated. The top paths followed by the entities can be determined using log analysis software. For the purpose of replicating the scripts used with the load testing software the entities followed one of three paths. Thirty-three percent utilize only the web server, thirty-three percent utilize both the web server and application server and the remaining thirty-four percent utilize all three resources. In addition to the paths followed by the entities, the log analysis software provides the rate at which the entities arrive at the site. To match the conditions of the scripts two entities were set to arrive every three seconds. Another important input to the model is the time each entity remains at each resource. In this concern two assumptions were made. First, the system network latency was considered negligible. This assumption is reasonable and mimics the load-test performed by Protopapas (2000) in which all network traffic was local on the intranet. User think-time was also considered negligible for validation purposes (the model is being validated against the behavior of virtual users). To establish the service time at each station, the data gathered with the load testing software was analyzed using statistical analysis software. Based on the output of this software the equation of the curve with the best fit was used as the service time. Although it was attempted to establish distributions using the output of performance software it was not possible to collect sufficient data due to caching in the proxy server. Finally, no failures were incorporated into the model design since the script for the load tests did not include failures. The model was built using Arena 3.0. Verification is the process of establishing if the model s behavior conforms to its intended behavior. In order to verify the model several techniques were used. The model s response to system congestion was a significant increase in both time in queue and time in system. Similarly, its response to system starvation was a decrease in time in queue as well as a decrease in time in system. The model s output was also congruent when the create rate, flow paths and service times were altered. Experiments Ten experiments were run. The system at the beginning of each experiment is idle and 100 entities are arrived to the web server. At the end of each experiment, the system terminates when the last entity is disposed. This is achieved by using flushing. By having each experiment run as a different replication, the system is initialized before each experiment and arena controls the random number stream so that each experiment generates independent observations. These were used to establish the confidence interval for the time in system as shown in the results. Results For validation, the system s out put was analyzed using the analysis tool provided by Arena. The analysis tool showed that the average time in system under these conditions would fall between 192 and 203 with a 95% confidence level as shown in the Figure 2.

Figure 2 Confidence Interval for time in system Classical C.I. Intervals Summary IDENTIFIER AVERAGE STANDARD 0.950 C.I. MIN MAX No. DEVIATION HALF-WIDTH VALUE VALUE OF OBS. Tmax(TIS) 198 7.96 5.7 184 210 10 dmax(8) 101 1.26 0.905 100 103 10 Figure 3 shows the plot of the output. Performance Under Load 250 Average Response-Time (sec) 200 150 100 50 0 0 20 40 60 80 100 120 Number of Users Figure 3 Performance Under Load The graph was evaluated to determine that it represented the expected behavior of response time with a varying load. The shape of the graph was corresponding to expected system behavior. It was then compared to the graph generated by the load testing software. The shape of the graph and the critical values were approximate to those of the graph generated by the load testing software. In addition, the highest utilization was the web server and the application server because all entities visited the web server and two-thirds of the entities visited the application server and its capacity was one-fourth that of the web server. The flow path with the highest time in system was the one in which the entities visited the web server. This coincides with the results of the load-test performed [13] where the bottleneck was found to be the Accreditation page.

Conclusions Several conclusions were drawn from the project and its outcome. Discrete event simulation is a suitable tool in evaluating a system s performance. It enables both system designers and management to predict system performance and scalability highlighting possible problem areas. By analyzing the system and its requirements enhanced management and maintenance capabilities can be built into the model. A trade-off exists between the level of detail incorporated into a model and the time it takes to develop the model. In the case of client-server systems, in particular e-commerce websites, it is imperative to develop a model in the shortest time span possible. This project shows that it is viable to construct a model with a top-down view of the system and still capture the behavior of the system. Data collection is an essential factor in building a model, as there are no benchmarks available. Collecting data to establish the service distributions is difficult because of the complexity of the interactions between the different components in the architecture of a client-server system. In the case of NurseIn.net collecting this data was not feasible because of caching in the proxy server. Given the same hardware and software configuration the size of the data application being executed had the most significant impact on system performance. Acknowledgements This project was supported by NSF Grant # EEC-9619728.

References 1. Alexander, Steve, Scalability, Computerworld, 2000. 2. Bahrami, Ali, et al., Enterprise Architecture for Business Process Simulation, Proceedings of the 1998 Winter Simulation Conference, pp. 1409-1413. 3. Duxbury, P. D., Issues in Simulation Modeling of Client-Server Systems, www.scs.com. 4. Gimarc, Richard L. and Spellmann, Amy, Modeling Microsoft SQL Server 7.0, CMG, 1998. 5. Giachetti, R.E., Chen, C.C., and Saenz, O. Toward Measuring the Scalability of Enterprise Information Systems, IEEE Proceedings of the International Conference on Enterprise Information Systems (ICEIS 2000), Setúbal, Portugal, 7-10 July, 2001, pp. 24-29. 6. Huang, Yiqing, et al., A Speculation-Based Approach for Performance and Dependability Analysis: A Case Study 7. Joines A. Jeffrey & Roberts Stephen D., Simulation in an Object-Oriented World, Proceedings of the 1999 Winter Simulation Conference, pp. 132-140. 8. Keezer, William S., Simulation of Computer Systems and Applications, Proceedings of the 1997 Winter Simulation Conference, pp. 103-109. 9. Law, Darren R., Scalable Means More than More: A Unifying Definition of Simulation Scalability, Proceedings of the 1998 Winter Simulation Conference, pp. 781-788. 10. Martinka, Joseph J., Functional Requirements for Client/Server Performance Modeling: An Implementation Using Discrete Event Simulation, Hewlett Packard Laboratories. 11. Mielke, Ronald R., Applications for Enterprise Simulation, Proceedings of the 1999 Winter Simulation Conference, pp. 1490-1495. 12. Odhabi, Hamad J., et al., Developing a Graphical User Interface for Discrete Event Simulation, Proceedings of the 1998 Winter Simulation Conference, pp. 429-436. 13. Protopapas, S., Name of his project, Master s Project, Department of Industrial & Systems Engineering, Florida International University, 2001. 14. Schwetman, Herb, Model-Based Systems Analysis Using CSIM18, Proceedings of the 1998 Winter Simulation Conference, pp. 309-313. 15. Schwetman, Herb, Model, Then Build : A Modern Approach to Systems Development Using CSIM18, Proceedings of the 1999 Winter Simulation Conference, pp. 249-254. 16. Whitman, Larry, et al., Commercial Simulation Over the Web, Proceedings of the 1998 Winter Simulation Conference, pp. 335-339.