Software reliability analysis of laptop computers



Similar documents
START Selected Topics in Assurance

Chapter 3 RANDOM VARIATE GENERATION

Hardware Recommendations

Modeling Individual Claims for Motor Third Party Liability of Insurance Companies in Albania

ACTUARIAL MODELING FOR INSURANCE CLAIM SEVERITY IN MOTOR COMPREHENSIVE POLICY USING INDUSTRIAL STATISTICAL DISTRIBUTIONS

Simple Linear Regression Inference

Fairfield Public Schools

DETERMINATION OF THE PERFORMANCE

DELL. Virtual Desktop Infrastructure Study END-TO-END COMPUTING. Dell Enterprise Solutions Engineering

How To Check For Differences In The One Way Anova

Introduction to nonparametric regression: Least squares vs. Nearest neighbors

Survival Analysis of Left Truncated Income Protection Insurance Data. [March 29, 2012]

Statistics in Retail Finance. Chapter 6: Behavioural models

How To Calculate The Power Of A Cluster In Erlang (Orchestra)

MATH 10: Elementary Statistics and Probability Chapter 5: Continuous Random Variables

VDI Without Compromise with SimpliVity OmniStack and VMware Horizon View

Delivering Quality in Software Performance and Scalability Testing

Chapter 8 Hypothesis Testing Chapter 8 Hypothesis Testing 8-1 Overview 8-2 Basics of Hypothesis Testing

Institute of Actuaries of India Subject CT3 Probability and Mathematical Statistics

LOGISTIC REGRESSION ANALYSIS

Permutation Tests for Comparing Two Populations

Statistical Analysis of Life Insurance Policy Termination and Survivorship

Gamma Distribution Fitting

Modeling and Analysis of Call Center Arrival Data: A Bayesian Approach

SUMAN DUVVURU STAT 567 PROJECT REPORT

Detection of changes in variance using binary segmentation and optimal partitioning

Likelihood: Frequentist vs Bayesian Reasoning

Anti Virus Software: Norton, McAfee, Trend Micro, or Hauri?

MS SQL Performance (Tuning) Best Practices:

Performance analysis and comparison of virtualization protocols, RDP and PCoIP

Web Load Stress Testing

Sample Size and Power in Clinical Trials

Improved metrics collection and correlation for the CERN cloud storage test framework

Time Series Analysis

Inference of Probability Distributions for Trust and Security applications

A Study to Predict No Show Probability for a Scheduled Appointment at Free Health Clinic

7.1 The Hazard and Survival Functions

Fiery E100 Color Server. Welcome

Technical Appendix to accompany Real-Time Evaluation of Campaign Performance

CHAPTER 14 NONPARAMETRIC TESTS

Maximum Likelihood Estimation

Overview of Violations of the Basic Assumptions in the Classical Normal Linear Regression Model

Exam C, Fall 2006 PRELIMINARY ANSWER KEY

VDI Without Compromise with SimpliVity OmniStack and Citrix XenDesktop

Ready Time Observations

Citrix EdgeSight for Load Testing User s Guide. Citrx EdgeSight for Load Testing 2.7

Confidence Intervals for Exponential Reliability

APTA approach: Analysis of accelerated prototype test data based on small data volumes within a car door system case study

ICS Technology. PADS Viewer Manual. ICS Technology Inc PO Box 4063 Middletown, NJ

Simulation of DNS(Domain Name System) Using SimLib

Comparison of resampling method applied to censored data

MATH4427 Notebook 2 Spring MATH4427 Notebook Definitions and Examples Performance Measures for Estimators...

Nonparametric adaptive age replacement with a one-cycle criterion

Server Load Prediction

Intro to Data Analysis, Economic Statistics and Econometrics

HOW TO DO A SCIENCE PROJECT Step-by-Step Suggestions and Help for Elementary Students, Teachers, and Parents Brevard Public Schools

PARALLELS CLOUD STORAGE

AgencyPortal v5.1 Performance Test Summary Table of Contents

SOFTWARE MANAGEMENT PROGRAM. Software Testing Checklist

Likelihood Approaches for Trial Designs in Early Phase Oncology

White Paper. Lessons Learned from User Acceptance Testing. M. Glenn Newkirk Helen A. Sims. June 15, 2002

Hypothesis Testing for Beginners

Acer LCD Monitor Driver Installation Guide

Modeling the Claim Duration of Income Protection Insurance Policyholders Using Parametric Mixture Models

An Introduction to Modeling Stock Price Returns With a View Towards Option Pricing

Amazon EC2 XenApp Scalability Analysis

A Statistical Analysis of Popular Lottery Winning Strategies

Upgrading from Windows XP to Windows 7

1 Sufficient statistics

DSS. Diskpool and cloud storage benchmarks used in IT-DSS. Data & Storage Services. Geoffray ADDE

HYPOTHESIS TESTING: POWER OF THE TEST

Dell Client Energy Savings Calculator. General description of energy consumption in computer systems:

11. Analysis of Case-control Studies Logistic Regression

CHI-SQUARE: TESTING FOR GOODNESS OF FIT

Before you install ProSeries software for network use

Tested product: Auslogics BoostSpeed

A New Extension of the Exponential Distribution

NAS 249 Virtual Machine Configuration with VirtualBox

Lecture 15 Introduction to Survival Analysis

Least Squares Estimation

Universal Push2TV HD Adapter PTVU1000 Installation Guide

Solutions Lab. AXIS Barcode Reader Beta 0.3

Microsoft Dynamics NAV 2013 R2 Sizing Guidelines for On-Premises Single Tenant Deployments

SIDN Server Measurements

Configuration Manager 1.6

Power Management in Cloud Computing using Green Algorithm. -Kushal Mehta COP 6087 University of Central Florida

BA 275 Review Problems - Week 6 (10/30/06-11/3/06) CD Lessons: 53, 54, 55, 56 Textbook: pp , ,

A Primer on Mathematical Statistics and Univariate Distributions; The Normal Distribution; The GLM with the Normal Distribution

Mac computer configurations & OS X optimizations (Updated: November 2012)

Tests for Two Survival Curves Using Cox s Proportional Hazards Model

Computational Statistics and Data Analysis

BNG 202 Biomechanics Lab. Descriptive statistics and probability distributions I

Hands-On Microsoft Windows Server 2008

BitDefender Security for Exchange

UNDERSTANDING BATTERY LIFE IN PORTABLE COMPUTERS. Gary Verdun, Technology Strategist, Office of the Chief Technology Officer

Additional sources Compilation of sources:

Unit 26 Estimation with Confidence Intervals

Improve Business Productivity and User Experience with a SanDisk Powered SQL Server 2014 In-Memory OLTP Database

Implications of Alternative Operational Risk Modeling Techniques *

An Introduction to. Metrics. used during. Software Development

Transcription:

Software reliability analysis of laptop computers W. Wang* and M. Pecht** *Salford Business School, University of Salford, UK, w.wang@salford.ac.uk ** PHM Centre of City University of Hong Kong, Hong Kong, m.pecht@cityu.edu.hk Abstract. Computers freezing to work is a common problem and can cause unexpected damages if the work is important but not saved yet. This kind of problems is often caused by software failures which are hard to predict. This paper focuses on the reliability analysis of laptop computers experiencing multiple occurrences of failures in operation. The failure has been defined as the event that the computer freezes and has to be restarted unintended or by a forced re-start. The failure data was collected during three experiments running on one laptop computer continuously for a period of time under three different running environments. We fit a number of candidate distributions to the time between failures data. The results showed that the conventional Weibull distribution often used in practice is the best choice for all three data sets. However one interesting finding is that the mean time between failures decreases substantially when the computer is heavily stressed. This shows that if the computer is running on a full load, then the work should be frequently saved to avoid the damage.. Introduction Reliability is defined as the ability of a product to perform as intended (without failure and within specified performance limits) for a specified time period, in its life cycle application environment, Pecht et al. (). A product s health is a description of the state under its specified operating environment. Health monitoring is a method for evaluating a product s health as a means to determine whether and when failure will occur, Ramakrishnan and Pecht (3). We focus specifically laptop computers as the product we refer to in this paper since they are now almost household items widely used by both private individuals and companies. For laptop computers, hardware degradation is difficult to detect due to the complexity of the hardware, consisting of various sub-systems assembled together. One approach used by Vichare et al. was to monitor internal temperatures of laptop computers for health and usage monitoring, Vichare et al. (4). However, laptop computers can fail due to hardware or software failures. Hardware failures are rare though they do occur, but software failures as defined in the abstract are common things observed by virtually everyone who has used laptop computers. Because of the complexity associated with various software installed, the laptop computer is treated as a single system and we are interested in the reliability characteristics of software failures of laptop computers. A primary objective of the paper is to identify whether different intensities of usages influence the time to failure and what is the best distribution for describing the software failures. For the above objective, we designed and conducted a number of experiments to verify whether the intensity of the use of laptop computers do influnnce the software failure characteristics. Anoter objective is to find whether the failure rate is increasing, constant or decreasing. We examined eight field-returned laptop computers which were sent back to the manufacturer by the users who reported having problems with them. The manufacturer tested these computers and found no problems. These computers are of different models and have different system configurations such as processor speed, hard drive space, RAM, etc. Different experiments were designed via stressing the computers. Stressing was done by running different applications under different scenarios on these computers. Two of the eight computers showed multiple failures (as defined in the abstract and also in section ), but others did not. The different performance by these 8 laptop computers might be due to the different configurations and quality of hardware. Since we cannot wait to get sufficient data from each of the 8 computers, analysis for one computer that showed repeated failures is performed and shown in this paper. This may limit our conclusions to this type of computers only but the analysis nevertheless showed some interesting finding. Our hypothesis was that each computer would fail when stressed by using a set of specific applications. Applications that use different levels of memory and processing capability were selected for the experiments. Various applications were run at the same time to cause the CPU of the computer to work at diffeent levels of capacity. The belief was that the intensively used CPU would force the computer to restart or freeze more often.. Failure identification and determination For the purposes of this study we have defined the failure of a laptop as an automatic restart or a forced restart after the system freezes. If the laptop was to restart by itself, it would be considered an automatic

restart. If the computer was not to respond to the commands given by the user and the user was forced to restart the computer to resume the operation, this would be considered a forced restart. A forced restart can be triggered by a system lockdown. In this study, other types of computer failures such as faulty screen, or faulty keyboard, etc are not considered. The times of failure are determined by failure identifier events. Failure identifier events are event messages that are generated by the event log service, an inbuilt logging service in MS Windows, after a restart. The benchmark event used is the EventSystem, which is logged every time the computer restarts. Hence, the occurrence of this event is determined from the data and a gap in time is manually searched to find the time of failure. To understand this, a sample data record is shown below in Table to illustrate the process. Level Date and Time Source Error 4/9/ 5:9:7 Usbperf Information 4/9/ 5::5 User Profile Service Information 4/9/ 5::5 Security-Licensing- SLC Information 4/9/ 5::5 EventSystem Information 4/9/ 5::55 Desktop Window Manager Table Sample of Event Log Data In Table, EventSystem (or alternatively Microsoft-Windows-EventSystem) is found. The time it logged was not the time of failure, but is the time when the computer restarted. Hence, the time when the computer actually froze is found by scanning the data upwards to see a gap in time. At 4/9/ 5:9:7, a gap of around one and a half minutes occurred after the event usbperf. It is assumed that this was the actual time when the computer froze and is designated as the time of failure. The time stamp occurring after it (4/9/ 5::5. EventSystem) is chosen as the start time for the next lifetime. This is done so that we do not count the durations when the computer froze and there is a big gap in time if the user was not present to manually restart the computer. After determining the start time and end time in the lifetime of a computer, the times to failures and time between failures can be found. 3. Experiments We designed three experiments corresponding to three different user environments. 3. Experiment A specific set of applications were identified for this experiment to simulate a user behavior. The applications include Matlab (mathematical tool), Excel (office tool), Real Player (video playback software), and Internet Explorer (web browsing tool). These applications were initiated manually, and if idle, they were re-run at random. For example, when Excel had finshied an application, it may be re-run some time later by a code set-up by us ramdonly selecting the time to restart the same application. This also applies to other application such as playing video and running a Matlab code. Clearly in this experiment, the applications were not run in any specified order and had various level of stressing at various times. There could be times no application was running or all were running at the same time. This is perhaps the most closed situation to an actual user operation. The test computers were kept in room temperature and humidity. The time of failures were found by scanning the data for failure identifier events. Data were collected between /3/ and 4/6/. This amounted to 7446 minutes for the experiment. During this time period, there were a total of 8 failures in computer 4. 3. Experiment Experiment focused on stressing the computers under a more controlled setting. The test computers were stressed by the Matlab application which ran several specified codes in a loop with various uses of the CPU capacity. The Internet access had been cut off so that other applications running by default in the background could not communicate with the Internet. This experiment simulated a user behavior that ran only one application extensively which used up a lot of memory. The test computers were at room temperature and humidity. Data were collected between 4/8/ and 4/3/. During this time, there were a total of 4 failures from computer 4.

Time between failures 5 4 3 3.3 Experiment 3 Experiment 3 was an extension of experiment. In experiment 3, the computer automatically started the Matlab application and ran a specified code in a loop whenever the compute was rebooted. This code was taken from experiment, which has the highest use of the CPU capacity. The aim of experiment 3 was to generate failures that were all associated with the same stressing as a result of running Matlab with a more intensive use of the CPU than experiment. This gave the computer continuous stressing at the same but higher level of processing usage. Data were collected between 5/7/ and 5//. During this period, Time to failure there were a total of 9 failures in computer 4. 6.4 6447.63 794.83 8654.73 955.8 457.95 5.48 3.95 55. 698.3 695.4 7796.57 4. Failure analysis In this study, we fitted the following four distributions to the time between failure (TBF) data of three experiments using the maximum likelihood estimation method. They are Weibull,, and Exponential. The best suited distribution may be selected by the AIC measure, Akaike (98), which is given by AIC = l + N, () where l is the log-likelihood function value at the maximum and N is the number of estimated parameters. The use of AIC balances the log-likelihood value with the number of parameters. The one gives the lowest AIC is the best distribution fitted to the data. 4. Experiment Table shows the result from the data of experiment. Distribution Weibull β x α α β x β α e x α x e β α α β Γ( ) log( ) x α β e xβ π Exponential x e α α Estimated α = 67.49 α =.5 α = 5.573 α = 95. 59 parameter values β =.657 β = 836.9 β =.9388 Mean 93.63 98.45 63.7 95.58 Loglikelihood -65.74-54.977-58.36-64.686 AIC 5.44 33.95 6.7 5.36 χ.55.343.3769 3.83 Cannot be Cannot be Cannot be Can be rejected rejected rejected rejected Table : Estimated parameter values, means and goodness of fit measures from experiment From table we can see that the distribution is the best in terms of the AIC. It is obviously since it has the highest log-likelihood which is another measure of the goodness of fit. The total duration of the experiment is 7446 minutes and there were 8 failures for experiment. If we use the independent and identical PDF assumption for the TBF, then using the renewal theory, Cox (96), the expected number of failures over the experiment period can be approximated by 7446 divided by the mean time between failures. From Table, we have E(TBF)=98.45 from the chosen distribution and therefore, the expected number of failures over 7446 minutes is 78.88 which is not far from the observed number of 8. However, if we choose the exponential distribution we even get a better fit in terms of the difference between the fitted and observed mean numbers of failures. This is because the mean produced by the exponential distribution will be the same to the averaged mean time between failures from the data. However, it can be miss- leading if we only look at the mean numbers. Now we use another goodness of fit test to decide which distribution should be chosen. This is the Chi-Squared test of goodness of fit given below, Corder and Foreman (9), n i χ = = ( Oi Ei ) / Ei (), where the observed failure data is arranged into bins to form a histogram and O i and Ei denote the observed and expected number of failures within the ith bin and n is the number of total bins. We used the statistical tool box in the Matlab to perform this task and the result showed that the exponential distribution can be

rejected at the 5% confidence level, see Table. The other three distributions cannot be rejected but the Weibull produced the smallest Chi-Squared value, and therefore from this statistic, we choose the Weibull distribution. It is clearly that different criterions produced different results, but it is generally accepted that Chi-Squared statistic is a better measure for goodness of fit than the AIC since it uses more information from the data. Using the Weibull distribution we have E(TBF)=93.63 and then the mean number of failures over the experiment period is 79.9 which is very close to the observed 8 failures. Figure shows the Probability Density Functions (PDF) of these four fitted distributions. 4.5 5 x -3 4 Weibull Exponen 3.5 3.5.5.5 4 6 8 4 6 8 Figure. PDFs of TBF of four distributions for experiment The, Weibull and PDFs showed similar shapes but the exponential PDF is no near closer to them. 4. Experiment Table 3 shows the result of the parameter estimation based on the data of experiment. Distribution Weibull Exponential Estimated α =.893 α =.3 α = 3.4459 α = 356.337 parameter value β =.47 β =.5 β =.7846 Mean 54.4 333.75 8.5 356.33 Loglikelihood -33.36-54.78-85.63-69.9 AIC 47.6 33.564 375.64 538.8 χ..97 4.888 3.63 Cannot be Cannot be Cannot be Cannot be rejected rejected rejected rejected Table 3: Estimated parameter values, means and goodness of fit measures from experiment From table 3 we can see that also produced the best fit in terms of the AIC. But the same as experiment and if we use the Chi-Squared measure then Weibull is again the best choice as seen from Table 3. The experiment lasted 453 minutes and produced 4 failures. Following the same way we did for experiment and using the mean of 54.4 from the Weribull distribution, we have the expected number of failures over the experiment period is 56., which is not very good since we have observed 4 failures from the data. However, we have to note the sample size is small in this experiment and comparing just with means can be misleading as we have said before since an exponential distribution will always produce the best fit in terms of the means. We show the PDFs of these four distributions in Figure which shows that again the exponential distribution is singled out from the competition.

4.5 5 x -3 4 Weibull Exponen 3.5 3.5.5.5 4 6 8 4 6 8 Figure. PDFs of TBF of four distributions for experiment 3.3 Experiment 3 In a similar way, Table 4 shows the fitted result based on the data of experiment 3. In this case lognormal should be chosen since it has the smallest AIC. But in terms of the Chi-Squared statistic, we have again the Weibull which produced the smallest χ so we chose Weibull as our distribution. Using the same method as before we have that the expected number of failures using the Weibull distribution is 6.8. Distribution Weibull Exponential Estimated α = 76.63 α =.667 α = 3.7433 α = 5. parameter value β =.787 β = 57.544 β =.84 Mean 93.4 5.75 79.83 5.3 Loglikelihood -4.5-944.63-89. -7. AIC 87 893.4 64.4 4. χ.884.938.5574.7 Cannot be Cannot be Cannot be Cannot be rejected rejected rejected rejected Table 4. Estimated parameter values, means and goodness of fit measures We plot the PDFs of these four distributions in Figure 3. 4.5 5 x -3 4 Weibull Exponen 3.5 3.5.5.5 4 6 8 4 6 8

Figure 3. PDFs of the TBF of four distributions for experiment 3 In this case all PDFs are similar but the means are considerably shorter than the ones in Figures and. We then examined the hazard for the chosen PDF from each experiment. They are shown in Figure 4..3.5 Expt Expt Expt3..5..5 4 6 8 4 6 8 Figure 9. Hazard plot for the chosen PDF of each experiment We can see clearly that the hazards from all experiments decrease. This is not surprising since the hazard for the Weibull distribution when the shape parameter α< is always decreasing. This implies that the probability of the chance failure when the laptop already survived a while is getting smaller and smaller as the time goes. However, the hazard of experiment 3 is larger than experiment which is larger than experiment. This shows that intensive use of CPU will produce more failures. 5. Conclusions Experiments were designed to study the reliability of laptop computers. It was hypothesized that stressing the computers using various applications would cause it to restart or freeze, resulting in a failure. These failures were analyzed. The model parameters were estimated using the maximum likelihood method and two measures of the model fit were used to select the best model. It turned out that AIC is not a informative measure in this analysis since all models were not involved with a large number of unknown parameters. Eventually we used the Chi-Squared statistic to select the best model. From the analysis, Weibull was chosen for all three experiments since this distribution produced the smallest Chi-Squared values in all three experiments. However, the analysis shows that applying different stressing at different levels did produce different results with notable differences. The expected time to failures from the fitted model and observations show a clear trend in that the means are getting shorter when the stress increases. References Pecht, M., Das, D., and Ramakrishnan, A., (), The IEEE standards on reliability program and reliability prediction methods for electronic equipment, Microelectronics Reliability, 4, 59 66. Ramakrishnan, A., and Pecht, M., (3), A life consumption monitoring methodology for electronic systems, IEEE Trans. Components and Packaging Technologies, 6, 65 634. Vichare, N., Rodgers, P., Eveloy, V., and Pecht, M., (4), In situ temperature measurement of a notebook computer A case study in health and usage monitoring of electronics, IEEE Trans. Device and Materials Reliability, 4, 658 663. Akaike, H., (98), Likelihood and the Bayes procedure, Bayesian Statistics, Ed. Bernardo et. al., University Press. Cox, D.R., 96, Renewal theory, Methuen. Corder, G.W., and Foreman, D.I. (9), Nonparametric Statistics for Non-Statisticians: A Step-by-Step Approach, Wiley