Exploratory Failure Analysis of Open Source Software 1

Size: px
Start display at page:

Download "Exploratory Failure Analysis of Open Source Software 1"

Transcription

1 Exploratory Failure Analysis of Open Source Software Cobra Rahmani, Satish M. Srinivasan, Azad Azadmanesh College of Information Science & Technology University of Nebraska-Omaha, Omaha, U.S. {crahmani, smsrinivasan, Abstract- Reliability growth modeling in software system plays an important role in measuring and controlling software quality during software development. One main approach to reliability growth modeling is based on the statistical correlation of observed failure intensities versus estimated ones by the use of statistical models. Although there are a number of statistical models in the literature, this research concentrates on the following seven models: Weibull, Gamma, S-curve, Exponential, Lognormal, Cubic, and Schneidewind. The failure data collected are from five popular open source software (OSS) products. The objective is to determine which of the seven models best fits the failure data of the selected s as well as predicting the future failure pattern based on partial failure history. The outcome reveals that the best model fitting the failure data is not necessarily the best predictor model. Keywords- empirical software engineering, goodness-of-fit; open source software; software reliability growth modeling; I. INTRODUCTION Over the past years, open source software (OSS) has drawn increasing attention, both from the business and academic world. The leading concept of open source presented by Raymond [7] differentiates the collaborative open source approach from the traditional in-house and proprietary software development. Success behind OSS can be attributed to collaboration with volunteers across organizations and geographical boundaries, faster development due to surcharge in the number of developers, and platform independence due to its development environment [6]. In this research, five different OSS products named Eclipse V.2, Apache HTTP Server 2, Firefox, MPlayer OS X, and ClamWin Free Antivirus have been considered for estimating their reliability and for predicting the failure process. The rationale behind choosing these projects is popularity of the products in terms of number of users and downloads, the time-span in which the products have been in operation, and availability of the bug reports. The failure data collection concentrated on two popular archive sources, i.e. Bugzilla [5] and Sourceforge.net [9]. In this study, failure intensities, in sequence of two weeks, are collected for each software. The Failure information for MPlayer OS X and ClamWin Free Antivirus are obtained from Sourceforge.net. For Firefox, Eclipse V.2, and Apache 2, This research is funded in part by Department of Defense (DoD)/Air Force Office of Scientific Research (AFOSR), NSF Award Number FA , under the title High Assurance Software. the failure information is gathered from Bugzilla. Table I gives more information about the s used in this study. The table highlights the official release year and the duration of the collected failure data for each product. Table I. INFORMATION OF COLLECTED FAILURE FOR FIVE OSS Durations of collected Official failures release year Start date End date Firefox 999 3/999 2 /26 Eclipse V.2 2 /2 3 2/27 Apache /22 2/28 ClamWin Free Antivirus 24 3/24 8/28 MPlayer 22 9/22 6/26 The study compares seven distribution models in order to determine whether there is a consistency between any of these models with respect to the goodness-of-fit and reliability prediction of the selected s. The study attempts to shed light on the probable reasons if this consistency cannot be observed. The distribution models are Cubic, Exponential, Gamma, Lognormal, Schneidewind, S-curve, and Weibull [3,8,9,3,4,8,2, 22]. These models are chosen because of a combination of reasons such as their capability in providing various distribution shapes or potential in creating distributions that follow the failure patterns of software systems. The rest of the paper is organized as follows. Section II provides some definitions and background information about the aforementioned distribution models. Section III focuses on failure data analysis and the reliability modeling process. Section IV concludes the paper with a summary. II. BACKGROUND Software reliability growth models (SRGM) have been in existence for approximately 4 years with the intent to creating models that can accurately quantify software quality. Other distribution models have been developed over the years for purposes other than software reliability that at times are used in the software quality field. The hope is that by deciding on an appropriate model, its parameters can reflect the software behavior at one or more software phases such as development, testing, and operation [8,,4]. In [8], a number of models are discussed with information that can help the users and practitioners on deciding a model and assessing the reliability of software. 2 The failure data collected prior to the official release date of Firefox are obtained from Mozilla bug reports. 3 This date is prior to the official release date.

2 Lakey [] provides a number of SRGMs and provides a flowchart approach on how to decide on a model. However, because of so many interacting factors, no single model can be trusted to universally perform well at all times in estimating reliability or predicting the expected number of remaining defects [,4,,2]. To deal with the inaccurate predictions made by SRGMs, some authors have offered recalibrating the models. That is, the previous errors in earlier predictions are used to transform the model into a more accurate prediction model [,4]. In [7], authors have used non-linear regression to analyze defect data obtained from testing of three releases of a commercial system. They have applied four reliability models for selecting a suitable reliability model, which can best fit the customer defect data as testing progresses. Their study uses the method presented in [2]. In their approach, four traditional reliability models have been compared, but their case study is limited to just one commercial software. In [23], the authors have compared several SRGM models on one set of OSS failure data and concluded the logarithmic Poisson execution time model fits better than the other SRGMs for the actual data set. Their work is mainly concentrated on the goodness-of-fit without any assessment on prediction capability of these models. In [24], the author compares six different SRGMs on four different data sets taken from previous researches. Eighty percent of failure data is used to estimate the goodness-of-fit of those models and the other twenty percent is selected to validate the prediction capability of the models. The outcomes have shown differences between the best fitting and predicting models. A similar approach is used in [], in that some observations from the end of failure data are removed for the purpose of comparing predication performance between two SRGMs. In this study, the Probability Density Function (PDF) of the chosen distribution models are used to model the failure patterns of the selected products. PDF, denoted as f(t), shows the relative concentration of failures at different points of time t. The following gives a brief introduction to the seven distribution models considered in this study. Weibull The PDF of Weibull function is: f ( t) t ( t / ) e where α and β represent the scale and shape of the distribution model. The shape value determines the shape of the graph and the effect of the scale parameter is to squeeze or stretch the graph. S-curve There are different s-shaped distribution models. The one adopted in this study is used by SPSS [2] and has the following PDF: f ( t) e b b / t where is a constant and is the regression coefficient. If is positive, then the slope of the graph is upward. Otherwise the slope is downward. Lognormal The lognormal assumes that the natural logarithm of time to failure is normally distributed. The and µ are the mean and standard deviation of the natural logarithm of time to failure, respectively. The PDF of lognormal distribution is given by: Furthermore, and µ determine the shape and scale of the distribution model. Schneidewind This model assumes that the cumulative number of failures is Non-Homogeneous Poisson Process (NHPP) [2,4], which was originally studied in hardware reliability. NHPP models assume that the failure process varies with time and that the cumulative number of failures up to time t is Poisson distributed with the parameter m(t) that is the mean value of failures. Specifically, [ m ( t)] P ( M ( t) n) n! n e m ( t) where M(t) and m(t) are the total and the expected number of failures in interval [, t], and n is an integer. The mean value of the distribution model is: where α and β are the initial failure rate and the negative derivative of failure rate, respectively. Therefore, the expected number of failure during the period is m( ) - m( ). The Schneidewind s model is built on the belief that the failure frequency changes over time and that the recent failures rather than the past failures are more beneficial in predicting the future behavior of the system [8]. Gamma This model has properties similar to that of Weibull distribution with the scale and shape parameters α and β, respectively. The PDF of the Gamma distribution is given by: where is the gamma function: It is known [22] that for positive integer values x >,

3 Firefox Failure Eclipse Failure Eclipse V2. Failure ClamWin Failure Mplayer Failure Apache 2 Failure Exponential - Exponential distribution is a special case of Gamma and Weibull distributions with. Its PDF is given by: Cubic The PDF of the Cubic model is given as: where is a constant and are the regression coefficient values. III. EXPERIMENTAL ANALYSIS Prior to analyzing the performance estimates of the reliability growth models, the failure data for the five selected s must first be collected and filtered. Therefore, the reliability estimate process is partitioned into three steps: bug-gathering, bug-filtering, and bug-analysis. For the bug-gathering step, a java program has been developed to extract the raw failure data from the bug repository systems for each product. Although the breadth and depth of the bug reports vary from one repository system to another, each bug report normally contains a unique identification value for the report, the actual time/date the bug is reported, some information about the user reporting the bug, the product name, and also the status of the bug report filled by the organization in charge of the product development, such as whether the bug is fixed, valid, or deleted. The quality of reliability estimation highly depends on sufficient error reports and the accuracy of reports provided by the users. During the second step, i.e. bug-filtering, the extracted reports from the first step are filtered out in order to remove the unwanted reports such as duplicated ones. The reason for filtering is that some reports may not represent a real defect, or the information provided may not be complete. Among the bug-reports for MPlayer and ClamWin, which are gathered from Sourceforge.net, those reports with status other than Deleted (not a valid bug-report) are collected. For the other three software products, the bug reports are gathered from Bugzilla and those bug-reports with the following status values are accepted and the rest are discarded: FIXED (bug is fixed), WONTFIX (bug will not be fixed), LATER (bug won t be fixed in the current product version) and REMIND (bug probably won t be fixed in the current product version) Biweekly time Figure. Filtered failure intensities for the selected s Table II. VALUES FOR THE SEVEN DISTRIBUTION FUNCTIONS Distribution Cubic Exponential Gamma Lognormal Schneidewind S-curve Weibull function Apache ClamWin Free Antivirus Eclipse V Firefox MPlayer

4 Finally, in the last step, i.e. bug-analysis, the dates of the biweekly intervals for further analysis. Figure 4 exhibits the failure intensities for the five s. The x-axis and y-axis represent each biweekly period and its corresponding failure intensity, respectively. Also, each graph in the figure shows the interval for which the failure reports are collected. On a quick glance at the figure, Eclipse does not seem to follow a pattern similar to those of the other software products. Further investigation reveals that the bug reports include failures of multiple Eclipse versions. When the reports for each version are separated, it is noticed that the pattern of failure intensities for each version generally follows the same pattern as others. Therefore, rather than dealing with multiple versions of Eclipse with similar patterns, one single version i.e. Eclipse V.2. is analyzed for reliability estimation. The last graph in Figure shows the failure intensities for Eclipse V 2.. This version is selected for reliability analysis because of its high volume of bug reports in comparison to other versions. A. Goodness-of-fit Performance In this study, SPSS is used for conducting the statistical tests of goodness-of-fit. Specifically Non-Linear Regression (NLR) is employed to measure the goodness-offit of the seven distribution models with respect to the selected s. NLR is used because the failure intensities of the selected s follow a curvature pattern instead of a linear trend, which is evident from Figure. Table II shows the calculated values, as the result of NLR for the seven distribution functions. is a measure of the strength of how well the regression estimate fits the failure data [2]. value is between and, inclusive. The closer is to, the stronger the match is between the estimated regression and the observed failure data. In Table II, the highest value of among the distribution models for each product is bold-faced. Looking at the values, the Cubic model exhibits the overall best estimate of fitting the observed failure data. This is followed by the Weibull distribution. Furthermore not much discrepancy in values is noticed between the Cubic and Weibull distributions. One may also observe that the performance of the Gamma distribution is close to Weibull. Recall that the Gamma distribution is a special case of Weibull. Among these, S-curve shows the overall worst performance. Table III provides the best fitting models for each of the five s. Table III. BEST MODELS FOR FITTING THE FAILURE INTENSITY OF THE SELECTED OSS PRODUCTS Apache 2 ClamWin Free Antivirus Eclipse V.2 Firefox MPlayer Best fitting model Cubic, Lognormal Cubic, Weibull Weibull Cubic Exponential 4 The intensities of bug reports are connected to form smoother plots. The purpose is to better visualize the pattern of failure reports. filtered bug-reports are used to organize the reports into The next objective is to determine whether the model showing the best goodness-of-fit is also the best predictor of future failures. To investigate this, the time interval of the collected failures data for each product is halved. The failure data in the first half is used to estimate the parameters for each distribution model. Then, the same estimate of parameters is used to forecast the failures during the second half. B. Prediction Performance As indicated, the time interval of failure sample size is divided in to half where one-half is used for predicting the other-half is used for estimating future failures. Since the failure data for the software products under study is gathered for at least four years, it seems there is sufficient data in the first half to picture a decent estimate of the future failures. Except for Firefox, the failure data for the other products seem to be in a stabilized phase. So there should be a decent fit for the first half interval. Indeed doing the estimate for the first half supports this observation. For Firefox, even though the failure data is collected for over six years, it does not seem that the failure detection and removal are in a stable state. This study attempts the prediction process for Firefox as well, to obtain better insight for situations where sufficient failure data is not available or the reliability growth of a product may not be stable. As shown in different studies [3,4,5,6], among all reliability models, there is no single model to be always superior over the other models. But the failure pattern can be used as a simple way to decide on some models believed to provide a decent prediction. The prediction performance of the chosen distribution models are compared by determining the least average difference between the observed and predicted number of failures in the second half interval. This is measured by the Average Predicted Error (APE) form given below: APE= where n is the number of biweekly periods in the second half interval of a product. After calculating the estimated parameters for the first half interval and stretching the graph results over the second half, Figure 2 exhibits a graphical view in predicting of ClamWin failures for the seven distribution models. Due to the lack of space, the prediction graphs for the other products are not shown. However, Table IV shows the APE values for all selected products. As APE shows the average difference between actual failure intensities and predicted ones, a smaller APE value represents a better prediction. As shown in the table, Gamma and Lognormal are good predictors. Whereas, the Cubic model that was a good fitter identified earlier has the worst prediction performance. Comparing the table with Figure 2, the APE values support the visual patterns in the

5 7 2 7 Exponential Gamma Lognormal S-curve Weibull Schneidewind Cubic ClamWin FI Figure 2. ClamWin actual failure intensity and prediction by the seven distribution functions Table IV. APE VALUES FOR THE SEVEN DISTRIBUTION FUNCTIONS Distribution Cubic Exponential Gamma Lognormal Schneidewind S-curve Weibull function Apache ClamWin Free Antivirus Eclipse V Firefox MPlayer Table V. BEST MODELS FOR PREDICTING THE FUTURE FAILURE INTENSITY OF THE SELECTED OSS PRODUCTS Apache 2 ClamWin Free Antivirus Eclipse V.2 Firefox MPlayer Best prediction model Lognormal Gamma S-curve Lognormal Gamma figure. Table V provides the best predictor models for each product. Based on these observations, it is concluded that a best goodness-of-fit model may not necessarily be a good predictor model. Comparing the tables III and V, it is noticed that the best models for goodness-of-fit and prediction disagree for majority of the products. To better understand the reasons for not seeing the same consistency among the models in terms of goodness-of-fit and future prediction of failures, the Firefox product is further scrutinized. The observations are shown in Figure 3. The graph titled filtered bug pattern is the same as the failure intensity as shown in Figure 2. Fitted FI is the estimated fitted graph by Weibull based on the entire failure data. The other two graphs in the figure show the predictions when the first one- year and two-years of failure data are used for estimating the parameters of Weibull. The early portions of the two graphs are thus the fitted estimates based on oneand two-years of failure data and the latter parts of the graphs are the prediction estimates. As anticipated, the prediction based on one-year failure data is very poor. This is because as the length of prediction interval is increased by having less failure data to depend on, it becomes more difficult to predict future failures. For the same reason, the prediction using two years of failure data shows better accuracy of prediction. This observation could be the possible reason that the authors in [,24] adopted to predict a small percentage of failures compared to the total failure data. Additionally, the graph of the filtered bug pattern in Figure 3 shows a dip in about the 25 th biweekly period, which causes Weibull to adjust its estimate accordingly. This forces the one-year fitted graph to continue the decreasing trend of failures as time increases. This dip can also be wrongfully interpreted as a sign of cumulative failure data becoming stable. A similar observation (dip) is taking place around the 5 th biweekly period, although not as severe as the dip for the one-year failure data used for prediction purpose. In general, there are many factors that affect the accuracy of prediction. One obvious factor is the model type used. A survey done in the late 99s by the American Society for Quality reported that only 4% of the responders

6 Failure could apply a SRGM [4]. Additionally, application of a SRGM correctly requires a good understanding of the product profile at different stages. As some examples, whether the failures are independent of each other, whether the defect removals are imperfect, or whether there has been any shift in operation profile of the product, all can affect the prediction estimate. As a self-experience, the Eclipse product in Figure shows a failure pattern that can be modeled by multiple distributions such as Weibull. But most likely, the fitted graph would not provide a good estimate of the actual graph. Investigation revealed that the operation profile of Eclipse changes during each release of the product, which happens around January of each year Filtered Bug Pattern Fitted FI Fitted -Year FI Fitted 2-Year FI Figure 3. Firefox actual failure intensity and prediction by Weibull based on -year, 2-years, and the entire failure data. IV. CONCLUSION This study has attempted to compare seven reliability models with respect to estimates of failure intensities and failure forecasts against the actual failure data. The bug reports of five different s are collected and used as input to the seven models. The study has used nonlinear regression analysis as a metric to measure the goodness-of-fit. As the second metric, APE is used to determine which model is the best predictor. For the selected products, Weibull and Cubic are promising models for goodness-of-fit. But the Cubic model is shown to be the worst predictor. In general, Gamma and Lognormal models provided the best prediction models for future failures followed by the S-curve model. Therefore, the results show that a model able to provide a good fit may not be a good predictor of future failures because of so many interacting factors. It is reasonable to believe that some failure intensities, called outliers [6], out of a larger sample may have tangible effect on the parameters of the regression estimates. Therefore, as an avenue of future research, it is worth investigating this phenomenon, as to whether forecast of failures is improved when these outliers are removed from the estimation process based on the available failure data. Another avenue is to determine the effect on prediction by recalibrating the models used in this study. REFERENCES [] A.A. Abdel-Ghaly, P.Y. Chan, B. Littlewood, Evaluation of computing software reliability, IEEE Transactions on Software Engineering, vol. SE-2, no. 9, pp , 986. [2] A.D. Aczel, J. Sounderpandian, Complete Business Statistics, 6 th Ed., McGraw Hills, 25. [3] P. Asthana, Jumping the technology S-curve, IEEE Spectrum, vol. 32, no. 6, pp , 995. [4] S. Brocklehurst, B. Littlewood, New ways to get accurate reliability measures, IEEE Software, pp , July 992. [5] Bugzilla, [6] W.J. Conover, Practical Nonparametric Statistics, 3 rd Ed., John Wiley, 999. [7] R. Hewett et.al, On Effective Use of Reliability Models and Defect Data in Software Development, [8] IEEE Reliability Society, IEEE recommended practice on software reliability, IEEE Std , June 28. [9] H.S. Kan, Metrics and Models in Software Quality Engineering, 2 nd Ed., Addison-Wesley, 23. [] P. Lakey, A. Neufelder, System and software reliability assurance notebook, Rome Laboratory, 997. [] J.S. Lawson, C.W. Wesselman, D.T. Scott, Simple plots improve software reliability prediction models, Quality Engineering, vol. 5, no. 3, pp. 4-47, 23. [2] M.R. Lyu, Handbook of Software Reliability Engineering, McGraw Hills, 996. [3] R. Mullen, S.S. Gokhale, The Lognormal distribution of software failure rates: Applications to software reliability growth modeling, 9 th Int l Symposium on Software Reliability Engineering, pp , 998. [4] H. Pham, System Software Reliability, Springer, 26. [5] H. Pham, L. Nordmann, A generalized NHPP software reliability model, 3 rd Int l Conference on Reliability and Quality in Design, 997. [6] C. Rahmani, H. Siy, H., A. Azadmanesh,, An experimental analysis of open source software reliability, 28 th IEEE Symposium on Reliable Distributed Systems, Sep 29. [7] E.S. Raymond, The cathedral and the bazaar: musings on Linux and open source by an accidental revolutionary, 2 nd Ed., O Reilly, 2. [8] N.F. Schneidewind, "Analysis of error processes in computer software, Sigplan Note, vol., no. 6, pp , 975. [9] SourceForge, [2] SPSS, [2] C. Stringfellow, A.A. Andrews, An empirical method for selecting software reliability growth models, Empirical Software Engineering, vol. 7, no. 4, pp , Dec 22. [22] K.S. Trividi, Probability and Statistics with Reliability and Computer Science Applications, 2 nd Ed., John Wiley, 22. [23] Y. Tamura, S. Yamada, Comparison of software reliability assessment methods for open source software and reliability assessment tool, Journal of Computer Science vol. 2, no. 6, pp , 26. [24] D.R.P. Williams, Prediction capability analysis of two and three parameters software reliability growth models, Information Technology Journal, vol. 5, no. 6, pp , 26.

An Experimental Analysis of Open Source Software Reliability*

An Experimental Analysis of Open Source Software Reliability* An Experimental Analysis of Open Source Software Reliability* Cobra Rahmani, Harvey Siy, Azad Azadmanesh College of Information Science & Technology University of Nebraska-Omaha Omaha, U.S. E-mail: (crahmani,

More information

A Comparative Analysis of Open Source Software Reliability

A Comparative Analysis of Open Source Software Reliability 384 JOURNAL OF SOFTWARE, VOL. 5, NO. 2, DECEMBER 2 A Comparative Analysis of Open Source Software Reliability Cobra Rahmani, Azad Azadmanesh and Lotfollah Najjar College of Information Science & Technology

More information

Fairfield Public Schools

Fairfield Public Schools Mathematics Fairfield Public Schools AP Statistics AP Statistics BOE Approved 04/08/2014 1 AP STATISTICS Critical Areas of Focus AP Statistics is a rigorous course that offers advanced students an opportunity

More information

Relationships Between Two Variables: Scatterplots and Correlation

Relationships Between Two Variables: Scatterplots and Correlation Relationships Between Two Variables: Scatterplots and Correlation Example: Consider the population of cars manufactured in the U.S. What is the relationship (1) between engine size and horsepower? (2)

More information

Review of Fundamental Mathematics

Review of Fundamental Mathematics Review of Fundamental Mathematics As explained in the Preface and in Chapter 1 of your textbook, managerial economics applies microeconomic theory to business decision making. The decision-making tools

More information

The Impact of Defect Resolution on Project Activity in Open Source Projects: Moderating Role of Project Category

The Impact of Defect Resolution on Project Activity in Open Source Projects: Moderating Role of Project Category 594 The Impact of Defect Resolution on Project Activity in Open Source Projects: Moderating Role of Project Category 1 Amir Hossein Ghapanchi, School of information systems, technology and management,

More information

START Selected Topics in Assurance

START Selected Topics in Assurance START Selected Topics in Assurance Related Technologies Table of Contents Introduction Some Statistical Background Fitting a Normal Using the Anderson Darling GoF Test Fitting a Weibull Using the Anderson

More information

Traditional Commercial Software Development. Open Source Development. Traditional Assumptions. Intangible Goods. Dr. James A.

Traditional Commercial Software Development. Open Source Development. Traditional Assumptions. Intangible Goods. Dr. James A. Open Source Development Dr. James A. Bednar jbednar@inf.ed.ac.uk http://homepages.inf.ed.ac.uk/jbednar Traditional Commercial Software Development Producing consumer-oriented software is often done in

More information

CALCULATIONS & STATISTICS

CALCULATIONS & STATISTICS CALCULATIONS & STATISTICS CALCULATION OF SCORES Conversion of 1-5 scale to 0-100 scores When you look at your report, you will notice that the scores are reported on a 0-100 scale, even though respondents

More information

CHI-SQUARE: TESTING FOR GOODNESS OF FIT

CHI-SQUARE: TESTING FOR GOODNESS OF FIT CHI-SQUARE: TESTING FOR GOODNESS OF FIT In the previous chapter we discussed procedures for fitting a hypothesized function to a set of experimental data points. Such procedures involve minimizing a quantity

More information

Software reliability improvement with quality metric and defect tracking

Software reliability improvement with quality metric and defect tracking Software reliability improvement with quality metric and defect tracking Madhavi Mane 1, Manjusha Joshi 2, Prof. Amol Kadam 3, Prof. Dr. S.D. Joshi 4, 1 M.Tech Student, Computer Engineering Department

More information

Revised Modularity Index to Measure Modularity of OSS Projects with Case Study of Freemind

Revised Modularity Index to Measure Modularity of OSS Projects with Case Study of Freemind Revised Modularity Index to Measure Modularity of OSS Projects with Case Study of Freemind Andi Wahju Rahardjo Emanuel Informatics Bachelor Program Maranatha Christian University Bandung, Indonesia ABSTRACT

More information

Monitoring Software Reliability using Statistical Process Control: An Ordered Statistics Approach

Monitoring Software Reliability using Statistical Process Control: An Ordered Statistics Approach Monitoring Software Reliability using Statistical Process Control: An Ordered Statistics Approach Bandla Srinivasa Rao Associate Professor. Dept. of Computer Science VRS & YRN College Dr. R Satya Prasad

More information

Software Reliability Measurement Experiences Conducted in Alcatel Portugal

Software Reliability Measurement Experiences Conducted in Alcatel Portugal Software Reliability Measurement Experiences Conducted in Alcatel Portugal Rui Loureno, Alcatel Portugal, S.A. Abstract Sofhvare Reliabz.lity measurement is essential for examining the degree of qualz.ty

More information

Optimal parameter choice in modeling of ERP system reliability

Optimal parameter choice in modeling of ERP system reliability Proceedings of the 22nd Central European Conference on Information and Intelligent Systems 365 Optimal parameter choice in modeling of ERP system reliability Frane Urem, Želimir Mikulić Department of management

More information

Using kernel methods to visualise crime data

Using kernel methods to visualise crime data Submission for the 2013 IAOS Prize for Young Statisticians Using kernel methods to visualise crime data Dr. Kieran Martin and Dr. Martin Ralphs kieran.martin@ons.gov.uk martin.ralphs@ons.gov.uk Office

More information

Understanding the Role of Core Developers in Open Source Software Development

Understanding the Role of Core Developers in Open Source Software Development Journal of Information, Information Technology, and Organizations Volume 1, 2006 Understanding the Role of Core Developers in Open Source Software Development Ju Long Texas State University- San Marcos,

More information

The importance of using marketing information systems in five stars hotels working in Jordan: An empirical study

The importance of using marketing information systems in five stars hotels working in Jordan: An empirical study International Journal of Business Management and Administration Vol. 4(3), pp. 044-053, May 2015 Available online at http://academeresearchjournals.org/journal/ijbma ISSN 2327-3100 2015 Academe Research

More information

Comparing Methods to Identify Defect Reports in a Change Management Database

Comparing Methods to Identify Defect Reports in a Change Management Database Comparing Methods to Identify Defect Reports in a Change Management Database Elaine J. Weyuker, Thomas J. Ostrand AT&T Labs - Research 180 Park Avenue Florham Park, NJ 07932 (weyuker,ostrand)@research.att.com

More information

An Introduction to. Metrics. used during. Software Development

An Introduction to. Metrics. used during. Software Development An Introduction to Metrics used during Software Development Life Cycle www.softwaretestinggenius.com Page 1 of 10 Define the Metric Objectives You can t control what you can t measure. This is a quote

More information

Chapter 3 Non-parametric Models for Magneto-Rheological Dampers

Chapter 3 Non-parametric Models for Magneto-Rheological Dampers Chapter 3 Non-parametric Models for Magneto-Rheological Dampers The primary purpose of this chapter is to present an approach for developing nonparametric models for magneto-rheological (MR) dampers. Upon

More information

ANALYSIS OF OPEN SOURCE DEFECT TRACKING TOOLS FOR USE IN DEFECT ESTIMATION

ANALYSIS OF OPEN SOURCE DEFECT TRACKING TOOLS FOR USE IN DEFECT ESTIMATION ANALYSIS OF OPEN SOURCE DEFECT TRACKING TOOLS FOR USE IN DEFECT ESTIMATION Catherine V. Stringfellow, Dileep Potnuri Department of Computer Science Midwestern State University Wichita Falls, TX U.S.A.

More information

Chapter 3 RANDOM VARIATE GENERATION

Chapter 3 RANDOM VARIATE GENERATION Chapter 3 RANDOM VARIATE GENERATION In order to do a Monte Carlo simulation either by hand or by computer, techniques must be developed for generating values of random variables having known distributions.

More information

Modeling and Analysis of Call Center Arrival Data: A Bayesian Approach

Modeling and Analysis of Call Center Arrival Data: A Bayesian Approach Modeling and Analysis of Call Center Arrival Data: A Bayesian Approach Refik Soyer * Department of Management Science The George Washington University M. Murat Tarimcilar Department of Management Science

More information

A logistic approximation to the cumulative normal distribution

A logistic approximation to the cumulative normal distribution A logistic approximation to the cumulative normal distribution Shannon R. Bowling 1 ; Mohammad T. Khasawneh 2 ; Sittichai Kaewkuekool 3 ; Byung Rae Cho 4 1 Old Dominion University (USA); 2 State University

More information

1) Write the following as an algebraic expression using x as the variable: Triple a number subtracted from the number

1) Write the following as an algebraic expression using x as the variable: Triple a number subtracted from the number 1) Write the following as an algebraic expression using x as the variable: Triple a number subtracted from the number A. 3(x - x) B. x 3 x C. 3x - x D. x - 3x 2) Write the following as an algebraic expression

More information

2. Simple Linear Regression

2. Simple Linear Regression Research methods - II 3 2. Simple Linear Regression Simple linear regression is a technique in parametric statistics that is commonly used for analyzing mean response of a variable Y which changes according

More information

Mining Peer Code Review System for Computing Effort and Contribution Metrics for Patch Reviewers

Mining Peer Code Review System for Computing Effort and Contribution Metrics for Patch Reviewers Mining Peer Code Review System for Computing Effort and Contribution Metrics for Patch Reviewers Rahul Mishra, Ashish Sureka Indraprastha Institute of Information Technology, Delhi (IIITD) New Delhi {rahulm,

More information

Time series Forecasting using Holt-Winters Exponential Smoothing

Time series Forecasting using Holt-Winters Exponential Smoothing Time series Forecasting using Holt-Winters Exponential Smoothing Prajakta S. Kalekar(04329008) Kanwal Rekhi School of Information Technology Under the guidance of Prof. Bernard December 6, 2004 Abstract

More information

LOGISTIC REGRESSION ANALYSIS

LOGISTIC REGRESSION ANALYSIS LOGISTIC REGRESSION ANALYSIS C. Mitchell Dayton Department of Measurement, Statistics & Evaluation Room 1230D Benjamin Building University of Maryland September 1992 1. Introduction and Model Logistic

More information

Fuzzy Logic Based Revised Defect Rating for Software Lifecycle Performance. Prediction Using GMR

Fuzzy Logic Based Revised Defect Rating for Software Lifecycle Performance. Prediction Using GMR BIJIT - BVICAM s International Journal of Information Technology Bharati Vidyapeeth s Institute of Computer Applications and Management (BVICAM), New Delhi Fuzzy Logic Based Revised Defect Rating for Software

More information

Chapter Seven. Multiple regression An introduction to multiple regression Performing a multiple regression on SPSS

Chapter Seven. Multiple regression An introduction to multiple regression Performing a multiple regression on SPSS Chapter Seven Multiple regression An introduction to multiple regression Performing a multiple regression on SPSS Section : An introduction to multiple regression WHAT IS MULTIPLE REGRESSION? Multiple

More information

MATH 10: Elementary Statistics and Probability Chapter 5: Continuous Random Variables

MATH 10: Elementary Statistics and Probability Chapter 5: Continuous Random Variables MATH 10: Elementary Statistics and Probability Chapter 5: Continuous Random Variables Tony Pourmohamad Department of Mathematics De Anza College Spring 2015 Objectives By the end of this set of slides,

More information

A Robustness Simulation Method of Project Schedule based on the Monte Carlo Method

A Robustness Simulation Method of Project Schedule based on the Monte Carlo Method Send Orders for Reprints to reprints@benthamscience.ae 254 The Open Cybernetics & Systemics Journal, 2014, 8, 254-258 Open Access A Robustness Simulation Method of Project Schedule based on the Monte Carlo

More information

Simple linear regression

Simple linear regression Simple linear regression Introduction Simple linear regression is a statistical method for obtaining a formula to predict values of one variable from another where there is a causal relationship between

More information

EST.03. An Introduction to Parametric Estimating

EST.03. An Introduction to Parametric Estimating EST.03 An Introduction to Parametric Estimating Mr. Larry R. Dysert, CCC A ACE International describes cost estimating as the predictive process used to quantify, cost, and price the resources required

More information

A Study to Predict No Show Probability for a Scheduled Appointment at Free Health Clinic

A Study to Predict No Show Probability for a Scheduled Appointment at Free Health Clinic A Study to Predict No Show Probability for a Scheduled Appointment at Free Health Clinic Report prepared for Brandon Slama Department of Health Management and Informatics University of Missouri, Columbia

More information

Part 2: Analysis of Relationship Between Two Variables

Part 2: Analysis of Relationship Between Two Variables Part 2: Analysis of Relationship Between Two Variables Linear Regression Linear correlation Significance Tests Multiple regression Linear Regression Y = a X + b Dependent Variable Independent Variable

More information

Improving Developer Activity Metrics with Issue Tracking Annotations

Improving Developer Activity Metrics with Issue Tracking Annotations Improving Developer Activity s with Issue Tracking Annotations Andrew Meneely, Mackenzie Corcoran, Laurie Williams North Carolina State University {apmeneel, mhcorcor, lawilli3}@ncsu.edu ABSTRACT Understanding

More information

SUMAN DUVVURU STAT 567 PROJECT REPORT

SUMAN DUVVURU STAT 567 PROJECT REPORT SUMAN DUVVURU STAT 567 PROJECT REPORT SURVIVAL ANALYSIS OF HEROIN ADDICTS Background and introduction: Current illicit drug use among teens is continuing to increase in many countries around the world.

More information

The Impact of Release Management and Quality Improvement in Open Source Software Project Management

The Impact of Release Management and Quality Improvement in Open Source Software Project Management Applied Mathematical Sciences, Vol. 6, 2012, no. 62, 3051-3056 The Impact of Release Management and Quality Improvement in Open Source Software Project Management N. Arulkumar 1 and S. Chandra Kumramangalam

More information

Lean Six Sigma Analyze Phase Introduction. TECH 50800 QUALITY and PRODUCTIVITY in INDUSTRY and TECHNOLOGY

Lean Six Sigma Analyze Phase Introduction. TECH 50800 QUALITY and PRODUCTIVITY in INDUSTRY and TECHNOLOGY TECH 50800 QUALITY and PRODUCTIVITY in INDUSTRY and TECHNOLOGY Before we begin: Turn on the sound on your computer. There is audio to accompany this presentation. Audio will accompany most of the online

More information

Statistical Analysis of Process Monitoring Data for Software Process Improvement and Its Application

Statistical Analysis of Process Monitoring Data for Software Process Improvement and Its Application American Journal of Operations Research, 2012, 2, 43-50 http://dx.doi.org/10.4236/ajor.2012.21005 Published Online March 2012 (http://www.scirp.org/journal/ajor) Statistical Analysis of Process Monitoring

More information

Permutation Tests for Comparing Two Populations

Permutation Tests for Comparing Two Populations Permutation Tests for Comparing Two Populations Ferry Butar Butar, Ph.D. Jae-Wan Park Abstract Permutation tests for comparing two populations could be widely used in practice because of flexibility of

More information

Regression III: Advanced Methods

Regression III: Advanced Methods Lecture 16: Generalized Additive Models Regression III: Advanced Methods Bill Jacoby Michigan State University http://polisci.msu.edu/jacoby/icpsr/regress3 Goals of the Lecture Introduce Additive Models

More information

An Application of Weibull Analysis to Determine Failure Rates in Automotive Components

An Application of Weibull Analysis to Determine Failure Rates in Automotive Components An Application of Weibull Analysis to Determine Failure Rates in Automotive Components Jingshu Wu, PhD, PE, Stephen McHenry, Jeffrey Quandt National Highway Traffic Safety Administration (NHTSA) U.S. Department

More information

II. DISTRIBUTIONS distribution normal distribution. standard scores

II. DISTRIBUTIONS distribution normal distribution. standard scores Appendix D Basic Measurement And Statistics The following information was developed by Steven Rothke, PhD, Department of Psychology, Rehabilitation Institute of Chicago (RIC) and expanded by Mary F. Schmidt,

More information

VISUALIZATION OF DENSITY FUNCTIONS WITH GEOGEBRA

VISUALIZATION OF DENSITY FUNCTIONS WITH GEOGEBRA VISUALIZATION OF DENSITY FUNCTIONS WITH GEOGEBRA Csilla Csendes University of Miskolc, Hungary Department of Applied Mathematics ICAM 2010 Probability density functions A random variable X has density

More information

Analysing Questionnaires using Minitab (for SPSS queries contact -) Graham.Currell@uwe.ac.uk

Analysing Questionnaires using Minitab (for SPSS queries contact -) Graham.Currell@uwe.ac.uk Analysing Questionnaires using Minitab (for SPSS queries contact -) Graham.Currell@uwe.ac.uk Structure As a starting point it is useful to consider a basic questionnaire as containing three main sections:

More information

Empirical Analysis and Statistical Modeling of Attack Processes based on Honeypots

Empirical Analysis and Statistical Modeling of Attack Processes based on Honeypots Empirical Analysis and Statistical Modeling of Attack Processes based on Honeypots M. Kaâniche 1, E. Alata 1, V. Nicomette 1, Y. Deswarte 1, M. Dacier 2 1 LAAS-CNRS, Université de Toulouse 7 Avenue du

More information

Normality Testing in Excel

Normality Testing in Excel Normality Testing in Excel By Mark Harmon Copyright 2011 Mark Harmon No part of this publication may be reproduced or distributed without the express permission of the author. mark@excelmasterseries.com

More information

The KaleidaGraph Guide to Curve Fitting

The KaleidaGraph Guide to Curve Fitting The KaleidaGraph Guide to Curve Fitting Contents Chapter 1 Curve Fitting Overview 1.1 Purpose of Curve Fitting... 5 1.2 Types of Curve Fits... 5 Least Squares Curve Fits... 5 Nonlinear Curve Fits... 6

More information

Do Onboarding Programs Work?

Do Onboarding Programs Work? Do Onboarding Programs Work? Adriaan Labuschagne and Reid Holmes School of Computer Science University of Waterloo Waterloo, ON, Canada alabusch,rtholmes@cs.uwaterloo.ca Abstract Open source software systems

More information

https://williamshartunionca.springboardonline.org/ebook/book/27e8f1b87a1c4555a1212b...

https://williamshartunionca.springboardonline.org/ebook/book/27e8f1b87a1c4555a1212b... of 19 9/2/2014 12:09 PM Answers Teacher Copy Plan Pacing: 1 class period Chunking the Lesson Example A #1 Example B Example C #2 Check Your Understanding Lesson Practice Teach Bell-Ringer Activity Students

More information

Multiple Regression: What Is It?

Multiple Regression: What Is It? Multiple Regression Multiple Regression: What Is It? Multiple regression is a collection of techniques in which there are multiple predictors of varying kinds and a single outcome We are interested in

More information

Comparison of Request Admission Based Performance Isolation Approaches in Multi-tenant SaaS Applications

Comparison of Request Admission Based Performance Isolation Approaches in Multi-tenant SaaS Applications Comparison of Request Admission Based Performance Isolation Approaches in Multi-tenant SaaS Applications Rouven Kreb 1 and Manuel Loesch 2 1 SAP AG, Walldorf, Germany 2 FZI Research Center for Information

More information

Predict the Popularity of YouTube Videos Using Early View Data

Predict the Popularity of YouTube Videos Using Early View Data 000 001 002 003 004 005 006 007 008 009 010 011 012 013 014 015 016 017 018 019 020 021 022 023 024 025 026 027 028 029 030 031 032 033 034 035 036 037 038 039 040 041 042 043 044 045 046 047 048 049 050

More information

C. Wohlin, M. Höst, P. Runeson and A. Wesslén, "Software Reliability", in Encyclopedia of Physical Sciences and Technology (third edition), Vol.

C. Wohlin, M. Höst, P. Runeson and A. Wesslén, Software Reliability, in Encyclopedia of Physical Sciences and Technology (third edition), Vol. C. Wohlin, M. Höst, P. Runeson and A. Wesslén, "Software Reliability", in Encyclopedia of Physical Sciences and Technology (third edition), Vol. 15, Academic Press, 2001. Software Reliability Claes Wohlin

More information

Theory at a Glance (For IES, GATE, PSU)

Theory at a Glance (For IES, GATE, PSU) 1. Forecasting Theory at a Glance (For IES, GATE, PSU) Forecasting means estimation of type, quantity and quality of future works e.g. sales etc. It is a calculated economic analysis. 1. Basic elements

More information

X X X a) perfect linear correlation b) no correlation c) positive correlation (r = 1) (r = 0) (0 < r < 1)

X X X a) perfect linear correlation b) no correlation c) positive correlation (r = 1) (r = 0) (0 < r < 1) CORRELATION AND REGRESSION / 47 CHAPTER EIGHT CORRELATION AND REGRESSION Correlation and regression are statistical methods that are commonly used in the medical literature to compare two or more variables.

More information

File Size Distribution Model in Enterprise File Server toward Efficient Operational Management

File Size Distribution Model in Enterprise File Server toward Efficient Operational Management Proceedings of the World Congress on Engineering and Computer Science 212 Vol II WCECS 212, October 24-26, 212, San Francisco, USA File Size Distribution Model in Enterprise File Server toward Efficient

More information

RELIABILITY IMPROVEMENT WITH PSP OF WEB-BASED SOFTWARE APPLICATIONS

RELIABILITY IMPROVEMENT WITH PSP OF WEB-BASED SOFTWARE APPLICATIONS RELIABILITY IMPROVEMENT WITH PSP OF WEB-BASED SOFTWARE APPLICATIONS Leticia Dávila-Nicanor, Pedro Mejía-Alvarez CINVESTAV-IPN. Sección de Computación ldavila@yahoo.com.mx, pmejia@cs.cinvestav.mx ABSTRACT

More information

Section A. Index. Section A. Planning, Budgeting and Forecasting Section A.2 Forecasting techniques... 1. Page 1 of 11. EduPristine CMA - Part I

Section A. Index. Section A. Planning, Budgeting and Forecasting Section A.2 Forecasting techniques... 1. Page 1 of 11. EduPristine CMA - Part I Index Section A. Planning, Budgeting and Forecasting Section A.2 Forecasting techniques... 1 EduPristine CMA - Part I Page 1 of 11 Section A. Planning, Budgeting and Forecasting Section A.2 Forecasting

More information

Fusion Pro QbD-aligned DOE Software

Fusion Pro QbD-aligned DOE Software Fusion Pro QbD-aligned DOE Software Statistical Experimental Design Analysis & Modeling Robustness Simulation Numerical & Graphical Optimization 2D, 3D, & 4D Visualization Graphics 100% aligned with Quality

More information

5 Correlation and Data Exploration

5 Correlation and Data Exploration 5 Correlation and Data Exploration Correlation In Unit 3, we did some correlation analyses of data from studies related to the acquisition order and acquisition difficulty of English morphemes by both

More information

A Review of Statistical Outlier Methods

A Review of Statistical Outlier Methods Page 1 of 5 A Review of Statistical Outlier Methods Nov 2, 2006 By: Steven Walfish Pharmaceutical Technology Statistical outlier detection has become a popular topic as a result of the US Food and Drug

More information

Beating the MLB Moneyline

Beating the MLB Moneyline Beating the MLB Moneyline Leland Chen llxchen@stanford.edu Andrew He andu@stanford.edu 1 Abstract Sports forecasting is a challenging task that has similarities to stock market prediction, requiring time-series

More information

Comparison of sales forecasting models for an innovative agro-industrial product: Bass model versus logistic function

Comparison of sales forecasting models for an innovative agro-industrial product: Bass model versus logistic function The Empirical Econometrics and Quantitative Economics Letters ISSN 2286 7147 EEQEL all rights reserved Volume 1, Number 4 (December 2012), pp. 89 106. Comparison of sales forecasting models for an innovative

More information

Geostatistics Exploratory Analysis

Geostatistics Exploratory Analysis Instituto Superior de Estatística e Gestão de Informação Universidade Nova de Lisboa Master of Science in Geospatial Technologies Geostatistics Exploratory Analysis Carlos Alberto Felgueiras cfelgueiras@isegi.unl.pt

More information

Analysis of Open Source Software Development Iterations by Means of Burst Detection Techniques

Analysis of Open Source Software Development Iterations by Means of Burst Detection Techniques Analysis of Open Source Software Development Iterations by Means of Burst Detection Techniques Bruno Rossi, Barbara Russo, and Giancarlo Succi CASE Center for Applied Software Engineering Free University

More information

A Review of Methods. for Dealing with Missing Data. Angela L. Cool. Texas A&M University 77843-4225

A Review of Methods. for Dealing with Missing Data. Angela L. Cool. Texas A&M University 77843-4225 Missing Data 1 Running head: DEALING WITH MISSING DATA A Review of Methods for Dealing with Missing Data Angela L. Cool Texas A&M University 77843-4225 Paper presented at the annual meeting of the Southwest

More information

Lecture 2. Marginal Functions, Average Functions, Elasticity, the Marginal Principle, and Constrained Optimization

Lecture 2. Marginal Functions, Average Functions, Elasticity, the Marginal Principle, and Constrained Optimization Lecture 2. Marginal Functions, Average Functions, Elasticity, the Marginal Principle, and Constrained Optimization 2.1. Introduction Suppose that an economic relationship can be described by a real-valued

More information

8. Time Series and Prediction

8. Time Series and Prediction 8. Time Series and Prediction Definition: A time series is given by a sequence of the values of a variable observed at sequential points in time. e.g. daily maximum temperature, end of day share prices,

More information

Section 1.4. Difference Equations

Section 1.4. Difference Equations Difference Equations to Differential Equations Section 1.4 Difference Equations At this point almost all of our sequences have had explicit formulas for their terms. That is, we have looked mainly at sequences

More information

Institute of Actuaries of India Subject CT3 Probability and Mathematical Statistics

Institute of Actuaries of India Subject CT3 Probability and Mathematical Statistics Institute of Actuaries of India Subject CT3 Probability and Mathematical Statistics For 2015 Examinations Aim The aim of the Probability and Mathematical Statistics subject is to provide a grounding in

More information

16 Learning Curve Theory

16 Learning Curve Theory 16 Learning Curve Theory LEARNING OBJECTIVES : After studying this unit, you will be able to : Understanding, of learning curve phenomenon. Understand how the percentage learning rate applies to the doubling

More information

1 Error in Euler s Method

1 Error in Euler s Method 1 Error in Euler s Method Experience with Euler s 1 method raises some interesting questions about numerical approximations for the solutions of differential equations. 1. What determines the amount of

More information

Using Four-Quadrant Charts for Two Technology Forecasting Indicators: Technology Readiness Levels and R&D Momentum

Using Four-Quadrant Charts for Two Technology Forecasting Indicators: Technology Readiness Levels and R&D Momentum Using Four-Quadrant Charts for Two Technology Forecasting Indicators: Technology Readiness Levels and R&D Momentum Tang, D. L., Wiseman, E., & Archambeault, J. Canadian Institute for Scientific and Technical

More information

PROPERTIES OF THE SAMPLE CORRELATION OF THE BIVARIATE LOGNORMAL DISTRIBUTION

PROPERTIES OF THE SAMPLE CORRELATION OF THE BIVARIATE LOGNORMAL DISTRIBUTION PROPERTIES OF THE SAMPLE CORRELATION OF THE BIVARIATE LOGNORMAL DISTRIBUTION Chin-Diew Lai, Department of Statistics, Massey University, New Zealand John C W Rayner, School of Mathematics and Applied Statistics,

More information

Exploratory Data Analysis

Exploratory Data Analysis Exploratory Data Analysis Johannes Schauer johannes.schauer@tugraz.at Institute of Statistics Graz University of Technology Steyrergasse 17/IV, 8010 Graz www.statistics.tugraz.at February 12, 2008 Introduction

More information

Facebook Friend Suggestion Eytan Daniyalzade and Tim Lipus

Facebook Friend Suggestion Eytan Daniyalzade and Tim Lipus Facebook Friend Suggestion Eytan Daniyalzade and Tim Lipus 1. Introduction Facebook is a social networking website with an open platform that enables developers to extract and utilize user information

More information

ModelingandSimulationofthe OpenSourceSoftware Community

ModelingandSimulationofthe OpenSourceSoftware Community ModelingandSimulationofthe OpenSourceSoftware Community Yongqin Gao, GregMadey Departmentof ComputerScience and Engineering University ofnotre Dame ygao,gmadey@nd.edu Vince Freeh Department of ComputerScience

More information

Descriptive statistics Statistical inference statistical inference, statistical induction and inferential statistics

Descriptive statistics Statistical inference statistical inference, statistical induction and inferential statistics Descriptive statistics is the discipline of quantitatively describing the main features of a collection of data. Descriptive statistics are distinguished from inferential statistics (or inductive statistics),

More information

OHJ-1860 Software Systems Seminar: Global Software Development. Open-source software development. 11.12.2007 By Antti Rasmus

OHJ-1860 Software Systems Seminar: Global Software Development. Open-source software development. 11.12.2007 By Antti Rasmus 1 OHJ-1860 Software Systems Seminar: Global Software Development Open-source software development 11.12.2007 By Antti Rasmus Outline 2 Open-source software (OSS) development Motivation: IDC study on open

More information

Customer Life Time Value

Customer Life Time Value Customer Life Time Value Tomer Kalimi, Jacob Zahavi and Ronen Meiri Contents Introduction... 2 So what is the LTV?... 2 LTV in the Gaming Industry... 3 The Modeling Process... 4 Data Modeling... 5 The

More information

International Journal of Emerging Technologies in Computational and Applied Sciences (IJETCAS) www.iasir.net

International Journal of Emerging Technologies in Computational and Applied Sciences (IJETCAS) www.iasir.net International Association of Scientific Innovation and Research (IASIR) (An Association Unifying the Sciences, Engineering, and Applied Research) International Journal of Emerging Technologies in Computational

More information

Descriptive Analysis

Descriptive Analysis Research Methods William G. Zikmund Basic Data Analysis: Descriptive Statistics Descriptive Analysis The transformation of raw data into a form that will make them easy to understand and interpret; rearranging,

More information

Alessandro Birolini. ineerin. Theory and Practice. Fifth edition. With 140 Figures, 60 Tables, 120 Examples, and 50 Problems.

Alessandro Birolini. ineerin. Theory and Practice. Fifth edition. With 140 Figures, 60 Tables, 120 Examples, and 50 Problems. Alessandro Birolini Re ia i it En ineerin Theory and Practice Fifth edition With 140 Figures, 60 Tables, 120 Examples, and 50 Problems ~ Springer Contents 1 Basic Concepts, Quality and Reliability Assurance

More information

PTC Thermistor: Time Interval to Trip Study

PTC Thermistor: Time Interval to Trip Study PTC Thermistor: Time Interval to Trip Study by by David C. C. Wilson Owner Owner // Principal Principal Consultant Consultant Wilson Consulting Services, LLC April 5, 5, 5 Page 1-19 Table of Contents Description

More information

Copyright 2007 by Laura Schultz. All rights reserved. Page 1 of 5

Copyright 2007 by Laura Schultz. All rights reserved. Page 1 of 5 Using Your TI-83/84 Calculator: Linear Correlation and Regression Elementary Statistics Dr. Laura Schultz This handout describes how to use your calculator for various linear correlation and regression

More information

Data Collection from Open Source Software Repositories

Data Collection from Open Source Software Repositories Data Collection from Open Source Software Repositories GORAN MAUŠA, TIHANA GALINAC GRBAC SEIP LABORATORY FACULTY OF ENGINEERING UNIVERSITY OF RIJEKA, CROATIA Software Defect Prediction (SDP) Aim: Focus

More information

This unit will lay the groundwork for later units where the students will extend this knowledge to quadratic and exponential functions.

This unit will lay the groundwork for later units where the students will extend this knowledge to quadratic and exponential functions. Algebra I Overview View unit yearlong overview here Many of the concepts presented in Algebra I are progressions of concepts that were introduced in grades 6 through 8. The content presented in this course

More information

Bandwidth Modeling in Large Distributed Systems for Big Data Applications

Bandwidth Modeling in Large Distributed Systems for Big Data Applications Bandwidth Modeling in Large Distributed Systems for Big Data Applications Bahman Javadi School of Computing, Engineering and Mathematics University of Western Sydney, Australia Email: b.javadi@uws.edu.au

More information

SED As a Homogenous Virus

SED As a Homogenous Virus 1 A Critical Review of Software Engineering Research on Open Source Software Development Thomas Østerlie and Letizia Jaccheri NTNU Presented by Jingyue Li 2 Problem formulation A growing concern with the

More information

Learning and Researching with Open Source Software

Learning and Researching with Open Source Software Learning and Researching with Open Source Software Minghui Zhou zhmh@pku.edu.cn Associate Professor Peking University Outline A snapshot of Open Source Software (OSS) Learning with OSS Research on OSS

More information

White Paper April 2006

White Paper April 2006 White Paper April 2006 Table of Contents 1. Executive Summary...4 1.1 Scorecards...4 1.2 Alerts...4 1.3 Data Collection Agents...4 1.4 Self Tuning Caching System...4 2. Business Intelligence Model...5

More information

Mario Guarracino. Regression

Mario Guarracino. Regression Regression Introduction In the last lesson, we saw how to aggregate data from different sources, identify measures and dimensions, to build data marts for business analysis. Some techniques were introduced

More information

Sample Size and Power in Clinical Trials

Sample Size and Power in Clinical Trials Sample Size and Power in Clinical Trials Version 1.0 May 011 1. Power of a Test. Factors affecting Power 3. Required Sample Size RELATED ISSUES 1. Effect Size. Test Statistics 3. Variation 4. Significance

More information

Assumptions. Assumptions of linear models. Boxplot. Data exploration. Apply to response variable. Apply to error terms from linear model

Assumptions. Assumptions of linear models. Boxplot. Data exploration. Apply to response variable. Apply to error terms from linear model Assumptions Assumptions of linear models Apply to response variable within each group if predictor categorical Apply to error terms from linear model check by analysing residuals Normality Homogeneity

More information

The Evolution of Mobile Apps: An Exploratory Study

The Evolution of Mobile Apps: An Exploratory Study The Evolution of Mobile Apps: An Exploratory Study Jack Zhang, Shikhar Sagar, and Emad Shihab Rochester Institute of Technology Department of Software Engineering Rochester, New York, USA, 14623 {jxz8072,

More information