Mean-Variance Combination (MVC): A New Method for Evaluating Effort Estimation Models

Similar documents
Scatter Plots with Error Bars

Characterizing Task Usage Shapes in Google s Compute Clusters

Cross Validation. Dr. Thomas Jensen Expedia.com

Software Development Cost and Time Forecasting Using a High Performance Artificial Neural Network Model

Software Estimation: Practical Insights & Orphean Research Issues

Gage Studies for Continuous Data

Getting Started with Statistics. Out of Control! ID: 10137

Analysis of Attributes Relating to Custom Software Price

Validation of Internal Rating and Scoring Models

Topics. Project plan development. The theme. Planning documents. Sections in a typical project plan. Maciaszek, Liong - PSE Chapter 4

Multinomial Logistic Regression Applied on Software Productivity Prediction

Deducing software process improvement areas from a COCOMO II-based productivity measurement

Module 5: Statistical Analysis

Agility, Uncertainty, and Software Project Estimation Todd Little, Landmark Graphics

Simple Predictive Analytics Curtis Seare

Software project cost estimation using AI techniques

Module 3: Correlation and Covariance

Phase Distribution of Software Development Effort

A HYBRID INTELLIGENT MODEL FOR SOFTWARE COST ESTIMATION

Random Forest Based Imbalanced Data Cleaning and Classification

Software Metrics & Software Metrology. Alain Abran. Chapter 4 Quantification and Measurement are Not the Same!

Lecture 14: Cost Estimation

We discuss 2 resampling methods in this chapter - cross-validation - the bootstrap

Measuring Software Product Quality

Chapter 5 Analysis of variance SPSS Analysis of variance

Validation and Calibration. Definitions and Terminology

Knowledge Discovery and Data Mining. Structured vs. Non-Structured Data

DESCRIPTIVE STATISTICS. The purpose of statistics is to condense raw data to make it easier to answer specific questions; test hypotheses.

PMI PMBOK & ESTIMATING PMI PMBOK & ESTIMATING PMI PMBOK & ESTIMATING PMI PMBOK & ESTIMATING PMI PMBOK & ESTIMATING PMI PMBOK & ESTIMATING

Hathaichanok Suwanjang and Nakornthip Prompoon

10 Keys to Successful Software Projects: An Executive Guide

Week 4: Standard Error and Confidence Intervals

10 Deadly Sins of Software Estimation.

MTAT Software Economics. Lecture 5: Software Cost Estimation

Umbrella & Excess Liability - Understanding & Quantifying Price Movement

Article 3, Dealing with Reuse, explains how to quantify the impact of software reuse and commercial components/libraries on your estimate.

1. What is the critical value for this 95% confidence interval? CV = z.025 = invnorm(0.025) = 1.96

A DIFFERENT KIND OF PROJECT MANAGEMENT: AVOID SURPRISES

Predicting The Risk Of Rheumatoid Arthritis

Effort Estimation: How Valuable is it for a Web Company to Use a Cross-company Data Set, Compared to Using Its Own Single-company Data Set?

Confidence Intervals for Cp

AMS Verification at SoC Level: A practical approach for using VAMS vs SPICE views

Software Engineering. Dilbert on Project Planning. Overview CS / COE Reading: chapter 3 in textbook Requirements documents due 9/20

Performance Metrics for Graph Mining Tasks

Algorithmic Trading Session 1 Introduction. Oliver Steinki, CFA, FRM

Annealing Techniques for Data Integration

Jitter Measurements in Serial Data Signals

Example: Credit card default, we may be more interested in predicting the probabilty of a default than classifying individuals as default or not.

Automating FP&A Analytics Using SAP Visual Intelligence and Predictive Analysis

Statistical Analysis of New Product Development (NPD) Cycle-time Data Including Applications of Results

R 2 -type Curves for Dynamic Predictions from Joint Longitudinal-Survival Models

Multiple Regression: What Is It?

W6.B.1. FAQs CS535 BIG DATA W6.B If the distance of the point is additionally less than the tight distance T 2, remove it from the original set

Risk Analysis and Quantification

A DIFFERENT KIND OF PROJECT MANAGEMENT

IBM SPSS Direct Marketing 23

MIMO Antenna Systems in WinProp

IBM SPSS Direct Marketing 22

Keywords : Soft computing; Effort prediction; Neural Network; Fuzzy logic, MRE. MMRE, Prediction.

A Review of Cross Sectional Regression for Financial Data You should already know this material from previous study

Analytical Test Method Validation Report Template

Pragmatic Peer Review Project Contextual Software Cost Estimation A Novel Approach

Current and Future Challenges for Systems and Software Cost Estimation

REDUCING UNCERTAINTY IN SOLAR ENERGY ESTIMATES

Quantitative Managing Defects for Iterative Projects: An Industrial Experience Report in China

L13: cross-validation

Correcting Output Data from Distributed PV Systems for Performance Analysis

itesla Project Innovative Tools for Electrical System Security within Large Areas

An Evaluation of Neural Networks Approaches used for Software Effort Estimation

Introduction to the Monte Carlo method

How to Make Best Use of Cross-Company Data for Web Effort Estimation?

1) Write the following as an algebraic expression using x as the variable: Triple a number subtracted from the number

APPENDIX N. Data Validation Using Data Descriptors

Course Overview Lean Six Sigma Green Belt

FINDING SUBGROUPS OF ENHANCED TREATMENT EFFECT. Jeremy M G Taylor Jared Foster University of Michigan Steve Ruberg Eli Lilly

The impact of window size on AMV

Implementing an AMA for Operational Risk

Project Cost Management

CISC 322 Software Architecture

Software cost estimation

Chapter 10. Key Ideas Correlation, Correlation Coefficient (r),

Appendix 1: Time series analysis of peak-rate years and synchrony testing.

Unit 9 Describing Relationships in Scatter Plots and Line Graphs

Total Cost of Care and Resource Use Frequently Asked Questions (FAQ)

Cost Estimation for Web Applications

Methodologies for Evaluation of Standalone CAD System Performance

Assessing Measurement System Variation

Prediction of Software Development Modication Eort Enhanced by a Genetic Algorithm

DRAFTING MANUAL. Gears (Bevel and Hypoid) Drafting Practice

Chapter 13 Introduction to Linear Regression and Correlation Analysis

Function Point: how to transform them in effort? This is the problem!

B-bleaching: Agile Overtraining Avoidance in the WiSARD Weightless Neural Classifier

EST.03. An Introduction to Parametric Estimating

Simple linear regression

Measurement Information Model

Lean Six Sigma Black Belt-EngineRoom

Regression Analysis: A Complete Example

Levels of Testing Patrick Oladimeji

Research on Clustering Analysis of Big Data Yuan Yuanming 1, 2, a, Wu Chanle 1, 2

Algebra I Vocabulary Cards

Transcription:

Mean-Variance Combination (MVC): A New Method for Evaluating Effort Estimation Models Lang Xie (ISCAS), Ye Yang (ISCAS),Da Yang(ISCAS),Qing Wang(ISCAS),Mingshu Li(ISCAS) April. 27, 2011

Agenda 1 2 Introduction to cost estimation research at ISCAS MVC method for evaluating cost estimation models 3 Ongoing work 2011-5-3 2

Cost Estimation Research Framework ISCAS Perspective Local Government Cost estimation for contract pricing Literature review COCOMO family models COCOMO-U Budgeting under uncertainty Defects prediction Simulation Software Quality Industry Cost estimation tool Software Cost Estimation Basic Research Coping with the cone of uncertainty Software Process USC JSP WikiWinWin SoftPM COCOMO Coping with the uncertainty Combining estimations Estimation based on Use- Case Cost estimation & process management integration Cost drivers auto-rating Software Measurement SoftPM 3 3

Uncertainty ranges of cost estimations present a decreasing trend as the software development lifecycle proceeds 4x 2x 1.5x Early Design (13 parameters) 1.25x Relative Size Range x 0.8x 0.67x Post-Architecture (23 parameters) 0.5x Applications Composition (3 parameters) 0.25x Concept of Operation Rqts. Spec. Product Design Spec. Detail Design Spec. Accepted Software Feasibility Plans and Rqts. Product Design Detail Design Devel. and Test

Input COCOMO-U The COCOMO-U takes the probability distributions of the estimated project size and other 22 cost factors as input. Output the probability distribution of software development effort 5

The InCoME Process Cost Drivers Analysis & Data Collection Build Cost Models Yes Evaluate Cost Models Require Further Improvement? No Risk Assessment Cost Estimation Decision Support 6

Estimation Process based on COGOMO ---constructive government contract pricing model Government history projects Industry history projects Human capital in China Industry benchmark Project size Customized model input Calibrated parameters Establishment of government knowledge base Effort Estimation Effort distribution Wage-rate in China Estimated effort Cost Analysis Total cost 7

Data:7 versions of Qone Localize of COCOMO Result: A: 1.32 B: 0.94 Qone: a commercial software process management tool, released by a Chinese software enterprise 2011-5-3 8

Data: Cost Estimation based on Use Cases 7 versions of Qone Estimation Model Effort = A * (UCadjusted) B UCadjusted = newuc + Wmod * moduc + Wreu * reuuc + Wdel * deluc QONE case UCadjusted = newuc + 0.2 * moduc +0.05 * adouc version adduc moduc reuuc adjuste d UC effort v1 3 10 216 15.8 2284.5 v2 7 22 207 21.75 3941 v3 86 22 19 111.5 30945 v4 57 61 236 73.65 10340.1 v5 12 31 308 33.25 7477.5 v6 37 30 318 58.55 14903.6 v7 15 8 373 34.9 7166 A B R 2 P-value 96.9396 1.1927 0.928219 0.000481 The method provides guidance for organizations to conduct the maintenance effort estimation based on use cases. It apply use cases as the size metric. The added, modified, reused and deleted types of use cases are identified to be included in the use case metric for estimating the effort of software maintenance.

Propheta-a cost estimation tool Three cost estimation methods: Analogy estimation based ondatabase from CSBSG and ISBSG COCOMO Integrated estimation for software product with multiple modules CSBSG: The China Benchmarking Standards Group ISBSG: The International Software Benchmarking Standards Group 2011-5-3 10

Agenda 1 2 Introduction to cost estimation research at ISCAS MVC method for evaluating cost estimation models 3 Ongoing work 2011-5-3 11

MVC method Background && Motivation MVC(mean-variance combination) method Experiment result 2011-5-3 12

Background(1/3) A wealth of estimation methods existed Evaluation method is important Indicate the problem of estimation models. Drive the improvement of estimation models. Statistic view of model s character Bias and variance 2011-5-3 13

Background(2/3) Bias and Variance Ideal Model y Model 2 Structure: horizontal line Data: whole data set Model 1 Structure: y = a*x + b Data: part of data x 2011-5-3 14

Background(3/3) The true bias can not be caught The distance between the observed value and estimated value contains bias and variance Accuracy indicators: MMRE gives the bias information while stdmre gives the variation information and part of bias. 2011-5-3 15

Motivation Indicators: based on RE or MRE MMRE, stdmre, PRED(N), MdMRE, etc. Evaluation: Cross Validation(CV) Average value of indicators above Interval of indicators above Traditional mean value of indicators in CV are challenged Do not combine the bias and variance together The comparing result varies 2011-5-3 16

MVC method: the whole process History data Model structure Resampling Train and test MVC Process Generate indicators Split Ratio, Re-sampling times 2011-5-3 17

MVC Method: Re-sampling process Re-sampling process Input: data, model structure Output: pairs of (MMRE, stdmre) Fix the ratio of test set and sampling times N Randomly split whole data set N times to get N pairs of (train set, test set) Train and test current model structure N times using the N pairs Calculate N pairs of (MMRE, stdmre) 2011-5-3 18

MVC method: why Re-sampling The history data is limited, small size Independent and identity distribution may not be satisfied Re-sampling is like to simulate the situations: train set VS test set History data Vs the new data C(n,m), the number of possible combination is large. 2011-5-3 19

MVC Method: Generate Indicator paradigm Scatter-plot Convex_hull AUC_L (AUC Lower) AUC_M (AUC Middle) AUC_U (AUC Upper) ACU: Area Under Curve 2011-5-3 20

MVC Method: Generate Indicator Algorithm Input : N pairs of (MMRE, stdmre) Output: three types of area Get the scatter plot of MMRE and stdmre Get the convex hull, and split the convex hull as up part and lower part Extend the two part of convex hull to three types of area 2011-5-3 21

AUC_U Convex hull AUC_M AUC_L 2011-5-3 22

MMRE std_mre Result: Performance of traditional indicators 0.6 0.5 0.4 0.3 0.2 0.1 0 CV times 10 30 50 70 90 110 0.48 0.47 0.46 0.45 0.44 0.43 0.42 0.41 10 30 50 70 90 110 CV times 2011-5-3 23

Results: the scatter plot of MMRE and stdmre 2011-5-3 24

Results: MVC s indicators on two models COCOMO81 dataset. NASA93 dataset model Auc_U Auc_M Auc_L Auc_U Auc_M Auc_L COCOMO 0.6134 0.4197 0.3103 3.2708 2.1228 1.7653 Analogy 0.6403 0.4901 0.3414 0.7075 0.4522 0.3394 2011-5-3 25

Results: variance of indicators divided by mean value 2011-5-3 26

Discussion Benefit: The reason of more stable: distribution replace point The combination 2011-5-3 27

Agenda 1 2 Introduction to cost estimation research at ISCAS MVC method for evaluating cost estimation models 3 Ongoing work 2011-5-3 28

Improve MVC: MMRE Ongoing work Splitting of four areas Left-up High bias and low variance Mean_threshold Left-bottom low bias and low variance stdmre right-up High bias and high variance Right-bottom Low bias and high variance Std_threshold How to determine the threshold for dividing four regions? 2011-5-3 29

Ongoing work Deal with Cross-Company data Definition and measurement of Local Bias (Ye, etc. 2011 ESEM) Build new models to deal with Organization ID Measure uncertainty more accurate Reduce the bias under the indicating of MVC method and express the variance more accurately 2011-5-3 30

Thank you! 2011-5-3 31