Interval Quality: Relating Customer-Perceived Quality To Process Quality

Similar documents

How to Measure Software Quality in Vain

Understanding and Predicting Effort in Software Projects

The Methodology of Software Development

Effects of Distributed Software Development and Virtual Teams

Globalization and the Future Developer

Software Support Tools and Experimental Work

Automating the Measurement of Open Source Projects

(2) Question 2: Size of Mozilla community.

Developer Fluency: Achieving True Mastery in Software Projects

Two case studies of Open Source Software Development: Apache and Mozilla

Predictors of Customer Perceived Software Quality

Standard Success Program

Communication Needs, Practices and Supporting Structures in Global Inter- Organizational Software Development Projects

Cisco Network Optimization Service

How To Write Software

Automated Classification of Change Messages in Open Source Projects

Your Software Quality is Our Business. INDEPENDENT VERIFICATION AND VALIDATION (IV&V) WHITE PAPER Prepared by Adnet, Inc.

Communication Problems in Global Software Development: Spotlight on a New Field of Investigation

Software Engineering Compiled By: Roshani Ghimire Page 1

A Comparison of Software Cost, Duration, and Quality for Waterfall vs. Iterative and Incremental Development: A Systematic Review

Future of CMM and Quality Improvement. Roy Ko Hong Kong Productivity Council

Software Quality Assurance in Agile, XP, Waterfall and Spiral A Comparative Study

The «SQALE» Analysis Model An analysis model compliant with the representation condition for assessing the Quality of Software Source Code

Executive Summary WHAT IS DRIVING THE PUSH FOR HIGH AVAILABILITY?

Inferring Change Effort from Configuration Management Databases

Reliability of a Commercial Telecommunications System

Defect Detection in a Distributed Software Maintenance Project

The Impact of Defect Resolution on Project Activity in Open Source Projects: Moderating Role of Project Category

Wind River Device Management Wind River

A Tool for Mining Defect-Tracking Systems to Predict Fault-Prone Files

C. Wohlin, "Managing Software Quality through Incremental Development and Certification", In Building Quality into Software, pp , edited by

Open Source Software Maintenance Process Framework

Empirical Project Monitor: A Tool for Mining Multiple Project Data

Empirical Software Engineering Introduction & Basic Concepts

ABSTRACT 8/14/ * Avaya Labs, randyh@avaya.com, audris@avaya.com, palframan@avaya.com 2 ** Iowa State University, weiss@iastate.

Cisco Change Management: Best Practices White Paper

Mastering Institutional Biospecimen Management

Supporting Knowledge Collaboration Using Social Networks in a Large-Scale Online Community of Software Development Projects

Support Operations Handbook

v6sonar Report Service Overview and Case Study By Nephos6, Inc.

Testhouse Training Portfolio

ITIL Essentials Study Guide

Comparing Methods to Identify Defect Reports in a Change Management Database

Learning and Researching with Open Source Software

Sample Workshops - An Overview of Software Development Practices

Dependable (Safe/Reliable) Systems. ARO Reliability Workshop Software Intensive Systems

Directions for VMware Ready Testing for Application Software

Improving. Summary. gathered from. research, and. Burnout of. Whitepaper

The Business Case for Virtualization Management: A New Approach to Meeting IT Goals By Rich Corley Akorri

Software Quality Management

SOFTWARE PERFORMANCE EVALUATION ALGORITHM EXPERIMENT FOR IN-HOUSE SOFTWARE USING INTER-FAILURE DATA

Modeling The Enterprise IT Infrastructure

Knowledge Base Data Warehouse Methodology

".,!520%'$!#)!"#$%&!>#($!#)!

Extreme Programming In Global Software Development

CSSE 372 Software Project Management: Managing Software Projects with Measures

A Study on Software Metrics and Phase based Defect Removal Pattern Technique for Project Management

Availability and Cost Monitoring in Datacenters. Using Mean Cumulative Functions

Using the Agile Methodology to Mitigate the Risks of Highly Adaptive Projects

Software Engineering of NLP-based Computer-assisted Coding Applications

D. E. Perry A. Porter? L. G. Votta M. W. Wade. Software Production Research Dept Quality Management Group

An Application of Weibull Analysis to Determine Failure Rates in Automotive Components

Software Project Management Matrics. Complied by Heng Sovannarith

Home Office Collaborative Working Related Work. Sommersemester 2010 HAW-Hamburg Karsten Panier

Two Case Studies of Open Source Software Development: Apache and Mozilla

RELIABILITY AND AVAILABILITY OF CLOUD COMPUTING. Eric Bauer. Randee Adams IEEE IEEE PRESS WILEY A JOHN WILEY & SONS, INC.

Exhibit E - Support & Service Definitions. v1.11 /

REDUCING UNCERTAINTY IN SOLAR ENERGY ESTIMATES

IT White Paper. N + 1 Become Too Many + 1?

Introduction Purpose... 4 Scope... 4 Manitoba ehealth Change Management... 4 Icons RFC Procedures... 5

Using Iterative and Incremental Processes in Global Software Development

Chapter 9 Software Evolution

A Visualization Approach for Bug Reports in Software Systems

New BGP Performa Service for Advanced Software

Latest Trends in Testing. Ajay K Chhokra

Reliability Growth Planning with PM2

Ticket Management & Best Practices. April 29, 2014

Measurement for Successful Projects. Authored by Michael Harris and David Herron

CRAY GOLD SUPPORT SUPPORT OPERATIONS HANDBOOK. GOLD-SOHB Page 1 of 22

Cisco Unified Communications and Collaboration technology is changing the way we go about the business of the University.

Agile Requirements Definition for Software Improvement and Maintenance in Open Source Software Development

QUICK FACTS. Managing a Service Operations Team for a Leading Software Developer. TEKsystems Global Services Customer Success Stories.

Testing Metrics. Introduction

Two Case Studies of Open Source Software Development: Apache and Mozilla

Modeling, Simulation & Data Mining: Answering Tough Cost, Date & Staff Forecasts Questions

High Availability for Microsoft Servers

Preface. Book Origin and Overview

Supporting the CMMI Metrics Framework thru Level 5. Márcio. Silveira. page 1

Performance Testing Uncovered

Custom Software Development Approach

The remedies set forth within this SLA are your sole and exclusive remedies for any failure of the service.

Beyond Logging and Monitoring. New Techniques for Solving J2EE Application Problems in Production Environments. David Kadouch BMC Software

Advanced Data Management Technologies

Capability Maturity Model Integration (CMMI SM ) Fundamentals

PIRR: a Methodology for Distributed Network Management in Mobile Networks

Knowledge Article Performance Comparison: BMC Remedy ITSM Incident Management version Vs on Windows

Central Florida Expressway Authority

PSU Hyland OnBase Document Imaging and Workflow Services Level Memorandum of Understanding

Simulating the Structural Evolution of Software

Implementing Pavement Management Systems, Do's and Don ts at the Local Agency Level

Transcription:

Interval Quality: Relating Customer-Perceived Quality To Process Quality Audris Mockus and David Weiss {audris,weiss}@avaya.com Avaya Labs Research Basking Ridge, NJ 07920 http://mockus.org/

Motivation: bridge the gap between developer and user and measure in vivo A key software engineering objective is to improve quality via practices and tools that support requirements, design, implementation, verification, and maintenance Needs of a user: installability, reliability, availability, backward compatibility, cost, and features Primary objectives Can we measure user-perceived-quality in vivo? Can we communicate it to the development team? Is the common wisdom about software quality correct? 2 Mockus & Weiss Interval Quality: Relating Customer-Perceived Quality To Process Quality

Outline History of quality in communications systems How to observe quality in vivo Questions Can we compare quality among releases? Which part of the life-cycle affects quality the most? Can we approximate quality using easy-to-obtain measures? Does hardware or software have more impact on quality? Answers Yes, service, no, it depends Discussion 3 Mockus & Weiss Interval Quality: Relating Customer-Perceived Quality To Process Quality

Approaches to measure quality Theoretical models [16] Simulations (in silico) Observing indirectly Test runs, load tests, stress tests, SW defects and failures Observing directly in vivo via recorded user/system actions (not opinion surveys) has following benefits: is more realistic, is more accurate, provides higher level of confidence, is more suited to observe an overall effect than in vitro research, is more relevant in practice. 4 Mockus & Weiss Interval Quality: Relating Customer-Perceived Quality To Process Quality

History of Communications Quality [6] Context: military and commercial communication systems, 1960-present Goals: system outage, loss of service, degradation of service Downtime of 2 hours over 40 yr, later 5 nines (or 5 min per year) Degradation of service, e.g., <.01% calls mishandled Faults per line per time unit, e.g., errors per 100 subscribers per year MTBF for service or equipment, e.g, exchange MTBF, % subscribers with MTBF > X Duplication levels, e.g., standby HW for systems with > 64 subscribers 5 Mockus & Weiss Interval Quality: Relating Customer-Perceived Quality To Process Quality

Observing in vivo architecture System ID Customer Info. Configuration System ID Alarm ID Alarm type System ID Resolution Other attributes Platform Release Date modified Inventory system Time Other alarm info Alarming system Ticket ID Time Ticketing system Weekly snapshots System ID Platform Release First date Installed base Alarming base Augmented ticket/alarm Outage/Restart Release/Platf. Inst/runtme Rel. launch System Id/Conf. Metrics/ Bounds MTBF Outage duration Availability Population Survival Hazard Level 0 Level 1 Level 2 6 Mockus & Weiss Interval Quality: Relating Customer-Perceived Quality To Process Quality

Observing in vivo primary data sources Service tickets Represent requests for action to remedy adverse events: outages, software and hardware issues, and other requests Manual input = not always accurate Some issues may be unnoticed and/or unreported = missing Software alarms Complete list for the systems configured to generate them Irrelevant events may be included, e.g, experimental, misconfigured systems that are not in production use at the time Inventory Type, size, configuration, install date for each release Link between deployment dates and tickets/alarms 7 Mockus & Weiss Interval Quality: Relating Customer-Perceived Quality To Process Quality

Issues with commonly available data and published analyses Present Problem reports by month (hopefully grouped by release) Sales (installations) by month (except for freely downloadable SW) Absent No link between install time and problem report = no way to get accurate estimates of hazard function (probability density of observing a failure conditional on the absence of earlier failures) No complete list of software outages = no way to get rough estimates of the underlying rate 8 Mockus & Weiss Interval Quality: Relating Customer-Perceived Quality To Process Quality

Data Remedies Only present state of inventory is kept = collect snapshots to reconstruct history The accounting aggregation (by solution, license) is different from service (by system) or production (by release/patch) aggregation = remap to the finest common aggregation Missing data Systems observed for different periods = use survival curves Reporting bias = divide into groups according to service levels and practices Quantity of interest not measured = design measures for upper and lower bounds 9 Mockus & Weiss Interval Quality: Relating Customer-Perceived Quality To Process Quality

Practical questions Can we compare quality among releases to evaluate the effectiveness of QA practices? Which part of the production/deployment/service life-cycle affects quality the most? Can quality be approximated with easy-to-obtain measures, e,g., defect density? Does hardware or software have more impact on quality? 10 Mockus & Weiss Interval Quality: Relating Customer-Perceived Quality To Process Quality

Hazard function (Probability density of observing a failure conditional on the absence of earlier failures) Estimated Hazard Rate 0.0 0.2 0.4 0.6 0.8 0.0 0.2 0.4 0.6 0.8 1.0 Runtime (years) Have to adjust for runtime and separate by platform or the MTBF will characterize the currently installed base, not release quality Therefore, how to compare release quality? 11 Mockus & Weiss Interval Quality: Relating Customer-Perceived Quality To Process Quality

Interval Quality 0.000 0.005 0.010 0.015 0.020 0.025 *** Post inst. MR rates. Current Date *** *** *** *** 0 1 months after inst. 0 3 months after inst. 0 6 months after inst. 1.1 1.3 2.0 2.1 2.2 3.0 3.1 *** *** *** *** ** *** *** Fraction of customers that report software failures within the first few months of installation Does not account for proximity to launch, platform mix Significant differences marked with * We live or die by this measure 12 Mockus & Weiss Interval Quality: Relating Customer-Perceived Quality To Process Quality

Can we use easy-to-obtain defect density? Quantity 0.000 0.005 0.010 0.015 F3 F1 DM DL DL F3 DM F1 DM DL F3 F1 DL DM F1 F3 DefPerKLOC/100 DefPerPreGaMR*10 Probability 1m. Probability 3m. F3 DM DL F1 DM DL F3 F1 DM DL r1.1 r1.2 r1.3 r2.0 r2.1 r2.2 F3 F1 Anti-correlated?! 13 Mockus & Weiss Interval Quality: Relating Customer-Perceived Quality To Process Quality

High defect density leads to satisfied customers? What does any organization strive for? 14 Mockus & Weiss Interval Quality: Relating Customer-Perceived Quality To Process Quality

Stability = Predictability! The rate at which customer problems get to Tier IV is almost constant despite highly varying deployment and failure rates Numbers of field issues 0 50 100 150 r1.1 r1.2 r1.3 r2.0 r2.1 r2.2 Deployed systems 0 500 1000 1500 Months 15 Mockus & Weiss Interval Quality: Relating Customer-Perceived Quality To Process Quality

Major versus Minor releases Defect density numerator is about the same as for IQ because Major releases are deployed more slowly to fewer customers For minor releases a customer is less likely to experience a fault so they are deployed faster and to more customers The denominator diverges because Major releases have more code changed and fewer customers Minor releases have less code and more customers 16 Mockus & Weiss Interval Quality: Relating Customer-Perceived Quality To Process Quality

Hardware vs Software Limitations Durations of SW Warm, 5e 01 5e+00 5e+01 5e+02 HW Low HW High SW Cold SW All SW Cold, HW differ by orders of magnitude Warm rst. don t drop calls High/Critical cfg. may by unaffected HW-High ultra conservative Variability for each estimate may be high Distribution of MTBF for 15 platform/release combinations 17 Mockus & Weiss Interval Quality: Relating Customer-Perceived Quality To Process Quality

Which part of the software production and delivery contributes most to quality? Development perspective - fraction of MRs removed per stage Development features, bugs introduced, and resolved Verification 40% of development stage MRs (post unit-test) α/β trials 7% of development stage MRs Deployment 5% in major and 18% in minor releases Customer perspective - probability of observing a failure may drop up to 30 times in the first few months post-launch [15] 18 Mockus & Weiss Interval Quality: Relating Customer-Perceived Quality To Process Quality

In vivo investigation = new insights Methodology Service support systems provide in vivo capability = new insights Results become an integral part of development practices continuous feedback on production changes/improvements Quality insights Maintenance the most important quality improvement activity Development process view does not represent customer views Software tends to be a bigger reliability issue with a few exceptions Measurement hints Pick the right measure for the objective no single quality exists Adjust for relevant factors to avoid measuring demographics Bound objective, navigate around missing, biased, irrelevant data 19 Mockus & Weiss Interval Quality: Relating Customer-Perceived Quality To Process Quality

Thank You. 20 Mockus & Weiss Interval Quality: Relating Customer-Perceived Quality To Process Quality

Limitations Different characteristics of the project including numbers of customers, application domain, software size, quality requirements are likely to affect most of the presented values Many projects may not have as detailed and homogeneous service repositories 21 Mockus & Weiss Interval Quality: Relating Customer-Perceived Quality To Process Quality

Methodology: Validation Interview a sample of individuals operating and maintaining relevant systems Go over recent cases the person was involved with to illustrate the practices (what is the nature of the work item, why you got it, who reviewed it) to understand/validate the meaning of attribute values: (when was the work done, for what purpose, by whom) to gather additional data: effort spent, information exchange with other project participants to add experimental/task specific questions Augment data via relevant models [8, 11, 1, 12] Validate and clean retrieved and modeled data Iterate 22 Mockus & Weiss Interval Quality: Relating Customer-Perceived Quality To Process Quality

Methodology: Existing Models Predicting the quality of a patch [12] Work coordination: What parts of the code can be independently maintained [13] Who are the experts to contact about any section of the code [10] How to measure organizational dependencies [4] Effort: estimate MR effort and benchmark practices What makes some changes hard [5] What practices and tools work [1, 2, 3] How OSS and Commercial practices differ [9] Project models Release schedule [14] Release readiness criteria [7] Consumer perceived quality [15, 8] 23 Mockus & Weiss Interval Quality: Relating Customer-Perceived Quality To Process Quality

Naive reliability estimates Naive estimate: calendar time installed base # software restarts runtime simplex systems Naive+ estimate: # restarts simplex runtime simplex,generating alarms Alarming syst. estimate: # restarts simplex Naive Naive+ Alarming Systems 80000 1011 761 Restarts 14000 32 32 Period.5.25.25 MTBF (years) 3 7.9 5.9 24 Mockus & Weiss Interval Quality: Relating Customer-Perceived Quality To Process Quality

What affects restart rates? Kaplan-Meier estimates of the survival curves for three platforms and two releases Differences between releases dwarfed by differences among platforms [8] 25 Mockus & Weiss Interval Quality: Relating Customer-Perceived Quality To Process Quality

Probability of observing SW issue in the first 3 months 0.0 Probability Medium size, No upgrades Medium size, Upgrades Large size, Upgrades 0.0 0.2 0.4 0.6 0.8 1.0 Time in years between launch and deployment Quality: with time after the launch with Size for new installs 26 Mockus & Weiss Interval Quality: Relating Customer-Perceived Quality To Process Quality

References [1] D. Atkins, T. Ball, T. Graves, and A. Mockus. Using version control data to evaluate the impact of software tools: A case study of the version editor. IEEE Transactions on Software Engineering, 28(7):625 637, July 2002. [2] D. Atkins, A. Mockus, and H. Siy. Measuring technology effects on software change cost. Bell Labs Technical Journal, 5(2):7 18, April June 2000. [3] Birgit Geppert, Audris Mockus, and Frank Rößler. Refactoring for changeability: A way to go? In Metrics 2005: 11th International Symposium on Software Metrics, Como, September 2005. IEEE CS Press. [4] James Herbsleb and Audris Mockus. Formulation and preliminary test of an empirical theory of coordination in software engineering. In 2003 International Conference on Foundations of Software Engineering, Helsinki, Finland, October 2003. ACM Press. [5] James D. Herbsleb, Audris Mockus, Thomas A. Finholt, and Rebecca E. Grinter. An empirical study of global software development: Distance and speed. In 23nd International Conference on Software Engineering, pages 81 90, Toronto, Canada, May 12-19 2001. [6] H.A. Malec. Communications reliability: a historical perspective. IEEE Transactions on Reliability, 47(3):333 345, Sept. 1998. [7] Audris Mockus. Analogy based prediction of work item flow in software projects: a case study. In 2003 International Symposium on Empirical Software Engineering, pages 110 119, Rome, Italy, October 2003. ACM Press. [8] Audris Mockus. Empirical estimates of software availability of deployed systems. In 2006 International Symposium on Empirical Software Engineering, pages 222 231, Rio de Janeiro, Brazil, September 21-22 2006. ACM Press. [9] Audris Mockus, Roy T. Fielding, and James Herbsleb. Two case studies of open source software development: Apache and mozilla. ACM Transactions on Software Engineering and Methodology, 11(3):1 38, July 2002. [10] Audris Mockus and James Herbsleb. Expertise browser: A quantitative approach to identifying expertise. In 2002 International Conference on Software Engineering, pages 503 512, Orlando, Florida, May 19-25 2002. ACM Press. [11] Audris Mockus and Lawrence G. Votta. Identifying reasons for software change using historic databases. In International Conference on Software Maintenance, pages 120 130, San Jose, California, October 11-14 2000.

[12] Audris Mockus and David M. Weiss. Predicting risk of software changes. Bell Labs Technical Journal, 5(2):169 180, April June 2000. [13] Audris Mockus and David M. Weiss. Globalization by chunking: a quantitative approach. IEEE Software, 18(2):30 37, March 2001. [14] Audris Mockus, David M. Weiss, and Ping Zhang. Understanding and predicting effort in software projects. In 2003 International Conference on Software Engineering, pages 274 284, Portland, Oregon, May 3-10 2003. ACM Press. [15] Audris Mockus, Ping Zhang, and Paul Li. Drivers for customer perceived software quality. In ICSE 2005, pages 225 233, St Louis, Missouri, May 2005. ACM Press. [16] J. D. Musa, A. Iannino, and K. Okumoto. Software Reliability. McGraw-Hill Publishing Co., 1990.

Abstract We investigate relationships among software quality measures commonly used to assess the value of a technology, and several aspects of customer perceived quality measured by Interval Quality (IQ): a novel measure of the probability that a customer will observe a failure within a certain interval after software release. We integrate information from development and customer support systems to compare defect density measures and IQ for six releases of a major telecommunications system. We find a surprising negative relationship between the traditional defect density and IQ. The four years of use in several large telecommunication products demonstrates how a software organization can control customer perceived quality not just during development and verification, but also during deployment by changing the release rate strategy and by increasing the resources to correct field problems rapidly. Such adaptive behavior can compensate for the variations in defect density between major and minor releases.

Audris Mockus Avaya Labs Research 233 Mt. Airy Road Basking Ridge, NJ 07920 ph: +1 908 696 5608, fax:+1 908 696 5402 http://mockus.org, mailto:audris@mockus.org Audris Mockus is interested in quantifying, modeling, and improving software development. He designs data mining methods to summarize and augment software change data, interactive visualization techniques to inspect, present, and control the development process, and statistical models and optimization techniques to understand the relationships among people, organizations, and characteristics of a software product. Audris Mockus received B.S. and M.S. in Applied Mathematics from Moscow Institute of Physics and Technology in 1988. In 1991 he received M.S. and in 1994 he received Ph.D. in Statistics from Carnegie Mellon University. He works in the Software Technology Research Department of Avaya Labs. Previously he worked in the Software Production Research Department of Bell Labs.