Quantitative and qualitative methods in process improvement and product quality assessment.

Quantitative and qualitative methods in process improvement and product quality assessment. Anna Bobkowska Abstract Successful improvement of the development process and product quality assurance should take advantages of complementary use of both quantitative and qualitative methods. In the paper, experience of such integrated activities during students quality lab is presented. 1. Introduction One of the recommendations of the TQM (Total Quality Management) says: Measure your success! [4] However things defined by business case or external problems such as cancelled projects, late deliveries, exceeded budget, and useless software are just a tip of the iceberg containing issues of the complex project reality. In order to make improvements effectively good understanding of those subtitles is needed. One can argue that quantitative methods are not sufficient for the purpose of successful software quality assurance. Intentional and systematic use of qualitative methods can provide missing part of the solution. 2. Complementary applications of quantitative and qualitative methods Quantitative and qualitative issues are strictly related, but in software engineering qualitative methods are treated rather informally. Comparison of both approaches in aspects of techniques and applications is presented in the figure 1. Quantitative methods Measurement Quantification Computations Applications Statistical results Control and management Comparison of similar products Integration with other indicators Qualitative methods Comments, notes and schemas Discussions and observations, Questionnaires, interviews Analysis, reasoning Applications Problem identification and removal Theory generation and improvement Understanding of project reality Enhancement of the measurement structure Cross-case analysis Focus on details, complexity Non-technical aspects Figure 1. Comparison of quantitative and qualitative methods in aspects of techniques and applications.

2.1. Quantitative methods Quantitative methods use numbers. Measures are easy to take when there are countable objects e.g. number of persons, number of elements of given type on the diagram, or there are standard devices or agreements for objective measuring a given feature, e.g. time, cost. In the case when those conditions are not satisfied but a number that represents a state according to the defined goal is needed, it is necessary to quantify that feature. Examples include: ease of use, understandability, and cover all the cases when it is very difficult to find objective metrics. Quantification is concerned with assigning numbers to the states of reality with appropriate rules of representation [3]. During quantification there is a problem of subjectivity and imprecision of the achieved data although they are represented by numbers. Quantitative methods are necessary for comparison of similar products, control and management. Quantitative data easily deliver statistical results and can be processed automatically. Another advantage of their use is ease of integration with systems covering other indicators. It is important to define goals of measurement and prepare an appropriate measurement plan. This can be done with GQM (Goal-Question-Metric) [2] approach or FCM (Factor-Criteria-Metrics) structure [3] or Goal-Subgoal-Metrics structure [6]. In application of software quality defect density metric is used. Although defects are countable ideas, it is not fully objective metric since defect classifications are made by humans. Apart from the problem of not discovered defects, importance of the defect can differ in cases when it has occurred a serious defect, a mistake or just a misunderstanding. Another approach is concerned with FCM structure and making computation in order to achieve numbers that represent higher level quality indicators. This is usually more total approach which allows to involve non-technical (e.g. human) aspects of quality. This approach is used and extended in the presented research. 2.2. Qualitative methods Qualitative methods use objects different then numbers e.g. words, pictures. The techniques include interviews, discussions, observations, comments, notes, questionnaires, and schemas. Usually the results are more subjective and more difficult to process, and thus require more work during analysis. But they give richer and more informative results. Quantitative methods can be applied to theory generation and improvement, problem identification and removal, and cross-case analysis [8]. They are necessary in the situations when better reality or system understanding is required, and complexity, not reductions and quick processing are a value. Searching for reasons of things and non-technical aspects are good fields for them. Although they seem to be intuitive their application also requires infrastructure preparation and special methods of results analysis. Qualitative methods can be helpful in optimising a structure of measurements. On the other hand, metrics can provide assessments or problem indication to focus on with qualitative methods. After identification of the reasons of problems and their removal quantitative methods can be used again to track the project.

3. Development process improvement There is a relationship between development process parameters and software quality. While attempting to achieve high quality products one should provide also optimal development process. Optimal process being goal of this research is characterised by: An appropriate schedule, in which deadlines are set to avoid late deliveries, and there is unified weekly workload during all the semester. Infrastructure for efficient co-operation between participant with different roles in the project is provided, and there are some facilitators for internal organisation in the group, e.g. task sequencing and work distribution. Product templates for each phase are defined to fit the type of the system and avoid overdocumenting. Before the changes, OMT [7] method without measurements and reviews was used. There were problems with late deliveries, poor quality products, and student s complaining about workload. But there was no base to verify the workload and find reasons of problems. So, the development process was exchanged to a more controlled process described in brief in section 3.1, and then it was optimised with use of combination of quantitative and qualitative methods in the three following years. 3.1. Making a difference Any set of improvement activities must be based on the development process definition. Software development process, in this case, was tailored for Internet applications and covered the phases of requirements specification, object-oriented analysis, design, implementation and testing. The products of the analysis and design were documented with UML [5] and structured text. Implementation technology was a design decision. Additionally the method of playing roles by the groups of project participants [1] was introduced. There were the following roles: Developer - responsible for the project and the delivery of products, Customer responsible for the problem domain, and Quality expert - responsible for quality assurance. It was assumed that groups playing the roles co-operate in partnership. The process included reviews after each phase, and technical meeting of all participants after requirements specification review phase. Measurement plan was established. Collected data contained information about: time of work spent on project activities by the participants, late deliveries and evaluation of products quality delivered by the developer (self-evaluation), the quality expert, and teacher. Apart from that, participants described way of work distribution, internal communication, and main problems concerned with the phase. Those comments as well as observations and discussions were used to find problems and look for solutions. Introduction of the measurements and comments revealed author from relaying on opinions and beliefs, and allowed to rely on facts. Metrics easily verified concepts, for example, real time of work, quality of products, and scale of late deliveries (if those were problems of all groups or just an exception!). On the other hand, descriptions gave an understanding of activities made by students and their problems. A lot of problems were concerned with poor organisation, misunderstanding, and had simple solutions. But there were also serious problems that needed more advanced activities. In the task of searching for suggestions that facilitate organisation or explanations that allow to better understand project issues, metrics and descriptions made complementary work.

3.2. Process optimisation Optimisation activity aimed to change process parameters in these places where problems occurred. Collected data enabled tracking results of the new development process. Number of weeks set for the development tasks, average time of work per group and deviation that indicates how the results differ among groups are presented in the table 1. Symbols of * indicate different development tasks made during the same period of time, e.g. in the first iteration there was phase of design that covered system design and component design. Data are taken from four best projects for each iteration. Table 1. Number of weeks for a development task (#weeks), average time of work (ATW) and deviation (d) in the iterations. Development tasks #weeks ATW d #weeks ATW d #weeks ATW d in iter.1 in iter.2 in iter.3 Vision 1 10.3 2.1 Requirements 1 15.6 4.7 1 15.8 6.3 2 *3 28.5 14.7 specification Review 1 6.7 1.45 Analysis 3 23.8 6.3 2 32.6 10.4 2 *3 Review 1 12 2 1 8.3 1.4 1 11 3.5 System design 3 *1 24.9 14.5 2 *2 24.5 8.8 2 *4 17 7 User interface 2 *2 2 *4 prototype Review 1 6.6 1.4 1 7.7 2 Component design 3 *1 2 37.5 14.3 2 ** 27.3 8.8 Review 1 10.3 3.1 1 4.2 1.1 1 5.6 1.2 Implementation 3 52.3 19 3 119.4 33.9 3 ** 22.2 8.8 Tests 1 8.2 4 1 6 1.6 1 7.9 1.6 Total 14 161.8 29.9 14 251.8 55.1 14 133.7 24.3 Time for review was not limited. Quality expert s task was to find as many defects as possible and then make evaluations and quality predictions. Just looking through documentation one could find it was more precise and reader friendly then before the change. Also late deliveries (except from design phase) did not happen in the scale of the students group. However some problems appeared. Results of the revision of the first iteration and improvement activities undertaken to avoid the problems in the next iteration are presented in the table 2. Table 2. Problems collected in the first iteration and improvements in the second iteration Problems Improvement activities Large time of work, especially in the Establish groups of 3 persons instead of 2. implementation phase (it is co-related Support internal organisation of the group by with the measured time of work) delivering suggestions of sequence of task and work distribution. Simplify questionnaires. Late deliveries in the design phase Split design (as one task lasting a long time) into (each group was late 1 or 2 weeks) Medium level of the customer satisfaction at the end of project. two: system design and component design. Introduce prototyping user interface task with review made by the customer.

The improvements gave satisfactory results. In the second iteration the customer s satisfaction level was higher, there was no late deliveries of system design, but there were still problems with component design. It was possible to decrease the time of work (see large deviation in the second iteration), but average results are so large because of a very good group that worked a lot, but also delivered excellent products. The list of problems found in the second iteration and improvement activities for the next iteration are presented in table 3. Table 3. Problems collected in the second iteration and improvements in the third iteration Problems Improvement activities Is there possibility to make the Simplify of Requirements Specification document: Vision of process more efficient? Still the system is made during the first week and non-functional large time of work was requirements selected to fit best this type of system are reported. delivered together with the analysis phase. Late deliveries in component design phase Different understanding of the system by the quality expert, the customer and the developer after system design review and user interface prototype review. Introduce two phases of component design together with component prototyping, implementation and writing technical documentation (the second one includes also system integration) instead of the component design phase and implementation phase. Introduce meeting of all participants after system design review and user interface prototype review to make compromise for the design and implementation. After those improvements there were no late deliveries, time of work decreased, and level of customer s satisfaction was good. This optimal process enabled to make observations on human factors, e.g. adequacy of assigning persons with certain competencies and personality to the development tasks. Presented work describes activities taken to achieve the optimal process with no late deliveries, unified time of work during all the semester, good infrastructure supporting development process and optimal product templates. Each iteration allowed to remove more detailed problems. Removal of the most visible problems resulted in appearance of more hidden ones, e.g. different views of an expert and customer would not be possible to find without review of user interface prototype, or without split for system and component design it would be difficult to locate the problem in component design. It is also interesting that problems indicated by measurements usually have their explanations, e.g. late deliveries in component design are caused by poor knowledge of implementation environment and difficulty to imagine how the system works. 4. Product quality assessment Inspections and tests are main techniques of quality assurance with focus on product. Inspections can be applied to the documentation in early phases to avoid defect propagation and thus minimise time and cost of improvements. In the final version of this work, reviews of analysis and requirements specification, system design, and detailed design with component implementation were made by the quality experts, and user interface prototype was verified by the customer. Questionnaire templates with metrics and comments collection as well as

reasoning about quality were designed in order to support reviews. Tests were made by both the quality experts and the customers at the end of the project. 4.1. Questionnaires for reviews First activity in product quality assurance, and especially review design, is quality definition. The FCM structure was used for this purpose, and for this type of application desired quality was defined with the following factors: Functionality with criteria of completeness, adequacy and coherency, Satisfaction which was verified by checking ease of use, productivity and general aesthetics, Maintainability which was decomposed to documentation quality, portability and modifiability, Dependability in aspects of security and error-tolerance. It was assumed that quality metrics assigned to criteria and factors were in scale [1..5], where 1meantverylowquality,and5 veryhigh. In order to achieve those quality metrics data collection according to the following elements were used: Diagram metrics - they covered counting of elements on the diagrams, and allowed to compare those numbers with expected values, which could be derived from historical data about similar projects. Those expected values were not given in all projects, and thus intuitive estimation of them had to be made. Diagram metrics were also a basis for local size calculation which was used for defect metrics calculation. There was possibility to comment on those values. This is a potential place for automation, but in the author s opinion these kind of calculations bring very uncertain information. Diagram metrics are objective, but quite a lot of subjectivity is associated with the expected values application. Evaluations with comments - evaluations are numbers that represent subjective feelings about some aspects of the work, e.g. ease of understanding, aesthetics, precision, etc. Sometimes they are more efficient then metrics, e.g. instead of counting all the attributes and compare it with fuzzy expected values, one can evaluate that (always, usually,..., never) attributes are complete and adequate and thus description is detailed enough (or not). They could be given in the scale [1..5] described above or described as comments. Defect collection according to the defect classification - those defects can be then counted and combined with the diagram metrics. Such metrics are good quality indicators. The templates were prepared to integrate quality issues with documentation fragments in order to focus attention of the reviewer on limited fragment of documentation, e.g. one type of diagram with descriptions, with their relevant metrics, evaluation and potential defects. Then those elements were integrated into quality metrics that represented factors and recommendations of improvement were generated. During the lab, intuitive reasoning supported with questionnaires was used, but in this part quantitative calculations with formulas that model quality relationships mathematically can be applied to partially automate the process. To summarise use of qualitative and quantitative methods: comments had application during the project to data collection, reasoning about quality factors and criteria, and generation of recommendations for improvement by the quality experts. Also questionnaire analysis and optimisation over the iterations was possible thanks to those comments. Metrics of diagrams and number of defects were used for calculation of the defect metrics. Quantified values of criteria and factors allowed to compare products and gave an idea about the time needed to make corrections.

4.2. Results One of the constraints of this research is that participants are students, and not real customers, developers and quality experts. In order to achieve possibly the most reliable results ten best projects are analysed. In the table 4 metrics of diagrams and defects are presented. Then average value (AV) and deviation (d) are calculated. Since analysis consists of the use case model, class diagram and sequence diagrams with descriptions, the metrics and defects are concerned with them. For the use case diagram, there are the following metrics: number of use cases (# use cases), number of actors (# actors), number of elements including use cases, actors and relationships (# elements), number of elements incoherent with Software Requirements Specification (# incoherent), number of inadequate elements (# inadequate), which includes missing, wrong scope elements, and not needed elements, number of elements difficult to understand (# dif. to understand), which includes ambiguous and surprising elements and statements. Table 4. Metrics of diagrams and defects of the analysis phase for ten projects (P1-P10) Description P1 P2 P3 P4 P5 P6 P7 P8 P9 P10 AV d # use cases 8 11 13 5 6 14 14 12 17 9 10.9 3.12 # actors 2 2 3 4 1 4 2 3 2 3 2.60 0.80 # elements 20 30 37 16 17 38 36 31 36 34 29.5 7.10 # incoherent 2 7 7 6 3 8 1 3 2 2 4.10 2.32 # inadequate 3 8 4 1 2 9 3 2 2 6 4.00 2.20 # dif. To understand 2 7 3 2 3 4 1 3 3 0 2.80 1.24 % inadequate 15% 27% 11% 6% 12% 24% 8% 6% 6% 18% 13% 6% % dif. to understand 10% 23% 8% 13% 18% 11% 3% 10% 8% 0% 10% 5% # classes 13 11 4 14 10 24 16 13 9 9 12.3 3.70 Size ofthe class diagr. 30 24 8 35 19 48 41 24 19 17 26.5 9.60 # imprecision 3 8 4 3 4 4 1 4 3 1 3.50 1.30 Precision evaluation 3 3 3,5 5 4 4 3,5 4 3,5 3 3.65 0.48 # inadequate 2 6 5 2 2 6 4 1 2 3 3.30 1.56 # dif. to understand 2 4 2 0 2 2 1 1 0 2 1.60 0.88 % imprecision 10% 33% 50% 9% 21% 8% 2% 17% 16% 6% 17% 11% % inadequate 7% 25% 63% 6% 11% 13% 10% 4% 11% 18% 17% 11% % dif. to understand 7% 17% 25% 0% 11% 4% 2% 4% 0% 12% 8% 6% # sequence diagr. 8 11 8 5 2 14 10 11 11 1 8.10 3.30 Size (# interactions) 48 49 134 36 6 84 37 20 33 17 46.4 25.9 # incorrect interactions 4 3 3 0 2 3 3 2 2 2 2.40 0.80 # m_exceptions x 3 5 1 4 4 2 x x x x x %incorrect interactions 8% 6% 2% 0% 33% 4% 8% 10% 6% 12% 9% 6% # inco A-SRS 3 4 3 0 2 1 0 2 0 0 1.50 1.30 # inco CD-UCD 0 3 5 0 1 0 0 0 2 3 1.40 1.48 # inco CD-SD 0 3 3 1 0 0 2 1 0 1 1.10 0.94 # inco UCD-SD 2 6 6 1 4 1 3 2 6 5 3.60 1.80

Defect metrics are calculated by division of the number of inadequate elements by the number of elements (% inadequate), and the number of elements difficult to understand by the number of elements (% dif. to understand). The number of elements stand here for the size of use case diagram. Similar metrics are collected for the class diagram, but in this case precision is also important and it is difficult to be captured by metrics, so apart from discovering imprecise elements (# imprecise) there is precision evaluation in the scale of [1..5]. The size of the class diagram is here defined as a sum of classes and relationships. And for sequence diagrams there are the following metrics: number of sequence diagrams (# sequence diagr.), size counted as a number of interactions, missing exceptions (# m_exceptions) - x is used when there is a lot of exceptions missing. And finally coherency between the diagrams is checked: incoherence between analysis diagrams and requirements specification (#inco A-SRS), incoherence between class diagram and use case diagram (# inco CD-UCD), incoherence between class diagram and sequence diagrams (# inco CD-SD), and incoherence between use case diagram and sequence diagrams (# inco UCD-SD). In the table 5, quality metrics for the same projects being result of the reasoning on the basis of the metrics described above together with evaluations and comments are presented. Table 5. Quality metrics for the criteria in ten projects. Description P1 P2 P3 P4 P5 P6 P7 P8 P9 P10 AV Completeness 3 3 3 3 5 4 5 4 5 4 3,9 Adequacy 3,5 2 4 4 4 3 4,5 3 3 3 3,4 Precision 3,5 3 3 5 4 4 3,5 4,5 4 2 3,7 Functionality 4 3 4 5 4 5 4 4 4 3 4,0 The questionnaires support finding defects and help to integrate information taken from different sources. For more precise reasoning it would be useful to introduce some standard rules. Concrete defects are closely connected with the domain of application and they are used during the project to improve quality, so other considerations then classified defect metrics have no sense in this analysis. Statistics of the diagram metrics can be applied for gathering expected values. Let us assume that average value is an expected value, and deviation sets the range of accepted values. In this case, range of expected values for the number of classes is [8..16], and it is possible to find automatically anomalies in project P3 with too little classes (4), and project P6 with too many (24). 5. Conclusions In the paper, experience of process improvement and product quality assessment in early phases of software development with use of quantitative and qualitative methods is presented. Defined process and collected data plan were the basis for making improvements. They enabled to rely on facts, not opinions during revisions. Quantitative methods were used to verify concepts, and qualitative ones for problem identification and removal, and deeper analysis of project issues. After the change concerned with introduction of the new method with reviews and measurements, the development process was optimised over three iterations. Every iteration allowed to identify and remove more and more detailed problems. In product quality assessment qualitative and quantitative methods were used by the quality experts to data collection, reasoning about quality factors and criteria, and generation of recommendations for improvement. They were also used in questionnaire optimisation.

6. References [1] Bobkowska A., Training on High Quality Software Development with the 3RolesPlaying method, in SCI 98/ISAS 98 conference proceedings, Orlando, USA, July 1998. [2] Briand, L. C., Differding C. M., Rombach D. H., Practical Guidelines for Measurement - Based Process Improvement, Technical Report of the International Software Engineering Network (ISERN-96-05), 1996. [3] Fenton N.E., Software Metrics. A Rigorous Approach, Chapman&Hall, 1993. [4] Grudowski P., Kolman R., Meller A., Preihs J., Zarządzanie jakością (Quality Management), Wydawnictwo Politechniki Gdańskiej, Gdańsk, 1996. [5] Rational Software Corporation, UML Notation Guide, www.rational.com [6] Rational Software Corporation, Rational Unified Process, www.rational.com [7] Rumbaugh J., Blaha M., Premerliani W., Eddy F., Lorensen W., Object-oriented modelling and design, Prentice-Hall, 1991. [8] Seaman C. B., Qualitative methods in Empirical Studies of Software Engineering., Transactions on Software Engineering volume 25, 1999, pages 557-572.