MTAT.03.159: Software Testing Lecture 07: Tools, Metrics and Test Process Improvement / TMMi (Textbook Ch. 14, 9, 16) Spring 2013 Dietmar Pfahl email: dietmar.pfahl@ut.ee
Structure of Lecture 07 Test Tools Test Measurement Test Process Improvement SWT Exam
Tools the Workbench Good at repeating tasks Good at organising data Requires training Introduced incrementally No silver bullet Evaluation criteria Ease of use Power Robustness Functionality Ease of insertion Quality of support Cost Company policies and goals
Test Tools in the Process Requirement specification Architectural design Test execution and comparison tools Performance simulator tools System test Acceptance test Test design tools Detailed design Integration test Dynamic analysis tools Static analysis tools Test management tools Code Unit test Coverage tools Debugging tools
Test Tools by Test Maturity TMM level 1 Debuggers Configuration builders LOC counters TMM level 2 Test/project planners Run-time error checkers Test preparation tools Coverage analyzers Cross-reference tools TMM level 3 Configuration management Requirements tools Capture-replay tools Comparator Defect tracker Complexity measurer Load generators TMM level 4 Code checkers Code comprehension tools Test harness generators Perform./network analyzers Simulators/emulators Test management tools TMM level 5 Test library tools Advanced
There is no shortage of Test Tools Defect Tracking (98) GUI Test Drivers (71) Load and Performance (52) Static Analysis (38) Test Coverage (22) Test Design Tools (24) Test Drivers (17) Test Implementation (35) assist with testing at runtime - memory leak checkers, comparators, and a wide variety of others Test case Management (24) Unit Test Tools (63) 3 different categories of others Other links to test tool overviews: http://www.aptest.com/resources.html http://www.softwareqatest.com/qatweb1.html From http://www.testingfaqs.org/
Test data generator Generatedata.com online service for generating data
Output of the previous input field
Evolution of System Testing approaches 1. Recorded Scripts 2. Engineered Scripts 3. Data-driven Testing First Last Data Pekka Pukaro 1244515 Teemu Tekno 587245 4. Keyword-driven Testing 5. Model-based Testing
Recorded Scripts Unstructured Scripts generated using capture and replay tools Relatively quick to set up Mostly used for regression testing Scripts non-maintainable, in practice If the system changes they need to be captured again Capture Replay Tools Record user s actions to a script (keyboard, mouse) Tool specific scripting language Scripts access the (user) interface of the software Input fields, buttons and other widgets Simple checks can be created in the scripts Existence of texts and objects in the UI Data of GUI objects
Engineered Scripts Scripts are well-designed (following a systematic approach), modular, robust, documented, and maintainable Separation of common tasks E.g. setup, cleanup/teardown, and defect detection Test data is still embedded into the scripts One driver script per test case Code is mostly written manually Implementation and maintenance require programming skills which testers (test engineers) might not have Just like any other software development project
First Last Data Data-Driven Testing Pekka Pukaro 1244515 Teemu Tekno 587245 Test inputs and expected outcomes stored as data Normally in a tabular format Test data are read from an external data source One driver script can execute all of the designed test cases External test data can be edited without programming skills Test design and framework implementation are now separate tasks former can be given to someone with the domain knowledge (business people, customers) and latter to someone with programming skills. Avoids the problems of embedded test data Data are hard to understand in the middle of all scripting details Updating tests or creating similar tests with slightly different test data always requires programming Leads to copy-paste scripting
Data-Driven Testing
Keyword-Driven Testing Keywords also known as action words Keyword-driven testing improves data-driven testing: Keywords abstract the navigation and actions from the script Keywords and test data are read from an external data source When test cases are executed keywords are interpreted by a test library which is called by a test automation framework The test library = the test scripts Example: Login: admin, t5t56y; AddCustomers: newcustomers.txt RemoveCustomer: Pekka Pukaro More keywords (=action words) can be defined based on existing keywords Keyword driven testing ~= domain specific languages (DSL) Details: http://doc.froglogic.com/squish/4.1/all/how.to.do.keyword.driven.testing.html Another tool: http://code.google.com/p/robotframework/
Architecture of a Keyword-Driven Framework Pekka Laukkanen. Data-Driven and Keyword-Driven Test Automation Frameworks. Master s Thesis. Helsinki University of Technology. 2006.
Model-based Testing
Model-based Testing System under test is modelled UML-state machines, domain specific languages (DSL) Test cases are automatically generated from the model The model can provide also the expected results for the generated test cases More accurate model -> better test cases Generate a large amount of tests that cover the model Many different criteria for covering the model Execution time of test cases might be a factor Challenges: Personnel competencies Data-intensive systems (cannot be modelled as a state-machine) Simple MBT tool http://graphwalker.org/
Evolution of System Testing approaches 1. Recorded Scripts Cheap to set up, quick & dirty 2. Engineered Scripts Structured 3. Data-driven Testing Data separation 4. Keyword-driven Testing Action separation, DSL 5. Model-based Testing Modeling & Automatic test case generation First Last Data Pekka Pukaro 1244515 Teemu Tekno 587245
Automation and Oracles Automated testing depends on the ability to detect automatically (via a program) when the software fails An automated test is not equivalent to a similar manual test Automatic comparison is typically more precise Automatic comparison will be tripped by irrelevant discrepancies The skilled human comparison will sample a wider range of dimensions, noting oddities that one wouldn't program the computer to detect Our ability to automate testing is fundamentally constrained by our ability to create and use oracles (Cem Kaner)
Types of outcome to compare Screen-based Character-based applications GUI applications Correct message, display attributes, displayed correctly GUI components and their attributes Graphical images Avoid bitmap comparisons Disk-based Comparing text files Comparing non-textual forms of data Comparing databases and binary files Others Multimedia applications Sounds, video clips, animated pictures Communicating applications Simple vs. complex comparison
Test case sensitivity in comparisons Robust tests Sensitive tests Susceptibility to change Implementation effort Miss defects Failure analysis effort Sensitive test case compares many elements and is likely to notice that something breaks. However, it is also more sensitive to change and causes rework in test automation. A robust test checks less and is more change-resilient, but also misses potential defects. Striking a balance is the challenge. Storage space Redrawn from Fewster and Graham, Software Test Automation, 1999.
Effect of automation on goodness of a test case Automated test after many runs Effective Manual test Evolvability (maintainability) of automated test does not change, but economics increases the sizes and the goodness of the test case Economic Evolvable First run of automated tests Exemplary Redrawn from Fewster and Graham Software Test Automation, 1999.
Scope: Automating different steps Automated tests Select/Identify test cases to run Set up test environment - create test environment - load test data Repeat for each test case: - set up test prerequisites - execute - compare results - log results -analyze test failures -report defect(s) - clear up after test case Clear up test environment: - delete unwanted data - save important data Summarize results Automated testing Select/Identify test cases to run Set up test environment: - create test environment - load test data Repeat for each test case: - set up test prerequisites - execute -compare results - log results -clear up after test case Clear up test environment: - delete unwanted data -save important data Summarize results Analyze test failures Report defects Manual process Automated process Redrawn from Fewster et al. Software Test Automation, 1999.
Relationship of testing activities Edit tests (maintenance) Set up Execute Analyze failures Clear up Manual testing Same tests automated More mature automation Redrawn from Fewster et al. Software Test Automation, 1999. Time
Test automation promises 1. Efficient regression test 2. Run tests more often 3. Perform difficult tests (e.g. load, outcome check) 4. Better use of resources 5. Consistency and repeatability 6. Reuse of tests 7. Earlier time to market 8. Increased confidence
Common problems 1. Unrealistic expectations 2. Poor testing practice Automatic chaos just gives faster chaos 3. Expected effectiveness 4. False sense of security 5. Maintenance of automatic tests 6. Technical problems (e.g. Interoperability) 7. Organizational issues
What can be automated? 1. Identify 2. Design Intellectual Performed once 3. Build 4. Execute Repeated Clerical 5. Check
Limits of automated testing Does not replace manual testing Manual tests find more defects than automated tests Does not improve effectiveness Greater reliance on quality of tests Oracle problem Test automation may limit the software development Costs of maintaining automated tests
What to automate first? Most important tests A set of breadth tests (sample each system area overall) Test for the most important functions Tests that are easiest to automate Tests that will give the quickest payback Test that are run the most often
Structure of Lecture 07 Test Tools Test Measurement Test Process Improvement SWT Exam
Test Management Monitoring (or tracking) Check status Reports Metrics Controlling Corrective actions
Purpose of Measurement Test monitoring check the status Test controlling corrective actions Plan new testing Measure and analyze results The benefit/profit of testing The cost of testing The quality of testing The quality of the product Basis of improvement, not only for the test process
Cost of Testing How much does testing cost? As much as resources we have! Ericcsson mobile phones did not do too well because did too much of testing instad of getting the product to market
Test Monitoring Status Coverage metrics Test case metrics: development and execution Test harness development Efficiency / Cost metrics How much time have we spent? How much money/effort have we spent? Failure / Fault metrics How much have we accomplished? What is the quality status of the software? Effectiveness metrics Metrics Estimation Cost Stop? How effective is the testing techniques in detecting defects?
Selecting the right metrics What is the purpose of the collected data? What kinds of questions can they answer? Who will use the data? How is the data used? When and who needs the data? Which forms and tools are used to collect the data? Who will collect them? Who will analyse the data? Who have access to the data?
Goal-Question-Metric Paradigm (GQM) Goals What is the organization trying to achieve? The objective of process improvement is to satisfy these goals Questions Questions about areas of uncertainty related to the goals You need process knowledge to derive the questions Metrics Measurements to be collected to answer the questions Goal example: Analyze <object(s) of study> the detection of design faults using inspection and testing for the purpose of <purpose> evaluation with respect to their <quality focus> effectiveness and efficiency from the point of view of the <perspective> managers in the context of <context> developers, and in a real application domain [van Solingen, Berghout, The Goal/Question/Metric Method, McGraw-Hill, 1999]
Measurement Basics Basic data: Time and Effort (calendar- and staff-hours) Failures / Faults Size / Functionality Basic rule: Feedback to origin Use data or don t measure
Test metrics: Coverage What? % statements covered % branches covered % data flow % requirements % equivalence classes Why? Track completeness of test
Test metrics: Development status Test case development status Planned Available Unplanned (not planned for, but needed) Test harness development status Planned Available Unplanned
Test metrics: Test execution status What? # faults/hour # executed tests Requirements coverage Why? Track progress of test project Decide stopping criteria
Test metrics: Size/complexity/length What? Size/Length LOC Functionality Function Points Complexity McCabe Difficulty Halstead Cohesion, Coupling,... Why? Estimate test effort
Test metrics: Efficiency What? # faults/hour # faults/test case Why? Evaluate efficiency of V&V activities
Test metrics: Faults/Failures (Trouble reports) What? # faults/size repair time root cause Why? Monitor quality Monitor efficiency Improve
Test metrics: Effectiveness What? % found faults per phase % missed faults Why? Evaluate effectiveness of V&V activities
How good are we at testing? Test quality Are we here? Many faults Few faults Product quality Few faults Few faults Or are we here?
When to stop testing? All planned tests are executed and passed All coverage goals met (requirements, code,...) Detection of specific number of failures Rates of failure detection fallen below a specified level Fault seeding ratios are favourable Reliability above a certain value Cost has reached the limit
Example Number of detected failures Number of executed test cases Number of failures per day Interpretation?
Example Number of detected failures Number of executed test cases Interpretation?
Example Number of detected failures Number of executed test cases Interpretation?
Structure of Lecture 07 Test Tools Test Measurement Test Process Improvement SWT Exam
Process quality and product quality Quality in process -> Quality in product Project: instantiated process Quality according to ISO 9126 Process quality contributes to improving product quality, which in turn contributes to improving quality in use Process Project Product
Principles Test organisation Assess Improve Maturity Model
Process improvement models (Integrated) Capability maturity model (CMM, CMMI) Software process improvement and capability determination (SPICE) ISO 9001, Bootstrap, Test maturity model (TMM) Test process improvement model (TPI) Test improvement model (TIM) Minimal Test Practice Framework (MTPF)
CMMI (Capability Maturity Model Integrated) Defined Managed Process change management Technology change management Defect prevention Software quality management Quantitative process management Peer reviews Intergroup coordination Software product engineering Integrated software management Training programme Organization process definition Organization process focus Optimizing Repeatable Software configuration management Software quality assurance Software subcontract management Software project tracking and oversight Software project planning Requirements management Initial
Test Maturity Model (TMM) Levels Maturity goals and sub-goals Scope, boundaries, accomplishments Activities, tasks, responsibilities Assessment model Maturity goals Assessment guidelines Assessment procedure
Level 2: Phase Definition Institutionalize basic testing techniques and methods Initiate a test planning process Develop testing and debugging tools
Level 3: Integration Control and monitor the testing process Integrate testing into software lifecycle Establish a technical training program Establish a software test organization
Level 4: Management and Measurement Software quality evaluation Establish a test management program Establish an organization-wide review program
Level 5: Optimizing, Defect Prevention, and Quality Control Test process optimization Quality control Application of process data for defect prevention
Can the organization be too mature?
Clausewitz: Armor and mobility alternate dominance (DeMarco) Romans Franks Castles Maginot Line Greeks Vandals, Huns Mongols Field Artillery Tanks
Birth of the castle (CMMI) and the tiger (Agile) U.S Department of defense Scientific management Statistical process control Management Control Large team & low skill Leading industry consultants Team creates own process Working software Software craftsmanship Productivity Small team & high skill
Plan-driven vs. Agile (Boehm & Turner, 2003, IEEE Computer, 36(6), pp 64-69) 103
Software quality assurance comparison: castle vs. tiger Independent QA team Compliance to documented processes Against predefined criteria Documents & processes & control Formal: Reporting to management Organisation Ensuring Evaluation Criteria Focus Communication Integrated into the project team Applicability and improvement of the current processes and practices Identifying issues and problems Productivity & quality & customer Informal: Supporting the team
General advice Identify the real problems before starting an improvement program What the customer wants is not always what it needs Implement easy changes first Involve people Changes take time!
Recommended Textbook Exercises Chapter 14 2, 4, 5, 6, 9 Chapter 9 2, 3, 4, 5, 8, 12 Chapter 16 No exercises
Structure of Lecture 07 Test Tools Test Measurement Test Process Improvement SWT Exam
Final Exam Written exam (40%) Based on textbook, lectures and lab sessions Open book 90 min Dates: Exam 1: 30-May-2013 10:15-11:45 (J. Liivi 2-405) Exam 2: 10-June-2013 14:15-15:45 (J. Liivi 2-403)
Thank You!