Universidade de Aveiro Departamento de Electrónica, Telecomunicações e Informática Evaluation in Data and Information Visualization Beatriz Sousa Santos, 2011/2012 1
Definition Visualization is the process of exploring, transform and represent data as images (or other sensorial forms) to gain insight into phenomena There are several expressions used to designate different areas of Visualization: Scientific Visualization Data Visualization Information Visualization The differences among these areas are not completely clear 2
Framework Visualization includes not only image production from the data, but also their transformation and manipulation (if possible their acquisition) Data acquisition Data Hypothesis Understanding User Computing Results (Brodlie et al., 1992) It is a human-in-the-loop problem 3
Data Visualization Reference model Simulated data Finite Element Analysis,. Numerical models... Measured data: CT, RMI, ultra-sound lasers,. Satalite imaging... Data Transform Map Display Visualization Technique (adapted from Schroeder et al., 2006) 4
Data and Information Visualization In general: Data Visualization (DV) -Data having an inherent spatial structure (e.g., CAT, MR, geophysical, meteorological, fluid dynamics data) Information Visualization (IV) Data not having an inherent spatial structure (e.g., stock exchange, S/W, Web usage patterns, text) These designations may be misleading since both DV and IV start with (raw) data and allow to extract information Borders between these areas are not well defined, neither it is clear if there is any advantage in separating them (Rhyne, 2003) 6
Information Visualization Reference Model In Information Visualization interaction is generally more considered the role of the user is more explicitly represented task Raw data Data tables Visual structures Views Data Transformation Visual Mappings View Transformation Human interaction Visualization can be described as the mapping of data to visual form that supports human interaction in a workspace for visual sense making (Card et al., 1999) 10
How can we evaluate a Visualization? A correct definition of goal is fundamental Reveal shape Analyze structure (Simulation of an astrophysical phenomenon) (Keller & Keller, 1993) 11
Answering two questions: How well does the final visualization: -represents the underlying phenomenon - helps the user understand it? Which imply: A) Low level - evaluating the representation of the phenomenon A) High level evaluating the users performance in their tasks (involving understanding the phenomenon) while using the visualization 12
Simulated Data Measured Data Data Evaluating a visualization technique should Involve evaluation of all phases: transform Visualization technique map display e.g. low level: accuracy, repeatability of methods (errors, artifacts, ) high level: efficacy and efficiency, in supporting users tasks learnability, memorability, Not forgetting the interaction (not only visual) aspects! 13
Main Issues for evaluation planning Motivation/ goal (why? / what for?) Test data (which data sets? How many?) Evaluation methods (which?) Collected data (which measures? which observations?) Data analysis (which methods?) Much related with the methods 14
Motivation and goal are the starting point of an evaluation For example: - Which is the best representation of specific data to support specific users while performing specific tasks? - Which is the best segmentation algorithm? Along constraints, influences the choice of methods data sets... 16
Test data can be real, synthetic(or in beteween) For instance in Medical Data Visualization it is common to use: Accuracy Synthetic data Phantoms" Cadavers In Vivo Realism Synthetic data allow a better knowledge of the ground truth Data should : Be enough Be representative Include specially difficult cases 17
Collected data have a fundamental impact on the information we can get from the evaluation Analysis of the collected data has an impact on the results credibility Selecting methods should take into consideration: Nature, level of representation and scale of the collected data Size of the sample Statistical distribution Etc. 20
Methods Methods from other disciplines can be adapted, e.g. methods used in Human-Computer-Interaction (Dix, 2004): - Controlled experiments with users - Observation - Query methods (questionnaires, interviews) - Inspection methods (heuristic evaluation) Empirical (involving users) Analytical (not involving users) Specific methods are appearing (e.g. insight based methods) 21
Controlled experiments workhorse of experimental science (Carpendale, 2008) with benchmark tasks, the primary method for rigorously evaluating visualizations (North, 2006) Involve: Hypothesis Independent (input) variables (what is controlled) Dependent (output) variables (what is measured) Secondary variables (what more could influence results) Experimental design (between groups / within groups) Protocol (sequence and characteristics of actions) Statistical analysis 22
Observation Is a very useful method widely used in usability evaluation Can be done in different ways: Very simple (e.g. just observing the user doing some tasks) Very sophisticated (e.g. using a usability Lab, logging, video, ) Think aloud Usability testing includes observation and query techniques (engineering approach) 23
Query methods Also very useful and widely used in usability evaluation Two types: Questionnaires easier to apply to more people; less flexible Interviews more flexible; reach less people Must be carefully designed (types of questions, scale of responses, ) Should be evaluated before applying them 24
Heuristic evaluation Widely used in usability evaluation Application in Visualization evaluation is not as common (few heuristics) It is a structured analysis assessing if a set of heuristics are followed It should be performed by expert analysts Has the advantage of not involving users Can be performed even before any prototype 25
Universidade de Aveiro Departamento de Electrónica, Telecomunicações e Informática Evaluating Visualizations: examples CardioAnalyser: Left Ventricle (LV) Visualization from Angio Computer Tomography (CT) data * Pedigree tree visualization using the H-layout 29
CardioAnalyser: Visualizing the Left Ventricle (LV) and quantifying its performance from Angio Computer Tomography data Goal: Help users to better understand the performance of the Left Ventricle from AngioCT through interactive visualization methods/tools 30
CardioAnalyser: Visualizing the Left Ventricle (LV) and quantifying its performance from Angio Computer Tomography data - CT exam: ~12 phases x (512x512x256) volume - segment endocardium and epicardium in every phase - edit the segmentations (if necessary) - visualize quantify How should we evaluate? 31
CardioAnalyser: Visualizing the Left Ventricle (LV) and quantifying its performance from Angio Computer Tomography data How should we evaluate? Video 1- the segmentation method/tool 2- the functional analysis method/tool 3-the perfusion analysis method/tool 32
CardioAnalyser: Visualizing the Left Ventricle (LV) and quantifying its performance from Angio Computer Tomography data How should we evaluate? 1- the segmentation method /tool 2- the functional analysis tool 3-the perfusion analysis tool 33
Low level evaluation : Preliminary evaluating the segmentation method observer study, query High level evaluation : Evaluating a 3D segmentation editing tool - user study, observation, query The team: At the University: - Samuel Silva, PhD student - Joaquim Madeira, PhD - Carlos Ferreira, PhD (Math) At Gaia Hospital: - 1 radiologist (MD) - 3 experienced radiographers 34
Is the CardioAnalyser LV segmentation tool adequate to support radiographers in their segmentation tasks? 1 qualitative evaluation of the segmentation method 2 qualitative evaluation of the 3D editing tool 3- selection of a measure to compare segmentations 4- quantitative evaluation of the LV segmentation tool 35
Constraints during evaluation A lot of data for each exam High patient/image variability Very busy domain experts Distant hospital implied: -a careful choice of test data set and methods -the development of specific applications 36
1- LV segmentation method. Accurate segmentations are needed to: - compare structures - perform quantitative measurements In medical applications segmentations must be validated by the expert A segmentation method that starts by one phase (60%) and uses the first segmentation to help segment the other phases was developed As segmentations are to be validated, the emphasis is on making the hole process easy and fast 37
Qualitative Evaluation of the segmentation method Preliminary qualitative evaluation after developing the first prototype Meant to: detect serious segmentation problems inform further fine tuning of the method 3 radiographers endocardium epicardium 7exams, 3 phases /exam (ED, ES, 60%) epicardium, endocardium 38
Radiographers classified the segmentations (without any edition) as if they were final (i.e., usable for diagnosis purposes) Using a Regional classification: Endocardium: four anatomical regions: apex mid-ventricle mitral valve outflow Epicardium: five anatomical regions: apex mid-ventricle lateral and septal regions basal lateral and septal regions Scale: - OK (optimum segmentation) -EXCESS (3 levels) + ++ +++ -SHORTAGE (3 levels) - -- --- 40
Worst cases: Segmentation classification: 1 low significance: very good; could include/exclude a very small region; 2 moderate: good; could be significantly improved including/excluding a small region; 41 3 serious: cannot be used without the inclusion/exclusion of important regions,
Results of preliminary evaluation of the segmentation method Endocardium segmentation: apex and midventricularslices well segmented Epicardium segmentation clearly needed further improvements Most problems in the septal sections of midventricular and basal regions example of epicardium segmentation problem in the septal section 43
2-A tool to edit LV segmentations in 3D Even robust segmentation methods cannot deal with the wide range of variation of anatomical structures, e.g. in: - shape - orientation - texture Tools to easy segment editing/correction by experts are needed Performing segmentation in volume data editing several slices may be a tiresome task 44
Should be: - Intuitive - Easy to use by radiographers to correct most common segmentation problems Two alternatives: Voxel mask (ADD/REMOVE) 3D surface (deform) 45
Simple evaluation of the tool to edit LV segmentations in 3D Three radiographers Explanation and practice Two typical tasks: task 1 -adjusting the segmentation to the mitral valves (removing) task 2 -adjusting the segmentation to the LV wall (adding) Time to perform the tasks using: voxel mask (3DV) surface editing (3DS) the 2D editing tool (from MITK) Preferences, comments 49
Results of the 3D editing tool evaluation Average task times for both 3D editing modes much smaller than for the 2D tool Users preferred voxel editing simplicity -but surface editing does not occlude the image The option of showing just the outline was added to the voxel method Time (s) to complete an editing task using: 2D tool; 3DV -voxel editing; 3DS -surface editing 50
Comparing a modified pedigree tree visualization method with the original method João Miguel Santos: MSc Student Paulo Dias, PhD H-Tree method (Tuttle et al., 2010) 51
Comparing a modified pedigree tree visualization method with the original method Visualization techniques capable of representing large pedigree trees are useful An H-Tree Layout has been recently proposed to overcome some of the limitations of traditional representations 52
Traditional representations of pedigree trees (used in commercial S/W) Binary trees with several layouts (horizontal, vertical, bow): - Generations easily understandable - Space needs grow fast with generations Fan trees - Generations still understandable - Space needs attenuated - Impractical for > 5 or 6 generations 54
Pedigree H-layout representation To overcome space limitations, Tuttleet al. (2010) proposed a method based on the H-Tree Layout: - It allows the representation of a greater number of generations simultaneously However: -It is more difficult to identify relations among individuals 5 4 5 5 4 5 3 2 3 5 4 5 5 4 5 1 5 4 5 5 4 5 3 2 3 5 4 5 5 4 5 55
Enhancing the Pedigree H-layout Objectives: - simplify the understanding of the family structure inherent in the pedigree - allow downward interactive navigation Demo 56
Enhancing the Pedigree H-layout New functionality proposed: -complementary information on the tooltip with the relation to the central individual -"generation emphasis" that highlights individuals belonging to generation nin relation to the individual under the cursor -contextual menu allowing downward navigation to direct descendants 57
Evaluating the Enhanced Pedigree H-Tree Does the enhanced method better support understanding the family structure? As (comparative evaluation) How good is the enhanced method (for specific tasks/users)? (outright evaluation) Two types: Analytical Empirical 58
Empirical evaluation characterization Data: public real data Users: InfoVis/HCI students Experts (MDs, animal breeders) Tasks: Simple Complex Interaction Visual Methods: Observation Logging Questionnaire Interview Controlled experiment Insight-based evaluation Measures: User performance Efficiency Efficacy Satisfaction 59
Measures/methods: Task completion: Observation Logging Difficulty, Disorientation: Questionnaire Observation Times: Observation/Logging Satisfaction: Questionnaire Interview 60
Evaluation: four/five phases Pilot usability test A few users Usability test 6 InfoVis students Pilot test for the controlled experiment: 6 InfoVis students Controlled experiment: 60 HCI students Evaluation with domain experts -No logging - Only comparative -Informally confirmed usefulness of enhancements -Allowed improving: - application -protocol For academic purposes: - further improvement - formal comparison - guidelines 61
Usability test (including pilot) General explanation concerning the application and the test Practice until each user feels ready Users performed 6 tasks An observer registered: Task completion Correct answers Times Difficulty If the user asked for help/ seemed lost Users answered a questionnaire Users were informally interviewed 62
Documents involved in the protocol List of tasks Observer notes Questionnaire Teste.exe 63
Results of the usability test Efficacy - more correct answers with: tooltips generation emphasis Efficiency - times were difficult to register manually (tasks too simple?) Tooltips were considered the most helpful feature to understand the family structure Specific suggestions (e.g. increase arrows size) The protocol was modified to be used in the controlled experiment 64
Another test: is the test application Colorblind friendly? Simulations were done using http://vischeck.com The choice of colours was confirmed by a colorblind user 65
Other tested alternatives 66
Design of the controlled experiment Question: Do users understand better the family structure while using the enhanced method (compared with the original method)? Can be divided in the following two hypothesis: Hypothesis 1 Tooltips improve users performance in understanding the family structure, when compared with the original method Hypothesis 2 Generation emphasis improves users performance in understanding the family structure, when compared with the original method 68
Variables: Input(independent) variables: Method 3 levels original original + tooltips original + generation emphasis Output(dependent) variables: times task completion rate; success rate disorientation, difficulty satisfaction Secondary variables: Learning or fatigue effect (control sequence of tasks) 69
Experimental design Within-groups: all users perform the same tasks in all experimental conditions (i.e., with all methods ) Advantages over between-groups design: More data with the same users Less user profile variation Caution: Randomize and register order of tasks 70
Protocol of the controlled experiment General explanation concerning the application and the test Practice until each user feels ready Users perform 10 tasks An observer registers: Task completion Difficulty Errors If the user asked for help/ felt lost The application loggs times Users answer a questionnaire Users are informally interviewed 71
In these examples (but more gerally): Formative came first, then summative evaluation (they are not totally disjoint) It was important to: Start thinking about evaluation as soon as possible Do several evaluation rounds Use more then one method Carefully choose the methods, data, users, tasks, measures, data analysis methods Learn as much as possible from each evaluation round, to: - Improve the methods/applicationsi - mprove next evaluation 73
About Evaluating Visualization methods/applications: Evaluating Visualizations is challenging It will become more challenging as Visualization evolves to be more interactive, collaborative, distributed, multi-sensorial, mobile It is fundamental to: - evaluate solutions to specific cases - develop new visualization methods / systems - establish guidelines to make Visualization more useful, more usable, and more used 74
Bibliography - books Brodlie, K., L. Carpenter, R. Earnshaw, J. Gallop, R. Hubbold, A. Mumford, C. Osland, P. Quarendon, Scientific Visualization, Techniques and Applications, Springer Verlag, 1992 Card, S., J. Mackinlay, B. Schneiderman(ed.), Readings in Information Visualization-Using Vision to Think, Morgan Kaufmann, 1999 Carpendale, S.: Evaluating Information Visualization. Information Visualization: Human- Centered Issues and Perspectives, Kerren, A. Stasko, J., Fekete, J.D., North, C. (eds), LNCS vol. 4950 19-45. Springer, 2008 Dix, A., Finlay, J., Abowd G., Beale, R.: Human-Computer Interaction, 3rd edition, Prentice Hall, 2004 Hansen, C., C. Jonhson(eds.), The Visualization Handbook, Elsevier, 2005 Jonhson, C., R. Moorhaed, T. Munzner, H. Pfister, P. Rheingans, T. Yoo, Visualization Research Challenges, NHI/NSF, January, 2006 Keller, P., M. Keller, Visual Cues, IEEE Computer Society Press, 1993 Schroeder, W., K. Martin, B. Lorensen, The Visulization Toolkit- An Object Oriented Approach to 3D Graphics, 4th ed., Prentice Hall, 2006 Spence, R., Information Visualization: Design for Interaction, 2nd ed., Addison Wesley 2006 Ware, C., Information Visualization: Perception to Design, 2nd ed. Academic Press, 2004 75
Bibliography papers Rhyne, T. M., "Does the Difference between Information and Scientific Visualization Really Matter?, IEEE Computer Graphics and Applications, May/June, 2003, pp. 6-8 Rhyne, T. M., Scientific Visualization in the Next Millennium, IEEE Computer Graphics and Applications, Jan./Feb., 2002, pp. 20-21 Hibbard, B., Top Ten, Visualization Problems, SIGGRAPH Computer Graphics Newsletter, VisFiles, May 1999, Vol. 33, N.2 Johnson, C., Top Scientific Visualization Research Problems, IEEE Computer Graphics and Applications: Visualization Viewpoints, July/August, 2004, pp. 13-17 Eick, S., "Information Visualization at 10," IEEE Computer Graphics and Applications, vol.25, no.1,jan /Feb,2005, pp. 12-14 Keefe, D., Integrating Visualization and Interaction Research to Improve Scientific Workflows, IEEE Computer Graphics and Applications, vol.30, no.2, Mar/April, 2010, pp. 8-13 Globus, A., E. Raible, Fourteen Ways to Say Nothing With Scientific Visualization, Computer, 27, 7, July 1994, pp. 86-88 76