Usability Testing Jeliot 3- Program Visualization Tool: Evaluation Using Eye-Movement Tracking

Usability Testing Jeliot 3- Program Visualization Tool: Evaluation Using Eye-Movement Tracking Roman Bednarik University of Joensuu Connet course 281: Usability in Everyday Environment February 2005

Contents Contents...3 1. Introduction...4 2. Eye-Movement Tracking and Usability...4 3. Jeliot 3...4 4. Experiment...6 Participants...6 Materials and Apparatus...6 Procedure and Design...7 5. Results and Discussion...8 Fixation behaviour...8 Switching behaviour...9 6. Conclusion...10 References...12 3

1. Introduction This paper presents an experiment where sixteen participants eyemovements were tracked while interacting with Jeliot 3 a Javaprogram visualization tool 1. The purpose of the experiment was an attempt to link the patterns of eye-movements during animation to basic usability principles. From the link we could evaluate the usability of the interface, into some extent. Distribution of fixations and transition matrices were computed for the areas of interest. Areas of interest for the usability test were defined to match the main areas in the tool s window. 2. Eye-Movement Tracking and Usability Relating eye-movement patterns to usability issues is still an open and problematic question. In the present usability test, two eye-movement metrics were used. The distribution of fixations is thought to be related to the importance of an element in an interface. As the importance of the element increases, more fixations are expected to be paid on the element. Second measure used in the experiment is the transition matrix. This is a representation of all attention transitions between all elements in an interface during a session. Once a switch in visual attention is registered between two elements, the value in the related cell is increased by one. To provide a relative measure, the values are either divided by a total number of switches, or by the duration of the session. 3. Jeliot 3 1 A part of the experiment was previously reported in Bednarik et al. (2005). 4

In the present experiment, we used Jeliot 3, a program visualization system. Jeliot 3 can visualize a large subset of novice-level Java programs (see http://cs.joensuu.fi/jeliot/). The user interface of Jeliot 3 is shown in Figure 1. Figure 1: User interface of Jeliot 3. (1) code editor, (2) animation frame, (3) control panel, (4) output console. In a typical session with Jeliot, a user either writes or loads a program. User can compile the program through the user interface of Jeliot. When compiled, an animation frame view is opened, in which the user can animate the programs execution. Jeliot shows the execution either step by step or continuously. User can control the speed of the animation and stop the animation at any point. In the visualization/animation view, user can see the method frame, local variables, expression evaluation, arrays, and objects. Furthermore, there are separate specialized visualizations where only a call tree of the program or a history of execution is shown. It is the goal of the tool to support learning to program. Therefore, any usability problem can be seen as an obstacle. However, the general problem in usability, such as low error-rate, might not always be seen as contradictions to effective learning. Opposite, they might contribute to the learning process. 5

4. Experiment Participants Eighteen participants were recruited from high-school students attending a university-level programming course, and from computer science students from a local university. Due to technical problems, data from two participants had to be discarded. Therefore the results are based on the data collected from 16 subjects (13 male, 3 female). The mean age was 22.8 years (range 16-45, SD=7.7). All subjects reported normal or corrected-to-normal vision. The mean programming experience was 46.7 moths (SD=53.3), Java experience 12.7 months (SD=12.4). Six participants had a previous experience with Jeliot, and other two had an industrial programming experience. Materials and Apparatus Three short Java programs, a factorial computation, a naïve string matching, and a recursive binary search, were presented to participants. The lengths of the programs in lines of code were 15, 34, and 38 respectively. The names of methods and variables were altered so that the recognition of a program based on these surface features would be difficult. We used an adapted version of Jeliot 3 in our study. The user interface of Jeliot 3 is shown in the Figure 1. The interface consists of four main areas of interest. A code editor (1) on the left hand side shows the program code, and during program visualization, the currently executed statement or expression is highlighted. There is a control panel (3) at the bottom left corner with which a user can control the animation with VCR-like buttons. On the right hand side of the window there is an animation frame (2) showing the execution state of the program. The frame is divided into four separate areas (from left-to-right and top-todown): method area containing method frames and local variables, expression evaluation area, constant area containing constants and static variables, and objects and arrays area. Finally, an output console 6

(4) at the bottom right corner of Jeliot s window shows the output of the executed program. The Jeliot 3 was modified to collect information about the user actions during the experiment. Furthermore, all the changes in the visualizations of the programs were recorded to be compared with the eye tracking data. That is, for the same timeslot we were able to compare the participant s focus of visual attention to the actual location of current change in animation. The specialized visualization views of Jeliot were disabled during the experiment. The remote Tobii ET-1750 (50Hz) eye-tracker was used to track participants eye movements; the interaction protocols (such as mouse clicks) were collected for all the target programs, and audio and video were recorded for a whole session. Fixations shorter than 100ms were disregarded from analysis. Procedure and Design The experiment was conducted in a quiet usability lab. Participants were seated in an ordinary office chair, near the experimenter, and facing a 17 FT display. Every participant then passed an automatic eye-tracking calibration. After the calibration, participants performed three sessions, each consisting of a comprehension phase using Jeliot 3 and a program summary writing phase. Participants were instructed to comprehend the program as well as possible and they could use Jeliot as they found it necessary. The duration of a session was not limited. The first program was factorial computation and it was used as a warmup and the resulting data were discarded. The order of the two actual comprehension tasks was randomized so that a half of the participants started with the naïve string matching and other half with recursive binary-search program. A pilot test discovered only minor problems in the experiment design. The stimulus window was set to a fixed size; additional views of the visualization were made unavailable. 7

5. Results and Discussion Fixation behaviour In terms of fixation count distribution and corresponding areas of highest interest, we could recognize two main areas of interest for our participants (Figure 2, for the string matching program). One area is located at the visualization frame, showing that most of participants paid attention to the expression evaluation. The second most fixated area was found at the source code. The centre of that area collocated with the most important part of the source code which could be linked to the central ideas of programs. The distribution of fixations and corresponding areas of high interest was similar for both of the programs in the experiment. Moreover, the fixation durations at these locations were the longest. Altogether, it is an important evidence of cognitive processing and information search during a task (Goldberg and Kotval, 1999), with the implications to the usability of Jeliot interface. The overall distribution of fixations was 57.1% code, 40.4% visualization, 0.1% output, and 2.3% control. The low number of fixations on the output area is explained by low production of output of the programs. When we decomposed the fixations onto the visualization area, in average 36% of these fixations were onto the method area, 43% onto the expression evaluation area, 17% to instances area, and 4% of fixations laid on the constant area. When taking into the consideration the positions where users fixate at most, it would be obvious to move expression evaluation area closer to source code area, since most of attention switches are between those two areas. When the areas are distant from each other, a usability problem appears: users have to exercise higher efforts to follow the animation. Our goal is, of course, to reduce the workload, and therefore allow users to spend more resources on learning. 8

Together with the expression area and code area, control panel is in the group of most attended areas during animation. We propose control buttons to be moved closer to the code and expression areas. All the proposed changes could significantly reduce the distance of the fixations and possibly lead to less cognitive load which does not attribute for effective viewing of animation. Moreover, having the gaze available during the animation, it would be possible to employ it also as an interaction modality for controlling the course of animation. Figure 2: Visualization of the fixation distribution for string matching program. Switching behaviour We measured the number of switches per minute as an indication of dynamics of attention allocation. By a term switch, we mean any change of the visual attention focus between two areas of interest, code, visualization, output, and control. In average, participants performed 20.47 (SD=9.3) attention switches per minute whereas Jeliot promoted 59.53 (SD=25.42) switches per minute during the animation. We considered that Jeliot promoted a switch every time a part of the source code was highlighted after some animation had happened in the visualization area or vice versa. Figure 3 illustrates the attention switching behaviour in number of switches per minute between the main four areas of interest. 9

An apparent conclusion is that there should be fewer switches promoted by Jeliot between source code area and animation. Decreasing the amount of promoted switches is possible, but the problem is that at present we do not know which of the promoted switches are relevant to the user and which ones are not. However, if an eye tracker would be used as a real-time input device for Jeliot, it would be possible to infer which of the promoted switches were consumed and which were not. An adaptive engine of Jeliot could then stop promoting those switches that users did not consume if they are not considered important in order to fully comprehend the animated program. Figure 3. Switches per minute between the main areas of interest. Another explanation could be that because the users could control the speed of the animation, the high number of promoted switches was due to the fact that users played animation faster when they wanted to skip some part of the animation. This indicates that we should consider the possibility to skip over some parts of the animation easily. Again, based on the results related to the overall fixation distributions, we propose to use an eye-tracker in order to estimate the source code locations of highest importance and therefore to increase an awareness of the tool about the actual learning process. 6. Conclusion This study was conducted in order to discover potential usability problems with Jeliot 3, a program visualization tool. Sixteen subjects 10

with varying level of experience were interacting with the tool during comprehension of three Java programs. According to distribution of mostly attended areas of interest, namely a central part in the code and the expression evaluation frame, it could be concluded that these areas are located too far from each other. During a comprehension, a user makes a switch between these too areas. While these are in a distance, the switch takes longer than in a case the areas are more close to each other. Consequently, the cognitive efforts are spent more on switching (allocating the attention focus) rather than on comprehension itself. 11

References Bednarik, R., Myller, N., Sutinen, E., Tukiainen, M. Applying Eye- Movement Tracking to Program Visualization. Submitted, 2005. Goldberg Joseph H., Kotval Xerxes P (1999). Computer interface evaluation using eye movements: methods and constructs. International Journal of Industrial Ergonomics. Vol 24:631-645. 12