1 User interface evaluation experiences: A brief comparison between usability and communicability testing Experiências de avaliação de interface de usuário: uma breve comparação entre usabilidade e testes de comunicabilidade Kern, Bryan; B.S.; The State University of New York at Oswego Tavares, Tatiana; PhD; Universidade Federal da Paraíba Schofield, Damian; PhD; The State University of New York at Oswego Abstract This paper provides a comparison of using multiple evaluation methods within the domain of Human-Computer Interaction. A number of evaluation methods are discussed as well as used in practice. The debate of whether to use usability evaluation methods over communicability evaluations is also discussed. Keywords: communicability; usability; comparison. Resumo Este artigo fornece uma comparação do uso de múltiplos métodos de avaliação no domínio da Interação Humano-Computador. Um número de métodos de avaliação são discutidos, bem como utilizado na prática. O debate sobre se a utilização de métodos de avaliação de usabilidade sobre avaliações comunicabilidade também é discutida. Palavras Chave: comunicabilidade; usabilidade; comparação.
2 Introduction Using different types of evaluation methods for interfaces helps gain a broader understanding of what the program is doing from the users perspective and find design flaws that exist within the program. When creating and conducting usability tests, there are five important attributes that are user centered. They are: 1.) Learnability - the program must be easy to learn and use. 2.) Efficiency - the user must understand what the software is doing in order to use it. 3.) Memorability - the user should remember how to use the software when leaving and coming back to it. 4.) Errors - a low error rate, makes a happy user. 5.) Satisfaction - the users must enjoy using the software. All of these attributes can be measured, either on a qualitative or quantitative scale (BHATNAGAR, S.; DUBEY, S.K., 2012). Other researchers have proposed a system that looks at the communicability of usability. Instead of focusing on the users and solutions to how they rank the qualities of a program, communicability design focuses on the conversation or internal dialogue that occurs within users (DE SOUZA, C.S.; LAFFRON, R.; LEITAO, C.F., 2007). Using these different types of evaluation methods, both the software can be analyzed as well as the different methods, which can validate a study even more. Incorporating both types of methods (usability and communicability) a deeper understanding of what evaluation methods should be used together. Combining and comparing the two methods also has the possibility of producing more reliability and increasing the validity for the study. User Interface Evaluation Methods Jakob Nielsen stresses the idea that if an application (he uses websites as an example) is not usable, users will stop using the application (NIELSEN, J., 2003). Evaluation methods are used to find problems with the application that users find, which will make the application more usable. This study used multiple evaluation methods and a background is given for each. Usability is a term that encompasses many different types of evaluations (NIELSEN, J., 2003). This study utilized three different forms of usability evaluation methods. The three methods are Heuristic Evaluations, Naturalistic Observation and Task-Driven. These were chosen for this study due to the commonality and ease of use of combining each method for testing. The use of a Heuristic Evaluation can be and effective tool for user testing. Nielsen describes a Heuristic Evaluation as,...[the ability to] find the usability problems in the design so that they can be attended to as part of an iterative design process. Nielsen states that these tests are done by HCI specialists (NIELSEN, J., 2003). When prompted to analyze a user interface, a Heuristic Evaluation is the proper method to start with. Using a Heuristic Evaluation as a starting point did two things for a study. It can help the evaluator to become familiar with the system, as well as find major problems with the system early on (JEFFRIES, R.; MILLER, J.R.; WHARTON, C.; UYEDA, K.M., 1991). For this study, an HCI Masters student evaluated each program using a Heuristic Evaluation.
3 A Naturalistic Observation method is meant to gain qualitative data about how users interact with a program, given minimal information about the program (LANDAUER, T.K., 1988). This method prompts a user to sit in front of an application and figure out what the given application is meant to do. Both applications were tested on the basis of a Naturalistic Observation for initial testing purposes. A Coaching evaluation method was also utilized for testing each of the applications. This is when the experimenter answers any questions the user has regarding system related issues that occur during testing (BHATNAGAR, S.; DUBEY, S.K., 2012). Using this method helped users get over functional issues that might have came up during testing. The last evaluation method utilized was communicability. The best definition of what communicability is,...messages from designers to users (DE SOUZA, C.S.; PRATES, R.O.; BARBOSA, S.D.J., 1999). This means that the focus of interaction is based on conversations about solutions, rather than figuring out the quality of solutions. In essence, what does the designer afford to the user of the programs, rather than what does the program afford to usability (DE SOUZA, C.S.; LAFFRON, R.; LEITAO, C.F., 2007). This evaluation method is able to ascribe tags to problems that occur in terms of communicability (see figure 1). Figure 1. Communicability Evaluation Method Tags (http://ugosan.org/tag/semiotic-engineering/). Programs to be Evaluated Utilizing usability and communicability evaluation methods, applications must be in progress to test. Two distinct applications being created at UFPB's LAVID were evaluated for this study. The first being an application for the Kinect. This applications purpose was to be used for 3D manipulation of objects for health professionals. The other application is the GTAVCS Arthron Server website client for video collaboration. This application is also intended for health professionals. Both of these applications are meant to be utilized in a surgery setting.
4 Using a Kinect UI for 3D Manipulation Heuristic Evaluation The initial application was reviewed using the heuristic evaluation method. A heuristic evaluation is a method that involves using an HCI expert to evaluate a system, and give feedback on what needs to be changed. This is a cheap method of evaluation, due to the method not needing much time or people to test the system (NIELSEN, 1994). This type of evaluation was undertaken to get an initial opinion on the application as it was. This evaluation is meant to uncover a number of issues with design, and changes are to implement before initial user testing. User Testing The initial testing done for the Kinect UI was a naturalistic study, with the use of a coaching method (BHATNAGAR, S.; DUBEY, S., 2012). The users were asked to step in front of the Kinect and try to figure out what they had to do to interact with the program. If the user would become stuck, they could ask a question and the experimenter would try to help them, without giving away too much of the tasks to be done. Each user was given as much time as they needed to complete the given task. The overall given task was to see whether the user could activate both of their hands, and then manipulate the object on screen. Each user was videotaped, using an ipad, and all users gave consent to be videotaped. A verbal questionnaire was given to the users after the test was complete. This video was then analyzed on the basis of the communicability Evaluation Method (CEM) (de Souza et al., 1999). Arthron Server Video Collaboration UI Heuristic Evaluation Similar to the initial heuristic evaluation done for the Kinect UI, the website was also evaluated. Once again, it was an informal evaluation, which was between one developer and a HCI expert. The developer asked questions about the design of the given website, and what improvements could be made. In web design, a good rule of thumb is to try and tell a story to the users. Example, leading the user through the website as though they were reading a book. User Testing The initial testing for this UI was task driven. The users were asked to to sign up to receive a username and password. The users were then asked to start a video collaboration session; this is the main component behind the website. After the user started a session, they were then asked to initialize the session with two other participants. These users uploaded an encoder (video stream provider) and a decoder (video stream receiver). After this, the users took a short questionnaire assessing the website.
5 OBTAINED RESULTS 1.) Kinect UI - 1. Heuristic Evaluation The main changes made to the application were to add buttons to give the user feedback on screen to see what they are actually manipulating. Before this change, the way to change between rotation and zoom had the user open and close their hand. This was not working well, so the developers decided to try the button layout. The application allowed the user three tasks going into the initial testing. The user could 1.) pick the object to manipulate, 2.) activate the Kinect using their hands and 3.) manipulate the object using the three buttons (rotate, zoom and stop). 2. Initial User Testing The application that was used was the second design iteration of the application. The three buttons were added (zoom, rotate and stop) to this iteration of the application. There were four participants used for the initial testing, three males and one female. Two students were from UFPB and two students from the STEM foreign exchange program in the USA. All four of the users generally enjoyed using the application. Participant one wanted to keep using the keyboard and mouse, which can be a problem for users who have minimal exposure to technology like the Kinect. Participant three had trouble with tracking; the participant's hands kept switching on the screen due to the proximity of the participant's hands to each other. Participant three greatly enjoyed using the program, due to not having much experience with the Kinect. The participant had little interaction with a program such as this one before the user test. Also, all of the participant's stated that if they were in a situation where they needed to use a program such as this one, that it would help their work more than hinder. Some communicability tags that were seen analyzing the video for the Kinect were what happened?, why doesn't it?, and what now?. These are said to be caused by the users not being completely comfortable in front of the Kinect, which forces the user to interact with something novel to them. See figure 3 for a picture of user testing for the Kinect. 2.) GTAVCS (Arthron Server) UI 1. Heuristic Evaluation This method led to changing certain navigation components one the website, as well as content within a given part of the site. For example, there were three tabs that when clicked, the content in the webpages were identical. It was not until the user (me in this case) understood what the developer was trying to display until more items on the page were clicked. In communicability, this would be considered a what's this tag, which could lead to either a help! tag or a what now? tag (de Souza, 1999). Using communicability, trying to avoid these cases at all costs is essential. After this evaluation, the developers improved upon the navigation components of the UI, as well as adding in additional directions to guide the users through each page within the website. 2. Testing There were four users who tested the UI. Subsequent to this test, a number of changes were made: 1.) Improve upon the drag-and-drop page, since it is very ambiguous.
6 2.) Explicitly tell the users where they are able to play, pause, and stop the video stream 3.) Produce directions for each page that the user would need help at. One communicability tag that kept appearing throughout the testing was the what now? tag. The problem of not knowing what to do leads the user to not want to use the program. Discussion Comparing three different types of Human-Computer Interaction evaluation methods, one can see that each have both pros and cons. Other analyses of evaluation methods within Human-Computer Interaction deal mainly with usability methods, or tend to join many different methods, and use them while testing the same application (BHATNAGAR, S.; DUBEY, S.K., 2012). The difference between usability and communicability is clear. Once is meant to find mistakes within the system itself, directly, while the other is meant to see the semiotic mistakes between designer and user. Taking different methods within usability are usually taken lightly, or seen as non-compatible methods (BHATNAGAR, S.; DUBEY, S.K., 2012). This is not the case, as shown here. Both usability evaluation methods as well as communicability methods were used and tested in this study. Not only were both usability and communicability tested, but two distinct methods of usability evaluations were used. The reasoning behind comparing these two different methods of evaluation was meant to see if the methods are suppose to be used separately, or if they can be used together. This study discovered that utilizing both methods while testing the same application was effective. Communicability was able to discuss things that usability might not have been able to. For example, communicability was important when discussing the Arthron Server UI because the website was not telling a story to the user. There was a design flaw that the programmers did not notice until the users tested, even though they were told during the heuristic evaluation. Since this was an initial investigation for comparing two different evaluation methods, much time was spent dealing with how to fit both tests into one testing session. Time was a constraint, due to only having a small amount of time to assess the programs, pick the right evaluation methods, and make test the programs in a reasonable amount of time. This study sets a groundwork for future studies utilizing both usability and communicability evaluation methods. Conclusion By using different methods of evaluation within the Human-Computer Interaction domain, many different types of issues can be solved. Being able to also use multiple methods of evaluation during one test is a benefit. This makes finding issues more efficient, as well as reducing the amount of time it takes to test. The programs become more efficient due to the fact that actual users sat down, tested the program, and their results were assessed by an HCI expert and the developers. Future work will include utilizing the different evaluation methods in tandem; this will further the reliability and validity of using both methods at once. Further research is being extended on both of these projects with using more of a quantitative
7 approach with both usability methods as well as the communicability (such as explicitly counting tags). As stated, utilizing both usability and communicability can create better programs; on the basis that the procedures used are the right ones for a given application. Doing more research using both methods can determine an outcome for future evaluation procedures. Acknowledgments We thank all the volunteers, and all publications support and staff, who wrote and provided helpful comments on previous versions of this document. Especially, Anna Medeiros, Rafael de Castro, José Ivan and Maria Clara. References BHATNAGAR, S.; DUBEY, S.K. Analytic study of usability evaluation methods. UNIASCIT, p DE SOUZA, C.S.; PRATES, R.O.; BARBOSA, S.D.J. Methods and tools: a method for evaluating the communicability of user interfaces. Magazine interactions, p DE SOUZA, C.S.; LAFFRON, R.; LEITAO, C.F. Communicability in multicultural contexts: A study with the International Children's Digital Library. Human-Computer Interaction Symposium, p JEFFRIES, R.; MILLER, J.R.; WHARTON, C.; UYEDA, K.M. User interface evaluation in the real world: a comparison of four techniques. Proceedings of CHI '91 ACM Computer Human Interaction, p LANDAUER, T.K. Research methods in human computer interaction. Handbook of Human Computer Interaction, chapter 42. Elsevier Science Publishers, p NIELSEN, J. Heuristic evaluation. In Nielsen, J., and Mack, R.L. (Eds.), Usability Inspection Methods. John Wiley & Sons, New York, NY, NIELSEN, J. Usability 101: introduction to usability. In: