Technical paper review Program visualization and explanation for novice C programmers by Matthew Heinsen Egan and Chris McDonald Garvit Pahal Indian Institute of Technology, Kanpur October 28, 2014 Garvit Pahal (IITK) Technical paper review October 28, 2014 1 / 28
Overview 1 Introduction 2 Prior Work 3 Foundation project - SeeC 4 System developed Graph Visualization Natural Language Explanation 5 Integration with SeeC 6 Problems with the system 7 Overview Garvit Pahal (IITK) Technical paper review October 28, 2014 2 / 28
Table of Contents 1 Introduction 2 Prior Work 3 Foundation project - SeeC 4 System developed Graph Visualization Natural Language Explanation 5 Integration with SeeC 6 Problems with the system 7 Overview Garvit Pahal (IITK) Technical paper review October 28, 2014 3 / 28
Introduction Task Introduce novice-focused systems for creating graphical visualizations of the runtime memory state of C language programs, and for generating natural language explanations of C program fragments. Garvit Pahal (IITK) Technical paper review October 28, 2014 4 / 28
Introduction Why Program visualization and natural language explanations of program behaviour have been shown to assist novice programmers with improving their programming knowledge, correcting misunderstandings, and debugging programs. These techniques have been used in several novice-focused debugging systems, but few have been developed for the C programming language despite it being widely reported as a difficult language for novices. Garvit Pahal (IITK) Technical paper review October 28, 2014 5 / 28
Table of Contents 1 Introduction 2 Prior Work 3 Foundation project - SeeC 4 System developed Graph Visualization Natural Language Explanation 5 Integration with SeeC 6 Problems with the system 7 Overview Garvit Pahal (IITK) Technical paper review October 28, 2014 6 / 28
Systems developed Zimmermann and Zeller (2002): Their tool extracts information about a programs memory state using the GNU Project Debugger VIP (2005): It is a novice-focused program visualization system that supports a subset of the C++ programming language Hundhausen and Brown (2007) described ALVIS, a radically dynamic programming environment which updates the visualization on every change. It also supports only a subset of C HDPV (2008): It is a data structure visualization system for programs written in C, C++, or Java ITEM/IP-II: This program visualization system supports an educational mini-language named Tortoise, and generates textual explanations of program execution. Bradman (1995): A system designed to assist novice programmers learning C, presented by Smith and Webb Garvit Pahal (IITK) Technical paper review October 28, 2014 7 / 28
Evaluation of these systems Graphical program visualizations and automatically generated explanations of program behaviour have been shown to assist novice programmers with constructing knowledge and debugging programs in most of the evaluations. Garvit Pahal (IITK) Technical paper review October 28, 2014 8 / 28
Table of Contents 1 Introduction 2 Prior Work 3 Foundation project - SeeC 4 System developed Graph Visualization Natural Language Explanation 5 Integration with SeeC 6 Problems with the system 7 Overview Garvit Pahal (IITK) Technical paper review October 28, 2014 9 / 28
SeeC SeeC project was introduced by Heinsen Egan and McDonald (2013) It is a novice-focused system for the standard C programming language that provides execution tracing and runtime error detection. It is built upon the Clang project which is a modular collection of libraries which implement a front-end for compiling C, C++, Objective C, and Objective C++ Garvit Pahal (IITK) Technical paper review October 28, 2014 10 / 28
SeeC Clangs parsing and semantic analysis libraries are used to create an Abstract Syntax Tree (AST) from a programs source code. Figure : Source code Figure : Abstract syntax tree Garvit Pahal (IITK) Technical paper review October 28, 2014 11 / 28
SeeC Each node in the AST represents a declaration or statement in the program and provides rich semantic information. When an execution trace is loaded the programs AST is reconstructed, allowing us to link runtime states to relevant AST nodes. This provides a mapping between the programs static source code and its dynamic state Garvit Pahal (IITK) Technical paper review October 28, 2014 12 / 28
SeeC The AST nodes can be used to retrieve Value objects which can be of type Scalars Arrays Records Pointers File pointers Garvit Pahal (IITK) Technical paper review October 28, 2014 13 / 28
Table of Contents 1 Introduction 2 Prior Work 3 Foundation project - SeeC 4 System developed Graph Visualization Natural Language Explanation 5 Integration with SeeC 6 Problems with the system 7 Overview Garvit Pahal (IITK) Technical paper review October 28, 2014 14 / 28
Graph Visualization The system for graph visualization is built upon SeeC s representation of recreated stated (Value objects). The system produces a graph in the DOT language. Figure : Source code Figure : Graph visualization Garvit Pahal (IITK) Technical paper review October 28, 2014 15 / 28
Layout generation for different value types Scalar: Fill the cell with the string description of the Value. Garvit Pahal (IITK) Technical paper review October 28, 2014 16 / 28
Layout generation for different value types Array: Create a new sub-table in the cell, with two columns, and one row for each element in the array. Place the index of the elements in the left columns cells, and then recursively layout the right columns cells using the elements Values. Garvit Pahal (IITK) Technical paper review October 28, 2014 16 / 28
Layout generation for different value types Record: Create a new sub-table in the cell, with two columns, and one row for each member of the record. Place the names of the members in the left columns cells, and then recursively layout the right columns cells using the members Values. Garvit Pahal (IITK) Technical paper review October 28, 2014 16 / 28
Layout generation for different value types Pointer: If the pointer is uninitialized then fill the cell with the placeholder?. If the pointers raw value is zero then fill the cell with the text NULL. If the pointer has no valid dereferences then fill the cell with the placeholder!. Otherwise, leave the cell empty it will be connected appropriately when edges are created. Garvit Pahal (IITK) Technical paper review October 28, 2014 16 / 28
Natural Language Explanation Previous studies have shown that automatically generated natural language explanations of program source code can be useful for novice programmers. Unfortunately, this area lacks new developments for the C programming language. This may be due to the difficulties of developing tools for the C programming language. Garvit Pahal (IITK) Technical paper review October 28, 2014 17 / 28
Natural Language Explanation The explanatory system is built upon the Clang libraries, providing robust and sustainable parsing and semantic analysis of the C programming language. Clang produces the AST (Abstract Syntax Tree) and the system creates natural language explanations for individual nodes in Clangs AST. Garvit Pahal (IITK) Technical paper review October 28, 2014 18 / 28
Natural Language Explanation Figure : Source code Figure : Abstract syntax tree Natural Language Explanation for IfStmt It consists of a condition, a body, and an else. Garvit Pahal (IITK) Technical paper review October 28, 2014 19 / 28
Natural Language Explanation The system can optionally use information about the runtime state of the program when generating explanations. This information is provided to the message formatting system in the same manner as the semantic information provided by the AST nodes. To return to our example, the explanation of if statements can explain whether the body or the else statement is executed based on the value that was produced by the condition statement. Garvit Pahal (IITK) Technical paper review October 28, 2014 20 / 28
Table of Contents 1 Introduction 2 Prior Work 3 Foundation project - SeeC 4 System developed Graph Visualization Natural Language Explanation 5 Integration with SeeC 6 Problems with the system 7 Overview Garvit Pahal (IITK) Technical paper review October 28, 2014 21 / 28
Integration with SeeC The graphical visualization system and explanation generation system are integrated into SeeCs graphical trace viewer. Figure : Integrated system Garvit Pahal (IITK) Technical paper review October 28, 2014 22 / 28
Features of the integrated system Execution of programs are recorded in trace files. The graphical trace viewer can load these traces files, allowing students to inspect the recorded state of the program at any point during its execution. The system also supports contextual navigation based on particular items in the state. A student may also select a particular function call and rewind to the beginning of the call or move forwards until the call is complete. Garvit Pahal (IITK) Technical paper review October 28, 2014 23 / 28
Table of Contents 1 Introduction 2 Prior Work 3 Foundation project - SeeC 4 System developed Graph Visualization Natural Language Explanation 5 Integration with SeeC 6 Problems with the system 7 Overview Garvit Pahal (IITK) Technical paper review October 28, 2014 24 / 28
Problems with the system and modifications required There are some problems with this approach Task of determining which of multiple competing types should be rendered for a particular area of memory. For example in unions. With some modifications, this can be solved. Even relatively simple statements in the C programming language may consist of several AST nodes. A student considering an entire statement must view the explanations for the individual AST nodes. It may be possible to create a system which can combine fragments of explanations to create a unified explanation for an entire statement. Garvit Pahal (IITK) Technical paper review October 28, 2014 25 / 28
Table of Contents 1 Introduction 2 Prior Work 3 Foundation project - SeeC 4 System developed Graph Visualization Natural Language Explanation 5 Integration with SeeC 6 Problems with the system 7 Overview Garvit Pahal (IITK) Technical paper review October 28, 2014 26 / 28
Overview Program Visualization and natural language explanations of program behavious have been used in several debugging systems, but very few have been developed for novices and specially for novice C programmers. The system described in the paper might become a useful tool in assisting novice C programmers that can help them improve their programming knowledge and dedug their programs. Garvit Pahal (IITK) Technical paper review October 28, 2014 27 / 28
Thank You! Questions please Garvit Pahal (IITK) Technical paper review October 28, 2014 28 / 28