Visualizing Graphical Probabilistic Models

Size: px
Start display at page:

Download "Visualizing Graphical Probabilistic Models"

Transcription

1 Visualizing Graphical Probabilistic Models Chih-Hung Chiang*, Patrick Shaughnessy, Gary Livingston, Georges Grinstein Department of Computer Science, University of Massachusetts Lowell, Lowell, MA01854 ABSTRACT Complex probabilistic models are difficult to evaluate not only for consistency with the domain theory but also for novelty and significance. In this paper we describe the current visual representations of Bayesian networks and then present an implementation containing enhancements to these visualizations by integrating several well known techniques. We then apply our approach in an exploration of graphical models and use Bayesian networks as an illustration. Keywords: Graphical probabilistic model, Bayesian network, visualization 1. INTRODUCTION For many applications, merely learning or manually developing probabilistic models and then quantitatively evaluating them is not enough. These models also need to be explored qualitatively in order to identify portions of them that are consistent with the domain theory, portions which are inconsistent with the domain theory, and portions which represent potentially novel and significant discoveries. Current methods for visualizing probabilistic models exist for a variety of areas which include for example geo-spatial data and flow visualization (Pang et al. 1 ). We present interactive methods for visualizing the probabilities contained in graphical models and use Bayesian networks to illustrate these techniques. Bayesian networks (Pearl 2 ) are graphic structures for representing probabilistic relationships among variables and for performing probabilistic inference with those variables. They have proven to be a valuable tool for encoding, learning and reasoning with probabilistic relationships. A Bayesian network consists of two primary components; The qualitative component is a directed acyclic graph (DAG) in which each node represents an attribute of the data and the edges represent the dependencies among the attributes. The quantitative component is a set of conditional probability distributions which give the probabilities for the value of each node given the value of its parents. Figure 1 shows a Bayesian network with three Boolean attributes and their corresponding conditional probability tables (CPT). In the network given in Figure 1, attribute A has a 50% chance of having the value T, given no other information, and attribute C has a 37.5% chance of having the value T if A s value is T and a 62.5% chance of having the value T if A s value is F. When A and C are both T or both F, B has a 50% chance of having the value T, and when A and C have different values, B has a 90% chance of having the value T. Table 1 shows the joint probability distribution for the Bayesian network in Figure 1. Many model analysis issues can benefit from visual analysis. Card et al 3 describe numerous information visualization techniques. Thearling et al. 4 emphasizes that visualizing a model should allow a user to understand, discuss and explain the logic behind the model, gaining a user s trust. We hypothesize thus that information visualization techniques should prove useful for harnessing Bayesian networks for modeling and analyzing the underlying data. The visualization methods presented in this paper are implemented in an interactive visualization tool for the exploratory analysis of Bayesian networks. This tool allows users to explore the cause-effect relationships represented in the conditional probability tables using simple and interactive manipulations of the visual representations, thus facilitating the discovery of meaningful causal features. *[email protected]; phone

2 Figure 1. Simple Bayesian network with CPTs. Table 1. Joint probability distribution for the network in Figure 1 a b c P(A=a AND B=b AND C=c) T T T T T F T F T T F F F T T F T F F F T F F F RELATED WORK For the past twenty years, much research has focused on developing Bayesian networks (learning the structure of the network), and performing inferences. For example, Friedman et al. 5 used Bayesian networks to analyze expression data. Different software packages serve different purposes with some tools focusing on learning while others emphasize inference. Murphy 6 did a survey of thirty four popular software packages for graphical models based on several features of which the following are the most relevant to our work: 1. Does the package support continuous random variables and does it support sampling? 2. Does the package support undirected graphs? 3. Does the package learn CPTs and the structure of the network? 4. Does the package support utility/action nodes? Murphy s analysis motivated the development of the Bayes Net Toolbox (BNT) 6. BNT is an open-source widely used Matlab package for directed graphical models. One of the strengths of BNT is that it offers a variety of inference and learning algorithms which can be used for different kinds of models.

3 Omitted from Murphy s survey was whether or not a package provided support for visualizing Bayesian networks. Table 2 is the result of a survey we performed on the Bayesian network visualization of some of the most popular software packages. All of the chosen tools provide visualization of the edges in the models. Some of these tools (Hugin Expert 7, BayesBuilder 8, Netica 9 and GeNIe/SMILE 10 ) use a probability distribution with bar charts within the node, providing an overview of the probability distributions of the values of all nodes at a glance (Figure 2), while other tools only show the node as a simple geometric object (e.g., circle). GeNIE/SMILE (figure 3) also provides an interface for users to display bar chart or pie chart distributions of the probabilities for a selected column. While these tools provide visualizations of the probabilities of a node s values, none provide graphical views of the entire conditional probability tables. Name Table 2. Comparison software for visualizing Bayesian network Visual representation of the model Graphical visualization of all node s probability distribution of values Bayes Net Toolbox Yes No No Hugin Expert Yes Bar charts No BayesBuilder Yes Bar charts No WinMine Yes No No BayesianLab Yes No No Netica Yes Bar charts MSBNx Yes No No Analytica Yes No No Graphical visualization of conditional probability tables GeNIe/SMILE Yes Bar charts Displays a pie chart distribution of the probabilities for the selected column Figure 2. The bar chart visualization of the probability distribution in each node in Netica

4 Figure 3. The pie chart visualization for the selected column in CPT in GeNIE/SMILE A major issue of the elicitation of numerical parameters in Bayesian networks is navigation through large CPTs. Wang et al. 11 developed two user interface tools, CPTREE (Conditional Probability Tree) and the scpt (shrinkable Conditional Probability Table) that aim at improving navigation through large CPTs and at improving the interactive assessment of discrete conditional probability distributions. The CPTREE is a tree view of a CPT that allows users to shrink any of the conditioning parents while scpt is a table view of a CPT with a shrinkable structure for any dimension of the table. Both reduce the size of the table displayed on the screen and allow a user to efficiently navigate through CPTs. GeNIE/SMILE has incorporated these navigation techniques into their package Despite the large number of software tools available for the modeling of Bayesian networks, as Elmqvist and Tsigas 12 state, the work performed on causal relation visualization has been surprisingly low. They proposed the use of partitioned polygons with color-coded segments to show the dependencies between the variables. Zapata-Rivera et al. 13 developed a Bayesian network visualization tool (VisNet), which uses temporal order, color, size, proximity and animation to visualize the cause-effect relationship, marginal probability and probability propagation. However, none of these tools provide for visualization of the conditional probability tables. This is the focus of our work and the next two sections discuss the visual representations of causality and conditional probability tables. 3. VISUALIZATION OF CAUSALITY Graphs with nodes and directed edges are widely used to model dependency relationships (Heckerman et al. 14 ) and Bayesian networks use directed acyclic graphs to represent causalities. Nodes represent variables, and directed edges represent direct probabilistic dependences. Under these circumstances, the layout of the DAG plays an important role in the depiction of causality. For instance, the temporal order of the nodes could offer an intuitive notion of the cause-effect relationships. Moreover, the use of visualization attributes such as color and size in the representation of the nodes and edges can also provide valuable information about the networks. 3.1 Layout of Bayesian Networks Although a number of software tools have been created to build and visualize Bayesian networks, Marriott and Moulder 15 mention in their study that the layout provided by these Bayesian networks visualization tools is generally poor. DAG layout is a well studied area (Sugiyama et al. 16 ). Our focus was not on creating a new DAG drawing algorithm, but instead on finding a layout algorithm for an acyclic directed graph G= (V, E), with nodes V and edges E, with design rules based on the following aesthetic principles. Use temporal order to build a hierarchical structure for the cause-effect relationships Minimize the number of edge crossings Keep the edges short and keep the drawing area as small as possible without compromising the readability of the network

5 One popular approach in graphical drawing is the hierarchical approach (Battista et al. 17 ). It has many variants for drawing DAGs and matches our design rules very well. Figure 4 illustrates the three steps we used to build the network. Figure 4. Steps in building the layout of the networks Step 1: layer assignment. During this step, the nodes are assigned to layers L1, L2, L3,,Lh based on their cause-effect relationships. Overall, the layer number for node j will be larger than node i if there is an edge between parent node i and child node j. We find the longest path first and assign an incremental layer number to each node when it goes through the path. Nodes not appearing on the path are assigned a layer number based on relationships with the nodes already having the layer number. Step 2: crossing reduction. Nodes in each layer are ordered to reduce the number of edge crossings. We use a heuristic method to find the optimal orderings, building on the approach of Ganser et al. 18. Step 3: x- and y-coordinate assignment. After node layer assignment and ordering, the layer number is used to assign the y-coordinate and the ordering is used to assign the x-coordinate. 3.2 Visualization of Nodes and Edges As mentioned earlier, the nodes in Bayesian networks represent variables of the data and often these variables have additional properties. Some are obvious, such as the number of parent variables and the number of children. Some properties, such as the name of the variable, can easily be added to the display. However there are some properties which are hidden or not so obvious. We use color and size to visualize these hidden properties. For edges, we use the thickness and color to highlight properties related to causal relationships, such as the correlation value and the computed confidence level. 4. VISUALIZATION OF CONDITIONAL PROBABILITY TABLES We believe our most significant contribution is the visualization of conditional probability tables. Currently there are two common methods for embedding conditional probability tables into the graph, both with limitations. The first method uses the formal mathematical descriptions of conditional probability and puts these descriptions either beside or under the node they belong to. There are some problems related to this implementation, the most obvious one being the size of the descriptions, making it difficult to put all the descriptions into the graph when the variables have multiple values. Even with a modest number of nodes and a small number of values, these visualizations quickly become crowded. The second common method for putting conditional probability into the graph is the use of separate probability tables. These views vary from one table representing the conditional probabilities for each node to one giant table, a compact representation of joint probability distributions via conditional independence lying on the side of the graph which shows the conditional probabilities for all nodes.

6 Besides the size issue for the approaches given above, neither provides an easy nor effective way to quickly perceive conditional relationships between two variables. In our method, we use colormaps (or heatmaps) to visualize the conditional probabilities. Its character eliminates the major limitations described above. The colormap, a common visualization, popular in gene expression and microarray data analysis, is an alternative to table visualization. Instead of displaying the table value directly, it uses an icon, glyph or color to visually represent cell values. One of the important advantages of colormaps, and the main reason we use colormaps instead of tabular or mathematical descriptions, is the efficiency of visual quantitative comparisons. A colormap view of conditional probability tables quickly provides a user with the overall context, including all other correlations not currently visible in the usual table view or with mathematical descriptions. Size clearly is another benefit; the size for each colormap table cell being smaller than the corresponding table cells with numeric values. The difference is even more pronounced when colormap displays are compared to mathematical descriptions. Figure 5 show an example of correlations between parent and child nodes from strongly negative on the left side to strongly positive on the right side. Figure 5. Colormap visualization of a CPT 5. IMPLEMENTATION Our Bayesian network visualization package includes two major components; one is a Bayesian model learning toolbox which implements Bayesian network learning algorithms, and the other is our visualization tool, BayesViz, specifically targeted for Bayesian networks. It is written in Java and based on the Universal Visualization Platform (UVP) 19. UVP is a general-purpose platform for building numerous visualization and analysis applications. It is composed of a central framework and a large number of plug-in tools allowing us to focus on the design of the visualizations and thus results in a rapid implementation of experimental tools. Figure 6 shows the visualization process. The Bayesian model learning toolbox constructs the Bayesian network structure and computes the conditional probability tables based on the learning algorithms the user selects. The learned causality relationships and CPT information are passed to the visualization tool. A user can interact with the visualization tool for more detailed information about the model with, for example, a probing panel showing the corresponding numeric values when the user hovers the mouse over a colormap. Figure 6. The integrated visualization process

7 6. EXAMPLE Figures 7 and 8 show BayesViz s visualization of a model generated using random hill-climbing on 50 genes from the yeast dataset presented by Spellman et al 20. Figure 5 presents the inferred network with edges colored by correlation coefficient (green indicates a negative correlation coefficient and red a positive coefficient) and colormap tables representing the conditional relationships between the values of parent and child nodes. A strong positive correlation between a parent and a child may be recognized by a colormap with strong yellow lower-left to upper-right banding, and strong negative correlations between a parent and a child is signaled by strong yellow upper-left to lower-right banding. For example, the colormap for the edge from parent YNL031C to child YBR010W (A in Figure 7) indicates a strong positive correlation. With this visualization method, positive correlations may be quickly identified by looking for red edges and negative correlations by green edges. The quality of the correlations is identified by viewing the banding in the colormaps. This view thus provides for the quick identification of the types and strengths of the dependencies between parent and child nodes. This type of analysis is useful to a scientist in identifying relationships for more detailed analysis or experimentation and helps in suggesting new hypotheses to be tested. The automatically generated layout of the network can be adjusted by dragging nodes so that any undesired choices made by the layout algorithm can be easily overcome and the sizes and color schemes of the conditional probability tables can be adjusted (B in Figures 7). The color and size of the nodes and edges can be made to depend on the network properties selected by the user allowing such properties to be easily explored. Figure 7. Colormap on each edge

8 In Figure 8, we see the colormaps for the conditional probability tables for the nodes. Each row indicates a combination of parent conditions, and the rows are sorted by values of the first parent (selected arbitrarily), and then the values of the second parent, and so forth. The changes in the values for the first parent are indicated by a small gap in the colormaps. This allows a quicker recognition of how changes in one parent affect the conditional dependencies between the remaining parent s values and the child s values. For instance, the colormap for the node YLR049C, shown in detail with the associated probabilities in C in Figure 8, shows that the conditional relationship between the second parent and the YLR049C varies considerably with different values for the first parent. 7. CONCLUSION The ability to analyze a model inferred from data by a learning system is important. Some portions of the model may be incorrect or irrelevant, and interactive visualization provides a powerful tool supporting this analysis. We have presented an integrated Bayesian Network analysis and interactive visualization that uses layering, color, and colormaps. We ve used this implementation in a microarray gene expression analysis activity and suggest that these methods allow for the strength and quality of conditional relationships to be quickly identified and analyzed. Figure 8. Colormap on each node

9 REFERENCES 1. A. T. Pang, C. M. Wittenbrink, S. K. Lodha, (1996). Approaches to Uncertainty Visualization, Technical Report, UCSC-CRL University of California, Santa Cruz. 2. J. Pearl, (1997). Graphical Models for Probabilistic and Causal Reasoning in The Computer Science and Engineering Handbook, A. Tucker, Editor. CRC Press: Boca Raton, FL. p S. K. Card, J. D. Mackinlay, B. Shneiderman, (1996). Readings in Information Visualization: Using Vision to Think. Morgan Kaufmann. 4. K. Thearling, B. Becker, D. DeCoste, B. Mawby, M. Pilote, D. Sommerfield, (2002). Visualizing Data Mining Models, in Information Visualization in Data Mining and Knowledge Discovery, Georges G. Grinstein, Usama Fayyad and Andreas Wierse, Editor. Mogran Kaufmann: San Francisco, CA. 5. N. Friedman, I. Nachman, M. Linial, D. Pe'er, (2000). Using Bayesian Networks to Analyze Expression Data. Journal of Computational Biology. 7(3-4): p K. P. Murphy, (2001). The Bayes Net Toolbox for Matlab in the 33rd Symposium on the Interface. Costa Mesa, California. 7. Hugin Expert, Infosys Technologies Limited M. Nijman, E. Akay, W. Wiegerinck, SNN Nijmegen, BayesBuilder Netica, Norsys Software Corp GeNIE/SMILE, Decision Systems Laboratory, University of Pittsburgh H. Wang, M. J. Druzdzel, (2000). User Interface Tools for Navigation in Conditional Probability Tables and Elicitation of Probabilities in Bayesian Networks. In Proceedings of the Sixteenth Annual Conference on Uncertainty in Artificial Intelligence (UAI-2000). San Francisco, CA, USA. 12. N. Elmqvist, P. Tsigas, (2003). Causality Visualization Using Animated Growing Polygons in IEEE Symposium on Information Visualization. Seattle, Washington, USA. 13. Juan-Diego Zapata-Rivera, E.N., Jim E. Greer, (1999). Visualization of Bayesian Belief Networks in IEEE Visualization D. Heckerman, D. M. Chickering, C. Meek, R. Rounthwaite, C. Kadie, (2000). Dependency Networks for Inference, Collaborative Filtering, and Data Visualization. Journal of Machine Learning Research, 1: p K. Marriott, P. Moulder, L. Hope, C. Twardy, (2005). Layout of Bayesian Networks in the 28th Australian Computer Science Conference. The University of Newcastle, Australia: Estivill-Castro, Ed. 16. K. Sugiyama, S. Tagawa, M. Toda, (1981). Methods for Visual Understanding of Hierarchical System Structures. IEEE TRANSACTIONS on Systems, Man, and Cybernetics, 11: p G. D. Battista, P. Eades, R. Tamassia, I. G. Tollis, (1999). Graph Drawing, Algorithms for The Visualization of Graphs. Prentice Hall. 18. E. R. Gansner, E. Koutsofios, S. C. North, K. Vo, (1993). A Technique for Drawing Directed Graphs. IEEE Transactions on Software Engineering, 19(3): p A. G. Gee, H. Li, M. Yu, M. B. Smrtic, U. Cvek, H. Goodell, V. Gupta, C. Lawrence, J. Zhou, C. Chiang, G. G. Grinstein, (2005). Universal visualization platform. In SPIE Visualization and Data Analysis. San Diego, California. 20. P. T. Spellman, G. Sherlock, M. Q. Zhang, V. R. Iyer, K. Anders, M. B. Eisen, P. O. Brown, D. Botstein, and B. Futcher, (1998). Comprehensive Identification of Cell Cycle-regulated Genes of the Yeast Saccharomyces cerevisiae by Microarray Hybridization. Molecular Biology of the Cell, 9: p

Chapter 14 Managing Operational Risks with Bayesian Networks

Chapter 14 Managing Operational Risks with Bayesian Networks Chapter 14 Managing Operational Risks with Bayesian Networks Carol Alexander This chapter introduces Bayesian belief and decision networks as quantitative management tools for operational risks. Bayesian

More information

Visualizing e-government Portal and Its Performance in WEBVS

Visualizing e-government Portal and Its Performance in WEBVS Visualizing e-government Portal and Its Performance in WEBVS Ho Si Meng, Simon Fong Department of Computer and Information Science University of Macau, Macau SAR [email protected] Abstract An e-government

More information

Information Visualization of Attributed Relational Data

Information Visualization of Attributed Relational Data Information Visualization of Attributed Relational Data Mao Lin Huang Department of Computer Systems Faculty of Information Technology University of Technology, Sydney PO Box 123 Broadway, NSW 2007 Australia

More information

TOWARDS SIMPLE, EASY TO UNDERSTAND, AN INTERACTIVE DECISION TREE ALGORITHM

TOWARDS SIMPLE, EASY TO UNDERSTAND, AN INTERACTIVE DECISION TREE ALGORITHM TOWARDS SIMPLE, EASY TO UNDERSTAND, AN INTERACTIVE DECISION TREE ALGORITHM Thanh-Nghi Do College of Information Technology, Cantho University 1 Ly Tu Trong Street, Ninh Kieu District Cantho City, Vietnam

More information

Agenda. Interface Agents. Interface Agents

Agenda. Interface Agents. Interface Agents Agenda Marcelo G. Armentano Problem Overview Interface Agents Probabilistic approach Monitoring user actions Model of the application Model of user intentions Example Summary ISISTAN Research Institute

More information

шли Information Visualization in Data Mining and Knowledge Discovery Edited by digimine, Inc. University of Massachusetts, Lowell

шли Information Visualization in Data Mining and Knowledge Discovery Edited by digimine, Inc. University of Massachusetts, Lowell Information Visualization in Data Mining and Knowledge Discovery Edited by USAMA FAYYAD digimine, Inc. GEORGES G. GRINSTEIN University of Massachusetts, Lowell ANDREAS WIERSE VirCinity IT-Consulting GmbH

More information

VISUALIZING HIERARCHICAL DATA. Graham Wills SPSS Inc., http://willsfamily.org/gwills

VISUALIZING HIERARCHICAL DATA. Graham Wills SPSS Inc., http://willsfamily.org/gwills VISUALIZING HIERARCHICAL DATA Graham Wills SPSS Inc., http://willsfamily.org/gwills SYNONYMS Hierarchical Graph Layout, Visualizing Trees, Tree Drawing, Information Visualization on Hierarchies; Hierarchical

More information

Extend Table Lens for High-Dimensional Data Visualization and Classification Mining

Extend Table Lens for High-Dimensional Data Visualization and Classification Mining Extend Table Lens for High-Dimensional Data Visualization and Classification Mining CPSC 533c, Information Visualization Course Project, Term 2 2003 Fengdong Du [email protected] University of British Columbia

More information

Single Level Drill Down Interactive Visualization Technique for Descriptive Data Mining Results

Single Level Drill Down Interactive Visualization Technique for Descriptive Data Mining Results , pp.33-40 http://dx.doi.org/10.14257/ijgdc.2014.7.4.04 Single Level Drill Down Interactive Visualization Technique for Descriptive Data Mining Results Muzammil Khan, Fida Hussain and Imran Khan Department

More information

Interactive Exploration of Decision Tree Results

Interactive Exploration of Decision Tree Results Interactive Exploration of Decision Tree Results 1 IRISA Campus de Beaulieu F35042 Rennes Cedex, France (email: pnguyenk,[email protected]) 2 INRIA Futurs L.R.I., University Paris-Sud F91405 ORSAY Cedex,

More information

In Proceedings of the Eleventh Conference on Biocybernetics and Biomedical Engineering, pages 842-846, Warsaw, Poland, December 2-4, 1999

In Proceedings of the Eleventh Conference on Biocybernetics and Biomedical Engineering, pages 842-846, Warsaw, Poland, December 2-4, 1999 In Proceedings of the Eleventh Conference on Biocybernetics and Biomedical Engineering, pages 842-846, Warsaw, Poland, December 2-4, 1999 A Bayesian Network Model for Diagnosis of Liver Disorders Agnieszka

More information

Network-Based Tools for the Visualization and Analysis of Domain Models

Network-Based Tools for the Visualization and Analysis of Domain Models Network-Based Tools for the Visualization and Analysis of Domain Models Paper presented as the annual meeting of the American Educational Research Association, Philadelphia, PA Hua Wei April 2014 Visualizing

More information

Analecta Vol. 8, No. 2 ISSN 2064-7964

Analecta Vol. 8, No. 2 ISSN 2064-7964 EXPERIMENTAL APPLICATIONS OF ARTIFICIAL NEURAL NETWORKS IN ENGINEERING PROCESSING SYSTEM S. Dadvandipour Institute of Information Engineering, University of Miskolc, Egyetemváros, 3515, Miskolc, Hungary,

More information

A Bayesian Network Model for Diagnosis of Liver Disorders Agnieszka Onisko, M.S., 1,2 Marek J. Druzdzel, Ph.D., 1 and Hanna Wasyluk, M.D.,Ph.D.

A Bayesian Network Model for Diagnosis of Liver Disorders Agnieszka Onisko, M.S., 1,2 Marek J. Druzdzel, Ph.D., 1 and Hanna Wasyluk, M.D.,Ph.D. Research Report CBMI-99-27, Center for Biomedical Informatics, University of Pittsburgh, September 1999 A Bayesian Network Model for Diagnosis of Liver Disorders Agnieszka Onisko, M.S., 1,2 Marek J. Druzdzel,

More information

Comparison of K-means and Backpropagation Data Mining Algorithms

Comparison of K-means and Backpropagation Data Mining Algorithms Comparison of K-means and Backpropagation Data Mining Algorithms Nitu Mathuriya, Dr. Ashish Bansal Abstract Data mining has got more and more mature as a field of basic research in computer science and

More information

SuperViz: An Interactive Visualization of Super-Peer P2P Network

SuperViz: An Interactive Visualization of Super-Peer P2P Network SuperViz: An Interactive Visualization of Super-Peer P2P Network Anthony (Peiqun) Yu [email protected] Abstract: The Efficient Clustered Super-Peer P2P network is a novel P2P architecture, which overcomes

More information

MicroStrategy Analytics Express User Guide

MicroStrategy Analytics Express User Guide MicroStrategy Analytics Express User Guide Analyzing Data with MicroStrategy Analytics Express Version: 4.0 Document Number: 09770040 CONTENTS 1. Getting Started with MicroStrategy Analytics Express Introduction...

More information

Exercise with Gene Ontology - Cytoscape - BiNGO

Exercise with Gene Ontology - Cytoscape - BiNGO Exercise with Gene Ontology - Cytoscape - BiNGO This practical has material extracted from http://www.cbs.dtu.dk/chipcourse/exercises/ex_go/goexercise11.php In this exercise we will analyze microarray

More information

JustClust User Manual

JustClust User Manual JustClust User Manual Contents 1. Installing JustClust 2. Running JustClust 3. Basic Usage of JustClust 3.1. Creating a Network 3.2. Clustering a Network 3.3. Applying a Layout 3.4. Saving and Loading

More information

A Visualization is Worth a Thousand Tables: How IBM Business Analytics Lets Users See Big Data

A Visualization is Worth a Thousand Tables: How IBM Business Analytics Lets Users See Big Data White Paper A Visualization is Worth a Thousand Tables: How IBM Business Analytics Lets Users See Big Data Contents Executive Summary....2 Introduction....3 Too much data, not enough information....3 Only

More information

DETERMINING THE CONDITIONAL PROBABILITIES IN BAYESIAN NETWORKS

DETERMINING THE CONDITIONAL PROBABILITIES IN BAYESIAN NETWORKS Hacettepe Journal of Mathematics and Statistics Volume 33 (2004), 69 76 DETERMINING THE CONDITIONAL PROBABILITIES IN BAYESIAN NETWORKS Hülya Olmuş and S. Oral Erbaş Received 22 : 07 : 2003 : Accepted 04

More information

Graduate Co-op Students Information Manual. Department of Computer Science. Faculty of Science. University of Regina

Graduate Co-op Students Information Manual. Department of Computer Science. Faculty of Science. University of Regina Graduate Co-op Students Information Manual Department of Computer Science Faculty of Science University of Regina 2014 1 Table of Contents 1. Department Description..3 2. Program Requirements and Procedures

More information

MultiExperiment Viewer Quickstart Guide

MultiExperiment Viewer Quickstart Guide MultiExperiment Viewer Quickstart Guide Table of Contents: I. Preface - 2 II. Installing MeV - 2 III. Opening a Data Set - 2 IV. Filtering - 6 V. Clustering a. HCL - 8 b. K-means - 11 VI. Modules a. T-test

More information

Another Look at Sensitivity of Bayesian Networks to Imprecise Probabilities

Another Look at Sensitivity of Bayesian Networks to Imprecise Probabilities Another Look at Sensitivity of Bayesian Networks to Imprecise Probabilities Oscar Kipersztok Mathematics and Computing Technology Phantom Works, The Boeing Company P.O.Box 3707, MC: 7L-44 Seattle, WA 98124

More information

A Bayesian Approach for on-line max auditing of Dynamic Statistical Databases

A Bayesian Approach for on-line max auditing of Dynamic Statistical Databases A Bayesian Approach for on-line max auditing of Dynamic Statistical Databases Gerardo Canfora Bice Cavallo University of Sannio, Benevento, Italy, {gerardo.canfora,bice.cavallo}@unisannio.it ABSTRACT In

More information

VisCG: Creating an Eclipse Call Graph Visualization Plug-in. Kenta Hasui, Undergraduate Student at Vassar College Class of 2015

VisCG: Creating an Eclipse Call Graph Visualization Plug-in. Kenta Hasui, Undergraduate Student at Vassar College Class of 2015 VisCG: Creating an Eclipse Call Graph Visualization Plug-in Kenta Hasui, Undergraduate Student at Vassar College Class of 2015 Abstract Call graphs are a useful tool for understanding software; however,

More information

The Basics of Graphical Models

The Basics of Graphical Models The Basics of Graphical Models David M. Blei Columbia University October 3, 2015 Introduction These notes follow Chapter 2 of An Introduction to Probabilistic Graphical Models by Michael Jordan. Many figures

More information

Bayesian Networks and Classifiers in Project Management

Bayesian Networks and Classifiers in Project Management Bayesian Networks and Classifiers in Project Management Daniel Rodríguez 1, Javier Dolado 2 and Manoranjan Satpathy 1 1 Dept. of Computer Science The University of Reading Reading, RG6 6AY, UK [email protected],

More information

Categorical Data Visualization and Clustering Using Subjective Factors

Categorical Data Visualization and Clustering Using Subjective Factors Categorical Data Visualization and Clustering Using Subjective Factors Chia-Hui Chang and Zhi-Kai Ding Department of Computer Science and Information Engineering, National Central University, Chung-Li,

More information

Hierarchical Data Visualization

Hierarchical Data Visualization Hierarchical Data Visualization 1 Hierarchical Data Hierarchical data emphasize the subordinate or membership relations between data items. Organizational Chart Classifications / Taxonomies (Species and

More information

Clustering & Visualization

Clustering & Visualization Chapter 5 Clustering & Visualization Clustering in high-dimensional databases is an important problem and there are a number of different clustering paradigms which are applicable to high-dimensional data.

More information

Bayesian Networks. Mausam (Slides by UW-AI faculty)

Bayesian Networks. Mausam (Slides by UW-AI faculty) Bayesian Networks Mausam (Slides by UW-AI faculty) Bayes Nets In general, joint distribution P over set of variables (X 1 x... x X n ) requires exponential space for representation & inference BNs provide

More information

Tracking Groups of Pedestrians in Video Sequences

Tracking Groups of Pedestrians in Video Sequences Tracking Groups of Pedestrians in Video Sequences Jorge S. Marques Pedro M. Jorge Arnaldo J. Abrantes J. M. Lemos IST / ISR ISEL / IST ISEL INESC-ID / IST Lisbon, Portugal Lisbon, Portugal Lisbon, Portugal

More information

Visualization Techniques in Data Mining

Visualization Techniques in Data Mining Tecniche di Apprendimento Automatico per Applicazioni di Data Mining Visualization Techniques in Data Mining Prof. Pier Luca Lanzi Laurea in Ingegneria Informatica Politecnico di Milano Polo di Milano

More information

Visualizing Repertory Grid Data for Formative Assessment

Visualizing Repertory Grid Data for Formative Assessment Visualizing Repertory Grid Data for Formative Assessment Kostas Pantazos 1, Ravi Vatrapu 1, 2 and Abid Hussain 1 1 Computational Social Science Laboratory (CSSL) Department of IT Management, Copenhagen

More information

DataPA OpenAnalytics End User Training

DataPA OpenAnalytics End User Training DataPA OpenAnalytics End User Training DataPA End User Training Lesson 1 Course Overview DataPA Chapter 1 Course Overview Introduction This course covers the skills required to use DataPA OpenAnalytics

More information

Understanding Data: A Comparison of Information Visualization Tools and Techniques

Understanding Data: A Comparison of Information Visualization Tools and Techniques Understanding Data: A Comparison of Information Visualization Tools and Techniques Prashanth Vajjhala Abstract - This paper seeks to evaluate data analysis from an information visualization point of view.

More information

Effective Big Data Visualization

Effective Big Data Visualization Effective Big Data Visualization Every Picture Tells A Story Don t It? Mark Gamble Dir Technical Marketing Actuate Corporation 1 Data Driven Summit 2014 Agenda What is data visualization? What is good?

More information

Doctor of Philosophy in Computer Science

Doctor of Philosophy in Computer Science Doctor of Philosophy in Computer Science Background/Rationale The program aims to develop computer scientists who are armed with methods, tools and techniques from both theoretical and systems aspects

More information

Data Integration. Lectures 16 & 17. ECS289A, WQ03, Filkov

Data Integration. Lectures 16 & 17. ECS289A, WQ03, Filkov Data Integration Lectures 16 & 17 Lectures Outline Goals for Data Integration Homogeneous data integration time series data (Filkov et al. 2002) Heterogeneous data integration microarray + sequence microarray

More information

Visualization methods for patent data

Visualization methods for patent data Visualization methods for patent data Treparel 2013 Dr. Anton Heijs (CTO & Founder) Delft, The Netherlands Introduction Treparel can provide advanced visualizations for patent data. This document describes

More information

Bayesian networks - Time-series models - Apache Spark & Scala

Bayesian networks - Time-series models - Apache Spark & Scala Bayesian networks - Time-series models - Apache Spark & Scala Dr John Sandiford, CTO Bayes Server Data Science London Meetup - November 2014 1 Contents Introduction Bayesian networks Latent variables Anomaly

More information

Compression algorithm for Bayesian network modeling of binary systems

Compression algorithm for Bayesian network modeling of binary systems Compression algorithm for Bayesian network modeling of binary systems I. Tien & A. Der Kiureghian University of California, Berkeley ABSTRACT: A Bayesian network (BN) is a useful tool for analyzing the

More information

NEW VERSION OF DECISION SUPPORT SYSTEM FOR EVALUATING TAKEOVER BIDS IN PRIVATIZATION OF THE PUBLIC ENTERPRISES AND SERVICES

NEW VERSION OF DECISION SUPPORT SYSTEM FOR EVALUATING TAKEOVER BIDS IN PRIVATIZATION OF THE PUBLIC ENTERPRISES AND SERVICES NEW VERSION OF DECISION SUPPORT SYSTEM FOR EVALUATING TAKEOVER BIDS IN PRIVATIZATION OF THE PUBLIC ENTERPRISES AND SERVICES Silvija Vlah Kristina Soric Visnja Vojvodic Rosenzweig Department of Mathematics

More information

Introduction to Data Mining Techniques

Introduction to Data Mining Techniques Introduction to Data Mining Techniques Dr. Rajni Jain 1 Introduction The last decade has experienced a revolution in information availability and exchange via the internet. In the same spirit, more and

More information

A Tutorial on dynamic networks. By Clement Levallois, Erasmus University Rotterdam

A Tutorial on dynamic networks. By Clement Levallois, Erasmus University Rotterdam A Tutorial on dynamic networks By, Erasmus University Rotterdam V 1.0-2013 Bio notes Education in economics, management, history of science (Ph.D.) Since 2008, turned to digital methods for research. data

More information

131-1. Adding New Level in KDD to Make the Web Usage Mining More Efficient. Abstract. 1. Introduction [1]. 1/10

131-1. Adding New Level in KDD to Make the Web Usage Mining More Efficient. Abstract. 1. Introduction [1]. 1/10 1/10 131-1 Adding New Level in KDD to Make the Web Usage Mining More Efficient Mohammad Ala a AL_Hamami PHD Student, Lecturer m_ah_1@yahoocom Soukaena Hassan Hashem PHD Student, Lecturer soukaena_hassan@yahoocom

More information

Data Visualization Handbook

Data Visualization Handbook SAP Lumira Data Visualization Handbook www.saplumira.com 1 Table of Content 3 Introduction 20 Ranking 4 Know Your Purpose 23 Part-to-Whole 5 Know Your Data 25 Distribution 9 Crafting Your Message 29 Correlation

More information

Comparison of Non-linear Dimensionality Reduction Techniques for Classification with Gene Expression Microarray Data

Comparison of Non-linear Dimensionality Reduction Techniques for Classification with Gene Expression Microarray Data CMPE 59H Comparison of Non-linear Dimensionality Reduction Techniques for Classification with Gene Expression Microarray Data Term Project Report Fatma Güney, Kübra Kalkan 1/15/2013 Keywords: Non-linear

More information

Gephi Tutorial Quick Start

Gephi Tutorial Quick Start Gephi Tutorial Welcome to this introduction tutorial. It will guide you to the basic steps of network visualization and manipulation in Gephi. Gephi version 0.7alpha2 was used to do this tutorial. Get

More information

What is Visualization? Information Visualization An Overview. Information Visualization. Definitions

What is Visualization? Information Visualization An Overview. Information Visualization. Definitions What is Visualization? Information Visualization An Overview Jonathan I. Maletic, Ph.D. Computer Science Kent State University Visualize/Visualization: To form a mental image or vision of [some

More information

An Introduction to the Use of Bayesian Network to Analyze Gene Expression Data

An Introduction to the Use of Bayesian Network to Analyze Gene Expression Data n Introduction to the Use of ayesian Network to nalyze Gene Expression Data Cristina Manfredotti Dipartimento di Informatica, Sistemistica e Comunicazione (D.I.S.Co. Università degli Studi Milano-icocca

More information

Evaluating an Integrated Time-Series Data Mining Environment - A Case Study on a Chronic Hepatitis Data Mining -

Evaluating an Integrated Time-Series Data Mining Environment - A Case Study on a Chronic Hepatitis Data Mining - Evaluating an Integrated Time-Series Data Mining Environment - A Case Study on a Chronic Hepatitis Data Mining - Hidenao Abe, Miho Ohsaki, Hideto Yokoi, and Takahira Yamaguchi Department of Medical Informatics,

More information

GeNIeRate: An Interactive Generator of Diagnostic Bayesian Network Models

GeNIeRate: An Interactive Generator of Diagnostic Bayesian Network Models GeNIeRate: An Interactive Generator of Diagnostic Bayesian Network Models Pieter C. Kraaijeveld Man Machine Interaction Group Delft University of Technology Mekelweg 4, 2628 CD Delft, the Netherlands [email protected]

More information

Blog Post Extraction Using Title Finding

Blog Post Extraction Using Title Finding Blog Post Extraction Using Title Finding Linhai Song 1, 2, Xueqi Cheng 1, Yan Guo 1, Bo Wu 1, 2, Yu Wang 1, 2 1 Institute of Computing Technology, Chinese Academy of Sciences, Beijing 2 Graduate School

More information

TEXT-FILLED STACKED AREA GRAPHS Martin Kraus

TEXT-FILLED STACKED AREA GRAPHS Martin Kraus Martin Kraus Text can add a significant amount of detail and value to an information visualization. In particular, it can integrate more of the data that a visualization is based on, and it can also integrate

More information

INFORMATION SECURITY RISK ASSESSMENT UNDER UNCERTAINTY USING DYNAMIC BAYESIAN NETWORKS

INFORMATION SECURITY RISK ASSESSMENT UNDER UNCERTAINTY USING DYNAMIC BAYESIAN NETWORKS INFORMATION SECURITY RISK ASSESSMENT UNDER UNCERTAINTY USING DYNAMIC BAYESIAN NETWORKS R. Sarala 1, M.Kayalvizhi 2, G.Zayaraz 3 1 Associate Professor, Computer Science and Engineering, Pondicherry Engineering

More information

Protein Protein Interaction Networks

Protein Protein Interaction Networks Functional Pattern Mining from Genome Scale Protein Protein Interaction Networks Young-Rae Cho, Ph.D. Assistant Professor Department of Computer Science Baylor University it My Definition of Bioinformatics

More information

MicroStrategy Desktop

MicroStrategy Desktop MicroStrategy Desktop Quick Start Guide MicroStrategy Desktop is designed to enable business professionals like you to explore data, simply and without needing direct support from IT. 1 Import data from

More information

Up/Down Analysis of Stock Index by Using Bayesian Network

Up/Down Analysis of Stock Index by Using Bayesian Network Engineering Management Research; Vol. 1, No. 2; 2012 ISSN 1927-7318 E-ISSN 1927-7326 Published by Canadian Center of Science and Education Up/Down Analysis of Stock Index by Using Bayesian Network Yi Zuo

More information

Learning Hierarchical Bayesian Networks for Large-Scale Data Analysis

Learning Hierarchical Bayesian Networks for Large-Scale Data Analysis Learning Hierarchical Bayesian Networks for Large-Scale Data Analysis Kyu-Baek Hwang 1, Byoung-Hee Kim 2, and Byoung-Tak Zhang 2 1 School of Computing, Soongsil University, Seoul 156-743, Korea [email protected]

More information

EFFICIENT DATA PRE-PROCESSING FOR DATA MINING

EFFICIENT DATA PRE-PROCESSING FOR DATA MINING EFFICIENT DATA PRE-PROCESSING FOR DATA MINING USING NEURAL NETWORKS JothiKumar.R 1, Sivabalan.R.V 2 1 Research scholar, Noorul Islam University, Nagercoil, India Assistant Professor, Adhiparasakthi College

More information

How To Use Neural Networks In Data Mining

How To Use Neural Networks In Data Mining International Journal of Electronics and Computer Science Engineering 1449 Available Online at www.ijecse.org ISSN- 2277-1956 Neural Networks in Data Mining Priyanka Gaur Department of Information and

More information

The Optimality of Naive Bayes

The Optimality of Naive Bayes The Optimality of Naive Bayes Harry Zhang Faculty of Computer Science University of New Brunswick Fredericton, New Brunswick, Canada email: hzhang@unbca E3B 5A3 Abstract Naive Bayes is one of the most

More information

GEO-VISUALIZATION SUPPORT FOR MULTIDIMENSIONAL CLUSTERING

GEO-VISUALIZATION SUPPORT FOR MULTIDIMENSIONAL CLUSTERING Geoinformatics 2004 Proc. 12th Int. Conf. on Geoinformatics Geospatial Information Research: Bridging the Pacific and Atlantic University of Gävle, Sweden, 7-9 June 2004 GEO-VISUALIZATION SUPPORT FOR MULTIDIMENSIONAL

More information

BiCluster Viewer: A Visualization Tool for Analyzing Gene Expression Data

BiCluster Viewer: A Visualization Tool for Analyzing Gene Expression Data BiCluster Viewer: A Visualization Tool for Analyzing Gene Expression Data Julian Heinrich, Robert Seifert, Michael Burch, Daniel Weiskopf VISUS, University of Stuttgart Abstract. Exploring data sets by

More information

International Journal of Computer Science Trends and Technology (IJCST) Volume 2 Issue 3, May-Jun 2014

International Journal of Computer Science Trends and Technology (IJCST) Volume 2 Issue 3, May-Jun 2014 RESEARCH ARTICLE OPEN ACCESS A Survey of Data Mining: Concepts with Applications and its Future Scope Dr. Zubair Khan 1, Ashish Kumar 2, Sunny Kumar 3 M.Tech Research Scholar 2. Department of Computer

More information

Decision support software for probabilistic risk assessment using Bayesian networks

Decision support software for probabilistic risk assessment using Bayesian networks Decision support software for probabilistic risk assessment using Bayesian networks Norman Fenton and Martin Neil THIS IS THE AUTHOR S POST-PRINT VERSION OF THE FOLLOWING CITATION (COPYRIGHT IEEE) Fenton,

More information

Visualizing Uncertainty: Computer Science Perspective

Visualizing Uncertainty: Computer Science Perspective Visualizing Uncertainty: Computer Science Perspective Ben Shneiderman, Univ of Maryland, College Park Alex Pang, Univ of California, Santa Cruz National Academy of Sciences Workshop, Washington, DC What

More information

Visualization of Phylogenetic Trees and Metadata

Visualization of Phylogenetic Trees and Metadata Visualization of Phylogenetic Trees and Metadata November 27, 2015 Sample to Insight CLC bio, a QIAGEN Company Silkeborgvej 2 Prismet 8000 Aarhus C Denmark Telephone: +45 70 22 32 44 www.clcbio.com [email protected]

More information

Using Graph Theory to Analyze Gene Network Coherence

Using Graph Theory to Analyze Gene Network Coherence Using Graph Theory to Analyze Gene Network Coherence Francisco A. Gómez-Vela [email protected] Norberto Díaz-Díaz [email protected] José A. Lagares José A. Sánchez Jesús S. Aguilar 1 Outlines Introduction Proposed

More information

Manjeet Kaur Bhullar, Kiranbir Kaur Department of CSE, GNDU, Amritsar, Punjab, India

Manjeet Kaur Bhullar, Kiranbir Kaur Department of CSE, GNDU, Amritsar, Punjab, India Volume 5, Issue 6, June 2015 ISSN: 2277 128X International Journal of Advanced Research in Computer Science and Software Engineering Research Paper Available online at: www.ijarcsse.com Multiple Pheromone

More information

Chapter 28. Bayesian Networks

Chapter 28. Bayesian Networks Chapter 28. Bayesian Networks The Quest for Artificial Intelligence, Nilsson, N. J., 2009. Lecture Notes on Artificial Intelligence, Spring 2012 Summarized by Kim, Byoung-Hee and Lim, Byoung-Kwon Biointelligence

More information

Search Result Optimization using Annotators

Search Result Optimization using Annotators Search Result Optimization using Annotators Vishal A. Kamble 1, Amit B. Chougule 2 1 Department of Computer Science and Engineering, D Y Patil College of engineering, Kolhapur, Maharashtra, India 2 Professor,

More information

All Visualizations Documentation

All Visualizations Documentation All Visualizations Documentation All Visualizations Documentation 2 Copyright and Trademarks Licensed Materials - Property of IBM. Copyright IBM Corp. 2013 IBM, the IBM logo, and Cognos are trademarks

More information

Mining Online GIS for Crime Rate and Models based on Frequent Pattern Analysis

Mining Online GIS for Crime Rate and Models based on Frequent Pattern Analysis , 23-25 October, 2013, San Francisco, USA Mining Online GIS for Crime Rate and Models based on Frequent Pattern Analysis John David Elijah Sandig, Ruby Mae Somoba, Ma. Beth Concepcion and Bobby D. Gerardo,

More information

Selection of Optimal Discount of Retail Assortments with Data Mining Approach

Selection of Optimal Discount of Retail Assortments with Data Mining Approach Available online at www.interscience.in Selection of Optimal Discount of Retail Assortments with Data Mining Approach Padmalatha Eddla, Ravinder Reddy, Mamatha Computer Science Department,CBIT, Gandipet,Hyderabad,A.P,India.

More information

DMDSS: Data Mining Based Decision Support System to Integrate Data Mining and Decision Support

DMDSS: Data Mining Based Decision Support System to Integrate Data Mining and Decision Support DMDSS: Data Mining Based Decision Support System to Integrate Data Mining and Decision Support Rok Rupnik, Matjaž Kukar, Marko Bajec, Marjan Krisper University of Ljubljana, Faculty of Computer and Information

More information

Data Visualization. Brief Overview of ArcMap

Data Visualization. Brief Overview of ArcMap Data Visualization Prepared by Francisco Olivera, Ph.D., P.E., Srikanth Koka and Lauren Walker Department of Civil Engineering September 13, 2006 Contents: Brief Overview of ArcMap Goals of the Exercise

More information

Graph/Network Visualization

Graph/Network Visualization Graph/Network Visualization Data model: graph structures (relations, knowledge) and networks. Applications: Telecommunication systems, Internet and WWW, Retailers distribution networks knowledge representation

More information

Customer Analytics. Turn Big Data into Big Value

Customer Analytics. Turn Big Data into Big Value Turn Big Data into Big Value All Your Data Integrated in Just One Place BIRT Analytics lets you capture the value of Big Data that speeds right by most enterprises. It analyzes massive volumes of data

More information

Improvements of Space-Optimized Tree for Visualizing and Manipulating Very Large Hierarchies

Improvements of Space-Optimized Tree for Visualizing and Manipulating Very Large Hierarchies Improvements of Space-Optimized Tree for Visualizing and Manipulating Very Large Hierarchies Quang Vinh Nguyen and Mao Lin Huang Faculty of Information Technology University of Technology, Sydney, Australia

More information

ALEXANDER G. GEE, SC.D.

ALEXANDER G. GEE, SC.D. Institute for Visualization and Perception Research Department of Computer Science 198 Riverside St., Olsen Hall, Rm. 301C, Lowell, MA 01854 [email protected] 617-304-1123 (cell) http://www.cs.uml.edu/~agee

More information

Interactive Data Mining and Visualization

Interactive Data Mining and Visualization Interactive Data Mining and Visualization Zhitao Qiu Abstract: Interactive analysis introduces dynamic changes in Visualization. On another hand, advanced visualization can provide different perspectives

More information

Data, Measurements, Features

Data, Measurements, Features Data, Measurements, Features Middle East Technical University Dep. of Computer Engineering 2009 compiled by V. Atalay What do you think of when someone says Data? We might abstract the idea that data are

More information

Principles of Data Visualization for Exploratory Data Analysis. Renee M. P. Teate. SYS 6023 Cognitive Systems Engineering April 28, 2015

Principles of Data Visualization for Exploratory Data Analysis. Renee M. P. Teate. SYS 6023 Cognitive Systems Engineering April 28, 2015 Principles of Data Visualization for Exploratory Data Analysis Renee M. P. Teate SYS 6023 Cognitive Systems Engineering April 28, 2015 Introduction Exploratory Data Analysis (EDA) is the phase of analysis

More information

Data Visualization. Prepared by Francisco Olivera, Ph.D., Srikanth Koka Department of Civil Engineering Texas A&M University February 2004

Data Visualization. Prepared by Francisco Olivera, Ph.D., Srikanth Koka Department of Civil Engineering Texas A&M University February 2004 Data Visualization Prepared by Francisco Olivera, Ph.D., Srikanth Koka Department of Civil Engineering Texas A&M University February 2004 Contents Brief Overview of ArcMap Goals of the Exercise Computer

More information

Experiments in Web Page Classification for Semantic Web

Experiments in Web Page Classification for Semantic Web Experiments in Web Page Classification for Semantic Web Asad Satti, Nick Cercone, Vlado Kešelj Faculty of Computer Science, Dalhousie University E-mail: {rashid,nick,vlado}@cs.dal.ca Abstract We address

More information

Guide for Data Visualization and Analysis using ACSN

Guide for Data Visualization and Analysis using ACSN Guide for Data Visualization and Analysis using ACSN ACSN contains the NaviCell tool box, the intuitive and user- friendly environment for data visualization and analysis. The tool is accessible from the

More information