3D-SE Viewer: A Text Mining Tool based on Bipartite Graph Visualization
|
|
|
- Solomon Thompson
- 9 years ago
- Views:
Transcription
1 Proceedings of International Joint Conference on Neural Networks, Orlando, Florida, USA, August 12-17, D-SE Viewer: A Text Mining Tool based on Bipartite Graph Visualization Shiro Usui, Antoine Naud, Naonori Ueda, Tatsuki Taniguchi Abstract A new interactive visualization tool is proposed for textual data mining based on bipartite graph visualization. Applications to three text datasets are presented to show the capability of this interactive tool to visualize complex relational information between two sets of items by embedding their graph in a 3-dimensional space. Information extracted from texts, such as keywords, indexing terms or topics are visualized to allow interactive browsing of a field of research featured by keywords, topics or research teams. This 3-D visualization tool conveys more information than planar or linear displays of graphs. I. INTRODUCTION Very often when dealing with textual data, people are interested in the relationships between entities belonging to two distinct categories: e.g. relationships between words and documents, between topics and documents or between authors and documents. The most widely used approach in Natural Language Processing is the vector space model. In this model, a set of terms T is first built by extracting words from a collection of documents D followed by stop words removal and stemming [6]. The numbers of occurrences of each term in each document (usually called frequency) are counted and denoted f ij. A matrix F is built, with one row for each term, one column for each document, and with the frequencies f ij as entries. When the number of documents N in the collection is in the range of a few thousands (as it is in the examples presented below), the number of terms extracted is often larger than a few tens of thousands, leading to very high dimensional space for the documents. In order enable further processing of matrix F, we reduce its size by selecting a number M of terms using a ranking scheme. This is done by ranking the terms according to a term weighting scheme and retaining the top M terms (M 1000). Different term weighting schemes can be found in the Information Retrieval literature, which catch different desired properties for the terms (see [2] for a review and comparison of term weighting schemes). The most popular one is probably TF.IDF, the term frequency inverse document frequency which has been used in this work. The matrix of frequencies F is usually sparse because most of the terms occur only in a few documents. In this case, it is convenient to regard the data as a graph which vertices represent both terms and documents, and each edge connects one term to one document if the term occurs at least once in the document. The frequencies in matrix F are Shiro Usui and Antoine Naud are with the Laboratory for Neuroinformatics, RIKEN Brain Science Institute, 2-1 Hirosawa, Wako City, Saitama , JAPAN, [email protected], [email protected], Naonori Ueda is with the NTT Communication Science Laboratories, Kyoto, JAPAN, and Tatsuki Taniguchi is with IVIS Corp., JAPAN. then converted to binary entries to build a second occurrence matrix O (o ij = sgn(f ij )), from which a bipartite graph is defined. In such a graph, each term is connected to all the documents in which it occurs, and each document is connected to all the terms from set T it contains, but there is no connection between terms, nor between documents. II. BIPARTITE GRAPH VISUALIZATION The purpose of bipartite graph visualization is to display simultaneously two types of relationships: the similarities existing between items within each of two subsets, on the basis of the relationships defined by the graph edges. In the terms and documents application introduced above, we are interested in seeing the similarities between terms, as well as similarities between documents, based on the occurrences of terms in the documents. In each of the applications presented in section III, the most important information is the configuration of the graph s vertices. The edges of the graph are not of primary interest here, they are not visualized by default, although the 3D-SE viewer allows to display them. A. Formal definition of a bipartite graph A graph G is defined as G := {V, E}, where V is the set of vertices or nodes and E the set of edges. Graph G is undirected if the pairs in E are unordered. An undirected graph G is called a bipartite graph if there exist a partition of the vertex set V = V A V B, so that there is no edge in E connecting V A to V B. B. The Spherical Embedding algorithm The Spherical Embedding (SE) algorithm [8] was primarily designed for the visualization of bipartite graphs. The items of the two subsets V A and V B are represented as nodes positioned on 2 concentric spheres in a 3-dimensional Euclidean space. 1 Items from subset V A are mapped on the inner sphere Θ A (with radius r A = 1), whereas items from V B are mapped on the outer sphere Θ B (with radius r B = 2). Items positions are defined in such a way that similar items in V A are close to each other on Θ A, and similar items in V B are close to each other on Θ B. Figure 1 illustrates the process of bipartite graph construction and visualization in a 2-dimensional space using the Spherical Embedding algorithm. To achieve this goal, we minimize a sum over the edges in E of the Euclidean distance between the corresponding nodes in the 3-D space. The minimization is performed through a gradient descent procedure, under 1 The number of dimensions of the embedding space was set to 3 because more information can be visualized in 3-D than in 2-D X/07/$ IEEE
2 Occurrence matrix Bipartite graph 2-D embedding b 1 b 2 b 3 b 4 b 5 b 6 b 7 a a a a a a a a a a 10 1 a a a a a V B set S.E. + V A set V A set V B set Θ B Θ A Fig. 1. Visualization process: from the binary occurrence matrix O to the bipartite graph and its visualization using the Spherical Embedding algorithm. constraints requiring that the points lie on the two spheres. This constrained optimization problem is converted to an unconstrained one using some sufficient statistics results. The whole process amounts to minimizing the sum of Euclidean distances between pairs of points along all the edges in E, that is find the nodes coordinates {x i } subject to x i T x i = r i 2 that minimize E = 1 2 M+N i<j w ij ( aij r i r j x i T x j ) 2 (1) where a ij = +1 if nodes i and j are connected and 1 otherwise, and r i = r A (resp. r B ) for nodes from subset V A (resp. V B ). The {w ij } are weights that can be used to give more emphasis on pairs of nodes belonging to E. C. Related approaches Although graph drawing is a very active field of research, very few work exist on the visualization of bipartite graphs. An interesting method called Anchor Maps [5] has been proposed recently. It provides a visualization of the graph in a 2-dimensional space, proceeding in two steps: the items of the first subset of vertices V A are plotted on a circle, after which the vertices of the second subset V B are added to the plot by allocating them with respect to the vertices of V A. A different approach is proposed by Zheng et al. [12], in which a layout of points on two parallel planes is sought for, such that a view in three dimensions from which the number of observed crossings will be minimal. Drawing the vertices on planar curves, as proposed by Di Giacomo et al. [1], is another interesting approach. Hong et al. proposed [3] a layered drawing of bipartite graphs in 2 1/2 dimensions, that is the vertices are allocated on two surfaces embedded in a three dimensional Euclidean space. In all these approaches, the ultimate goal is the visualization of the graph itself, so that nodes are displayed together with lines representing the edges. In our approach, the focus is set primarily on the visualization of the graph s vertices; whereas its edges can be displayed on demand by selecting interactively the corresponding nodes. D. The 3D-SE viewer visualization tool The 3D-SE viewer 2 visualization tool has been designed and developed for the general purpose of bipartite graphs visualization. In order to build an interactive tool available on web pages, it has been implemented as a Java applet. The visualized items are represented as colored nodes with labels, embedded in a 3-D Euclidean space. Their positions are first calculated by the SE algorithm, then they are viewed on a pseudo 3-D layout implemented using standard Java graphics context Java.awt.graphics. Interactively, the viewpoint can be modified by the user (rotation of the spheres around their center, zooming in or out, translation of the center). The nodes of subset V A (respectively V B ) are displayed on the inner sphere Θ A (resp. Θ B ) and they are listed in the list panel on the right (resp. left) side of the central view. A node can be selected by clicking it directly in the central view or in the side panel (several nodes can be selected by pressing the Shift or Controls key while clicking nodes). When a node is selected, all the edges connected to it are displayed, and the nodes from these edges second end are also selected. Finally, one can search a node by entering a search phrase matching its name in one of the two search text fields on top of the listing panels. III. APPLICATIONS Three different data sets have been used to test visually the performance of the 3D-SE viewer tool. These data sets are all based on relationships between words and other entities (research teams, documents or conference sessions), expressed in a bipartite graph. The data sets differ in size (numbers of nodes in each subset) that is, V A and V B and 2 3D-SE viewer c BSI-NI and NTT-CS Labs
3 Fig. 2. BSI-Team map: 3D-SE viewer-based visualization of RIKEN Brain Science Institute research teams. The nodes are colored according to the teams Units listed in the left panel. It can be seen that nodes with the same color appear in the same region on the outer sphere. also in their sparseness ratio (percentage of empty cells in occurrence matrix) defined as S = E /( V A V B ). A. The BSI-Team Map The first application of 3D-SE viewer was the visualization of the structure of a research organization: the RIKEN Brain Science Institute (BSI), by showing relationships between its laboratories and research units. The resulting tool is accessible on the Internet, it allows visitors to see at a glance all the teams of BSI, as well as search facilities and direct access to a chosen team. Such a representation convey much more information on inter-team similarities than a simple list of names or a planar graph does: In 3-dimensions, there is larger degree of freedom to position the teams in a way that reflects similarities of interests. In order to feature research interests of the different units of the Institute, a questionnaire has been sent to the 53 research team leaders. Based on their answers, a list of 123 keywords has been established for the whole Institute. This list was then sent back to team leaders who were asked select keywords that best correspond to their teams research interests, and distinguish keywords of primary and secondary interest. Gathering finally the answers, a table was formed with the keywords on rows, teams on columns and numbers in entries (1 or 2 when the keyword was selected as primary or secondary keywords, and 0 otherwise). From this frequency matrix, an binary occurrence matrix was derived (sparseness: 85.12%) and the bipartite graph was build and visualized using the 3D-SE viewer. The inner sphere represents the keywords and the outer sphere contains teams. The resulting interactive exploration tool was named BSI-Team Map 3 and it is accessible from RIKEN BSI s main webpage ( english/teammap/index.html). Since its public accessibility in December 2006, some positive comments was received about this tool s capability to show how people interact (personal friend communication). This tool is an effort to make the structure of a research structure more accessible and understandable to the international community, although a lack of international visibility of scientists webpages in Japan has been recently reproached [4]. We agree that visibility could be improved by visualizing topics, for instance extracted from the teams Web pages describing research activities, topic oriented structure being more accessible than people oriented. Figure 2 illustrates an example of team search: the 3 BSI-Team Map c RIKEN Brain Science Institute
4 team leader s name amari was entered in the search field on top of left panel. The found name is enlightened in the team list, the view centered on this point and the links to the keywords of this team were drawn. Then a single click on the team s name shows directly this team s webpage. Similarly, when entering a keyword in the search field in the top of the right panel, the links from this keyword to all the teams related to it will be shown. B. The Visiome Platform Understanding the brain as a system requires worldwide collaboration of scientists specializing in different areas of brain science. This issue confronting many areas of research and much more compounded in the fields of brain research, prompted for the development of a field called Neuroinformatics (NI). Its main goal is to help brain scientists handle the analysis, modeling, simulation, and management of the information resource before, during, and after the conduction of research. The Neuroinformatics Platforms such as Visiome ( [9] [11] aim to address these issues by providing portal sites to different fields of brain research, such as in NeuroInformatics Japan Center (NIJC). One vital component of the Neuroinformatics platform is the index tree which is used to organize the electronic materials (digital contents) submitted by the contributors. Automating the keyword index extraction is necessary to support the evolution of the platform in operation and for the establishment of new platforms. A tool for automatic extraction of the keywords from papers titles and abstracts was developed [10]. The 3D-SE viewer was used here to visualize both the indexing keywords and the documents of different types contained in the database, called contents. In a first application, a manually established list of indexing terms was used. Selection of keywords for each content was also performed by human experts in the field of vision science. The bipartite graph contains a set of 3002 contents linked to a set of 432 index terms. The sparseness of occurrence matrix is 99.23%. The density of the graph can be estimated by the number of edges per index terms, which ranges from 0 (5 index terms have no content linked to them) to 295, with a mean value of and a total of 9946 edges. Similarly, the number of edges per content ranges from 1 to 31 index terms, with an average value of In the Visiome platform, the 3002 contents are filed into 8 categories: Visual System, Visual Stimulus, Basic Neuroscience, Tools & Techniques, Models & Theory, Applications, Links, Binders, these categories are used to color the nodes on the display. Besides the selection and search possibilities that were described earlier, the 3D-SE viewer allows to access directly documents. When clicking the name just above a node in the 3-D display, the corresponding Visiome web page is opened: If an index term node name is clicked on the outer sphere, the Visiome page of this index term is shown allowing to view and access all individual documents related to this term. Similarly, if a content node is clicked on the inner sphere, then the Visiome page of this content is shown, providing access to the details of the corresponding document. Figure 3 illustrates a view zoomed into the area of searched term retina. The found node Retina is connected to a large number of keywords as indicated by the links, which shows that the position of this node on the outer sphere is reliable. Experts in this field can evaluate how close from a semantic point of view the neighboring nodes are to this keyword. C. Society for Neuroscience Annual Meeting abstracts The Society for Neuroscience (SfN) is, with more than 37, 500 members, the world s largest organization of scientists devoted to the study of the brain. Its Annual Meeting is the largest event in neuroscience, which gathered in 2005 nearly 35, 000 scientific and nonscientific attendees. Among other scientific events, 650 poster sessions allow researchers to communicate results of their last research. Since the year 2000, abstracts of all posters presented at the Annual Meeting are available online at the SfN Website ( as well as on a CD distributed to participants. The data were extracted from the XML file that is created during installation of the 2006 Neuroscience Meeting Planner software. The sessions are organized into 8 main themes in neuroscience: A-Development, B-Neural Excitability Synapses and Glia: Cellular Mechanisms, C-Sensory and Motor Systems, D-Homeostatic and Neuroendocrine Systems, E-Cognition and Behavior, F-Disorders of the Nervous System, G-Techniques in Neuroscience, H-History and Teaching of Neuroscience. Each theme is subdivided into subthemes, and each subtheme is divided into topics. A summary of some basic figures concerning the 2006 SfN Annual Meeting is presented in Table I. Although we were TABLE I SOCIETY FOR NEUROSCIENCE 2006 ANNUAL MEETING SUMMARY DATA 1 Number of themes 8 2 Number of subthemes 71 3 Number of topics Number of poster sessions Number of posters Number of slides sessions Number of slides Total (Posters + Slides) interested in extracting information from posters titles and abstracts, the figures in Table I concern both posters and slides for which a title and an abstract were available. The purpose of this last application of 3D-SE viewer is the visualization of the different sessions, in order to see how they organize on the basis of their similarities. Such a display could be usable in future SfN Meetings for attendees to help them plan an itinerary as a path connecting items on the sphere where themes, subthemes and topics would be represented. In this preliminary work, we wanted to visualize only the poster sessions on the basis of their relationships to posters abstracts and titles. In this purpose, for each session, words were extracted from titles and abstracts of the session s posters (from 15 to 30 posters per session) in the same manner as described in section I and ranked according to their TFIDF values. The top 20 words for each session were selected, and gathered into one set of all words for
5 Fig. 3. Visiome Platform index keywords: 3D-SE viewer-based visualization with focus on selected Retina keyword. all sessions (many words were common to several sessions, so the final set has 2164 words). Finally, a sessions words occurrence matrix was build (sparseness: 97.50%) and the ensuing bipartite graph connecting words to sessions was visualized using the 3D-SE viewer applet. Figure 4 represents the 650 poster sessions visualized on the basis of the terms extracted from posters abstracts and titles. IV. MAIN RESULTS The 3D-SE viewer shows an interesting capability of visualizing bipartite graphs on two spheres. From numerous experiments conducted with various datasets, it has been observed that best visual effects are obtained when the bipartite graph is balanced, that is the numbers of items in each of the 2 subsets is of the same range. We observed also several times that the nodes are more uniformly allocated on the two spheres in cases when the graph s edges are themselves more uniformly distributed over the graph s vertices. This means that the number of edges connected to each vertex should not vary too much among the different vertices, otherwise we may observe some empty areas on both spheres. This hole effect is probably related to the graph density properties, and it should be analyzed more completely. Another advantage of the 3D-SE viewer is its fairly competitive time complexity, which allows to obtain the visualization of several thousands of nodes in a few seconds. These results of the application of 3D-SE viewer to such data are very preliminary and further research in this area will be conducted in the near future. V. CONCLUSIONS The presented applications of 3D-SE viewer show that it is a useful tool for the visualization of data such as research teams of a large research institution, terms indexing documents in a neuroscience database or poster sessions from a particular knowledge domain. The results are encouraging and applications to larger datasets can be considered. The 3D-SE viewer can be used e.g. for other research institutes, showing for example a higher level of RIKEN s organizational structure. Applications to the visualization of n-partite graphs for n > 2 can also be considered. ACKNOWLEDGMENT The authors would like to thank W. Duch for helpful comments and discussions. The NeuroInformatics Committee of the Society for Neuroscience is gratefully acknowledged for providing and granting the use of the abstracts visualized in section III-C. REFERENCES [1] E. Di Giacomo, L. Grilli and G. Liotta, Drawing Bipartite Graphs on Two Curves, 14th International Symposium on Graph Drawing, Universität Karlsruhe, Sep , [2] T. Gibson Kolda, Limited-memory matrix methods with applications, PhD thesis, Dept of Computer Science, University of Maryland, [3] S. Hong, N. Nikolov, Layered Drawings of Directed Graphs in Three Dimensions, Proceedings of the Asia-Pacific Symposium on Information Visualization, CRPIT, vol. 45, pp , [4] M. Ito and T. Wiesel, Cultural differences reduce Japanese researchers visibility on the Web, Nature, 444, p. 817, Dec. 14, 2006.
6 Fig. 4. Society for Neuroscience 2006 Annual Meeting: a view of 650 poster sessions (outer sphere) and 2164 extracted terms (inner sphere). The nodes on the outer sphere are colored according to the session s dominant theme. It can be seen that the nodes cluster in accordance with their theme. [5] K. Misue, Drawing Bipartite Graphs as Anchored Maps, In Proc. Asia Pacific Symposium on Information Visualisation (APVIS2006), Tokyo, Japan. CRPIT, 60. Misue, K., Sugiyama, K. and Tanaka, J., Eds., ACS., pp , [6] M.F. Porter, An Algorithm for Suffix Stripping, Program, vol. 14, 3, pp , July [7] G. Salton and M.J. McGill, Introduction to Modern Retrieval, McGraw- Hill Book Company, [8] K. Saito(P), T. Iwata and N. Ueda, Visualization of Bipartite Graph by Spherical Embedding, JNNS 2004 (in Japanese). [9] S. Usui, Visiome: Neuroinformatics Research in Vision Project, Neural Networks, vol. 16, pp , [10] S. Usui, P. Palmes, K. Nagata, T. Taniguchi, N. Ueda, Keyword Extraction, Ranking, and Organization for the Neuroinformatics Platform, Bio Systems, vol. 88, pp , [11] S. Usui, A Neuroinformatics Platform for Vision Science: Visiome, IJCNN 2007 (companion paper at this conference). [12] L. Zheng, L. Song, P. Eades, Crossing Minimization Problems of Drawing Bipartite Graphs in Two Clusters, Proc. of the Asia-Pacific Symposium on Information Visualization, CRPIT, vol. 45, pp , 2005.
Visual Analysis Tool for Bipartite Networks
Visual Analysis Tool for Bipartite Networks Kazuo Misue Department of Computer Science, University of Tsukuba, 1-1-1 Tennoudai, Tsukuba, 305-8573 Japan [email protected] Abstract. To find hidden features
Complex Network Visualization based on Voronoi Diagram and Smoothed-particle Hydrodynamics
Complex Network Visualization based on Voronoi Diagram and Smoothed-particle Hydrodynamics Zhao Wenbin 1, Zhao Zhengxu 2 1 School of Instrument Science and Engineering, Southeast University, Nanjing, Jiangsu
Solving Geometric Problems with the Rotating Calipers *
Solving Geometric Problems with the Rotating Calipers * Godfried Toussaint School of Computer Science McGill University Montreal, Quebec, Canada ABSTRACT Shamos [1] recently showed that the diameter of
FUZZY CLUSTERING ANALYSIS OF DATA MINING: APPLICATION TO AN ACCIDENT MINING SYSTEM
International Journal of Innovative Computing, Information and Control ICIC International c 0 ISSN 34-48 Volume 8, Number 8, August 0 pp. 4 FUZZY CLUSTERING ANALYSIS OF DATA MINING: APPLICATION TO AN ACCIDENT
DATA ANALYSIS II. Matrix Algorithms
DATA ANALYSIS II Matrix Algorithms Similarity Matrix Given a dataset D = {x i }, i=1,..,n consisting of n points in R d, let A denote the n n symmetric similarity matrix between the points, given as where
Component visualization methods for large legacy software in C/C++
Annales Mathematicae et Informaticae 44 (2015) pp. 23 33 http://ami.ektf.hu Component visualization methods for large legacy software in C/C++ Máté Cserép a, Dániel Krupp b a Eötvös Loránd University [email protected]
Visualization of large data sets using MDS combined with LVQ.
Visualization of large data sets using MDS combined with LVQ. Antoine Naud and Włodzisław Duch Department of Informatics, Nicholas Copernicus University, Grudziądzka 5, 87-100 Toruń, Poland. www.phys.uni.torun.pl/kmk
A Color Placement Support System for Visualization Designs Based on Subjective Color Balance
A Color Placement Support System for Visualization Designs Based on Subjective Color Balance Eric Cooper and Katsuari Kamei College of Information Science and Engineering Ritsumeikan University Abstract:
Self-adaptive e-learning Website for Mathematics
Self-adaptive e-learning Website for Mathematics Akira Nakamura Abstract Keyword searching and browsing on learning website is ultimate self-adaptive learning. Our e-learning website KIT Mathematics Navigation
ME 24-688 Week 11 Introduction to Dynamic Simulation
The purpose of this introduction to dynamic simulation project is to explorer the dynamic simulation environment of Autodesk Inventor Professional. This environment allows you to perform rigid body dynamic
Topic Maps Visualization
Topic Maps Visualization Bénédicte Le Grand, Laboratoire d'informatique de Paris 6 Introduction Topic maps provide a bridge between the domains of knowledge representation and information management. Topics
Comparison of Non-linear Dimensionality Reduction Techniques for Classification with Gene Expression Microarray Data
CMPE 59H Comparison of Non-linear Dimensionality Reduction Techniques for Classification with Gene Expression Microarray Data Term Project Report Fatma Güney, Kübra Kalkan 1/15/2013 Keywords: Non-linear
Social Media Mining. Graph Essentials
Graph Essentials Graph Basics Measures Graph and Essentials Metrics 2 2 Nodes and Edges A network is a graph nodes, actors, or vertices (plural of vertex) Connections, edges or ties Edge Node Measures
Information Visualization of Attributed Relational Data
Information Visualization of Attributed Relational Data Mao Lin Huang Department of Computer Systems Faculty of Information Technology University of Technology, Sydney PO Box 123 Broadway, NSW 2007 Australia
ARTIFICIAL INTELLIGENCE METHODS IN EARLY MANUFACTURING TIME ESTIMATION
1 ARTIFICIAL INTELLIGENCE METHODS IN EARLY MANUFACTURING TIME ESTIMATION B. Mikó PhD, Z-Form Tool Manufacturing and Application Ltd H-1082. Budapest, Asztalos S. u 4. Tel: (1) 477 1016, e-mail: [email protected]
Graph/Network Visualization
Graph/Network Visualization Data model: graph structures (relations, knowledge) and networks. Applications: Telecommunication systems, Internet and WWW, Retailers distribution networks knowledge representation
Part 2: Community Detection
Chapter 8: Graph Data Part 2: Community Detection Based on Leskovec, Rajaraman, Ullman 2014: Mining of Massive Datasets Big Data Management and Analytics Outline Community Detection - Social networks -
Classification of Fingerprints. Sarat C. Dass Department of Statistics & Probability
Classification of Fingerprints Sarat C. Dass Department of Statistics & Probability Fingerprint Classification Fingerprint classification is a coarse level partitioning of a fingerprint database into smaller
PATTERN RECOGNITION AND MACHINE LEARNING CHAPTER 4: LINEAR MODELS FOR CLASSIFICATION
PATTERN RECOGNITION AND MACHINE LEARNING CHAPTER 4: LINEAR MODELS FOR CLASSIFICATION Introduction In the previous chapter, we explored a class of regression models having particularly simple analytical
An Open Framework for Reverse Engineering Graph Data Visualization. Alexandru C. Telea Eindhoven University of Technology The Netherlands.
An Open Framework for Reverse Engineering Graph Data Visualization Alexandru C. Telea Eindhoven University of Technology The Netherlands Overview Reverse engineering (RE) overview Limitations of current
NEW VERSION OF DECISION SUPPORT SYSTEM FOR EVALUATING TAKEOVER BIDS IN PRIVATIZATION OF THE PUBLIC ENTERPRISES AND SERVICES
NEW VERSION OF DECISION SUPPORT SYSTEM FOR EVALUATING TAKEOVER BIDS IN PRIVATIZATION OF THE PUBLIC ENTERPRISES AND SERVICES Silvija Vlah Kristina Soric Visnja Vojvodic Rosenzweig Department of Mathematics
Visualizing e-government Portal and Its Performance in WEBVS
Visualizing e-government Portal and Its Performance in WEBVS Ho Si Meng, Simon Fong Department of Computer and Information Science University of Macau, Macau SAR [email protected] Abstract An e-government
Pro/ENGINEER Wildfire 4.0 Basic Design
Introduction Datum features are non-solid features used during the construction of other features. The most common datum features include planes, axes, coordinate systems, and curves. Datum features do
W. Heath Rushing Adsurgo LLC. Harness the Power of Text Analytics: Unstructured Data Analysis for Healthcare. Session H-1 JTCC: October 23, 2015
W. Heath Rushing Adsurgo LLC Harness the Power of Text Analytics: Unstructured Data Analysis for Healthcare Session H-1 JTCC: October 23, 2015 Outline Demonstration: Recent article on cnn.com Introduction
Data Mining: Exploring Data. Lecture Notes for Chapter 3. Introduction to Data Mining
Data Mining: Exploring Data Lecture Notes for Chapter 3 Introduction to Data Mining by Tan, Steinbach, Kumar What is data exploration? A preliminary exploration of the data to better understand its characteristics.
Performance Analysis of Naive Bayes and J48 Classification Algorithm for Data Classification
Performance Analysis of Naive Bayes and J48 Classification Algorithm for Data Classification Tina R. Patil, Mrs. S. S. Sherekar Sant Gadgebaba Amravati University, Amravati [email protected], [email protected]
VISUALIZING HIERARCHICAL DATA. Graham Wills SPSS Inc., http://willsfamily.org/gwills
VISUALIZING HIERARCHICAL DATA Graham Wills SPSS Inc., http://willsfamily.org/gwills SYNONYMS Hierarchical Graph Layout, Visualizing Trees, Tree Drawing, Information Visualization on Hierarchies; Hierarchical
USING SPECTRAL RADIUS RATIO FOR NODE DEGREE TO ANALYZE THE EVOLUTION OF SCALE- FREE NETWORKS AND SMALL-WORLD NETWORKS
USING SPECTRAL RADIUS RATIO FOR NODE DEGREE TO ANALYZE THE EVOLUTION OF SCALE- FREE NETWORKS AND SMALL-WORLD NETWORKS Natarajan Meghanathan Jackson State University, 1400 Lynch St, Jackson, MS, USA [email protected]
TIBCO Spotfire Network Analytics 1.1. User s Manual
TIBCO Spotfire Network Analytics 1.1 User s Manual Revision date: 26 January 2009 Important Information SOME TIBCO SOFTWARE EMBEDS OR BUNDLES OTHER TIBCO SOFTWARE. USE OF SUCH EMBEDDED OR BUNDLED TIBCO
Scientific Graphing in Excel 2010
Scientific Graphing in Excel 2010 When you start Excel, you will see the screen below. Various parts of the display are labelled in red, with arrows, to define the terms used in the remainder of this overview.
JustClust User Manual
JustClust User Manual Contents 1. Installing JustClust 2. Running JustClust 3. Basic Usage of JustClust 3.1. Creating a Network 3.2. Clustering a Network 3.3. Applying a Layout 3.4. Saving and Loading
Visualizing an Auto-Generated Topic Map
Visualizing an Auto-Generated Topic Map Nadine Amende 1, Stefan Groschupf 2 1 University Halle-Wittenberg, information manegement technology [email protected] 2 media style labs Halle Germany [email protected]
Space-filling Techniques in Visualizing Output from Computer Based Economic Models
Space-filling Techniques in Visualizing Output from Computer Based Economic Models Richard Webber a, Ric D. Herbert b and Wei Jiang bc a National ICT Australia Limited, Locked Bag 9013, Alexandria, NSW
9. Text & Documents. Visualizing and Searching Documents. Dr. Thorsten Büring, 20. Dezember 2007, Vorlesung Wintersemester 2007/08
9. Text & Documents Visualizing and Searching Documents Dr. Thorsten Büring, 20. Dezember 2007, Vorlesung Wintersemester 2007/08 Slide 1 / 37 Outline Characteristics of text data Detecting patterns SeeSoft
Information Visualization Multivariate Data Visualization Krešimir Matković
Information Visualization Multivariate Data Visualization Krešimir Matković Vienna University of Technology, VRVis Research Center, Vienna Multivariable >3D Data Tables have so many variables that orthogonal
Understanding Data: A Comparison of Information Visualization Tools and Techniques
Understanding Data: A Comparison of Information Visualization Tools and Techniques Prashanth Vajjhala Abstract - This paper seeks to evaluate data analysis from an information visualization point of view.
SPECIAL PERTURBATIONS UNCORRELATED TRACK PROCESSING
AAS 07-228 SPECIAL PERTURBATIONS UNCORRELATED TRACK PROCESSING INTRODUCTION James G. Miller * Two historical uncorrelated track (UCT) processing approaches have been employed using general perturbations
Interactive Information Visualization of Trend Information
Interactive Information Visualization of Trend Information Yasufumi Takama Takashi Yamada Tokyo Metropolitan University 6-6 Asahigaoka, Hino, Tokyo 191-0065, Japan [email protected] Abstract This paper
MapReduce Approach to Collective Classification for Networks
MapReduce Approach to Collective Classification for Networks Wojciech Indyk 1, Tomasz Kajdanowicz 1, Przemyslaw Kazienko 1, and Slawomir Plamowski 1 Wroclaw University of Technology, Wroclaw, Poland Faculty
OECD.Stat Web Browser User Guide
OECD.Stat Web Browser User Guide May 2013 May 2013 1 p.10 Search by keyword across themes and datasets p.31 View and save combined queries p.11 Customise dimensions: select variables, change table layout;
Strategic Online Advertising: Modeling Internet User Behavior with
2 Strategic Online Advertising: Modeling Internet User Behavior with Patrick Johnston, Nicholas Kristoff, Heather McGinness, Phuong Vu, Nathaniel Wong, Jason Wright with William T. Scherer and Matthew
DataPA OpenAnalytics End User Training
DataPA OpenAnalytics End User Training DataPA End User Training Lesson 1 Course Overview DataPA Chapter 1 Course Overview Introduction This course covers the skills required to use DataPA OpenAnalytics
Interactive Recovery of Requirements Traceability Links Using User Feedback and Configuration Management Logs
Interactive Recovery of Requirements Traceability Links Using User Feedback and Configuration Management Logs Ryosuke Tsuchiya 1, Hironori Washizaki 1, Yoshiaki Fukazawa 1, Keishi Oshima 2, and Ryota Mibe
The Scientific Data Mining Process
Chapter 4 The Scientific Data Mining Process When I use a word, Humpty Dumpty said, in rather a scornful tone, it means just what I choose it to mean neither more nor less. Lewis Carroll [87, p. 214] In
Piston Ring. Problem:
Problem: A cast-iron piston ring has a mean diameter of 81 mm, a radial height of h 6 mm, and a thickness b 4 mm. The ring is assembled using an expansion tool which separates the split ends a distance
Practical Graph Mining with R. 5. Link Analysis
Practical Graph Mining with R 5. Link Analysis Outline Link Analysis Concepts Metrics for Analyzing Networks PageRank HITS Link Prediction 2 Link Analysis Concepts Link A relationship between two entities
Enterprise Organization and Communication Network
Enterprise Organization and Communication Network Hideyuki Mizuta IBM Tokyo Research Laboratory 1623-14, Shimotsuruma, Yamato-shi Kanagawa-ken 242-8502, Japan E-mail: [email protected] Fusashi Nakamura
Social Media Mining. Data Mining Essentials
Introduction Data production rate has been increased dramatically (Big Data) and we are able store much more data than before E.g., purchase data, social media data, mobile phone data Businesses and customers
An Adaptive Decoding Algorithm of LDPC Codes over the Binary Erasure Channel. Gou HOSOYA, Hideki YAGI, Toshiyasu MATSUSHIMA, and Shigeichi HIRASAWA
2007 Hawaii and SITA Joint Conference on Information Theory, HISC2007 Hawaii, USA, May 29 31, 2007 An Adaptive Decoding Algorithm of LDPC Codes over the Binary Erasure Channel Gou HOSOYA, Hideki YAGI,
How To Cluster On A Search Engine
Volume 2, Issue 2, February 2012 ISSN: 2277 128X International Journal of Advanced Research in Computer Science and Software Engineering Research Paper Available online at: A REVIEW ON QUERY CLUSTERING
Data Exploration Data Visualization
Data Exploration Data Visualization What is data exploration? A preliminary exploration of the data to better understand its characteristics. Key motivations of data exploration include Helping to select
Extend Table Lens for High-Dimensional Data Visualization and Classification Mining
Extend Table Lens for High-Dimensional Data Visualization and Classification Mining CPSC 533c, Information Visualization Course Project, Term 2 2003 Fengdong Du [email protected] University of British Columbia
Machine Learning and Data Mining. Regression Problem. (adapted from) Prof. Alexander Ihler
Machine Learning and Data Mining Regression Problem (adapted from) Prof. Alexander Ihler Overview Regression Problem Definition and define parameters ϴ. Prediction using ϴ as parameters Measure the error
HISTORICAL DEVELOPMENTS AND THEORETICAL APPROACHES IN SOCIOLOGY Vol. I - Social Network Analysis - Wouter de Nooy
SOCIAL NETWORK ANALYSIS University of Amsterdam, Netherlands Keywords: Social networks, structuralism, cohesion, brokerage, stratification, network analysis, methods, graph theory, statistical models Contents
Constrained Tetrahedral Mesh Generation of Human Organs on Segmented Volume *
Constrained Tetrahedral Mesh Generation of Human Organs on Segmented Volume * Xiaosong Yang 1, Pheng Ann Heng 2, Zesheng Tang 3 1 Department of Computer Science and Technology, Tsinghua University, Beijing
How To Fix Out Of Focus And Blur Images With A Dynamic Template Matching Algorithm
IJSTE - International Journal of Science Technology & Engineering Volume 1 Issue 10 April 2015 ISSN (online): 2349-784X Image Estimation Algorithm for Out of Focus and Blur Images to Retrieve the Barcode
Mining Text Data: An Introduction
Bölüm 10. Metin ve WEB Madenciliği http://ceng.gazi.edu.tr/~ozdemir Mining Text Data: An Introduction Data Mining / Knowledge Discovery Structured Data Multimedia Free Text Hypertext HomeLoan ( Frank Rizzo
Voluntary Product Accessibility Template Blackboard Learn Release 9.1 April 2014 (Published April 30, 2014)
Voluntary Product Accessibility Template Blackboard Learn Release 9.1 April 2014 (Published April 30, 2014) Contents: Introduction Key Improvements VPAT Section 1194.21: Software Applications and Operating
PSG College of Technology, Coimbatore-641 004 Department of Computer & Information Sciences BSc (CT) G1 & G2 Sixth Semester PROJECT DETAILS.
PSG College of Technology, Coimbatore-641 004 Department of Computer & Information Sciences BSc (CT) G1 & G2 Sixth Semester PROJECT DETAILS Project Project Title Area of Abstract No Specialization 1. Software
Information Visualization on the Base of Hierarchical Graph Models
Information Visualization on the Base of Hierarchical Graph Models V.N. KASYANOV Laboratory for Program Construction and Optimization Institute of Informatics Systems Lavrentiev pr. 6, Novosibirsk, 630090
Prentice Hall Mathematics: Course 1 2008 Correlated to: Arizona Academic Standards for Mathematics (Grades 6)
PO 1. Express fractions as ratios, comparing two whole numbers (e.g., ¾ is equivalent to 3:4 and 3 to 4). Strand 1: Number Sense and Operations Every student should understand and use all concepts and
Asking Hard Graph Questions. Paul Burkhardt. February 3, 2014
Beyond Watson: Predictive Analytics and Big Data U.S. National Security Agency Research Directorate - R6 Technical Report February 3, 2014 300 years before Watson there was Euler! The first (Jeopardy!)
TIBCO Spotfire Business Author Essentials Quick Reference Guide. Table of contents:
Table of contents: Access Data for Analysis Data file types Format assumptions Data from Excel Information links Add multiple data tables Create & Interpret Visualizations Table Pie Chart Cross Table Treemap
Introduction to the TI-Nspire CX
Introduction to the TI-Nspire CX Activity Overview: In this activity, you will become familiar with the layout of the TI-Nspire CX. Step 1: Locate the Touchpad. The Touchpad is used to navigate the cursor
Utilizing spatial information systems for non-spatial-data analysis
Jointly published by Akadémiai Kiadó, Budapest Scientometrics, and Kluwer Academic Publishers, Dordrecht Vol. 51, No. 3 (2001) 563 571 Utilizing spatial information systems for non-spatial-data analysis
NakeDB: Database Schema Visualization
NAKEDB: DATABASE SCHEMA VISUALIZATION, APRIL 2008 1 NakeDB: Database Schema Visualization Luis Miguel Cortés-Peña, Yi Han, Neil Pradhan, Romain Rigaux Abstract Current database schema visualization tools
TOPAS: a Web-based Tool for Visualization of Mapping Algorithms
TOPAS: a Web-based Tool for Visualization of Mapping Algorithms 0. G. Monakhov, 0. J. Chunikhin, E. B. Grosbein Institute of Computational Mathematics and Mathematical Geophysics, Siberian Division of
TABLE OF CONTENTS. INTRODUCTION... 5 Advance Concrete... 5 Where to find information?... 6 INSTALLATION... 7 STARTING ADVANCE CONCRETE...
Starting Guide TABLE OF CONTENTS INTRODUCTION... 5 Advance Concrete... 5 Where to find information?... 6 INSTALLATION... 7 STARTING ADVANCE CONCRETE... 7 ADVANCE CONCRETE USER INTERFACE... 7 Other important
Self Organizing Maps for Visualization of Categories
Self Organizing Maps for Visualization of Categories Julian Szymański 1 and Włodzisław Duch 2,3 1 Department of Computer Systems Architecture, Gdańsk University of Technology, Poland, [email protected]
Chapter 6. The stacking ensemble approach
82 This chapter proposes the stacking ensemble approach for combining different data mining classifiers to get better performance. Other combination techniques like voting, bagging etc are also described
SuperViz: An Interactive Visualization of Super-Peer P2P Network
SuperViz: An Interactive Visualization of Super-Peer P2P Network Anthony (Peiqun) Yu [email protected] Abstract: The Efficient Clustered Super-Peer P2P network is a novel P2P architecture, which overcomes
ProteinQuest user guide
ProteinQuest user guide 1. Introduction... 3 1.1 With ProteinQuest you can... 3 1.2 ProteinQuest basic version 4 1.3 ProteinQuest extended version... 5 2. ProteinQuest dictionaries... 6 3. Directions for
Recognition. Sanja Fidler CSC420: Intro to Image Understanding 1 / 28
Recognition Topics that we will try to cover: Indexing for fast retrieval (we still owe this one) History of recognition techniques Object classification Bag-of-words Spatial pyramids Neural Networks Object
HDDVis: An Interactive Tool for High Dimensional Data Visualization
HDDVis: An Interactive Tool for High Dimensional Data Visualization Mingyue Tan Department of Computer Science University of British Columbia [email protected] ABSTRACT Current high dimensional data visualization
TECHNOLOGY ANALYSIS FOR INTERNET OF THINGS USING BIG DATA LEARNING
TECHNOLOGY ANALYSIS FOR INTERNET OF THINGS USING BIG DATA LEARNING Sunghae Jun 1 1 Professor, Department of Statistics, Cheongju University, Chungbuk, Korea Abstract The internet of things (IoT) is an
Principles of Data Visualization for Exploratory Data Analysis. Renee M. P. Teate. SYS 6023 Cognitive Systems Engineering April 28, 2015
Principles of Data Visualization for Exploratory Data Analysis Renee M. P. Teate SYS 6023 Cognitive Systems Engineering April 28, 2015 Introduction Exploratory Data Analysis (EDA) is the phase of analysis
Current Standard: Mathematical Concepts and Applications Shape, Space, and Measurement- Primary
Shape, Space, and Measurement- Primary A student shall apply concepts of shape, space, and measurement to solve problems involving two- and three-dimensional shapes by demonstrating an understanding of:
Reflection and Refraction
Equipment Reflection and Refraction Acrylic block set, plane-concave-convex universal mirror, cork board, cork board stand, pins, flashlight, protractor, ruler, mirror worksheet, rectangular block worksheet,
Methodology for Emulating Self Organizing Maps for Visualization of Large Datasets
Methodology for Emulating Self Organizing Maps for Visualization of Large Datasets Macario O. Cordel II and Arnulfo P. Azcarraga College of Computer Studies *Corresponding Author: [email protected]
New Work Item for ISO 3534-5 Predictive Analytics (Initial Notes and Thoughts) Introduction
Introduction New Work Item for ISO 3534-5 Predictive Analytics (Initial Notes and Thoughts) Predictive analytics encompasses the body of statistical knowledge supporting the analysis of massive data sets.
Data quality in Accounting Information Systems
Data quality in Accounting Information Systems Comparing Several Data Mining Techniques Erjon Zoto Department of Statistics and Applied Informatics Faculty of Economy, University of Tirana Tirana, Albania
RESEARCH PAPERS FACULTY OF MATERIALS SCIENCE AND TECHNOLOGY IN TRNAVA SLOVAK UNIVERSITY OF TECHNOLOGY IN BRATISLAVA
RESEARCH PAPERS FACULTY OF MATERIALS SCIENCE AND TECHNOLOGY IN TRNAVA SLOVAK UNIVERSITY OF TECHNOLOGY IN BRATISLAVA 2010 Number 29 3D MODEL GENERATION FROM THE ENGINEERING DRAWING Jozef VASKÝ, Michal ELIÁŠ,
IE 680 Special Topics in Production Systems: Networks, Routing and Logistics*
IE 680 Special Topics in Production Systems: Networks, Routing and Logistics* Rakesh Nagi Department of Industrial Engineering University at Buffalo (SUNY) *Lecture notes from Network Flows by Ahuja, Magnanti
1 o Semestre 2007/2008
Departamento de Engenharia Informática Instituto Superior Técnico 1 o Semestre 2007/2008 Outline 1 2 3 4 5 Outline 1 2 3 4 5 Exploiting Text How is text exploited? Two main directions Extraction Extraction
1051-232 Imaging Systems Laboratory II. Laboratory 4: Basic Lens Design in OSLO April 2 & 4, 2002
05-232 Imaging Systems Laboratory II Laboratory 4: Basic Lens Design in OSLO April 2 & 4, 2002 Abstract: For designing the optics of an imaging system, one of the main types of tools used today is optical
SAP InfiniteInsight 7.0 SP1
End User Documentation Document Version: 1.0-2014-11 Getting Started with Social Table of Contents 1 About this Document... 3 1.1 Who Should Read this Document... 3 1.2 Prerequisites for the Use of this
Lecture 1: Systems of Linear Equations
MTH Elementary Matrix Algebra Professor Chao Huang Department of Mathematics and Statistics Wright State University Lecture 1 Systems of Linear Equations ² Systems of two linear equations with two variables
Fast Multipole Method for particle interactions: an open source parallel library component
Fast Multipole Method for particle interactions: an open source parallel library component F. A. Cruz 1,M.G.Knepley 2,andL.A.Barba 1 1 Department of Mathematics, University of Bristol, University Walk,
Search Result Optimization using Annotators
Search Result Optimization using Annotators Vishal A. Kamble 1, Amit B. Chougule 2 1 Department of Computer Science and Engineering, D Y Patil College of engineering, Kolhapur, Maharashtra, India 2 Professor,
CS 534: Computer Vision 3D Model-based recognition
CS 534: Computer Vision 3D Model-based recognition Ahmed Elgammal Dept of Computer Science CS 534 3D Model-based Vision - 1 High Level Vision Object Recognition: What it means? Two main recognition tasks:!
A Comparison Framework of Similarity Metrics Used for Web Access Log Analysis
A Comparison Framework of Similarity Metrics Used for Web Access Log Analysis Yusuf Yaslan and Zehra Cataltepe Istanbul Technical University, Computer Engineering Department, Maslak 34469 Istanbul, Turkey
Visualization of Phylogenetic Trees and Metadata
Visualization of Phylogenetic Trees and Metadata November 27, 2015 Sample to Insight CLC bio, a QIAGEN Company Silkeborgvej 2 Prismet 8000 Aarhus C Denmark Telephone: +45 70 22 32 44 www.clcbio.com [email protected]
A Review And Evaluations Of Shortest Path Algorithms
A Review And Evaluations Of Shortest Path Algorithms Kairanbay Magzhan, Hajar Mat Jani Abstract: Nowadays, in computer networks, the routing is based on the shortest path problem. This will help in minimizing
NeL 2 : Network Drawing Tool for Handling Layered Structured Network Diagram
NeL 2 : Network Drawing Tool for Handling Layered Structured Network Diagram Nagayoshi Nakazono 1 Kazuo Misue 2 Jiro Tanaka 2 1 College of Information Sciences, 2 Department of Computer Science University
Analecta Vol. 8, No. 2 ISSN 2064-7964
EXPERIMENTAL APPLICATIONS OF ARTIFICIAL NEURAL NETWORKS IN ENGINEERING PROCESSING SYSTEM S. Dadvandipour Institute of Information Engineering, University of Miskolc, Egyetemváros, 3515, Miskolc, Hungary,
Link Prediction in Social Networks
CS378 Data Mining Final Project Report Dustin Ho : dsh544 Eric Shrewsberry : eas2389 Link Prediction in Social Networks 1. Introduction Social networks are becoming increasingly more prevalent in the daily
