Map-like Wikipedia Visualization. Pang Cheong Iao. Master of Science in Software Engineering
|
|
|
- Isaac Harvey
- 10 years ago
- Views:
Transcription
1 Map-like Wikipedia Visualization by Pang Cheong Iao Master of Science in Software Engineering 2011 Faculty of Science and Technology University of Macau
2 Map-like Wikipedia Visualization by Pang Cheong Iao A thesis submitted in partial fulfillment of the requirements for the degree of Master of Science in Software Engineering Faculty of Science and Technology University of Macau 2011 Approved by Supervisor Date
3 In presenting this thesis in partial fulfillment of the requirements for a Master's degree at the University of Macau, I agree that the Library and the Faculty of Science and Technology shall make its copies freely available for inspection. However, reproduction of this thesis for any purposes or by any means shall not be allowed without my written permission. Authorization is sought by contacting the author at [email protected] Signature Date
4 University of Macau Abstract MAP-LIKE WIKIPEDIA VISUALIZATION by Pang Cheong Iao Thesis Supervisor: Assistant Professor, Dr. Robert P. Biuk-Aghai Master of Science in Software Engineering Wikipedia, the largest online collaboratively authored encyclopedia, allows anyone to easily contribute to its content. As it operates in a collaboratively authoring model, authors are not only responsible for writing up articles, but also assigning articles into categories in a way to classify by their topics. This makes the category system, which contains topic classifications under authors preferences is worth studying. However, given its large data volume and idiosyncrasies of its data structure, the analysis of Wikipedia s category data is challenging. To facilitate ease and efficiency of understanding, we have designed a new visualization a map-like visualization representing Wikipedia s category data in a form similar to a geographic map. It allows readers to intuitively perceive large-scale patterns of categories and articles in the Wikipedia. In this thesis, we describe the algorithm of creating this novel type of visualization, and introduce our approaches for tackling the difficulties of handling the Wikipedia data.
5 TABLE OF CONTENTS List of Figures... 3 List of Tables... 5 Glossary... 6 Acknowledgments... 7 Dedication... 8 Chapter 1: Introduction Collaborative Authoring and Wikis Wikipedia and MediaWiki Research Problems Solution Approach Outline of this Thesis Chapter 2: Literature Review Information Visualization Wikipedia Data Processing Wikipedia Visualization Map-like Visualization Force-directed Layout Algorithms Overlap Removal Algorithms Chapter 3: Map-like Visualization Overview of the Visualization Data Source Requirements Sorting Algorithm by Similarities Preliminary Layout Hexagonal Visualization The Hexagon Canvas Selection of Hexagons Clutter Reduction Coloring Scheme
6 3.5.5 Text Labeling Chapter 4: Wikipedia data analysis Database Schema of MediaWiki Obtaining Wikipedia Data Transforming Category Graph Choosing Semantic Root Removing Non-Content Categories Building a Category Tree Category Relationships Chapter 5: Case Study on Wikipedia Performance Discussion on Applications Understanding Category Composition Displaying Category Relationship Overview of Multiple Wikipedias Comparison of Topic Areas in Multiple Wikipedias Compare with Other Wikipedia Visualizations Comparison with Semantic Coverage Visualization Comparison with Matrix Visualization Comparison with Radial Visualization User Evaluation Chapter 6: Conclusion Contribution Future Work Associated Publications References Appendix A: Results of Wikipedia Map-like Visualization Appendix B: Importing Wikipedia Database Dumps VITA
7 LIST OF FIGURES Number Page Figure 1: Treemap visualization of a folder in a file system (reproduced from [10])...15 Figure 2: A treemap with one million of items (reproduced from [11])...16 Figure 3: Two representations of a tree: (a) the radial layout (b) the balloon layout (reproduced from [13])...17 Figure 4: Connections added to: (a) a radial tree (b) a balloon tree; both visualizing function calls in a software (caller in green and callee in red) (excerpt from [14])...17 Figure 5: The screenshot of SpaceTree (reproduced from [17])...18 Figure 6: Haber and McNabb s visualization model...19 Figure 7: Five levels deep, centered on Politics...21 Figure 8: Close up of Chris Harrison s WikiViz output...21 Figure 9: Entity view with labeled nodes in WikiVis (reproduced from [23])...22 Figure 10: WikiVis category view (reproduced from [23])...22 Figure 11: Wikipedia category visualization (reproduced from [7])...23 Figure 12: The Chromograms application: showing users activity in blocks (reproduced from [24])...24 Figure 13: History flow visualization (reproduced from IBM Research)...25 Figure 14: Author text analysis of a section of text with high depth of collaboration (reproduced from [26])...25 Figure 15: Edit significance bars of revisions of an article (reproduced from [27])...25 Figure 16: Summary of different type of edit behaviors (reproduced from [27])...26 Figure 17: Visualization of a 10-year period by cartographic means (reproduced from [28])...27 Figure 18: Comparison of overlap removal algorithms (reproduced from [32])...28 Figure 19: The bottom-up layout approach...32 Figure 20: Approaches for radial placement of sorted entities...34 Figure 21: Hexagons and their coordinates in the memory
8 Figure 22: Arithmetic of a hexagon...36 Figure 23: Example of hexagon selection...37 Figure 24: Before and after clutter reduction...39 Figure 25: A topographic map uses colors for showing elevation...40 Figure 26: Legend of a map-like visualization...40 Figure 27: Rotation of text labels in a region...41 Figure 28: Result of the text label placement...42 Figure 29: Eliminating edges in the graph (edge indicates parent-to-child relationship)...50 Figure 30: Comparison of different experimental methods for aggregating similarity...52 Figure 31: The current way to obtain category distribution information...58 Figure 32: Highlight of category Mathematics in the Wikipedia...58 Figure 33: Cluster of related categories Environment, Life and Geography placed in close proximity of one another...59 Figure 34: Category Science in Wikipedia: (a) Danish (b) Swedish (c) Chinese...62 Figure 35: Highlight of Holloway s Wikipedia visualization...63 Figure 36: A multi-level matrix visualization of category hierarchies...64 Figure 37: A radial visualization of Wikipedia (reproduced from [37])...66 Figure 38: Extract of the radial visualization near the edge of the circle (reproduced from [37])...67 Figure 39: Overview map of the Simple English Wikipedia...78 Figure 40: Overview map of the Danish Wikipedia...79 Figure 41: Overview map of the Chinese Wikipedia...79 Figure 42: Overview map of the Swedish Wikipedia...80 Figure 43: Overview map of the German Wikipedia...81 Figure 44: Overview map of the English Wikipedia
9 LIST OF TABLES Number Page Table 1: Comparison of wikitext and HTML...11 Table 2: Examples of hierarchical data (number in the parentheses represents the size)...30 Table 3: Sorting algorithm with similarity pairs...32 Table 4: Algorithm for randomly selecting hexagons...38 Table 5: Namespaces in MediaWiki...44 Table 6: Definition of the categorylinks table...44 Table 7: Sizes of Wikipedias...45 Table 8: Top content categories in different Wikipedia language editions...46 Table 9: Comparison of name similarities under category Aircraft and Computer science...49 Table 10: Cosine similarity for categories in the English Wikipedia with data from different levels of sub-categories...51 Table 11: Standard deviations of similarity aggregation experimental expressions...53 Table 12: Aggregated similarity of category Mathematics in the Wikipedia...54 Table 13: Statistics of visualization output...56 Table 14: Statistics of the German and English Wikipedia...60 Table 15: Statistics of category Science in the Danish, Swedish and Chinese Wikipedia...62 Table 16: Responses of the user evaluation...68 Table 17: SQL commands for importing database dumps
10 GLOSSARY Article: An encyclopedia article in the Wikipedia. Category: A classification defined in the Wikipedia that groups articles with relevant information. Category Assignment: An article is assigned to a category, and becomes part of the content of that category. Category Relationship: A measurement of relatedness of categories. If contents within one category are more similar to the contents of another category, then we define they have a closer relationship or vice versa. Co-assignment of Categories: If an article is assigned with multiple categories at the same time, the article is co-assigned with these categories. Cosine Similarity: Cosine similarity is a measure of similarity between two vectors by measuring the cosine of the angle between them. It is widely used in measuring similarities of documents and strings. Directed Acyclic Graph: A directed acyclic graph is a directed graph with no directed cycles. For example: for any starting point of vertex v, there is no way to follow a sequence of edges that eventually loops back to v again. Namespace: Namespaces are used in the Wikipedia to separate contents for different purposes, such as articles and categories are stored under Main and Category namespace respectively for public access, while Talk namespace consists of discussions that intended for the authoring community. In this way, different types of users are able to concentrate on the data which intended for their use. Page: The basic element in the Wikipedia, which can be an encyclopedia article, a category or other object that differentiate by its namespace. A page is uniquely identified by its page ID. Sub-category: A category that is assigned under an upper level (parent) category. Usually sub-categories are defined as more detailed topics of their parent s topic. Top Categories: The highest level of categories for classifying articles by their topics, such as Science, People, History, etc
11 ACKNOWLEDGMENTS I would like to thank my thesis supervisor, Dr. Robert P. Biuk-Aghai, who kept guiding and encouraging me along the way of this research. He inspired me with different new ideas, points to improve and possible solutions to problems. Without his leadership and experience, it is impossible for me the complete this thesis. Moreover, he paid so much effort in helping me improve the academic writing skills as well as publish publications. I truly appreciate his effort and help during these three years of supervision. I am also grateful to my colleagues in the University of Macau, especially my unit head Steve Lai, team leaders Jack Wu and Christina Lou and my project manager Evi Ian, who helped me in my work and coordinated the workload so that I could successfully finish this degree. I would also thank the former and current students of the Faculty of Science and Technology, such as Libby Tang, Henry Leong, Alex Ieong, Kin Lei and Peter Fong, for sharing their experiences and knowledge related to this research. I gratefully acknowledge the support for the publications of this research from the Macau Special Administrative Region Science and Technology Development Fund (grant number 021/2011/A). Last but not least, I will never forget the support and encouragement from my parents and all my friends on both my research and my life
12 DEDICATION I would like to dedicate this thesis to my parents, who continuously support my studies and my life all the time
THE TIME-SERIES ANALYSIS FOR INTERACTIONS AMONG RETURNS ON S&P 500, CSI300 AND HSI BEFORE AND AFTER THE BANKRUPTCY OF LEHMAN BROTHERS TANG WAI KEONG
THE TIME-SERIES ANALYSIS FOR INTERACTIONS AMONG RETURNS ON S&P 500, CSI300 AND HSI BEFORE AND AFTER THE BANKRUPTCY OF LEHMAN BROTHERS by TANG WAI KEONG Master of Science in Mathematics 2010 Faculty of
VISUALIZING HIERARCHICAL DATA. Graham Wills SPSS Inc., http://willsfamily.org/gwills
VISUALIZING HIERARCHICAL DATA Graham Wills SPSS Inc., http://willsfamily.org/gwills SYNONYMS Hierarchical Graph Layout, Visualizing Trees, Tree Drawing, Information Visualization on Hierarchies; Hierarchical
SuperViz: An Interactive Visualization of Super-Peer P2P Network
SuperViz: An Interactive Visualization of Super-Peer P2P Network Anthony (Peiqun) Yu [email protected] Abstract: The Efficient Clustered Super-Peer P2P network is a novel P2P architecture, which overcomes
CS 207 - Data Science and Visualization Spring 2016
CS 207 - Data Science and Visualization Spring 2016 Professor: Sorelle Friedler [email protected] An introduction to techniques for the automated and human-assisted analysis of data sets. These
Visualizing an Auto-Generated Topic Map
Visualizing an Auto-Generated Topic Map Nadine Amende 1, Stefan Groschupf 2 1 University Halle-Wittenberg, information manegement technology [email protected] 2 media style labs Halle Germany [email protected]
Self Organizing Maps for Visualization of Categories
Self Organizing Maps for Visualization of Categories Julian Szymański 1 and Włodzisław Duch 2,3 1 Department of Computer Systems Architecture, Gdańsk University of Technology, Poland, [email protected]
Hierarchical Data Visualization. Ai Nakatani IAT 814 February 21, 2007
Hierarchical Data Visualization Ai Nakatani IAT 814 February 21, 2007 Introduction Hierarchical Data Directory structure Genealogy trees Biological taxonomy Business structure Project structure Challenges
Hierarchical Data Visualization
Hierarchical Data Visualization 1 Hierarchical Data Hierarchical data emphasize the subordinate or membership relations between data items. Organizational Chart Classifications / Taxonomies (Species and
Web Document Clustering
Web Document Clustering Lab Project based on the MDL clustering suite http://www.cs.ccsu.edu/~markov/mdlclustering/ Zdravko Markov Computer Science Department Central Connecticut State University New Britain,
Graph/Network Visualization
Graph/Network Visualization Data model: graph structures (relations, knowledge) and networks. Applications: Telecommunication systems, Internet and WWW, Retailers distribution networks knowledge representation
VisCG: Creating an Eclipse Call Graph Visualization Plug-in. Kenta Hasui, Undergraduate Student at Vassar College Class of 2015
VisCG: Creating an Eclipse Call Graph Visualization Plug-in Kenta Hasui, Undergraduate Student at Vassar College Class of 2015 Abstract Call graphs are a useful tool for understanding software; however,
KEYWORD SEARCH OVER PROBABILISTIC RDF GRAPHS
ABSTRACT KEYWORD SEARCH OVER PROBABILISTIC RDF GRAPHS In many real applications, RDF (Resource Description Framework) has been widely used as a W3C standard to describe data in the Semantic Web. In practice,
Collective Behavior Prediction in Social Media. Lei Tang Data Mining & Machine Learning Group Arizona State University
Collective Behavior Prediction in Social Media Lei Tang Data Mining & Machine Learning Group Arizona State University Social Media Landscape Social Network Content Sharing Social Media Blogs Wiki Forum
Topic Maps Visualization
Topic Maps Visualization Bénédicte Le Grand, Laboratoire d'informatique de Paris 6 Introduction Topic maps provide a bridge between the domains of knowledge representation and information management. Topics
Final Project Report
CPSC545 by Introduction to Data Mining Prof. Martin Schultz & Prof. Mark Gerstein Student Name: Yu Kor Hugo Lam Student ID : 904907866 Due Date : May 7, 2007 Introduction Final Project Report Pseudogenes
Computing Concepts with Java Essentials
2008 AGI-Information Management Consultants May be used for personal purporses only or by libraries associated to dandelon.com network. Computing Concepts with Java Essentials 3rd Edition Cay Horstmann
Visual Analysis of People s Calling Network from CDR data
Visual Analysis of People s Calling Network from CDR data Category: Research Sloan Business School Media Lab Graduation Student Media Lab Staff Radial tree view of selected hierarchy and groups Media Lab
TIBCO Spotfire Business Author Essentials Quick Reference Guide. Table of contents:
Table of contents: Access Data for Analysis Data file types Format assumptions Data from Excel Information links Add multiple data tables Create & Interpret Visualizations Table Pie Chart Cross Table Treemap
An Open Framework for Reverse Engineering Graph Data Visualization. Alexandru C. Telea Eindhoven University of Technology The Netherlands.
An Open Framework for Reverse Engineering Graph Data Visualization Alexandru C. Telea Eindhoven University of Technology The Netherlands Overview Reverse engineering (RE) overview Limitations of current
Information Visualization of Attributed Relational Data
Information Visualization of Attributed Relational Data Mao Lin Huang Department of Computer Systems Faculty of Information Technology University of Technology, Sydney PO Box 123 Broadway, NSW 2007 Australia
NakeDB: Database Schema Visualization
NAKEDB: DATABASE SCHEMA VISUALIZATION, APRIL 2008 1 NakeDB: Database Schema Visualization Luis Miguel Cortés-Peña, Yi Han, Neil Pradhan, Romain Rigaux Abstract Current database schema visualization tools
Traffic Prediction in Wireless Mesh Networks Using Process Mining Algorithms
Traffic Prediction in Wireless Mesh Networks Using Process Mining Algorithms Kirill Krinkin Open Source and Linux lab Saint Petersburg, Russia [email protected] Eugene Kalishenko Saint Petersburg
Visualizing the Top 400 Universities
Int'l Conf. e-learning, e-bus., EIS, and e-gov. EEE'15 81 Visualizing the Top 400 Universities Salwa Aljehane 1, Reem Alshahrani 1, and Maha Thafar 1 [email protected], [email protected], [email protected]
Environmental Remote Sensing GEOG 2021
Environmental Remote Sensing GEOG 2021 Lecture 4 Image classification 2 Purpose categorising data data abstraction / simplification data interpretation mapping for land cover mapping use land cover class
Visualization methods for patent data
Visualization methods for patent data Treparel 2013 Dr. Anton Heijs (CTO & Founder) Delft, The Netherlands Introduction Treparel can provide advanced visualizations for patent data. This document describes
Tai Kam Fong, Jackie. Master of Science in E-Commerce Technology
Trend Following Algorithms in Automated Stock Market Trading by Tai Kam Fong, Jackie Master of Science in E-Commerce Technology 2011 Faculty of Science and Technology University of Macau Trend Following
WebSphere Business Monitor
WebSphere Business Monitor Monitor sub-models 2010 IBM Corporation This presentation should provide an overview of the sub-models in a monitor model in WebSphere Business Monitor. WBPM_Monitor_MonitorModels_Submodels.ppt
Technical Report. The KNIME Text Processing Feature:
Technical Report The KNIME Text Processing Feature: An Introduction Dr. Killian Thiel Dr. Michael Berthold [email protected] [email protected] Copyright 2012 by KNIME.com AG
8. Machine Learning Applied Artificial Intelligence
8. Machine Learning Applied Artificial Intelligence Prof. Dr. Bernhard Humm Faculty of Computer Science Hochschule Darmstadt University of Applied Sciences 1 Retrospective Natural Language Processing Name
Concept-Mapping Software: How effective is the learning tool in an online learning environment?
Concept-Mapping Software: How effective is the learning tool in an online learning environment? Online learning environments address the educational objectives by putting the learner at the center of the
Link Prediction in Social Networks
CS378 Data Mining Final Project Report Dustin Ho : dsh544 Eric Shrewsberry : eas2389 Link Prediction in Social Networks 1. Introduction Social networks are becoming increasingly more prevalent in the daily
Clustering Connectionist and Statistical Language Processing
Clustering Connectionist and Statistical Language Processing Frank Keller [email protected] Computerlinguistik Universität des Saarlandes Clustering p.1/21 Overview clustering vs. classification supervised
TIBCO Spotfire Network Analytics 1.1. User s Manual
TIBCO Spotfire Network Analytics 1.1 User s Manual Revision date: 26 January 2009 Important Information SOME TIBCO SOFTWARE EMBEDS OR BUNDLES OTHER TIBCO SOFTWARE. USE OF SUCH EMBEDDED OR BUNDLED TIBCO
Big Data and Scripting map/reduce in Hadoop
Big Data and Scripting map/reduce in Hadoop 1, 2, parts of a Hadoop map/reduce implementation core framework provides customization via indivudual map and reduce functions e.g. implementation in mongodb
Component visualization methods for large legacy software in C/C++
Annales Mathematicae et Informaticae 44 (2015) pp. 23 33 http://ami.ektf.hu Component visualization methods for large legacy software in C/C++ Máté Cserép a, Dániel Krupp b a Eötvös Loránd University [email protected]
Clustering through Decision Tree Construction in Geology
Nonlinear Analysis: Modelling and Control, 2001, v. 6, No. 2, 29-41 Clustering through Decision Tree Construction in Geology Received: 22.10.2001 Accepted: 31.10.2001 A. Juozapavičius, V. Rapševičius Faculty
FORMAT GUIDELINES FOR MASTER S THESES AND REPORTS
FORMAT GUIDELINES FOR MASTER S THESES AND REPORTS The University of Texas at Austin Graduate School September 2010 Formatting questions not addressed in these guidelines should be directed to a Graduate
Self-adaptive e-learning Website for Mathematics
Self-adaptive e-learning Website for Mathematics Akira Nakamura Abstract Keyword searching and browsing on learning website is ultimate self-adaptive learning. Our e-learning website KIT Mathematics Navigation
Such As Statements, Kindergarten Grade 8
Such As Statements, Kindergarten Grade 8 This document contains the such as statements that were included in the review committees final recommendations for revisions to the mathematics Texas Essential
InfiniteInsight 6.5 sp4
End User Documentation Document Version: 1.0 2013-11-19 CUSTOMER InfiniteInsight 6.5 sp4 Toolkit User Guide Table of Contents Table of Contents About this Document 3 Common Steps 4 Selecting a Data Set...
Expanding the CASEsim Framework to Facilitate Load Balancing of Social Network Simulations
Expanding the CASEsim Framework to Facilitate Load Balancing of Social Network Simulations Amara Keller, Martin Kelly, Aaron Todd 4 June 2010 Abstract This research has two components, both involving the
GEO-VISUALIZATION SUPPORT FOR MULTIDIMENSIONAL CLUSTERING
Geoinformatics 2004 Proc. 12th Int. Conf. on Geoinformatics Geospatial Information Research: Bridging the Pacific and Atlantic University of Gävle, Sweden, 7-9 June 2004 GEO-VISUALIZATION SUPPORT FOR MULTIDIMENSIONAL
Review Your Thesis or Dissertation
Review Your Thesis or Dissertation This document shows the formatting requirements for UBC theses. Theses must follow these guidelines in order to be accepted at the Faculty of Graduate and Postdoctoral
Social Media Mining. Network Measures
Klout Measures and Metrics 22 Why Do We Need Measures? Who are the central figures (influential individuals) in the network? What interaction patterns are common in friends? Who are the like-minded users
Email Spam Detection Using Customized SimHash Function
International Journal of Research Studies in Computer Science and Engineering (IJRSCSE) Volume 1, Issue 8, December 2014, PP 35-40 ISSN 2349-4840 (Print) & ISSN 2349-4859 (Online) www.arcjournals.org Email
Learning outcomes. Knowledge and understanding. Competence and skills
Syllabus Master s Programme in Statistics and Data Mining 120 ECTS Credits Aim The rapid growth of databases provides scientists and business people with vast new resources. This programme meets the challenges
Consumption of OData Services of Open Items Analytics Dashboard using SAP Predictive Analysis
Consumption of OData Services of Open Items Analytics Dashboard using SAP Predictive Analysis (Version 1.17) For validation Document version 0.1 7/7/2014 Contents What is SAP Predictive Analytics?... 3
Option 1: empirical network analysis. Task: find data, analyze data (and visualize it), then interpret.
Programming project Task Option 1: empirical network analysis. Task: find data, analyze data (and visualize it), then interpret. Obtaining data This project focuses upon cocktail ingredients. Data was
A QTI editor integrated into the netuniversité web portal using IMS LD
Giacomini Pacurar, E., Trigang, P & Alupoaie, S. (2005). A QTI editor integrated into the netuniversité web portal using IMS LD Journal of Interactive Media in Education 2005(09). [jime.open.ac.uk/2005/09].
Character Image Patterns as Big Data
22 International Conference on Frontiers in Handwriting Recognition Character Image Patterns as Big Data Seiichi Uchida, Ryosuke Ishida, Akira Yoshida, Wenjie Cai, Yaokai Feng Kyushu University, Fukuoka,
Tools and Techniques for Social Network Analysis
Tools and Techniques for Social Network Analysis Pajek Program for Analysis and Visualization of Large Networks Pajek: What is it Pajek is a program, for Windows and Linux (via Wine) Developers: Vladimir
Analytics on Big Data
Analytics on Big Data Riccardo Torlone Università Roma Tre Credits: Mohamed Eltabakh (WPI) Analytics The discovery and communication of meaningful patterns in data (Wikipedia) It relies on data analysis
I. Create the base view with the data you want to measure
Developing Key Performance Indicators (KPIs) in Tableau The following tutorial will show you how to create KPIs in Tableau 9. To get started, you will need the following: Tableau version 9 Data: Sample
STATISTICA. Financial Institutions. Case Study: Credit Scoring. and
Financial Institutions and STATISTICA Case Study: Credit Scoring STATISTICA Solutions for Business Intelligence, Data Mining, Quality Control, and Web-based Analytics Table of Contents INTRODUCTION: WHAT
THREE-DIMENSIONAL CARTOGRAPHIC REPRESENTATION AND VISUALIZATION FOR SOCIAL NETWORK SPATIAL ANALYSIS
CO-205 THREE-DIMENSIONAL CARTOGRAPHIC REPRESENTATION AND VISUALIZATION FOR SOCIAL NETWORK SPATIAL ANALYSIS SLUTER C.R.(1), IESCHECK A.L.(2), DELAZARI L.S.(1), BRANDALIZE M.C.B.(1) (1) Universidade Federal
Clustering & Visualization
Chapter 5 Clustering & Visualization Clustering in high-dimensional databases is an important problem and there are a number of different clustering paradigms which are applicable to high-dimensional data.
Hierarchical Clustering Analysis
Hierarchical Clustering Analysis What is Hierarchical Clustering? Hierarchical clustering is used to group similar objects into clusters. In the beginning, each row and/or column is considered a cluster.
9. Text & Documents. Visualizing and Searching Documents. Dr. Thorsten Büring, 20. Dezember 2007, Vorlesung Wintersemester 2007/08
9. Text & Documents Visualizing and Searching Documents Dr. Thorsten Büring, 20. Dezember 2007, Vorlesung Wintersemester 2007/08 Slide 1 / 37 Outline Characteristics of text data Detecting patterns SeeSoft
Bitrix Site Manager 4.1. User Guide
Bitrix Site Manager 4.1 User Guide 2 Contents REGISTRATION AND AUTHORISATION...3 SITE SECTIONS...5 Creating a section...6 Changing the section properties...8 SITE PAGES...9 Creating a page...10 Editing
RESEARCH PAPERS FACULTY OF MATERIALS SCIENCE AND TECHNOLOGY IN TRNAVA SLOVAK UNIVERSITY OF TECHNOLOGY IN BRATISLAVA
RESEARCH PAPERS FACULTY OF MATERIALS SCIENCE AND TECHNOLOGY IN TRNAVA SLOVAK UNIVERSITY OF TECHNOLOGY IN BRATISLAVA 2010 Number 29 3D MODEL GENERATION FROM THE ENGINEERING DRAWING Jozef VASKÝ, Michal ELIÁŠ,
Big Data Text Mining and Visualization. Anton Heijs
Copyright 2007 by Treparel Information Solutions BV. This report nor any part of it may be copied, circulated, quoted without prior written approval from Treparel7 Treparel Information Solutions BV Delftechpark
Back Propagation Neural Networks User Manual
Back Propagation Neural Networks User Manual Author: Lukáš Civín Library: BP_network.dll Runnable class: NeuralNetStart Document: Back Propagation Neural Networks Page 1/28 Content: 1 INTRODUCTION TO BACK-PROPAGATION
Krishna Institute of Engineering & Technology, Ghaziabad Department of Computer Application MCA-213 : DATA STRUCTURES USING C
Tutorial#1 Q 1:- Explain the terms data, elementary item, entity, primary key, domain, attribute and information? Also give examples in support of your answer? Q 2:- What is a Data Type? Differentiate
1 File Processing Systems
COMP 378 Database Systems Notes for Chapter 1 of Database System Concepts Introduction A database management system (DBMS) is a collection of data and an integrated set of programs that access that data.
Interactive Data Mining and Visualization
Interactive Data Mining and Visualization Zhitao Qiu Abstract: Interactive analysis introduces dynamic changes in Visualization. On another hand, advanced visualization can provide different perspectives
Feasibility Study of Contractor Registration System and Contractor Grading System in Macau. Pun I Chung. Master of Science in Civil Engineering
Feasibility Study of Contractor Registration System and Contractor Grading System in Macau by Pun I Chung Master of Science in Civil Engineering 2012 Faculty of Science and Technology University of Macau
Information Visualization Multivariate Data Visualization Krešimir Matković
Information Visualization Multivariate Data Visualization Krešimir Matković Vienna University of Technology, VRVis Research Center, Vienna Multivariable >3D Data Tables have so many variables that orthogonal
Big Data Data-intensive Computing Methods, Tools, and Applications (CMSC 34900)
Big Data Data-intensive Computing Methods, Tools, and Applications (CMSC 34900) Ian Foster Computation Institute Argonne National Lab & University of Chicago 2 3 SQL Overview Structured Query Language
An example. Visualization? An example. Scientific Visualization. This talk. Information Visualization & Visual Analytics. 30 items, 30 x 3 values
Information Visualization & Visual Analytics Jack van Wijk Technische Universiteit Eindhoven An example y 30 items, 30 x 3 values I-science for Astronomy, October 13-17, 2008 Lorentz center, Leiden x An
PoS(ISGC 2013)021. SCALA: A Framework for Graphical Operations for irods. Wataru Takase KEK E-mail: [email protected]
SCALA: A Framework for Graphical Operations for irods KEK E-mail: [email protected] Adil Hasan University of Liverpool E-mail: [email protected] Yoshimi Iida KEK E-mail: [email protected] Francesca
Modelling, Extraction and Description of Intrinsic Cues of High Resolution Satellite Images: Independent Component Analysis based approaches
Modelling, Extraction and Description of Intrinsic Cues of High Resolution Satellite Images: Independent Component Analysis based approaches PhD Thesis by Payam Birjandi Director: Prof. Mihai Datcu Problematic
Database Optimizing Services
Database Systems Journal vol. I, no. 2/2010 55 Database Optimizing Services Adrian GHENCEA 1, Immo GIEGER 2 1 University Titu Maiorescu Bucharest, Romania 2 Bodenstedt-Wilhelmschule Peine, Deutschland
Chapter 23. Database Security. Security Issues. Database Security
Chapter 23 Database Security Security Issues Legal and ethical issues Policy issues System-related issues The need to identify multiple security levels 2 Database Security A DBMS typically includes a database
Machine Learning using MapReduce
Machine Learning using MapReduce What is Machine Learning Machine learning is a subfield of artificial intelligence concerned with techniques that allow computers to improve their outputs based on previous
DATA MINING TOOL FOR INTEGRATED COMPLAINT MANAGEMENT SYSTEM WEKA 3.6.7
DATA MINING TOOL FOR INTEGRATED COMPLAINT MANAGEMENT SYSTEM WEKA 3.6.7 UNDER THE GUIDANCE Dr. N.P. DHAVALE, DGM, INFINET Department SUBMITTED TO INSTITUTE FOR DEVELOPMENT AND RESEARCH IN BANKING TECHNOLOGY
SYSM 6304: Risk and Decision Analysis Lecture 5: Methods of Risk Analysis
SYSM 6304: Risk and Decision Analysis Lecture 5: Methods of Risk Analysis M. Vidyasagar Cecil & Ida Green Chair The University of Texas at Dallas Email: [email protected] October 17, 2015 Outline
Sanjeev Kumar. contribute
RESEARCH ISSUES IN DATAA MINING Sanjeev Kumar I.A.S.R.I., Library Avenue, Pusa, New Delhi-110012 [email protected] 1. Introduction The field of data mining and knowledgee discovery is emerging as a
Understanding Data: A Comparison of Information Visualization Tools and Techniques
Understanding Data: A Comparison of Information Visualization Tools and Techniques Prashanth Vajjhala Abstract - This paper seeks to evaluate data analysis from an information visualization point of view.
User Manual. for the. Database Normalizer. application. Elmar Jürgens 2002-2004 [email protected]
User Manual for the Database Normalizer application Elmar Jürgens 2002-2004 [email protected] Introduction 1 Purpose of this document 1 Concepts 1 Explanation of the User Interface 2 Overview 2 All Relations
Psy 210 Conference Poster on Sex Differences in Car Accidents 10 Marks
Psy 210 Conference Poster on Sex Differences in Car Accidents 10 Marks Overview The purpose of this assignment is to compare the number of car accidents that men and women have. The goal is to determine
Intelligent Flexible Automation
Intelligent Flexible Automation David Peters Chief Executive Officer Universal Robotics February 20-22, 2013 Orlando World Marriott Center Orlando, Florida USA Trends in AI and Computing Power Convergence
Interactive Exploration of Decision Tree Results
Interactive Exploration of Decision Tree Results 1 IRISA Campus de Beaulieu F35042 Rennes Cedex, France (email: pnguyenk,[email protected]) 2 INRIA Futurs L.R.I., University Paris-Sud F91405 ORSAY Cedex,
Visualizing Relationships and Connections in Complex Data Using Network Diagrams in SAS Visual Analytics
Paper 3323-2015 Visualizing Relationships and Connections in Complex Data Using Network Diagrams in SAS Visual Analytics ABSTRACT Stephen Overton, Ben Zenick, Zencos Consulting Network diagrams in SAS
Data Mining Cluster Analysis: Basic Concepts and Algorithms. Lecture Notes for Chapter 8. Introduction to Data Mining
Data Mining Cluster Analysis: Basic Concepts and Algorithms Lecture Notes for Chapter 8 Introduction to Data Mining by Tan, Steinbach, Kumar Tan,Steinbach, Kumar Introduction to Data Mining 4/8/2004 Hierarchical
Nuclear Science and Technology Division (94) Multigroup Cross Section and Cross Section Covariance Data Visualization with Javapeño
June 21, 2006 Summary Nuclear Science and Technology Division (94) Multigroup Cross Section and Cross Section Covariance Data Visualization with Javapeño Aaron M. Fleckenstein Oak Ridge Institute for Science
Unit 4 DECISION ANALYSIS. Lesson 37. Decision Theory and Decision Trees. Learning objectives:
Unit 4 DECISION ANALYSIS Lesson 37 Learning objectives: To learn how to use decision trees. To structure complex decision making problems. To analyze the above problems. To find out limitations & advantages
A Workbench for Prototyping XML Data Exchange (extended abstract)
A Workbench for Prototyping XML Data Exchange (extended abstract) Renzo Orsini and Augusto Celentano Università Ca Foscari di Venezia, Dipartimento di Informatica via Torino 155, 30172 Mestre (VE), Italy
