Visualization methods for patent data

Size: px
Start display at page:

Download "Visualization methods for patent data"

Transcription

1 Visualization methods for patent data Treparel 2013 Dr. Anton Heijs (CTO & Founder) Delft, The Netherlands Introduction Treparel can provide advanced visualizations for patent data. This document describes them with a short explanation of the concepts behind them. In the scientific literature one can find many papers on details behind the visualization techniques we mention here. Treparel s KMX technology uses these advanced visualizations as part of the analysis pipeline where we also have support for multiple selections of data points (patent documents) in different visualizations. This we call multiple coupled views and it basically means that when a user selects one or more documents this is shown in all available visualisations and the interaction is also supported from all visualizations. Visualisation is the process of constructing a visual image in the mind to understand the data better. Although this is an accurate description of the word visualisation instead of being a mental process the task of visualisation has become more and more an external process. The fact that visualisation has partly become an external process indicates that a broader definition of the term visualisation seems to be needed, such as: Visualisation is a method of computing. It transforms the symbolic into the geometric,enabling researchers to observe their simulations and computations. Visualisation offers a method for seeing the unseen. It enriches the process of scientific discovery and fosters profound and unexpected insights. The definition already hints at some of the benefits of computer visualisation. A good summary of benefits can be found in : Visualisation enables man to comprehend large datasets, datasets which are too large to grasp by mental imagination. Visualisation enables the discovery of previous unknown properties of the dataset which may not have been anticipated. The perception of these properties or patterns can lead the user to develop new insights. Visualisation often reveals inherent problems of the data, for instance errors and artefacts may be readily revealed. Visualisation enables both the examination of the large scale features of the dataset as well as the local features, allowing the user to see local features in a larger scale reference. Visualisation allows the user to form hypothesis based on the (newly) observed phenomena or developed insights. Treparel, Delftechpark 26, 2628 XH Delft, The Netherlands IBAN: NL39.ABNA BTW/VAT: NL B01 Chamber of Commerce: info@treparel.com

2 Ideally visualisation should be used to provide a means to overview, explore and navigate large multidimensional datasets. Let us first take a brief look at how exactly we arrive at a visualisation from the original raw data. The visualisation pipeline is the name of the sequence of processes to create a visual representation of data. Before the visualisation pipeline is entered a quantity of data is generated either from databases or any other means of data collection. The visualisation pipeline basically consists of four steps. Data analysis is the first step in the visualisation process, which consist of multiple steps in a pipeline. During data analysis the data is prepared for visualisation. Basically this means that a number of operations can be performed on the data to make it more suitable for visualisation. After completing the data analysis step the raw data has been transformed to data which can be visualised. However this does not mean that all of the data is of interest. Only the portions of the data that are of interest should be visualised and hence the second step in the visualisation pipeline is a data selection step to select the data of interest, so only focal data remains in the pipeline. Usually this part of the pipeline features some user- interaction to decide on the sections of interest. Now that has been decided which data is the focus data, the next step is the mapping step of the visualisation pipeline. In this part of the pipeline the data is mapped to render- able representations. These representations are geometric primitive like lines, surfaces, points, voxels with certain attributes like colour, position, size, transparency, texture etc. After the data mapping all that remains is the final rendering of the geometric data. Rendering is creating an image from a model. Operations performed here are viewing transformations, lighting calculations, hidden surface removal, scan conversion, anti aliasing etc. The final visualisation is created and either written to file or displayed on the screen. Visualization Pipeline The resulting visualisation should ideally be expressive, effective and appropriate. Expressive meaning that the visualisation should only display the relevant information of a dataset. It should be effective in such a manner that it complements the users capabilities of perception page 2 / 16

3 and the mental image that a user has of the visualisation. Finally an appropriate visualisation is a visualisation in which the efforts of creating the visualisation do not outweigh the benefits of the resulting visualisation. An alternative way to show the steps in the visualization process is shown below: Visualization Pipeline Al these steps are part of the visualization pipeline as we use it in KMX. Visualization of the patent text The first step in the visualization of patent data is often done by searching/filtering the data to extract the patterns text mining can strongly contribute to the visualization. Some important analysis tasks for a user are: the visualization of a patent collection to a known set of classes the visualization of a patent collection to a unknown set of classes the visualization of a patent collection in the context of their hierarchy the visualization of a patent collection over time The first two task can be implemented using supervised and unsupervised machine learning techniques through which automatic classification and clustering of the data is done. This data is then processed in the visualization pipeline to provide insight in the classified and clustered patent data. Since patent data contains classification codes the data can be hierarchically ordered in for instance the IPC classification. To provide insight in a collection of patent data we also provide an approach to visualize hierarchical patent data using a tree map algorithm. The patent data also contains time stamp data through which a collection of patents can be analysed over time. For this we implemented a visualization o the change of the number of patents from a patent collection which belong to a patent class over time. page 3 / 16

4 Treemap visualization Tree mapping is a method for displaying tree- structured data using nested rectangles which provide overview and selection of data points. An example is given in the figure below where there are documents in class A and H and in the class A there are three sub categories (A1, A2 and A3) where one is selected and all documents in that class are in shown in red. Within the tree map the user has an overview of the classes and number of patents in those classes for the full collection. With a mouse over he can get additional information about the patent and he can add or remove one or more patents from the currently selected set. When one selects on box one selects on document (such as EP in the example). The tree map visualization is very powerful since in a fixed screen scape the tree mapping algorithm can show all hierarchical data points (patent documents) and provide and an overview and also a good selection mechanism. Tree map visualization One example of a tree map showing patents on chemistry. With the interaction of the tree map visualization one of course also needs to have support for drill down into the data. If one want to see all patents in C07 and can update the visualization and show patents deeper in the C07 classification tree. This is one of the strength of the tree map algorithm. page 4 / 16

5 Tree map visualization using colour to indicate the number of patents in each class The above example shows how colouring can be used to show a parameter like the number of patents in a class, shown from green (large number) to black (small number of patents in that sub class). We can also combine two visualizations, as shown below where the tree map colouring is used to show the patents over three years (2005,2006,2007) and the cluster visualization is showing the same documents but then their similarity as calculated by the machine learning algorithm for clustering in KMX. page 5 / 16

6 Combination of tree map and landscape (cluster) visualization with colouring over the years Combined use of two visualizations in KMX (tree map and clustering) to show the patent data hierarchically (tree map) and unsupervised (3D clustering where the height is the density of the patents especially prominent for the an- organic and organic chemistry) and the colour is used to display the pattern in the patent data over time. The clustering of documents helps to analyse a collection of patents and get insight in the natural grouping of the patents. In the cluster visualization, the user can easily select documents by brushing, i.e. selecting them using the mouse. By brushing in a cluster or a parallel coordinates visualization the user gets feedback about the selected documents which greatly helps in the selection of documents, which is an example of the mentioned multiple coupled views support. One can use multiple brushes to have a rough selection and a more precise selection which provides the user feedback on a larger selected set of documents and also a smaller set. The use of multiple brushes also helps the analyst to explore the documents directly visualized in a tree map visualization and a visualization of the documents over time. This helps to understand if a brushed set of documents which are close together in a cluster page 6 / 16

7 visualization are also hierarchically close together in the tree map visualization. Additionally one can analyse this also over time which provides the user to analyse if documents which are clustered close together are also close together over time. If one wants to check if there is a trend on a certain technology over time this would be a logical way to analyse it and also to explore Parallel coordinates visualization When we have a set of documents selected maybe by filtering or brushing (see right cluster image) we can show for the selected set of documents (in the example below the documents on ebola, sars and h5n1) the distribution of the classification score. This is done by using three parallel vertical oriented coordinates where the classification score is from 0.0 (bottom) to 100 (top) can be shown for each document and each document is a line going through the three axis. Immediately one can now see the document that are selected on one cluster and that have a high score on one class and a low score on the other classes. This is true in the below shown example of KMX for all classes and shows the high performance of the classifiers. Parallel coordinates is a very general visualization technique and can map multivariate data belonging to text data. Here we have explained it with an example related to clustering, classification and two types of visualizations. Left a parallel coordinates visualization for the selected patents from the cluster visualization on the right. page 7 / 16

8 Here we show an example how it can be integrated in an application where the KMX algorithms are used to calculate the patterns in the data that are shown in the visual interface. Here we show the use of parallel coordinates where we sorted the scores for the patents to the most important coordinate classes and the decay shows that all patents belong distinctively to the first shown class (first coordinate) and thereafter to one or two additional classes but dominant. The gray cylinders indicate the number of patents in that range of the classification score which helps to read and interpret the patent data. page 8 / 16

9 Another example of using parallel coordinates in KMX. We have 10 classifiers and thus 10 coordinates and when we select patents from the clustering (see below in blue) we can see which patents (the line) score high for which classifier. This technology can be used for tagging and thus data enrichment where the visualizations are important in the analysis process. Cluster visualization of patents page 9 / 16

10 When we have classified all patents in KMX we can use the classification scores to calculate the correlation between all patents and visualise this. This provides valuable insights on aspects which one cannot determine in a query based approach, such as shown below. On the vertical and horizontal axes of the correlation visualization (matrix) we have the classification codes (IPC for instance) and therefore the visualization is symmetric. There are documents which are in different classes (like with pesticides) and although they are in different classes the still can share a strong correlation such as shown for patents in class C07K02 and A61K05 that have a correlation coefficient of 0,75 in the visualization below. Seeing where these strong correlation classes are is easy and valuable and this information cannot be determine by a query based approach. Also seeing where there are many of the correlating classes is seen directly in one picture which shows the strength of overview first and details on demand later when using visualizations. Correlation visualization of patents page 10 / 16

11 Combined use of search, correlation, tree map, parallel coordinates and cluster visualization with brushing and filtering (here for 4 classes indicated by 4 colours). In more detail the parallel coordinates for many classes of data from pubmed page 11 / 16

12 Parallel coordinates for a selected set of documents (shown in yellow) after brushing in a cluster visualisation. Visualization of patterns over time in a document set When one wants to understand patent data over time it is valuable to be able to analyse them as part of a class capturing document about the same subject, classifications and concepts. This can be done using classification and/or clustering and then we can visualise the increase or decrease of the patents over time where the band with of the classes show the trends. This is shown below for patent classification classes but can also be done for instance on non patent literature for instance the MESH terms of pubmed documents. Visualisation of increase of the number of patents over time for different patent classes. page 12 / 16

13 Trends of the patents over time for different patent classes. Trends of the pubmed article over time for different MESH terms. page 13 / 16

14 With selection of a group of documents one can have direct access and interaction with these documents and analyse them further. Visualization of relationships between patents (graph visualizations) In a patent document set there are many meta data variables that can be analysed in relation with all other document. For instance on the left we show a cluster visualization of patent on optical recording and on the right we have the same cluster but then colour coded to a specific classifier and additionally relationships are show by the connecting lines that are also colour mapped. Cluster visualization showing relations between the patents in the set. A very common approach to analyse relationships between documents is the citation analysis where we want to determine and show which patent cite other patents and vice versa (forward and backward citations). This is also very common for scientific papers and even used to estimate impact indicators. These citation visualisation are in fact graph (network) visualizations and below we show an example. page 14 / 16

15 Citation network visualisation for three domains over time and with bars on the papers (shown as nodes in the graph) with the highest impact. Visualization of meta data from the patents Patent and non- patent documents contact a lot of metadata (inventor, assignee, year of filling etc.) that can be visualised. Normally we filter the data and calculate some aggregated data and then use simple information visualisation representations (bar chart, pie charts etc.) to show basic data of the analysed patent/document sets. This can be done in a process (workflow) like shown below: page 15 / 16

16 Example workflow diagram for generating visualisations Example workflow to determine and represent basic average data of document sets. The representations that can be used are shown below and again here it is an advantage if rich interaction is supported especially with multiple coupled views. This is the case for KMX but in most cases it is not supported and them one does not have the ability to really have interaction with the data and learn from it since one only has static images. Examples of basic info- graphic visualizations: Example bar chart examples Example point and line charts examples (including radar plot on the right) Example pie charts and ring charts (showing a level of hierarchy) and bubble charts page 16 / 16

Big Data: Rethinking Text Visualization

Big Data: Rethinking Text Visualization Big Data: Rethinking Text Visualization Dr. Anton Heijs anton.heijs@treparel.com Treparel April 8, 2013 Abstract In this white paper we discuss text visualization approaches and how these are important

More information

Big Data Text Mining and Visualization. Anton Heijs

Big Data Text Mining and Visualization. Anton Heijs Copyright 2007 by Treparel Information Solutions BV. This report nor any part of it may be copied, circulated, quoted without prior written approval from Treparel7 Treparel Information Solutions BV Delftechpark

More information

GEO-VISUALIZATION SUPPORT FOR MULTIDIMENSIONAL CLUSTERING

GEO-VISUALIZATION SUPPORT FOR MULTIDIMENSIONAL CLUSTERING Geoinformatics 2004 Proc. 12th Int. Conf. on Geoinformatics Geospatial Information Research: Bridging the Pacific and Atlantic University of Gävle, Sweden, 7-9 June 2004 GEO-VISUALIZATION SUPPORT FOR MULTIDIMENSIONAL

More information

What is Visualization? Information Visualization An Overview. Information Visualization. Definitions

What is Visualization? Information Visualization An Overview. Information Visualization. Definitions What is Visualization? Information Visualization An Overview Jonathan I. Maletic, Ph.D. Computer Science Kent State University Visualize/Visualization: To form a mental image or vision of [some

More information

An example. Visualization? An example. Scientific Visualization. This talk. Information Visualization & Visual Analytics. 30 items, 30 x 3 values

An example. Visualization? An example. Scientific Visualization. This talk. Information Visualization & Visual Analytics. 30 items, 30 x 3 values Information Visualization & Visual Analytics Jack van Wijk Technische Universiteit Eindhoven An example y 30 items, 30 x 3 values I-science for Astronomy, October 13-17, 2008 Lorentz center, Leiden x An

More information

Visual Data Mining. Motivation. Why Visual Data Mining. Integration of visualization and data mining : Chidroop Madhavarapu CSE 591:Visual Analytics

Visual Data Mining. Motivation. Why Visual Data Mining. Integration of visualization and data mining : Chidroop Madhavarapu CSE 591:Visual Analytics Motivation Visual Data Mining Visualization for Data Mining Huge amounts of information Limited display capacity of output devices Chidroop Madhavarapu CSE 591:Visual Analytics Visual Data Mining (VDM)

More information

Data Visualization Handbook

Data Visualization Handbook SAP Lumira Data Visualization Handbook www.saplumira.com 1 Table of Content 3 Introduction 20 Ranking 4 Know Your Purpose 23 Part-to-Whole 5 Know Your Data 25 Distribution 9 Crafting Your Message 29 Correlation

More information

Clustering & Visualization

Clustering & Visualization Chapter 5 Clustering & Visualization Clustering in high-dimensional databases is an important problem and there are a number of different clustering paradigms which are applicable to high-dimensional data.

More information

an introduction to VISUALIZING DATA by joel laumans

an introduction to VISUALIZING DATA by joel laumans an introduction to VISUALIZING DATA by joel laumans an introduction to VISUALIZING DATA iii AN INTRODUCTION TO VISUALIZING DATA by Joel Laumans Table of Contents 1 Introduction 1 Definition Purpose 2 Data

More information

Customer Analytics. Turn Big Data into Big Value

Customer Analytics. Turn Big Data into Big Value Turn Big Data into Big Value All Your Data Integrated in Just One Place BIRT Analytics lets you capture the value of Big Data that speeds right by most enterprises. It analyzes massive volumes of data

More information

P6 Analytics Reference Manual

P6 Analytics Reference Manual P6 Analytics Reference Manual Release 3.2 October 2013 Contents Getting Started... 7 About P6 Analytics... 7 Prerequisites to Use Analytics... 8 About Analyses... 9 About... 9 About Dashboards... 10 Logging

More information

Data Visualization. or Graphical Data Presentation. Jerzy Stefanowski Instytut Informatyki

Data Visualization. or Graphical Data Presentation. Jerzy Stefanowski Instytut Informatyki Data Visualization or Graphical Data Presentation Jerzy Stefanowski Instytut Informatyki Data mining for SE -- 2013 Ack. Inspirations are coming from: G.Piatetsky Schapiro lectures on KDD J.Han on Data

More information

Specific Usage of Visual Data Analysis Techniques

Specific Usage of Visual Data Analysis Techniques Specific Usage of Visual Data Analysis Techniques Snezana Savoska 1 and Suzana Loskovska 2 1 Faculty of Administration and Management of Information systems, Partizanska bb, 7000, Bitola, Republic of Macedonia

More information

Data mining as a tool of revealing the hidden connection of the plant

Data mining as a tool of revealing the hidden connection of the plant Data mining as a tool of revealing the hidden connection of the plant Honeywell AIDA Advanced Interactive Data Analysis Introduction What is AIDA? AIDA: Advanced Interactive Data Analysis Developped in

More information

Interactive Data Mining and Visualization

Interactive Data Mining and Visualization Interactive Data Mining and Visualization Zhitao Qiu Abstract: Interactive analysis introduces dynamic changes in Visualization. On another hand, advanced visualization can provide different perspectives

More information

Data Mining Applications in Higher Education

Data Mining Applications in Higher Education Executive report Data Mining Applications in Higher Education Jing Luan, PhD Chief Planning and Research Officer, Cabrillo College Founder, Knowledge Discovery Laboratories Table of contents Introduction..............................................................2

More information

DATA MINING TECHNOLOGY. Keywords: data mining, data warehouse, knowledge discovery, OLAP, OLAM.

DATA MINING TECHNOLOGY. Keywords: data mining, data warehouse, knowledge discovery, OLAP, OLAM. DATA MINING TECHNOLOGY Georgiana Marin 1 Abstract In terms of data processing, classical statistical models are restrictive; it requires hypotheses, the knowledge and experience of specialists, equations,

More information

SAS VISUAL ANALYTICS AN OVERVIEW OF POWERFUL DISCOVERY, ANALYSIS AND REPORTING

SAS VISUAL ANALYTICS AN OVERVIEW OF POWERFUL DISCOVERY, ANALYSIS AND REPORTING SAS VISUAL ANALYTICS AN OVERVIEW OF POWERFUL DISCOVERY, ANALYSIS AND REPORTING WELCOME TO SAS VISUAL ANALYTICS SAS Visual Analytics is a high-performance, in-memory solution for exploring massive amounts

More information

The Importance of Analytics

The Importance of Analytics CIPHER Briefing The Importance of Analytics July 2014 Renting 1 machine for 1,000 hours will be nearly equivalent to renting 1,000 machines for 1 hour in the cloud. This will enable users and organizations

More information

Knowledge Discovery from patents using KMX Text Analytics

Knowledge Discovery from patents using KMX Text Analytics Knowledge Discovery from patents using KMX Text Analytics Dr. Anton Heijs anton.heijs@treparel.com Treparel Abstract In this white paper we discuss how the KMX technology of Treparel can help searchers

More information

APPLICATION OF DATA MINING TECHNIQUES FOR BUILDING SIMULATION PERFORMANCE PREDICTION ANALYSIS. email paul@esru.strath.ac.uk

APPLICATION OF DATA MINING TECHNIQUES FOR BUILDING SIMULATION PERFORMANCE PREDICTION ANALYSIS. email paul@esru.strath.ac.uk Eighth International IBPSA Conference Eindhoven, Netherlands August -4, 2003 APPLICATION OF DATA MINING TECHNIQUES FOR BUILDING SIMULATION PERFORMANCE PREDICTION Christoph Morbitzer, Paul Strachan 2 and

More information

Ignite Your Creative Ideas with Fast and Engaging Data Discovery

Ignite Your Creative Ideas with Fast and Engaging Data Discovery SAP Brief SAP BusinessObjects BI s SAP Crystal s SAP Lumira Objectives Ignite Your Creative Ideas with Fast and Engaging Data Discovery Tap into your data big and small Tap into your data big and small

More information

Cleaned Data. Recommendations

Cleaned Data. Recommendations Call Center Data Analysis Megaputer Case Study in Text Mining Merete Hvalshagen www.megaputer.com Megaputer Intelligence, Inc. 120 West Seventh Street, Suite 10 Bloomington, IN 47404, USA +1 812-0-0110

More information

Hierarchical Data Visualization

Hierarchical Data Visualization Hierarchical Data Visualization 1 Hierarchical Data Hierarchical data emphasize the subordinate or membership relations between data items. Organizational Chart Classifications / Taxonomies (Species and

More information

All Visualizations Documentation

All Visualizations Documentation All Visualizations Documentation All Visualizations Documentation 2 Copyright and Trademarks Licensed Materials - Property of IBM. Copyright IBM Corp. 2013 IBM, the IBM logo, and Cognos are trademarks

More information

DATA VISUALIZATION GABRIEL PARODI STUDY MATERIAL: PRINCIPLES OF GEOGRAPHIC INFORMATION SYSTEMS AN INTRODUCTORY TEXTBOOK CHAPTER 7

DATA VISUALIZATION GABRIEL PARODI STUDY MATERIAL: PRINCIPLES OF GEOGRAPHIC INFORMATION SYSTEMS AN INTRODUCTORY TEXTBOOK CHAPTER 7 DATA VISUALIZATION GABRIEL PARODI STUDY MATERIAL: PRINCIPLES OF GEOGRAPHIC INFORMATION SYSTEMS AN INTRODUCTORY TEXTBOOK CHAPTER 7 Contents GIS and maps The visualization process Visualization and strategies

More information

A Short Introduction on Data Visualization. Guoning Chen

A Short Introduction on Data Visualization. Guoning Chen A Short Introduction on Data Visualization Guoning Chen Data is generated everywhere and everyday Age of Big Data Data in ever increasing sizes need an effective way to understand them History of Visualization

More information

Course Syllabus For Operations Management. Management Information Systems

Course Syllabus For Operations Management. Management Information Systems For Operations Management and Management Information Systems Department School Year First Year First Year First Year Second year Second year Second year Third year Third year Third year Third year Third

More information

VISUALIZING HIERARCHICAL DATA. Graham Wills SPSS Inc., http://willsfamily.org/gwills

VISUALIZING HIERARCHICAL DATA. Graham Wills SPSS Inc., http://willsfamily.org/gwills VISUALIZING HIERARCHICAL DATA Graham Wills SPSS Inc., http://willsfamily.org/gwills SYNONYMS Hierarchical Graph Layout, Visualizing Trees, Tree Drawing, Information Visualization on Hierarchies; Hierarchical

More information

MarkerView Software 1.2.1 for Metabolomic and Biomarker Profiling Analysis

MarkerView Software 1.2.1 for Metabolomic and Biomarker Profiling Analysis MarkerView Software 1.2.1 for Metabolomic and Biomarker Profiling Analysis Overview MarkerView software is a novel program designed for metabolomics applications and biomarker profiling workflows 1. Using

More information

Visualization Techniques in Data Mining

Visualization Techniques in Data Mining Tecniche di Apprendimento Automatico per Applicazioni di Data Mining Visualization Techniques in Data Mining Prof. Pier Luca Lanzi Laurea in Ingegneria Informatica Politecnico di Milano Polo di Milano

More information

Create Mobile, Compelling Dashboards with Trusted Business Warehouse Data

Create Mobile, Compelling Dashboards with Trusted Business Warehouse Data SAP Brief SAP BusinessObjects Business Intelligence s SAP BusinessObjects Design Studio Objectives Create Mobile, Compelling Dashboards with Trusted Business Warehouse Data Increase the value of data with

More information

MicroStrategy Desktop

MicroStrategy Desktop MicroStrategy Desktop Quick Start Guide MicroStrategy Desktop is designed to enable business professionals like you to explore data, simply and without needing direct support from IT. 1 Import data from

More information

Investment Analysis using the Portfolio Analysis Machine (PALMA 1 ) Tool by Richard A. Moynihan 21 July 2005

Investment Analysis using the Portfolio Analysis Machine (PALMA 1 ) Tool by Richard A. Moynihan 21 July 2005 Investment Analysis using the Portfolio Analysis Machine (PALMA 1 ) Tool by Richard A. Moynihan 21 July 2005 Government Investment Analysis Guidance Current Government acquisition guidelines mandate the

More information

Visualisatie BMT. Introduction, visualization, visualization pipeline. Arjan Kok Huub van de Wetering (h.v.d.wetering@tue.nl)

Visualisatie BMT. Introduction, visualization, visualization pipeline. Arjan Kok Huub van de Wetering (h.v.d.wetering@tue.nl) Visualisatie BMT Introduction, visualization, visualization pipeline Arjan Kok Huub van de Wetering (h.v.d.wetering@tue.nl) 1 Lecture overview Goal Summary Study material What is visualization Examples

More information

Statistical Data Mining. Practical Assignment 3 Discriminant Analysis and Decision Trees

Statistical Data Mining. Practical Assignment 3 Discriminant Analysis and Decision Trees Statistical Data Mining Practical Assignment 3 Discriminant Analysis and Decision Trees In this practical we discuss linear and quadratic discriminant analysis and tree-based classification techniques.

More information

Web 3.0 image search: a World First

Web 3.0 image search: a World First Web 3.0 image search: a World First The digital age has provided a virtually free worldwide digital distribution infrastructure through the internet. Many areas of commerce, government and academia have

More information

USING SELF-ORGANIZING MAPS FOR INFORMATION VISUALIZATION AND KNOWLEDGE DISCOVERY IN COMPLEX GEOSPATIAL DATASETS

USING SELF-ORGANIZING MAPS FOR INFORMATION VISUALIZATION AND KNOWLEDGE DISCOVERY IN COMPLEX GEOSPATIAL DATASETS USING SELF-ORGANIZING MAPS FOR INFORMATION VISUALIZATION AND KNOWLEDGE DISCOVERY IN COMPLEX GEOSPATIAL DATASETS Koua, E.L. International Institute for Geo-Information Science and Earth Observation (ITC).

More information

ICT Perspectives on Big Data: Well Sorted Materials

ICT Perspectives on Big Data: Well Sorted Materials ICT Perspectives on Big Data: Well Sorted Materials 3 March 2015 Contents Introduction 1 Dendrogram 2 Tree Map 3 Heat Map 4 Raw Group Data 5 For an online, interactive version of the visualisations in

More information

Information Visualization Multivariate Data Visualization Krešimir Matković

Information Visualization Multivariate Data Visualization Krešimir Matković Information Visualization Multivariate Data Visualization Krešimir Matković Vienna University of Technology, VRVis Research Center, Vienna Multivariable >3D Data Tables have so many variables that orthogonal

More information

Innovative Information Visualization of Electronic Health Record Data: a Systematic Review

Innovative Information Visualization of Electronic Health Record Data: a Systematic Review Innovative Information Visualization of Electronic Health Record Data: a Systematic Review Vivian West, David Borland, W. Ed Hammond February 5, 2015 Outline Background Objective Methods & Criteria Analysis

More information

Extend Table Lens for High-Dimensional Data Visualization and Classification Mining

Extend Table Lens for High-Dimensional Data Visualization and Classification Mining Extend Table Lens for High-Dimensional Data Visualization and Classification Mining CPSC 533c, Information Visualization Course Project, Term 2 2003 Fengdong Du fdu@cs.ubc.ca University of British Columbia

More information

LexisNexis TotalPatent. Training Manual

LexisNexis TotalPatent. Training Manual LexisNexis TotalPatent Training Manual March, 2013 Table of Contents 1 GETTING STARTED Signing On / Off Setting Preferences and Project IDs Online Help and Feedback 2 SEARCHING FUNDAMENTALS Overview of

More information

Data Mining with SQL Server Data Tools

Data Mining with SQL Server Data Tools Data Mining with SQL Server Data Tools Data mining tasks include classification (directed/supervised) models as well as (undirected/unsupervised) models of association analysis and clustering. 1 Data Mining

More information

Hierarchical Clustering Analysis

Hierarchical Clustering Analysis Hierarchical Clustering Analysis What is Hierarchical Clustering? Hierarchical clustering is used to group similar objects into clusters. In the beginning, each row and/or column is considered a cluster.

More information

Formulas, Functions and Charts

Formulas, Functions and Charts Formulas, Functions and Charts :: 167 8 Formulas, Functions and Charts 8.1 INTRODUCTION In this leson you can enter formula and functions and perform mathematical calcualtions. You will also be able to

More information

Chapter 3: Data Mining Driven Learning Apprentice System for Medical Billing Compliance

Chapter 3: Data Mining Driven Learning Apprentice System for Medical Billing Compliance Chapter 3: Data Mining Driven Learning Apprentice System for Medical Billing Compliance 3.1 Introduction This research has been conducted at back office of a medical billing company situated in a custom

More information

DataPA OpenAnalytics End User Training

DataPA OpenAnalytics End User Training DataPA OpenAnalytics End User Training DataPA End User Training Lesson 1 Course Overview DataPA Chapter 1 Course Overview Introduction This course covers the skills required to use DataPA OpenAnalytics

More information

Data Mining and Visualization

Data Mining and Visualization Data Mining and Visualization Jeremy Walton NAG Ltd, Oxford Overview Data mining components Functionality Example application Quality control Visualization Use of 3D Example application Market research

More information

Visual Structure Analysis of Flow Charts in Patent Images

Visual Structure Analysis of Flow Charts in Patent Images Visual Structure Analysis of Flow Charts in Patent Images Roland Mörzinger, René Schuster, András Horti, and Georg Thallinger JOANNEUM RESEARCH Forschungsgesellschaft mbh DIGITAL - Institute for Information

More information

VISUALIZATION. Improving the Computer Forensic Analysis Process through

VISUALIZATION. Improving the Computer Forensic Analysis Process through By SHELDON TEERLINK and ROBERT F. ERBACHER Improving the Computer Forensic Analysis Process through VISUALIZATION The ability to display mountains of data in a graphical manner significantly enhances the

More information

Diagrams and Graphs of Statistical Data

Diagrams and Graphs of Statistical Data Diagrams and Graphs of Statistical Data One of the most effective and interesting alternative way in which a statistical data may be presented is through diagrams and graphs. There are several ways in

More information

DHL Data Mining Project. Customer Segmentation with Clustering

DHL Data Mining Project. Customer Segmentation with Clustering DHL Data Mining Project Customer Segmentation with Clustering Timothy TAN Chee Yong Aditya Hridaya MISRA Jeffery JI Jun Yao 3/30/2010 DHL Data Mining Project Table of Contents Introduction to DHL and the

More information

TIBCO Spotfire Business Author Essentials Quick Reference Guide. Table of contents:

TIBCO Spotfire Business Author Essentials Quick Reference Guide. Table of contents: Table of contents: Access Data for Analysis Data file types Format assumptions Data from Excel Information links Add multiple data tables Create & Interpret Visualizations Table Pie Chart Cross Table Treemap

More information

<no narration for this slide>

<no narration for this slide> 1 2 The standard narration text is : After completing this lesson, you will be able to: < > SAP Visual Intelligence is our latest innovation

More information

WebFOCUS RStat. RStat. Predict the Future and Make Effective Decisions Today. WebFOCUS RStat

WebFOCUS RStat. RStat. Predict the Future and Make Effective Decisions Today. WebFOCUS RStat Information Builders enables agile information solutions with business intelligence (BI) and integration technologies. WebFOCUS the most widely utilized business intelligence platform connects to any enterprise

More information

3D Interactive Information Visualization: Guidelines from experience and analysis of applications

3D Interactive Information Visualization: Guidelines from experience and analysis of applications 3D Interactive Information Visualization: Guidelines from experience and analysis of applications Richard Brath Visible Decisions Inc., 200 Front St. W. #2203, Toronto, Canada, rbrath@vdi.com 1. EXPERT

More information

STATISTICA. Financial Institutions. Case Study: Credit Scoring. and

STATISTICA. Financial Institutions. Case Study: Credit Scoring. and Financial Institutions and STATISTICA Case Study: Credit Scoring STATISTICA Solutions for Business Intelligence, Data Mining, Quality Control, and Web-based Analytics Table of Contents INTRODUCTION: WHAT

More information

Visualization of Software Metrics Marlena Compton Software Metrics SWE 6763 April 22, 2009

Visualization of Software Metrics Marlena Compton Software Metrics SWE 6763 April 22, 2009 Visualization of Software Metrics Marlena Compton Software Metrics SWE 6763 April 22, 2009 Abstract Visualizations are increasingly used to assess the quality of source code. One of the most well developed

More information

STATISTICA. Clustering Techniques. Case Study: Defining Clusters of Shopping Center Patrons. and

STATISTICA. Clustering Techniques. Case Study: Defining Clusters of Shopping Center Patrons. and Clustering Techniques and STATISTICA Case Study: Defining Clusters of Shopping Center Patrons STATISTICA Solutions for Business Intelligence, Data Mining, Quality Control, and Web-based Analytics Table

More information

The Big Data methodology in computer vision systems

The Big Data methodology in computer vision systems The Big Data methodology in computer vision systems Popov S.B. Samara State Aerospace University, Image Processing Systems Institute, Russian Academy of Sciences Abstract. I consider the advantages of

More information

TOP-DOWN DATA ANALYSIS WITH TREEMAPS

TOP-DOWN DATA ANALYSIS WITH TREEMAPS TOP-DOWN DATA ANALYSIS WITH TREEMAPS Martijn Tennekes, Edwin de Jonge Statistics Netherlands (CBS), P.0.Box 4481, 6401 CZ Heerlen, The Netherlands m.tennekes@cbs.nl, e.dejonge@cbs.nl Keywords: Abstract:

More information

GUIDE TO POST-PROCESSING OF THE POINT CLOUD

GUIDE TO POST-PROCESSING OF THE POINT CLOUD GUIDE TO POST-PROCESSING OF THE POINT CLOUD Contents Contents 3 Reconstructing the point cloud with MeshLab 16 Reconstructing the point cloud with CloudCompare 2 Reconstructing the point cloud with MeshLab

More information

CHAPTER 1 INTRODUCTION

CHAPTER 1 INTRODUCTION 1 CHAPTER 1 INTRODUCTION Exploration is a process of discovery. In the database exploration process, an analyst executes a sequence of transformations over a collection of data structures to discover useful

More information

INFOASSIST: REPORTING MADE SIMPLE

INFOASSIST: REPORTING MADE SIMPLE INFOASSIST: REPORTING MADE SIMPLE BRIAN CARTER INFORMATION BUILDERS SUMMIT 2010 USERS CONFERENCE JUNE 2010 Presentation Abstract: InfoAssist, WebFOCUS' browser-based ad hoc reporting tool, provides a single

More information

Situational Awareness Through Network Visualization

Situational Awareness Through Network Visualization CYBER SECURITY DIVISION 2014 R&D SHOWCASE AND TECHNICAL WORKSHOP Situational Awareness Through Network Visualization Pacific Northwest National Laboratory Daniel M. Best Bryan Olsen 11/25/2014 Introduction

More information

Data Visualisation and Its Application in Official Statistics. Olivia Or Census and Statistics Department, Hong Kong, China ooyor@censtatd.gov.

Data Visualisation and Its Application in Official Statistics. Olivia Or Census and Statistics Department, Hong Kong, China ooyor@censtatd.gov. Data Visualisation and Its Application in Official Statistics Olivia Or Census and Statistics Department, Hong Kong, China ooyor@censtatd.gov.hk Abstract Data visualisation has been a growing topic of

More information

DELTA Dashboards Visualise, Analyse and Monitor kdb+ Datasets with Delta Dashboards

DELTA Dashboards Visualise, Analyse and Monitor kdb+ Datasets with Delta Dashboards Delta Dashboards is a powerful, real-time presentation layer for the market-leading kdb+ database technology. They provide rich visualisation of both real-time streaming data and highly optimised polled

More information

3/17/2009. Knowledge Management BIKM eclassifier Integrated BIKM Tools

3/17/2009. Knowledge Management BIKM eclassifier Integrated BIKM Tools Paper by W. F. Cody J. T. Kreulen V. Krishna W. S. Spangler Presentation by Dylan Chi Discussion by Debojit Dhar THE INTEGRATION OF BUSINESS INTELLIGENCE AND KNOWLEDGE MANAGEMENT BUSINESS INTELLIGENCE

More information

ECS 235A Project - NVD Visualization Using TreeMaps

ECS 235A Project - NVD Visualization Using TreeMaps ECS 235A Project - NVD Visualization Using TreeMaps Kevin Griffin Email: kevgriffin@ucdavis.edu December 12, 2013 1 Introduction The National Vulnerability Database (NVD) is a continuously updated United

More information

Business Process Discovery

Business Process Discovery Sandeep Jadhav Introduction Well defined, organized, implemented, and managed Business Processes are very critical to the success of any organization that wants to operate efficiently. Business Process

More information

Glencoe. correlated to SOUTH CAROLINA MATH CURRICULUM STANDARDS GRADE 6 3-3, 5-8 8-4, 8-7 1-6, 4-9

Glencoe. correlated to SOUTH CAROLINA MATH CURRICULUM STANDARDS GRADE 6 3-3, 5-8 8-4, 8-7 1-6, 4-9 Glencoe correlated to SOUTH CAROLINA MATH CURRICULUM STANDARDS GRADE 6 STANDARDS 6-8 Number and Operations (NO) Standard I. Understand numbers, ways of representing numbers, relationships among numbers,

More information

ORGANIZATIONAL KNOWLEDGE MAPPING BASED ON LIBRARY INFORMATION SYSTEM

ORGANIZATIONAL KNOWLEDGE MAPPING BASED ON LIBRARY INFORMATION SYSTEM ORGANIZATIONAL KNOWLEDGE MAPPING BASED ON LIBRARY INFORMATION SYSTEM IRANDOC CASE STUDY Ammar Jalalimanesh a,*, Elaheh Homayounvala a a Information engineering department, Iranian Research Institute for

More information

A Visualization is Worth a Thousand Tables: How IBM Business Analytics Lets Users See Big Data

A Visualization is Worth a Thousand Tables: How IBM Business Analytics Lets Users See Big Data White Paper A Visualization is Worth a Thousand Tables: How IBM Business Analytics Lets Users See Big Data Contents Executive Summary....2 Introduction....3 Too much data, not enough information....3 Only

More information

NAVIGATING SCIENTIFIC LITERATURE A HOLISTIC PERSPECTIVE. Venu Govindaraju

NAVIGATING SCIENTIFIC LITERATURE A HOLISTIC PERSPECTIVE. Venu Govindaraju NAVIGATING SCIENTIFIC LITERATURE A HOLISTIC PERSPECTIVE Venu Govindaraju BIOMETRICS DOCUMENT ANALYSIS PATTERN RECOGNITION 8/24/2015 ICDAR- 2015 2 Towards a Globally Optimal Approach for Learning Deep Unsupervised

More information

Learning outcomes. Knowledge and understanding. Competence and skills

Learning outcomes. Knowledge and understanding. Competence and skills Syllabus Master s Programme in Statistics and Data Mining 120 ECTS Credits Aim The rapid growth of databases provides scientists and business people with vast new resources. This programme meets the challenges

More information

Create Cool Lumira Visualization Extensions with SAP Web IDE Dong Pan SAP PM and RIG Analytics Henry Kam Senior Product Manager, Developer Ecosystem

Create Cool Lumira Visualization Extensions with SAP Web IDE Dong Pan SAP PM and RIG Analytics Henry Kam Senior Product Manager, Developer Ecosystem Create Cool Lumira Visualization Extensions with SAP Web IDE Dong Pan SAP PM and RIG Analytics Henry Kam Senior Product Manager, Developer Ecosystem 2015 SAP SE or an SAP affiliate company. All rights

More information

2015 Workshops for Professors

2015 Workshops for Professors SAS Education Grow with us Offered by the SAS Global Academic Program Supporting teaching, learning and research in higher education 2015 Workshops for Professors 1 Workshops for Professors As the market

More information

Visual Data Mining with Pixel-oriented Visualization Techniques

Visual Data Mining with Pixel-oriented Visualization Techniques Visual Data Mining with Pixel-oriented Visualization Techniques Mihael Ankerst The Boeing Company P.O. Box 3707 MC 7L-70, Seattle, WA 98124 mihael.ankerst@boeing.com Abstract Pixel-oriented visualization

More information

Visual Mining of E-Customer Behavior Using Pixel Bar Charts

Visual Mining of E-Customer Behavior Using Pixel Bar Charts Visual Mining of E-Customer Behavior Using Pixel Bar Charts Ming C. Hao, Julian Ladisch*, Umeshwar Dayal, Meichun Hsu, Adrian Krug Hewlett Packard Research Laboratories, Palo Alto, CA. (ming_hao, dayal)@hpl.hp.com;

More information

Dolcera Software and Services. Enhance your business Potential

Dolcera Software and Services. Enhance your business Potential Dolcera Software and Services Enhance your business Potential About Dolcera Dolcera is a Knowledge Services company based out of Silicon Valley, USA and Hyderabad, India Dolcera s clients include dozens

More information

Big Data and Analytics: Challenges and Opportunities

Big Data and Analytics: Challenges and Opportunities Big Data and Analytics: Challenges and Opportunities Dr. Amin Beheshti Lecturer and Senior Research Associate University of New South Wales, Australia (Service Oriented Computing Group, CSE) Talk: Sharif

More information

Topographic Change Detection Using CloudCompare Version 1.0

Topographic Change Detection Using CloudCompare Version 1.0 Topographic Change Detection Using CloudCompare Version 1.0 Emily Kleber, Arizona State University Edwin Nissen, Colorado School of Mines J Ramón Arrowsmith, Arizona State University Introduction CloudCompare

More information

COMP9321 Web Application Engineering

COMP9321 Web Application Engineering COMP9321 Web Application Engineering Semester 2, 2015 Dr. Amin Beheshti Service Oriented Computing Group, CSE, UNSW Australia Week 11 (Part II) http://webapps.cse.unsw.edu.au/webcms2/course/index.php?cid=2411

More information

Using Tableau Software with Hortonworks Data Platform

Using Tableau Software with Hortonworks Data Platform Using Tableau Software with Hortonworks Data Platform September 2013 2013 Hortonworks Inc. http:// Modern businesses need to manage vast amounts of data, and in many cases they have accumulated this data

More information

Course 803401 DSS. Business Intelligence: Data Warehousing, Data Acquisition, Data Mining, Business Analytics, and Visualization

Course 803401 DSS. Business Intelligence: Data Warehousing, Data Acquisition, Data Mining, Business Analytics, and Visualization Oman College of Management and Technology Course 803401 DSS Business Intelligence: Data Warehousing, Data Acquisition, Data Mining, Business Analytics, and Visualization CS/MIS Department Information Sharing

More information

Employee Engagement Survey Results. Sample Company. All Respondents

Employee Engagement Survey Results. Sample Company. All Respondents Employee Engagement Survey Results All Respondents Summary Results from 246 Respondents February, 2009 Table of Contents All Respondents (n = 246) 1 Employee Engagement Two-Factor Profile of Employee Engagement

More information

Table of Contents Find the story within your data

Table of Contents Find the story within your data Visualizations 101 Table of Contents Find the story within your data Introduction 2 Types of Visualizations 3 Static vs. Animated Charts 6 Drilldowns and Drillthroughs 6 About Logi Analytics 7 1 For centuries,

More information

Delivering Smart Answers!

Delivering Smart Answers! Companion for SharePoint Topic Analyst Companion for SharePoint All Your Information Enterprise-ready Enrich SharePoint, your central place for document and workflow management, not only with an improved

More information

Visualization of 2D Domains

Visualization of 2D Domains Visualization of 2D Domains This part of the visualization package is intended to supply a simple graphical interface for 2- dimensional finite element data structures. Furthermore, it is used as the low

More information

BioVisualization: Enhancing Clinical Data Mining

BioVisualization: Enhancing Clinical Data Mining BioVisualization: Enhancing Clinical Data Mining Even as many clinicians struggle to give up their pen and paper charts and spreadsheets, some innovators are already shifting health care information technology

More information

Introduction to Visualization with VTK and ParaView

Introduction to Visualization with VTK and ParaView Introduction to Visualization with VTK and ParaView R. Sungkorn and J. Derksen Department of Chemical and Materials Engineering University of Alberta Canada August 24, 2011 / LBM Workshop 1 Introduction

More information

Visualizing the Top 400 Universities

Visualizing the Top 400 Universities Int'l Conf. e-learning, e-bus., EIS, and e-gov. EEE'15 81 Visualizing the Top 400 Universities Salwa Aljehane 1, Reem Alshahrani 1, and Maha Thafar 1 saljehan@kent.edu, ralshahr@kent.edu, mthafar@kent.edu

More information

Sanjeev Kumar. contribute

Sanjeev Kumar. contribute RESEARCH ISSUES IN DATAA MINING Sanjeev Kumar I.A.S.R.I., Library Avenue, Pusa, New Delhi-110012 sanjeevk@iasri.res.in 1. Introduction The field of data mining and knowledgee discovery is emerging as a

More information

. Learn the number of classes and the structure of each class using similarity between unlabeled training patterns

. Learn the number of classes and the structure of each class using similarity between unlabeled training patterns Outline Part 1: of data clustering Non-Supervised Learning and Clustering : Problem formulation cluster analysis : Taxonomies of Clustering Techniques : Data types and Proximity Measures : Difficulties

More information

ANALYSIS OF WEBSITE USAGE WITH USER DETAILS USING DATA MINING PATTERN RECOGNITION

ANALYSIS OF WEBSITE USAGE WITH USER DETAILS USING DATA MINING PATTERN RECOGNITION ANALYSIS OF WEBSITE USAGE WITH USER DETAILS USING DATA MINING PATTERN RECOGNITION K.Vinodkumar 1, Kathiresan.V 2, Divya.K 3 1 MPhil scholar, RVS College of Arts and Science, Coimbatore, India. 2 HOD, Dr.SNS

More information

Understanding Data: A Comparison of Information Visualization Tools and Techniques

Understanding Data: A Comparison of Information Visualization Tools and Techniques Understanding Data: A Comparison of Information Visualization Tools and Techniques Prashanth Vajjhala Abstract - This paper seeks to evaluate data analysis from an information visualization point of view.

More information

Advanced Visualizations Tools for CERN Institutional Data

Advanced Visualizations Tools for CERN Institutional Data Advanced Visualizations Tools for CERN Institutional Data September 2013 Author: Alberto Rodríguez Peón Supervisor(s): Jiří Kunčar CERN openlab Summer Student Report 2013 Project Specification The aim

More information

CONTENTS PREFACE 1 INTRODUCTION 1 2 DATA VISUALIZATION 19

CONTENTS PREFACE 1 INTRODUCTION 1 2 DATA VISUALIZATION 19 PREFACE xi 1 INTRODUCTION 1 1.1 Overview 1 1.2 Definition 1 1.3 Preparation 2 1.3.1 Overview 2 1.3.2 Accessing Tabular Data 3 1.3.3 Accessing Unstructured Data 3 1.3.4 Understanding the Variables and Observations

More information

1 Copyright 2011, Oracle and/or its affiliates. All rights reserved.

1 Copyright 2011, Oracle and/or its affiliates. All rights reserved. 1 Copyright 2011, Oracle and/or its affiliates. All rights Building Visually Appealing Web 2.0 Data Dashboards Frank Nimphius Senior Principal Product Manager, Oracle 2 Copyright 2011, Oracle and/or its

More information