How To Visualize Big Data
|
|
|
- Sheila Fox
- 5 years ago
- Views:
Transcription
1 NDBI040 Big Data Management and NoSQL Databases Lecture 11. Visualization of (Big) Data RNDr. David Hoksza, Ph.D. Doc. RNDr. Irena Holubova, Ph.D.
2 Motivation for (Big) Data Visualization Data visualization = creation and studying of visual representation of data Information abstracted in some schematic form Including attributes, variables, Purpose: To communicate information clearly and effectively through graphical means To help find the information needed more effectively and intuitively Both aesthetic form and functionality are required Even when data volumes are large, the patterns can be spotted quite easily (with the right data processing and visualization) Simplification of Big Data management Picking up things with the naked eye that would otherwise be hidden
3
4 Snímek 3 WU1 Windows User;
5 Motivation Similar motivation as for statistics but visualization can reveal/distinguish data/trends/patters, which statistics can not Source: Tufte, Edward R (1983), The Visual Display of Quantitative Information, Graphics Press Four data sets with nearly identical linear model (mean, variance, linear regression line, )
6 Motivation Find an outlier.
7 Data Visualization Information visualization has two equally important aspects Structural modeling Detection, extraction and simplification of the underlying information Graphical representation Transform initial representation into a graphical one which provides visualization of the structure Different types of structures require different type of visualization e.g., time series vs. hierarchical information
8 Big Data Visualization Decision about what technique to use became more difficult with Big Data Visualization is needed to decide which portion of data to explore further Visualization algorithms (i.e., graph drawing) should scale well to billions of entities (nodes) The first application was probably the visualization of web-related data i.e., pages, relations, traffic, New techniques may be needed Trends might not be clear Noise reduction might be even more necessary
9 Visualization Types Data Relationships Scatter plot Classical statistical diagram that lets us visualize relationships between numeric variables Can carry additional information Matrix chart Summarizes a multidimensional data set in a grid Network diagram A set of objects (vertices) connected by edges Visualization of the network is optimized to keep strongly related items in close proximity to each other
10 Visualization Types Data Relationships Correlation matrix (heat map) Combines data to quickly identify which variables are related Shows how strong the relationship is between the variables
11 Visualization Types Data Relationships Heat map is often combined with a dendrogram Aggregates rows or columns based on their overall similarity into a tree structure
12 Visualization Types Comparison of a Set of Values Bar Chart Classical method for numerical comparisons Histograms Box plot (box-and-whisker plots) Five statistics (minimum, lower quartile, median, upper quartile and maximum) summarizing the distribution of a set of data Bubble chart Circles in a bubble chart represent different data values The area of a circle corresponding to the value The positions of the bubbles do not mean anything Designed to pack the circles together with relatively little wasted space
13 Visualization Types Trends over Time Line graph Classical method for visualizing continuous change Stack graph Visualizing change in a set of items The sum of the values is as important as the individual items
14 Visualization Types Parts of a Whole Pie Chart Percentages are encoded as "slices" of a pie, with the area corresponding to the percentage Treemap Visualization of hierarchical structures Effective in showing attributes of leaf nodes using size and color coding Enable to compare nodes and sub-trees at varying depth Economy of Australia
15 Visualization Types Text Analysis Tag cloud Visualization of word frequencies i.e., how frequently words appear in a given text
16 Which Visualization Technique to Use? New visualization software is capable of guessing the correct visualization based on the characteristics of the data One-dimensional data bar chart Two-dimensional data scatter plot N-dimensional data multiple scatter plots, matrix chart, Data with coordinates map-based charts Offers options Trend: to simplify the process for common users
17 Big Data Visualization The goal of visualizing Big Data is usually to make sense of a large amount of interlinked information In interconnected data the connections between objects are difficult to organize on a linear layout Circular representations Network diagrams Typical topologies one can encounter (a bit confusing term based on Manuel Lima s Visual Complexity see references) include arc diagrams, centralized burst, centralized ring, globe, circular ties or radial convergence And many more
18 Arc Diagram Vertices are placed on a line and edges are drawn as semicircles Arcs represent relationships Colors can encode, e.g., distance A map of 63,799 crossreferences found in the Bible. The bottom bars represent number of verses in the given chapter. Color of arcs represents the distance between the two chapters. ex.php/visualizations/bibleviz grey/white = book
19 Arc Diagram Visualization of IRC communication behavior: Who is talking to whom? Arcs are directional and Sorted by the amount of incoming references Sorted by user name Sorted by the amount of outgoing references Unsorted Sorted by rate of incoming/outgoing references references users drawn clockwise: In the upper half of a graph they point from left to right, in the bottom half from right to left Arc strength corresponds to the number of references from the source to the target This visualization favors strong social connections over sociability: Frequent references between the same two users feature more prominently than combined references from several sources to a single target. c_arcs/
20 Centralized Burst A map of protein-to-protein interactions of a yeast source: H. Jeong. et al. Lethality and Centrality in Protein Networks, Nature, no. 411, 2011: Visualization with strong central tendency Can reveal highly connected objects (hubs) which usually correspond to objects with high importance e.g., in a gene network, hubs are interesting points for targeting new drugs Disabling a central gene probably will not allow the organism to adapt
21 Centralized Ring Topology suitable for situations where we inspect a relation of multiple objects to one object Not very suitable for Big Data A sociogram of individual donations in Asheville, North Carolina, to Barack Obama s 2008 presidential campaign. A map of genetic overlap between migraine and about 60 other diseases. Each of the circles represents a different disease, and the size of the circle corresponds to the size of patient samples, ranging from 46 to 136,000, of those suffering from the disease.
22 Globe Globe visualizations are basically projections of other topologies on a globe The global exchange of information in real time by visualizing volumes of long distance telephone and IP data flowing between New York and cities around the world. How does the city of New York connect to other cities? With which cities does New York have the strongest ties and how do these relationships shift with time? How does the rest of the world reach into the neighborhoods of New York? The size of the glow on a particular city location corresponds to the amount of IP traffic flowing between that place and New York City. A greater glow implies a greater IP flow.
23 Radial Convergence Also known as radial chart Actually a 360 arc diagram parties donators Tracking the commercial ties between most countries across the globe. Money flow from private donators to parties in the German Bundestag (house of the parliament).
24
25 Graph Drawing Spatial layout and graph drawing play a key aspect in information visualization Good layout needs to express the key features of a complex structure Graph drawing algorithms first agree on a criterion of what makes a good graph (and what should be avoided) and then run an algorithm driven by these criteria Generally, the primary goal is to optimize the arrangement of nodes so that strongly connected nodes appear close to each other Most widely known graph drawing algorithms combine forcedirected graph drawing and spring-embedder algorithms The strength of a connection needs to be defined
26 Force-Directed Replacement Can be traced back to VLSI (very large scale integration) design = creating integrated circuits Aim: optimize the layout of a circuit to a obtain as few number of crossings as possible Generally agreed on aesthetics criteria Symmetry Even distribution of nodes Uniform edge lengths Minimization of edge crossings Some of the criteria can be mutually exclusive e.g., symmetric graph may require crossings which might be avoided
27 Force-Directed Replacement Replaces vertices in a graph by steel rings and edges by springs Attractive force is applied to a pair of connected nodes Repulsive force is applied to a pair of disconnected nodes Assigns forces among edges and nodes Attractive: spring-like forces Hook s law = the force F needed to extend or compress a spring by distance X is proportional to that distance F = k. X k = characterizes stiffness of a string Repulsive: forces of electrically charged particles Coulomb's law: F = k e. (q 1. q 2 ) / (r. r) q 1, q 2 = magnitude of charges of particles k e = Coulomb s constant r = distance of particles
28 Spring-Embedder Model (SEM) One of the most widely used models Aim: uniform edge lengths and symmetry Observation: equilibrium state for the system of forces: Edges tend to have uniform length (spring forces) Nodes that are not connected by an edge tend to be drawn further apart (electrical repulsion) Algorithm: 1. The model starts with a random initial step 2. In each step, the vertices move accordingly to the spring forces to reduce the tension 3. The optimal layout corresponds to the minimum-energy state of the system
29 SEM Optimization Fruchterman and Reingold SEM can fail on very large graphs Big Data Heuristics: Attraction forces are computed only for neighboring nodes Repulsive forces are computed for all pairs of nodes A bit differently from SEM (optimization) Optimization process is carried out iteratively Controlled by temperature parameter A similar way to simulated annealing Only about 50 iterations are used
30 Simulated Annealing Probabilistic metaheuristics Problem of locating a good approximation to the global optimum of a given function in a large search space Avoiding getting stuck in a local suboptimum Inspiration comes from annealing in metallurgy Heating and controlled cooling of a material to increase the size of its crystals Idea: The state of a system is randomly modified Moves from state s 1 to s 2 higher level heuristics s 2 does not need to be better than s 1 The probability of moving to a worse solution is given by temperature T which is gradually lowered
31 Graph Drawing Challenges Current algorithms are inefficient in incremental updating of the layout i.e., one needs to redraw the whole layout when adding/removing a single node Networks with heterogeneous link types or node types cannot be efficiently handled e.g., having users of a social network sharing various type of content (images, posts, video) and forming various types of relations (liking, tagging) Majority of algorithms focus on strong ties (heavyweight links) Weak ties can be surprisingly valuable because they are more likely to be the source of novel information e.g., hearing about a new job offering is an example of weak link with great social impact
32 Graph Drawing Challenges Scalability Big Data related challenge Problematic scaling as the size and density of the network increases i.e., Big Data are also difficult to visualize Limited screen resolution Big Data related challenge Sometimes we simply do not have enough pixels to visualize a complex large-scale network Zoomable interfaces, fish-eye views, In general solvable by interactivity
33 Graph Drawing Challenges Scalability Solution with dense or large-scale networks can be partially solved by reducing the complexity of the information to be visualized Link reduction techniques Pruning the original network Clustering Dividing the network into smaller components and treat them individually Inefficient if the graph contains large components Dimension reduction
34 Link Reduction Techniques Removing low weight links Imposing a link weight threshold only link weights above the threshold are considered Does not take into account the structure of the network Minimum spanning tree Link reduction to N 1 edges (on network of N nodes) Network scaling algorithms e.g., pathfinder network scaling Extracts paths of length at most Q Exploits Minkowski metric Generalization of Euclidean distance
35 Clustering Techniques Goal: to divide a large data set into a number of subsets according to some given similarity measures Basic methodologies: The choice is a trade-off between quality and speed Graph-theoretical Relies on a pre-computed distance matrix Based on how objects are separated e.g., single link, group average, complete link Single-pass Seed oriented clustering clusters grow from a given number of data objects (seeds) chosen Iterative Iterative optimization of the clustering structure according to a heuristic function k-means clustering Repeating re-computation of centroids (step 3 + 4) Unlike single-pass methods, the resulting number of clusters may not be known in advance Hierarchical clustering Step 1 Step 2 Step 3 Step 4
36 Dimension Reduction (DR) When dealing with data which have relations expressed as a distance matrix only Either we can use graph drawing methods Or specialized dimension reduction techniques Idea: each data point consists of multiple attributes and the goal is to visualize similar data points near to each other in 2D space = projection from a multidimensional space into a 2D space Generally difficult problems since in general distance space is not metric Unlike Euclidian space
37 Dimension Reduction Multi Dimensional Scaling (MDS) The goal of MDS is to find a representation of data points (in nd) in the target space (in 2D) which resembles the mutual distances in the original space as close as possible Algorithm 1. Generate an initial layout 2. Iteratively reposition the data points so that the value of an error function for the current projection decreases Error function = sum over all pairs of objects (distance in nd space distance in 2D space) 3. Stop after given number of iterations Very time demanding, especially for large datasets since all the distances need to be computed in every step Optimizations have been introduced involving mainly parallel processing
38 Dimension Reduction Principal Component Analysis (PCA) Finding a linear transformation which tries to keep as much variability in the data as possible Identifies new basis vectors which maximize the amount of information kept after transformation onto the new basis New basis vectors correspond to the eigenvectors of the covariance matrix The order of an eigenvalue/eigenvector specifies its informativeness two first eigenvectors define a projection into 2D space keeping most of the information present in the data
39 Smoothing Revealing patterns in large data The patterns can be partially visible but not evident Techniques Moving average Representing trend using local averages Sliding window and averaging values over the values Locally weighted scatter plot smoothing (LOWESS) Weights for the data points decline with their distance from center point according to a weight function 4 points window 20 points window 200 points window
40 Data Visualization Tools Analytics and visualization tools Standard statistical packages R, Matlab Customizability according to specific needs Specialized data analytics/visualization solutions SAS, IBM Cognos Limited by the design Ready-to-use solutions on top of a data warehouse Visualization tools Tableau, Many Eyes (IBM), Circos, Visual.ly Trend: to bring the visualization and analysis to common users (not only data scientists) Easy-to-use software Web interfaces allowing instant sharing of visualizations Drag and drop interfaces
41 Data Visualization Tools Tableau Data engine (analytics database) allowing ad-hoc analysis of massive data (millions of records) Products Tableau Desktop drag-and-drop tool, multiple views in a single dashboard, highlighting and filtering data, connection to live data Tableau Server sharing data and visualizations and work in progress Tableau Online hosted version of Tableau server allowing fast sharing of dashboards Tableau Public embedding of live visualizations into web pages
42 Data Visualization Tools Many Eyes Data visualization experiment by IBM Research and the IBM Cognos software group Social networking application Almost 175,000 visualizations as of today Allows to update a dataset and create and share a visualization based on the data Easy-to-use
43 Data Visualization Tools Circos Offline visualization tool specializing on circular visualizations Exploring relationships between objects or positions User input: tab-delimited source file Output: circular visualization of the data in a form of an image SVG, PNG
44 Programming Visualizations JavaScript Libraries D3 Data-Drive Documents Efficient manipulation of documents based on data CSS3, HTML5, SVG Comes packed with Algorithms and layouts (force layouts, trees, ) Drawing functionality (d3.svg) jquery-based selections Hundreds of visualizations with source codes at Raphael More a graphics library Every graphical object is a DOM object JavaScript event handlers can be attached to it
45 Programming Visualizations Google Charts API Set of JavaScript classes Extremely simple to use: 1. Load libraries 2. List data to be charted 3. Select chart type 4. Create object with given id 5. Create div element with the id
46 Hunk Analytics tools are not designed for the diversity and size of Big Data sets Hunk = Hadoop more accessible to common users Allows users to analyze and visualize historical data in Hadoop Enables to detect patterns and find anomalies
47 Project Neo Dashboard atop datasets Allowing untrained workers to create visualizations and discover patterns and insights from raw data IBM s effort to make data visualization available not only to data analytics or scientists To be launched in 2014 Focus on speed Built on top of Rapidly Adaptive Visualization Engine
48 Data Visualization Techniques Course Topics Lectures: Map and geographical data analysis and visualization Time series analysis and visualization Network data analysis and visualization Interactive inforgraphics 3D visualization Searching in visualizations... Labs: Creating visualizations Probably using Tableau Circos Visual.ly Programming visualizations D3.js
49 References Edward R. Tufte: The Visual Display of Quantitative Information Edward R. Tufte: Envisioning Information Chaomei Chen: Information Visualization: Beyond the Horizon Manuel Lima: Visual Complexity: Mapping Patterns of Information
Create Cool Lumira Visualization Extensions with SAP Web IDE Dong Pan SAP PM and RIG Analytics Henry Kam Senior Product Manager, Developer Ecosystem
Create Cool Lumira Visualization Extensions with SAP Web IDE Dong Pan SAP PM and RIG Analytics Henry Kam Senior Product Manager, Developer Ecosystem 2015 SAP SE or an SAP affiliate company. All rights
Business Intelligence and Process Modelling
Business Intelligence and Process Modelling F.W. Takes Universiteit Leiden Lecture 2: Business Intelligence & Visual Analytics BIPM Lecture 2: Business Intelligence & Visual Analytics 1 / 72 Business Intelligence
All Visualizations Documentation
All Visualizations Documentation All Visualizations Documentation 2 Copyright and Trademarks Licensed Materials - Property of IBM. Copyright IBM Corp. 2013 IBM, the IBM logo, and Cognos are trademarks
Data Visualization Techniques
Data Visualization Techniques From Basics to Big Data with SAS Visual Analytics WHITE PAPER SAS White Paper Table of Contents Introduction.... 1 Generating the Best Visualizations for Your Data... 2 The
an introduction to VISUALIZING DATA by joel laumans
an introduction to VISUALIZING DATA by joel laumans an introduction to VISUALIZING DATA iii AN INTRODUCTION TO VISUALIZING DATA by Joel Laumans Table of Contents 1 Introduction 1 Definition Purpose 2 Data
Using Data Mining for Mobile Communication Clustering and Characterization
Using Data Mining for Mobile Communication Clustering and Characterization A. Bascacov *, C. Cernazanu ** and M. Marcu ** * Lasting Software, Timisoara, Romania ** Politehnica University of Timisoara/Computer
Data Visualization Techniques
Data Visualization Techniques From Basics to Big Data with SAS Visual Analytics WHITE PAPER SAS White Paper Table of Contents Introduction.... 1 Generating the Best Visualizations for Your Data... 2 The
Diagrams and Graphs of Statistical Data
Diagrams and Graphs of Statistical Data One of the most effective and interesting alternative way in which a statistical data may be presented is through diagrams and graphs. There are several ways in
Graph/Network Visualization
Graph/Network Visualization Data model: graph structures (relations, knowledge) and networks. Applications: Telecommunication systems, Internet and WWW, Retailers distribution networks knowledge representation
Clustering & Visualization
Chapter 5 Clustering & Visualization Clustering in high-dimensional databases is an important problem and there are a number of different clustering paradigms which are applicable to high-dimensional data.
DATA MINING CLUSTER ANALYSIS: BASIC CONCEPTS
DATA MINING CLUSTER ANALYSIS: BASIC CONCEPTS 1 AND ALGORITHMS Chiara Renso KDD-LAB ISTI- CNR, Pisa, Italy WHAT IS CLUSTER ANALYSIS? Finding groups of objects such that the objects in a group will be similar
<no narration for this slide>
1 2 The standard narration text is : After completing this lesson, you will be able to: < > SAP Visual Intelligence is our latest innovation
Enterprise Data Visualization and BI Dashboard
Strengths Key Features and Benefits Ad-hoc Visualization and Data Discovery Prototyping Mockups Dashboards The application is web based and can be installed on any windows or linux server. There is no
TIBCO Spotfire Business Author Essentials Quick Reference Guide. Table of contents:
Table of contents: Access Data for Analysis Data file types Format assumptions Data from Excel Information links Add multiple data tables Create & Interpret Visualizations Table Pie Chart Cross Table Treemap
How To Cluster
Data Clustering Dec 2nd, 2013 Kyrylo Bessonov Talk outline Introduction to clustering Types of clustering Supervised Unsupervised Similarity measures Main clustering algorithms k-means Hierarchical Main
Understanding Data: A Comparison of Information Visualization Tools and Techniques
Understanding Data: A Comparison of Information Visualization Tools and Techniques Prashanth Vajjhala Abstract - This paper seeks to evaluate data analysis from an information visualization point of view.
Using visualization to understand big data
IBM Software Business Analytics Advanced visualization Using visualization to understand big data By T. Alan Keahey, Ph.D., IBM Visualization Science and Systems Expert 2 Using visualization to understand
Visualization Quick Guide
Visualization Quick Guide A best practice guide to help you find the right visualization for your data WHAT IS DOMO? Domo is a new form of business intelligence (BI) unlike anything before an executive
Data Mining Cluster Analysis: Basic Concepts and Algorithms. Lecture Notes for Chapter 8. Introduction to Data Mining
Data Mining Cluster Analysis: Basic Concepts and Algorithms Lecture Notes for Chapter 8 by Tan, Steinbach, Kumar 1 What is Cluster Analysis? Finding groups of objects such that the objects in a group will
Why Taking This Course? Course Introduction, Descriptive Statistics and Data Visualization. Learning Goals. GENOME 560, Spring 2012
Why Taking This Course? Course Introduction, Descriptive Statistics and Data Visualization GENOME 560, Spring 2012 Data are interesting because they help us understand the world Genomics: Massive Amounts
White Paper. Data Visualization Techniques. From Basics to Big Data With SAS Visual Analytics
White Paper Data Visualization Techniques From Basics to Big Data With SAS Visual Analytics Contents Introduction... 1 Tips to Get Started... 1 The Basics: Charting 101... 2 Line Graphs...2 Bar Charts...3
OpenText Information Hub (ihub) 3.1 and 3.1.1
OpenText Information Hub (ihub) 3.1 and 3.1.1 OpenText Information Hub (ihub) 3.1.1 meets the growing demand for analytics-powered applications that deliver data and empower employees and customers to
VISUALIZING HIERARCHICAL DATA. Graham Wills SPSS Inc., http://willsfamily.org/gwills
VISUALIZING HIERARCHICAL DATA Graham Wills SPSS Inc., http://willsfamily.org/gwills SYNONYMS Hierarchical Graph Layout, Visualizing Trees, Tree Drawing, Information Visualization on Hierarchies; Hierarchical
Chapter ML:XI (continued)
Chapter ML:XI (continued) XI. Cluster Analysis Data Mining Overview Cluster Analysis Basics Hierarchical Cluster Analysis Iterative Cluster Analysis Density-Based Cluster Analysis Cluster Evaluation Constrained
BIG DATA VISUALIZATION. Team Impossible Peter Vilim, Sruthi Mayuram Krithivasan, Matt Burrough, and Ismini Lourentzou
BIG DATA VISUALIZATION Team Impossible Peter Vilim, Sruthi Mayuram Krithivasan, Matt Burrough, and Ismini Lourentzou Let s begin with a story Let s explore Yahoo s data! Dora the Data Explorer has a new
Data Visualization Handbook
SAP Lumira Data Visualization Handbook www.saplumira.com 1 Table of Content 3 Introduction 20 Ranking 4 Know Your Purpose 23 Part-to-Whole 5 Know Your Data 25 Distribution 9 Crafting Your Message 29 Correlation
9. Text & Documents. Visualizing and Searching Documents. Dr. Thorsten Büring, 20. Dezember 2007, Vorlesung Wintersemester 2007/08
9. Text & Documents Visualizing and Searching Documents Dr. Thorsten Büring, 20. Dezember 2007, Vorlesung Wintersemester 2007/08 Slide 1 / 37 Outline Characteristics of text data Detecting patterns SeeSoft
CS 207 - Data Science and Visualization Spring 2016
CS 207 - Data Science and Visualization Spring 2016 Professor: Sorelle Friedler [email protected] An introduction to techniques for the automated and human-assisted analysis of data sets. These
Environmental Remote Sensing GEOG 2021
Environmental Remote Sensing GEOG 2021 Lecture 4 Image classification 2 Purpose categorising data data abstraction / simplification data interpretation mapping for land cover mapping use land cover class
Choosing a successful structure for your visualization
IBM Software Business Analytics Visualization Choosing a successful structure for your visualization By Noah Iliinsky, IBM Visualization Expert 2 Choosing a successful structure for your visualization
Protein Protein Interaction Networks
Functional Pattern Mining from Genome Scale Protein Protein Interaction Networks Young-Rae Cho, Ph.D. Assistant Professor Department of Computer Science Baylor University it My Definition of Bioinformatics
Team Members: Christopher Copper Philip Eittreim Jeremiah Jekich Andrew Reisdorph. Client: Brian Krzys
Team Members: Christopher Copper Philip Eittreim Jeremiah Jekich Andrew Reisdorph Client: Brian Krzys June 17, 2014 Introduction Newmont Mining is a resource extraction company with a research and development
Map-like Wikipedia Visualization. Pang Cheong Iao. Master of Science in Software Engineering
Map-like Wikipedia Visualization by Pang Cheong Iao Master of Science in Software Engineering 2011 Faculty of Science and Technology University of Macau Map-like Wikipedia Visualization by Pang Cheong
Tableau Your Data! Wiley. with Tableau Software. the InterWorks Bl Team. Fast and Easy Visual Analysis. Daniel G. Murray and
Tableau Your Data! Fast and Easy Visual Analysis with Tableau Software Daniel G. Murray and the InterWorks Bl Team Wiley Contents Foreword xix Introduction xxi Part I Desktop 1 1 Creating Visual Analytics
Cluster Analysis: Advanced Concepts
Cluster Analysis: Advanced Concepts and dalgorithms Dr. Hui Xiong Rutgers University Introduction to Data Mining 08/06/2006 1 Introduction to Data Mining 08/06/2006 1 Outline Prototype-based Fuzzy c-means
Microsoft Business Intelligence Visualization Comparisons by Tool
Microsoft Business Intelligence Visualization Comparisons by Tool Version 3: 10/29/2012 Purpose: Purpose of this document is to provide a quick reference of visualization options available in each tool.
SAS VISUAL ANALYTICS AN OVERVIEW OF POWERFUL DISCOVERY, ANALYSIS AND REPORTING
SAS VISUAL ANALYTICS AN OVERVIEW OF POWERFUL DISCOVERY, ANALYSIS AND REPORTING WELCOME TO SAS VISUAL ANALYTICS SAS Visual Analytics is a high-performance, in-memory solution for exploring massive amounts
Client Overview. Engagement Situation. Key Requirements
Client Overview Our client is one of the leading providers of business intelligence systems for customers especially in BFSI space that needs intensive data analysis of huge amounts of data for their decision
Best Practices in Data Visualizations. Vihao Pham January 29, 2014
Best Practices in Data Visualizations Vihao Pham January 29, 2014 Agenda Best Practices in Data Visualizations Why We Visualize Understanding Data Visualizations Enhancing Visualizations Visualization
Best Practices in Data Visualizations. Vihao Pham 2014
Best Practices in Data Visualizations Vihao Pham 2014 Agenda Best Practices in Data Visualizations Why We Visualize Understanding Data Visualizations Enhancing Visualizations Visualization Considerations
Big Data in Pictures: Data Visualization
Big Data in Pictures: Data Visualization Huamin Qu Hong Kong University of Science and Technology What is data visualization? Data visualization is the creation and study of the visual representation of
MicroStrategy Desktop
MicroStrategy Desktop Quick Start Guide MicroStrategy Desktop is designed to enable business professionals like you to explore data, simply and without needing direct support from IT. 1 Import data from
A Visualization is Worth a Thousand Tables: How IBM Business Analytics Lets Users See Big Data
White Paper A Visualization is Worth a Thousand Tables: How IBM Business Analytics Lets Users See Big Data Contents Executive Summary....2 Introduction....3 Too much data, not enough information....3 Only
Data Visualization Frameworks: D3.js vs. Flot vs. Highcharts by Igor Zalutsky, JavaScript Developer at Altoros
Data Visualization Frameworks: D3.js vs. Flot vs. Highcharts by Igor Zalutsky, JavaScript Developer at Altoros 2013 Altoros, Any unauthorized republishing, rewriting or use of this material is prohibited.
By LaBRI INRIA Information Visualization Team
By LaBRI INRIA Information Visualization Team Tulip 2011 version 3.5.0 Tulip is an information visualization framework dedicated to the analysis and visualization of data. Tulip aims to provide the developer
3D Interactive Information Visualization: Guidelines from experience and analysis of applications
3D Interactive Information Visualization: Guidelines from experience and analysis of applications Richard Brath Visible Decisions Inc., 200 Front St. W. #2203, Toronto, Canada, [email protected] 1. EXPERT
Data Exploration Data Visualization
Data Exploration Data Visualization What is data exploration? A preliminary exploration of the data to better understand its characteristics. Key motivations of data exploration include Helping to select
TIBCO Spotfire Network Analytics 1.1. User s Manual
TIBCO Spotfire Network Analytics 1.1 User s Manual Revision date: 26 January 2009 Important Information SOME TIBCO SOFTWARE EMBEDS OR BUNDLES OTHER TIBCO SOFTWARE. USE OF SUCH EMBEDDED OR BUNDLED TIBCO
Complex Network Visualization based on Voronoi Diagram and Smoothed-particle Hydrodynamics
Complex Network Visualization based on Voronoi Diagram and Smoothed-particle Hydrodynamics Zhao Wenbin 1, Zhao Zhengxu 2 1 School of Instrument Science and Engineering, Southeast University, Nanjing, Jiangsu
Data Visualization. Scientific Principles, Design Choices and Implementation in LabKey. Cory Nathe Software Engineer, LabKey cnathe@labkey.
Data Visualization Scientific Principles, Design Choices and Implementation in LabKey Catherine Richards, PhD, MPH Staff Scientist, HICOR [email protected] Cory Nathe Software Engineer, LabKey [email protected]
EM Clustering Approach for Multi-Dimensional Analysis of Big Data Set
EM Clustering Approach for Multi-Dimensional Analysis of Big Data Set Amhmed A. Bhih School of Electrical and Electronic Engineering Princy Johnson School of Electrical and Electronic Engineering Martin
Consumption of OData Services of Open Items Analytics Dashboard using SAP Predictive Analysis
Consumption of OData Services of Open Items Analytics Dashboard using SAP Predictive Analysis (Version 1.17) For validation Document version 0.1 7/7/2014 Contents What is SAP Predictive Analytics?... 3
The Scientific Data Mining Process
Chapter 4 The Scientific Data Mining Process When I use a word, Humpty Dumpty said, in rather a scornful tone, it means just what I choose it to mean neither more nor less. Lewis Carroll [87, p. 214] In
Echidna: Efficient Clustering of Hierarchical Data for Network Traffic Analysis
Echidna: Efficient Clustering of Hierarchical Data for Network Traffic Analysis Abdun Mahmood, Christopher Leckie, Parampalli Udaya Department of Computer Science and Software Engineering University of
MicroStrategy Analytics Express User Guide
MicroStrategy Analytics Express User Guide Analyzing Data with MicroStrategy Analytics Express Version: 4.0 Document Number: 09770040 CONTENTS 1. Getting Started with MicroStrategy Analytics Express Introduction...
CLUSTER ANALYSIS FOR SEGMENTATION
CLUSTER ANALYSIS FOR SEGMENTATION Introduction We all understand that consumers are not all alike. This provides a challenge for the development and marketing of profitable products and services. Not every
Tableau's data visualization software is provided through the Tableau for Teaching program.
A BEGINNER S GUIDE TO VISUALIZATION Featuring REU Site Collaborative Data Visualization Applications June 10, 2014 Vetria L. Byrd, PhD Advanced Visualization, Director REU Coordinator Visualization Scientist
Clustering. Adrian Groza. Department of Computer Science Technical University of Cluj-Napoca
Clustering Adrian Groza Department of Computer Science Technical University of Cluj-Napoca Outline 1 Cluster Analysis What is Datamining? Cluster Analysis 2 K-means 3 Hierarchical Clustering What is Datamining?
Comparison of Non-linear Dimensionality Reduction Techniques for Classification with Gene Expression Microarray Data
CMPE 59H Comparison of Non-linear Dimensionality Reduction Techniques for Classification with Gene Expression Microarray Data Term Project Report Fatma Güney, Kübra Kalkan 1/15/2013 Keywords: Non-linear
DICON: Visual Cluster Analysis in Support of Clinical Decision Intelligence
DICON: Visual Cluster Analysis in Support of Clinical Decision Intelligence Abstract David Gotz, PhD 1, Jimeng Sun, PhD 1, Nan Cao, MS 2, Shahram Ebadollahi, PhD 1 1 IBM T.J. Watson Research Center, New
Medical Information Management & Mining. You Chen Jan,15, 2013 [email protected]
Medical Information Management & Mining You Chen Jan,15, 2013 [email protected] 1 Trees Building Materials Trees cannot be used to build a house directly. How can we transform trees to building materials?
Data Mining Cluster Analysis: Basic Concepts and Algorithms. Lecture Notes for Chapter 8. Introduction to Data Mining
Data Mining Cluster Analysis: Basic Concepts and Algorithms Lecture Notes for Chapter 8 Introduction to Data Mining by Tan, Steinbach, Kumar Tan,Steinbach, Kumar Introduction to Data Mining 4/8/2004 Hierarchical
Multi-Dimensional Data Visualization. Slides courtesy of Chris North
Multi-Dimensional Data Visualization Slides courtesy of Chris North What is the Cleveland s ranking for quantitative data among the visual variables: Angle, area, length, position, color Where are we?!
CS 591.03 Introduction to Data Mining Instructor: Abdullah Mueen
CS 591.03 Introduction to Data Mining Instructor: Abdullah Mueen LECTURE 3: DATA TRANSFORMATION AND DIMENSIONALITY REDUCTION Chapter 3: Data Preprocessing Data Preprocessing: An Overview Data Quality Major
Gephi Tutorial Quick Start
Gephi Tutorial Welcome to this introduction tutorial. It will guide you to the basic steps of network visualization and manipulation in Gephi. Gephi version 0.7alpha2 was used to do this tutorial. Get
Making confident decisions with the full spectrum of analysis capabilities
IBM Software Business Analytics Analysis Making confident decisions with the full spectrum of analysis capabilities Making confident decisions with the full spectrum of analysis capabilities Contents 2
SPECIAL PERTURBATIONS UNCORRELATED TRACK PROCESSING
AAS 07-228 SPECIAL PERTURBATIONS UNCORRELATED TRACK PROCESSING INTRODUCTION James G. Miller * Two historical uncorrelated track (UCT) processing approaches have been employed using general perturbations
MultiExperiment Viewer Quickstart Guide
MultiExperiment Viewer Quickstart Guide Table of Contents: I. Preface - 2 II. Installing MeV - 2 III. Opening a Data Set - 2 IV. Filtering - 6 V. Clustering a. HCL - 8 b. K-means - 11 VI. Modules a. T-test
White Paper. Redefine Your Analytics Journey With Self-Service Data Discovery and Interactive Predictive Analytics
White Paper Redefine Your Analytics Journey With Self-Service Data Discovery and Interactive Predictive Analytics Contents Self-service data discovery and interactive predictive analytics... 1 What does
Today's Topics. COMP 388/441: Human-Computer Interaction. simple 2D plotting. 1D techniques. Ancient plotting techniques. Data Visualization:
COMP 388/441: Human-Computer Interaction Today's Topics Overview of visualization techniques 1D charts, 2D plots, 3D+ techniques, maps A few guidelines for scientific visualization methods, guidelines,
Survey Public Visualization Services
Survey Public Visualization Services Stefan Kölbl Markus Unterleitner Benedict Wright Graz University of Technology A-8010 Graz, Austria 4 May 2012 Abstract This survey is about public visualization services.
14.10.2014. Overview. Swarms in nature. Fish, birds, ants, termites, Introduction to swarm intelligence principles Particle Swarm Optimization (PSO)
Overview Kyrre Glette kyrrehg@ifi INF3490 Swarm Intelligence Particle Swarm Optimization Introduction to swarm intelligence principles Particle Swarm Optimization (PSO) 3 Swarms in nature Fish, birds,
Principles of Data Visualization for Exploratory Data Analysis. Renee M. P. Teate. SYS 6023 Cognitive Systems Engineering April 28, 2015
Principles of Data Visualization for Exploratory Data Analysis Renee M. P. Teate SYS 6023 Cognitive Systems Engineering April 28, 2015 Introduction Exploratory Data Analysis (EDA) is the phase of analysis
Business Intelligence. A Presentation of the Current Lead Solutions and a Comparative Analysis of the Main Providers
60 Business Intelligence. A Presentation of the Current Lead Solutions and a Comparative Analysis of the Main Providers Business Intelligence. A Presentation of the Current Lead Solutions and a Comparative
Visualizing Data: Scalable Interactivity
Visualizing Data: Scalable Interactivity The best data visualizations illustrate hidden information and structure contained in a data set. As access to large data sets has grown, so has the need for interactive
Visibility optimization for data visualization: A Survey of Issues and Techniques
Visibility optimization for data visualization: A Survey of Issues and Techniques Ch Harika, Dr.Supreethi K.P Student, M.Tech, Assistant Professor College of Engineering, Jawaharlal Nehru Technological
An Introduction to Data Mining. Big Data World. Related Fields and Disciplines. What is Data Mining? 2/12/2015
An Introduction to Data Mining for Wind Power Management Spring 2015 Big Data World Every minute: Google receives over 4 million search queries Facebook users share almost 2.5 million pieces of content
Hierarchical Data Visualization
Hierarchical Data Visualization 1 Hierarchical Data Hierarchical data emphasize the subordinate or membership relations between data items. Organizational Chart Classifications / Taxonomies (Species and
Visualizing Data from Government Census and Surveys: Plans for the Future
Censuses and Surveys of Governments: A Workshop on the Research and Methodology behind the Estimates Visualizing Data from Government Census and Surveys: Plans for the Future Kerstin Edwards March 15,
Data Mining: Exploring Data. Lecture Notes for Chapter 3. Introduction to Data Mining
Data Mining: Exploring Data Lecture Notes for Chapter 3 Introduction to Data Mining by Tan, Steinbach, Kumar Tan,Steinbach, Kumar Introduction to Data Mining 8/05/2005 1 What is data exploration? A preliminary
Chapter 7. Cluster Analysis
Chapter 7. Cluster Analysis. What is Cluster Analysis?. A Categorization of Major Clustering Methods. Partitioning Methods. Hierarchical Methods 5. Density-Based Methods 6. Grid-Based Methods 7. Model-Based
Introduction to D3.js Interactive Data Visualization in the Web Browser
Datalab Seminar Introduction to D3.js Interactive Data Visualization in the Web Browser Dr. Philipp Ackermann Sample Code: http://github.engineering.zhaw.ch/visualcomputinglab/cgdemos 2016 InIT/ZHAW Visual
Knowledge Discovery and Data Mining. Structured vs. Non-Structured Data
Knowledge Discovery and Data Mining Unit # 2 1 Structured vs. Non-Structured Data Most business databases contain structured data consisting of well-defined fields with numeric or alphanumeric values.
Decision Trees from large Databases: SLIQ
Decision Trees from large Databases: SLIQ C4.5 often iterates over the training set How often? If the training set does not fit into main memory, swapping makes C4.5 unpractical! SLIQ: Sort the values
1. What are the uses of statistics in data mining? Statistics is used to Estimate the complexity of a data mining problem. Suggest which data mining
1. What are the uses of statistics in data mining? Statistics is used to Estimate the complexity of a data mining problem. Suggest which data mining techniques are most likely to be successful, and Identify
Practical Graph Mining with R. 5. Link Analysis
Practical Graph Mining with R 5. Link Analysis Outline Link Analysis Concepts Metrics for Analyzing Networks PageRank HITS Link Prediction 2 Link Analysis Concepts Link A relationship between two entities
DATA MINING TECHNOLOGY. Keywords: data mining, data warehouse, knowledge discovery, OLAP, OLAM.
DATA MINING TECHNOLOGY Georgiana Marin 1 Abstract In terms of data processing, classical statistical models are restrictive; it requires hypotheses, the knowledge and experience of specialists, equations,
Spatio-Temporal Networks:
Spatio-Temporal Networks: Analyzing Change Across Time and Place WHITE PAPER By: Jeremy Peters, Principal Consultant, Digital Commerce Professional Services, Pitney Bowes ABSTRACT ORGANIZATIONS ARE GENERATING
Iris Sample Data Set. Basic Visualization Techniques: Charts, Graphs and Maps. Summary Statistics. Frequency and Mode
Iris Sample Data Set Basic Visualization Techniques: Charts, Graphs and Maps CS598 Information Visualization Spring 2010 Many of the exploratory data techniques are illustrated with the Iris Plant data
Lecture 2: Descriptive Statistics and Exploratory Data Analysis
Lecture 2: Descriptive Statistics and Exploratory Data Analysis Further Thoughts on Experimental Design 16 Individuals (8 each from two populations) with replicates Pop 1 Pop 2 Randomly sample 4 individuals
Network Metrics, Planar Graphs, and Software Tools. Based on materials by Lala Adamic, UMichigan
Network Metrics, Planar Graphs, and Software Tools Based on materials by Lala Adamic, UMichigan Network Metrics: Bowtie Model of the Web n The Web is a directed graph: n webpages link to other webpages
Clustering. Data Mining. Abraham Otero. Data Mining. Agenda
Clustering 1/46 Agenda Introduction Distance K-nearest neighbors Hierarchical clustering Quick reference 2/46 1 Introduction It seems logical that in a new situation we should act in a similar way as in
Visualizing the Top 400 Universities
Int'l Conf. e-learning, e-bus., EIS, and e-gov. EEE'15 81 Visualizing the Top 400 Universities Salwa Aljehane 1, Reem Alshahrani 1, and Maha Thafar 1 [email protected], [email protected], [email protected]
Table of Contents Find the story within your data
Visualizations 101 Table of Contents Find the story within your data Introduction 2 Types of Visualizations 3 Static vs. Animated Charts 6 Drilldowns and Drillthroughs 6 About Logi Analytics 7 1 For centuries,
INTRAFOCUS. DATA VISUALISATION An Intrafocus Guide
DATA VISUALISATION An Intrafocus Guide September 2011 Table of Contents What is Data Visualisation?... 2 Where is Data Visualisation Used?... 3 The Market View... 4 What Should You Look For?... 5 The Key
Course 803401 DSS. Business Intelligence: Data Warehousing, Data Acquisition, Data Mining, Business Analytics, and Visualization
Oman College of Management and Technology Course 803401 DSS Business Intelligence: Data Warehousing, Data Acquisition, Data Mining, Business Analytics, and Visualization CS/MIS Department Information Sharing
A GENERAL TAXONOMY FOR VISUALIZATION OF PREDICTIVE SOCIAL MEDIA ANALYTICS
A GENERAL TAXONOMY FOR VISUALIZATION OF PREDICTIVE SOCIAL MEDIA ANALYTICS Stacey Franklin Jones, D.Sc. ProTech Global Solutions Annapolis, MD Abstract The use of Social Media as a resource to characterize
DHL Data Mining Project. Customer Segmentation with Clustering
DHL Data Mining Project Customer Segmentation with Clustering Timothy TAN Chee Yong Aditya Hridaya MISRA Jeffery JI Jun Yao 3/30/2010 DHL Data Mining Project Table of Contents Introduction to DHL and the
