Visualizing Bipartite Network Data using SAS Visualization Tools. John Zheng, Columbia MD
|
|
|
- Tracey Manning
- 10 years ago
- Views:
Transcription
1 Visualizing Bipartite Network Data using SAS Visualization Tools John Zheng, Columbia MD ABSTRACT Bipartite networks refer to the networks created from affiliation relationships, such as patient-provider relationships in healthcare data. Analyzing such networks allows us to gain additional insights on groups that share the same member and members that belong to the same group. This paper develops methods to visualize bipartite network data in healthcare using the Annotate facility. Link analysis is used to explore the connections between healthcare providers and patients, and hence connections between peer providers. In addition, this paper also takes advantage of the spatial information in the patient-provider data to gain additional insights. In particular, the connections between providers are projected on a Google map using Google Map Generator. INTRODUCTION Prior work has been published on using SAS to visualize and to compute network statistics of classical social networks (Hoyle 2006, Hornibrook 2009, Ellis 2009). This paper focuses on the visualization of bipartite networks, which are the networks that consist of two disjoint sets of nodes and all edges must be between the two sets. Bipartite networks are often used to model affiliation relationships, such as the patient-provider network or the consumer-seller network. In this paper, special attention is paid to uncovering the hidden connections in patientprovider bipartite networks. Instead of directly applying social network analysis and visualization techniques on the original bipartite network, we first convert the original bipartite network into a classical (unipartite) network of providers that is defined by patient-sharing. Once the network of providers is established, of interests are those unnaturally dense cliques within the network. The clustered providers are displayed distinctively and suspicious patterns flagged for further study. Large networks posts one of the greatest challenges in visualizing social networks. In the patient-provider network, for example, there are over 13,000 health care providers registered under NPI (National Provider Identification) in the tri state area of Maryland, Virginia and DC, and 230,000+ FFS Medicare Part B patients in calendar year When dealing with the big data set such as this, the first visualization challenge comes from laying out the nodes. Multidimensional scaling (MDS) is often used to produce spatial layout of the nodes based on their similarity. However, MDS does not scale well with increasing number of nodes. So far one method has been used to overcome the difficulties in laying out a very large network with MDS. The first method (Hoyle 2009) uses a variation of MDS that was used to generate coordinates for the node dataset. When spatial information (such as address or zip code) is available, one can also skip the MDS entirely by projecting all nodes on a Google map using Google Map Generator provided by SAS/GRAPH. This paper discusses these two methods as well as a third method, component analysis, to dissect the original network into many smaller, manageable sub-networks. THE REPRESENTATION OF SOCIAL NETWORK DATA IN SAS Social network data can be represented in one of two formats in SAS, the matrix format and the edge-list format. The matrix format stores social network data in an n-by-n square matrix with all nodes listed across the top and down the side. The value of cell(i,j) in the matrix indicates whether the node i and the node j is connected (0 no connection;
2 otherwise, connected). The matrix form is used in many network analysis procedures such as MDS. The edge list, on the other hand, stores a pair of connected nodes along with attributes of this connection in a single row of a table. For example, a connection between provider i and j with 5 shared patients is stored as (i,j,5). Because the edge list format only stores edges that exist, it uses much less space when it comes to storing a sparse network, which is often the case in social network datasets. In contrast, the matrix format stores every pair of nodes whether or not they are connected. Converting Bipartite Networks to Classical Networks The bipartite network can be conveniently represented in the matrix format. The patient-provider network can be represented by an m-by-n matrix with patients being the row nodes and providers being the column nodes. Each cell (i,j) records whether the patient i is cared by provider j in the observation period. Table 1 shows an example of a patient-provider network represented by matrix. In this example, patient 2 is cared by providers 1and 2. Provider 1 cares patients 1 and 2. Provider1 Provider2 Provider3 Patient1 1 Patient2 1 1 Patient3 1 Patient4 1 Table 1. A Matrix Representation of the Patient-provider Network We can obtain a classical network of providers by multiplying the transpose of the matrix A with A. The resulting new matrix is an n-by-n square matrix of providers,, with each cell representing the number of patients shared by the row provider and the column provider Notice that the constructed provider network matrix is symmetric, i.e., the number of patients shared by provider pair (i,j) is the same as that by provider pair (j,i). The diagonal values are precisely the number of patients cared by each provider. We can similarly construct the classical network of patients,. Each cell in represents the number of common providers shared between the row patient and the column patient Although the matrix algebra can be carried out using PROC IML, PROC IML is not suitable for manipulating large matrixes with thousands of nodes. Trials show that storing such a large matrix demands large amount of computer memory and computing matrix multiplication is hopelessly long in PROC IML. For these reasons, we recommend storing the bipartite network data in the edge-list format and converting the bipartite network into a classical network using PROC SQL instead. The patient-provider bipartite network in Example 1 can also be represented in the edge-list format as follows.
3 PatientID ProviderID Table 2. An Edge-list Representation of the Patient-Provider Network To construct a classical network of providers, we can join the table with itself using PatientID and then group the results by NPI (used as ProviderID) and count the number rows to get the number of patients shared. The SQL query for this relational algebra operation is as follows. select tbl1.npi as NPI1, tbl2.npi as NPI2, count(*) as PatientsCnt from table_name as tbl1, table_name tbl2 where tbl1.patientid = tbl2.patientid group by NPI1, NPI2 The result of the relational algerbra is an edge-list representation of the classical provider network, as illustrated in table 3. ProviderID1 ProvdierID2 Shared Patients # Table 3. An Edge-list Representation of the Classical Provider Network VISUALIZATION THE CLASSICAL PROVIDER NETWORK Once the bipartite network is converted to classical unipartite networks, we may use many tools in SAS to display the network structure to suit various visualization needs. SAS has created a Java tool for network visualization called Constellation Applet, which can be used in conjunction with SAS s XML generating macro %DS2CONST to create social network diagrams. The automatic layout feature of the Constellation applet can be very convenient. But the Contellation applet also has shortcomings. First, it is good for can only handle limited numbers of nodes. Moreover, the automatic layout changes from call to call, preventing meaningful comparison across multiple network diagrams. Therefore it is recommended that the automatic layout feature be turned off for visualizing large networks such as the patient-provider network. The Annotate Facility can also be used to create social network diagrams. As a general purpose graph drawing facility, the Annotate facility provides greatest flexibility for visualizing network data. For example a user can use graphical elements such as customized shapes. But users of the Annotate facility must provide coordinates for nodes and edges. The Annotate facility is used in this paper to produce the network diagram for its versatility and for its ability to handle large networks. A main task in visualizing a social network is to lay out the nodes in a meaningful way. There are a number of social network analysis layout algorithms, such as the force directed placement algorithm and the MDS algorithm. We choose the popular MDS algorithm as our layout engine. The MDS attempts to place nodes in such a way that
4 similar nodes are close to each other on a 2-dimensional map. The MDS algorithm takes as its input a similarity matrix that records the similarity of every pair of nodes in the network. The classical network of providers obtained in the previous section provides a basis for constructing a similarity matrix. We define our similarity measure as follows. The Similarity Measure Choosing a similarity measure is a key to visualizing the provider network. Though we could use the number of shared patients as similarity between two providers, it does not consistently measure similarity. For example, 10 shared patients between two providers have different implications when the two providers care only 10 patients than when they care 100 patients. To better capture the similarity between two providers, we calculate overlapping ratio as their similarity measure: 2 count of patients shared Overlapping Ratio provider 1 s patient count provider 2 s patient count By definition, two providers have an overlapping ratio of 0 if they share no patients and an overlapping ratio of 1 if they share all their patients. This overlapping ratio is used as a similarity measure to lay out the network as well as to identify significant connections. Because the MDS procedure requires a square matrix as its input, we need to convert the edge-list representation of the provider network to a similarity matrix. Here is SAS code for converting the edge-list representation to the similarity matrix. data SimilarityTbl(drop=npi: OverlapRatio); array D(1:&DimMax.); do until(last.npi1); set DyadTbl; by npi1 npi2; if first.npi1 then call missing(of D(*)); D(input(npi2, NpiIdx.)) = OverlapRatio; Var = npi1; if last.npi1 then output; end; run; Finally, the following SAS code illustrates how to use PROC MDS to layout the network of providers based on the similarity matrix. proc mds data=similaritytbl out=mdstbl dimension = 2 level=ratio similar=1 fit=1; quit; It s straightforward to rescale the coordinates from the MDS to be percent of graphics area and to generate an annotate file for providers. In addition, we can add other graphical elements to enhance the social network visualization. For example, we can use the color of nodes to represent provider specialty, the size of the node to represent the number of patients cared by the provider, and the thickness of the lines to represent the similarity measure. The code for generating the node dataset is shown below.
5 %mk_pt(annobubble, MdsTbl, input(npi,cordx.), input(npi,cordy.), pie, color = "cx" put(npi,npitaxcolor.), size = sqrt(input(npi,npibenecnt.)/1000) ); and the code for generating the edge dataset %mk_ln(annolines, DyadTbl, input(npi1,cordx.), input(npi1,cordy.), input(npi2,cordx.), input(npi2,cordy.), color = 'gray', size = OverlapRatio*4, ThreCond = OverlapRatio>=0.1 ); The node and edge annotate datasets generating macros %mk_pt() and %mk_ln() are given in the appendix. To facilitate the generation of the annotation datasets, a series of SAS formats are used to look up the coordinates, patient count, and node colors of providers. We obtain the annotation dataset by combining node dataset and the edge dataset, and use PROC GSLIDE to plot the annotation dataset. Figure 1 shows a network with 26 providers. The color schemes for plotted specialty are: Allopathic & Osteopathic Physicians- Navy, Hospitals- Sky Blue; Other Service Providers- Powder Blue, and Physician Assistants & Advanced Practice Nursing Providers- Turquoise etc. Figure 1. The Visualization of the Provider Network STRATEGIES FOR LAYING OUT LARGE NETWORKS The size of the network poses a serious challenge in applying the MDS layout algorithm. PROC MDS cannot easily reach convergence when dealing with a matrix with more than a few hundred nodes. So it is essential to reduce the size of the network submitted to PROC MDS. We discuss a few different strategies. Core-peripheral Layout One strategy, as suggested by Hoyle, is to use MDS to generate coordinates for a small number of core nodes. After the core nodes are placed in a center circle by MDS, an additional algorithm places peripheral nodes in an outer
6 band. Figure 2 shows an example of such a core-peripheral layout. In this example, there are 1,100 providers. 190 core providers (chosen by the degree centrality) are laid out in the center by PROC MDS. The remaining 910 peripheral providers are arranged roughly in a circular outer band. Links to core nodes are shown in blue lines. Links between peripheral nodes are shown in gray lines. Network Component Analysis Figure 2. A Core-Peripheral Network of Providers However a dense network like Figure 2 can barely tell an insightful story. Our second strategy is to identify components in the original network and draw each component separately. A component is a set of connected nodes that are not connected to any node outside the component. If the original network can be divided into disjoint components, then fewer nodes are entered for each PROC MDS layout, thus alleviating the size problem. To identify all the components in a network, we add a first node into a component, then all its neighbors, and then neighbors of neighbors, and so on, until no new node is added. We label this set of node as the first component. We repeat the process for the remaining nodes until all nodes are assigned to a component. In a well-connected network such as ours, the biggest component can easily consist of more than 95% of all nodes. To further break up the network into smaller clusters, we can eliminate edges that are too weak. For example, we can ignore the connections with less than 1% shared patients. By applying a threshold of minimum 1% shared patients, we obtain the following top 5 components. Node Count Percentage Accumulative Percentage % 10.09% % 17.55% % 22.09% % 25.73% % 28.82% Table 4. The Top 5 Components of the Provider Network
7 Highlighting the Stronger Connections Another useful exercise to enhance the network visualization is to apply different threshold for connections. By using a series of thresholds, we can clearly see that less significant connections are eliminated and important connections emerge. A series of network diagram below illustrates the effect of applying different threshold for connections. In the last diagram, every connection has at least an overlapping ratio of 10%. The clique of providers at the top-left corner of the diagram should be flagged for further study. No Minimum Overlapping Ratio Minimum Overlapping Ratio = 2.5% Minimum Overlapping Ratio = 5% Minimum Overlapping Ratio = 10% Figure 3. An Example of Sub-component Network of Providers with Different Minimum Overlapping Ratio VISUALIZATION USING GOOGLE MAP Once the components are identified, we can look closer into each component by drawing it on the Google map using Google Map Generator (GMG). We extract the spatial information such as the 9-digit zip code of the practice address from the provider data and add it to the annotate dataset. SAS GMG can convert an Annotate data set into either a KML file or a GPOLY JavaScript file, which is then used to generate the Google map. For details on how to configure SAS GMG, one can refer to Massengill (2010). The macros for generating the node and edge annotate datasets for
8 SAS GSLIDE can be reused for SAS GMG, except that different coordinates (now generated by PROC GEOCODE) and color formats are provided. The code for generating the node and edge annotate dataset is shown as follows. %mk_pt(annogpt, dd_mds, input(npi,npizipx.), input(npi,npizipy.), point, color = put(npi,npigoogcolor.), size = 1 ); %mk_ln(annoglines, dd_sim, input(npi1,npizipx.), input(npi1,npizipy.), input(npi2,npizipx.), input(npi2,npizipy.), color = colors, size = 0.5, ThreCond = 1 ); The following figure shows an example of a provider network component projected on Google Map. The flagged providers are represented by red icons on the map. Connections with flagged providers having overlapping ratio>=10% are in red lines and less than 10% are in blue lines. Connections between unflagged providers are drawn in gray lines. Once again the line weight is proportional to the overlapping ratio between the pair of providers. Note that the 9-digit zip codes are fictional and randomly selected for illustration purpose. All Connections Connections with Flagged Providers Only Figure 4. A Component of the Provider Network Projected on Google Map CONCLUSION In this paper, we show how to convert a bipartite network into a classical (unipartite) network for visualization and social network analysis. We use patient-provider network as an example to construct a network of providers with connections measured by the percentage of patients shared by two providers (namely overlapping ratio). We then visualize the provider network with the Annotate facility. To attain manageable network sizes for visualization, we dissect the provider network into several components and draw each component separately. This new approach
9 provides a viable way of visualizing large networks. We further enhance network visualization by applying different thresholds for connections and thus highlighting most significant connections and cliques. We also show how to gain spatial insights by projecting the provider network on a Google map using Google Map Generator provided by SAS/GRAPH. REFERENCES Hornibrook, Shane &Degrees. of Separation: Social Network Analysis Using The SAS System (NESUG2006) Hornibrook, Shane Analyzing Large Social Networks with MP Connect, SAS/IntrNet, and %DS2CONST (SESUG 2007) Hoyle, Larry Implementing Stack and Queue Data Structures with SAS Hash Object (SGF 2009) Hoyle, Larry Implementing Stack and Queue Data Structures with SAS Hash Objects. (SGF2008) Ellis, Alan R. Using SAS to Calculate Betweenness Centrality 1_P pdf Johnston, M. Francis, Xiao Chen, Phillip Bonacich, Silvia Swigert Modeling Indegree Centralization in NetSAS: A SAS Macro Enabling Exponential Random Graph Models Massengill, Darrell Google Maps and SAS/GRAPH (SGF2010) Massengill, Darrell Odom, Ed PROC GEOCODE: Now with Street-Level Geocoding (SGF2010) Newman, M. E. J. The structure and function of complex networks. SIAM Review 45, (2003) Bavelas, A. A mathematical model for group structure. Applied Anthropology (1948), 7, Brandes, U. A faster algorithm for betweenness centrality. Journal of Mathematical Sociology(2001), 25(2): ACKNOWLEDGMENTS The author would like to thank Ahmed Al-Attar, Jia Wang for their help in preparing and testing the code, and Gary McQuown for his constructive suggestions.
10 CONTACT INFORMATION Your comments and questions are valued and encouraged. Contact the author at: John Zheng (571) SAS and all other SAS Institute Inc. product or service names are registered trademarks or trademarks of SAS Institute Inc. in the USA and other countries. indicates USA registration. APPENDIX *nodes annotate dataset generation macro; %macro mk_pt(annodsn, indsn, x, y, function, color="green", size=0.7 ); data &annodsn(keep=function x y size color xsys ysys hsys when style rotate); length function color $8 style $20 rotate 8.; retain xsys ysys '2' hsys '3' when 'a' size.7; set &indsn; function="&function"; %if "&function"="pie" %then %do; rotate=360 ; %end; style='psolid'; x=&x; y=&y; color=&color; size=&size; output; run; %mend; *edges annotate dataset generation macro; %macro mk_ln(annodsn, indsn, fromx, fromy,tox,toy, color="blue", size=0.5,threcond=1); data &annodsn(keep=function x y size color xsys ysys hsys when line); length function color $ 8 ; retain xsys ysys '2' hsys '3' when 'a' size.5 line 1; set &indsn; /* Move to starting point */ function='move'; x=&fromx; y=&fromy; output; /* Draw to the ending point, if threshold condition met */ x=&tox; y=&toy; if &ThreCond. then function='draw'; else function='move'; line=1; size=&size; color=&color; output; run; %mend;
Sales Territory and Target Visualization with SAS. Yu(Daniel) Wang, Experis
Sales Territory and Target Visualization with SAS Yu(Daniel) Wang, Experis ABSTRACT SAS 9.4, OpenStreetMap(OSM) and JAVA APPLET provide tools to generate professional Google like maps. The zip code boundary
And Now, Presenting... &Degrees. of Separation: Social Network Analysis Using The SAS System Shane Hornibrook, Charlotte, NC
&Degrees. of Separation: Social Network Analysis Using The SAS System Shane Hornibrook, Charlotte, NC ABSTRACT Social Network Analysis, also known as Link Analysis, is a mathematical and graphical analysis
DATA ANALYSIS II. Matrix Algorithms
DATA ANALYSIS II Matrix Algorithms Similarity Matrix Given a dataset D = {x i }, i=1,..,n consisting of n points in R d, let A denote the n n symmetric similarity matrix between the points, given as where
MultiExperiment Viewer Quickstart Guide
MultiExperiment Viewer Quickstart Guide Table of Contents: I. Preface - 2 II. Installing MeV - 2 III. Opening a Data Set - 2 IV. Filtering - 6 V. Clustering a. HCL - 8 b. K-means - 11 VI. Modules a. T-test
VISUAL ALGEBRA FOR COLLEGE STUDENTS. Laurie J. Burton Western Oregon University
VISUAL ALGEBRA FOR COLLEGE STUDENTS Laurie J. Burton Western Oregon University VISUAL ALGEBRA FOR COLLEGE STUDENTS TABLE OF CONTENTS Welcome and Introduction 1 Chapter 1: INTEGERS AND INTEGER OPERATIONS
The Scientific Data Mining Process
Chapter 4 The Scientific Data Mining Process When I use a word, Humpty Dumpty said, in rather a scornful tone, it means just what I choose it to mean neither more nor less. Lewis Carroll [87, p. 214] In
9. Text & Documents. Visualizing and Searching Documents. Dr. Thorsten Büring, 20. Dezember 2007, Vorlesung Wintersemester 2007/08
9. Text & Documents Visualizing and Searching Documents Dr. Thorsten Büring, 20. Dezember 2007, Vorlesung Wintersemester 2007/08 Slide 1 / 37 Outline Characteristics of text data Detecting patterns SeeSoft
SGL: Stata graph library for network analysis
SGL: Stata graph library for network analysis Hirotaka Miura Federal Reserve Bank of San Francisco Stata Conference Chicago 2011 The views presented here are my own and do not necessarily represent the
Visualizing Relationships and Connections in Complex Data Using Network Diagrams in SAS Visual Analytics
Paper 3323-2015 Visualizing Relationships and Connections in Complex Data Using Network Diagrams in SAS Visual Analytics ABSTRACT Stephen Overton, Ben Zenick, Zencos Consulting Network diagrams in SAS
ABSTRACT INTRODUCTION OVERVIEW OF POSTGRESQL AND POSTGIS SESUG 2012. Paper RI-14
SESUG 2012 Paper RI-14 Creating a Heatmap Visualization of 150 Million GPS Points on Roadway Maps via SAS Shih-Ching Wu, Virginia Tech Transportation Institute, Blacksburg, Virginia Shane McLaughlin, Virginia
USING SPECTRAL RADIUS RATIO FOR NODE DEGREE TO ANALYZE THE EVOLUTION OF SCALE- FREE NETWORKS AND SMALL-WORLD NETWORKS
USING SPECTRAL RADIUS RATIO FOR NODE DEGREE TO ANALYZE THE EVOLUTION OF SCALE- FREE NETWORKS AND SMALL-WORLD NETWORKS Natarajan Meghanathan Jackson State University, 1400 Lynch St, Jackson, MS, USA [email protected]
Scatter Chart. Segmented Bar Chart. Overlay Chart
Data Visualization Using Java and VRML Lingxiao Li, Art Barnes, SAS Institute Inc., Cary, NC ABSTRACT Java and VRML (Virtual Reality Modeling Language) are tools with tremendous potential for creating
Charting Your Course: Charts and Graphs for IT Projects
Charting Your Course: Charts and Graphs for IT Projects Dawn Li, Ph.D. and Gary McQuown Data and Analytic Solutions, Inc. Fairfax, VA ABSTRACT This paper describes the most common types of charts and graphs
TIBCO Spotfire Network Analytics 1.1. User s Manual
TIBCO Spotfire Network Analytics 1.1 User s Manual Revision date: 26 January 2009 Important Information SOME TIBCO SOFTWARE EMBEDS OR BUNDLES OTHER TIBCO SOFTWARE. USE OF SUCH EMBEDDED OR BUNDLED TIBCO
Forecasting Geographic Data Michael Leonard and Renee Samy, SAS Institute Inc. Cary, NC, USA
Forecasting Geographic Data Michael Leonard and Renee Samy, SAS Institute Inc. Cary, NC, USA Abstract Virtually all businesses collect and use data that are associated with geographic locations, whether
Understanding Data: A Comparison of Information Visualization Tools and Techniques
Understanding Data: A Comparison of Information Visualization Tools and Techniques Prashanth Vajjhala Abstract - This paper seeks to evaluate data analysis from an information visualization point of view.
CSU, Fresno - Institutional Research, Assessment and Planning - Dmitri Rogulkin
My presentation is about data visualization. How to use visual graphs and charts in order to explore data, discover meaning and report findings. The goal is to show that visual displays can be very effective
Introduction to the TI-Nspire CX
Introduction to the TI-Nspire CX Activity Overview: In this activity, you will become familiar with the layout of the TI-Nspire CX. Step 1: Locate the Touchpad. The Touchpad is used to navigate the cursor
Practical Graph Mining with R. 5. Link Analysis
Practical Graph Mining with R 5. Link Analysis Outline Link Analysis Concepts Metrics for Analyzing Networks PageRank HITS Link Prediction 2 Link Analysis Concepts Link A relationship between two entities
SAP InfiniteInsight 7.0 SP1
End User Documentation Document Version: 1.0-2014-11 Getting Started with Social Table of Contents 1 About this Document... 3 1.1 Who Should Read this Document... 3 1.2 Prerequisites for the Use of this
Data Visualization Techniques
Data Visualization Techniques From Basics to Big Data with SAS Visual Analytics WHITE PAPER SAS White Paper Table of Contents Introduction.... 1 Generating the Best Visualizations for Your Data... 2 The
Comparison of Non-linear Dimensionality Reduction Techniques for Classification with Gene Expression Microarray Data
CMPE 59H Comparison of Non-linear Dimensionality Reduction Techniques for Classification with Gene Expression Microarray Data Term Project Report Fatma Güney, Kübra Kalkan 1/15/2013 Keywords: Non-linear
Operation Count; Numerical Linear Algebra
10 Operation Count; Numerical Linear Algebra 10.1 Introduction Many computations are limited simply by the sheer number of required additions, multiplications, or function evaluations. If floating-point
Matrix Multiplication
Matrix Multiplication CPS343 Parallel and High Performance Computing Spring 2016 CPS343 (Parallel and HPC) Matrix Multiplication Spring 2016 1 / 32 Outline 1 Matrix operations Importance Dense and sparse
TIBCO Spotfire Business Author Essentials Quick Reference Guide. Table of contents:
Table of contents: Access Data for Analysis Data file types Format assumptions Data from Excel Information links Add multiple data tables Create & Interpret Visualizations Table Pie Chart Cross Table Treemap
Text Mining in JMP with R Andrew T. Karl, Senior Management Consultant, Adsurgo LLC Heath Rushing, Principal Consultant and Co-Founder, Adsurgo LLC
Text Mining in JMP with R Andrew T. Karl, Senior Management Consultant, Adsurgo LLC Heath Rushing, Principal Consultant and Co-Founder, Adsurgo LLC 1. Introduction A popular rule of thumb suggests that
Market Pricing Override
Market Pricing Override MARKET PRICING OVERRIDE Market Pricing: Copy Override Market price overrides can be copied from one match year to another Market Price Override can be accessed from the Job Matches
Consumption of OData Services of Open Items Analytics Dashboard using SAP Predictive Analysis
Consumption of OData Services of Open Items Analytics Dashboard using SAP Predictive Analysis (Version 1.17) For validation Document version 0.1 7/7/2014 Contents What is SAP Predictive Analytics?... 3
Visual Analysis Tool for Bipartite Networks
Visual Analysis Tool for Bipartite Networks Kazuo Misue Department of Computer Science, University of Tsukuba, 1-1-1 Tennoudai, Tsukuba, 305-8573 Japan [email protected] Abstract. To find hidden features
Roadway Plan Production
AutoCAD Civil 3D 2010 Education Curriculum Instructor Guide Unit 5: Transportation Design Lesson 5 Roadway Plan Production Overview This lesson describes how to automate the generation of plan and profile
Client Overview. Engagement Situation. Key Requirements
Client Overview Our client is one of the leading providers of business intelligence systems for customers especially in BFSI space that needs intensive data analysis of huge amounts of data for their decision
Get The Picture: Visualizing Financial Data part 1
Get The Picture: Visualizing Financial Data part 1 by Jeremy Walton Turning numbers into pictures is usually the easiest way of finding out what they mean. We're all familiar with the display of for example
Experiences in Using Academic Data for BI Dashboard Development
Paper RIV09 Experiences in Using Academic Data for BI Dashboard Development Evangeline Collado, University of Central Florida; Michelle Parente, University of Central Florida ABSTRACT Business Intelligence
Visualization Quick Guide
Visualization Quick Guide A best practice guide to help you find the right visualization for your data WHAT IS DOMO? Domo is a new form of business intelligence (BI) unlike anything before an executive
Switching from PC SAS to SAS Enterprise Guide Zhengxin (Cindy) Yang, inventiv Health Clinical, Princeton, NJ
PharmaSUG 2014 PO10 Switching from PC SAS to SAS Enterprise Guide Zhengxin (Cindy) Yang, inventiv Health Clinical, Princeton, NJ ABSTRACT As more and more organizations adapt to the SAS Enterprise Guide,
5. Binary objects labeling
Image Processing - Laboratory 5: Binary objects labeling 1 5. Binary objects labeling 5.1. Introduction In this laboratory an object labeling algorithm which allows you to label distinct objects from a
SAS/GRAPH Network Visualization Workshop 2.1
SAS/GRAPH Network Visualization Workshop 2.1 User s Guide SAS Documentation The correct bibliographic citation for this manual is as follows: SAS Institute Inc 2009. SAS/GRAPH : Network Visualization Workshop
15.062 Data Mining: Algorithms and Applications Matrix Math Review
.6 Data Mining: Algorithms and Applications Matrix Math Review The purpose of this document is to give a brief review of selected linear algebra concepts that will be useful for the course and to develop
Visualization of Semantic Windows with SciDB Integration
Visualization of Semantic Windows with SciDB Integration Hasan Tuna Icingir Department of Computer Science Brown University Providence, RI 02912 [email protected] February 6, 2013 Abstract Interactive Data
A HYBRID APPROACH FOR AUTOMATED AREA AGGREGATION
A HYBRID APPROACH FOR AUTOMATED AREA AGGREGATION Zeshen Wang ESRI 380 NewYork Street Redlands CA 92373 [email protected] ABSTRACT Automated area aggregation, which is widely needed for mapping both natural
ATLAS.ti for Mac OS X Getting Started
ATLAS.ti for Mac OS X Getting Started 2 ATLAS.ti for Mac OS X Getting Started Copyright 2014 by ATLAS.ti Scientific Software Development GmbH, Berlin. All rights reserved. Manual Version: 5.20140918. Updated
Data representation and analysis in Excel
Page 1 Data representation and analysis in Excel Let s Get Started! This course will teach you how to analyze data and make charts in Excel so that the data may be represented in a visual way that reflects
A STATISTICS COURSE FOR ELEMENTARY AND MIDDLE SCHOOL TEACHERS. Gary Kader and Mike Perry Appalachian State University USA
A STATISTICS COURSE FOR ELEMENTARY AND MIDDLE SCHOOL TEACHERS Gary Kader and Mike Perry Appalachian State University USA This paper will describe a content-pedagogy course designed to prepare elementary
Innovative Techniques and Tools to Detect Data Quality Problems
Paper DM05 Innovative Techniques and Tools to Detect Data Quality Problems Hong Qi and Allan Glaser Merck & Co., Inc., Upper Gwynnedd, PA ABSTRACT High quality data are essential for accurate and meaningful
Graphing Made Easy for Project Management
Paper 7300-2016 Graphing Made Easy for Project Management Zhouming (Victor) Sun, AZ/Medimmune Corp., Gaithersburg, MD ABSTRACT Project management is a hot topic across many industries, and there are multiple
How To Check For Differences In The One Way Anova
MINITAB ASSISTANT WHITE PAPER This paper explains the research conducted by Minitab statisticians to develop the methods and data checks used in the Assistant in Minitab 17 Statistical Software. One-Way
STRAND: Number and Operations Algebra Geometry Measurement Data Analysis and Probability STANDARD:
how August/September Demonstrate an understanding of the place-value structure of the base-ten number system by; (a) counting with understanding and recognizing how many in sets of objects up to 50, (b)
Number, Operation, and Quantitative Reasoning
2 nd Grade Math TEKS I Can Statements Website Resources Number, Operation, and Quantitative Reasoning 2.1A I can use models to make a number that has hundreds, tens,and ones 2.1B I can read and write whole
COC131 Data Mining - Clustering
COC131 Data Mining - Clustering Martin D. Sykora [email protected] Tutorial 05, Friday 20th March 2009 1. Fire up Weka (Waikako Environment for Knowledge Analysis) software, launch the explorer window
Interactive Data Mining and Visualization
Interactive Data Mining and Visualization Zhitao Qiu Abstract: Interactive analysis introduces dynamic changes in Visualization. On another hand, advanced visualization can provide different perspectives
Microsoft Business Intelligence Visualization Comparisons by Tool
Microsoft Business Intelligence Visualization Comparisons by Tool Version 3: 10/29/2012 Purpose: Purpose of this document is to provide a quick reference of visualization options available in each tool.
Lasse Cronqvist. Email: [email protected]. Tosmana. TOol for SMAll-N Analysis. version 1.2. User Manual
Lasse Cronqvist Email: [email protected] Tosmana TOol for SMAll-N Analysis version 1.2 User Manual Release: 24 th of January 2005 Content 1. Introduction... 3 2. Installing Tosmana... 4 Installing
W. Heath Rushing Adsurgo LLC. Harness the Power of Text Analytics: Unstructured Data Analysis for Healthcare. Session H-1 JTCC: October 23, 2015
W. Heath Rushing Adsurgo LLC Harness the Power of Text Analytics: Unstructured Data Analysis for Healthcare Session H-1 JTCC: October 23, 2015 Outline Demonstration: Recent article on cnn.com Introduction
FUZZY CLUSTERING ANALYSIS OF DATA MINING: APPLICATION TO AN ACCIDENT MINING SYSTEM
International Journal of Innovative Computing, Information and Control ICIC International c 0 ISSN 34-48 Volume 8, Number 8, August 0 pp. 4 FUZZY CLUSTERING ANALYSIS OF DATA MINING: APPLICATION TO AN ACCIDENT
Network-Based Tools for the Visualization and Analysis of Domain Models
Network-Based Tools for the Visualization and Analysis of Domain Models Paper presented as the annual meeting of the American Educational Research Association, Philadelphia, PA Hua Wei April 2014 Visualizing
Improving the Performance of Data Mining Models with Data Preparation Using SAS Enterprise Miner Ricardo Galante, SAS Institute Brasil, São Paulo, SP
Improving the Performance of Data Mining Models with Data Preparation Using SAS Enterprise Miner Ricardo Galante, SAS Institute Brasil, São Paulo, SP ABSTRACT In data mining modelling, data preparation
Gephi Tutorial Quick Start
Gephi Tutorial Welcome to this introduction tutorial. It will guide you to the basic steps of network visualization and manipulation in Gephi. Gephi version 0.7alpha2 was used to do this tutorial. Get
NakeDB: Database Schema Visualization
NAKEDB: DATABASE SCHEMA VISUALIZATION, APRIL 2008 1 NakeDB: Database Schema Visualization Luis Miguel Cortés-Peña, Yi Han, Neil Pradhan, Romain Rigaux Abstract Current database schema visualization tools
I. Create the base view with the data you want to measure
Developing Key Performance Indicators (KPIs) in Tableau The following tutorial will show you how to create KPIs in Tableau 9. To get started, you will need the following: Tableau version 9 Data: Sample
An approach of detecting structure emergence of regional complex network of entrepreneurs: simulation experiment of college student start-ups
An approach of detecting structure emergence of regional complex network of entrepreneurs: simulation experiment of college student start-ups Abstract Yan Shen 1, Bao Wu 2* 3 1 Hangzhou Normal University,
How To Find Local Affinity Patterns In Big Data
Detection of local affinity patterns in big data Andrea Marinoni, Paolo Gamba Department of Electronics, University of Pavia, Italy Abstract Mining information in Big Data requires to design a new class
DataPA OpenAnalytics End User Training
DataPA OpenAnalytics End User Training DataPA End User Training Lesson 1 Course Overview DataPA Chapter 1 Course Overview Introduction This course covers the skills required to use DataPA OpenAnalytics
Visualizing e-government Portal and Its Performance in WEBVS
Visualizing e-government Portal and Its Performance in WEBVS Ho Si Meng, Simon Fong Department of Computer and Information Science University of Macau, Macau SAR [email protected] Abstract An e-government
CLUSTER ANALYSIS FOR SEGMENTATION
CLUSTER ANALYSIS FOR SEGMENTATION Introduction We all understand that consumers are not all alike. This provides a challenge for the development and marketing of profitable products and services. Not every
Virtual Landmarks for the Internet
Virtual Landmarks for the Internet Liying Tang Mark Crovella Boston University Computer Science Internet Distance Matters! Useful for configuring Content delivery networks Peer to peer applications Multiuser
Visualization of Phylogenetic Trees and Metadata
Visualization of Phylogenetic Trees and Metadata November 27, 2015 Sample to Insight CLC bio, a QIAGEN Company Silkeborgvej 2 Prismet 8000 Aarhus C Denmark Telephone: +45 70 22 32 44 www.clcbio.com [email protected]
The Forgotten JMP Visualizations (Plus Some New Views in JMP 9) Sam Gardner, SAS Institute, Lafayette, IN, USA
Paper 156-2010 The Forgotten JMP Visualizations (Plus Some New Views in JMP 9) Sam Gardner, SAS Institute, Lafayette, IN, USA Abstract JMP has a rich set of visual displays that can help you see the information
DENTAL IMPRESSIONS TARGET COMMON CORE STATE STANDARD(S) IN MATHEMATICS: N-Q.1:
This task was developed by high school and postsecondary mathematics and health sciences educators, and validated by content experts in the Common Core State Standards in mathematics and the National Career
Medical Information Management & Mining. You Chen Jan,15, 2013 [email protected]
Medical Information Management & Mining You Chen Jan,15, 2013 [email protected] 1 Trees Building Materials Trees cannot be used to build a house directly. How can we transform trees to building materials?
Microsoft Consulting Services. PerformancePoint Services for Project Server 2010
Microsoft Consulting Services PerformancePoint Services for Project Server 2010 Author: Emmanuel Fadullon, Delivery Architect Microsoft Consulting Services August 2011 Information in the document, including
Methodology for Emulating Self Organizing Maps for Visualization of Large Datasets
Methodology for Emulating Self Organizing Maps for Visualization of Large Datasets Macario O. Cordel II and Arnulfo P. Azcarraga College of Computer Studies *Corresponding Author: [email protected]
Big Data Text Mining and Visualization. Anton Heijs
Copyright 2007 by Treparel Information Solutions BV. This report nor any part of it may be copied, circulated, quoted without prior written approval from Treparel7 Treparel Information Solutions BV Delftechpark
Catalog Creator by On-site Custom Software
Catalog Creator by On-site Custom Software Thank you for purchasing or evaluating this software. If you are only evaluating Catalog Creator, the Free Trial you downloaded is fully-functional and all the
InfiniteInsight 6.5 sp4
End User Documentation Document Version: 1.0 2013-11-19 CUSTOMER InfiniteInsight 6.5 sp4 Toolkit User Guide Table of Contents Table of Contents About this Document 3 Common Steps 4 Selecting a Data Set...
Lecture 2: Descriptive Statistics and Exploratory Data Analysis
Lecture 2: Descriptive Statistics and Exploratory Data Analysis Further Thoughts on Experimental Design 16 Individuals (8 each from two populations) with replicates Pop 1 Pop 2 Randomly sample 4 individuals
APPM4720/5720: Fast algorithms for big data. Gunnar Martinsson The University of Colorado at Boulder
APPM4720/5720: Fast algorithms for big data Gunnar Martinsson The University of Colorado at Boulder Course objectives: The purpose of this course is to teach efficient algorithms for processing very large
Voronoi Treemaps in D3
Voronoi Treemaps in D3 Peter Henry University of Washington [email protected] Paul Vines University of Washington [email protected] ABSTRACT Voronoi treemaps are an alternative to traditional rectangular
How To Run Statistical Tests in Excel
How To Run Statistical Tests in Excel Microsoft Excel is your best tool for storing and manipulating data, calculating basic descriptive statistics such as means and standard deviations, and conducting
Performance Assessment Task Which Shape? Grade 3. Common Core State Standards Math - Content Standards
Performance Assessment Task Which Shape? Grade 3 This task challenges a student to use knowledge of geometrical attributes (such as angle size, number of angles, number of sides, and parallel sides) to
Using SAS to Create Graphs with Pop-up Functions Shiqun (Stan) Li, Minimax Information Services, NJ Wei Zhou, Lilly USA LLC, IN
Paper CC12 Using SAS to Create Graphs with Pop-up Functions Shiqun (Stan) Li, Minimax Information Services, NJ Wei Zhou, Lilly USA LLC, IN ABSTRACT In addition to the static graph features, SAS provides
Cluster Analysis: Advanced Concepts
Cluster Analysis: Advanced Concepts and dalgorithms Dr. Hui Xiong Rutgers University Introduction to Data Mining 08/06/2006 1 Introduction to Data Mining 08/06/2006 1 Outline Prototype-based Fuzzy c-means
Final Project Report
CPSC545 by Introduction to Data Mining Prof. Martin Schultz & Prof. Mark Gerstein Student Name: Yu Kor Hugo Lam Student ID : 904907866 Due Date : May 7, 2007 Introduction Final Project Report Pseudogenes
BI Requirements Checklist
The BI Requirements Checklist is designed to provide a framework for gathering user requirements for BI technology. The framework covers, not only the obvious BI functions, but also the follow-up actions
Designing forms for auto field detection in Adobe Acrobat
Adobe Acrobat 9 Technical White Paper Designing forms for auto field detection in Adobe Acrobat Create electronic forms more easily by using the right elements in your authoring program to take advantage
Visualizing an Auto-Generated Topic Map
Visualizing an Auto-Generated Topic Map Nadine Amende 1, Stefan Groschupf 2 1 University Halle-Wittenberg, information manegement technology [email protected] 2 media style labs Halle Germany [email protected]
VISUALIZING HIERARCHICAL DATA. Graham Wills SPSS Inc., http://willsfamily.org/gwills
VISUALIZING HIERARCHICAL DATA Graham Wills SPSS Inc., http://willsfamily.org/gwills SYNONYMS Hierarchical Graph Layout, Visualizing Trees, Tree Drawing, Information Visualization on Hierarchies; Hierarchical
The Value of Visualization 2
The Value of Visualization 2 G Janacek -0.69 1.11-3.1 4.0 GJJ () Visualization 1 / 21 Parallel coordinates Parallel coordinates is a common way of visualising high-dimensional geometry and analysing multivariate
Applying Customer Attitudinal Segmentation to Improve Marketing Campaigns Wenhong Wang, Deluxe Corporation Mark Antiel, Deluxe Corporation
Applying Customer Attitudinal Segmentation to Improve Marketing Campaigns Wenhong Wang, Deluxe Corporation Mark Antiel, Deluxe Corporation ABSTRACT Customer segmentation is fundamental for successful marketing
Self-adaptive e-learning Website for Mathematics
Self-adaptive e-learning Website for Mathematics Akira Nakamura Abstract Keyword searching and browsing on learning website is ultimate self-adaptive learning. Our e-learning website KIT Mathematics Navigation
Using visualization to understand big data
IBM Software Business Analytics Advanced visualization Using visualization to understand big data By T. Alan Keahey, Ph.D., IBM Visualization Science and Systems Expert 2 Using visualization to understand
Numeracy Targets. I can count at least 20 objects
Targets 1c I can read numbers up to 10 I can count up to 10 objects I can say the number names in order up to 20 I can write at least 4 numbers up to 10. When someone gives me a small number of objects
Plotting: Customizing the Graph
Plotting: Customizing the Graph Data Plots: General Tips Making a Data Plot Active Within a graph layer, only one data plot can be active. A data plot must be set active before you can use the Data Selector
Using SAS With a SQL Server Database. M. Rita Thissen, Yan Chen Tang, Elizabeth Heath RTI International, RTP, NC
Using SAS With a SQL Server Database M. Rita Thissen, Yan Chen Tang, Elizabeth Heath RTI International, RTP, NC ABSTRACT Many operations now store data in relational databases. You may want to use SAS
Let SAS Write Your SAS/GRAPH Programs for You Max Cherny, GlaxoSmithKline, Collegeville, PA
Paper TT08 Let SAS Write Your SAS/GRAPH Programs for You Max Cherny, GlaxoSmithKline, Collegeville, PA ABSTRACT Creating graphics is one of the most challenging tasks for SAS users. SAS/Graph is a very
