A. Mrvar: Network Analysis using Pajek 1. Cluster



Similar documents
Course on Social Network Analysis Graphs and Networks

Tools and Techniques for Social Network Analysis

Copyright 2008, Lada Adamic. School of Information University of Michigan

Gephi Tutorial Visualization

SCAN: A Structural Clustering Algorithm for Networks

Part 2: Community Detection

Editing Common Polygon Boundary in ArcGIS Desktop 9.x

LYON COUNTY GEOMOOSE 2 HELP DOCUMENT

Social Media Mining. Graph Essentials

Getting started in Excel

Canterbury Maps Quick Start - Drawing and Printing Tools

Microsoft Publisher 2010 What s New!

DIA Creating Charts and Diagrams

Understanding Sociograms

Maplex Tutorial. Copyright Esri All rights reserved.

JustClust User Manual

Geometer s Sketchpad. Discovering the incenter of a triangle

Appointments: Calendar Window

Pajek. Program for Analysis and Visualization of Large Networks. Reference Manual List of commands with short explanation version 2.

Psy 210 Conference Poster on Sex Differences in Car Accidents 10 Marks

Computer Basics: Tackling the mouse, keyboard, and using Windows

designvue manual v1 Index This document contains the following sections:

Scheduling Shop Scheduling. Tim Nieberg

Visualizing Complexity in Networks: Seeing Both the Forest and the Trees

Excel -- Creating Charts

Choose a topic from the left to get help for CmapTools.

Statistics Chapter 2

Getting Started Guide

IE 680 Special Topics in Production Systems: Networks, Routing and Logistics*

Creating Basic Reports with the SAP Query Tool

Decreases the magnification of your chart. Changes the magnification of the displayed chart.

Graph Theory and Complex Networks: An Introduction. Chapter 06: Network analysis

Gephi Tutorial Quick Start

NodeXL for Network analysis Demo/hands-on at NICAR 2012, St Louis, Feb 24. Peter Aldhous, San Francisco Bureau Chief.

Discrete Mathematics & Mathematical Reasoning Chapter 10: Graphs

How To Use Spss

Creating a Database in Access

M-Files Gantt View. User Guide. App Version: Author: Joel Heinrich

Understand the Sketcher workbench of CATIA V5.

Groups and Positions in Complete Networks

Handout #Ch7 San Skulrattanakulchai Gustavus Adolphus College Dec 6, Chapter 7: Digraphs

COMBINATORIAL PROPERTIES OF THE HIGMAN-SIMS GRAPH. 1. Introduction

Linkage 3.2. User s Guide

Appendix 2.1 Tabular and Graphical Methods Using Excel

Reductions & NP-completeness as part of Foundations of Computer Science undergraduate course

Drawing a histogram using Excel

Structural Health Monitoring Tools (SHMTools)

3D Tool 2.0 Quick Start Guide

Module 6; Managing Large Classes

Graph/Network Visualization

Outlook Mail Quick Fixes

An Introduction to APGL

Network Metrics, Planar Graphs, and Software Tools. Based on materials by Lala Adamic, UMichigan

136 CHAPTER 4. INDUCTION, GRAPHS AND TREES

Heat Map Explorer Getting Started Guide

Network (Tree) Topology Inference Based on Prüfer Sequence

Network Algorithms for Homeland Security

Manager. User. Guide

Using Microsoft Project 2000

UCINET Visualization and Quantitative Analysis Tutorial

SPTO Configuration Tool

not at all a manual simply a quick how-to-do guide

Doing Multiple Regression with SPSS. In this case, we are interested in the Analyze options so we choose that menu. If gives us a number of choices:

IBM SPSS Data Preparation 22

Lecture 2 Mathcad Basics

DVS DCI Signing Certificate Tool

Web Dynpro ABAP: ALV and Table in Popup Window

Priority Based Enhancement of Online Power-Aware Routing in Wireless Sensor Network. Ronit Nossenson Jerusalem College of Technology

1. Write the number of the left-hand item next to the item on the right that corresponds to it.

Chapter 1. Creating Sketches in. the Sketch Mode-I. Evaluation chapter. Logon to for more details. Learning Objectives

Mean Ramsey-Turán numbers

Index ScaleNet v1.0 Installation (Windows 95, 98) 1 ScaleNet v1.0 Installation (Windows XP) 3 ScaleNet v1.0 Configuration 6 Initiating Session 7

Creating and Manipulating Spatial Weights

J-TRADER QUICK START USERGUIDE For Version 8.0

Databases in Microsoft Access David M. Marcovitz, Ph.D.

Intellect Platform - Tables and Templates Basic Document Management System - A101

Installing Ubuntu. Obtaining files required

Graph theory and network analysis. Devika Subramanian Comp 140 Fall 2008

Triangle deletion. Ernie Croot. February 3, 2010

Acer erecovery Management

Creating a Network Graph with Gephi

PowerPoint: Graphics and SmartArt

Graph Mining and Social Network Analysis

Netigate User Guide. Setup Introduction Questions Text box Text area Radio buttons Radio buttons Weighted...

Lecture 17 : Equivalence and Order Relations DRAFT

Simplified External memory Algorithms for Planar DAGs. July 2004

Snap 9 Professional s Scanning Module

! E6893 Big Data Analytics Lecture 10:! Linked Big Data Graph Computing (II)

11 Printing Designs. When you have completed this chapter, you will be able to:

SPERNER S LEMMA AND BROUWER S FIXED POINT THEOREM

Microsoft Access 2010 Overview of Basics

PIC 10A. Lecture 7: Graphics II and intro to the if statement

Graph Theory and Complex Networks: An Introduction. Chapter 06: Network analysis. Contents. Introduction. Maarten van Steen. Version: April 28, 2014

IBM SPSS Direct Marketing 23

5. Binary objects labeling

CATIA Basic Concepts TABLE OF CONTENTS

[COGNOS DATA TRAINING FAQS] This is a list of frequently asked questions for a Cognos user

Ingham County Equalization/Tax Mapping Viewer Tutorial

Scott Harvey, Registrar Tri County Technical College. Using Excel Pivot Tables to Analyze Student Data

NakeDB: Database Schema Visualization

Transcription:

A. Mrvar: Network Analysis using Pajek 1 Cluster One of the important goals of network analysis is to find clusters of units, which have similar (or equal) structural characteristics, which are determined from relation R. Cluster C is a subset of units. In Pajek the default extension for cluster is.cls. If we are interested only in units2,4and5in the test network, the cluster in input file is described in the following way: 2 4 5 Smaller clusters can be built using Pajek too, in the following way: First create an empty cluster (cluster without units) using Cluster/Create Empty Cluster then select File/Cluster/Edit or the corresponding icon and by double clicking on Add new vertex add new units to the cluster.

A. Mrvar: Network Analysis using Pajek 2 Partition of units A partition of the set of unitsu is formed by clusters. C = {C 1,C 2,...,C k } if: C i = U and i j C i C j =. i We can imagine a partition as a one dimensional table of equal dimension as the number of units in network. The i-th element in the table tells, the cluster to which the unit i belongs. Clusters are in this case presented as integer numbers. Sometimes we put not interesting units in cluster 0. We need a partition whenever we want to describe a property of a unit using some integer number. So far we have already used partitions for determining distances of vertices from a selected vertex k-neighbours determining depths of vertices in an acyclic network determining degrees of vertices If vertices represent people, a partition can be used to describe

A. Mrvar: Network Analysis using Pajek 3 their gender (1-men, 2-women). The default extension for a partition is.clu clustering. If we have a network with 5 vertices, and want to place vertices 1 and 5 into cluster 1, and all other vertices in cluster 2, the corresponding partition is described on input file in the following way: *Vertices 5 1 2 2 2 1 We can build a partition using Pajek too, so that we first use Partition/Create Constant Partition 0 The dimension of partition is by default set to the number of vertices in the selected network. In this way we put all vertices in cluster 0. Then we select File/Partition/Edit/ or press the corresponding icon, and for each unit input its cluster number.

A. Mrvar: Network Analysis using Pajek 4 If we use Draw/Draw-Partition or press Ctrl+P colors of vertices will be used to show to which cluster each vertex belongs. The second possibility to determine a partition is: Select Draw/Draw-SelectAll or press Ctrl+A. Using that command three operations are executed: a new partition of equal dimension as the number of vertices is generated; all vertices are put to cluster 0 and the network is drawn using the obtained null partition (all vertices are cyan). In the picture of a network: By pressing the middle mouse button we increment the cluster number of selected vertex (if we press the Alt key on the keyboard at the same time, we decrement the cluster number). If the mouse has only two buttons, the left mouse button and Shift key are used to increment, and the left mouse button and Alt key are used to decrement the cluster number. If only the part of network is selected we increment/decrement the cluster number of all selected vertices by pressing with mouse somewhere in the picture (not on a vertex).

A. Mrvar: Network Analysis using Pajek 5 Colors used are shown in Options/Colors/Partition Colors in Draw window, or http://vlado.fmf.uni-lj.si/pub/networks/pajek/part col.pdf In the layout of a network, all vertices that belong to a selected cluster can be moved at once by pressing the left mouse button close to the vertex of selected color, holding the mouse down and moving the mouse. We are also interested in the subnetwork (vertices and lines) which the vertices in a selected cluster induce. In Pajek we extract a subnetwork determined by a given cluster using: Operations/Extract from Network/Cluster Before we run this operation we must select the network and cluster in the main window. We will extract a selected cluster which is determined by a partition even more often. In this case we use: Operations/Extract from Network/Partition Before we run this operation we must select the network and partition in the main window. Additionally we must input clusters which we want to extract.

A. Mrvar: Network Analysis using Pajek 6 Components Three types of components will be defined: strong, weak and biconnected. A subset of vertices in a network is called a strongly connected component if (taking directions of lines into account) from every vertex of the subset we can reach every other vertex belonging to the same subset. If direction of lines is not important (where we consider the network to be undirected), such a subset is called a weakly connected component. Example: Let the vertices of network correspond to buildings in the city, and lines to streets that connect the buildings. Some streets are ordinary (undirected lines), some are one way only (directed lines). All buildings that can be reached from one to another using the car, belong to the same strongly connected component (we must take one way streets into account). All buildings that can be reached from one to another by walking, belong to the same weakly connected component (one way streets are no restriction).

A. Mrvar: Network Analysis using Pajek 7 Lets take the relation whom would you invite to the party. For all persons belonging to the same strongly connected component: everybody will (at least indirectly) invite everybody else from the same strongly connected component everybody will be (at least indirectly) invited by everybody else from the same strongly connected component If the network is undirected, a strongly connected component os the same as a weakly connected one, so they are simply called connected components. Connection between walks in a network and components Between any two vertices from the same strongly connected component there always exists a walk. We also say that the two vertices are strongly connected. Between any two vertices from the same weakly connected component there always exists a chain. We also say that the two vertices are weakly connected. A network is called strongly/weakly connected if every pair of vertices is strongly/weakly connected.

A. Mrvar: Network Analysis using Pajek 8 Example c i d f j h e b g a The network is weakly connected. There exist 4 strongly connected components: 1. (a) 2. (b,c,e) 3. (d,f) 4. (g,h,i,j) Vertices that belong to the same strongly connected components are drawn using the same shape. The shape of vertices in Pajek is determined in the input file, so that after the vertex label (and coordinates) we add:

A. Mrvar: Network Analysis using Pajek 9 box, ellipse, diamond, triangle or empty: *Vertices 5 1 "a" box 2 "b" ellipse 3 "c" diamond 4 "d" triangle 5 "e" empty... In Pajek, strongly connected components are computed using Net/Components/Strong, weakly using Net/Components/Weak. Result is represented by a partition vertices that belong to the same component have the same number in the partition. Find the connected components in the dictionary of English words (dic28.net)! What is the number of components? 17903 Number of vertices in the largest component? 24831 Size of the smallest component? 1 (isolated vertex) Extract and draw one of the smallest components.

A. Mrvar: Network Analysis using Pajek 10 One of the smallest components: willful wilful wailful pailful painful gainful moanful cupfuls cupful cupsful panful manful canful jarful capful carful earful lapful careful tearful fearful awful lawful dareful direful ireful

A. Mrvar: Network Analysis using Pajek 11 Biconnected components Lets take the undirected connected network. Vertex a of the network is an articulation point of the network if there exist two other, different vertices v and w, so that every chain between the two vertices includes also vertex a. Simply saying: vertex a is articulation point, if removing the vertex from the network causes the network to become disconnected. artic. p. A network is called biconnected if for every triple of vertices a, v and w there exists a chain between v and w which does not include vertex a. Simply: a network is called biconnected if, after removing any vertex, the network remains connected. A biconnected network does not contain any articulation point.

A. Mrvar: Network Analysis using Pajek 12 Example e a b d i c f h Te network in the picture (aho1.net) consists of 4 biconnected components: (a,b,c) (b,d,e) (d,f) (f,g,h,i) The articulation points are b,dand f. g

A. Mrvar: Network Analysis using Pajek 13 A network that consists of 17 biconnected components: 2 4 55 56 54 53 52 50 49 51 48 1 2 3 5 4 5 5 17 6 7 11 3 1 8 9 10 7 5 11 6 17 27 16 12 13 14 20 15 19 18 44 42 43 46 45 47 41 30 40 33 32 39 38 35 34 26 36 28 37 29 31 25 24 22 21 23 8 11 9 11 10 11 11 13-16 18-31 34 39-41 44 12 14 17 27 32 33 34 34 35 36 37 38 41 42 43 41 45 46 47 44 49 50 51 52 53 54 48 49 53 55 56

A. Mrvar: Network Analysis using Pajek 14 Definition of components using walks For strongly connected components the direction of lines is important: A subset of vertices is called a strongly connected component if (when taking directions of lines into account) there exists at least one walk from any vertex to any other vertex from the subset. For weakly connected and biconnected components direction of lines is not important: A subset of vertices is called a weakly connected component if there exists at least one chain from any vertex to any other vertex from the subset. A subset of vertices is called a biconnected component if there exist at least two chains from any vertex to any other vertex from the subset, with no intermediate vertices in common. In some occasions it is beneficial that a network is well connected (large strongly, weakly and biconnected components). Such examples are communication networks (streets, information flow,... ); sometimes it is not (e. g., infections).

A. Mrvar: Network Analysis using Pajek 15 Biconnected components in Pajek To compute bicomponents use: Net/Components/Bi-components Biconnected components are stored to hierarchy (articulation points belong to several components, therefore bicomponents cannot be stored in a partition like strongly connected components). Nodes in the hierarchy represent biconnected components. We can travel in a hierarchy like in Windows Explorer. Vertices that belong to selected bicomponent are displayed by double clicking with the left mouse button (or single clicking with right mouse button) the node in hierarchy. A selected bicomponent is extracted using: Hierarchy/Extract Cluster and entering the number of the node in hierarchy. A subnetwork determined by obtained cluster is extracted using Operations/Extract from Network/Cluster Additionally, when computing bicomponents, Pajek returns a partition with articulation points (vertices in cluster 1 are not articulation points, all others are). The cluster number tells, into how many pieces a network will fall, when removing a given articulation point from the network.

A. Mrvar: Network Analysis using Pajek 16 Example: Airlines connections network An airlines connections network in the USA (usair97.net) consists of 14 biconnected components of size at least 3. The largest bicomponent contains 244 airports, some of the smallest contain 3 airports. There exist 27 articulation points (if we compute bi-components of size 2 or more). The most important articulation point (airport) is Dallas. If we remove that vertex (airport is closed), the airlines connections network will fall to 14 disconnected pieces. The first 8 airports according to articulation point criterion are (result obtained using Info/Partition): Rank Unit Class Id ---------------------------------------- 1 261 14 Dallas/Fort Worth Intl 2 13 7 Bethel 3 8 7 Anchorage Intl 4 201 6 San Francisco Intl 5 152 6 Pittsburgh Intll 6 182 5 Lambert-St Louis Intl 7 258 4 Phoenix Sky Harbor Intl 8 67 4 Minneapolis-St Paul Intl

A. Mrvar: Network Analysis using Pajek 17 Cores A subset of vertices is called ak-core if every vertex from the subset is connected to at least k vertices from the same subset. Formally: Let N = (V,L),L V V be a network. A maximal subgraph H = (W, W W) induced by the set W is ak-core iff v W : deg H (v) k. Special example: A subset of vertices is called a clique if every vertex from the subset is connected to every other vertex from the subset. Computing cliques is very time consuming, while computing cores is not. 0, 1, 2 and 3 core:

A. Mrvar: Network Analysis using Pajek 18 Cores can be computed according to lines going into vertices (Input), lines going out of vertices (Output) or all lines (All). In the case of undirected networks input and output cores are the same. Properties of cores: Cores are nested (see figure): i < j = H j H i They are not necessarily connected subnetworks (see figure). Cores can be generalized to networks with values on lines. Cores in Pajek can be computed using Net/Partitions/Core and selecting Input, Output or All core. Result is a partition: for every vertex its core number is given. In most cases we are interested in the highest core(s) only. The corresponding subnetwork can be extracted using Operations/Extract from Network/Partition and typing which clusters to extract.

A. Mrvar: Network Analysis using Pajek 19 Example: Compute and extract the highest core in network write.net. The network is undirected. The highest core is 4-core. Computing and extracting a core of a network is one of the possibilities to determine the boundary of the network: Often, we are interested only in the densest part of a large network. In this case we compute cores, and extract only the part belonging to some core number or higher. Take the relation whom would you ask for the advice (directed network): For people belonging to an input k-core, it holds that at least k people from the k-core would ask any of the persons belonging tok-core. For people belonging to output k-core, it holds, that every person fromk-core would ask for the advice at leastk persons from k-core.

A. Mrvar: Network Analysis using Pajek 20 Take the network advice.net: whom would you ask for the advice. The network is directed. The frequency distribution of input k-cores (the table is obtained using Pajek command Info/Partition): Class Freq Class*Freq Represent. ----------------------------------------- 5 2 10 I15 8 1 8 I2 11 26 286 I1 ----------------------------------------- Explanation In the network an 11-core exists with 26 students in it: there exist 26 students, among which every one would be asked for advice from at least 11 others from this 26. One of these 26 students is also student I1.

A. Mrvar: Network Analysis using Pajek 21 The frequency distribution of output k-cores: Class Freq Class*Freq Represent. ----------------------------------------- 1 2 2 I2 2 2 4 I3 4 1 4 I7 7 2 14 I11 8 2 16 I6 9 20 180 I1 ----------------------------------------- Explanation In the network a 9-core exists, with 20 students in it: there exist 20 students, everyone of whom would ask for the advice at least 9 others from this 20. One if this 20 students is student I1. Example What is the highest core in network of airline connections in USA usair97.net (network is undirected)? Which airports belong to the highest core?

A. Mrvar: Network Analysis using Pajek 22 Examples 1. Find strongly connected components in stropic.net. Extract the largest component. 2. Find connected components in dic28.net! What is the number of components? 17903 The largest component has 24831 vertices The smallest component(s) have 1 vertex Extract and draw one of the smallest component (component with 15-40 vertices). 3. Find biconnected components in aho1.net. Extract the largest component as a new network. 4. Find biconnected components in write.net. Extract the largest component as a new network. 5. Find biconnected components in airlines connections network (usair97.net). Extract the component with 6 airports. Find first 8 airports which are the most crucial articulation points. Draw the picture of network, so that the size of vertices will represent to how many pieces the network will fall if given airport is closed. 6. Compute and extract the highest core in write.net. 7. Compute and extract the highest input and output core in

A. Mrvar: Network Analysis using Pajek 23 advice.net. 8. What is the highest core in airlines connections network (usair97.net)? Which airports belong to that core?