Introduction to social network analysis



Similar documents
Examining graduate committee faculty compositions- A social network analysis example. Kathryn Shirley and Kelly D. Bradley. University of Kentucky

Equivalence Concepts for Social Networks

Using social network analysis in evaluating community-based programs: Some experiences and thoughts.

What is Network Mapping?

HISTORICAL DEVELOPMENTS AND THEORETICAL APPROACHES IN SOCIOLOGY Vol. I - Social Network Analysis - Wouter de Nooy

Groups and Positions in Complete Networks

THE ROLE OF SOCIOGRAMS IN SOCIAL NETWORK ANALYSIS. Maryann Durland Ph.D. EERS Conference 2012 Monday April 20, 10:30-12:00

UCINET Visualization and Quantitative Analysis Tutorial

Borgatti, Steven, Everett, Martin, Johnson, Jeffrey (2013) Analyzing Social Networks Sage

Network Analysis Basics and applications to online data

Week 3. Network Data; Introduction to Graph Theory and Sociometric Notation

Introduction to Ego Network Analysis

Social Network Mining

Social Network Analysis using Graph Metrics of Web-based Social Networks

Statistical Analysis of Complete Social Networks

Social Network Analysis: Visualization Tools

Introduction to Social Network Methods

How to do a Business Network Analysis

IBM SPSS Modeler Social Network Analysis 15 User Guide

SOCIAL NETWORK ANALYSIS EVALUATING THE CUSTOMER S INFLUENCE FACTOR OVER BUSINESS EVENTS

Big Data Analytics of Multi-Relationship Online Social Network Based on Multi-Subnet Composited Complex Network

What is SNA? A sociogram showing ties

Social Networking Analytics

A SOCIAL NETWORK ANALYSIS APPROACH TO ANALYZE ROAD NETWORKS INTRODUCTION

Understanding Sociograms

DATA ANALYSIS II. Matrix Algorithms

Workshop in Applied Analysis Software MY591. Introduction to Social Network Analysis with UCINET

Complex Network Analysis of Brain Connectivity: An Introduction LABREPORT 5

PM 542: Social Network Analysis

Visualization of Social Networks in Stata by Multi-dimensional Scaling

SYSM 6304: Risk and Decision Analysis Lecture 5: Methods of Risk Analysis

Practical Graph Mining with R. 5. Link Analysis

Network Analysis For Sustainability Management

The mathematics of networks

UCINET Quick Start Guide

A Network Approach to Spatial Data Infrastructure Applying Social Network Analysis in SDI research

Comparison of Non-linear Dimensionality Reduction Techniques for Classification with Gene Expression Microarray Data

How to Analyze Company Using Social Network?

Multilevel Models for Social Network Analysis

Temporal Visualization and Analysis of Social Networks

Evaluating Software Products - A Case Study

Graphs over Time Densification Laws, Shrinking Diameters and Possible Explanations

Imputation of missing network data: Some simple procedures

Inside Social Network Analysis

USING SPECTRAL RADIUS RATIO FOR NODE DEGREE TO ANALYZE THE EVOLUTION OF SCALE- FREE NETWORKS AND SMALL-WORLD NETWORKS

DIGITS CENTER FOR DIGITAL INNOVATION, TECHNOLOGY, AND STRATEGY THOUGHT LEADERSHIP FOR THE DIGITAL AGE

Graph Mining and Social Network Analysis

Social Media Mining. Graph Essentials

KAIST Business School

Social Network Analysis

The Measurement of Social Networks: A Comparison of Alter-Centered and Relationship-Centered Survey Designs

Social Network Analysis Measuring, Mapping, and Modeling Collections of Connections

How To Analyze The Social Interaction Between Students Of Ou

1 Applications of Social Network Analysis (Introduction)

DATA ANALYSIS IN PUBLIC SOCIAL NETWORKS

Social Network Analysis

Data exploration with Microsoft Excel: analysing more than one variable

Social Media Mining. Network Measures

Advanced Statistical Analysis of Mortality. Rhodes, Thomas E. and Freitas, Stephen A. MIB, Inc. 160 University Avenue. Westwood, MA 02090

Algebra Academic Content Standards Grade Eight and Grade Nine Ohio. Grade Eight. Number, Number Sense and Operations Standard

Pre-Algebra Academic Content Standards Grade Eight Ohio. Number, Number Sense and Operations Standard. Number and Number Systems

Visualizing Bipartite Network Data using SAS Visualization Tools. John Zheng, Columbia MD

Measures of Spread and Their Effects Grade Seven

SECTIONS NOTES ON GRAPH THEORY NOTATION AND ITS USE IN THE STUDY OF SPARSE SYMMETRIC MATRICES

THE HUMAN BRAIN. observations and foundations

So today we shall continue our discussion on the search engines and web crawlers. (Refer Slide Time: 01:02)

Data Analysis, Statistics, and Probability

A Modified Elicitation of Personal Networks Using Dynamic Visualization

Software for Social Network Analysis

How To Understand The Network Of A Network

Trusses: Cohesive Subgraphs for Social Network Analysis

Knowing Your School. A series of briefing notes for school governors from the National Governors Association produced in association with partners

Network Analysis. Antonio M. Chiesi Department of Social and Political Studies, Università degli Studi di Milano

Visualizing Complexity in Networks: Seeing Both the Forest and the Trees

Técnicas y Herramientas de Apoyo a la investigación (THA) II. Técnicas de Investigación Cualitativa. Social network analysis (SNA)

An approach of detecting structure emergence of regional complex network of entrepreneurs: simulation experiment of college student start-ups

Social Network Discovery based on Sensitivity Analysis

Data Exploration Data Visualization

SGL: Stata graph library for network analysis

Network Analysis: Lecture 1. Sacha Epskamp

Mining Social-Network Graphs

Transcription:

Introduction to social network analysis Paola Tubaro University of Greenwich, London 26 March 2012

Introduction to social network analysis Introduction Introducing SNA Rise of online social networking services: social networks to the fore. New interest for social network analysis (SNA). Yet networks have always existed! Likewise, SNA now has a long history.

Introduction Today Understand what SNA is. Understand how you could use it. Learn basic principles and measures.

Introduction Outline Outline 1 Introduction 2 What is SNA 3 Data 4 Network metrics 5 Further readings

Introduction Motivation What can SNA be used for? Improvements in organisational performance. Policy interventions for behaviour change;

Introduction Motivation The organisational chain of a company

Introduction Motivation Formal chart vs. network With whom do you discuss issues important to your work?

Introduction Motivation Formal chart vs. network With whom do you discuss issues important to your work? Senior people relatively peripheral (Barry): removed from day-to-day activities of the group.

Introduction Motivation Formal chart vs. network With whom do you discuss issues important to your work? The very central role of Nick (what if he moves to another job?)

Introduction Motivation Formal chart vs. network With whom do you discuss issues important to your work? Product 1 division relatively separate from overall network.

Introduction Motivation Interventions Using network data to improve flows of communication and coordination in the organisation.

Introduction Motivation Networks for behaviour change: smoking prevention Network of friendships among sixth grade pupils. Squares = girls, circles = boys; blue = smokers, red = non-smokers. Valente et al. 2003.

Introduction Motivation Use popular pupils ( opinion leaders ) to reduce smoking in adolescents Identify most popular pupils in class; Recruit and train them; Use them to spread the message. Valente et al. 2003: network method effective in reducing adolescents smoking.

What is SNA Defining SNA An approach to human behaviours and social interactions. A set of specific analytical and statistical methods. A special type of data (and techniques of data collection). A set of visualisation tools.

What is SNA What is a network What is a network a formal definition = A set of units (nodes) connected by one or more relations (ties) What is a node? Depends on setting: person, group/organisation, object. What is a tie? A relation or a shared trait: friendship, advice, exchange, co-work.

What is SNA What is a network Graphs and networks Circles (A, B) represent nodes. Lines (e.g. between A and B) represent ties/edges. Graph visualizes the whole structure of ties of a defined group. Graphical conventions (colours, size of nodes and/or ties) can be added to show attributes. For example: if this is a network of friendship, blue = boys, red = girls.

What is SNA What is a network Graphs and networks Circles (A, B) represent nodes. Lines (e.g. between A and B) represent ties/edges. Graph visualizes the whole structure of ties of a defined group. Graphical conventions (colours, size of nodes and/or ties) can be added to show attributes. For example: if this is a network of friendship, blue = boys, red = girls.

What is SNA What is a network Isolates, dyads and triads a d b e c f Isolate Dyad Triad

What is SNA The network perspective A new perspective SNA requires a change of mindset with respect to other social science approaches. Emphasis is on relationships, not attributes. Not just dyadic relationships (just A and B), but dyadic relationships as embedded in a whole set of relationships.

What is SNA The network perspective A new perspective SNA requires a change of mindset with respect to other social science approaches. Emphasis is on relationships, not attributes. Not just dyadic relationships (just A and B), but dyadic relationships as embedded in a whole set of relationships.

What is SNA The network perspective A new perspective SNA requires a change of mindset with respect to other social science approaches. Emphasis is on relationships, not attributes. Not just dyadic relationships (just A and B), but dyadic relationships as embedded in a whole set of relationships.

What is SNA The network perspective Embedded relationships Figure: Suppose the relationship represented here is friendship. How may friendship between A and B vary in these three different contexts?

What is SNA The network perspective Triads a a b c b c b a c Intransitive Transitive 3-cycles Intransitive: Only bilateral ties. Transitive: A friend of my friend is my friend. Three-cycles: a form of generalized exchange.

What is SNA The network perspective Triads a a b c b c b a c Intransitive Transitive 3-cycles Intransitive: Only bilateral ties. Transitive: A friend of my friend is my friend. Three-cycles: a form of generalized exchange.

What is SNA The network perspective Network effects, more globally e a 1 1 1 d b 1 1c For example, those who attract many choices will attract even more in future (reputation effect, Matthew effect). Does a high (and growing) number of friends have advantages / disadvantages?

What is SNA The network perspective Network effects, more globally e a 1 1 1 d b 1 1c For example, those who attract many choices will attract even more in future (reputation effect, Matthew effect). Does a high (and growing) number of friends have advantages / disadvantages?

What is SNA The network perspective Network effects, more globally e a 1 1 1 d b 1 1c For example, those who attract many choices will attract even more in future (reputation effect, Matthew effect). Does a high (and growing) number of friends have advantages / disadvantages?

What is SNA The network perspective Network effects, more globally e a 1 1 1 d b 1 1c For example, those who attract many choices will attract even more in future (reputation effect, Matthew effect). Does a high (and growing) number of friends have advantages / disadvantages?

What is SNA Summary Now you know: What a network is; Correspondence between a network and a graph; Difference between triadic and dyadic structures; Global effects of network structure.

Data Network data Data format: How network data look like How they differ from other social science data From data to graph Data collection: Name generators/interpreters Archives Web crawlers

Data Data format Data type 1: Ego networks The whole set of contacts (alters) of one person or entity (ego). Usually includes attributes of alters and ties between them. Usually collected for a sample of egos (e.g. in a survey). Typically, graphically represented with ego at its centre (star-shaped).

Data Data format Example: Ego networks to discover hidden populations Enrolling HIV+ persons to participate in vaccine preparedness study through their networks. Valente, 2010.

Data Data format Data type II: Whole networks Mapping the whole set of ties of a particular group, setting or population. Not focused on one particular person or entity. Network boundaries must be well-defined. Examples: network of friends in a classroom; network of knowledge-sharing between employees of an organisation.

Data Data format Data storage: traditional social science Social science data are usually represented in the form of a rectangular table, where each row is an observation, each column is a variable: For example: name age gender married Jane 25 0 0 Mary 31 0 0 Bob 29 1 1 Sue 28 0 1 Alan 32 1 0 Tom 29 1 1

Data Data format Network data storage I: matrix Network data can be stored as a n-by-n square matrix with all nodes listed in both columns and rows. The value of cell (i, j) in the matrix indicates whether the node i and the node j are connected (1) or not (0). The diagonal is meaningless. For example, for a friendship network: Jane Mary Bob Sue Alan Tom Jane 1 1 0 0 0 Mary 1 0 1 0 0 Bob 1 0 0 1 0 Sue 0 1 0 1 0 Alan 0 0 0 1 1 Tom 0 0 0 0 1

Data Data format Data storage II: Edge list The edge list stores each pair of connected nodes in a single row of a table. For example, for the same friendship network: ego Jane Jane Mary Bob Alan Alan alter Mary Bob Sue Alan Tom Susan

Data Data format Which format to choose Most network analysis packages support both formats. Some provide conversion facilities (e.g. UCINET: edge list to matrix). It is usually possible to combine network data (in matrix or edge list format) and attributes. A rectangular table is usually needed for attribute data as in traditional social science.

Data Data format Some general rules Matrix visually appealing when nodeset is small, but difficult to handle when it is large (because all possible pairs must be explicitly included). With large node sets, edge list is more convenient (because only existing ties need to be listed).

Data Data format Tie data I Directed ties: Tie goes from one node to another, but not necessarily back. E.g. Advice-giving, money-lending. Usual graphical representation: arrow. When directed ties do go in both directions, they are reciprocal ties. Usual graphical representation: double arrow. a 1 a 1 b 1 b 1

Data Data format Tie data I Directed ties: Tie goes from one node to another, but not necessarily back. E.g. Advice-giving, money-lending. Usual graphical representation: arrow. When directed ties do go in both directions, they are reciprocal ties. Usual graphical representation: double arrow. a 1 a 1 b 1 b 1

Data Data format Tie data II Undirected ties: Ties are mutual by definition. E.g. Siblings, co-workers. Usual graphical representation: line. a 1 b 1

Data Data format Undirected ties: matrix is symmetric Jane Mary Bob Sue Alan Tom Jane 1 1 0 0 0 Mary 1 0 1 0 0 Bob 1 0 0 1 0 Sue 0 1 0 1 0 Alan 0 0 0 1 1 Tom 0 0 0 0 1

Data Data format Directed ties: matrix is NOT symmetric Jane Mary Bob Sue Alan Tom Jane 1 1 0 0 0 Mary 0 0 1 0 0 Bob 0 0 0 1 0 Sue 0 0 0 0 0 Alan 0 0 0 1 1 Tom 0 0 0 0 0

Data Data format Binary and valued ties Binary ties indicate presence or absence of tie Valued ties can be stronger or weaker, under some definition of strength: Emotional closeness; Frequency of contact; Duration of Relationships. Graphically: line (arrow) thickness often represents strength of tie.

Data Data format Storing valued ties in a edge list The edge list can include a third column with attributes of each tie. In our friendship example, we can include duration of friendship: ego alter duration (years) Jane Mary 5 Jane Bob 2 Mary Susan 3 Bob Alan 1 Alan Tom 2 Alan Susan 2

Data Data format Storing valued ties in a matrix Instead of 0-1 values, the matrix has different values depending on duration of the relationship: Jane Mary Bob Sue Alan Tom Jane 5 2 0 0 0 Mary 0 0 3 0 0 Bob 0 0 0 1 0 Sue 0 0 0 0 0 Alan 0 0 0 2 2 Tom 0 0 0 0 0

Data Data format Graphs Basic principles of graph representation are simple (nodes and edges). But graph visualisation is a complex problem in computer science. Which representation is most suitable for detecting network structure and properties? Circle

Data Data format Graphs Basic principles of graph representation are simple (nodes and edges). But graph visualisation is a complex problem in computer science. Which representation is most suitable for detecting network structure and properties? Fruchtermann-Rheinhold

Data Data format Graphs Basic principles of graph representation are simple (nodes and edges). But graph visualisation is a complex problem in computer science. Which representation is most suitable for detecting network structure and properties? Kamada-Kawai

Data Data format Graphs Basic principles of graph representation are simple (nodes and edges). But graph visualisation is a complex problem in computer science. Which representation is most suitable for detecting network structure and properties? Spring

Data Data format Graphs Basic principles of graph representation are simple (nodes and edges). But graph visualisation is a complex problem in computer science. Which representation is most suitable for detecting network structure and properties? MDS

Data Data format Now you know: Format for network data: square matrix, rectangular matrix, edge list. Difference between Ego and whole networks. Directed and undirected ties. Binary and valued ties. Graphical conventions to represent these different data.

Data Data collection Collecting network data Networks are built from nodes and the ties between them. Who are the nodes? What are the ties? How to elicit information?

Data Data collection How to identify nodes Ego-network data collections often included in larger surveys. Whole network data collection requires defining network boundaries, for example: Members of an organisation; Students of one school; Attendees of one particular event. N.B. collection of whole network data needs to be exhaustive sensitive to response rate.

Data Data collection Collecting network data through surveys: name generators and interpreters Name generators are questions to elicit respondents alters, for example: From time to time, most people discuss important matters with other people. Looking back of the last six months, who are the people with whom you discussed matters important to you. Just tell me their names or initials. (General Social Survey, 1985) Can be accompanied by name interpreters to report alter characteristics and identify ties between alters.

Figure: A name generator with a graphical interface in a web-based survey; research project ANAMIA.

Data Data collection Collecting network data through surveys: rosters Provide respondents with a list of potential network members and ask them to choose from the list those to whom they are tied, for example: Here is the list of all the members of your Firm. Would you go through this list, and check the names of those you socialize with outside work. You know their family, they know yours, for instance. I do not mean all the people you are simply on a friendly level with, or people you happen to meet at Firm functions. (Lazega, 2001)

Data Data collection Collecting network data through surveys: rosters (cont.) Used for whole network studies. Also useful as a memory-aid. Requires the researcher to have a complete list of nodes from start. Only feasible for relatively small networks (e.g. schools, companies).

Data Data collection Collecting network data from archives For example: contract data from companies financial statements; citations data, from publishers portals. Depends on the quality of the archive and the actual availability of network information. Need to ensure definition of ties is consistent and data are reported uniformly across all nodes. Need to ensure completeness (for whole networks).

Figure: A citations network. From a study of the literature on pro-anorexic websites over ten years, with a corpus of 60 scientific articles. Casilli, Tubaro and Araya (2012), ANAMIA.

Data Data collection Webcrawling Using dedicated software to retrieve websites and the links between them. Increasingly popular with the rise of web-based networks, online social networking services, the study of the Internet as a network. Defining network boundaries may be difficult. Frequent need for manual verification of data quality. Privacy protection issues.

Figure: A map of the pro-anorexic web sphere in France. F. Pailler, D. Pereira, ANAMIA.

Data Data collection Now you know: Different ways of collecting network data: surveys, archives, webcrawling. All have advantages and disadvantages. Choice depends on research questions, context, and expected outcomes.

Network metrics Measuring properties of networks Focus is on properties of patterns of relationships, independently of node attributes. Based on the mathematics of graph theory, refined with social science concepts. A variety of algorithms, measures and software applications are available.

Network metrics Size Size Network size = number of nodes (= number of contacts in a personal network); The Dunbar number : cognitive limitations restrict the size of personal networks to about 150 contacts; An open question: have social media increased human capacity to maintain relationships? Median network size on Facebook = 99, average about 150-200 (though large variation).

Network metrics Density Density The proportion of ties that actually exist and the ties that could exist in principle: Density = L (n (n 1)) 2 for undirected ties; Density = L (n (n 1)) for directed ties. where L = number of edges, n = number of nodes.

Introduction to social network analysis Network metrics Density Application: Dense networks and behaviours Denser online networks spread behaviours faster: Centola 2010.

Network metrics Density Why is this so? When adoption of a new behavior requires social reinforcement (threshold effect), a denser network favours change.

Network metrics Centrality Degree centrality Who are the most important nodes? Diane has the highest number of direct connections (degree); A connector, or hub. Krackhardt s kite network.

Network metrics Centrality Degree centrality Who are the most important nodes? Diane has the highest number of direct connections (degree); A connector, or hub. Krackhardt s kite network.

Network metrics Centrality Betweenness centrality Heather has fewer connections than Diane; Yet she occupies a strategic position, between different parts of the network; She controls what flows in the network. Krackhardt s kite network.

Network metrics Centrality Closeness centrality Fernando and Garth have fewer connections than Diane; But they are at a shorter distance from all other network members; They can monitor the information flow in the network. Krackhardt s kite network.

Network metrics Centrality Core-periphery structures Ike and Jane have low centrality scores; e.g. they may be external contractors for a company; may be sources of fresh information! Krackhardt s kite network.

Network metrics Centrality Network centralisation The extent to which a network is dominated by one (or a few) nodes:

Network metrics Centrality Network centralisation Measures the extent to which a network is dominated by a single central node. Comparing centrality of the most central node to the centrality of other nodes. Normalized by dividing by the maximum centralization possible for a network of the given size. Ranges from 0 to 1 (star network).

Network metrics Centrality Centralisation may vary over time Figure: The advice network of judges in a Parisian court. Correlation between degrees, first to second observation (left panel) and second to third (right panel).

Network metrics Distance Distance Distance: number of steps from one member to another; Shorter paths in a network are the most important; The shorter the path from one network to the other, the quicker and more efficient the flow of information, advice, knowledge. Left: Longer paths; Right: Shorter paths.

Network metrics Cliques Cliques 3-member clique 4-member clique 5-member clique A clique is a sub-set of nodes where all possible pairs of nodes are directly connected. Scott (2000).

Network metrics Cliques Real-world cliques 1-clique 2-clique 3-clique Completely connected groups uncommon. n-clique: points connected by a maximum path link. n-cliques of greater than 2 empirically infrequent. Scott (2000).

Network metrics Cliques Application: Small Worlds A small world network is sparse, but with dense neighbourhoods and short paths; and there are few steps from one member to any other.

Network metrics Cliques Now you know: Key metrics to measure properties of networks: Size; Density; Centrality / Centralisation; Distance; Cliques.

Further readings Books on social network analysis: general Thomas W. Valente. Social networks and health. Models, Methods, and Applications, Oxford UP 2010. Christina Prell. Social Network Analysis. History, Theory and Methodology, Sage 2011 (October). John P. Scott. Social Network Analysis: A Handbook, Sage 2000.

Further readings Books on social network analysis: general (cont.) Stanley Wasserman and Katherine Faust. Social Network Analysis: Methods and Applications, Cambridge UP, 1994. Peter J. Carrington, John Scott, Stanley Wasserman (Eds.) Models and Methods in Social Network Analysis, Cambridge UP, 2005. David Knoke. Social Network Analysis, Sage 2008.

Further readings Books on social network analysis: Theory Ronald S. Burt. Brokerage and Closure: An Introduction to Social Capital, Oxford UP, 2005. Ronald S. Burt. Neighbor Networks: Competitive Advantage Local and Personal, Oxford UP, 2010. Nan Lin. Social Capital: A Theory of Social Structure and Action, Cambridge UP, 2002.

Further readings Books on social network analysis: Economics Matthew O. Jackson Social and Economic Networks, Princeton UP, 2010. Sanjeev Goyal. Connections: An Introduction to the Economics of Networks, Princeton UP, 2009. Fernando Vega-Redondo. Complex Social Networks,Cambridge UP 2007.

Further readings Journals Social Networks, Elsevier Connections Journal of Social Structure Redes (Spanish)

Further readings Associations and conferences INSNA: Sunbelt XXXIII conference, May 2013, Hamburg (www.insna.org); AFS - RT26, Ecole d été, September 2012; UKSNA: 8th annual conference, Bristol, June 2012; ASNA: 9th annual conference, Zurich, September 2012.

Further readings Thank you! Paola Tubaro, p.tubaro@gre.ac.uk