Dynamics of information spread on networks. Kristina Lerman USC Information Sciences Institute
|
|
|
- Rodney Flynn
- 9 years ago
- Views:
Transcription
1 Dynamics of information spread on networks Kristina Lerman USC Information Sciences Institute
2 Information spread in online social networks Diffusion of activation on a graph, where each infected (activated) node infects neighbors with some probability
3 Why now? Data driven network science Availability of large scale scale, time resolved data on the Social Web allows us to ask new questions about social behavior How does information spread on networks? How far and how fast does information flow? How does the network structure affect information flow? What does this tell us about the quality of information? What is the structure of the network? Who are the influential users and communities? How does network structure affect the flow of information? What is the collective behavior of users? How does individual behavior affect collective behavior?
4 Outline Online social networks (OSN) Empirical study of information cascades on networks Statistical properties of OSN Quantitative analysis of information cascades on networks What limits the size of information cascades in OSN? Network structure and dynamics Classes of diffusion processes on networks Diffusion and centrality equivalence What is the appropriate centrality metric for a given network? Compare performance of centrality metrics on OSN Alpha Centrality Parametrized centrality for network analysis Node Ranking and Community Detection
5 Social News: Digg Users submit or vote for (digg) news stories Online social networks Users follow friends to see Stories friends submit Stories friends vote for Shown in My News Trending stories Digg promotes most popular stories to its Top News page
6 Microblogging: Twitter Users tweet short text posts Retweet posts of others 140 characters long Online social networks Users follow friends to see Tweets by friends Retweets by friends Trending topics Twitter analyzes activity to identify popular trends
7 Comparative empirical analysis Digg Voting on a single story Twitter Use URLs as markers for tracking the flow of information one month ~3K stories ~280K users ~ 1M links three weeks ~70K stories ~700K users ~36M links [Lerman and Ghosh, ICWSM 10]
8 Dynamics of story popularity Digg Twitter 1: U.S. Government Asks Twitter to Stay Up for #IranElection 2: Western Corporations Helped Censor Iranian Internet 3: Iranian clerics defy ayatollah, join protests 1: U.S. Government Asks Twitter to Stay Up for #IranElection 2: Western Corporations Helped Censor Iranian Internet 3: Iranian clerics defy ayatollah, join protests
9 Distribution of popularity of stories Digg (promoted stories) Twitter (all stories) lognormal fit Aggregate over all stories to factor out influence of submitter and story quality Inequality of popularity some stories much more popular than others cf social influence study of [Salganik, Dodds & Watts, 2006]
10 Network structure: Distribution of followers follower friend Digg Twitter
11 Cascades on networks Information spreads through cascades on networks Underlying network Two cascades spreading on the network* *Nodes are labeled in the temporal order of activation
12 Analysis of information cascades [Leskovec, McGlohon, & Faloutsos, Cascading Behavior in Large Blog Graphs, in SDM (2007)]
13 Cascade generating function Quantitative framework for measuring the structure of evolving cascades Microscopic Macroscopic Efficient compression Time sequence of real numbers carries information about cascade structure Allows to reconstruct the cascade Fast and Scalable O(kN) space O(dkN) runtime complexity (k=#of seeds, d=max. degree, N=#of nodes) [Ghosh and Lerman, A Framework for Quantitative Analysis of Cascades on Networks. WSDM11]
14 Calculating cascade generating function ( j, ) (i, ) i friend( j ) (1, ) c 1 (2, ) c 2 (3, ) (1, ) c 1 (4, ) (1, ) (2, ) c 1 c 2 (6, ) (3, ) (1, ) ( 2 )c 1
15 Some common cascades branching chaining community
16 Collision of cascades
17 Cascade reconstruction: degeneracy
18 Digg case study: microscopic properties Evolution of three largest cascades for two non popular stories community effect branching effect chaining effect Play Doctor On Yourself: 16 Things To Do Between Checkups APOD: 2009 July 1 Three Galaxies in Draco
19 Digg case study: microscopic properties Evolution of three largest cascades for two popular stories community effect Infomercial King' Billy Mays Dead at 50 Bender's back
20 Digg: macroscopic properties No. of cascades Cascade size Spread Diameter
21 Why are Digg cascades so small? Cascade size
22 What limits cascade size? Network structure Clustering Degree heterogeneity Dynamics Social contagion mechanism Change in transmissibility (e.g., novelty decay) [Ver Steeg, Ghosh & Lerman, What stops social epidemics? in ICWSM 11]
23 Effect of network structure Study the effect of clustering by simulating cascades on synthetic and real graphs Digg graph Power law degree distribution with exponent 2 Synthetic graph Constructed using directed configuration Model [Newman et al.] Preserves degree distribution but destroys clustering and degree correlation Cascade simulations Independent Cascade Model (ICM) Widely used to model epidemics, viral marketing campaigns, etc. Start with infected seed (submitter) Susceptible fans are infected with probability (transmissibility) If user has n contagious friends, she has n chances to be infected Users can infect friends during a single round, then they are removed When cascade stops, measure its size (# of infected nodes)
24 Cascade size vs transmissibility epidemic threshold transmissibility clustering reduces epidemic threshold and cascade size, but not enough!
25 Friend Saturation Model Perhaps we got the contagion mechanism wrong? ICM: each friend has probability to infect the node; therefore, Probability user votes given n of his friends have voted p ICM (vote n friends voted)=1 (1 ) n On Digg, empirically p(vote n friends voted) ~ Friend Saturation Model (FSM) Repeated exposure to a story does not make the user much more likely to vote for it Cf, market saturation effect, decreasing cascade model [Kempe et al, 2005]
26 Simulations of FSM Model on Digg graph Actual and simulated (FSM) cascades on the Digg graph FSM mechanism drastically reduces cascade size
27 Outline Online social networks (OSN) Empirical study of information cascades on networks Online social networks (OSN) and their properties Quantitative analysis of information cascades on networks What limits the size of information cascades in OSN? Network structure and dynamics Classes of diffusion processes on networks Diffusion and centrality equivalence What is the appropriate centrality metric for a given network? Compare performance of centrality metrics on OSN Alpha Centrality Parametrized centrality for network analysis Node Ranking and Community Detection
28 Two classes of diffusion on networks Conservative diffusion Non conservative diffusion $ $ 4 5 Post 4 5 Post 1 1 $ 2 3 Conservative diffusion: models money transfer, conversations, etc. where some quantity ($, attention) is conserved Post 2 3 Non conservative diffusion: models epidemics, spread of information, etc., where quantity (viruses) is not conserved. Mathematical formulation of two types of diffusion Equivalence of steady state solutions and centrality metrics Unifies social network analysis and epidemic models Existence of threshold for epidemic processes Location of the epidemic threshold [Ghosh and Lerman, Predicting Influential Users in Online Social Networks. SNAKDD10]
29 Conservative diffusion 2 /2 2 /2 3 /2 3 /2 4 w 4 w w 5 w w 1-1 w /2 w 2 w 2 2 w /2 3 /2 3 3 /2 w 3 2 /
30 Conservative diffusion 2 /2 3 /2 4 3 /2 w /2 w 4-3 /2 4 w w /2 w 5 w 1 1 w w w / /2
31 Matrix formulation Adjacency matrix of the network A = Outdegree matrix D =
32 Matrix formulation of conservative diffusion w C C t (1 )s Tw t 1 with starting vector s and transfer matrix T = D -1 A Steady state solution: as t w C (1 )s (1 ) Ts... (1 )s (I T) Weight vector is conserved conservative diffusion w t C C w t 1
33 Non conservative diffusion w 4 w w 5 w w 1-1 w w 2 w 2 2 w 3 3 w
34 Non conservative diffusion w 4 w 4 4 w w w 1-1 w w 2 w 2 w 3 3 w w
35 Matrix formulation of non conservative diffusion w NC NC t s Aw t 1 with starting vector s Steady state solution: as w NC k A k s s(i A) 1 k 0 t Holds for < 1/ 1 Weight vector is not conserved non conservative diffusion w t NC NC w t 1
36 Epidemics as non conservative diffusion time = t-1
37 Epidemics as non conservative diffusion time = t Transmissibility [Wang et al ]
38 Epidemics as non conservative diffusion time = t Expected # of virus up to time t P t cum t k 0 t P A P k 0 k 0 k
39 Epidemics as non conservative diffusion time = t Expected # of virus up to time t non-conserva conservative diffusion P t cum t k 0 t P A P k 0 k 0 k w k 0 A k s
40 Implications: Epidemic threshold In spreading epidemics there exists a threshold below which epidemics die out and above which they reach significant fraction of nodes Epidemic threshold is given by 1/ 1, inverse of the largest eigenvalue of A [Wang et al., 2003] For >1/ 1 epidemics reach many nodes Agrees with empirically observed threshold on Digg Threshold also exists in non conservative diffusion
41 c =0.006 c =0.009 transmissibility
42 Implications for network analysis Who is important in a network? Geodesic path based ranking measures Betweenness centrality [Freeman, 1979] Topological ranking measures Page Rank [Brin et al., 1998] Degree Centrality Path Based Ranking Measures Alpha centrality [Bonacich, 1987] Katz score [Katz, 1953] EigenVector centrality [Bonacich, 2001] 1 1 Betweenness Centrality Degree Centrality PageRank Alpha-Centrality [Node size = importance]
43 Diffusion and centrality Current approaches take into account topology only and may lead to conflicting answers Diffusion and centrality Random walk Conservative diffusion PageRank Information spread Non conservative diffusion Alpha Centrality What is the appropriate metric for the given network?
44 PageRank [Page & Brin, 98] Web surfer: with probability navigates to a neighboring node. with probability (1 jumps to a random node w 5 /5 1 w 5 /5 4 w 5 5 w 5 /5 w 5 /5 w 5 /5 2 3 r t PR (1 )r PR 0 (1 ) D 1 Ar PR 0 K D 1 A t r 0 PR
45 PageRank [Page & Brin, 98] Web surfer: with probability navigates to a neighboring node. with probability (1 jumps to a random node w 5 /5 1 w 5 /5 4 w 5 5 w 5 /5 w 5 /5 w 5 /5 t r PR (1 )r PR 0 K D 1 A 2 3 t r PR 0... (1 )r PR 0 (I D 1 A)
46 PageRank [Page & Brin, 98] Web surfer: with probability navigates to a neighboring node. with probability (1 jumps to a random node w 5 /5 1 w 5 /5 4 w 5 5 w 5 /5 w 5 /5 conservative diffusion w 5 /5 t PR r PR (1 )r 0 (I D 1 A) 2 3 w C (1 )s (I T)
47 Alpha Centrality [Bonacich, 87] Number of paths of any length, attenuated by their length with r t Alpha ea ea 2... t ea t 1
48 Alpha Centrality [Bonacich, 87] Number of paths of any length, attenuated by their length with r t Alpha ea ea 2... t ea t 1
49 Alpha Centrality [Bonacich, 87] Number of paths of any length, attenuated by their length with t r Alpha ea (I A) 2 Holds for < 1/ 1 3
50 Alpha Centrality [Bonacich, 87] Number of paths of any length, attenuated by their length with non-conservative diffusion 2 3 t r Alpha ea (I A) Holds for < 1/ 1 w NC s (I A)
51 Ranking nodes by centrality PageRank Alpha Centrality
52 Ranking nodes by centrality PageRank Alpha Centrality / / /2 1/
53 Which metric is right? How can we evaluate centrality metrics? User activity in social media provides an independent measure of importance/influence Serves as ground truth for evaluating centrality metrics Evaluation methodology Define an empirical measure of influence (ground truth) Compare centrality metrics with the ground truth
54 Information flow on Digg Post Post submitter fan Follower vote fan fan Post Follower vote Information spread on Digg is non-conservative Non-conservative metric will best predict influential users
55 Empirical estimate of influence 1. Average follower votes Likelihood a follower votes for the story Influence of submitter Quality of the story Story quality Random variable Average out by aggregating fan votes over all stories submitted by the same submitter 289 users submitting at least 2 stories 2. Average Cascade size How far does the story spread into the network Ground truth(s): Rank users according to each estimate
56 Statistical significance estimate 1 Post submitter follower follower URN MODEL follower Users in OSN (N) Fans of submitter in OSN (K) No. of users who voted (n) No. of fans who voted (k) P(X k K,N,n) K N k k n k N n Balls in the urn (N) White balls in the urn (K) No. of balls picked (n) No of white balls picked (k) (Hypergeometric Dist.)
57 Statistical significance results Avg # follower votes received by stories within the first 100 votes vs # of submitter s followers. Probability of the expected number of fan votes being generated purely by chance.
58 Evaluation of importance prediction Correlation between the rankings produced by the empirical measures of influence and those predicted by Alpha-Centrality and PageRank (1) Avg. # of follower votes (2) Avg. cascade size Non conservative Alpha Centrality best predicts influence rankings
59 Time check?
60 Alpha Centrality [Bonacich, 87] r Alpha ( ) ea k 0 k A k ea (I A) Measures the number of paths between nodes, each path attenuated by its length with parameter sets the length scale of interactions = 0 local interactions only degree centrality As grows, increasingly longer distances become important But, condition must hold for convergence < 1/ 1 where 1 is largest eigenvalue of A
61 Normalized Alpha Centrality r Alpha ( ) n i, j ea k A k No longer bounded by eigenvalue: holds for 0 <_ <_ 1 Parameterized centrality metric k 0 k 0 k A k sets the length scale of interactions Local: For = 0 leads to the same rankings as degree centrality Meso: As grows, increasingly longer distances become important Global: As 1/ 1, leads to the same rankings as eigenvector centrality Relation to Alpha Centrality For 0 <_ < 1/ 1 leads to the same rankings as Alpha Centrality For > 1/ 1, rankings independent of [Ghosh and Lerman, Parameterized Metric for Network Analysis, Physical Review 2011] ij
62 Applications to network analysis Ranking Change in ranking with a Leaders: individuals with high importance score Bridges: individuals whose importance score grows with a Mediate communication between communities Peripherals: low importance score Community Detection Modularity maximization [Newman & Girvan, 04] Modularity Q=(actual connectivity) (expected connectivity) Connectivity measured using normalized Alpha Centrality
63 Advantages Multi scale analysis of networks Parameter sets the length scale of interaction Connect the rankings produced by well known local and global centrality metrics Degree Centrality Eigenvector Centrality Can differentiate between locally and globally connected nodes and structures Leaders and bridges Local and global communities
64 Zachary s karate club [Zachary, 77] Centrality scores of nodes vs.
65 Communities in Zachary s karate club network = 0 < 0.14 >= 0.14 Newman s modularity )
66 Related Work Empirical Investigation of Online Social Networks Network inferred from the observed links [Wu,04; Gruhl,04; Leskovec,07;Rodriguez,10] Forwarding chains long narrow rather than bushy and wide trees Networks are extracted independent of the spread of data. Diffusion terminates in few steps [Wu,04; Leskovec,08;] versus Influence spreads easily [Bakshy,09] decay of similarity Reach of spread does not depend on edge based similarity Left open the question whether OSN effective for spreading information rather than purchasing product Information such as news reach many individuals Enumeration of shapes on local cascades [Leskovec,05; Leskovec,07] As cascades grow in size, the number of possible shapes increases exponentially and such enumeration becomes infeasible
67 Related work: Modeling social epidemics Epidemic Models Homogenous SIS, SIR Models [Bailey,1975], segregate heterogeneous population in homogenous subgroups[hethcote,1978], Heterogeneous Mean Field (HMF) models [Moreno, 2002] Structure of the underlying network [Wang et al,2003], regardless of virus propagation mechanisms [Prakash et al., 2010] Effect of adding `stifflers [Barrat,08] Spreading mechanism Decreasing cascade model [Kempe 03; Leskovec,07; Kossinets, 06] Complex contagion on Twitter [Romero et al. 2011] Viral cascades as branching processes [Iribarren,09]
68 Related work: Centrality and diffusion Network structure and diffusion processes Epidemic models and spectral radius [Wang,03] Random walk and Laplacian [Chung,97] Most centrality metrics make implicit assumptions about network flow [Borgatti, 05] We advocate a simpler classification scheme Ranking of Twitter users Comparison with centrality metrics [Cha,10; Lee,10] Community detection We extend the edge based modularity maximization [Newman,04] to take path based connectivity into account
69 Conclusion Empirical Investigations of Networks Leverage the power of OSN Studied Digg and Twitter Structure Degree distribution Dynamics Information Propagation How stories evolve? Metric to Quantify Cascades Macroscopic and Microscopic Properties Puzzle: Why are cascades so small? Analysis and Simulations Clustering Contagion Mechanism FSM
70 Conclusion Modeling Networks Structure and Functionality Dynamic Processes Conservative and Non Conservative Prediction of dynamics uses structure (Spectral radius) Epidemic Models Non Conservative Prediction of structure uses dynamics Importance and Centrality Metrics Conservative and non conservative How to choose centrality metrics? Evaluation of centrality metrics Alpha centrality better models information diffusion Parameterized and useful for network analysis Normalized Alpha Centrality Communities, leaders and bridges
71 Thanks Collaborators Rumi Ghosh (USC) Greg Ver Steeg (USC) Tad Hogg (IMM) Tawan Surachawala (USC) Funding agencies NSF AFOSR AFRL
Predicting Influentials in Online Social Networks
Predicting Influentials in Online Social Networks Rumi Ghosh Kristina Lerman USC Information Sciences Institute WHO is IMPORTANT? Characteristics Topology Dynamic Processes /Nature of flow What are the
Big Data Analytics of Multi-Relationship Online Social Network Based on Multi-Subnet Composited Complex Network
, pp.273-284 http://dx.doi.org/10.14257/ijdta.2015.8.5.24 Big Data Analytics of Multi-Relationship Online Social Network Based on Multi-Subnet Composited Complex Network Gengxin Sun 1, Sheng Bin 2 and
Social Media Mining. Network Measures
Klout Measures and Metrics 22 Why Do We Need Measures? Who are the central figures (influential individuals) in the network? What interaction patterns are common in friends? Who are the like-minded users
Collective Behavior Prediction in Social Media. Lei Tang Data Mining & Machine Learning Group Arizona State University
Collective Behavior Prediction in Social Media Lei Tang Data Mining & Machine Learning Group Arizona State University Social Media Landscape Social Network Content Sharing Social Media Blogs Wiki Forum
Cost effective Outbreak Detection in Networks
Cost effective Outbreak Detection in Networks Jure Leskovec Joint work with Andreas Krause, Carlos Guestrin, Christos Faloutsos, Jeanne VanBriesen, and Natalie Glance Diffusion in Social Networks One of
USING SPECTRAL RADIUS RATIO FOR NODE DEGREE TO ANALYZE THE EVOLUTION OF SCALE- FREE NETWORKS AND SMALL-WORLD NETWORKS
USING SPECTRAL RADIUS RATIO FOR NODE DEGREE TO ANALYZE THE EVOLUTION OF SCALE- FREE NETWORKS AND SMALL-WORLD NETWORKS Natarajan Meghanathan Jackson State University, 1400 Lynch St, Jackson, MS, USA [email protected]
Social Network Mining
Social Network Mining Data Mining November 11, 2013 Frank Takes ([email protected]) LIACS, Universiteit Leiden Overview Social Network Analysis Graph Mining Online Social Networks Friendship Graph Semantics
NETZCOPE - a tool to analyze and display complex R&D collaboration networks
The Task Concepts from Spectral Graph Theory EU R&D Network Analysis Netzcope Screenshots NETZCOPE - a tool to analyze and display complex R&D collaboration networks L. Streit & O. Strogan BiBoS, Univ.
Community-Aware Prediction of Virality Timing Using Big Data of Social Cascades
1 Community-Aware Prediction of Virality Timing Using Big Data of Social Cascades Alvin Junus, Ming Cheung, James She and Zhanming Jie HKUST-NIE Social Media Lab, Hong Kong University of Science and Technology
Part 1: Link Analysis & Page Rank
Chapter 8: Graph Data Part 1: Link Analysis & Page Rank Based on Leskovec, Rajaraman, Ullman 214: Mining of Massive Datasets 1 Exam on the 5th of February, 216, 14. to 16. If you wish to attend, please
Complex Networks Analysis: Clustering Methods
Complex Networks Analysis: Clustering Methods Nikolai Nefedov Spring 2013 ISI ETH Zurich [email protected] 1 Outline Purpose to give an overview of modern graph-clustering methods and their applications
Practical Graph Mining with R. 5. Link Analysis
Practical Graph Mining with R 5. Link Analysis Outline Link Analysis Concepts Metrics for Analyzing Networks PageRank HITS Link Prediction 2 Link Analysis Concepts Link A relationship between two entities
Social Influence Analysis in Social Networking Big Data: Opportunities and Challenges. Presenter: Sancheng Peng Zhaoqing University
Social Influence Analysis in Social Networking Big Data: Opportunities and Challenges Presenter: Sancheng Peng Zhaoqing University 1 2 3 4 35 46 7 Contents Introduction Relationship between SIA and BD
The PageRank Citation Ranking: Bring Order to the Web
The PageRank Citation Ranking: Bring Order to the Web presented by: Xiaoxi Pang 25.Nov 2010 1 / 20 Outline Introduction A ranking for every page on the Web Implementation Convergence Properties Personalized
Mining Large Graphs. ECML/PKDD 2007 tutorial. Part 2: Diffusion and cascading behavior
Mining Large Graphs ECML/PKDD 2007 tutorial Part 2: Diffusion and cascading behavior Jure Leskovec and Christos Faloutsos Machine Learning Department Joint work with: Lada Adamic, Deepay Chakrabarti, Natalie
Part 2: Community Detection
Chapter 8: Graph Data Part 2: Community Detection Based on Leskovec, Rajaraman, Ullman 2014: Mining of Massive Datasets Big Data Management and Analytics Outline Community Detection - Social networks -
Mining Social Network Graphs
Mining Social Network Graphs Debapriyo Majumdar Data Mining Fall 2014 Indian Statistical Institute Kolkata November 13, 17, 2014 Social Network No introduc+on required Really? We s7ll need to understand
Graphs over Time Densification Laws, Shrinking Diameters and Possible Explanations
Graphs over Time Densification Laws, Shrinking Diameters and Possible Explanations Jurij Leskovec, CMU Jon Kleinberg, Cornell Christos Faloutsos, CMU 1 Introduction What can we do with graphs? What patterns
Social Networks and Social Media
Social Networks and Social Media Social Media: Many-to-Many Social Networking Content Sharing Social Media Blogs Microblogging Wiki Forum 2 Characteristics of Social Media Consumers become Producers Rich
Dmitri Krioukov CAIDA/UCSD
Hyperbolic geometry of complex networks Dmitri Krioukov CAIDA/UCSD [email protected] F. Papadopoulos, M. Boguñá, A. Vahdat, and kc claffy Complex networks Technological Internet Transportation Power grid
Graph Mining Techniques for Social Media Analysis
Graph Mining Techniques for Social Media Analysis Mary McGlohon Christos Faloutsos 1 1-1 What is graph mining? Extracting useful knowledge (patterns, outliers, etc.) from structured data that can be represented
Reconstruction and Analysis of Twitter Conversation Graphs
Reconstruction and Analysis of Twitter Conversation Graphs Peter Cogan [email protected] Gabriel Tucci [email protected] Matthew Andrews [email protected] W. Sean
1 o Semestre 2007/2008
Departamento de Engenharia Informática Instituto Superior Técnico 1 o Semestre 2007/2008 Outline 1 2 3 4 5 Outline 1 2 3 4 5 Exploiting Text How is text exploited? Two main directions Extraction Extraction
Nodes, Ties and Influence
Nodes, Ties and Influence Chapter 2 Chapter 2, Community Detec:on and Mining in Social Media. Lei Tang and Huan Liu, Morgan & Claypool, September, 2010. 1 IMPORTANCE OF NODES 2 Importance of Nodes Not
DATA ANALYSIS II. Matrix Algorithms
DATA ANALYSIS II Matrix Algorithms Similarity Matrix Given a dataset D = {x i }, i=1,..,n consisting of n points in R d, let A denote the n n symmetric similarity matrix between the points, given as where
Network Analysis For Sustainability Management
Network Analysis For Sustainability Management 1 Cátia Vaz 1º Summer Course in E4SD Outline Motivation Networks representation Structural network analysis Behavior network analysis 2 Networks Over the
Exploring Big Data in Social Networks
Exploring Big Data in Social Networks [email protected] ([email protected]) INWEB National Science and Technology Institute for Web Federal University of Minas Gerais - UFMG May 2013 Some thoughts about
LABEL PROPAGATION ON GRAPHS. SEMI-SUPERVISED LEARNING. ----Changsheng Liu 10-30-2014
LABEL PROPAGATION ON GRAPHS. SEMI-SUPERVISED LEARNING ----Changsheng Liu 10-30-2014 Agenda Semi Supervised Learning Topics in Semi Supervised Learning Label Propagation Local and global consistency Graph
Asking Hard Graph Questions. Paul Burkhardt. February 3, 2014
Beyond Watson: Predictive Analytics and Big Data U.S. National Security Agency Research Directorate - R6 Technical Report February 3, 2014 300 years before Watson there was Euler! The first (Jeopardy!)
Influence Propagation in Social Networks: a Data Mining Perspective
Influence Propagation in Social Networks: a Data Mining Perspective Francesco Bonchi Yahoo! Research Barcelona - Spain [email protected] http://francescobonchi.com/ Acknowledgments Amit Goyal (University
A qualitative method to find influencers using similarity-based approach in the blogosphere
IEEE International Conference on Social Computing / IEEE International Conference on Privacy, Security, Risk and Trust A qualitative method to find influencers using similarity-based approach in the blogosphere
Search engines: ranking algorithms
Search engines: ranking algorithms Gianna M. Del Corso Dipartimento di Informatica, Università di Pisa, Italy ESP, 25 Marzo 2015 1 Statistics 2 Search Engines Ranking Algorithms HITS Web Analytics Estimated
An Analysis of Verifications in Microblogging Social Networks - Sina Weibo
An Analysis of Verifications in Microblogging Social Networks - Sina Weibo Junting Chen and James She HKUST-NIE Social Media Lab Dept. of Electronic and Computer Engineering The Hong Kong University of
Information Flow and the Locus of Influence in Online User Networks: The Case of ios Jailbreak *
Information Flow and the Locus of Influence in Online User Networks: The Case of ios Jailbreak * Nitin Mayande & Charles Weber Department of Engineering and Technology Management Portland State University
COS 116 The Computational Universe Laboratory 9: Virus and Worm Propagation in Networks
COS 116 The Computational Universe Laboratory 9: Virus and Worm Propagation in Networks You learned in lecture about computer viruses and worms. In this lab you will study virus propagation at the quantitative
14.10.2014. Overview. Swarms in nature. Fish, birds, ants, termites, Introduction to swarm intelligence principles Particle Swarm Optimization (PSO)
Overview Kyrre Glette kyrrehg@ifi INF3490 Swarm Intelligence Particle Swarm Optimization Introduction to swarm intelligence principles Particle Swarm Optimization (PSO) 3 Swarms in nature Fish, birds,
Walk-Based Centrality and Communicability Measures for Network Analysis
Walk-Based Centrality and Communicability Measures for Network Analysis Michele Benzi Department of Mathematics and Computer Science Emory University Atlanta, Georgia, USA Workshop on Innovative Clustering
The structural virality of online diffusion
The structural virality of online diffusion Sharad Goel, Ashton Anderson Stanford University, Stanford, CA, 94305, U.S.A Jake Hofman, Duncan J. Watts Microsoft Research, New York, NY, 10016, U.S.A. Viral
Social Media Mining. Data Mining Essentials
Introduction Data production rate has been increased dramatically (Big Data) and we are able store much more data than before E.g., purchase data, social media data, mobile phone data Businesses and customers
Environmental Remote Sensing GEOG 2021
Environmental Remote Sensing GEOG 2021 Lecture 4 Image classification 2 Purpose categorising data data abstraction / simplification data interpretation mapping for land cover mapping use land cover class
Statistical and computational challenges in networks and cybersecurity
Statistical and computational challenges in networks and cybersecurity Hugh Chipman Acadia University June 12, 2015 Statistical and computational challenges in networks and cybersecurity May 4-8, 2015,
How To Understand The Network Of A Network
Roles in Networks Roles in Networks Motivation for work: Let topology define network roles. Work by Kleinberg on directed graphs, used topology to define two types of roles: authorities and hubs. (Each
ANALYZING SOCIAL INFLUENCE THROUGH SOCIAL MEDIA, A STRUCTURED LITERATURE REVIEW
ANALYZING SOCIAL INFLUENCE THROUGH SOCIAL MEDIA, A STRUCTURED LITERATURE REVIEW Remco Snijders, Utrecht University, Department of Information and Computer Science Remko Helms, Utrecht University, Department
Graph Processing and Social Networks
Graph Processing and Social Networks Presented by Shu Jiayu, Yang Ji Department of Computer Science and Engineering The Hong Kong University of Science and Technology 2015/4/20 1 Outline Background Graph
Temporal Dynamics of Scale-Free Networks
Temporal Dynamics of Scale-Free Networks Erez Shmueli, Yaniv Altshuler, and Alex Sandy Pentland MIT Media Lab {shmueli,yanival,sandy}@media.mit.edu Abstract. Many social, biological, and technological
Inet-3.0: Internet Topology Generator
Inet-3.: Internet Topology Generator Jared Winick Sugih Jamin {jwinick,jamin}@eecs.umich.edu CSE-TR-456-2 Abstract In this report we present version 3. of Inet, an Autonomous System (AS) level Internet
Graph Mining and Social Network Analysis
Graph Mining and Social Network Analysis Data Mining and Text Mining (UIC 583 @ Politecnico di Milano) References Jiawei Han and Micheline Kamber, "Data Mining: Concepts and Techniques", The Morgan Kaufmann
Scheduling Home Health Care with Separating Benders Cuts in Decision Diagrams
Scheduling Home Health Care with Separating Benders Cuts in Decision Diagrams André Ciré University of Toronto John Hooker Carnegie Mellon University INFORMS 2014 Home Health Care Home health care delivery
Network-based spam filter on Twitter
Network-based spam filter on Twitter Ziyan Zhou Stanford University [email protected] Lei Sun Stanford University [email protected] ABSTRACT Rapidly growing micro-blogging social networks, such as
IC05 Introduction on Networks &Visualization Nov. 2009. <[email protected]>
IC05 Introduction on Networks &Visualization Nov. 2009 Overview 1. Networks Introduction Networks across disciplines Properties Models 2. Visualization InfoVis Data exploration
Efficient Crawling of Community Structures in Online Social Networks
Efficient Crawling of Community Structures in Online Social Networks Network Architectures and Services PVM 2011-071 Efficient Crawling of Community Structures in Online Social Networks For the degree
MALLET-Privacy Preserving Influencer Mining in Social Media Networks via Hypergraph
MALLET-Privacy Preserving Influencer Mining in Social Media Networks via Hypergraph Janani K 1, Narmatha S 2 Assistant Professor, Department of Computer Science and Engineering, Sri Shakthi Institute of
Network Analysis Basics and applications to online data
Network Analysis Basics and applications to online data Katherine Ognyanova University of Southern California Prepared for the Annenberg Program for Online Communities, 2010. Relational data Node (actor,
A comparative study of social network analysis tools
Membre de Membre de A comparative study of social network analysis tools David Combe, Christine Largeron, Előd Egyed-Zsigmond and Mathias Géry International Workshop on Web Intelligence and Virtual Enterprises
How to Read and Measure the Influence of Friends in Data
Vidmer et al. EPJ Data Science (2015) 4:20 DOI 10.1140/epjds/s13688-015-0057-x REGULAR ARTICLE OpenAccess Unbiased metrics of friends influence in multi-level networks Alexandre Vidmer *, Matúš Medo and
AN INTRODUCTION TO SOCIAL NETWORK DATA ANALYTICS
Chapter 1 AN INTRODUCTION TO SOCIAL NETWORK DATA ANALYTICS Charu C. Aggarwal IBM T. J. Watson Research Center Hawthorne, NY 10532 [email protected] Abstract The advent of online social networks has been
Big Data Technology Motivating NoSQL Databases: Computing Page Importance Metrics at Crawl Time
Big Data Technology Motivating NoSQL Databases: Computing Page Importance Metrics at Crawl Time Edward Bortnikov & Ronny Lempel Yahoo! Labs, Haifa Class Outline Link-based page importance measures Why
DIGITS CENTER FOR DIGITAL INNOVATION, TECHNOLOGY, AND STRATEGY THOUGHT LEADERSHIP FOR THE DIGITAL AGE
DIGITS CENTER FOR DIGITAL INNOVATION, TECHNOLOGY, AND STRATEGY THOUGHT LEADERSHIP FOR THE DIGITAL AGE INTRODUCTION RESEARCH IN PRACTICE PAPER SERIES, FALL 2011. BUSINESS INTELLIGENCE AND PREDICTIVE ANALYTICS
Search and Information Retrieval
Search and Information Retrieval Search on the Web 1 is a daily activity for many people throughout the world Search and communication are most popular uses of the computer Applications involving search
Extracting Information from Social Networks
Extracting Information from Social Networks Aggregating site information to get trends 1 Not limited to social networks Examples Google search logs: flu outbreaks We Feel Fine Bullying 2 Bullying Xu, Jun,
Chapter 29 Scale-Free Network Topologies with Clustering Similar to Online Social Networks
Chapter 29 Scale-Free Network Topologies with Clustering Similar to Online Social Networks Imre Varga Abstract In this paper I propose a novel method to model real online social networks where the growing
Distances, Clustering, and Classification. Heatmaps
Distances, Clustering, and Classification Heatmaps 1 Distance Clustering organizes things that are close into groups What does it mean for two genes to be close? What does it mean for two samples to be
In-Network Coding for Resilient Sensor Data Storage and Efficient Data Mule Collection
In-Network Coding for Resilient Sensor Data Storage and Efficient Data Mule Collection Michele Albano Jie Gao Instituto de telecomunicacoes, Aveiro, Portugal Stony Brook University, Stony Brook, USA Data
Detecting Network Anomalies. Anant Shah
Detecting Network Anomalies using Traffic Modeling Anant Shah Anomaly Detection Anomalies are deviations from established behavior In most cases anomalies are indications of problems The science of extracting
Analysis of Internet Topologies
Analysis of Internet Topologies Ljiljana Trajković [email protected] Communication Networks Laboratory http://www.ensc.sfu.ca/cnl School of Engineering Science Simon Fraser University, Vancouver, British
Introduction To Genetic Algorithms
1 Introduction To Genetic Algorithms Dr. Rajib Kumar Bhattacharjya Department of Civil Engineering IIT Guwahati Email: [email protected] References 2 D. E. Goldberg, Genetic Algorithm In Search, Optimization
Graph Theory and Complex Networks: An Introduction. Chapter 08: Computer networks
Graph Theory and Complex Networks: An Introduction Maarten van Steen VU Amsterdam, Dept. Computer Science Room R4.20, [email protected] Chapter 08: Computer networks Version: March 3, 2011 2 / 53 Contents
Distance Metric Learning in Data Mining (Part I) Fei Wang and Jimeng Sun IBM TJ Watson Research Center
Distance Metric Learning in Data Mining (Part I) Fei Wang and Jimeng Sun IBM TJ Watson Research Center 1 Outline Part I - Applications Motivation and Introduction Patient similarity application Part II
SOCIAL NETWORK ANALYSIS EVALUATING THE CUSTOMER S INFLUENCE FACTOR OVER BUSINESS EVENTS
SOCIAL NETWORK ANALYSIS EVALUATING THE CUSTOMER S INFLUENCE FACTOR OVER BUSINESS EVENTS Carlos Andre Reis Pinheiro 1 and Markus Helfert 2 1 School of Computing, Dublin City University, Dublin, Ireland
Perron vector Optimization applied to search engines
Perron vector Optimization applied to search engines Olivier Fercoq INRIA Saclay and CMAP Ecole Polytechnique May 18, 2011 Web page ranking The core of search engines Semantic rankings (keywords) Hyperlink
Network Architectures & Services
Network Architectures & Services Fernando Kuipers ([email protected]) Multi-dimensional analysis Network peopleware Network software Network hardware Individual: Quality of Experience Friends: Recommendation
CSV886: Social, Economics and Business Networks. Lecture 2: Affiliation and Balance. R Ravi [email protected]
CSV886: Social, Economics and Business Networks Lecture 2: Affiliation and Balance R Ravi [email protected] Granovetter s Puzzle Resolved Strong Triadic Closure holds in most nodes in social networks
Analysis of Internet Topologies: A Historical View
Analysis of Internet Topologies: A Historical View Mohamadreza Najiminaini, Laxmi Subedi, and Ljiljana Trajković Communication Networks Laboratory http://www.ensc.sfu.ca/cnl Simon Fraser University Vancouver,
SGL: Stata graph library for network analysis
SGL: Stata graph library for network analysis Hirotaka Miura Federal Reserve Bank of San Francisco Stata Conference Chicago 2011 The views presented here are my own and do not necessarily represent the
Distributed Dynamic Load Balancing for Iterative-Stencil Applications
Distributed Dynamic Load Balancing for Iterative-Stencil Applications G. Dethier 1, P. Marchot 2 and P.A. de Marneffe 1 1 EECS Department, University of Liege, Belgium 2 Chemical Engineering Department,
A scalable multilevel algorithm for graph clustering and community structure detection
A scalable multilevel algorithm for graph clustering and community structure detection Hristo N. Djidjev 1 Los Alamos National Laboratory, Los Alamos, NM 87545 Abstract. One of the most useful measures
Social and Technological Network Analysis. Lecture 3: Centrality Measures. Dr. Cecilia Mascolo (some material from Lada Adamic s lectures)
Social and Technological Network Analysis Lecture 3: Centrality Measures Dr. Cecilia Mascolo (some material from Lada Adamic s lectures) In This Lecture We will introduce the concept of centrality and
Distance Degree Sequences for Network Analysis
Universität Konstanz Computer & Information Science Algorithmics Group 15 Mar 2005 based on Palmer, Gibbons, and Faloutsos: ANF A Fast and Scalable Tool for Data Mining in Massive Graphs, SIGKDD 02. Motivation
Crawling and Detecting Community Structure in Online Social Networks using Local Information
Crawling and Detecting Community Structure in Online Social Networks using Local Information TU Delft - Network Architectures and Services (NAS) 1/12 Outline In order to find communities in a graph one
