MINFS544: Business Network Data Analytics and Applications

Similar documents
Introduction to Networks and Business Intelligence

Graphs over Time Densification Laws, Shrinking Diameters and Possible Explanations

Discovering Determinants of Project Participation in an Open Source Social Network

Social Network Mining

Course Syllabus. BIA658 Social Network Analytics Fall, 2013

Big Data Analytics of Multi-Relationship Online Social Network Based on Multi-Subnet Composited Complex Network

Dmitri Krioukov CAIDA/UCSD

CSV886: Social, Economics and Business Networks. Lecture 2: Affiliation and Balance. R Ravi ravi+iitd@andrew.cmu.edu

Network Theory: 80/20 Rule and Small Worlds Theory

Effects of node buffer and capacity on network traffic

IC05 Introduction on Networks &Visualization Nov

THE ROLE OF SOCIOGRAMS IN SOCIAL NETWORK ANALYSIS. Maryann Durland Ph.D. EERS Conference 2012 Monday April 20, 10:30-12:00

Chapter 29 Scale-Free Network Topologies with Clustering Similar to Online Social Networks

NATIONAL SECURITY CRITICAL MISSION AREAS AND CASE STUDIES

A discussion of Statistical Mechanics of Complex Networks P. Part I

PROGRAM DIRECTOR: Arthur O Connor Contact: URL : THE PROGRAM Careers in Data Analytics Admissions Criteria CURRICULUM Program Requirements

Network Analysis Basics and applications to online data

Predictive Analytics Workshop With IBM SPSS Modeler

The University of Jordan

Technology and Trends for Smarter Business Analytics

Temporal Dynamics of Scale-Free Networks

Exploration and Visualization of Post-Market Data

General Network Analysis: Graph-theoretic. COMP572 Fall 2009

DIGITS CENTER FOR DIGITAL INNOVATION, TECHNOLOGY, AND STRATEGY THOUGHT LEADERSHIP FOR THE DIGITAL AGE

Strong and Weak Ties

Protein Protein Interaction Networks

Network Analysis. BCH 5101: Analysis of -Omics Data 1/34

Big Data for Public Safety: 4 use cases for intelligence and law enforcement agencies to leverage Big Data for crime prevention.

KNOWLEDGE NETWORK SYSTEM APPROACH TO THE KNOWLEDGE MANAGEMENT

Business Intelligence and Process Modelling

Distance Degree Sequences for Network Analysis

The Masters of Science in Information Systems & Technology

How To Understand The Network Of A Network

IBM's Fraud and Abuse, Analytics and Management Solution

Social Networks and Social Media

Network Analysis For Sustainability Management

Usage of OPNET IT tool to Simulate and Test the Security of Cloud under varying Firewall conditions

CoolaData Predictive Analytics

Graduate School of Informatics

Health Policy and Management Course Descriptions

Predictive Analytics

Social Business Intelligence For Retail Industry

How To Understand The Benefits Of Big Data Analysis

Complex Networks Analysis: Clustering Methods

Social Media and Digital Marketing Analytics ( INFO-UB ) Professor Anindya Ghose Monday Friday 6-9:10 pm from 7/15/13 to 7/30/13

Machine Learning for Display Advertising

IBM Content Analytics: Rapid insight for crime investigation

Social Network Analysis

KnowledgeSTUDIO HIGH-PERFORMANCE PREDICTIVE ANALYTICS USING ADVANCED MODELING TECHNIQUES

WYNYARD ADVANCED CRIME ANALYTICS POWERFUL SOFTWARE TO PREVENT AND SOLVE CRIME

SAS Fraud Framework for Banking

DECENTRALIZED SCALE-FREE NETWORK CONSTRUCTION AND LOAD BALANCING IN MASSIVE MULTIUSER VIRTUAL ENVIRONMENTS

The Big Data Paradigm Shift. Insight Through Automation

A TOPOLOGICAL ANALYSIS OF THE OPEN SOURCE SOFTWARE DEVELOPMENT COMMUNITY

WebFOCUS RStat. RStat. Predict the Future and Make Effective Decisions Today. WebFOCUS RStat

Methods for Assessing Vulnerability of Critical Infrastructure

DOCTORATE OF PHILOSOPHY

DATA ANALYSIS IN PUBLIC SOCIAL NETWORKS

Business Intelligence and Big Data Analytics: Speeding the Cycle from Insights to Action Four Steps to More Profitable Customer Engagement

IBM QRadar Security Intelligence April 2013

CYBER4SIGHT TM THREAT INTELLIGENCE SERVICES ANTICIPATORY AND ACTIONABLE INTELLIGENCE TO FIGHT ADVANCED CYBER THREATS

Evaluate Digital Digital Marketing Strategy

Cyber4sight TM Threat. Anticipatory and Actionable Intelligence to Fight Advanced Cyber Threats

Interpreting Web Analytics Data

Cross Media Attribution

Greedy Routing on Hidden Metric Spaces as a Foundation of Scalable Routing Architectures

Graph Mining and Social Network Analysis

Using Vulnerable Hosts to Assess Cyber Security Risk in Critical Infrastructures

Enhancing Decision Making

A comparative study of social network analysis tools

Making critical connections: predictive analytics in government

EFFICIENT DETECTION IN DDOS ATTACK FOR TOPOLOGY GRAPH DEPENDENT PERFORMANCE IN PPM LARGE SCALE IPTRACEBACK

D A T A M I N I N G C L A S S I F I C A T I O N

Collective Behavior Prediction in Social Media. Lei Tang Data Mining & Machine Learning Group Arizona State University

UMEÅ INTERNATIONAL SCHOOL

Graph models for the Web and the Internet. Elias Koutsoupias University of Athens and UCLA. Crete, July 2003

On the effect of forwarding table size on SDN network utilization

CrimeFighter: A Toolbox for Counterterrorism. Uffe Kock Wiil

Towards Modelling The Internet Topology The Interactive Growth Model

HISTORICAL DEVELOPMENTS AND THEORETICAL APPROACHES IN SOCIOLOGY Vol. I - Social Network Analysis - Wouter de Nooy

Transcription:

MINFS544: Business Network Data Analytics and Applications March 30 th, 2015 Daning Hu, Ph.D., Department of Informatics University of Zurich F Schweitzer et al. Science 2009

Stop Contagious Failures in Banking Systems During 2008 financial tsunami, which bank(s) we should inject capital first to stop contagious failures in bank networks? 2

Utilize Peer Influence in Online Social Networks Intelligent Advertising, Product Recommendation Who are the most influential people? What are the patterns of product diffusion? 3

Develop Strategies to Attack Terrorist Networks A Global Salafi Jihad Terrorist Network Hu et al. JHSEM 2009 How to effectively break down a terrorist network? 4

Network-based Business Intelligence Network-based (Modeling and Analysis) Modeling and analyzing various real-world social and organizational networks to understand: the cognitive and economic behaviors of the network actors; and the dynamic processes behind the network evolution Based on the above Business Intelligence (BI) Design network-based BI algorithms and information systems to provide decision support in various application domains Financial Risk Management, Security Informatics, and Knowledge Management, etc. Network Analysis, Simulation of Network Evolution, Data Mining, etc. 5

Summary Lecturer: Dr. Daning Hu; Teaching Assistant: Dr. Jiaqi Yan Email: hdaning@ifi.uzh.ch jackiejqyan@gmail.com Credits: 3 ECTS credits Language: English Audience: Master and doctoral students Office Hours: Tue 13:00 14:00 PM, Room 2.A.12, Please send emails to make appointments. Grading: Course report (term paper) 80% and interaction 20%

Grading 1. A full research paper (80%). The format of this paper can be found at: http://icis2015.aisnet.org/en/paper-submissions/papersubmission * If possible, get it published in ICIS 2015 and get it cited. This paper should include answers to the following questions: What is the research problem? Why is it interesting and important? Why is it hard? Why have previous approaches failed? What are the key components of your approach? What 1) models, 2) data sets and 3) metrics will be used to validate the approach? 7

A Brief History of Network Science 1736 Mathematical foundation Graph Theory 1930 Social Network Analysis and Theories Sociogram: Network visualization Six degree of separation 1990 2000 2012 Structural hole: Source of innovation (Physicists) Complex Network Topologies Small-world model (e.g., WWW) Scale-free model ( Rich get richer ) Network Science Economic networks (Agent modeling & simulation) Dynamic network analysis BI applications: product diffusion in social media, recommendation systems? 8

Outline Introduction Dynamic Analysis of Dark Networks A Global Salafi Jihad (GSJ) Terrorist Network A Narcotic Criminal Network A Network Approach to Managing Bank Systemic Risk Ongoing Work Conclusion 9

Dynamic Network Analysis (DNA) Studying dynamic link formation processes behind network evolution. Nodes forming links Network Evolution What Why How Model the changes in network evolution Temporal changes in network topological measures Dynamic network recovery on longitudinal data Statistical analysis of determinants behind link formation Homophily Preferential attachment Shared affiliations Simulate the evolution of networks Agent-based Modeling and Simulation Examine network robustness 10

Research Testbed: A Global Terrorist Network The Global Salafi Jihad (GSJ) network data is compiled by a former CIA operation officer Dr. Marc Sageman - 366 terrorists friendship, kinship, same religious leader, operational interactions, etc. geographical origins, socio-economic status, education, etc. when they join and leave GSJ The goal of dynamic analysis gain insights about the evolution of GSJ network develop effective attack strategies to break down GSJ network Sample data of GSJ terrorists 11

a 12

13

Dynamic Network Analysis Studying dynamic processes (i.e., link formation) behind network evolution. Nodes behaviors Network Evolution What Why How Model the changes in network evolution Temporal changes in network topological measures Dynamic network recovery on longitudinal data Statistical analysis of determinants behind link formation Homophily Preferential attachment Shared affiliations Simulate the evolution of networks Agent-based Modeling and Simulation Examine network robustness 14

Temporal Changes in Network-level Measures Average Degree <k > 0.24 degree 16 14 12 10 8 6 <k> probability of degree 0.21 0.18 0.15 0.12 0.09 0.06 0.03 1990 1991 1993 Poisson 4 2 0 1989 1990 1991 1992 1993 1994 1995 1996 1997 a 1998 1999 2000 2001 2002 2003 Fig.1. The temporal changes in the (a) average degree, (b) and (c) degree distribution probability of degree 0.00 0.24 0.21 0.18 0.15 0.12 0.09 0.06 0.03 1 4 7 10 13 16 19 22 25 28 31 34 37 40 43 46 49 b 1995 1997 1999 0.00 Degree = number of links a node has 1 4 7 10 13 16 19 22 25 28 31 34 37 40 43 46 49 52 c

Findings There are three stages for the evolution of the GSJ network: 1989-1993 The emerging stage: The network grows in size Accelerated Growth - No. of edges increases faster than nodes Random network topology (Poisson degree distribution) 1994-2000 The mature stage: The size of the network reached its peak in 2000 Scale-free topology (Power-law degree distribution) 2001-2003 The disintegration stage: Falling into small disconnected components after 9/11 16

Temporal Changes in Node Centrality Measures Degree Betweenness 60 50 40 30 20 10 0 1989 1990 1991 1992 1993 1994 1995 1996 1997 1998 1999 2000 2001 2002 Figure.2. Temporal changes in Degree and Betweenness centrality of Osama Bin Laden 5000 4500 4000 3500 3000 2500 2000 1500 1000 500 0 1989 1990 1991 1992 1993 1994 1995 1996 1997 1998 1999 2000 2001 2002 Degree: No. of links a node has Betweenness of a node i No. of shortest paths from all nodes to all others that pass through node i Measure i s influence on the traffic (information, resource) flowing through it 17

Findings and Possible Explanations 1994 1996: A sharp decrease in Bin Laden s Betweenness 1994: Saudi revoked his citizenship and expelled him 1995: Went to Sudan and was expelled again under U.S. pressure 1996: Went to Afghanistan and established camps there 1998 1999: Another sharp decrease in his Betweenness After 1998 bombings of U.S. embassies, Bill Clinton ordered a freeze on assets linked to bin Laden (top 10 most wanted) August 1998: A failed assassination on him from U.S. 1999: UN imposed sanctions against Afghanistan to force the Taliban to extradite him 18

Research Testbed: A Narcotic Criminal Network The COPLINK dataset contains 3 million police incident reports from the Tucson Police Department (1990 to 2006). 3 million incident reports and 1.44 million individuals Their personal and sociological information (age, ethnicity, etc.) Time information: when two individuals co-offend AZ Inmate affiliation data: when and where an inmate was housed A Narcotic Criminal Network 19,608 individuals involved in organized narcotic crimes 29,704 co-offending pairs (links) Table 1. Summary of the COPLINK dataset and the Arizona inmate dataset COPLINK Narcotic Data Arizona Inmate Data Overlapped (identified by first name, last name and DOB) Number of People 36,548 165,540 19,608 Time Span 1990-2006 1985-2006 17 years 19

Statistical Analysis of Determinants for Link Formation Proportional hazards model (Cox Regression Analysis) h(t, x 1, x 2, x 3...) = h 0 (t)exp(b 1 x 1 + b 2 x 2 + b 3 x 3...) Homophily in age (group) and race Shared affiliations: Mutual acquaintances (through crimes) Vehicle affiliation (same vehicle used by two in different crimes) Fig.3. Results of multivariate survival (Cox regression) analysis of triadic closure (link formation). 20

BI Application: Co-offending Prediction in COPLINK IBM s COPLINK is an intelligent police information system aims to to help speed up the crime detection process. COPLINK calculates the co-offending likelihood score based on the proportional hazards model. A ranked list of individuals based on their predicted likelihood of co-offending with the suspect under investigation. Fig.4. Screenshots of the COPLINK system 21

Simulate Attacks on Dark Networks Three attack (i.e. node removals) strategies: Attack on hubs (highest degrees) Attack on bridge (highest betweenness) Real-world Attack (Attack order based on real-world data) Simulate two types of attacks to examine the robustness of the Dark networks Simultaneous attacks (the degree/betweenness of nodes are NOT updated after each removal) Static Progressive attacks (the degree/betweenness of nodes are updated after each removal) Dynamic 22

Hub Vs. Bridge Attacks Both hub and bridge attacks are far more effective than realworld arrests Policy implications? Both Dark networks are more vulnerable to Bridge attacks than Hub attacks. Bridge (highest beweenness): Field lieutenants, operational leaders, etc. Hub (highest degree) : e.g., Bin Laden GSJ S and <s> 1 0.9 0.8 0.7 S (Hub attacks) 0.6 S (Bridge attacks) 0.5 0.4 0.3 0.2 0.1 0 0 0.2 0.4 0.6 0.8 1 Fraction of nodes removed 23

Summary and Contributions We developed a set of Dynamic Network Analysis (DNA) methods that are effective in Linking network topological changes to analytical insights Systematically capturing the link formation processes Examining the determinants of link formation Dark networks are robust against real-world attacks but vulnerable to targeted bridge attacks COPLINK provides real-time decision support for fighting crimes. 24

Research Readings and Resources 1. Networks Overview: * Statistical mechanics of complex networks, Section III, VI http://rmp.aps.org/abstract/rmp/v74/i1/p47_1 * Networks, Crowds, and Markets: http://www.cs.cornell.edu/home/kleinber/networks-book/ 2. Networks in Finance: * Financial Networks blog and research databases: WRDS database http://www.financialnetworkanalysis.com/research-database/ http://www.stern.nyu.edu/networks/electron.html * Company Board Social Networks 25

Research Readings and Resources (cont.) 3. Networks in Marketing: * Sinan Aral s research in networks and marketing Peer influence http://web.mit.edu/sinana/www/ * Social Media based Marketing: http://searchengineland.com/guide/what-is-social-media-marketing 4. Recommender Systems: http://www-cs-students.stanford.edu/~adityagp/recom.html 5. Word-of-Mouth Effects in Social Networks: http://papers.ssrn.com/sol3/papers.cfm?abstract_id=393042& 26