Agnieszka Stawinoga University of Naples Federico II. Simona Balbi University of Naples Federico II

Size: px
Start display at page:

Download "Agnieszka Stawinoga agnieszka.stawinoga@unina.it University of Naples Federico II. Simona Balbi simona.balbi@unina.it University of Naples Federico II"

Transcription

1 lds ay 2012, Florence he use of Network Analysis tools for dimensionality reduction in ext ining Agnieszka tawinoga agnieszka.stawinoga@unina.it University of Naples Federico II imona Balbi simona.balbi@unina.it University of Naples Federico II

2 Challenges in ext ining ext ining () is the non trivial process of identifying valid, novel potentially useful and ultimately understandable patterns in texts. HIGH DINIONALIY CHALLNG IN X INING - Development of new tools and strategies for detecting meaningful communities of terms in order to construct higher order data - Implementation of tools developed in other fields novel for text mining

3 NWORK ANALYI Networks are a form of "relational data", data whose properties cannot be reduced to the attributes of the individuals (nodes) involved. he object of analysis are the relations. he nodes represent individuals, and the links (ties or edges) represent a specific relationships between individuals. ode Relational information One-mode Network wo-mode Network - Adjacency matrix - Affiliation matrix Actor Attributes: In addition to relational information, social network data sets contain measurements on the characteristics of the actors ingle relation Dichotomous Weighted ultiple relation Nondirectional Directional

4 Networks ext Analysis Network ext Analysis is a method for encoding the relationships between words in a text and constructing a network of the linked words (Popping, 2000 ). the Reuters terror news network (Batagelj and rvar, 2002) Network of dictionary terms the graph of all geodesics from black to white (Batagelj et al. 2002)

5 Coding system in Bag of words: a coding system where a text (such as a document) is represented as an unordered collection of words, disregarding grammar and even word order. BAG OF WORD LXICAL ABL

6 DOCUN Lexical able R LXICAL ABL CAN B RAD A A VALUD AFFILIAION ARIX he value of cells indicate how many times each word appears in each document.

7 From Lexical able to one-mode network (1) R R D O C U N LXICAL ABL ( n p) DICHOOIZAION when interest is in presence / absence, NO frequency D O C U N LXICAL ABL A( n p) R R ARIX R x R W ( p p ) A A Interests in term co-occurrences

8 From Lexical able to one-mode network (2) R R ARIX R x R W ( p p ) GRAPHICAL RPRNAION OF ARIX G = (V;) V - the set of vertices which consists of p words the set of edges existing among vertices

9 NWORK ANALYI OOL: COPONN Components of a graph: sub-graphs connected within, but disconnected between sub-graphs. Network consists of 14 components: - 4 isolates (components of 1 node) -3 components of 2 nodes -3 components of 3 nodes -1 component of 6 nodes -2 components of 9 nodes -1 component of 28nodes

10 Betweenness centrality measures how often a node appears on the shortest path between two other random nodes in the network NWORK ANALYI OOL: BWNN CNRALIY g () i C (i)= jk, i j k B g j<k jk gjk(i) - the number of shortest paths between nodes j and k that contain the node i gjk - the number of shortest paths between nodes j and k (Freeman, 1979). High centrality index indicates a node which reaches other nodes on relatively short paths. A node with high betweenness centrality lies on a considerable fraction of shortest paths connecting pairs of other nodes. his measure is often considered as an indicator of the power and influence.

11 NWORK ANALYI OOL: BWNN CNRALIY BWNN CNRALIY IDNIFICAION OF H O RLVAN R WHICH CONNC DIFFRN COUNII WIHIN H NWORK ource:

12 NWORK ANALYI OOL: GO-NWORK An ego-network is a network formed by selecting a focal node ( ego ), including all actors ( alters ) that are connected to that node, and all the connections among those other actors. go-networks arise by "extracting" them from regular whole network data and illustrate local areas of larger networks. N-step neighborhood expands the definition of the size of ego's neighborhood by including all nodes to whom ego has a connection at a path length of N, and all the connections among all of these actors. go Alters

13 Our aim and strategy Goal: to propose a new strategy for dimensionality reduction by Network Analysis tools. Basing on the relational structure among terms, we propose to create higher order data (context, topics). trategy: 1. Preprocessing of the corpus and construction of the lexical table (nxp) 2. Construction of the co-occurrences matrix W (pxp) 3. Construction of the similarities matrix (pxp) and choice of a threshold 4. Representation of in the form of a binary network 5. xtraction of network components 6. Identification of central (betweenness ) nodes and building their ego nets 7. Creation of a new lexical table 1 (nxp 1 ) where p 1 <<p

14 Results Data: management commentary of one of the companies listed on Italian and American markets - Luxottica group, the world leader of eye-wears. he reference year is Preprocessing: Corpus: 44 sections and 21,266 tokens P 1. he corpus is normalized, and cleaned by stop words, showing 4538 types. P 2. he corpus is grammatically tagged, and the lexical-part-of-speech (nouns, adjectives) were selected. P 3. Graphical forms with attributes (NOUN, ADJCIV), the words peculiar to economic language (using the linguistic resource included in alac). P 4. Lemmatization of selected graphical forms.

15 Results R R D O C U N LXICAL ABL (44 332) R 1.DICHOOIZAION 2. A A R ARIX R x R R W( ) QUIVALNC INDX R ( ) JACCARD INDX R ( ) J R NORALIZAION OF CO-OCCURRNC R AOCIAION RNGH A ( )

16 Co-Occurrence Data Normalization: imilarity easures imilarity easures: quivalence index (a form of cosine index) Jaccard index Association strength quivalence index kj w 2 kj w w kk jj Jaccard index J kj w kj w w w kk jj kj Association strength Akj w kk kj w w jj wkk - the total number of occurrences of object k, wkj - the number of co-occurrences of k and j

17 Results R R QUIVALNC INDX ( ) 0,600 0,500 hreshold values and network density R R J JACCARD INDX ( ) 0,400 0,300 0,200 R R AOCIAION RNGH A ( ) 0,100 0,000 0,05 0,10 0,15 0,20 0,25 0,30 0,35 0,40 0,45 0,50 0,55 0,60 0,65 0,70 0,75 0,80 0,85 0,90 0,95 1,00 quivalence Index Jaccard index Association strength Chosen threshold value = 0.55

18 Results R R R J JACCARD INDX ( ) DICHOOIZAION HRHOLD VALU = 0,55 R ADJACNCY ARIX A( ) 332 nodes and 455 edges Network consists of 154 components: -137 isolates -7 components of 2 nodes -8 components of 3 nodes -1 component of 5 nodes -1 component of 7 nodes -1 component of 9 nodes -1 component of 136 nodes

19 Results o describe different context in the corpus and identify different topics we extract network components with 3 or more nodes ( 8 components of 3 nodes, 1 component of 5 nodes, 1 component of 7 nodes, 1 component of 9 nodes, 1 component of 136 nodes)

20 Results Components with 3, 5, 7 and 9 nodes

21 Results he main component of the network (136 nodes and 416 edges) Nodes with the highest betweenness centrality (99 th percentile of the whole distribution)

22 Results go-networks of the nodes with the highest betweenness centrality Basing on the proposed strategy we select 82 the most relevant terms

23 Results Lexical Correspondence Analysis is a principal axes method usually applied for the identification of the principal components of the association structure in a lexical table (Lebart et al., 1998) Correspondence Analysis First Factorial Plane (332 terms)

24 Results Correspondence Analysis First Factorial Plane: the selected terms (82)

25 Remarks on Contribution description of different context in the corpus and identification of different topics dimensionality reduction by passing from elementary data (terms) to higher order data (context, topics) selection of the most relevant terms viewed as new features prior to further analyses Future developments to put the strategy into the theoretical frame of feature selection to use the strategy for dealing with the problem of disambiguation, an important question in any Natural Language Processing

26 References 1. Balbi., tawinoga A., riunfo N. (2012). ext ining tools for extracting knowledge from Firms Annual Reports, JAD 2012 : 11 es Journées internationales d Analyse statistique des Données extuelles. (accepted for publication). 2. Batagelj V., rvar A. and Zaversnik. (2002). Network analysis of texts. Paper online: 3. V. Batagelj and A. rvar Reuters terror news network analysis with pajek. o appear in Jo 4. Batagelj V., rvar A., Zaveršnik. (2002). Network analysis of dictionaries. Jezikovne tehnologije / Language echnologies,. rjavec, J. Gros eds., Ljubljana, p Freeman L.C. (1979). Centrality in ocial Networks Conceptual Clarification. ocial Networks, 1: Hanneman R. and Riddle. (2005). Introduction to social network methods. Riverside, CA: University of California, Riverside Lebart L., alem A. and Berry L. (1998). xploring extual Data. Kluwer Academic Publishers. he Netherlands. 8. Popping R. (2000). Computer-Assisted ext Analysis. age. London. 9. van ck N. J., Waltman L. (2009) How to Normalize Co-Occurrence Data? An Analysis of ome Well-Known imilarity easures, R LI,

27 hank You! Acknowledgements his work is financially supported by the uropean Project BLU-.

Textual Data Analysis tools for Word Sense Disambiguation

Textual Data Analysis tools for Word Sense Disambiguation Textual Data Analysis tools for Word Sense Disambiguation Simona Balbi 1, Agnieszka Stawinoga 2 Università FedericoII di Napoli 1 simona.balbi@unina.it, 2 agnieszka.stawinoga@unina.it Abstract The ambiguity

More information

HISTORICAL DEVELOPMENTS AND THEORETICAL APPROACHES IN SOCIOLOGY Vol. I - Social Network Analysis - Wouter de Nooy

HISTORICAL DEVELOPMENTS AND THEORETICAL APPROACHES IN SOCIOLOGY Vol. I - Social Network Analysis - Wouter de Nooy SOCIAL NETWORK ANALYSIS University of Amsterdam, Netherlands Keywords: Social networks, structuralism, cohesion, brokerage, stratification, network analysis, methods, graph theory, statistical models Contents

More information

Practical Graph Mining with R. 5. Link Analysis

Practical Graph Mining with R. 5. Link Analysis Practical Graph Mining with R 5. Link Analysis Outline Link Analysis Concepts Metrics for Analyzing Networks PageRank HITS Link Prediction 2 Link Analysis Concepts Link A relationship between two entities

More information

A SOCIAL NETWORK ANALYSIS APPROACH TO ANALYZE ROAD NETWORKS INTRODUCTION

A SOCIAL NETWORK ANALYSIS APPROACH TO ANALYZE ROAD NETWORKS INTRODUCTION A SOCIAL NETWORK ANALYSIS APPROACH TO ANALYZE ROAD NETWORKS Kyoungjin Park Alper Yilmaz Photogrammetric and Computer Vision Lab Ohio State University park.764@osu.edu yilmaz.15@osu.edu ABSTRACT Depending

More information

What is SNA? A sociogram showing ties

What is SNA? A sociogram showing ties Case Western Reserve University School of Medicine Social Network Analysis: Nuts & Bolts Papp KK 1, Zhang GQ 2 1 Director, Program Evaluation, CTSC, 2 Professor, Electrical Engineering and Computer Science,

More information

Week 3. Network Data; Introduction to Graph Theory and Sociometric Notation

Week 3. Network Data; Introduction to Graph Theory and Sociometric Notation Wasserman, Stanley, and Katherine Faust. 2009. Social Network Analysis: Methods and Applications, Structural Analysis in the Social Sciences. New York, NY: Cambridge University Press. Chapter III: Notation

More information

Social Media Mining. Graph Essentials

Social Media Mining. Graph Essentials Graph Essentials Graph Basics Measures Graph and Essentials Metrics 2 2 Nodes and Edges A network is a graph nodes, actors, or vertices (plural of vertex) Connections, edges or ties Edge Node Measures

More information

Course on Social Network Analysis Graphs and Networks

Course on Social Network Analysis Graphs and Networks Course on Social Network Analysis Graphs and Networks Vladimir Batagelj University of Ljubljana Slovenia V. Batagelj: Social Network Analysis / Graphs and Networks 1 Outline 1 Graph...............................

More information

Social Media Mining. Network Measures

Social Media Mining. Network Measures Klout Measures and Metrics 22 Why Do We Need Measures? Who are the central figures (influential individuals) in the network? What interaction patterns are common in friends? Who are the like-minded users

More information

An overview of Software Applications for Social Network Analysis

An overview of Software Applications for Social Network Analysis IRSR INTERNATIONAL REVIEW of SOCIAL RESEARCH Volume 3, Issue 3, October 2013, 71-77 International Review of Social Research An overview of Software Applications for Social Network Analysis Ioana-Alexandra

More information

A comparative study of social network analysis tools

A comparative study of social network analysis tools Membre de Membre de A comparative study of social network analysis tools David Combe, Christine Largeron, Előd Egyed-Zsigmond and Mathias Géry International Workshop on Web Intelligence and Virtual Enterprises

More information

Intelligent Analysis of User Interactions in a Collaborative Software Engineering Context

Intelligent Analysis of User Interactions in a Collaborative Software Engineering Context Intelligent Analysis of User Interactions in a Collaborative Software Engineering Context Alejandro Corbellini 1,2, Silvia Schiaffino 1,2, Daniela Godoy 1,2 1 ISISTAN Research Institute, UNICEN University,

More information

Part 2: Community Detection

Part 2: Community Detection Chapter 8: Graph Data Part 2: Community Detection Based on Leskovec, Rajaraman, Ullman 2014: Mining of Massive Datasets Big Data Management and Analytics Outline Community Detection - Social networks -

More information

Efficient Techniques for Improved Data Classification and POS Tagging by Monitoring Extraction, Pruning and Updating of Unknown Foreign Words

Efficient Techniques for Improved Data Classification and POS Tagging by Monitoring Extraction, Pruning and Updating of Unknown Foreign Words , pp.290-295 http://dx.doi.org/10.14257/astl.2015.111.55 Efficient Techniques for Improved Data Classification and POS Tagging by Monitoring Extraction, Pruning and Updating of Unknown Foreign Words Irfan

More information

Equivalence Concepts for Social Networks

Equivalence Concepts for Social Networks Equivalence Concepts for Social Networks Tom A.B. Snijders University of Oxford March 26, 2009 c Tom A.B. Snijders (University of Oxford) Equivalences in networks March 26, 2009 1 / 40 Outline Structural

More information

Copyright 2008, Lada Adamic. School of Information University of Michigan

Copyright 2008, Lada Adamic. School of Information University of Michigan School of Information University of Michigan Unless otherwise noted, the content of this course material is licensed under a Creative Commons Attribution 3.0 License. http://creativecommons.org/licenses/by/3.0/

More information

Social Network Analysis: Visualization Tools

Social Network Analysis: Visualization Tools Social Network Analysis: Visualization Tools Dr. oec. Ines Mergel The Program on Networked Governance Kennedy School of Government Harvard University ines_mergel@harvard.edu Content Assembling network

More information

How to do a Business Network Analysis

How to do a Business Network Analysis How to do a Business Network Analysis by Graham Durant-Law Copyright HolisTech 2006-2007 Information and Knowledge Management Society 1 Format for the Evening Presentation (7:00 pm to 7:40 pm) Essential

More information

Strength of Weak Ties, Structural Holes, Closure and Small Worlds. Steve Borgatti MGT 780, Spring 2010 LINKS Center, U of Kentucky

Strength of Weak Ties, Structural Holes, Closure and Small Worlds. Steve Borgatti MGT 780, Spring 2010 LINKS Center, U of Kentucky Strength of Weak Ties, Structural Holes, Closure and Small Worlds Steve orgatti MGT 780, Spring 2010 LINKS Center, U of Kentucky Strength of Weak Ties theory Granovetter 1973 Overall idea Weak ties are

More information

2006-352: RICH NETWORKS: EVALUATING UNIVERSITY-HIGH SCHOOLS PARTNERSHIPS USING GRAPH ANALYSIS

2006-352: RICH NETWORKS: EVALUATING UNIVERSITY-HIGH SCHOOLS PARTNERSHIPS USING GRAPH ANALYSIS 2006-352: RICH NETWORKS: EVALUATING UNIVERSITY-HIGH SCHOOLS PARTNERSHIPS USING GRAPH ANALYSIS Donna Llewellyn, Georgia Institute of Technology Dr. Donna C. Llewellyn is the Director of the Center for the

More information

Visualization of textual data: unfolding the Kohonen maps.

Visualization of textual data: unfolding the Kohonen maps. Visualization of textual data: unfolding the Kohonen maps. CNRS - GET - ENST 46 rue Barrault, 75013, Paris, France (e-mail: ludovic.lebart@enst.fr) Ludovic Lebart Abstract. The Kohonen self organizing

More information

Workshop in Applied Analysis Software MY591. Introduction to Social Network Analysis with UCINET

Workshop in Applied Analysis Software MY591. Introduction to Social Network Analysis with UCINET Workshop in Applied Analysis Software MY591 Introduction to Social Network Analysis with UCINET Instructor: Prof. Ahmet K. Suerdem (Istanbul Bilgi University and London School of Economics) Contact: A.K.Suerdem@lse.ac.uk

More information

Network Metrics, Planar Graphs, and Software Tools. Based on materials by Lala Adamic, UMichigan

Network Metrics, Planar Graphs, and Software Tools. Based on materials by Lala Adamic, UMichigan Network Metrics, Planar Graphs, and Software Tools Based on materials by Lala Adamic, UMichigan Network Metrics: Bowtie Model of the Web n The Web is a directed graph: n webpages link to other webpages

More information

5. Binary objects labeling

5. Binary objects labeling Image Processing - Laboratory 5: Binary objects labeling 1 5. Binary objects labeling 5.1. Introduction In this laboratory an object labeling algorithm which allows you to label distinct objects from a

More information

How To Find Local Affinity Patterns In Big Data

How To Find Local Affinity Patterns In Big Data Detection of local affinity patterns in big data Andrea Marinoni, Paolo Gamba Department of Electronics, University of Pavia, Italy Abstract Mining information in Big Data requires to design a new class

More information

Social and Economic Networks: Lecture 1, Networks?

Social and Economic Networks: Lecture 1, Networks? Social and Economic Networks: Lecture 1, Networks? Alper Duman Izmir University Economics, February 26, 2013 Conventional economics assume that all agents are either completely connected or totally isolated.

More information

Visualizing Complexity in Networks: Seeing Both the Forest and the Trees

Visualizing Complexity in Networks: Seeing Both the Forest and the Trees CONNECTIONS 25(1): 37-47 2003 INSNA Visualizing Complexity in Networks: Seeing Both the Forest and the Trees Cathleen McGrath Loyola Marymount University, USA David Krackhardt The Heinz School, Carnegie

More information

A Comparison Framework of Similarity Metrics Used for Web Access Log Analysis

A Comparison Framework of Similarity Metrics Used for Web Access Log Analysis A Comparison Framework of Similarity Metrics Used for Web Access Log Analysis Yusuf Yaslan and Zehra Cataltepe Istanbul Technical University, Computer Engineering Department, Maslak 34469 Istanbul, Turkey

More information

Terminology Extraction from Log Files

Terminology Extraction from Log Files Terminology Extraction from Log Files Hassan Saneifar, Stéphane Bonniol, Anne Laurent, Pascal Poncelet, Mathieu Roche To cite this version: Hassan Saneifar, Stéphane Bonniol, Anne Laurent, Pascal Poncelet,

More information

Data Mining on Social Networks. Dionysios Sotiropoulos Ph.D.

Data Mining on Social Networks. Dionysios Sotiropoulos Ph.D. Data Mining on Social Networks Dionysios Sotiropoulos Ph.D. 1 Contents What are Social Media? Mathematical Representation of Social Networks Fundamental Data Mining Concepts Data Mining Tasks on Digital

More information

Using Networks to Visualize and Understand Participation on SourceForge.net

Using Networks to Visualize and Understand Participation on SourceForge.net Nathan Oostendorp; Mailbox #200 SI708 Networks Theory and Application Final Project Report Using Networks to Visualize and Understand Participation on SourceForge.net SourceForge.net is an online repository

More information

Statistical Analysis of Complete Social Networks

Statistical Analysis of Complete Social Networks Statistical Analysis of Complete Social Networks Introduction to networks Christian Steglich c.e.g.steglich@rug.nl median geodesic distance between groups 1.8 1.2 0.6 transitivity 0.0 0.0 0.5 1.0 1.5 2.0

More information

V. Adamchik 1. Graph Theory. Victor Adamchik. Fall of 2005

V. Adamchik 1. Graph Theory. Victor Adamchik. Fall of 2005 V. Adamchik 1 Graph Theory Victor Adamchik Fall of 2005 Plan 1. Basic Vocabulary 2. Regular graph 3. Connectivity 4. Representing Graphs Introduction A.Aho and J.Ulman acknowledge that Fundamentally, computer

More information

Examining graduate committee faculty compositions- A social network analysis example. Kathryn Shirley and Kelly D. Bradley. University of Kentucky

Examining graduate committee faculty compositions- A social network analysis example. Kathryn Shirley and Kelly D. Bradley. University of Kentucky Examining graduate committee faculty compositions- A social network analysis example Kathryn Shirley and Kelly D. Bradley University of Kentucky Graduate committee social network analysis 1 Abstract Social

More information

Combining statistical data analysis techniques to. extract topical keyword classes from corpora

Combining statistical data analysis techniques to. extract topical keyword classes from corpora Combining statistical data analysis techniques to extract topical keyword classes from corpora Mathias Rossignol Pascale Sébillot Irisa, Campus de Beaulieu, 35042 Rennes Cedex, France (mrossign sebillot)@irisa.fr

More information

Mining Social-Network Graphs

Mining Social-Network Graphs 342 Chapter 10 Mining Social-Network Graphs There is much information to be gained by analyzing the large-scale data that is derived from social networks. The best-known example of a social network is

More information

Parsing Software Requirements with an Ontology-based Semantic Role Labeler

Parsing Software Requirements with an Ontology-based Semantic Role Labeler Parsing Software Requirements with an Ontology-based Semantic Role Labeler Michael Roth University of Edinburgh mroth@inf.ed.ac.uk Ewan Klein University of Edinburgh ewan@inf.ed.ac.uk Abstract Software

More information

Network Analysis. Antonio M. Chiesi Department of Social and Political Studies, Università degli Studi di Milano Antonio.chiesi@unimi.

Network Analysis. Antonio M. Chiesi Department of Social and Political Studies, Università degli Studi di Milano Antonio.chiesi@unimi. Network Analysis Antonio M. Chiesi Department of Social and Political Studies, Università degli Studi di Milano Antonio.chiesi@unimi.it Essential references: Chiesi, A. M., Network Analysis, general, in

More information

W. Heath Rushing Adsurgo LLC. Harness the Power of Text Analytics: Unstructured Data Analysis for Healthcare. Session H-1 JTCC: October 23, 2015

W. Heath Rushing Adsurgo LLC. Harness the Power of Text Analytics: Unstructured Data Analysis for Healthcare. Session H-1 JTCC: October 23, 2015 W. Heath Rushing Adsurgo LLC Harness the Power of Text Analytics: Unstructured Data Analysis for Healthcare Session H-1 JTCC: October 23, 2015 Outline Demonstration: Recent article on cnn.com Introduction

More information

Big Data Text Mining and Visualization. Anton Heijs

Big Data Text Mining and Visualization. Anton Heijs Copyright 2007 by Treparel Information Solutions BV. This report nor any part of it may be copied, circulated, quoted without prior written approval from Treparel7 Treparel Information Solutions BV Delftechpark

More information

Open Source Software Developer and Project Networks

Open Source Software Developer and Project Networks Open Source Software Developer and Project Networks Matthew Van Antwerp and Greg Madey University of Notre Dame {mvanantw,gmadey}@cse.nd.edu Abstract. This paper outlines complex network concepts and how

More information

Visual Mining of Multi-Modal Social Networks at Different Abstraction Levels

Visual Mining of Multi-Modal Social Networks at Different Abstraction Levels Visual Mining of Multi-Modal Social Networks at Different Abstraction Levels Lisa Singh Georgetown University Washington, DC singh@cs.georgetown.edu Mitchell Beard Georgetown University Washington, DC

More information

IE 680 Special Topics in Production Systems: Networks, Routing and Logistics*

IE 680 Special Topics in Production Systems: Networks, Routing and Logistics* IE 680 Special Topics in Production Systems: Networks, Routing and Logistics* Rakesh Nagi Department of Industrial Engineering University at Buffalo (SUNY) *Lecture notes from Network Flows by Ahuja, Magnanti

More information

Terminology Extraction from Log Files

Terminology Extraction from Log Files Terminology Extraction from Log Files Hassan Saneifar 1,2, Stéphane Bonniol 2, Anne Laurent 1, Pascal Poncelet 1, and Mathieu Roche 1 1 LIRMM - Université Montpellier 2 - CNRS 161 rue Ada, 34392 Montpellier

More information

Complex Networks Analysis: Clustering Methods

Complex Networks Analysis: Clustering Methods Complex Networks Analysis: Clustering Methods Nikolai Nefedov Spring 2013 ISI ETH Zurich nefedov@isi.ee.ethz.ch 1 Outline Purpose to give an overview of modern graph-clustering methods and their applications

More information

1. Introduction. Suzanna Lamria Siregar 1, D. L. Crispina Pardede 2 and Rossi Septy Wahyuni 3

1. Introduction. Suzanna Lamria Siregar 1, D. L. Crispina Pardede 2 and Rossi Septy Wahyuni 3 International Proceedings of Management and Economy IPEDR vol. 84 (2015) (2015) IACSIT Press, Singapore Structural Positions and Financial Performances of Rural Banks in Central Java Network (CJ-Net):

More information

ProteinQuest user guide

ProteinQuest user guide ProteinQuest user guide 1. Introduction... 3 1.1 With ProteinQuest you can... 3 1.2 ProteinQuest basic version 4 1.3 ProteinQuest extended version... 5 2. ProteinQuest dictionaries... 6 3. Directions for

More information

Stock Market Prediction Using Data Mining

Stock Market Prediction Using Data Mining Stock Market Prediction Using Data Mining 1 Ruchi Desai, 2 Prof.Snehal Gandhi 1 M.E., 2 M.Tech. 1 Computer Department 1 Sarvajanik College of Engineering and Technology, Surat, Gujarat, India Abstract

More information

Randomized flow model and centrality measure for electrical power transmission network analysis

Randomized flow model and centrality measure for electrical power transmission network analysis Randomized flow model and centrality measure for electrical power transmission network analysis Enrico Zio, Roberta Piccinelli To cite this version: Enrico Zio, Roberta Piccinelli. Randomized flow model

More information

THE ROLE OF SOCIOGRAMS IN SOCIAL NETWORK ANALYSIS. Maryann Durland Ph.D. EERS Conference 2012 Monday April 20, 10:30-12:00

THE ROLE OF SOCIOGRAMS IN SOCIAL NETWORK ANALYSIS. Maryann Durland Ph.D. EERS Conference 2012 Monday April 20, 10:30-12:00 THE ROLE OF SOCIOGRAMS IN SOCIAL NETWORK ANALYSIS Maryann Durland Ph.D. EERS Conference 2012 Monday April 20, 10:30-12:00 FORMAT OF PRESENTATION Part I SNA overview 10 minutes Part II Sociograms Example

More information

Social Network Analysis

Social Network Analysis A Brief Introduction to Social Network Analysis Jennifer Roberts Outline Description of Social Network Analysis Sociocentric vs. Egocentric networks Estimating a social network TRANSIMS A case study What

More information

Chapter 8. Final Results on Dutch Senseval-2 Test Data

Chapter 8. Final Results on Dutch Senseval-2 Test Data Chapter 8 Final Results on Dutch Senseval-2 Test Data The general idea of testing is to assess how well a given model works and that can only be done properly on data that has not been seen before. Supervised

More information

Applying Social Network Analysis to the Information in CVS Repositories

Applying Social Network Analysis to the Information in CVS Repositories Applying Social Network Analysis to the Information in CVS Repositories Luis Lopez-Fernandez, Gregorio Robles, Jesus M. Gonzalez-Barahona GSyC, Universidad Rey Juan Carlos {llopez,grex,jgb}@gsyc.escet.urjc.es

More information

A Network Approach to Spatial Data Infrastructure Applying Social Network Analysis in SDI research

A Network Approach to Spatial Data Infrastructure Applying Social Network Analysis in SDI research A Network Approach to Spatial Data Infrastructure Applying Social Network Analysis in SDI research Glenn Vancauwenberghe, Geert Bouckaert and Joep Crompvoets K.U. Leuven, Public Management Institute, glenn.vancauwenberghe@soc.kuleuven.be

More information

Comparison of Non-linear Dimensionality Reduction Techniques for Classification with Gene Expression Microarray Data

Comparison of Non-linear Dimensionality Reduction Techniques for Classification with Gene Expression Microarray Data CMPE 59H Comparison of Non-linear Dimensionality Reduction Techniques for Classification with Gene Expression Microarray Data Term Project Report Fatma Güney, Kübra Kalkan 1/15/2013 Keywords: Non-linear

More information

SECTIONS 1.5-1.6 NOTES ON GRAPH THEORY NOTATION AND ITS USE IN THE STUDY OF SPARSE SYMMETRIC MATRICES

SECTIONS 1.5-1.6 NOTES ON GRAPH THEORY NOTATION AND ITS USE IN THE STUDY OF SPARSE SYMMETRIC MATRICES SECIONS.5-.6 NOES ON GRPH HEORY NOION ND IS USE IN HE SUDY OF SPRSE SYMMERIC MRICES graph G ( X, E) consists of a finite set of nodes or vertices X and edges E. EXMPLE : road map of part of British Columbia

More information

Temporal Visualization and Analysis of Social Networks

Temporal Visualization and Analysis of Social Networks Temporal Visualization and Analysis of Social Networks Peter A. Gloor*, Rob Laubacher MIT {pgloor,rjl}@mit.edu Yan Zhao, Scott B.C. Dynes *Dartmouth {yan.zhao,sdynes}@dartmouth.edu Abstract This paper

More information

MATHEMATICAL THOUGHT AND PRACTICE. Chapter 7: The Mathematics of Networks The Cost of Being Connected

MATHEMATICAL THOUGHT AND PRACTICE. Chapter 7: The Mathematics of Networks The Cost of Being Connected MATHEMATICAL THOUGHT AND PRACTICE Chapter 7: The Mathematics of Networks The Cost of Being Connected Network A network is a graph that is connected. In this context the term is most commonly used when

More information

Sociology and CS. Small World. Sociology Problems. Degree of Separation. Milgram s Experiment. How close are people connected? (Problem Understanding)

Sociology and CS. Small World. Sociology Problems. Degree of Separation. Milgram s Experiment. How close are people connected? (Problem Understanding) Sociology Problems Sociology and CS Problem 1 How close are people connected? Small World Philip Chan Problem 2 Connector How close are people connected? (Problem Understanding) Small World Are people

More information

Semantic Analysis of. Tag Similarity Measures in. Collaborative Tagging Systems

Semantic Analysis of. Tag Similarity Measures in. Collaborative Tagging Systems Semantic Analysis of Tag Similarity Measures in Collaborative Tagging Systems 1 Ciro Cattuto, 2 Dominik Benz, 2 Andreas Hotho, 2 Gerd Stumme 1 Complex Networks Lagrange Laboratory (CNLL), ISI Foundation,

More information

Technical Report. The KNIME Text Processing Feature:

Technical Report. The KNIME Text Processing Feature: Technical Report The KNIME Text Processing Feature: An Introduction Dr. Killian Thiel Dr. Michael Berthold Killian.Thiel@uni-konstanz.de Michael.Berthold@uni-konstanz.de Copyright 2012 by KNIME.com AG

More information

Computer-aided Document Indexing System

Computer-aided Document Indexing System Journal of Computing and Information Technology - CIT 13, 2005, 4, 299-305 299 Computer-aided Document Indexing System Mladen Kolar, Igor Vukmirović, Bojana Dalbelo Bašić and Jan Šnajder,, An enormous

More information

Nodes, Ties and Influence

Nodes, Ties and Influence Nodes, Ties and Influence Chapter 2 Chapter 2, Community Detec:on and Mining in Social Media. Lei Tang and Huan Liu, Morgan & Claypool, September, 2010. 1 IMPORTANCE OF NODES 2 Importance of Nodes Not

More information

2-Mode Concepts in Social Network Analysis

2-Mode Concepts in Social Network Analysis 2-Mode Concepts in Social Network Analysis Stephen P. Borgatti Chellgren Chair and Professor of Management Gatton College of Business and Economics University of Kentucky Lexington, KY 40506 USA sborgatti@uky.edu

More information

DATA ANALYSIS II. Matrix Algorithms

DATA ANALYSIS II. Matrix Algorithms DATA ANALYSIS II Matrix Algorithms Similarity Matrix Given a dataset D = {x i }, i=1,..,n consisting of n points in R d, let A denote the n n symmetric similarity matrix between the points, given as where

More information

Comparison of Standard and Zipf-Based Document Retrieval Heuristics

Comparison of Standard and Zipf-Based Document Retrieval Heuristics Comparison of Standard and Zipf-Based Document Retrieval Heuristics Benjamin Hoffmann Universität Stuttgart, Institut für Formale Methoden der Informatik Universitätsstr. 38, D-70569 Stuttgart, Germany

More information

Clustering Connectionist and Statistical Language Processing

Clustering Connectionist and Statistical Language Processing Clustering Connectionist and Statistical Language Processing Frank Keller keller@coli.uni-sb.de Computerlinguistik Universität des Saarlandes Clustering p.1/21 Overview clustering vs. classification supervised

More information

Social and Technological Network Analysis. Lecture 3: Centrality Measures. Dr. Cecilia Mascolo (some material from Lada Adamic s lectures)

Social and Technological Network Analysis. Lecture 3: Centrality Measures. Dr. Cecilia Mascolo (some material from Lada Adamic s lectures) Social and Technological Network Analysis Lecture 3: Centrality Measures Dr. Cecilia Mascolo (some material from Lada Adamic s lectures) In This Lecture We will introduce the concept of centrality and

More information

Exponential Random Graph Models for Social Network Analysis. Danny Wyatt 590AI March 6, 2009

Exponential Random Graph Models for Social Network Analysis. Danny Wyatt 590AI March 6, 2009 Exponential Random Graph Models for Social Network Analysis Danny Wyatt 590AI March 6, 2009 Traditional Social Network Analysis Covered by Eytan Traditional SNA uses descriptive statistics Path lengths

More information

Visualizing bibliometric networks

Visualizing bibliometric networks This is a preprint of the following book chapter: Van Eck, N.J., & Waltman, L. (2014). Visualizing bibliometric networks. In Y. Ding, R. Rousseau, & D. Wolfram (Eds.), Measuring scholarly impact: Methods

More information

Social network and smoking: a pilotstudyamonghigh-schoolstudents

Social network and smoking: a pilotstudyamonghigh-schoolstudents 2013 Italian Stata Users Group meeting Social network and smoking: a pilotstudyamonghigh-schoolstudents Bruno Federico Firenze, 14 novembre 2013 The social pattern of smoking Smoking is an important risk

More information

Hybrid Strategies. for better products and shorter time-to-market

Hybrid Strategies. for better products and shorter time-to-market Hybrid Strategies for better products and shorter time-to-market Background Manufacturer of language technology software & services Spin-off of the research center of Germany/Heidelberg Founded in 1999,

More information

Graph Mining and Social Network Analysis

Graph Mining and Social Network Analysis Graph Mining and Social Network Analysis Data Mining and Text Mining (UIC 583 @ Politecnico di Milano) References Jiawei Han and Micheline Kamber, "Data Mining: Concepts and Techniques", The Morgan Kaufmann

More information

Network Analysis of a Large Scale Open Source Project

Network Analysis of a Large Scale Open Source Project 2014 40th Euromicro Conference on Software Engineering and Advanced Applications Network Analysis of a Large Scale Open Source Project Alma Oručević-Alagić, Martin Höst Department of Computer Science,

More information

Socio-semantic network data visualization

Socio-semantic network data visualization Socio-semantic network data visualization Alexey Drutsa 1,2, Konstantin Yavorskiy 1 1 Witology alexey.drutsa@witology.com, konstantin.yavorskiy@witology.com http://www.witology.com 2 Moscow State University,

More information

General Network Analysis: Graph-theoretic. COMP572 Fall 2009

General Network Analysis: Graph-theoretic. COMP572 Fall 2009 General Network Analysis: Graph-theoretic Techniques COMP572 Fall 2009 Networks (aka Graphs) A network is a set of vertices, or nodes, and edges that connect pairs of vertices Example: a network with 5

More information

A chart generator for the Dutch Alpino grammar

A chart generator for the Dutch Alpino grammar June 10, 2009 Introduction Parsing: determining the grammatical structure of a sentence. Semantics: a parser can build a representation of meaning (semantics) as a side-effect of parsing a sentence. Generation:

More information

Connectivity and cuts

Connectivity and cuts Math 104, Graph Theory February 19, 2013 Measure of connectivity How connected are each of these graphs? > increasing connectivity > I G 1 is a tree, so it is a connected graph w/minimum # of edges. Every

More information

Lecture 2: Descriptive Statistics and Exploratory Data Analysis

Lecture 2: Descriptive Statistics and Exploratory Data Analysis Lecture 2: Descriptive Statistics and Exploratory Data Analysis Further Thoughts on Experimental Design 16 Individuals (8 each from two populations) with replicates Pop 1 Pop 2 Randomly sample 4 individuals

More information

Statistical Models in Data Mining

Statistical Models in Data Mining Statistical Models in Data Mining Sargur N. Srihari University at Buffalo The State University of New York Department of Computer Science and Engineering Department of Biostatistics 1 Srihari Flood of

More information

Data Mining: Exploring Data. Lecture Notes for Chapter 3. Slides by Tan, Steinbach, Kumar adapted by Michael Hahsler

Data Mining: Exploring Data. Lecture Notes for Chapter 3. Slides by Tan, Steinbach, Kumar adapted by Michael Hahsler Data Mining: Exploring Data Lecture Notes for Chapter 3 Slides by Tan, Steinbach, Kumar adapted by Michael Hahsler Topics Exploratory Data Analysis Summary Statistics Visualization What is data exploration?

More information

Social Analysis of the SEKE Co-Author Network

Social Analysis of the SEKE Co-Author Network Social Analysis of the SEKE Co-Author Network Rehab El Kharboutly Swapna S. Gokhale Software Engineering Computer Science & Engg. Quinnipiac University Univ. of Connecticut Hamden, CT 06518 Storrs, CT

More information

Protein Protein Interaction Networks

Protein Protein Interaction Networks Functional Pattern Mining from Genome Scale Protein Protein Interaction Networks Young-Rae Cho, Ph.D. Assistant Professor Department of Computer Science Baylor University it My Definition of Bioinformatics

More information

Social Networks and Social Media

Social Networks and Social Media Social Networks and Social Media Social Media: Many-to-Many Social Networking Content Sharing Social Media Blogs Microblogging Wiki Forum 2 Characteristics of Social Media Consumers become Producers Rich

More information

A Network Approach to Define Modularity of Components in Complex Products

A Network Approach to Define Modularity of Components in Complex Products Manuel E. Sosa INSEAD Fontainebleau, France manuel.sosa@insead.edu Steven D. Eppinger MIT Cambridge, MA, USA eppinger@mit.edu Craig M. Rowles Pratt and Whitney East Hartford, CT, USA rowles@alum.mit.edu

More information

Data, Measurements, Features

Data, Measurements, Features Data, Measurements, Features Middle East Technical University Dep. of Computer Engineering 2009 compiled by V. Atalay What do you think of when someone says Data? We might abstract the idea that data are

More information

CO-OCCURRENCE EXTRACTOR

CO-OCCURRENCE EXTRACTOR Page 1 of 7 CO-OCCURRENCE EXTRACTOR Sede opertiva: Piazza Vermicelli 87036 Rende (CS), Italy Page 2 of 7 TABLE OF CONTENTS 1 APP DOCUMENTATION... 3 1.1 HOW IT WORKS 3 1.2 Input data 4 1.3 Output data 4

More information

Topological Properties

Topological Properties Advanced Computer Architecture Topological Properties Routing Distance: Number of links on route Node degree: Number of channels per node Network diameter: Longest minimum routing distance between any

More information

Complex Network Analysis of Brain Connectivity: An Introduction LABREPORT 5

Complex Network Analysis of Brain Connectivity: An Introduction LABREPORT 5 Complex Network Analysis of Brain Connectivity: An Introduction LABREPORT 5 Fernando Ferreira-Santos 2012 Title: Complex Network Analysis of Brain Connectivity: An Introduction Technical Report Authors:

More information

Graph theoretic approach to analyze amino acid network

Graph theoretic approach to analyze amino acid network Int. J. Adv. Appl. Math. and Mech. 2(3) (2015) 31-37 (ISSN: 2347-2529) Journal homepage: www.ijaamm.com International Journal of Advances in Applied Mathematics and Mechanics Graph theoretic approach to

More information

Evaluating Software Products - A Case Study

Evaluating Software Products - A Case Study LINKING SOFTWARE DEVELOPMENT PHASE AND PRODUCT ATTRIBUTES WITH USER EVALUATION: A CASE STUDY ON GAMES Özge Bengur 1 and Banu Günel 2 Informatics Institute, Middle East Technical University, Ankara, Turkey

More information

Groups and Positions in Complete Networks

Groups and Positions in Complete Networks 86 Groups and Positions in Complete Networks OBJECTIVES The objective of this chapter is to show how a complete network can be analyzed further by using different algorithms to identify its groups and

More information

Cluster Analysis: Advanced Concepts

Cluster Analysis: Advanced Concepts Cluster Analysis: Advanced Concepts and dalgorithms Dr. Hui Xiong Rutgers University Introduction to Data Mining 08/06/2006 1 Introduction to Data Mining 08/06/2006 1 Outline Prototype-based Fuzzy c-means

More information

Social Network Analysis: Introduzione all'analisi di reti sociali

Social Network Analysis: Introduzione all'analisi di reti sociali Social Network Analysis: Introduzione all'analisi di reti sociali Michele Coscia Dipartimento di Informatica Università di Pisa www.di.unipi.it/~coscia Piano Lezioni Introduzione Misure + Modelli di Social

More information

Isabelle Debourges, Sylvie Guilloré-Billot, Christel Vrain

Isabelle Debourges, Sylvie Guilloré-Billot, Christel Vrain /HDUQLQJ9HUEDO5HODWLRQVLQ7H[W0DSV Isabelle Debourges, Sylvie Guilloré-Billot, Christel Vrain LIFO Rue Léonard de Vinci 45067 Orléans cedex 2 France email: {debourge, billot, christel.vrain}@lifo.univ-orleans.fr

More information

CINTIL-PropBank. CINTIL-PropBank Sub-corpus id Sentences Tokens Domain Sentences for regression atsts 779 5,654 Test

CINTIL-PropBank. CINTIL-PropBank Sub-corpus id Sentences Tokens Domain Sentences for regression atsts 779 5,654 Test CINTIL-PropBank I. Basic Information 1.1. Corpus information The CINTIL-PropBank (Branco et al., 2012) is a set of sentences annotated with their constituency structure and semantic role tags, composed

More information

Taxonomy learning factoring the structure of a taxonomy into a semantic classification decision

Taxonomy learning factoring the structure of a taxonomy into a semantic classification decision Taxonomy learning factoring the structure of a taxonomy into a semantic classification decision Viktor PEKAR Bashkir State University Ufa, Russia, 450000 vpekar@ufanet.ru Steffen STAAB Institute AIFB,

More information

Extracting correlation structure from large random matrices

Extracting correlation structure from large random matrices Extracting correlation structure from large random matrices Alfred Hero University of Michigan - Ann Arbor Feb. 17, 2012 1 / 46 1 Background 2 Graphical models 3 Screening for hubs in graphical model 4

More information

Travis Goodwin & Sanda Harabagiu

Travis Goodwin & Sanda Harabagiu Automatic Generation of a Qualified Medical Knowledge Graph and its Usage for Retrieving Patient Cohorts from Electronic Medical Records Travis Goodwin & Sanda Harabagiu Human Language Technology Research

More information

Design of LDPC codes

Design of LDPC codes Design of LDPC codes Codes from finite geometries Random codes: Determine the connections of the bipartite Tanner graph by using a (pseudo)random algorithm observing the degree distribution of the code

More information