Community Detection Proseminar - Elementary Data Mining Techniques by Simon Grätzer
|
|
- Myles Greer
- 8 years ago
- Views:
Transcription
1 Community Detection Proseminar - Elementary Data Mining Techniques by Simon Grätzer 1
2 Content What is Community Detection? Motivation Defining a community Methods to find communities Overlapping communities Clique percolation method Finding a community with query nodes Conclusion 2
3 What is Community Detection? Different from traditional clustering Algorithms use the graph property Graphs with a natural origin have a structure that is not random We try to find these structures by analyzing the graph A perfect solution has yet to be found 3
4 Motivation Communities can represent parts of a larger system (Like organs in the human body) Communities can be considered as a summary of the graph Communities make it easy to visualize and understand complex systems Communities on the web might represent pages of related topics Community can reveal the properties without releasing the individual privacy information 4
5 Defining a Community There is not exact definition of a community in a graph It depends on the application A general definition: Separation between nodes in different communities Cohesion between nodes in a community The differences between algorithms come down to the precise definition 5
6 Basics For a Graph G = {V, E} and a subgraph C G with G = V = n and C = nc φint(c) should have a higher value than the whole graph and φext(c) should be much lower Local definitions see communities as an autonomous entity within a larger system Global definitions see the communities as essential parts of a larger system Vertex similarity: compare individual nodes and group them based on a similarity measure 6
7 Methods Finding overlapping communities Clique percolation method (CPM) Finding communities with query nodes 7
8 Clique Percolation Method CPM is based on the idea that communities are likely to consist of cliques Assumption: Every node in the same community is connected to nearly every other node A community is build up by a chain of k-cliques which are adjacent. Two k-cliques are adjacent if they share k-1 nodes The largest possible chain is defined as community This is a local definition 8
9 Implementation of CPM The number of possible k-cliques in a graph is quite high Implementations search for maximal k-cliques (NP-hard problem) We build an clique-clique overlap matrix O All entries smaller than k-1 are removed 9
10 Parameter k = 3; k = 4 The results of processing the example graph with the CFinder software 10
11 Drawbacks Even if the underlying problem is NP-hard, for large sparse graphs, this algorithm is reasonably fast Some cases lead to useless results: It looks for cliques not dense subgraphs It requires a large number of cliques, but not too many 11
12 Finding a community with query nodes The goal is to find a subgraph H that contains a given set Q of query nodes and is densely connected. The function f is maximized among all possible choices for H In this case we choose the minimum degree for f Additionally we add a distance constraint d 12
13 Without size restriction - Greedy algorithm Choose f = f(h) = minimum degree of a node in H We set G0=G then repeat the steps: Obtain Gt+1 by removing a node which violates the distance constraint or has the minimum degree Terminate if either one of the query nodes has minimum degree or the query nodes are no longer connected We choose the component of Gt for which the minimum degree f(h) is maximized This can be implemented in O(n+m) 13
14 Q = {1, 2, 3} The greedy algorithm, without size constraint, applied on the example graph 14
15 Communities with size restriction A size constraint k makes the problem NP hard (Can be shown via a reduction to the Steiner tree problem) But it can be assumed that the size of the result set is correlated with the distance constraint The paper proposes two heuristics: GreedyDist repeatedly executes Greedy and decreases d until the size k of the graph is small enogh GreedyFast restricts the graph to the k closest nodes to the query nodes. Then Greedy is invoked 15
16 Evaluation with the DBLP dataset The goal was to find a network of scientific collaboration around Christos Papadimitriou 16
17 Conclusion A really broad topic with lots of applications Each algorithms is build with different problems in mind Algorithms are difficult to compare, there is no standard way of testing 17
18 Bibliography [1] P. Erdos and A. Renyi. On the evolution of random graphs. Publ. Math. Inst. Hung. Acad. Sci, 5:17 61, [2] S. Fortunato. Community detection in graphs. Physics Reports, 486(3-5):75 174, [3] P. F. Jonsson and P. A. Bates*. Global topological features of cancer proteins in the human interactome. Bioinformatics, , [4] T. H. J. S. J.-P. O. K. Kaski. Spectral and network methods in the analysis of correlation matrices of stock returns. Physica A 383, , [5] J. M. Kumpula, M. Kivelä, K. Kaski, and J. Saramäki. Sequential algorithm for fast clique percolation. Phys. Rev. E, 78:026109, Aug [6] G. Palla, I. Derényi, I. Farkas, and T. Vicsek. Uncovering the overlapping com- munity structure of complex networks in nature and society. Nature, 435: , June [7] M. E. Porter, K. Schwab, M. E. Porter, K. Schwab, F. Paua, E. T. Herrera, and M. Porter. Communities in networks. Notices of the American Mathematical Society, , [8] M. Sozio and A. Gionis. The community-search problem and how to plan a successful cocktail party. In Proceedings of the 16th ACM SIGKDD interna- tional conference on Knowledge discovery and data mining, KDD '10, , New York, NY, USA, ACM. [9] K.-F. W. Wei Gao. Information Retrieval Technology. Springer Berlin Heidelberg,
Graph Mining Techniques for Social Media Analysis
Graph Mining Techniques for Social Media Analysis Mary McGlohon Christos Faloutsos 1 1-1 What is graph mining? Extracting useful knowledge (patterns, outliers, etc.) from structured data that can be represented
More informationGroup CRM: a New Telecom CRM Framework from Social Network Perspective
Group CRM: a New Telecom CRM Framework from Social Network Perspective Bin Wu Beijing University of Posts and Telecommunications Beijing, China wubin@bupt.edu.cn Qi Ye Beijing University of Posts and Telecommunications
More informationAn approach of detecting structure emergence of regional complex network of entrepreneurs: simulation experiment of college student start-ups
An approach of detecting structure emergence of regional complex network of entrepreneurs: simulation experiment of college student start-ups Abstract Yan Shen 1, Bao Wu 2* 3 1 Hangzhou Normal University,
More informationComplex Networks Analysis: Clustering Methods
Complex Networks Analysis: Clustering Methods Nikolai Nefedov Spring 2013 ISI ETH Zurich nefedov@isi.ee.ethz.ch 1 Outline Purpose to give an overview of modern graph-clustering methods and their applications
More informationExpansion Properties of Large Social Graphs
Expansion Properties of Large Social Graphs Fragkiskos D. Malliaros 1 and Vasileios Megalooikonomou 1,2 1 Computer Engineering and Informatics Department University of Patras, 26500 Rio, Greece 2 Data
More informationProtein Protein Interaction Networks
Functional Pattern Mining from Genome Scale Protein Protein Interaction Networks Young-Rae Cho, Ph.D. Assistant Professor Department of Computer Science Baylor University it My Definition of Bioinformatics
More informationA scalable multilevel algorithm for graph clustering and community structure detection
A scalable multilevel algorithm for graph clustering and community structure detection Hristo N. Djidjev 1 Los Alamos National Laboratory, Los Alamos, NM 87545 Abstract. One of the most useful measures
More informationData Mining Cluster Analysis: Basic Concepts and Algorithms. Lecture Notes for Chapter 8. Introduction to Data Mining
Data Mining Cluster Analysis: Basic Concepts and Algorithms Lecture Notes for Chapter 8 by Tan, Steinbach, Kumar 1 What is Cluster Analysis? Finding groups of objects such that the objects in a group will
More informationGraphs over Time Densification Laws, Shrinking Diameters and Possible Explanations
Graphs over Time Densification Laws, Shrinking Diameters and Possible Explanations Jurij Leskovec, CMU Jon Kleinberg, Cornell Christos Faloutsos, CMU 1 Introduction What can we do with graphs? What patterns
More informationAnalysis of Internet Topologies
Analysis of Internet Topologies Ljiljana Trajković ljilja@cs.sfu.ca Communication Networks Laboratory http://www.ensc.sfu.ca/cnl School of Engineering Science Simon Fraser University, Vancouver, British
More informationSocial Media Mining. Data Mining Essentials
Introduction Data production rate has been increased dramatically (Big Data) and we are able store much more data than before E.g., purchase data, social media data, mobile phone data Businesses and customers
More informationIntroduction to Data Mining
Introduction to Data Mining Jay Urbain Credits: Nazli Goharian & David Grossman @ IIT Outline Introduction Data Pre-processing Data Mining Algorithms Naïve Bayes Decision Tree Neural Network Association
More informationSEQUENCES OF MAXIMAL DEGREE VERTICES IN GRAPHS. Nickolay Khadzhiivanov, Nedyalko Nenov
Serdica Math. J. 30 (2004), 95 102 SEQUENCES OF MAXIMAL DEGREE VERTICES IN GRAPHS Nickolay Khadzhiivanov, Nedyalko Nenov Communicated by V. Drensky Abstract. Let Γ(M) where M V (G) be the set of all vertices
More informationSocial Network Mining
Social Network Mining Data Mining November 11, 2013 Frank Takes (ftakes@liacs.nl) LIACS, Universiteit Leiden Overview Social Network Analysis Graph Mining Online Social Networks Friendship Graph Semantics
More informationEMPLOYMENT 2008 - Research associate, Statistical and Biological Physics Research
GERGELY PALLA - CURRICULUM VITAE CONTACT Statistical and Biological Physics Research Group of HAS, Eötvös University, Budapest, Pázmány P. stny. 1/A. H-1117 Hungary Phone: (36-1) 372-2768 Fax: (36-1) 372-2757
More informationGraph Mining and Social Network Analysis
Graph Mining and Social Network Analysis Data Mining and Text Mining (UIC 583 @ Politecnico di Milano) References Jiawei Han and Micheline Kamber, "Data Mining: Concepts and Techniques", The Morgan Kaufmann
More informationGraph Classification and Easy Reliability Polynomials
Mathematical Assoc. of America American Mathematical Monthly 121:1 November 18, 2014 1:11 a.m. AMM.tex page 1 Graph Classification and Easy Reliability Polynomials Pablo Romero and Gerardo Rubino Abstract.
More informationALBERTA. Social Network Analysis for the Assessment of Learning UNIVERSITY OF. Osmar R. Zaïane Professor & Scientific Director of AICML
UNIVERSITY OF ALBERTA Social Network Analysis for the Assessment of Learning Osmar R. Zaïane Professor & Scientific Director of AICML Educational Data Mining 2010 Pittsburgh, USA University of Alberta
More informationAnalysis of Internet Topologies: A Historical View
Analysis of Internet Topologies: A Historical View Mohamadreza Najiminaini, Laxmi Subedi, and Ljiljana Trajković Communication Networks Laboratory http://www.ensc.sfu.ca/cnl Simon Fraser University Vancouver,
More informationDmitri Krioukov CAIDA/UCSD
Hyperbolic geometry of complex networks Dmitri Krioukov CAIDA/UCSD dima@caida.org F. Papadopoulos, M. Boguñá, A. Vahdat, and kc claffy Complex networks Technological Internet Transportation Power grid
More informationThe spectra of random graphs with given expected degrees
Classification: Physical Sciences, Mathematics The spectra of random graphs with given expected degrees by Fan Chung Linyuan Lu Van Vu Department of Mathematics University of California at San Diego La
More informationCollective Behavior Prediction in Social Media. Lei Tang Data Mining & Machine Learning Group Arizona State University
Collective Behavior Prediction in Social Media Lei Tang Data Mining & Machine Learning Group Arizona State University Social Media Landscape Social Network Content Sharing Social Media Blogs Wiki Forum
More informationDiscovering Overlapping Groups in Social Media
Discovering Overlapping Groups in Social Media Xufei Wang Arizona State University Tempe, AZ 85287, USA Email:xufei.wang@asu.edu Lei Tang Yahoo! Labs Santa Clara, CA 9554, USA Email:ltang@yahoo-inc.com
More informationA Performance Comparison of Five Algorithms for Graph Isomorphism
A Performance Comparison of Five Algorithms for Graph Isomorphism P. Foggia, C.Sansone, M. Vento Dipartimento di Informatica e Sistemistica Via Claudio, 21 - I 80125 - Napoli, Italy {foggiapa, carlosan,
More informationOutline. NP-completeness. When is a problem easy? When is a problem hard? Today. Euler Circuits
Outline NP-completeness Examples of Easy vs. Hard problems Euler circuit vs. Hamiltonian circuit Shortest Path vs. Longest Path 2-pairs sum vs. general Subset Sum Reducing one problem to another Clique
More informationCAD Algorithms. P and NP
CAD Algorithms The Classes P and NP Mohammad Tehranipoor ECE Department 6 September 2010 1 P and NP P and NP are two families of problems. P is a class which contains all of the problems we solve using
More informationTrend Motif: A Graph Mining Approach for Analysis of Dynamic Complex Networks
Trend Motif: A Graph Mining Approach for Analysis of Dynamic Complex Networks Ruoming Jin, Scott McCallen Department of Computer Science,Kent State University, Kent, OH, 44241 {jin,smccalle}@cs.kent.edu
More informationCourse Syllabus For Operations Management. Management Information Systems
For Operations Management and Management Information Systems Department School Year First Year First Year First Year Second year Second year Second year Third year Third year Third year Third year Third
More informationLink Prediction in Social Networks
CS378 Data Mining Final Project Report Dustin Ho : dsh544 Eric Shrewsberry : eas2389 Link Prediction in Social Networks 1. Introduction Social networks are becoming increasingly more prevalent in the daily
More informationInformation Management course
Università degli Studi di Milano Master Degree in Computer Science Information Management course Teacher: Alberto Ceselli Lecture 01 : 06/10/2015 Practical informations: Teacher: Alberto Ceselli (alberto.ceselli@unimi.it)
More informationGraph theoretic approach to analyze amino acid network
Int. J. Adv. Appl. Math. and Mech. 2(3) (2015) 31-37 (ISSN: 2347-2529) Journal homepage: www.ijaamm.com International Journal of Advances in Applied Mathematics and Mechanics Graph theoretic approach to
More informationAn Introduction to APGL
An Introduction to APGL Charanpal Dhanjal February 2012 Abstract Another Python Graph Library (APGL) is a graph library written using pure Python, NumPy and SciPy. Users new to the library can gain an
More informationOPTIMIZED UTRAN TOPOLOGY PLANNING INCLUDING POINT-TO-MULTIPOINT EQUIPMENT
12th GI/ITG CONFERENCE ON MEASURING, MODELING AND EVALUATION OF COMPUTER AND COMMUNICATION SYSTEMS 3rd POLISH-GERMAN TELETRAFFIC SYMPOSIUM OPTIMIZED UTRAN TOPOLOGY PLANNING INCLUDING POINT-TO-MULTIPOINT
More informationDistance Degree Sequences for Network Analysis
Universität Konstanz Computer & Information Science Algorithmics Group 15 Mar 2005 based on Palmer, Gibbons, and Faloutsos: ANF A Fast and Scalable Tool for Data Mining in Massive Graphs, SIGKDD 02. Motivation
More informationNETZCOPE - a tool to analyze and display complex R&D collaboration networks
The Task Concepts from Spectral Graph Theory EU R&D Network Analysis Netzcope Screenshots NETZCOPE - a tool to analyze and display complex R&D collaboration networks L. Streit & O. Strogan BiBoS, Univ.
More informationDiscovering and Analyzing Deviant Communities: Methods and Experiments
Discovering and Analyzing Deviant Communities: Methods and Experiments Napoleon C. Paxton *, Dae-il Jang **, Ira S. Moskowitz *, Gail-Joon Ahn ** and Stephen Russell * * Information Technology Division,
More information2. (a) Explain the strassen s matrix multiplication. (b) Write deletion algorithm, of Binary search tree. [8+8]
Code No: R05220502 Set No. 1 1. (a) Describe the performance analysis in detail. (b) Show that f 1 (n)+f 2 (n) = 0(max(g 1 (n), g 2 (n)) where f 1 (n) = 0(g 1 (n)) and f 2 (n) = 0(g 2 (n)). [8+8] 2. (a)
More informationIntroduction to Scheduling Theory
Introduction to Scheduling Theory Arnaud Legrand Laboratoire Informatique et Distribution IMAG CNRS, France arnaud.legrand@imag.fr November 8, 2004 1/ 26 Outline 1 Task graphs from outer space 2 Scheduling
More informationStructural and functional analytics for community detection in large-scale complex networks
Chopade and Zhan Journal of Big Data DOI 10.1186/s40537-015-0019-y RESEARCH Open Access Structural and functional analytics for community detection in large-scale complex networks Pravin Chopade 1* and
More informationAccess control for data integration in presence of data dependencies. Mehdi Haddad, Mohand-Saïd Hacid
Access control for data integration in presence of data dependencies Mehdi Haddad, Mohand-Saïd Hacid 1 Outline Introduction Motivating example Related work Approach Detection phase (Re)configuration phase
More informationStructural and Relational Properties of Social Contact Networks with Applications to Public Health Informatics
NDSSL Technical Report 9-66 July 8, 29 Title: Structural and Relational Properties of Social Contact Networks with Applications to Public Health Informatics Authors: Maleq Khan V.S. Anil Kumar Madhav Marathe
More informationComplex Network Visualization based on Voronoi Diagram and Smoothed-particle Hydrodynamics
Complex Network Visualization based on Voronoi Diagram and Smoothed-particle Hydrodynamics Zhao Wenbin 1, Zhao Zhengxu 2 1 School of Instrument Science and Engineering, Southeast University, Nanjing, Jiangsu
More informationInet-3.0: Internet Topology Generator
Inet-3.: Internet Topology Generator Jared Winick Sugih Jamin {jwinick,jamin}@eecs.umich.edu CSE-TR-456-2 Abstract In this report we present version 3. of Inet, an Autonomous System (AS) level Internet
More informationData Integration. Lectures 16 & 17. ECS289A, WQ03, Filkov
Data Integration Lectures 16 & 17 Lectures Outline Goals for Data Integration Homogeneous data integration time series data (Filkov et al. 2002) Heterogeneous data integration microarray + sequence microarray
More informationA Clustering Model for Mining Evolving Web User Patterns in Data Stream Environment
A Clustering Model for Mining Evolving Web User Patterns in Data Stream Environment Edmond H. Wu,MichaelK.Ng, Andy M. Yip,andTonyF.Chan Department of Mathematics, The University of Hong Kong Pokfulam Road,
More informationHow To Cluster Of Complex Systems
Entropy based Graph Clustering: Application to Biological and Social Networks Edward C Kenley Young-Rae Cho Department of Computer Science Baylor University Complex Systems Definition Dynamically evolving
More informationDistributed Computing over Communication Networks: Maximal Independent Set
Distributed Computing over Communication Networks: Maximal Independent Set What is a MIS? MIS An independent set (IS) of an undirected graph is a subset U of nodes such that no two nodes in U are adjacent.
More informationSelf Organizing Maps for Visualization of Categories
Self Organizing Maps for Visualization of Categories Julian Szymański 1 and Włodzisław Duch 2,3 1 Department of Computer Systems Architecture, Gdańsk University of Technology, Poland, julian.szymanski@eti.pg.gda.pl
More informationThe Minimum Consistent Subset Cover Problem and its Applications in Data Mining
The Minimum Consistent Subset Cover Problem and its Applications in Data Mining Byron J Gao 1,2, Martin Ester 1, Jin-Yi Cai 2, Oliver Schulte 1, and Hui Xiong 3 1 School of Computing Science, Simon Fraser
More informationResearch on Supply Chain Network Knowledge Dissemination Mode
529 A publication of CHEMICAL ENGINEERING TRANSACTIONS VOL. 46, 2015 Guest Editors: Peiyu Ren, Yancang Li, Huiping Song Copyright 2015, AIDIC Servizi S.r.l., ISBN 978-88-95608-37-2; ISSN 2283-9216 The
More informationFinding and counting given length cycles
Finding and counting given length cycles Noga Alon Raphael Yuster Uri Zwick Abstract We present an assortment of methods for finding and counting simple cycles of a given length in directed and undirected
More informationTravis Goodwin & Sanda Harabagiu
Automatic Generation of a Qualified Medical Knowledge Graph and its Usage for Retrieving Patient Cohorts from Electronic Medical Records Travis Goodwin & Sanda Harabagiu Human Language Technology Research
More informationComplexity Theory. IE 661: Scheduling Theory Fall 2003 Satyaki Ghosh Dastidar
Complexity Theory IE 661: Scheduling Theory Fall 2003 Satyaki Ghosh Dastidar Outline Goals Computation of Problems Concepts and Definitions Complexity Classes and Problems Polynomial Time Reductions Examples
More informationBig Data Analytics of Multi-Relationship Online Social Network Based on Multi-Subnet Composited Complex Network
, pp.273-284 http://dx.doi.org/10.14257/ijdta.2015.8.5.24 Big Data Analytics of Multi-Relationship Online Social Network Based on Multi-Subnet Composited Complex Network Gengxin Sun 1, Sheng Bin 2 and
More informationAdvanced Ensemble Strategies for Polynomial Models
Advanced Ensemble Strategies for Polynomial Models Pavel Kordík 1, Jan Černý 2 1 Dept. of Computer Science, Faculty of Information Technology, Czech Technical University in Prague, 2 Dept. of Computer
More informationExponential time algorithms for graph coloring
Exponential time algorithms for graph coloring Uriel Feige Lecture notes, March 14, 2011 1 Introduction Let [n] denote the set {1,..., k}. A k-labeling of vertices of a graph G(V, E) is a function V [k].
More informationPart 2: Community Detection
Chapter 8: Graph Data Part 2: Community Detection Based on Leskovec, Rajaraman, Ullman 2014: Mining of Massive Datasets Big Data Management and Analytics Outline Community Detection - Social networks -
More informationBig Data Graph Algorithms
Christian Schulz CompSE seminar, RWTH Aachen, Karlsruhe 1 Christian Schulz: Institute for Theoretical www.kit.edu Informatics Algorithm Engineering design analyze Algorithms implement experiment 1 Christian
More informationBeyond the Stars: Revisiting Virtual Cluster Embeddings
Beyond the Stars: Revisiting Virtual Cluster Embeddings Matthias Rost Technische Universität Berlin September 7th, 2015, Télécom-ParisTech Joint work with Carlo Fuerst, Stefan Schmid Published in ACM SIGCOMM
More informationParallel Algorithms for Small-world Network. David A. Bader and Kamesh Madduri
Parallel Algorithms for Small-world Network Analysis ayssand Partitioning atto g(s (SNAP) David A. Bader and Kamesh Madduri Overview Informatics networks, small-world topology Community Identification/Graph
More informationA comparative study of social network analysis tools
Membre de Membre de A comparative study of social network analysis tools David Combe, Christine Largeron, Előd Egyed-Zsigmond and Mathias Géry International Workshop on Web Intelligence and Virtual Enterprises
More informationAnt Colony Optimization and Constraint Programming
Ant Colony Optimization and Constraint Programming Christine Solnon Series Editor Narendra Jussien WILEY Table of Contents Foreword Acknowledgements xi xiii Chapter 1. Introduction 1 1.1. Overview of the
More informationSmall Maximal Independent Sets and Faster Exact Graph Coloring
Small Maximal Independent Sets and Faster Exact Graph Coloring David Eppstein Univ. of California, Irvine Dept. of Information and Computer Science The Exact Graph Coloring Problem: Given an undirected
More informationProblem Set 7 Solutions
8 8 Introduction to Algorithms May 7, 2004 Massachusetts Institute of Technology 6.046J/18.410J Professors Erik Demaine and Shafi Goldwasser Handout 25 Problem Set 7 Solutions This problem set is due in
More informationSchool of Computer Science Carnegie Mellon Graph Mining, self-similarity and power laws
Graph Mining, self-similarity and power laws Christos Faloutsos University Overview Achievements global patterns and laws (static/dynamic) generators influence propagation communities; graph partitioning
More informationONLINE SOCIAL NETWORK MINING: CURRENT TRENDS AND RESEARCH ISSUES
ONLINE SOCIAL NETWORK MINING: CURRENT TRENDS AND RESEARCH ISSUES G Nandi 1, A Das 1 & 2 1 Assam Don Bosco University Guwahati, Assam 781017, India 2 St. Anthony s College, Shillong, Meghalaya 793001, India
More informationJoint Cluster Analysis of Attribute Data and Relationship Data: the Connected k-center Problem
Joint Cluster Analysis of Attribute Data and Relationship Data: the Connected k-center Problem Martin Ester, Rong Ge, Byron J. Gao, Zengjian Hu, Boaz Ben-Moshe School of Computing Science, Simon Fraser
More informationUtilizing Network Science and Honeynets for Software Induced Cyber Incident Analysis
Utilizing Network Science and Honeynets for Software Induced Cyber Incident Analysis Abstract Framing the scene and investigating the cause of a software induced cyber-attack continues to be one of the
More informationClustering. Danilo Croce Web Mining & Retrieval a.a. 2015/201 16/03/2016
Clustering Danilo Croce Web Mining & Retrieval a.a. 2015/201 16/03/2016 1 Supervised learning vs. unsupervised learning Supervised learning: discover patterns in the data that relate data attributes with
More informationIC05 Introduction on Networks &Visualization Nov. 2009. <mathieu.bastian@gmail.com>
IC05 Introduction on Networks &Visualization Nov. 2009 Overview 1. Networks Introduction Networks across disciplines Properties Models 2. Visualization InfoVis Data exploration
More informationPrinciples of Dat Da a t Mining Pham Tho Hoan hoanpt@hnue.edu.v hoanpt@hnue.edu. n
Principles of Data Mining Pham Tho Hoan hoanpt@hnue.edu.vn References [1] David Hand, Heikki Mannila and Padhraic Smyth, Principles of Data Mining, MIT press, 2002 [2] Jiawei Han and Micheline Kamber,
More informationAPPLICATION OF DATA MINING TECHNIQUES FOR BUILDING SIMULATION PERFORMANCE PREDICTION ANALYSIS. email paul@esru.strath.ac.uk
Eighth International IBPSA Conference Eindhoven, Netherlands August -4, 2003 APPLICATION OF DATA MINING TECHNIQUES FOR BUILDING SIMULATION PERFORMANCE PREDICTION Christoph Morbitzer, Paul Strachan 2 and
More informationA GRAPH-THEORETIC DEFINITION OF A SOCIOMETRIC CLIQUE *
Journal of Mathematical Sociology Gordon and Breach Science Publishers 1973 Vol. 3, pp 113-126 Printed in Birkenhead, England A GRAPH-THEORETIC DEFINITION OF A SOCIOMETRIC CLIQUE * RICHARD D. ALBA Columbia
More informationA Simultaneous Solution for General Linear Equations on a Ring or Hierarchical Cluster
Acta Technica Jaurinensis Vol. 3. No. 1. 010 A Simultaneous Solution for General Linear Equations on a Ring or Hierarchical Cluster G. Molnárka, N. Varjasi Széchenyi István University Győr, Hungary, H-906
More informationCS 5614: (Big) Data Management Systems. B. Aditya Prakash Lecture #18: Dimensionality Reduc7on
CS 5614: (Big) Data Management Systems B. Aditya Prakash Lecture #18: Dimensionality Reduc7on Dimensionality Reduc=on Assump=on: Data lies on or near a low d- dimensional subspace Axes of this subspace
More informationApproximation Algorithms
Approximation Algorithms or: How I Learned to Stop Worrying and Deal with NP-Completeness Ong Jit Sheng, Jonathan (A0073924B) March, 2012 Overview Key Results (I) General techniques: Greedy algorithms
More informationCommunity Mining from Multi-relational Networks
Community Mining from Multi-relational Networks Deng Cai 1, Zheng Shao 1, Xiaofei He 2, Xifeng Yan 1, and Jiawei Han 1 1 Computer Science Department, University of Illinois at Urbana Champaign (dengcai2,
More informationDATA ANALYSIS IN PUBLIC SOCIAL NETWORKS
International Scientific Conference & International Workshop Present Day Trends of Innovations 2012 28 th 29 th May 2012 Łomża, Poland DATA ANALYSIS IN PUBLIC SOCIAL NETWORKS Lubos Takac 1 Michal Zabovsky
More informationCS224W Project Report: Finding Top UI/UX Design Talent on Adobe Behance
CS224W Project Report: Finding Top UI/UX Design Talent on Adobe Behance Susanne Halstead, Daniel Serrano, Scott Proctor 6 December 2014 1 Abstract The Behance social network allows professionals of diverse
More informationThe Enron Corpus: A New Dataset for Email Classification Research
The Enron Corpus: A New Dataset for Email Classification Research Bryan Klimt and Yiming Yang Language Technologies Institute Carnegie Mellon University Pittsburgh, PA 15213-8213, USA {bklimt,yiming}@cs.cmu.edu
More informationFUZZY CLUSTERING ANALYSIS OF DATA MINING: APPLICATION TO AN ACCIDENT MINING SYSTEM
International Journal of Innovative Computing, Information and Control ICIC International c 0 ISSN 34-48 Volume 8, Number 8, August 0 pp. 4 FUZZY CLUSTERING ANALYSIS OF DATA MINING: APPLICATION TO AN ACCIDENT
More informationSPANNING CACTI FOR STRUCTURALLY CONTROLLABLE NETWORKS NGO THI TU ANH NATIONAL UNIVERSITY OF SINGAPORE
SPANNING CACTI FOR STRUCTURALLY CONTROLLABLE NETWORKS NGO THI TU ANH NATIONAL UNIVERSITY OF SINGAPORE 2012 SPANNING CACTI FOR STRUCTURALLY CONTROLLABLE NETWORKS NGO THI TU ANH (M.Sc., SFU, Russia) A THESIS
More informationDATA MINING - SELECTED TOPICS
DATA MINING - SELECTED TOPICS Peter Brezany Institute for Software Science University of Vienna E-mail : brezany@par.univie.ac.at 1 MINING SPATIAL DATABASES 2 Spatial Database Systems SDBSs offer spatial
More informationCSC2420 Fall 2012: Algorithm Design, Analysis and Theory
CSC2420 Fall 2012: Algorithm Design, Analysis and Theory Allan Borodin November 15, 2012; Lecture 10 1 / 27 Randomized online bipartite matching and the adwords problem. We briefly return to online algorithms
More informationDifferentially Private Analysis of
Title: Name: Affil./Addr. Keywords: SumOriWork: Differentially Private Analysis of Graphs Sofya Raskhodnikova, Adam Smith Pennsylvania State University Graphs, privacy, subgraph counts, degree distribution
More informationA box-covering algorithm for fractal scaling in scale-free networks
CHAOS 17, 026116 2007 A box-covering algorithm for fractal scaling in scale-free networks J. S. Kim CTP & FPRD, School of Physics and Astronomy, Seoul National University, NS50, Seoul 151-747, Korea K.-I.
More informationNonorthogonal Decomposition of Binary Matrices for Bounded-Error Data Compression and Analysis
Nonorthogonal Decomposition of Binary Matrices for Bounded-Error Data Compression and Analysis MEHMET KOYUTÜRK and ANANTH GRAMA Department of Computer Sciences, Purdue University and NAREN RAMAKRISHNAN
More informationSocial Network Analysis
Social Network Analysis Challenges in Computer Science April 1, 2014 Frank Takes (ftakes@liacs.nl) LIACS, Leiden University Overview Context Social Network Analysis Online Social Networks Friendship Graph
More informationStructural constraints in complex networks
Structural constraints in complex networks Dr. Shi Zhou Lecturer of University College London Royal Academy of Engineering / EPSRC Research Fellow Part 1. Complex networks and three key topological properties
More informationVisualization of textual data: unfolding the Kohonen maps.
Visualization of textual data: unfolding the Kohonen maps. CNRS - GET - ENST 46 rue Barrault, 75013, Paris, France (e-mail: ludovic.lebart@enst.fr) Ludovic Lebart Abstract. The Kohonen self organizing
More informationDefinition 11.1. Given a graph G on n vertices, we define the following quantities:
Lecture 11 The Lovász ϑ Function 11.1 Perfect graphs We begin with some background on perfect graphs. graphs. First, we define some quantities on Definition 11.1. Given a graph G on n vertices, we define
More informationPerformance Metrics for Graph Mining Tasks
Performance Metrics for Graph Mining Tasks 1 Outline Introduction to Performance Metrics Supervised Learning Performance Metrics Unsupervised Learning Performance Metrics Optimizing Metrics Statistical
More informationThank you! NetMine Data mining on networks IIS -0209107 AWSOM. Outline. Proposed method. Goals
NetMine Data mining on networks IIS -0209107 Christos Faloutsos (CMU) Michalis Faloutsos (UCR) Peggy Agouris George Kollios Fillia Makedon Betty Salzberg Anthony Stefanidis Thank you! NSF-IDM 04 C. Faloutsos
More informationCSC2420 Spring 2015: Lecture 3
CSC2420 Spring 2015: Lecture 3 Allan Borodin January 22, 2015 1 / 1 Announcements and todays agenda Assignment 1 due next Thursday. I may add one or two additional questions today or tomorrow. Todays agenda
More informationData Mining Fundamentals
Part I Data Mining Fundamentals Data Mining: A First View Chapter 1 1.11 Data Mining: A Definition Data Mining The process of employing one or more computer learning techniques to automatically analyze
More informationHow To Understand The Network Of A Network
Roles in Networks Roles in Networks Motivation for work: Let topology define network roles. Work by Kleinberg on directed graphs, used topology to define two types of roles: authorities and hubs. (Each
More informationKeywords Big Graphs, Big graph databases, Triangulation method, k-mutual friend subgraph, Streaming.
Volume 6, Issue 1, January 2016 ISSN: 2277 128X International Journal of Advanced Research in Computer Science and Software Engineering Research Paper Available online at: www.ijarcsse.com Review on Big
More informationEmergent properties of large linked networks have recently
Colloquium Tracking evolving communities in large linked networks John Hopcroft*, Omar Khan, Brian Kulis, and Bart Selman* *Department of Computer Science, Cornell University, Ithaca, NY 14853; Google,
More informationImplementing Web-Based Computing Services To Improve Performance And Assist Telemedicine Database Management System
Implementing Web-Based Computing Services To Improve Performance And Assist Telemedicine Database Management System D. A. Vidhate 1, Ige Pranita 2, Kothari Pooja 3, Kshatriya Pooja 4 (Information Technology,
More informationA number of tasks executing serially or in parallel. Distribute tasks on processors so that minimal execution time is achieved. Optimal distribution
Scheduling MIMD parallel program A number of tasks executing serially or in parallel Lecture : Load Balancing The scheduling problem NP-complete problem (in general) Distribute tasks on processors so that
More information