Vers une Analyse Conceptuelle des Réseaux Sociaux

Size: px
Start display at page:

Download "Vers une Analyse Conceptuelle des Réseaux Sociaux"

Transcription

1 Vers une Analyse Conceptuelle des Réseaux Sociaux Erick Stattner Martine Collard Laboratory of Mathematics and Computer Science (LAMIA) University of the French West Indies and Guiana, France MARAMI 2012 Erick Stattner, Martine Collard MARAMI / 27

2 Motivation Issues New Science of Networks focuses on interactions between entities and investigates new methods and techniques Knowledge extraction from data on real world phenomena studied through interactions among individuals New data mining techniques: Link Mining (Node classification, Link-based Clustering, Link prediction, Frequent patterns...) Attributed graph mining (Cohesive sub-graphs, Summarization,...) Erick Stattner, Martine Collard MARAMI / 27

3 Data Mining Task Context: Search for frequent patterns to answer to questions like : What are the groups of nodes the most connected? What are the nodes properties the most frequently found in connection? Contribution: Search for Frequent Links in Social Networks between groups of nodes sharing internal common properties by combining network structure and node attribute values b b b b b b b r r r r b b r Frequent link (b,r) Erick Stattner, Martine Collard MARAMI / 27

4 Outline Frequent pattern discovery Node clustering 1 Frequent pattern discovery Node clustering Erick Stattner, Martine Collard MARAMI / 27

5 Pattern Mining in Social Networks Current Methods Frequent pattern discovery Node clustering Main methods: Link prediction Frequent pattern discovery Node clustering Formal concept analysis Erick Stattner, Martine Collard MARAMI / 27

6 Pattern Mining in Social Networks Frequent pattern discovery Frequent pattern discovery: pattern = subgraph search for subgraphs occuring frequently into a large network into a set of networks Frequent pattern discovery Node clustering X 1. X 7. Y X Y X 2. X 3. X 4. X 5. X 6. Y 9. Z 8. Z 10. Y 11. Z X Y X Z X Y X Z Y Z Erick Stattner, Martine Collard MARAMI / 27

7 Pattern Mining in Social Networks Node clustering Frequent pattern discovery Node clustering Node clustering: based on links to detect subgraphs or "communities" objective: identifying groups of nodes densely connected into the network by maximizing intra-cluster links while minimizing inter-cluster links Erick Stattner, Martine Collard MARAMI / 27

8 Pattern Mining in Social Networks Hybrid Node clustering Frequent pattern discovery Node clustering Hybrid node clustering: based on links and on node attributes values objective: identifying groups of nodes that share common contacts Erick Stattner, Martine Collard MARAMI / 27

9 Formal concept analysis Frequent pattern discovery Node clustering Formal concept of links: based on links and on nodes objective: identifying groups of nodes that share common contacts Erick Stattner, Martine Collard MARAMI / 27

10 Pattern Mining in Social Networks Observation Frequent pattern discovery Node clustering Current methods mainly use network structure often ignore nodes properties Concept of frequent link combines information both from links and from node attributes values represents a regularity involving two groups of nodes that share internal common characteristics % % Erick Stattner, Martine Collard MARAMI / 27

11 Outline Knowledge extracted Analogy with lattices of itemsets 1 2 Knowledge extracted Analogy with lattices of itemsets 3 4 Erick Stattner, Martine Collard MARAMI / 27

12 Conceptual link Knowledge extracted Analogy with lattices of itemsets G = (V,E) network (directed) V defined as a relation R(A 1,...,A p ) A 1,...,A p node attributes each node v V defined by the itemset A 1 = a 1 and... and A p = a p or a 1...a p for m an itemset V m : set of nodes satisfying m sm sub-itemset of m V m V sm ex: V abc V ab Erick Stattner, Martine Collard MARAMI / 27

13 Conceptual link Knowledge extracted Analogy with lattices of itemsets G = (V,E) network I V set of all possible itemsets on G Left-hand side link set LE m = {e E ; e = (a,b) a V m } Right-hand side link set RE m = {e E ; e = (a,b) b V m } Conceptual link (m 1,m 2 ) = LE m1 RE m2 (1) = {e E ; e = (a,b) a V m1 et b V m2 } (2) Erick Stattner, Martine Collard MARAMI / 27

14 Frequent conceptual link Knowledge extracted Analogy with lattices of itemsets Support Support of l = (m 1,m 2 ) supp[(m 1,m 2 )] = (m 1,m 2 E β: link support threshold (m 1,m 2 ) is a frequent conceptual link iff: supp[(m 1,m 2 )] > β Erick Stattner, Martine Collard MARAMI / 27

15 Frequent Links Knowledge provided Knowledge extracted Analogy with lattices of itemsets Frequent Links: Provide knowledge on the groups of nodes the most connected in the social network i.e. knowledge on the properties most often connected Example: Bipartite network customer-product: m 1 : Gender= M and Interest= computer science m 2 : Category= Science Fiction and Product= book supp[(m 1,m 2 )] = 14% Erick Stattner, Martine Collard MARAMI / 27

16 Frequent conceptual link Downward-closure property Knowledge extracted Analogy with lattices of itemsets Sub and Super conceptual links (sm 1,sm 2 ) sub conceptual link of (m 1,m 2 ) (sm 1,sm 2 ) (m 1,m 2 ) Downward-closure property if l is frequent then all its sub-links sl are also frequent if l is unfrequent then all its super-links sl are also unfrequent Erick Stattner, Martine Collard MARAMI / 27

17 Maximal frequent conceptual link Knowledge extracted Analogy with lattices of itemsets Maximal frequent conceptual link (m 1,m 2 ) maximal frequent conceptual link iff l frequent conceptual link such as l l. Erick Stattner, Martine Collard MARAMI / 27

18 Conceptual view Lattice Knowledge extracted Analogy with lattices of itemsets Extraction of maximal frequent conceptual link on G Concept lattice and search space reduction ab, ab ab, ab ab, a ab, b a, ab b, ab ab, a ab, b a, ab b, ab a, a a, b b, a b, b a, a a, b b, a b, b Φ, Φ Φ, Φ (a) (b) Erick Stattner, Martine Collard MARAMI / 27

19 Conceptual view Knowledge extracted Analogy with lattices of itemsets β: link support threshold FL Vmax set of all maximal frequent conceptual links on G FL Vmax conceptual view of the social network G Seuil de support β Réseau Social Liens Conceptuels Fréquents Vue Conceptuelle 31% 22% 13% Erick Stattner, Martine Collard MARAMI / 27

20 Outline Testbed Extracted patterns Testbed Extracted patterns 4 Erick Stattner, Martine Collard MARAMI / 27

21 cc General Degree Testbed Testbed Extracted patterns Testbed: Sub-network of the proximity contact network (City of Portland) simulated with Episim [Eubank,2005] Each node: age class, i.e. age 10, gender (1-male, 2-female), worker status, type of relationship with householder, contact class, i.e. degree 2 sociability Origine Portland Type Undirected #nodes 3000 #links 4683 Density #comp 1 avg max 15 0,3 0,2 Distribution 0,1 0 avg Erick Stattner, Martine Collard MARAMI / 27

22 Extracted patterns Testbed Extracted patterns Some examples of extracted patterns: β = 0.1 Maximal cfl Support ((4; ;1;,, ),( ; ;2;,, )) ((2; ; ;2,, ),( ; ;2;2,, )) (( ;1;1;,, ),( ; ;1;,, )) % of the links of the network connect 40 years old people who have a job to people who do not have a job β = 0.2 Maximal cfl Support (( ;2; ;,, ),( ; ;1;,, )) (( ;1; ;,, ),( ; ;2;,, )) (( ;2; ;,, ),( ;1; ;,, )) % of the links of the network connect men to people who have a job Erick Stattner, Martine Collard MARAMI / 27

23 Conceptuel view Testbed Extracted patterns Summarization Erick Stattner, Martine Collard MARAMI / 27

24 P(k) 0,11 0,12 0,13 0,14 0,15 0,16 0,17 0,18 0,19 0,2 0,11 0,12 0,13 0,14 0,15 0,16 0,17 0,18 0,19 0,2 Results Testbed Extracted patterns Network measures versus support threshold: Number of nodes and links (c), Density and clustering coeff. (d) and Degree distribution (e) # Noeuds # Liens 0,6 0,5 0,4 0,3 0,2 0,1 0 Coeff. Clust. Densité Support Support (c) (d) 0,5 0,1 0,4 0,15 0,3 0,2 0,2 0, Erick Stattner, Martine Collard MARAMI / 27

25 Outline Erick Stattner, Martine Collard MARAMI / 27

26 Conclusion: New approach for extract frequent pattern in social data Combine information both from attributes values and links Two interests: Perspectives: Extract novel patterns : groups of nodes most connected Provide a kind of summarized representation of the network Optimization Scalability Erick Stattner, Martine Collard MARAMI / 27

27 Thanks for your attention! Erick Stattner, Martine Collard MARAMI / 27

Data Mining Cluster Analysis: Basic Concepts and Algorithms. Lecture Notes for Chapter 8. Introduction to Data Mining

Data Mining Cluster Analysis: Basic Concepts and Algorithms. Lecture Notes for Chapter 8. Introduction to Data Mining Data Mining Cluster Analysis: Basic Concepts and Algorithms Lecture Notes for Chapter 8 by Tan, Steinbach, Kumar 1 What is Cluster Analysis? Finding groups of objects such that the objects in a group will

More information

Part 2: Community Detection

Part 2: Community Detection Chapter 8: Graph Data Part 2: Community Detection Based on Leskovec, Rajaraman, Ullman 2014: Mining of Massive Datasets Big Data Management and Analytics Outline Community Detection - Social networks -

More information

Cluster analysis and Association analysis for the same data

Cluster analysis and Association analysis for the same data Cluster analysis and Association analysis for the same data Huaiguo Fu Telecommunications Software & Systems Group Waterford Institute of Technology Waterford, Ireland hfu@tssg.org Abstract: Both cluster

More information

Categorical Data Visualization and Clustering Using Subjective Factors

Categorical Data Visualization and Clustering Using Subjective Factors Categorical Data Visualization and Clustering Using Subjective Factors Chia-Hui Chang and Zhi-Kai Ding Department of Computer Science and Information Engineering, National Central University, Chung-Li,

More information

Practical Graph Mining with R. 5. Link Analysis

Practical Graph Mining with R. 5. Link Analysis Practical Graph Mining with R 5. Link Analysis Outline Link Analysis Concepts Metrics for Analyzing Networks PageRank HITS Link Prediction 2 Link Analysis Concepts Link A relationship between two entities

More information

Graph Mining and Social Network Analysis

Graph Mining and Social Network Analysis Graph Mining and Social Network Analysis Data Mining and Text Mining (UIC 583 @ Politecnico di Milano) References Jiawei Han and Micheline Kamber, "Data Mining: Concepts and Techniques", The Morgan Kaufmann

More information

Clustering UE 141 Spring 2013

Clustering UE 141 Spring 2013 Clustering UE 141 Spring 013 Jing Gao SUNY Buffalo 1 Definition of Clustering Finding groups of obects such that the obects in a group will be similar (or related) to one another and different from (or

More information

CIS 700: algorithms for Big Data

CIS 700: algorithms for Big Data CIS 700: algorithms for Big Data Lecture 6: Graph Sketching Slides at http://grigory.us/big-data-class.html Grigory Yaroslavtsev http://grigory.us Sketching Graphs? We know how to sketch vectors: v Mv

More information

Protein Protein Interaction Networks

Protein Protein Interaction Networks Functional Pattern Mining from Genome Scale Protein Protein Interaction Networks Young-Rae Cho, Ph.D. Assistant Professor Department of Computer Science Baylor University it My Definition of Bioinformatics

More information

MAXIMAL FREQUENT ITEMSET GENERATION USING SEGMENTATION APPROACH

MAXIMAL FREQUENT ITEMSET GENERATION USING SEGMENTATION APPROACH MAXIMAL FREQUENT ITEMSET GENERATION USING SEGMENTATION APPROACH M.Rajalakshmi 1, Dr.T.Purusothaman 2, Dr.R.Nedunchezhian 3 1 Assistant Professor (SG), Coimbatore Institute of Technology, India, rajalakshmi@cit.edu.in

More information

A discussion of Statistical Mechanics of Complex Networks P. Part I

A discussion of Statistical Mechanics of Complex Networks P. Part I A discussion of Statistical Mechanics of Complex Networks Part I Review of Modern Physics, Vol. 74, 2002 Small Word Networks Clustering Coefficient Scale-Free Networks Erdös-Rényi model cover only parts

More information

Information Management course

Information Management course Università degli Studi di Milano Master Degree in Computer Science Information Management course Teacher: Alberto Ceselli Lecture 01 : 06/10/2015 Practical informations: Teacher: Alberto Ceselli (alberto.ceselli@unimi.it)

More information

Data Mining Clustering (2) Sheets are based on the those provided by Tan, Steinbach, and Kumar. Introduction to Data Mining

Data Mining Clustering (2) Sheets are based on the those provided by Tan, Steinbach, and Kumar. Introduction to Data Mining Data Mining Clustering (2) Toon Calders Sheets are based on the those provided by Tan, Steinbach, and Kumar. Introduction to Data Mining Outline Partitional Clustering Distance-based K-means, K-medoids,

More information

Course 803401 DSS. Business Intelligence: Data Warehousing, Data Acquisition, Data Mining, Business Analytics, and Visualization

Course 803401 DSS. Business Intelligence: Data Warehousing, Data Acquisition, Data Mining, Business Analytics, and Visualization Oman College of Management and Technology Course 803401 DSS Business Intelligence: Data Warehousing, Data Acquisition, Data Mining, Business Analytics, and Visualization CS/MIS Department Information Sharing

More information

How To Find Local Affinity Patterns In Big Data

How To Find Local Affinity Patterns In Big Data Detection of local affinity patterns in big data Andrea Marinoni, Paolo Gamba Department of Electronics, University of Pavia, Italy Abstract Mining information in Big Data requires to design a new class

More information

How To Monitor User System Interactions Through Graph Based Dynamics Analysis

How To Monitor User System Interactions Through Graph Based Dynamics Analysis Monitoring User-System Interactions through Graph-Based Intrinsic Dynamics Analysis Sébastien Heymann, Bénédicte Le Grand Emails: Sebastien.Heymann@lip6.fr, Benedicte.Le-Grand@univ-paris1.fr May 30, 2013

More information

The Theory of Concept Analysis and Customer Relationship Mining

The Theory of Concept Analysis and Customer Relationship Mining The Application of Association Rule Mining in CRM Based on Formal Concept Analysis HongSheng Xu * and Lan Wang College of Information Technology, Luoyang Normal University, Luoyang, 471022, China xhs_ls@sina.com

More information

Chapter 5 Business Intelligence: Data Warehousing, Data Acquisition, Data Mining, Business Analytics, and Visualization

Chapter 5 Business Intelligence: Data Warehousing, Data Acquisition, Data Mining, Business Analytics, and Visualization Turban, Aronson, and Liang Decision Support Systems and Intelligent Systems, Seventh Edition Chapter 5 Business Intelligence: Data Warehousing, Data Acquisition, Data Mining, Business Analytics, and Visualization

More information

PSG College of Technology, Coimbatore-641 004 Department of Computer & Information Sciences BSc (CT) G1 & G2 Sixth Semester PROJECT DETAILS.

PSG College of Technology, Coimbatore-641 004 Department of Computer & Information Sciences BSc (CT) G1 & G2 Sixth Semester PROJECT DETAILS. PSG College of Technology, Coimbatore-641 004 Department of Computer & Information Sciences BSc (CT) G1 & G2 Sixth Semester PROJECT DETAILS Project Project Title Area of Abstract No Specialization 1. Software

More information

Mining Social-Network Graphs

Mining Social-Network Graphs 342 Chapter 10 Mining Social-Network Graphs There is much information to be gained by analyzing the large-scale data that is derived from social networks. The best-known example of a social network is

More information

Subgraph Patterns: Network Motifs and Graphlets. Pedro Ribeiro

Subgraph Patterns: Network Motifs and Graphlets. Pedro Ribeiro Subgraph Patterns: Network Motifs and Graphlets Pedro Ribeiro Analyzing Complex Networks We have been talking about extracting information from networks Some possible tasks: General Patterns Ex: scale-free,

More information

Mining Large Datasets: Case of Mining Graph Data in the Cloud

Mining Large Datasets: Case of Mining Graph Data in the Cloud Mining Large Datasets: Case of Mining Graph Data in the Cloud Sabeur Aridhi PhD in Computer Science with Laurent d Orazio, Mondher Maddouri and Engelbert Mephu Nguifo 16/05/2014 Sabeur Aridhi Mining Large

More information

SAP InfiniteInsight 7.0 SP1

SAP InfiniteInsight 7.0 SP1 End User Documentation Document Version: 1.0-2014-11 Getting Started with Social Table of Contents 1 About this Document... 3 1.1 Who Should Read this Document... 3 1.2 Prerequisites for the Use of this

More information

DATA MINING CLUSTER ANALYSIS: BASIC CONCEPTS

DATA MINING CLUSTER ANALYSIS: BASIC CONCEPTS DATA MINING CLUSTER ANALYSIS: BASIC CONCEPTS 1 AND ALGORITHMS Chiara Renso KDD-LAB ISTI- CNR, Pisa, Italy WHAT IS CLUSTER ANALYSIS? Finding groups of objects such that the objects in a group will be similar

More information

Machine Learning and Data Analysis overview. Department of Cybernetics, Czech Technical University in Prague. http://ida.felk.cvut.

Machine Learning and Data Analysis overview. Department of Cybernetics, Czech Technical University in Prague. http://ida.felk.cvut. Machine Learning and Data Analysis overview Jiří Kléma Department of Cybernetics, Czech Technical University in Prague http://ida.felk.cvut.cz psyllabus Lecture Lecturer Content 1. J. Kléma Introduction,

More information

Data Mining Cluster Analysis: Basic Concepts and Algorithms. Lecture Notes for Chapter 8. Introduction to Data Mining

Data Mining Cluster Analysis: Basic Concepts and Algorithms. Lecture Notes for Chapter 8. Introduction to Data Mining Data Mining Cluster Analsis: Basic Concepts and Algorithms Lecture Notes for Chapter 8 Introduction to Data Mining b Tan, Steinbach, Kumar Tan,Steinbach, Kumar Introduction to Data Mining /8/ What is Cluster

More information

K-Means Cluster Analysis. Tan,Steinbach, Kumar Introduction to Data Mining 4/18/2004 1

K-Means Cluster Analysis. Tan,Steinbach, Kumar Introduction to Data Mining 4/18/2004 1 K-Means Cluster Analsis Chapter 3 PPDM Class Tan,Steinbach, Kumar Introduction to Data Mining 4/18/4 1 What is Cluster Analsis? Finding groups of objects such that the objects in a group will be similar

More information

Forschungskolleg Data Analytics Methods and Techniques

Forschungskolleg Data Analytics Methods and Techniques Forschungskolleg Data Analytics Methods and Techniques Martin Hahmann, Gunnar Schröder, Phillip Grosse Prof. Dr.-Ing. Wolfgang Lehner Why do we need it? We are drowning in data, but starving for knowledge!

More information

Decision Support Optimization through Predictive Analytics - Leuven Statistical Day 2010

Decision Support Optimization through Predictive Analytics - Leuven Statistical Day 2010 Decision Support Optimization through Predictive Analytics - Leuven Statistical Day 2010 Ernst van Waning Senior Sales Engineer May 28, 2010 Agenda SPSS, an IBM Company SPSS Statistics User-driven product

More information

Data Mining Cluster Analysis: Basic Concepts and Algorithms. Lecture Notes for Chapter 8. Introduction to Data Mining

Data Mining Cluster Analysis: Basic Concepts and Algorithms. Lecture Notes for Chapter 8. Introduction to Data Mining Data Mining Cluster Analysis: Basic Concepts and Algorithms Lecture Notes for Chapter 8 Introduction to Data Mining by Tan, Steinbach, Kumar Tan,Steinbach, Kumar Introduction to Data Mining 4/8/2004 Hierarchical

More information

Clustering. Adrian Groza. Department of Computer Science Technical University of Cluj-Napoca

Clustering. Adrian Groza. Department of Computer Science Technical University of Cluj-Napoca Clustering Adrian Groza Department of Computer Science Technical University of Cluj-Napoca Outline 1 Cluster Analysis What is Datamining? Cluster Analysis 2 K-means 3 Hierarchical Clustering What is Datamining?

More information

Network Algorithms for Homeland Security

Network Algorithms for Homeland Security Network Algorithms for Homeland Security Mark Goldberg and Malik Magdon-Ismail Rensselaer Polytechnic Institute September 27, 2004. Collaborators J. Baumes, M. Krishmamoorthy, N. Preston, W. Wallace. Partially

More information

Why do statisticians "hate" us?

Why do statisticians hate us? Why do statisticians "hate" us? David Hand, Heikki Mannila, Padhraic Smyth "Data mining is the analysis of (often large) observational data sets to find unsuspected relationships and to summarize the data

More information

Chapter 5. Warehousing, Data Acquisition, Data. Visualization

Chapter 5. Warehousing, Data Acquisition, Data. Visualization Decision Support Systems and Intelligent Systems, Seventh Edition Chapter 5 Business Intelligence: Data Warehousing, Data Acquisition, Data Mining, Business Analytics, and Visualization 5-1 Learning Objectives

More information

Introduction to Data Mining

Introduction to Data Mining Introduction to Data Mining 1 Why Data Mining? Explosive Growth of Data Data collection and data availability Automated data collection tools, Internet, smartphones, Major sources of abundant data Business:

More information

Clustering. 15-381 Artificial Intelligence Henry Lin. Organizing data into clusters such that there is

Clustering. 15-381 Artificial Intelligence Henry Lin. Organizing data into clusters such that there is Clustering 15-381 Artificial Intelligence Henry Lin Modified from excellent slides of Eamonn Keogh, Ziv Bar-Joseph, and Andrew Moore What is Clustering? Organizing data into clusters such that there is

More information

3. The Junction Tree Algorithms

3. The Junction Tree Algorithms A Short Course on Graphical Models 3. The Junction Tree Algorithms Mark Paskin mark@paskin.org 1 Review: conditional independence Two random variables X and Y are independent (written X Y ) iff p X ( )

More information

Simple Graphs Degrees, Isomorphism, Paths

Simple Graphs Degrees, Isomorphism, Paths Mathematics for Computer Science MIT 6.042J/18.062J Simple Graphs Degrees, Isomorphism, Types of Graphs Simple Graph this week Multi-Graph Directed Graph next week Albert R Meyer, March 10, 2010 lec 6W.1

More information

雲 端 運 算 願 景 與 實 現 馬 維 英 博 士 微 軟 亞 洲 研 究 院 常 務 副 院 長

雲 端 運 算 願 景 與 實 現 馬 維 英 博 士 微 軟 亞 洲 研 究 院 常 務 副 院 長 雲 端 運 算 願 景 與 實 現 馬 維 英 博 士 微 軟 亞 洲 研 究 院 常 務 副 院 長 Important Aspects of the Cloud Software as a Service (SaaS) Platform as a Service (PaaS) Infrastructure as a Service (IaaS) Information and Knowledge

More information

High-dimensional labeled data analysis with Gabriel graphs

High-dimensional labeled data analysis with Gabriel graphs High-dimensional labeled data analysis with Gabriel graphs Michaël Aupetit CEA - DAM Département Analyse Surveillance Environnement BP 12-91680 - Bruyères-Le-Châtel, France Abstract. We propose the use

More information

{ Mining, Sets, of, Patterns }

{ Mining, Sets, of, Patterns } { Mining, Sets, of, Patterns } A tutorial at ECMLPKDD2010 September 20, 2010, Barcelona, Spain by B. Bringmann, S. Nijssen, N. Tatti, J. Vreeken, A. Zimmermann 1 Overview Tutorial 00:00 00:45 Introduction

More information

Big Data Analytics of Multi-Relationship Online Social Network Based on Multi-Subnet Composited Complex Network

Big Data Analytics of Multi-Relationship Online Social Network Based on Multi-Subnet Composited Complex Network , pp.273-284 http://dx.doi.org/10.14257/ijdta.2015.8.5.24 Big Data Analytics of Multi-Relationship Online Social Network Based on Multi-Subnet Composited Complex Network Gengxin Sun 1, Sheng Bin 2 and

More information

A NOVEL RESOURCE EFFICIENT DMMS APPROACH

A NOVEL RESOURCE EFFICIENT DMMS APPROACH A NOVEL RESOURCE EFFICIENT DMMS APPROACH FOR NETWORK MONITORING AND CONTROLLING FUNCTIONS Golam R. Khan 1, Sharmistha Khan 2, Dhadesugoor R. Vaman 3, and Suxia Cui 4 Department of Electrical and Computer

More information

Intrusion Detection: Game Theory, Stochastic Processes and Data Mining

Intrusion Detection: Game Theory, Stochastic Processes and Data Mining Intrusion Detection: Game Theory, Stochastic Processes and Data Mining Joseph Spring 7COM1028 Secure Systems Programming 1 Discussion Points Introduction Firewalls Intrusion Detection Schemes Models Stochastic

More information

Static Data Mining Algorithm with Progressive Approach for Mining Knowledge

Static Data Mining Algorithm with Progressive Approach for Mining Knowledge Global Journal of Business Management and Information Technology. Volume 1, Number 2 (2011), pp. 85-93 Research India Publications http://www.ripublication.com Static Data Mining Algorithm with Progressive

More information

Network Analytics in Marketing

Network Analytics in Marketing Network Analytics in Marketing Prof. Dr. Daning Hu Department of Informatics University of Zurich Nov 13th, 2014 Introduction: Network Analytics in Marketing Marketing channels and business networks have

More information

Cloud Monitoring. A challenging Application for Complex Event Processing. Bastian Hoßbach, Bernhard Seeger. ETH Zürich October 7, 2011

Cloud Monitoring. A challenging Application for Complex Event Processing. Bastian Hoßbach, Bernhard Seeger. ETH Zürich October 7, 2011 A challenging Application for Complex Event Processing ETH Zürich October 7, 2011 Agenda - Introduction and motivation - Design and implementation - Examples - Conclusion and research issues The NIST-Definition

More information

Semantic Video Annotation by Mining Association Patterns from Visual and Speech Features

Semantic Video Annotation by Mining Association Patterns from Visual and Speech Features Semantic Video Annotation by Mining Association Patterns from and Speech Features Vincent. S. Tseng, Ja-Hwung Su, Jhih-Hong Huang and Chih-Jen Chen Department of Computer Science and Information Engineering

More information

Sanjeev Kumar. contribute

Sanjeev Kumar. contribute RESEARCH ISSUES IN DATAA MINING Sanjeev Kumar I.A.S.R.I., Library Avenue, Pusa, New Delhi-110012 sanjeevk@iasri.res.in 1. Introduction The field of data mining and knowledgee discovery is emerging as a

More information

Graph Analysis of Student Model Networks

Graph Analysis of Student Model Networks Graph Analysis of Student Model Networks Julio Guerra School of Information Sciences jdg60@pitt.edu Roya Hosseini Intelligent Systems Program roh38@pitt.edu Yun Huang Intelligent Systems Program yuh43@pitt.edu

More information

Innovative Data Mining based approaches for life course analysis

Innovative Data Mining based approaches for life course analysis IPUC, Neuchâtel, February 23-24, 2007 Innovative Data Mining based approaches for life course analysis Gilbert Ritschard Alexis Gabadinho, Nicolas Müller, Matthias Studer University of Geneva, Switzerland

More information

Analyzing User Patterns to Derive Design Guidelines for Job Seeking and Recruiting Website

Analyzing User Patterns to Derive Design Guidelines for Job Seeking and Recruiting Website Analyzing User Patterns to Derive Design Guidelines for Job Seeking and Recruiting Website Yao Lu École Polytechnique Fédérale de Lausanne (EPFL) Lausanne, Switzerland e-mail: yao.lu@epfl.ch Sandy El Helou

More information

Standardization of Components, Products and Processes with Data Mining

Standardization of Components, Products and Processes with Data Mining B. Agard and A. Kusiak, Standardization of Components, Products and Processes with Data Mining, International Conference on Production Research Americas 2004, Santiago, Chile, August 1-4, 2004. Standardization

More information

Topic 13 Predictive Modeling. Topic 13. Predictive Modeling

Topic 13 Predictive Modeling. Topic 13. Predictive Modeling Topic 13 Predictive Modeling Topic 13 Predictive Modeling 13.1 Predicting Yield Maps Talk about the future of Precision Ag how about maps of things yet to come? Sounds a bit far fetched but Spatial Data

More information

Data Mining and Knowledge Discovery in Databases (KDD) State of the Art. Prof. Dr. T. Nouri Computer Science Department FHNW Switzerland

Data Mining and Knowledge Discovery in Databases (KDD) State of the Art. Prof. Dr. T. Nouri Computer Science Department FHNW Switzerland Data Mining and Knowledge Discovery in Databases (KDD) State of the Art Prof. Dr. T. Nouri Computer Science Department FHNW Switzerland 1 Conference overview 1. Overview of KDD and data mining 2. Data

More information

STATISTICA. Clustering Techniques. Case Study: Defining Clusters of Shopping Center Patrons. and

STATISTICA. Clustering Techniques. Case Study: Defining Clusters of Shopping Center Patrons. and Clustering Techniques and STATISTICA Case Study: Defining Clusters of Shopping Center Patrons STATISTICA Solutions for Business Intelligence, Data Mining, Quality Control, and Web-based Analytics Table

More information

Exploring Big Data in Social Networks

Exploring Big Data in Social Networks Exploring Big Data in Social Networks virgilio@dcc.ufmg.br (meira@dcc.ufmg.br) INWEB National Science and Technology Institute for Web Federal University of Minas Gerais - UFMG May 2013 Some thoughts about

More information

How To Cluster Of Complex Systems

How To Cluster Of Complex Systems Entropy based Graph Clustering: Application to Biological and Social Networks Edward C Kenley Young-Rae Cho Department of Computer Science Baylor University Complex Systems Definition Dynamically evolving

More information

DATA ANALYSIS IN PUBLIC SOCIAL NETWORKS

DATA ANALYSIS IN PUBLIC SOCIAL NETWORKS International Scientific Conference & International Workshop Present Day Trends of Innovations 2012 28 th 29 th May 2012 Łomża, Poland DATA ANALYSIS IN PUBLIC SOCIAL NETWORKS Lubos Takac 1 Michal Zabovsky

More information

A comparative study of social network analysis tools

A comparative study of social network analysis tools Membre de Membre de A comparative study of social network analysis tools David Combe, Christine Largeron, Előd Egyed-Zsigmond and Mathias Géry International Workshop on Web Intelligence and Virtual Enterprises

More information

An Efficient Clustering Algorithm for Market Basket Data Based on Small Large Ratios

An Efficient Clustering Algorithm for Market Basket Data Based on Small Large Ratios An Efficient lustering Algorithm for Market Basket Data Based on Small Large Ratios hing-uang Yun and Kun-Ta huang and Ming-Syan hen Department of Electrical Engineering National Taiwan University Taipei,

More information

Analytics on Big Data

Analytics on Big Data Analytics on Big Data Riccardo Torlone Università Roma Tre Credits: Mohamed Eltabakh (WPI) Analytics The discovery and communication of meaningful patterns in data (Wikipedia) It relies on data analysis

More information

Applied Research Laboratory: Visualization, Information and Imaging Programs

Applied Research Laboratory: Visualization, Information and Imaging Programs Applied Research Laboratory: Visualization, Information and Imaging Programs Dr. Christopher Griffin Applied Research Laboratory Penn State University Applied Research Laboratory - DoD Designated UARC

More information

EM Clustering Approach for Multi-Dimensional Analysis of Big Data Set

EM Clustering Approach for Multi-Dimensional Analysis of Big Data Set EM Clustering Approach for Multi-Dimensional Analysis of Big Data Set Amhmed A. Bhih School of Electrical and Electronic Engineering Princy Johnson School of Electrical and Electronic Engineering Martin

More information

Data Mining and Pattern Recognition for Large-Scale Scientific Data

Data Mining and Pattern Recognition for Large-Scale Scientific Data Data Mining and Pattern Recognition for Large-Scale Scientific Data Chandrika Kamath Center for Applied Scientific Computing Lawrence Livermore National Laboratory October 15, 1998 We need an effective

More information

PulseTerraMetrix RS Production Benefit

PulseTerraMetrix RS Production Benefit PulseTerraMetrix RS Production Benefit Mine management is commonly interested in assessing the potential benefit of how a shovel based payload monitoring could translate into increased production, or through

More information

Echidna: Efficient Clustering of Hierarchical Data for Network Traffic Analysis

Echidna: Efficient Clustering of Hierarchical Data for Network Traffic Analysis Echidna: Efficient Clustering of Hierarchical Data for Network Traffic Analysis Abdun Mahmood, Christopher Leckie, Parampalli Udaya Department of Computer Science and Software Engineering University of

More information

Harnessing the Potential of. The ABCs of using social network approaches to design and evaluate health & development programs.

Harnessing the Potential of. The ABCs of using social network approaches to design and evaluate health & development programs. Harnessing the Potential of The ABCs of using social network approaches to design and evaluate health & development programs Social Networks Overview 1. What is social network analysis (SNA)? 2. SNA and

More information

Introduction to Graph Mining

Introduction to Graph Mining Introduction to Graph Mining What is a graph? A graph G = (V,E) is a set of vertices V and a set (possibly empty) E of pairs of vertices e 1 = (v 1, v 2 ), where e 1 E and v 1, v 2 V. Edges may contain

More information

480093 - TDS - Socio-Environmental Data Science

480093 - TDS - Socio-Environmental Data Science Coordinating unit: Teaching unit: Academic year: Degree: ECTS credits: 2015 480 - IS.UPC - University Research Institute for Sustainability Science and Technology 715 - EIO - Department of Statistics and

More information

Social Media Mining. Network Measures

Social Media Mining. Network Measures Klout Measures and Metrics 22 Why Do We Need Measures? Who are the central figures (influential individuals) in the network? What interaction patterns are common in friends? Who are the like-minded users

More information

Big Data: Rethinking Text Visualization

Big Data: Rethinking Text Visualization Big Data: Rethinking Text Visualization Dr. Anton Heijs anton.heijs@treparel.com Treparel April 8, 2013 Abstract In this white paper we discuss text visualization approaches and how these are important

More information

SPANNING CACTI FOR STRUCTURALLY CONTROLLABLE NETWORKS NGO THI TU ANH NATIONAL UNIVERSITY OF SINGAPORE

SPANNING CACTI FOR STRUCTURALLY CONTROLLABLE NETWORKS NGO THI TU ANH NATIONAL UNIVERSITY OF SINGAPORE SPANNING CACTI FOR STRUCTURALLY CONTROLLABLE NETWORKS NGO THI TU ANH NATIONAL UNIVERSITY OF SINGAPORE 2012 SPANNING CACTI FOR STRUCTURALLY CONTROLLABLE NETWORKS NGO THI TU ANH (M.Sc., SFU, Russia) A THESIS

More information

Comparison of K-means and Backpropagation Data Mining Algorithms

Comparison of K-means and Backpropagation Data Mining Algorithms Comparison of K-means and Backpropagation Data Mining Algorithms Nitu Mathuriya, Dr. Ashish Bansal Abstract Data mining has got more and more mature as a field of basic research in computer science and

More information

Statistical Models in Data Mining

Statistical Models in Data Mining Statistical Models in Data Mining Sargur N. Srihari University at Buffalo The State University of New York Department of Computer Science and Engineering Department of Biostatistics 1 Srihari Flood of

More information

CLAN: An Algorithm for Mining Closed Cliques from Large Dense Graph Databases

CLAN: An Algorithm for Mining Closed Cliques from Large Dense Graph Databases CLAN: An Algorithm for Mining Closed Cliques from Large Dense Graph Databases Jianyong Wang, Zhiping Zeng, Lizhu Zhou Department of Computer Science and Technology Tsinghua University, Beijing, 100084,

More information

Clustering & Visualization

Clustering & Visualization Chapter 5 Clustering & Visualization Clustering in high-dimensional databases is an important problem and there are a number of different clustering paradigms which are applicable to high-dimensional data.

More information

Association Analysis: Basic Concepts and Algorithms

Association Analysis: Basic Concepts and Algorithms 6 Association Analysis: Basic Concepts and Algorithms Many business enterprises accumulate large quantities of data from their dayto-day operations. For example, huge amounts of customer purchase data

More information

Data Intensive Science and Computing

Data Intensive Science and Computing DEFENSE LABORATORIES ACADEMIA TRANSFORMATIVE SCIENCE Efficient, effective and agile research system INDUSTRY Data Intensive Science and Computing Advanced Computing & Computational Sciences Division University

More information

MapReduce and Distributed Data Analysis. Sergei Vassilvitskii Google Research

MapReduce and Distributed Data Analysis. Sergei Vassilvitskii Google Research MapReduce and Distributed Data Analysis Google Research 1 Dealing With Massive Data 2 2 Dealing With Massive Data Polynomial Memory Sublinear RAM Sketches External Memory Property Testing 3 3 Dealing With

More information

Unique column combinations

Unique column combinations Unique column combinations Arvid Heise Guest lecture in Data Profiling and Data Cleansing Prof. Dr. Felix Naumann Agenda 2 Introduction and problem statement Unique column combinations Exponential search

More information

Visualization methods for patent data

Visualization methods for patent data Visualization methods for patent data Treparel 2013 Dr. Anton Heijs (CTO & Founder) Delft, The Netherlands Introduction Treparel can provide advanced visualizations for patent data. This document describes

More information

Principles of Data Mining by Hand&Mannila&Smyth

Principles of Data Mining by Hand&Mannila&Smyth Principles of Data Mining by Hand&Mannila&Smyth Slides for Textbook Ari Visa,, Institute of Signal Processing Tampere University of Technology October 4, 2010 Data Mining: Concepts and Techniques 1 Differences

More information

A SOCIAL NETWORK ANALYSIS APPROACH TO ANALYZE ROAD NETWORKS INTRODUCTION

A SOCIAL NETWORK ANALYSIS APPROACH TO ANALYZE ROAD NETWORKS INTRODUCTION A SOCIAL NETWORK ANALYSIS APPROACH TO ANALYZE ROAD NETWORKS Kyoungjin Park Alper Yilmaz Photogrammetric and Computer Vision Lab Ohio State University park.764@osu.edu yilmaz.15@osu.edu ABSTRACT Depending

More information

Individual security and network design

Individual security and network design Individual security and network design Diego Cerdeiro Marcin Dziubiński Sanjeev Goyal FIT 2015 Motivation Networks often face external threats in form of strategic or random attacks The attacks can be

More information

Search for the optimal strategy to spread a viral video: An agent-based model optimized with genetic algorithms

Search for the optimal strategy to spread a viral video: An agent-based model optimized with genetic algorithms Search for the optimal strategy to spread a viral video: An agent-based model optimized with genetic algorithms Michal Kvasnička 1 Abstract. Agent-based computational papers on viral marketing have been

More information

Comparative Analysis of EM Clustering Algorithm and Density Based Clustering Algorithm Using WEKA tool.

Comparative Analysis of EM Clustering Algorithm and Density Based Clustering Algorithm Using WEKA tool. International Journal of Engineering Research and Development e-issn: 2278-067X, p-issn: 2278-800X, www.ijerd.com Volume 9, Issue 8 (January 2014), PP. 19-24 Comparative Analysis of EM Clustering Algorithm

More information

Predictive Modeling. Age. Sex. Y.o.B. Expected mortality. Model. Married Amount. etc. SOA/CAS Spring Meeting

Predictive Modeling. Age. Sex. Y.o.B. Expected mortality. Model. Married Amount. etc. SOA/CAS Spring Meeting watsonwyatt.com SOA/CAS Spring Meeting Application of Predictive Modeling in Life Insurance Jean-Felix Huet, ASA June 18, 28 Predictive Modeling Statistical model that relates an event (death) with a number

More information

Community Detection Proseminar - Elementary Data Mining Techniques by Simon Grätzer

Community Detection Proseminar - Elementary Data Mining Techniques by Simon Grätzer Community Detection Proseminar - Elementary Data Mining Techniques by Simon Grätzer 1 Content What is Community Detection? Motivation Defining a community Methods to find communities Overlapping communities

More information

BIG DATA VISUALIZATION. Team Impossible Peter Vilim, Sruthi Mayuram Krithivasan, Matt Burrough, and Ismini Lourentzou

BIG DATA VISUALIZATION. Team Impossible Peter Vilim, Sruthi Mayuram Krithivasan, Matt Burrough, and Ismini Lourentzou BIG DATA VISUALIZATION Team Impossible Peter Vilim, Sruthi Mayuram Krithivasan, Matt Burrough, and Ismini Lourentzou Let s begin with a story Let s explore Yahoo s data! Dora the Data Explorer has a new

More information

Teaching Scheme Credits Assigned Course Code Course Hrs./Week. BEITC802 Big Data 04 02 --- 04 01 --- 05 Analytics. Theory Marks

Teaching Scheme Credits Assigned Course Code Course Hrs./Week. BEITC802 Big Data 04 02 --- 04 01 --- 05 Analytics. Theory Marks Teaching Scheme Credits Assigned Course Code Course Hrs./Week Name Theory Practical Tutorial Theory Practical/Oral Tutorial Tota l BEITC802 Big Data 04 02 --- 04 01 --- 05 Analytics Examination Scheme

More information

Data Mining Cluster Analysis: Advanced Concepts and Algorithms. Lecture Notes for Chapter 9. Introduction to Data Mining

Data Mining Cluster Analysis: Advanced Concepts and Algorithms. Lecture Notes for Chapter 9. Introduction to Data Mining Data Mining Cluster Analysis: Advanced Concepts and Algorithms Lecture Notes for Chapter 9 Introduction to Data Mining by Tan, Steinbach, Kumar Tan,Steinbach, Kumar Introduction to Data Mining 4/18/2004

More information

Graph Mining Techniques for Social Media Analysis

Graph Mining Techniques for Social Media Analysis Graph Mining Techniques for Social Media Analysis Mary McGlohon Christos Faloutsos 1 1-1 What is graph mining? Extracting useful knowledge (patterns, outliers, etc.) from structured data that can be represented

More information

Building Data Cubes and Mining Them. Jelena Jovanovic Email: jeljov@fon.bg.ac.yu

Building Data Cubes and Mining Them. Jelena Jovanovic Email: jeljov@fon.bg.ac.yu Building Data Cubes and Mining Them Jelena Jovanovic Email: jeljov@fon.bg.ac.yu KDD Process KDD is an overall process of discovering useful knowledge from data. Data mining is a particular step in the

More information

A. Mrvar: Network Analysis using Pajek 1. Cluster

A. Mrvar: Network Analysis using Pajek 1. Cluster A. Mrvar: Network Analysis using Pajek 1 Cluster One of the important goals of network analysis is to find clusters of units, which have similar (or equal) structural characteristics, which are determined

More information

CAS CS 565, Data Mining

CAS CS 565, Data Mining CAS CS 565, Data Mining Course logistics Course webpage: http://www.cs.bu.edu/~evimaria/cs565-10.html Schedule: Mon Wed, 4-5:30 Instructor: Evimaria Terzi, evimaria@cs.bu.edu Office hours: Mon 2:30-4pm,

More information

Discrete Mathematics & Mathematical Reasoning Chapter 10: Graphs

Discrete Mathematics & Mathematical Reasoning Chapter 10: Graphs Discrete Mathematics & Mathematical Reasoning Chapter 10: Graphs Kousha Etessami U. of Edinburgh, UK Kousha Etessami (U. of Edinburgh, UK) Discrete Mathematics (Chapter 6) 1 / 13 Overview Graphs and Graph

More information

Sla Aware Load Balancing Algorithm Using Join-Idle Queue for Virtual Machines in Cloud Computing

Sla Aware Load Balancing Algorithm Using Join-Idle Queue for Virtual Machines in Cloud Computing Sla Aware Load Balancing Using Join-Idle Queue for Virtual Machines in Cloud Computing Mehak Choudhary M.Tech Student [CSE], Dept. of CSE, SKIET, Kurukshetra University, Haryana, India ABSTRACT: Cloud

More information

General Network Analysis: Graph-theoretic. COMP572 Fall 2009

General Network Analysis: Graph-theoretic. COMP572 Fall 2009 General Network Analysis: Graph-theoretic Techniques COMP572 Fall 2009 Networks (aka Graphs) A network is a set of vertices, or nodes, and edges that connect pairs of vertices Example: a network with 5

More information