Graph Mining and Social Network Analysis. Data Mining; EECS 4412 Darren Rolfe + Vince Chu


 Joy Blankenship
 3 years ago
 Views:
Transcription
1 Graph Mining and Social Network nalysis Data Mining; EES 4412 Darren Rolfe + Vince hu
2 genda Graph Mining Methods for Mining Frequent Subgraphs prioribased pproach: GM, FSG PatternGrowth pproach: gspan Social Networks nalysis Properties and Features of Social Real Graphs Models of Graphs we can use Using those models to predict/other things
3 Graph Mining Methods for Mining Frequent Subgraphs
4 Why Mine Graphs? lot of data today can be represented in the form of a graph Social: Friendship networks, social media networks, and instant messaging networks, document citation networks, blogs Technological: Power grid, the internet iological: Spread of virus/disease, protein/gene regulatory networks
5 What Do We Need To Do Identify various kinds of graph patterns Frequent substructures are the very basic patterns that can be discovered in a collection of graphs, useful for: characterizing graph sets, discriminating different groups of graphs, classifying and clustering graphs, building graph indices, and facilitating similarity search in graph databases
6 Mining Frequent Subgraphs Performed on a collection of graphs Notation: Vertex set of a graph gg by VV(gg) Edge set of a graph gg by EE(gg) label function, LL, maps a vertex or an edge to a label. graph gg is a subgraph of another graph ggg if there exists a subgraph isomorphism from gg to ggg. Given a labeled graph data set, DD = {GG 1, GG 2,, GGGG}, we define ssssssssssssss(gg) (or ffffffffffffffffff(gg)) as the percentage (or number) of graphs in DD where gg is a subgraph. frequent graph is a graph whose support is no less than a minimum support threshold, mmmmmm_ssssss.
7 Discovering Frequent Substructures Usually consists of two steps: 1. Generate frequent substructure candidates. 2. heck the frequency of each candidate. Most studies on frequent substructure discovery focus on the optimization of the first step, because the second step involves a subgraph isomorphism test whose computational complexity is excessively high (i.e., NPcomplete).
8 Graph Isomorphism Isomorphism of graphs G and H is a bijection between the vertex sets of G and H G H FF: VV(gg) VV(HH) H G Such that any two vertices uu and vv of GG are adjacent in GG if and only if ƒ(uu) and ƒ(vv) are adjacent in HH. I J D J I D Graph G Graph H
9 Frequent Subgraphs: n Example Graph 1 Graph 2 1. Start with a labelled graph data set. 2. Set a minimum support threshold for frequent graph. 3. Generate frequent substructure candidates. 4. heck the frequency of each candidate. Graph 3 Graph 4
10 Let the support minimum for this example be 50%. Frequent Subgraphs: n Example 1. Start with a labelled graph data set. 2. Set a minimum support threshold for frequent graph. 3. Generate frequent substructure candidates. 4. heck the frequency of each candidate.
11 k = 1 k = 2 Frequent Subgraphs: n Example k = 3 k = 4 1. Start with a labelled graph data set. 2. Set a minimum support threshold for frequent graph. 3. Generate frequent substructure candidates. 4. heck the frequency of each candidate.
12 k = 1 k = 2 Frequent Subgraphs: n Example k = 3 k = 4 1. Start with a labelled graph data set. 2. Set a minimum support threshold for frequent graph. 3. Generate frequent substructure candidates. 4. heck the frequency of each candidate.
13 Graph 1 Graph 2 k = 3, frequency: 3, support: 75% Frequent Subgraphs: n Example 1. Start with a labelled graph data set. 2. Set a minimum support threshold for frequent graph. 3. Generate frequent substructure candidates. 4. heck the frequency of each candidate. Graph 3 Graph 4
14 Graph 1 Graph 2 k = 4, frequency: 2, support: 50% Frequent Subgraphs: n Example 1. Start with a labelled graph data set. 2. Set a minimum support threshold for frequent graph. 3. Generate frequent substructure candidates. 4. heck the frequency of each candidate. Graph 3 Graph 4
15 prioribased pproach prioribased frequent substructure mining algorithms share similar characteristics with prioribased frequent itemset mining algorithms. Search for frequent graphs: Starts with graphs of small size ; definition of graph size depends on algorithm used. Proceeds in a bottomup manner by generating candidates having an extra vertex, edge, or path. Main design complexity of prioribased substructure mining algorithms is candidate generation step. andidate generation problem in frequent substructure mining is harder than that in frequent itemset mining, because there are many ways to join two substructures.
16 prioribased pproach 1. Generate size kk frequent subgraph candidates Generated by joining two similar but slightly different frequent subgraphs that were discovered in the previous call of the algorithm. 2. heck the frequency of each candidate 3. Generate the size kk + 1 frequent candidates 4. ontinue until candidates are empty
17 lgorithm: priorigraph prioribased Frequent Substructure Mining Input: DD, a graph data set mmmmmm_ssssss, minimum support threshold Output: SS kk, frequent substructure set Method: SS 1 frequent singleelements in DD all (DD, mmmmmm_ssssss, SS 1 ) procedure priorigraph(d, min_sup, S k ) 1 S k+1 ; 2 foreach frequent g i S k do 3 foreach frequent g j S k do 4 foreach size (k+1) graph g formed by merge(g i, g j ) do 5 if g is frequent in D and g S k+1 then 6 insert g into S k+1 ; 7 if s k+1 then 8 priorigraph(d, min_sup, S k+1 ); 9 return;
18 GM  prioribased Graph Mining Vertexbased candidate generation method that increases the substructure size by one vertex at each iteration of priorigraph. kk, graph size is the number of vertices in the graph Two sizek frequent graphs are joined only if they have the same size(k 1) subgraph. Newly formed candidate includes the size(k 1) subgraph in common and the additional two vertices from the two sizek patterns. ecause it is undetermined whether there is an edge connecting the additional two vertices, we actually can form two substructures.
19 k = 4 + GM: n Example k = 5 Two substructures joined by two chains. kk, graph size is the number of vertices in the graph
20 FSG Frequent Subgraph Discovery Edgebased candidate generation strategy that increases the substructure size by one edge in each call of priorigraph. kk, graph size is the number of edges in the graph Two sizek patterns are merged if and only if they share the same subgraph having k 1 edges, which is called the core. Newly formed candidate includes the core and the additional two edges from sizek patterns.
21 k = 4 + FSG: n Example k = 5 Two substructure patterns and their potential candidates. kk, graph size is the number of edges in the graph
22 k = 5 + FSG: nother Example k = 6 Two substructure patterns and their potential candidates. kk, graph size is the number of edges in the graph
23 Pitfall: prioribased pproach Generation of subgraph candidates is complicated and expensive. Levelwise candidate generation readthfirst search To determine whether a size(k+1) graph is frequent, must check all corresponding sizek subgraphs to obtain the upper bound of frequency. efore mining any size(k+1) subgraph, requires complete mining of sizek subgraphs Subgraph isomorphism is an NP Subgraph isomorphism is an NPcomplete problem, complete problem, so pruning is expensive.
24 PatternGrowth pproach 1. Initially, start with the frequent vertices as frequent graphs 2. Extend these graphs by adding a new edge such that newly formed graphs are frequent graphs graph g can be extended by adding a new edge e; newly formed graph is denoted by gg xx ee. If e introduces a new vertex, we denote the new graph by gg xxxx ee, otherwise gg xxxx ee, where f or b indicates that the extension is in a forward or backward direction 3. For each discovered graph g, it performs extensions recursively until all the frequent graphs with g embedded are discovered. 4. The recursion stops once no frequent graph can be generated.
25 lgorithm: PatternGrowthGraph Simplistic Pattern Growthbased Frequent Substructure Mining Input: gg, a frequent graph DD, a graph data set mmmmmm_ssssss, minimum support threshold Output: SS, frequent graph set Method: SS all PPPPPPPPPPPPPPPPPPPPPPPPPPPPPPPPPP(gg, DD, mmmmmm_ssssss, SS) procedure PatternGrowthGraph(g, D, min_sup, S) 1 if g S then return; 2 else insert g into S; 3 scan D once, find all edges e that g can be extended to g xx e; 4 foreach frequent g xx e do 5 PatternGrowthGraph(g xx e, D, min_sup, S); 6 return;
26 PatternGrowth: n Example Graph 1 Graph 2 1. Start with the frequent vertices as frequent graphs 2. Extend these graphs by adding a new edge forming new frequent graphs 3. For each discovered graph g, recursively extend 4. Stops once no frequent graph can be generated Graph 3 Graph 4
27 Let the support minimum for this example be 50%. PatternGrowth: n Example 1. Start with the frequent vertices as frequent graphs 2. Extend these graphs by adding a new edge forming new frequent graphs 3. For each discovered graph g, recursively extend 4. Stops once no frequent graph can be generated
28 Graph 1 Graph 2 Let s arbitrarily start with this frequent vertex PatternGrowth: n Example 1. Start with the frequent vertices as frequent graphs 2. Extend these graphs by adding a new edge forming new frequent graphs 3. For each discovered graph g, recursively extend 4. Stops once no frequent graph can be generated Graph 3 Graph 4
29 Graph 1 Graph 2 Extend graph (forward); add frequent edge PatternGrowth: n Example 1. Start with the frequent vertices as frequent graphs 2. Extend these graphs by adding a new edge forming new frequent graphs 3. For each discovered graph g, recursively extend 4. Stops once no frequent graph can be generated Graph 3 Graph 4
30 Graph 1 Graph 2 Extend frequent graph (forward) again PatternGrowth: n Example 1. Start with the frequent vertices as frequent graphs 2. Extend these graphs by adding a new edge forming new frequent graphs 3. For each discovered graph g, recursively extend 4. Stops once no frequent graph can be generated Graph 3 Graph 4
31 Graph 1 Graph 2 Extend graph (backward); previously seen node PatternGrowth: n Example 1. Start with the frequent vertices as frequent graphs 2. Extend these graphs by adding a new edge forming new frequent graphs 3. For each discovered graph g, recursively extend 4. Stops once no frequent graph can be generated Graph 3 Graph 4
32 Graph 1 Graph 2 Extend frequent graph (forward) again PatternGrowth: n Example 1. Start with the frequent vertices as frequent graphs 2. Extend these graphs by adding a new edge forming new frequent graphs 3. For each discovered graph g, recursively extend 4. Stops once no frequent graph can be generated Graph 3 Graph 4
33 Graph 1 Graph 2 Stop recursion, try different start vertex PatternGrowth: n Example 1. Start with the frequent vertices as frequent graphs 2. Extend these graphs by adding a new edge forming new frequent graphs 3. For each discovered graph g, recursively extend 4. Stops once no frequent graph can be generated Graph 3 Graph 4
34 Pitfall: PatternGrowthGraph Simple, but not efficient Same graph can be discovered many times; duplicate graph Generation and detection of duplicate graphs increases workload
35 gspan (Graphased Substructure Pattern Mining) Designed to reduce the generation of duplicate graphs. Explores via depthfirst search (DFS) DFS lexicographic order and minimum DFS code form a canonical labeling system to support DFS search. Discovers all the frequent subgraphs without candidate generation and false positives pruning. It combines the growing and checking of frequent subgraphs into one procedure, thus accelerates the mining process.
36 gspan (Graphased Substructure Pattern Mining) DFS Subscripting When performing a DFS in a graph, we construct a DFS tree One graph can have several different DFS trees Depthfirst discovery of the vertices forms a linear order Use subscripts to label this order according to their discovery time i < j means v i is discovered before v j. v o, the root and v n, the rightmost vertex. The straight path from v 0 to v n, rightmost path.
37 gspan (Graphased Substructure Pattern Mining) DFS ode We transform each subscripted graph to an edge sequence, called a DFS code, so that we can build an order among these sequences. The goal is to select the subscripting that generates the minimum sequence as its base subscripting. There are two kinds of orders in this transformation process: 1. Edge order, which maps edges in a subscripted graph into a sequence; and 2. Sequence order, which builds an order among edge sequences n edge is represented by a 5tuple, (ii, jj, ll ii, II (ii,jj), ll jj ); ll ii and ll jj are the labels of vv ii and vv jj, respectively, and II (ii,jj) is the label of the edge connecting them
38 gspan (Graphased Substructure Pattern Mining) DFS Lexicographic Order For the each DFS tree, we sort the DFS code (tuples) to a set of orderings. ased on the DFS lexicographic ordering, the minimum DFS code of a given graph G, written as dfs(g), is the minimal one among all the DFS codes. The subscripting that generates the minimum DFS code is called the base subscripting. Given two graphs GG and GGG, GG is isomorphic to GGG if and only if dddddd(gg) = dddddd(ggg). ased on this property, what we need to do for mining frequent subgraphs is to perform only the rightmost extensions on the minimum DFS codes, since such an extension will guarantee the completeness of mining results.
39 DFS ode: n Example DFS Subscripting When performing a DFS in a graph, we construct a DFS tree One graph can have several different DFS trees X X Z Y a a b b v 0 v 1 v 2 v 3 X X Z Y a a b b X X Z Y a a b b X X Z Y a a b b
40 X edge γ 0 b Z b a a X X a b Y e 0 e 1 (0, 1, X, a, X) (1, 2, X, a, Z) e 2 (2, 0, Z, b, X) e 3 (1, 3, X, b, Y) edge γ 1 (0, 1, X, a, X) e 0 DFS Lexicographic Order: n Example Z b a a X X a X b b Y e 1 (1, 2, X, b, Y) e 2 (1, 3, X, a, Z) e 3 (3, 0, Z, b, X) edge γ 2 e 0 (0, 1, Y, b, X) e 1 (1, 2, X, a, X) e 2 (2, 3, X, b, Z) For the each DFS tree, we sort the DFS code (tuples) to a set of orderings. ased on the DFS lexicographic ordering, the minimum DFS code of a given graph G, written as dfs(g), is the minimal one among all the DFS codes. Z Y e 3 (3, 1, Z, a, X)
41 gspan (Graphased Substructure Pattern Mining) 1. Initially, a starting vertex is randomly chosen 2. Vertices in a graph are marked so that we can tell which vertices have been visited 3. Visited vertex set is expanded repeatedly until a full DFS tree is built 4. Given a graph G and a DFS tree T in G, a new edge e an be added between the rightmost vertex and another vertex on the rightmost path (backward extension); or an introduce a new vertex and connect to a vertex on the rightmost path (forward extension). ecause both kinds of extensions take place on the rightmost path, we call them rightmost extension, denoted by gg rr ee
42 lgorithm: gspan Pattern growthbased frequent substructure mining that reduces duplicate graph generation. Input: ss, a DFS code DD, a graph data set mmmmmm_ssssss, minimum support threshold Output: SS, frequent graph set Method: SS all gggggggggg(ss, DD, mmmmmm_ssssss, SS) procedure gspan(s, D, min_sup, S) 1 if s dfs(s) then return; 2 insert s into S; 3 set to ; 4 scan D once, find all edges e that s can be rightmost extended to s rr e; 5 insert s rr e into and count its frequency; 6 foreach frequent s rr e in do 7 gspan(s rr e, D, min_sup, S); 8 return;
43 Other Graph Mining So far the techniques we have discussed: an handle only one special kind of graphs: Labeled, undirected, connected simple graphs without any specific constraints ssume that the database to be mined contains a set of graphs Each consisting of a set of labeled vertices and labeled but undirected edges, with no other constraints.
44 Other Graph Mining Mining Variant and onstrained Substructure Patterns losed frequent substructure where a frequent graph G is closed if and only if there is no proper supergraph G0 that has the same support as G Maximal frequent substructure where a frequent pattern G is maximal if and only if there is no frequent superpattern of G. onstraintbased substructure mining Element, set, or subgraph containment constraint Geometric constraint Valuesum constraint
45 pplication: lassification We mine frequent graph patterns in the training set. The features that are frequent in one class but rather infrequent in the other class(es) should be considered as highly discriminative features; used for model construction. To achieve highquality classification, We can adjust: the thresholds on frequency, discriminativeness, and graph connectivity ased on: the data, the number and quality of the features generated, and the classification accuracy.
46 pplication: luster analysis We mine frequent graph patterns in the training set. The set of graphs that share a large set of similar graph patterns should be considered as highly similar and should be grouped into similar clusters. The minimal support threshold can be used as a way to adjust the number of frequent clusters or generate hierarchical clusters.
47 Social Network nalysis
48 Examples of Social Networks Twitter network Network ir Transportation Network
49 Social Network nalysis Nodes often represent an object or entity such as a person, computer/server, power generator, airport, etc Links represent relationships, e.g. likes, follow s, flies to, etc
50 Why are we interested? It turns out that the structure of realworld graphs often have special characteristics This is important because structure always affects function e.g. the structure of a social network affects how a rumour, or an infectious disease, spreads e.g. the structure of a power grid determines how robust the network is to power failures Goal: 1. Identify the characteristics / properties of graphs; structural and dynamic / behavioural 2. Generate models of graphs that exhibit these characteristics 3. Use these tools to make predictions about the behaviour of graphs
51 Properties of RealWorld Social Graphs 1. Degree Distribution Plot the fraction of nodes with degree k (denoted p k ) vs. k Our intuition: Poisson/Normal Distribution WRONG! orrect: Highly Skewed mathworld.wolfram.com
52 Properties of RealWorld Social Graphs 1. (continued) Realworld social networks tend to have a highly skewed distribution that follows the Power Law: p k ~ k a small percentage of nodes have very high degree, are highly connected Example: Spread of a virus black squares = infected pink = infected but not contagious green = exposed but not infected
53 Properties of RealWorld Social Graphs 2. Small World Effect: for most real graphs, the number of hops it takes to reach any node from any other node is about 6 (Six Degrees of Separation). Milgram did an experiment, asked people in Nebraska to send letters to people in oston onstraint: letters could only be delivered to people known on a first name basis. Only 25% of letters made it to their target, but the ones that did made it in 6 hops
54 Properties of RealWorld Social Graphs 2. (continued) The distribution of the shortest path lengths. Example: MSN Messenger Network If we pick a random node in the network and then count how many hops it is from every other node, we get this graph Most nodes are at a distance of 7 hops away from any other node
55 Properties of RealWorld Social Graphs 3. Network Resilience If a node is removed, how is the network affected? For a realworld graphs, you must remove the highly connected nodes in order to reduce the connectivity of the graph Removing a node that is sparsely connected does not have a significant effect on connectivity Since the proportion of highly connected nodes in a realworld graph is small, the probability of choosing and removing such a node at random is small realworld graphs are resilient to random attacks! conversely, targeted attacks on highly connected nodes are very effective!
56 Properties of RealWorld Social Graphs 4. Densification How does the number of edges in the graph grow as the number of nodes grows? Previous belief: # edges grows linearly with # nodes i.e. EE(tt) ~ NN(tt) ctually, # edges grows superlinearly with the # nodes, i.e. the # of edges grows faster than the number of nodes i.e. EE(tt) ~ NN(tt) aa Graph gets denser over time
57 Properties of RealWorld Social Graphs 5. Shrinking Diameter Diameter is taken to be the longestshortest path in the graph s a network grows, the diameter actually gets smaller, i.e. the distance between nodes slowly decreases
58 Features/Properties of Graphs ommunity structure Densification Shrinking diameter
59 Generators: How do we model graphs Try: Generating a random graph Given n vertices connect each pair i.i.d. with Probability p Follows a Poisson distribution Follows from our intuition Not useful; no community structure Does not mirror realworld graphs
60 Generators: How do we model graphs (Erdos Renyi) Random graphs (1960s) Exponential random graphs Small world model Preferential attachment Edge copying model ommunity guided attachment Forest Fire Kronecker graphs (today)
61 Kronecker Graphs For kronecker graphs all the properties of real world graphs can actually be proven est model we have today djacency matrix, recursive generation
62 Kronecker Graphs 1. onstruct adjacency matrix for a graph G: GG = () = { 1 iiii ii aaaaaa jj aaaaaa aaaaaaaaaaaaaaaa, 0 ooooooooooooooooo } Side Note: The eigenvalue of a matrix is the scalar value ƛ for which the following is true: v = ƛv (where v is an eigenvector of the matrix )
63 Kronecker Graphs 2. Generate the 2nd Kronecker graph by taking the Kronecker product of the 1st graph with itself. The Kronecker product of 2 graphs is defined as:
64 Kronecker Graphs Visually, this is just taking the the first matrix and replacing the entries that were equal to 1 with the second matrix. 3 x 3 matrix 9 x 9 matrix
65 Kronecker Graphs We define the Kronecker product of two graphs as the Kronecker product of their adjacency matrices Therefore, we can compute the K th Kronecker graph by iteratively taking the Kronecker product of an initial graph G 1 k times: G k = G 1 G 1 G 1 G 1
66 pplying Models to Real World Graphs an then predict and understand the structure
67 Virus Propagation form of diffusion; a fundamental process in social networks an also refer to spread of rumours, news, etc
68 Virus Propagation SIS Model: Susceptible  Infected  Susceptible Virus birth rate β = the probability that an infected node attacks a neighbour Virus death rate ẟ = probability that an infected node becomes cured Healthy Node Heals with Prob ẟ Infects with Prob β Infected Node Infects with Prob β Infected Node t risk Node
69 Virus Propagation The virus strength of a graph: s = β/ẟ The epidemic threshold ττ of a graph is a value such that if: So we can ask the question: Will the virus become epidemic? Will the rumours/news become viral? How to find threshold ττ? Theorem: s = β/ẟ < ττ then an epidemic cannot happen. ττ = 1/ƛ 1, where ƛ 1, is the largest eigenvalue of adjacency matrix of the graph So if s < ττ then there is no epidemic
70 Link Prediction Given a social network at time t 1, predict the edges that will be added at time t 2 ssign connection score(x,y) to each pair of nodes Usually taken to be the shortest path between the nodes x and y, other measures use # of neighbours in common, and the Katz measure Produce a list of scores in decreasing order The pair at the top of the list are most likely to have a link created between them in the future an also use this measure for clustering
71 Score(x,y) = # of neighbours in common E F G H I J Link Prediction Top score = score(,) = 5 D Likely new link between and
72 Viral Marketing customer may increase the sales of some product if they interact positively with their peers in the social network ssign a network value to a customer
73 Diffusion in Networks: Influential Nodes Some nodes in the network can be active they can spread their influence to other nodes e.g. news, opinions, etc that propagate through a network of friends 2 models: Threshold model, Independent ontagion model
74 Thanks ny questions?
Graph Mining and Social Network Analysis
Graph Mining and Social Network Analysis Data Mining and Text Mining (UIC 583 @ Politecnico di Milano) References Jiawei Han and Micheline Kamber, "Data Mining: Concepts and Techniques", The Morgan Kaufmann
More information9.1. Graph Mining, Social Network Analysis, and Multirelational Data Mining. Graph Mining
9 Graph Mining, Social Network Analysis, and Multirelational Data Mining We have studied frequentitemset mining in Chapter 5 and sequentialpattern mining in Section 3 of Chapter 8. Many scientific and
More informationSocial Media Mining. Graph Essentials
Graph Essentials Graph Basics Measures Graph and Essentials Metrics 2 2 Nodes and Edges A network is a graph nodes, actors, or vertices (plural of vertex) Connections, edges or ties Edge Node Measures
More informationProtein Protein Interaction Networks
Functional Pattern Mining from Genome Scale Protein Protein Interaction Networks YoungRae Cho, Ph.D. Assistant Professor Department of Computer Science Baylor University it My Definition of Bioinformatics
More informationGraphs over Time Densification Laws, Shrinking Diameters and Possible Explanations
Graphs over Time Densification Laws, Shrinking Diameters and Possible Explanations Jurij Leskovec, CMU Jon Kleinberg, Cornell Christos Faloutsos, CMU 1 Introduction What can we do with graphs? What patterns
More informationPractical Graph Mining with R. 5. Link Analysis
Practical Graph Mining with R 5. Link Analysis Outline Link Analysis Concepts Metrics for Analyzing Networks PageRank HITS Link Prediction 2 Link Analysis Concepts Link A relationship between two entities
More informationRecap. Type of graphs Connectivity/Giant component Diameter Clustering coefficient Betweenness Centrality Degree distributions
Recap Type of graphs Connectivity/Giant component Diameter Clustering coefficient Betweenness Centrality Degree distributions Degree Distribution N k is the number of nodes with degree k P(k) is the probability
More informationDistance Degree Sequences for Network Analysis
Universität Konstanz Computer & Information Science Algorithmics Group 15 Mar 2005 based on Palmer, Gibbons, and Faloutsos: ANF A Fast and Scalable Tool for Data Mining in Massive Graphs, SIGKDD 02. Motivation
More informationMining SocialNetwork Graphs
342 Chapter 10 Mining SocialNetwork Graphs There is much information to be gained by analyzing the largescale data that is derived from social networks. The bestknown example of a social network is
More informationOutline. NPcompleteness. When is a problem easy? When is a problem hard? Today. Euler Circuits
Outline NPcompleteness Examples of Easy vs. Hard problems Euler circuit vs. Hamiltonian circuit Shortest Path vs. Longest Path 2pairs sum vs. general Subset Sum Reducing one problem to another Clique
More informationCMSC 451: Graph Properties, DFS, BFS, etc.
CMSC 451: Graph Properties, DFS, BFS, etc. Slides By: Carl Kingsford Department of Computer Science University of Maryland, College Park Based on Chapter 3 of Algorithm Design by Kleinberg & Tardos. Graphs
More informationSubgraph Patterns: Network Motifs and Graphlets. Pedro Ribeiro
Subgraph Patterns: Network Motifs and Graphlets Pedro Ribeiro Analyzing Complex Networks We have been talking about extracting information from networks Some possible tasks: General Patterns Ex: scalefree,
More informationGraph Theory and Complex Networks: An Introduction. Chapter 06: Network analysis
Graph Theory and Complex Networks: An Introduction Maarten van Steen VU Amsterdam, Dept. Computer Science Room R4.0, steen@cs.vu.nl Chapter 06: Network analysis Version: April 8, 04 / 3 Contents Chapter
More informationPart 2: Community Detection
Chapter 8: Graph Data Part 2: Community Detection Based on Leskovec, Rajaraman, Ullman 2014: Mining of Massive Datasets Big Data Management and Analytics Outline Community Detection  Social networks 
More informationComplex Networks Analysis: Clustering Methods
Complex Networks Analysis: Clustering Methods Nikolai Nefedov Spring 2013 ISI ETH Zurich nefedov@isi.ee.ethz.ch 1 Outline Purpose to give an overview of modern graphclustering methods and their applications
More informationBig Data Analytics of MultiRelationship Online Social Network Based on MultiSubnet Composited Complex Network
, pp.273284 http://dx.doi.org/10.14257/ijdta.2015.8.5.24 Big Data Analytics of MultiRelationship Online Social Network Based on MultiSubnet Composited Complex Network Gengxin Sun 1, Sheng Bin 2 and
More informationSocial Media Mining. Network Measures
Klout Measures and Metrics 22 Why Do We Need Measures? Who are the central figures (influential individuals) in the network? What interaction patterns are common in friends? Who are the likeminded users
More informationMining Social Network Graphs
Mining Social Network Graphs Debapriyo Majumdar Data Mining Fall 2014 Indian Statistical Institute Kolkata November 13, 17, 2014 Social Network No introduc+on required Really? We s7ll need to understand
More informationEuclidean Minimum Spanning Trees Based on Well Separated Pair Decompositions Chaojun Li. Advised by: Dave Mount. May 22, 2014
Euclidean Minimum Spanning Trees Based on Well Separated Pair Decompositions Chaojun Li Advised by: Dave Mount May 22, 2014 1 INTRODUCTION In this report we consider the implementation of an efficient
More informationChapter 6: Episode discovery process
Chapter 6: Episode discovery process Algorithmic Methods of Data Mining, Fall 2005, Chapter 6: Episode discovery process 1 6. Episode discovery process The knowledge discovery process KDD process of analyzing
More informationGraph models for the Web and the Internet. Elias Koutsoupias University of Athens and UCLA. Crete, July 2003
Graph models for the Web and the Internet Elias Koutsoupias University of Athens and UCLA Crete, July 2003 Outline of the lecture Small world phenomenon The shape of the Web graph Searching and navigation
More informationThe Open University s repository of research publications and other research outputs
Open Research Online The Open University s repository of research publications and other research outputs The degreediameter problem for circulant graphs of degree 8 and 9 Journal Article How to cite:
More informationLesson 3. Algebraic graph theory. Sergio Barbarossa. Rome  February 2010
Lesson 3 Algebraic graph theory Sergio Barbarossa Basic notions Definition: A directed graph (or digraph) composed by a set of vertices and a set of edges We adopt the convention that the information flows
More informationChapter 4. Trees. 4.1 Basics
Chapter 4 Trees 4.1 Basics A tree is a connected graph with no cycles. A forest is a collection of trees. A vertex of degree one, particularly in a tree, is called a leaf. Trees arise in a variety of applications.
More information2. (a) Explain the strassen s matrix multiplication. (b) Write deletion algorithm, of Binary search tree. [8+8]
Code No: R05220502 Set No. 1 1. (a) Describe the performance analysis in detail. (b) Show that f 1 (n)+f 2 (n) = 0(max(g 1 (n), g 2 (n)) where f 1 (n) = 0(g 1 (n)) and f 2 (n) = 0(g 2 (n)). [8+8] 2. (a)
More informationGraph Algorithms using MapReduce
Graph Algorithms using MapReduce Graphs are ubiquitous in modern society. Some examples: The hyperlink structure of the web 1/7 Graph Algorithms using MapReduce Graphs are ubiquitous in modern society.
More informationProblem Set 7 Solutions
8 8 Introduction to Algorithms May 7, 2004 Massachusetts Institute of Technology 6.046J/18.410J Professors Erik Demaine and Shafi Goldwasser Handout 25 Problem Set 7 Solutions This problem set is due in
More informationGraph Mining Techniques for Social Media Analysis
Graph Mining Techniques for Social Media Analysis Mary McGlohon Christos Faloutsos 1 11 What is graph mining? Extracting useful knowledge (patterns, outliers, etc.) from structured data that can be represented
More informationThe mathematics of networks
The mathematics of networks M. E. J. Newman Center for the Study of Complex Systems, University of Michigan, Ann Arbor, MI 48109 1040 In much of economic theory it is assumed that economic agents interact,
More informationA. V. Gerbessiotis CS Spring 2014 PS 3 Mar 24, 2014 No points
A. V. Gerbessiotis CS 610102 Spring 2014 PS 3 Mar 24, 2014 No points Problem 1. Suppose that we insert n keys into a hash table of size m using open addressing and uniform hashing. Let p(n, m) be the
More informationIntroduction to Graph Mining
Introduction to Graph Mining What is a graph? A graph G = (V,E) is a set of vertices V and a set (possibly empty) E of pairs of vertices e 1 = (v 1, v 2 ), where e 1 E and v 1, v 2 V. Edges may contain
More informationCSE 326, Data Structures. Sample Final Exam. Problem Max Points Score 1 14 (2x7) 2 18 (3x6) 3 4 4 7 5 9 6 16 7 8 8 4 9 8 10 4 Total 92.
Name: Email ID: CSE 326, Data Structures Section: Sample Final Exam Instructions: The exam is closed book, closed notes. Unless otherwise stated, N denotes the number of elements in the data structure
More informationGRAPH THEORY and APPLICATIONS. Trees
GRAPH THEORY and APPLICATIONS Trees Properties Tree: a connected graph with no cycle (acyclic) Forest: a graph with no cycle Paths are trees. Star: A tree consisting of one vertex adjacent to all the others.
More informationChapter 2. Basic Terminology and Preliminaries
Chapter 2 Basic Terminology and Preliminaries 6 Chapter 2. Basic Terminology and Preliminaries 7 2.1 Introduction This chapter is intended to provide all the fundamental terminology and notations which
More informationOn the Minimum ABC Index of Chemical Trees
Applied Mathematics 0, (): 86 DOI: 0.593/j.am.000.0 On the Minimum ABC Index of Chemical Trees Tzvetalin S. Vassilev *, Laura J. Huntington Department of Computer Science and Mathematics, Nipissing University,
More informationUSING SPECTRAL RADIUS RATIO FOR NODE DEGREE TO ANALYZE THE EVOLUTION OF SCALE FREE NETWORKS AND SMALLWORLD NETWORKS
USING SPECTRAL RADIUS RATIO FOR NODE DEGREE TO ANALYZE THE EVOLUTION OF SCALE FREE NETWORKS AND SMALLWORLD NETWORKS Natarajan Meghanathan Jackson State University, 1400 Lynch St, Jackson, MS, USA natarajan.meghanathan@jsums.edu
More informationHomework 15 Solutions
PROBLEM ONE (Trees) Homework 15 Solutions 1. Recall the definition of a tree: a tree is a connected, undirected graph which has no cycles. Which of the following definitions are equivalent to this definition
More informationCOT5405 Analysis of Algorithms Homework 3 Solutions
COT0 Analysis of Algorithms Homework 3 Solutions. Prove or give a counter example: (a) In the textbook, we have two routines for graph traversal  DFS(G) and BFS(G,s)  where G is a graph and s is any
More informationApproximation Algorithms
Approximation Algorithms or: How I Learned to Stop Worrying and Deal with NPCompleteness Ong Jit Sheng, Jonathan (A0073924B) March, 2012 Overview Key Results (I) General techniques: Greedy algorithms
More informationComputer Algorithms. NPComplete Problems. CISC 4080 Yanjun Li
Computer Algorithms NPComplete Problems NPcompleteness The quest for efficient algorithms is about finding clever ways to bypass the process of exhaustive search, using clues from the input in order
More informationSCAN: A Structural Clustering Algorithm for Networks
SCAN: A Structural Clustering Algorithm for Networks Xiaowei Xu, Nurcan Yuruk, Zhidan Feng (University of Arkansas at Little Rock) Thomas A. J. Schweiger (Acxiom Corporation) Networks scaling: #edges connected
More informationTree isomorphism. Alexander Smal. Joint Advanced Student School St.Petersburg State University of Information Technologies, Mechanics and Optics
Tree isomorphism Alexander Smal St.Petersburg State University of Information Technologies, Mechanics and Optics Joint Advanced Student School 2008 1 / 22 Motivation In some applications the chemical structures
More informationDATA ANALYSIS II. Matrix Algorithms
DATA ANALYSIS II Matrix Algorithms Similarity Matrix Given a dataset D = {x i }, i=1,..,n consisting of n points in R d, let A denote the n n symmetric similarity matrix between the points, given as where
More informationKEYWORD SEARCH OVER PROBABILISTIC RDF GRAPHS
ABSTRACT KEYWORD SEARCH OVER PROBABILISTIC RDF GRAPHS In many real applications, RDF (Resource Description Framework) has been widely used as a W3C standard to describe data in the Semantic Web. In practice,
More informationAn Introduction to APGL
An Introduction to APGL Charanpal Dhanjal February 2012 Abstract Another Python Graph Library (APGL) is a graph library written using pure Python, NumPy and SciPy. Users new to the library can gain an
More informationAny two nodes which are connected by an edge in a graph are called adjacent node.
. iscuss following. Graph graph G consist of a non empty set V called the set of nodes (points, vertices) of the graph, a set which is the set of edges and a mapping from the set of edges to a set of pairs
More informationPlanar Tree Transformation: Results and Counterexample
Planar Tree Transformation: Results and Counterexample Selim G Akl, Kamrul Islam, and Henk Meijer School of Computing, Queen s University Kingston, Ontario, Canada K7L 3N6 Abstract We consider the problem
More informationSocial Network Mining
Social Network Mining Data Mining November 11, 2013 Frank Takes (ftakes@liacs.nl) LIACS, Universiteit Leiden Overview Social Network Analysis Graph Mining Online Social Networks Friendship Graph Semantics
More informationGraph Theory and Complex Networks: An Introduction. Chapter 06: Network analysis. Contents. Introduction. Maarten van Steen. Version: April 28, 2014
Graph Theory and Complex Networks: An Introduction Maarten van Steen VU Amsterdam, Dept. Computer Science Room R.0, steen@cs.vu.nl Chapter 0: Version: April 8, 0 / Contents Chapter Description 0: Introduction
More informationData Clustering. Dec 2nd, 2013 Kyrylo Bessonov
Data Clustering Dec 2nd, 2013 Kyrylo Bessonov Talk outline Introduction to clustering Types of clustering Supervised Unsupervised Similarity measures Main clustering algorithms kmeans Hierarchical Main
More informationAlgorithm Design and Analysis Homework #6 Due: 1pm, Monday, January 9, === Homework submission instructions ===
Algorithm Design and Analysis Homework #6 Due: 1pm, Monday, January 9, 2012 === Homework submission instructions === Submit the answers for writing problems (including your programming report) through
More informationLecture Notes on Spanning Trees
Lecture Notes on Spanning Trees 15122: Principles of Imperative Computation Frank Pfenning Lecture 26 April 26, 2011 1 Introduction In this lecture we introduce graphs. Graphs provide a uniform model
More informationThank you! NetMine Data mining on networks IIS 0209107 AWSOM. Outline. Proposed method. Goals
NetMine Data mining on networks IIS 0209107 Christos Faloutsos (CMU) Michalis Faloutsos (UCR) Peggy Agouris George Kollios Fillia Makedon Betty Salzberg Anthony Stefanidis Thank you! NSFIDM 04 C. Faloutsos
More informationUnit 4: Layout Compaction
Unit 4: Layout Compaction Course contents Design rules Symbolic layout Constraintgraph compaction Readings: Chapter 6 Unit 4 1 Design rules: restrictions on the mask patterns to increase the probability
More informationSociology and CS. Small World. Sociology Problems. Degree of Separation. Milgram s Experiment. How close are people connected? (Problem Understanding)
Sociology Problems Sociology and CS Problem 1 How close are people connected? Small World Philip Chan Problem 2 Connector How close are people connected? (Problem Understanding) Small World Are people
More informationDmitri Krioukov CAIDA/UCSD
Hyperbolic geometry of complex networks Dmitri Krioukov CAIDA/UCSD dima@caida.org F. Papadopoulos, M. Boguñá, A. Vahdat, and kc claffy Complex networks Technological Internet Transportation Power grid
More informationA New Marketing Channel Management Strategy Based on Frequent Subtree Mining
A New Marketing Channel Management Strategy Based on Frequent Subtree Mining Daoping Wang Peng Gao School of Economics and Management University of Science and Technology Beijing ABSTRACT For most manufacturers,
More informationKrishna Institute of Engineering & Technology, Ghaziabad Department of Computer Application MCA213 : DATA STRUCTURES USING C
Tutorial#1 Q 1: Explain the terms data, elementary item, entity, primary key, domain, attribute and information? Also give examples in support of your answer? Q 2: What is a Data Type? Differentiate
More informationData Mining Cluster Analysis: Advanced Concepts and Algorithms. ref. Chapter 9. Introduction to Data Mining
Data Mining Cluster Analysis: Advanced Concepts and Algorithms ref. Chapter 9 Introduction to Data Mining by Tan, Steinbach, Kumar 1 Outline Prototypebased Fuzzy cmeans Mixture Model Clustering Densitybased
More informationSocial Media Mining. Data Mining Essentials
Introduction Data production rate has been increased dramatically (Big Data) and we are able store much more data than before E.g., purchase data, social media data, mobile phone data Businesses and customers
More informationSocial Networks and Social Media
Social Networks and Social Media Social Media: ManytoMany Social Networking Content Sharing Social Media Blogs Microblogging Wiki Forum 2 Characteristics of Social Media Consumers become Producers Rich
More informationNetwork (Tree) Topology Inference Based on Prüfer Sequence
Network (Tree) Topology Inference Based on Prüfer Sequence C. Vanniarajan and Kamala Krithivasan Department of Computer Science and Engineering Indian Institute of Technology Madras Chennai 600036 vanniarajanc@hcl.in,
More informationIE 680 Special Topics in Production Systems: Networks, Routing and Logistics*
IE 680 Special Topics in Production Systems: Networks, Routing and Logistics* Rakesh Nagi Department of Industrial Engineering University at Buffalo (SUNY) *Lecture notes from Network Flows by Ahuja, Magnanti
More information4 Basics of Trees. Petr Hliněný, FI MU Brno 1 FI: MA010: Trees and Forests
4 Basics of Trees Trees, actually acyclic connected simple graphs, are among the simplest graph classes. Despite their simplicity, they still have rich structure and many useful application, such as in
More informationSolutions to Exercises 8
Discrete Mathematics Lent 2009 MA210 Solutions to Exercises 8 (1) Suppose that G is a graph in which every vertex has degree at least k, where k 1, and in which every cycle contains at least 4 vertices.
More informationGeneral Network Analysis: Graphtheoretic. COMP572 Fall 2009
General Network Analysis: Graphtheoretic Techniques COMP572 Fall 2009 Networks (aka Graphs) A network is a set of vertices, or nodes, and edges that connect pairs of vertices Example: a network with 5
More informationHierarchical Clustering. Clustering Overview
lustering Overview Last lecture What is clustering Partitional algorithms: Kmeans Today s lecture Hierarchical algorithms ensitybased algorithms: SN Techniques for clustering large databases IRH UR ata
More informationA Fast Algorithm For Finding Hamilton Cycles
A Fast Algorithm For Finding Hamilton Cycles by Andrew Chalaturnyk A thesis presented to the University of Manitoba in partial fulfillment of the requirements for the degree of Masters of Science in Computer
More informationAttacking Anonymized Social Network
Attacking Anonymized Social Network From: Wherefore Art Thou RX3579X? Anonymized Social Networks, Hidden Patterns, and Structural Steganography Presented By: Machigar Ongtang (Ongtang@cse.psu.edu ) Social
More informationDynamic Programming. Applies when the following Principle of Optimality
Dynamic Programming Applies when the following Principle of Optimality holds: In an optimal sequence of decisions or choices, each subsequence must be optimal. Translation: There s a recursive solution.
More informationTrees and Fundamental Circuits
Trees and Fundamental Circuits Tree A connected graph without any circuits. o must have at least one vertex. o definition implies that it must be a simple graph. o only finite trees are being considered
More informationFUZZY CLUSTERING ANALYSIS OF DATA MINING: APPLICATION TO AN ACCIDENT MINING SYSTEM
International Journal of Innovative Computing, Information and Control ICIC International c 0 ISSN 3448 Volume 8, Number 8, August 0 pp. 4 FUZZY CLUSTERING ANALYSIS OF DATA MINING: APPLICATION TO AN ACCIDENT
More informationOPTIMAL DESIGN OF DISTRIBUTED SENSOR NETWORKS FOR FIELD RECONSTRUCTION
OPTIMAL DESIGN OF DISTRIBUTED SENSOR NETWORKS FOR FIELD RECONSTRUCTION Sérgio Pequito, Stephen Kruzick, Soummya Kar, José M. F. Moura, A. Pedro Aguiar Department of Electrical and Computer Engineering
More informationLecture 9. 1 Introduction. 2 Random Walks in Graphs. 1.1 How To Explore a Graph? CS621 Theory Gems October 17, 2012
CS62 Theory Gems October 7, 202 Lecture 9 Lecturer: Aleksander Mądry Scribes: Dorina Thanou, Xiaowen Dong Introduction Over the next couple of lectures, our focus will be on graphs. Graphs are one of
More informationLecture 15 An Arithmetic Circuit Lowerbound and Flows in Graphs
CSE599s: Extremal Combinatorics November 21, 2011 Lecture 15 An Arithmetic Circuit Lowerbound and Flows in Graphs Lecturer: Anup Rao 1 An Arithmetic Circuit Lower Bound An arithmetic circuit is just like
More informationHome Page. Data Structures. Title Page. Page 1 of 24. Go Back. Full Screen. Close. Quit
Data Structures Page 1 of 24 A.1. Arrays (Vectors) nelement vector start address + ielementsize 0 +1 +2 +3 +4... +n1 start address continuous memory block static, if size is known at compile time dynamic,
More informationData Mining Practical Machine Learning Tools and Techniques
Ensemble learning Data Mining Practical Machine Learning Tools and Techniques Slides for Chapter 8 of Data Mining by I. H. Witten, E. Frank and M. A. Hall Combining multiple models Bagging The basic idea
More informationA Brief Survey on Anonymization Techniques for Privacy Preserving Publishing of Social Network Data
A Brief Survey on Anonymization Techniques for Privacy Preserving Publishing of Social Network Data Bin Zhou School of Computing Science Simon Fraser University, Canada bzhou@cs.sfu.ca Jian Pei School
More informationThree Effective TopDown Clustering Algorithms for Location Database Systems
Three Effective TopDown Clustering Algorithms for Location Database Systems KwangJo Lee and SungBong Yang Department of Computer Science, Yonsei University, Seoul, Republic of Korea {kjlee5435, yang}@cs.yonsei.ac.kr
More informationMinimum Spanning Trees
Minimum Spanning Trees Algorithms and 18.304 Presentation Outline 1 Graph Terminology Minimum Spanning Trees 2 3 Outline Graph Terminology Minimum Spanning Trees 1 Graph Terminology Minimum Spanning Trees
More information12 Abstract Data Types
12 Abstract Data Types 12.1 Source: Foundations of Computer Science Cengage Learning Objectives After studying this chapter, the student should be able to: Define the concept of an abstract data type (ADT).
More informationSPECIAL PERTURBATIONS UNCORRELATED TRACK PROCESSING
AAS 07228 SPECIAL PERTURBATIONS UNCORRELATED TRACK PROCESSING INTRODUCTION James G. Miller * Two historical uncorrelated track (UCT) processing approaches have been employed using general perturbations
More informationTheorem A graph T is a tree if, and only if, every two distinct vertices of T are joined by a unique path.
Chapter 3 Trees Section 3. Fundamental Properties of Trees Suppose your city is planning to construct a rapid rail system. They want to construct the most economical system possible that will meet the
More informationGraph Theory. Directed and undirected graphs make useful mental models for many situations. These objects are loosely defined as follows:
Graph Theory Directed and undirected graphs make useful mental models for many situations. These objects are loosely defined as follows: Definition An undirected graph is a (finite) set of nodes, some
More informationGraphs and Network Flows IE411 Lecture 1
Graphs and Network Flows IE411 Lecture 1 Dr. Ted Ralphs IE411 Lecture 1 1 References for Today s Lecture Required reading Sections 17.1, 19.1 References AMO Chapter 1 and Section 2.1 and 2.2 IE411 Lecture
More informationRanking on Data Manifolds
Ranking on Data Manifolds Dengyong Zhou, Jason Weston, Arthur Gretton, Olivier Bousquet, and Bernhard Schölkopf Max Planck Institute for Biological Cybernetics, 72076 Tuebingen, Germany {firstname.secondname
More informationNetwork Analysis and Visualization of Staphylococcus aureus. by Russ Gibson
Network Analysis and Visualization of Staphylococcus aureus by Russ Gibson Network analysis Based on graph theory Probabilistic models (random graphs) developed by Erdős and Rényi in 1959 Theory and tools
More informationDetection of local affinity patterns in big data
Detection of local affinity patterns in big data Andrea Marinoni, Paolo Gamba Department of Electronics, University of Pavia, Italy Abstract Mining information in Big Data requires to design a new class
More informationTHE UNIVERSITY OF AUCKLAND
COMPSCI 369 THE UNIVERSITY OF AUCKLAND FIRST SEMESTER, 2011 MIDSEMESTER TEST Campus: City COMPUTER SCIENCE Computational Science (Time allowed: 50 minutes) NOTE: Attempt all questions Use of calculators
More informationLABEL PROPAGATION ON GRAPHS. SEMISUPERVISED LEARNING. Changsheng Liu 10302014
LABEL PROPAGATION ON GRAPHS. SEMISUPERVISED LEARNING Changsheng Liu 10302014 Agenda Semi Supervised Learning Topics in Semi Supervised Learning Label Propagation Local and global consistency Graph
More informationGraph Algorithms. Ananth Grama, Anshul Gupta, George Karypis, and Vipin Kumar
Graph Algorithms Ananth Grama, Anshul Gupta, George Karypis, and Vipin Kumar To accompany the text Introduction to Parallel Computing, Addison Wesley, 3. Topic Overview Definitions and Representation Minimum
More informationCOLORED GRAPHS AND THEIR PROPERTIES
COLORED GRAPHS AND THEIR PROPERTIES BEN STEVENS 1. Introduction This paper is concerned with the upper bound on the chromatic number for graphs of maximum vertex degree under three different sets of coloring
More informationCanonical Forms for Labeled Trees and Their Applications in Frequent Subtree Mining
Under consideration for publication in Knowledge and Information Systems anonical Forms for Labeled Trees and Their pplications in Frequent Subtree Mining Yun hi, Yirong Yang, and Richard R. Muntz epartment
More informationWhy graph clustering is useful?
Graph Clustering Why graph clustering is useful? Distance matrices are graphs as useful as any other clustering Identification of communities in social networks Webpage clustering for better data management
More information! Solve problem to optimality. ! Solve problem in polytime. ! Solve arbitrary instances of the problem. #approximation algorithm.
Approximation Algorithms 11 Approximation Algorithms Q Suppose I need to solve an NPhard problem What should I do? A Theory says you're unlikely to find a polytime algorithm Must sacrifice one of three
More informationDiameter and Treewidth in MinorClosed Graph Families, Revisited
Algorithmica manuscript No. (will be inserted by the editor) Diameter and Treewidth in MinorClosed Graph Families, Revisited Erik D. Demaine, MohammadTaghi Hajiaghayi MIT Computer Science and Artificial
More informationADIMinebio: A Graph Mining Algorithm for Biomedical Data
ADIMinebio: A Graph Mining Algorithm for Biomedical Data Rodrigo de Sousa Gomide 1, Cristina Dutra de Aguiar Ciferri 2, Ricardo Rodrigues Ciferri 3, Marina Teresa Pires Vieira 4 1 Goiano Federal Institute
More informationB490 Mining the Big Data. 2 Clustering
B490 Mining the Big Data 2 Clustering Qin Zhang 11 Motivations Group together similar documents/webpages/images/people/proteins/products One of the most important problems in machine learning, pattern
More informationLecture 10: Regression Trees
Lecture 10: Regression Trees 36350: Data Mining October 11, 2006 Reading: Textbook, sections 5.2 and 10.5. The next three lectures are going to be about a particular kind of nonlinear predictive model,
More informationGraph Theory. Introduction. Distance in Graphs. Trees. Isabela Drămnesc UVT. Computer Science Department, West University of Timişoara, Romania
Graph Theory Introduction. Distance in Graphs. Trees Isabela Drămnesc UVT Computer Science Department, West University of Timişoara, Romania November 2016 Isabela Drămnesc UVT Graph Theory and Combinatorics
More information