A mixture model for random graphs
|
|
- Benjamin Beasley
- 8 years ago
- Views:
Transcription
1 A mixture model for random graphs J-J Daudin, F. Picard, S. Robin UMR INA-PG / ENGREF / INRA, Paris Mathématique et Informatique Appliquées Examples of networks. Social: Biological: Internet: who knows who? which protein interacts with which? connection between servers or web pages. 1
2 Random graphs Notation and definition. Given a set of n vertices (i = 1..n), X ij indicates the presence/absence of a (non oriented) edge between vertices i and j: X ij = X ji =Á{i j}, X ii = 0. The random graph is defined by the join distribution of all the {X ij } i,j. Typical characteristics. Degree (connectivity) of the vertices: K i = j i X ij Clustering coefficient: c = Pr{X jk = 1 X ij = X ik = 1} Diameter: Longest path between two vertices. 2
3 Erdos-Rényi (ER) model Definition. The {X ij } i,j are i.i.d.: X ij B(p). Characteristics. Degree : Clustering coefficient: K i B(n 1, p) P(λ) c = p Drawback. The ER fits poorly many real-world networks. Empirical degree distributions are often very different from the Poisson distribution because of few vertices having very high degrees. Empirical clustering coefficients are generally higher than expected under ER. 3
4 Erdös-Rényi mixture for graph (ERMG) An explicit random graph model Mixture population of edges. We still suppose that the edges belong to Q groups: α q = Pr{i q}, Z iq =Á{i q}. Conditional distribution of the edges. The edges {X ij } are conditionally independent given the group of the vertices: X ij {i q, j l} B(π ql ). π ql = π lq is the connection probability between groups q and l. A high value of π ql reveals a preferential connectivity between groups q and l. 4
5 Some properties of the ERMG model Conditional distribution of the degrees: K i {i q} B(n 1, π q ) P(λ q ) where π q = l α lπ ql, λ q = (n 1)π q. Marginal distribution of the degrees: we get a Poisson mixture K i q B(n 1, π q ) q α q P(λ q ). 5
6 Between-group connectivity. A ql denotes the connectivity between groups q and l: A ql = i<j Z iq Z jl X ij. In the ERMG model, its expectation is n(n 1) (A ql ) = α q α l π ql. 2 Clustering coefficient: c = Pr{ V}/ Pr{V} = Pr{ }/ Pr{V}. In the ERMG model, we get c = q,l,m α qα l α m π ql π qm π lm q,l,m α qα l α m π ql π qm. 6
7 Independent model The absence of preferential connection between groups corresponds to the case where π ql = η q η l. Distribution of degrees: {K i i q} P(λ q ), where λ q = (n 1)η q η, η = l α lη l. Between group connectivity: (A ql ) = n(n 1)(α q η q )(α l η l )/2. ( q α qη 2 q) 2 Clustering coefficient: c = The ER model corresponds to η 2. Q = 1, α 1 = 1, η = η 1 = p, so we get the known result: c = η 4 1/η 2 1 = p. 7
8 Examples Description Network Q π Clustering coefficient Random 1 p p Independent model (product connectivity) Stars 4 Clusters (affiliation networks) 2 2 ( ) a 2 ab ab b ( ) 1 ε ε 1 (a 2 + b 2 ) 2 (a + b) ε 2 (1 + ε) 2 8
9 Scale free network model. (Barabasi & Albert, 99) The network is build iteratively: the i-th vertex joining the network connects one of the (i 1) preceeding ones with probability proportional to their current degree (busy gets busier): j < i, Pr i {i j} K i j. The limit marginal distribution for the degrees is then scale free: p(k) k 3. Analogous modeling with the independent ERMG. At time q, n q = nα q vertices join the net work. They preferentially connect the oldest vertices: π ql = η q η l, η 1 η 2 η q... The decreasing speed of the {η q } gives the tail of the degree distribution. 9
10 Maximum likelihood estimation via E-M We denote X = {X ij } i,j=1..n, Z = {Z iq } i=1..n,q=1..q. Likelihood The conditional expectation of the complete-data log-likelihood is Q(X) = {L(X, Z) X } = i τ iq log α q + q i θ ijql log b(x ij ;π ql ), q j>i l where τ iq and θ ijql are posterior probabilities τ iq = Pr{Z iq = 1 X }, θ ijql = Pr{Z iq Z jl = 1 X } Evaluating these probabilities is not straightforward because the {Z iq } are all dependent conditionally on X. 10
11 E step. We approximate the conditional joint distribution of the {Z iq }: Pr{Z X } i Pr{Z i X, Z i } where Pr{Z iq = 1 X, Z i } α q b(c im ;Nm, i π qm ) The elements of Z i are estimated by their conditional expectation: Ẑ jl = τ jl. The posterior probabilities τ iq must therefore satisfy τ iq = Pr{Z iq = 1 X, Ẑi } which is actually a fix point type relation. The τ iq are obtained by iterating it. M step. Maximizing Q(X) subject to q α q = 1 gives m τ iq /n, θ ijql. α q = i π ql = i θ ijql X ij / i j j 11
12 Choice of the number of groups We propose a heuristic penalized likelihood criterion inspired from BIC. Since Q(X) is the sum of τ iq log α q i q θ ijql log b(x ij ;π ql ) i q j>i l which deals with (Q 1) independent proportions α q s and involves n terms, which deals with Q(Q + 1)/2 probabilities π ql s and involves n(n 1)/2 terms, we propose the following heuristic criterion: 2Q(X) + (Q 1)log n + Q(Q + 1) 2 [ ] n(n 1) log. 2 12
13 Application to Karate Club Data n = 34 members (vertices) of a Karate club 2 members are connected is they have social interactions (apart from their sportive activity) 156 edges. This dataset (Zachary, 77) has been intensively studied in the literature, generally with Q = 4 groups. Parameter estimates. α(%) π (%) λ Clustering coefficient. ERMG models gives while the empirical c is
14 Dot-plot representation of the graph. Dot present means X ij = 1 The vertices are re-ordered according to their mean group number : q i = q q τ iq Posterior probabilities τ iq
15 Interpretation of the groups 2 persons, including the administrator, strongly connected with group 4, but not with groups 2 and 3; 3 persons including the instructor, strongly connected with group 3, but not with groups 1 and 4; 13 ordinary members, connected with the instructor; 16 ordinary members, connected with the administrator. End of the story. The instructor (group 2) finally leaved the club and started another one with about one half the members (corresponding to group 3?). 15
16 Selection of the number of groups. The pseudo BIC actually selects Q = 6 groups Comparison with the 4 group model. Former groups 1 and 4 are conserved. Former groups 2 and 3 are each divided in two new groups We do not know if the new club did last very long Posterior probabilities τ iq
17 Application to E. coli reaction network n = 605 vertices (reactions) and edges. 2 reactions i and j are connected if the product of i is the substrate of j (or conversely). provided by V. Lacroix and M.-F. Sagot (INRIA Hélix). Number of groups. Pseudo-BIC selects Q = 21. Group proportions. α q (%) Many small groups actually correspond to cliques or pseudo-cliques. 17
18 600 Dot-plot representation of the graph. 500 Biological interpretation: Groups 1 to 20 gather reactions involving all the same compound either as a substrate or as a product. A compound (pyruvate, ATP, etc) can be associated to each group Posterior probabilities τ iq
19 Zoom (bottom left). Submatrix of π: q, l Vertices degree K i. Mean degree in the last group: K 21 =
20 Distribution of the degree. According to the ERMG, de degrees have a Poisson mixture distribution. Histogram + mixture distribution P-P plot Clustering coefficient. Empirical ERMG (Q = 6) ERMG (Q = 21) ER (Q = 1)
21 Reaction graph. 15 (16) 8 (10) 13 (14) 3 (7) Group number 12 (13) 16 (17) (group size) 1 (4) 11 (12) 7 (9) 21 (345) 9 (11) 17 (18) 6 (9) 4 (8) 14 (15) 19 (19) 20 (35) 10 (11) 5 (8) 2 (6) 18 (18) 21
22 Conclusions Past. The ERMG model is a flexible generalization of the ER model and a promising alternative to the scale-free model. It seems to fit well several real-world networks It is properly defined, so its properties can be properly studied. Future. Study the probabilistic properties of the ERMG model (diameter, probability for a subgraph to be connected, etc). Derive a relevant criterion to select the number of groups. Extension to valued graphs: X ij not only 0/1, but some measure of the connection intensity. 22
Analyzing the Facebook graph?
Logistics Big Data Algorithmic Introduction Prof. Yuval Shavitt Contact: shavitt@eng.tau.ac.il Final grade: 4 6 home assignments (will try to include programing assignments as well): 2% Exam 8% Big Data
More informationMessage-passing sequential detection of multiple change points in networks
Message-passing sequential detection of multiple change points in networks Long Nguyen, Arash Amini Ram Rajagopal University of Michigan Stanford University ISIT, Boston, July 2012 Nguyen/Amini/Rajagopal
More informationComplex Networks Analysis: Clustering Methods
Complex Networks Analysis: Clustering Methods Nikolai Nefedov Spring 2013 ISI ETH Zurich nefedov@isi.ee.ethz.ch 1 Outline Purpose to give an overview of modern graph-clustering methods and their applications
More informationUSING SPECTRAL RADIUS RATIO FOR NODE DEGREE TO ANALYZE THE EVOLUTION OF SCALE- FREE NETWORKS AND SMALL-WORLD NETWORKS
USING SPECTRAL RADIUS RATIO FOR NODE DEGREE TO ANALYZE THE EVOLUTION OF SCALE- FREE NETWORKS AND SMALL-WORLD NETWORKS Natarajan Meghanathan Jackson State University, 1400 Lynch St, Jackson, MS, USA natarajan.meghanathan@jsums.edu
More informationA discussion of Statistical Mechanics of Complex Networks P. Part I
A discussion of Statistical Mechanics of Complex Networks Part I Review of Modern Physics, Vol. 74, 2002 Small Word Networks Clustering Coefficient Scale-Free Networks Erdös-Rényi model cover only parts
More informationDATA ANALYSIS II. Matrix Algorithms
DATA ANALYSIS II Matrix Algorithms Similarity Matrix Given a dataset D = {x i }, i=1,..,n consisting of n points in R d, let A denote the n n symmetric similarity matrix between the points, given as where
More informationPart 2: Community Detection
Chapter 8: Graph Data Part 2: Community Detection Based on Leskovec, Rajaraman, Ullman 2014: Mining of Massive Datasets Big Data Management and Analytics Outline Community Detection - Social networks -
More informationA scalable multilevel algorithm for graph clustering and community structure detection
A scalable multilevel algorithm for graph clustering and community structure detection Hristo N. Djidjev 1 Los Alamos National Laboratory, Los Alamos, NM 87545 Abstract. One of the most useful measures
More informationScheduling Shop Scheduling. Tim Nieberg
Scheduling Shop Scheduling Tim Nieberg Shop models: General Introduction Remark: Consider non preemptive problems with regular objectives Notation Shop Problems: m machines, n jobs 1,..., n operations
More informationLecture 4: BK inequality 27th August and 6th September, 2007
CSL866: Percolation and Random Graphs IIT Delhi Amitabha Bagchi Scribe: Arindam Pal Lecture 4: BK inequality 27th August and 6th September, 2007 4. Preliminaries The FKG inequality allows us to lower bound
More informationVERTICES OF GIVEN DEGREE IN SERIES-PARALLEL GRAPHS
VERTICES OF GIVEN DEGREE IN SERIES-PARALLEL GRAPHS MICHAEL DRMOTA, OMER GIMENEZ, AND MARC NOY Abstract. We show that the number of vertices of a given degree k in several kinds of series-parallel labelled
More informationLogistic Regression. Jia Li. Department of Statistics The Pennsylvania State University. Logistic Regression
Logistic Regression Department of Statistics The Pennsylvania State University Email: jiali@stat.psu.edu Logistic Regression Preserve linear classification boundaries. By the Bayes rule: Ĝ(x) = arg max
More informationStationary random graphs on Z with prescribed iid degrees and finite mean connections
Stationary random graphs on Z with prescribed iid degrees and finite mean connections Maria Deijfen Johan Jonasson February 2006 Abstract Let F be a probability distribution with support on the non-negative
More informationSpatial Statistics Chapter 3 Basics of areal data and areal data modeling
Spatial Statistics Chapter 3 Basics of areal data and areal data modeling Recall areal data also known as lattice data are data Y (s), s D where D is a discrete index set. This usually corresponds to data
More informationLatent Class Regression Part II
This work is licensed under a Creative Commons Attribution-NonCommercial-ShareAlike License. Your use of this material constitutes acceptance of that license and the conditions of use of materials on this
More informationFitting Subject-specific Curves to Grouped Longitudinal Data
Fitting Subject-specific Curves to Grouped Longitudinal Data Djeundje, Viani Heriot-Watt University, Department of Actuarial Mathematics & Statistics Edinburgh, EH14 4AS, UK E-mail: vad5@hw.ac.uk Currie,
More informationBig Data Analytics of Multi-Relationship Online Social Network Based on Multi-Subnet Composited Complex Network
, pp.273-284 http://dx.doi.org/10.14257/ijdta.2015.8.5.24 Big Data Analytics of Multi-Relationship Online Social Network Based on Multi-Subnet Composited Complex Network Gengxin Sun 1, Sheng Bin 2 and
More informationPoisson Models for Count Data
Chapter 4 Poisson Models for Count Data In this chapter we study log-linear models for count data under the assumption of a Poisson error structure. These models have many applications, not only to the
More informationScientific Collaboration Networks in China s System Engineering Subject
, pp.31-40 http://dx.doi.org/10.14257/ijunesst.2013.6.6.04 Scientific Collaboration Networks in China s System Engineering Subject Sen Wu 1, Jiaye Wang 1,*, Xiaodong Feng 1 and Dan Lu 1 1 Dongling School
More informationBioinformatics: Network Analysis
Bioinformatics: Network Analysis Graph-theoretic Properties of Biological Networks COMP 572 (BIOS 572 / BIOE 564) - Fall 2013 Luay Nakhleh, Rice University 1 Outline Architectural features Motifs, modules,
More informationON SOME ANALOGUE OF THE GENERALIZED ALLOCATION SCHEME
ON SOME ANALOGUE OF THE GENERALIZED ALLOCATION SCHEME Alexey Chuprunov Kazan State University, Russia István Fazekas University of Debrecen, Hungary 2012 Kolchin s generalized allocation scheme A law of
More informationAggregate Loss Models
Aggregate Loss Models Chapter 9 Stat 477 - Loss Models Chapter 9 (Stat 477) Aggregate Loss Models Brian Hartman - BYU 1 / 22 Objectives Objectives Individual risk model Collective risk model Computing
More informationThe Matrix Elements of a 3 3 Orthogonal Matrix Revisited
Physics 116A Winter 2011 The Matrix Elements of a 3 3 Orthogonal Matrix Revisited 1. Introduction In a class handout entitled, Three-Dimensional Proper and Improper Rotation Matrices, I provided a derivation
More informationGeneral Network Analysis: Graph-theoretic. COMP572 Fall 2009
General Network Analysis: Graph-theoretic Techniques COMP572 Fall 2009 Networks (aka Graphs) A network is a set of vertices, or nodes, and edges that connect pairs of vertices Example: a network with 5
More informationSegmentation models and applications with R
Segmentation models and applications with R Franck Picard UMR 5558 UCB CNRS LBBE, Lyon, France franck.picard@univ-lyon1.fr http://pbil.univ-lyon1.fr/members/fpicard/ INED-28/04/11 F. Picard (CNRS-LBBE)
More informationSupplement to Call Centers with Delay Information: Models and Insights
Supplement to Call Centers with Delay Information: Models and Insights Oualid Jouini 1 Zeynep Akşin 2 Yves Dallery 1 1 Laboratoire Genie Industriel, Ecole Centrale Paris, Grande Voie des Vignes, 92290
More informationGraph models for the Web and the Internet. Elias Koutsoupias University of Athens and UCLA. Crete, July 2003
Graph models for the Web and the Internet Elias Koutsoupias University of Athens and UCLA Crete, July 2003 Outline of the lecture Small world phenomenon The shape of the Web graph Searching and navigation
More informationWhy? A central concept in Computer Science. Algorithms are ubiquitous.
Analysis of Algorithms: A Brief Introduction Why? A central concept in Computer Science. Algorithms are ubiquitous. Using the Internet (sending email, transferring files, use of search engines, online
More informationTHE NUMBER OF GRAPHS AND A RANDOM GRAPH WITH A GIVEN DEGREE SEQUENCE. Alexander Barvinok
THE NUMBER OF GRAPHS AND A RANDOM GRAPH WITH A GIVEN DEGREE SEQUENCE Alexer Barvinok Papers are available at http://www.math.lsa.umich.edu/ barvinok/papers.html This is a joint work with J.A. Hartigan
More informationMATH4427 Notebook 2 Spring 2016. 2 MATH4427 Notebook 2 3. 2.1 Definitions and Examples... 3. 2.2 Performance Measures for Estimators...
MATH4427 Notebook 2 Spring 2016 prepared by Professor Jenny Baglivo c Copyright 2009-2016 by Jenny A. Baglivo. All Rights Reserved. Contents 2 MATH4427 Notebook 2 3 2.1 Definitions and Examples...................................
More informationPart 2: One-parameter models
Part 2: One-parameter models Bernoilli/binomial models Return to iid Y 1,...,Y n Bin(1, θ). The sampling model/likelihood is p(y 1,...,y n θ) =θ P y i (1 θ) n P y i When combined with a prior p(θ), Bayes
More informationA hidden Markov model for criminal behaviour classification
RSS2004 p.1/19 A hidden Markov model for criminal behaviour classification Francesco Bartolucci, Institute of economic sciences, Urbino University, Italy. Fulvia Pennoni, Department of Statistics, University
More informationNetwork/Graph Theory. What is a Network? What is network theory? Graph-based representations. Friendship Network. What makes a problem graph-like?
What is a Network? Network/Graph Theory Network = graph Informally a graph is a set of nodes joined by a set of lines or arrows. 1 1 2 3 2 3 4 5 6 4 5 6 Graph-based representations Representing a problem
More informationGENERATING AN ASSORTATIVE NETWORK WITH A GIVEN DEGREE DISTRIBUTION
International Journal of Bifurcation and Chaos, Vol. 18, o. 11 (2008) 3495 3502 c World Scientific Publishing Company GEERATIG A ASSORTATIVE ETWORK WITH A GIVE DEGREE DISTRIBUTIO JI ZHOU, XIAOKE XU, JIE
More informationRandom graphs with a given degree sequence
Sourav Chatterjee (NYU) Persi Diaconis (Stanford) Allan Sly (Microsoft) Let G be an undirected simple graph on n vertices. Let d 1,..., d n be the degrees of the vertices of G arranged in descending order.
More informationChapter 29 Scale-Free Network Topologies with Clustering Similar to Online Social Networks
Chapter 29 Scale-Free Network Topologies with Clustering Similar to Online Social Networks Imre Varga Abstract In this paper I propose a novel method to model real online social networks where the growing
More informationMixture Models for Genomic Data
Mixture Models for Genomic Data S. Robin AgroParisTech / INRA École de Printemps en Apprentissage automatique, Baie de somme, May 2010 S. Robin (AgroParisTech / INRA) Mixture Models May 10 1 / 48 Outline
More informationarxiv:1408.3610v1 [math.pr] 15 Aug 2014
PageRank in scale-free random graphs Ningyuan Chen, Nelly Litvak 2, Mariana Olvera-Cravioto Columbia University, 500 W. 20th Street, 3rd floor, New York, NY 0027 2 University of Twente, P.O.Box 27, 7500AE,
More informationA Uniform Asymptotic Estimate for Discounted Aggregate Claims with Subexponential Tails
12th International Congress on Insurance: Mathematics and Economics July 16-18, 2008 A Uniform Asymptotic Estimate for Discounted Aggregate Claims with Subexponential Tails XUEMIAO HAO (Based on a joint
More informationModel-Based Cluster Analysis for Web Users Sessions
Model-Based Cluster Analysis for Web Users Sessions George Pallis, Lefteris Angelis, and Athena Vakali Department of Informatics, Aristotle University of Thessaloniki, 54124, Thessaloniki, Greece gpallis@ccf.auth.gr
More informationA fast algorithm to find all high degree vertices in graphs with a power law degree sequence
A fast algorithm to find all high degree vertices in graphs with a power law degree sequence Colin Cooper, Tomasz Radzik, and Yiannis Siantos Department of Informatics, King s College London, UK Abstract.
More information1 Introduction to Matrices
1 Introduction to Matrices In this section, important definitions and results from matrix algebra that are useful in regression analysis are introduced. While all statements below regarding the columns
More information4. How many integers between 2004 and 4002 are perfect squares?
5 is 0% of what number? What is the value of + 3 4 + 99 00? (alternating signs) 3 A frog is at the bottom of a well 0 feet deep It climbs up 3 feet every day, but slides back feet each night If it started
More informationMaximum Likelihood Estimation
Math 541: Statistical Theory II Lecturer: Songfeng Zheng Maximum Likelihood Estimation 1 Maximum Likelihood Estimation Maximum likelihood is a relatively simple method of constructing an estimator for
More information5 Directed acyclic graphs
5 Directed acyclic graphs (5.1) Introduction In many statistical studies we have prior knowledge about a temporal or causal ordering of the variables. In this chapter we will use directed graphs to incorporate
More informationTemporal Dynamics of Scale-Free Networks
Temporal Dynamics of Scale-Free Networks Erez Shmueli, Yaniv Altshuler, and Alex Sandy Pentland MIT Media Lab {shmueli,yanival,sandy}@media.mit.edu Abstract. Many social, biological, and technological
More informationParametric fractional imputation for missing data analysis
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 Biometrika (????,??,?, pp. 1 14 C???? Biometrika Trust Printed in
More informationIdentification of Influencers - Measuring Influence in Customer Networks
Submitted to Decision Support Systems manuscript DSS Identification of Influencers - Measuring Influence in Customer Networks Christine Kiss, Martin Bichler Internet-based Information Systems, Dept. of
More informationPerformance Metrics for Graph Mining Tasks
Performance Metrics for Graph Mining Tasks 1 Outline Introduction to Performance Metrics Supervised Learning Performance Metrics Unsupervised Learning Performance Metrics Optimizing Metrics Statistical
More informationCourse: Model, Learning, and Inference: Lecture 5
Course: Model, Learning, and Inference: Lecture 5 Alan Yuille Department of Statistics, UCLA Los Angeles, CA 90095 yuille@stat.ucla.edu Abstract Probability distributions on structured representation.
More informationMarkov random fields and Gibbs measures
Chapter Markov random fields and Gibbs measures 1. Conditional independence Suppose X i is a random element of (X i, B i ), for i = 1, 2, 3, with all X i defined on the same probability space (.F, P).
More informationGraphical Modeling for Genomic Data
Graphical Modeling for Genomic Data Carel F.W. Peeters cf.peeters@vumc.nl Joint work with: Wessel N. van Wieringen Mark A. van de Wiel Molecular Biostatistics Unit Dept. of Epidemiology & Biostatistics
More informationSEQUENCES OF MAXIMAL DEGREE VERTICES IN GRAPHS. Nickolay Khadzhiivanov, Nedyalko Nenov
Serdica Math. J. 30 (2004), 95 102 SEQUENCES OF MAXIMAL DEGREE VERTICES IN GRAPHS Nickolay Khadzhiivanov, Nedyalko Nenov Communicated by V. Drensky Abstract. Let Γ(M) where M V (G) be the set of all vertices
More informationItem selection by latent class-based methods: an application to nursing homes evaluation
Item selection by latent class-based methods: an application to nursing homes evaluation Francesco Bartolucci, Giorgio E. Montanari, Silvia Pandolfi 1 Department of Economics, Finance and Statistics University
More informationa 11 x 1 + a 12 x 2 + + a 1n x n = b 1 a 21 x 1 + a 22 x 2 + + a 2n x n = b 2.
Chapter 1 LINEAR EQUATIONS 1.1 Introduction to linear equations A linear equation in n unknowns x 1, x,, x n is an equation of the form a 1 x 1 + a x + + a n x n = b, where a 1, a,..., a n, b are given
More informationRecursive Estimation
Recursive Estimation Raffaello D Andrea Spring 04 Problem Set : Bayes Theorem and Bayesian Tracking Last updated: March 8, 05 Notes: Notation: Unlessotherwisenoted,x, y,andz denoterandomvariables, f x
More informationExam C, Fall 2006 PRELIMINARY ANSWER KEY
Exam C, Fall 2006 PRELIMINARY ANSWER KEY Question # Answer Question # Answer 1 E 19 B 2 D 20 D 3 B 21 A 4 C 22 A 5 A 23 E 6 D 24 E 7 B 25 D 8 C 26 A 9 E 27 C 10 D 28 C 11 E 29 C 12 B 30 B 13 C 31 C 14
More informationNetwork analysis with the W -graph model
Network analysis with the W -graph model (via the Stochastic Block Model) S. Robin Joint work with P. Latouche and S. Ouadah INRA / AgroParisTech IMS, June 2015, Singapore S. Robin Joint work with P. Latouche
More informationEvaluation of a New Method for Measuring the Internet Degree Distribution: Simulation Results
Evaluation of a New Method for Measuring the Internet Distribution: Simulation Results Christophe Crespelle and Fabien Tarissan LIP6 CNRS and Université Pierre et Marie Curie Paris 6 4 avenue du président
More informationSTATISTICA. Clustering Techniques. Case Study: Defining Clusters of Shopping Center Patrons. and
Clustering Techniques and STATISTICA Case Study: Defining Clusters of Shopping Center Patrons STATISTICA Solutions for Business Intelligence, Data Mining, Quality Control, and Web-based Analytics Table
More informationSampling Biases in IP Topology Measurements
Sampling Biases in IP Topology Measurements Anukool Lakhina with John Byers, Mark Crovella and Peng Xie Department of Boston University Discovering the Internet topology Goal: Discover the Internet Router
More informationExploring contact patterns between two subpopulations
Exploring contact patterns between two subpopulations Winfried Just Hannah Callender M. Drew LaMar December 23, 2015 In this module 1 we introduce a construction of generic random graphs for a given degree
More informationWalk-Based Centrality and Communicability Measures for Network Analysis
Walk-Based Centrality and Communicability Measures for Network Analysis Michele Benzi Department of Mathematics and Computer Science Emory University Atlanta, Georgia, USA Workshop on Innovative Clustering
More informationFigure B.1: Optimal ownership as a function of investment interrelatedness. Figure C.1: Marginal effects at low interrelatedness
Online Appendix for: Lileeva, A. and J. Van Biesebroeck. Outsourcing when Investments are Specific and Interrelated, Journal of the European Economic Association Appendix A: Proofs Proof of Proposition
More informationExploratory data analysis (Chapter 2) Fall 2011
Exploratory data analysis (Chapter 2) Fall 2011 Data Examples Example 1: Survey Data 1 Data collected from a Stat 371 class in Fall 2005 2 They answered questions about their: gender, major, year in school,
More informationGraph Mining and Social Network Analysis
Graph Mining and Social Network Analysis Data Mining and Text Mining (UIC 583 @ Politecnico di Milano) References Jiawei Han and Micheline Kamber, "Data Mining: Concepts and Techniques", The Morgan Kaufmann
More informationStatistical Machine Learning from Data
Samy Bengio Statistical Machine Learning from Data 1 Statistical Machine Learning from Data Gaussian Mixture Models Samy Bengio IDIAP Research Institute, Martigny, Switzerland, and Ecole Polytechnique
More informationCredit Risk Models: An Overview
Credit Risk Models: An Overview Paul Embrechts, Rüdiger Frey, Alexander McNeil ETH Zürich c 2003 (Embrechts, Frey, McNeil) A. Multivariate Models for Portfolio Credit Risk 1. Modelling Dependent Defaults:
More informationGraphs over Time Densification Laws, Shrinking Diameters and Possible Explanations
Graphs over Time Densification Laws, Shrinking Diameters and Possible Explanations Jurij Leskovec, CMU Jon Kleinberg, Cornell Christos Faloutsos, CMU 1 Introduction What can we do with graphs? What patterns
More informationA LOGNORMAL MODEL FOR INSURANCE CLAIMS DATA
REVSTAT Statistical Journal Volume 4, Number 2, June 2006, 131 142 A LOGNORMAL MODEL FOR INSURANCE CLAIMS DATA Authors: Daiane Aparecida Zuanetti Departamento de Estatística, Universidade Federal de São
More informationFinite Horizon Investment Risk Management
History Collaborative research with Ralph Vince, LSP Partners, LLC, and Marcos Lopez de Prado, Guggenheim Partners. Qiji Zhu Western Michigan University Workshop on Optimization, Nonlinear Analysis, Randomness
More informationAdaptive Design for Intra Patient Dose Escalation in Phase I Trials in Oncology
Adaptive Design for Intra Patient Dose Escalation in Phase I Trials in Oncology Jeremy M.G. Taylor Laura L. Fernandes University of Michigan, Ann Arbor 19th August, 2011 J.M.G. Taylor, L.L. Fernandes Adaptive
More informationHow To Cluster Of Complex Systems
Entropy based Graph Clustering: Application to Biological and Social Networks Edward C Kenley Young-Rae Cho Department of Computer Science Baylor University Complex Systems Definition Dynamically evolving
More informationSummary of Formulas and Concepts. Descriptive Statistics (Ch. 1-4)
Summary of Formulas and Concepts Descriptive Statistics (Ch. 1-4) Definitions Population: The complete set of numerical information on a particular quantity in which an investigator is interested. We assume
More informationLecture 3: Linear methods for classification
Lecture 3: Linear methods for classification Rafael A. Irizarry and Hector Corrada Bravo February, 2010 Today we describe four specific algorithms useful for classification problems: linear regression,
More informationGraph Theory and Networks in Biology
Graph Theory and Networks in Biology Oliver Mason and Mark Verwoerd March 14, 2006 Abstract In this paper, we present a survey of the use of graph theoretical techniques in Biology. In particular, we discuss
More informationA SURVEY ON CONTINUOUS ELLIPTICAL VECTOR DISTRIBUTIONS
A SURVEY ON CONTINUOUS ELLIPTICAL VECTOR DISTRIBUTIONS Eusebio GÓMEZ, Miguel A. GÓMEZ-VILLEGAS and J. Miguel MARÍN Abstract In this paper it is taken up a revision and characterization of the class of
More informationGLMs: Gompertz s Law. GLMs in R. Gompertz s famous graduation formula is. or log µ x is linear in age, x,
Computing: an indispensable tool or an insurmountable hurdle? Iain Currie Heriot Watt University, Scotland ATRC, University College Dublin July 2006 Plan of talk General remarks The professional syllabus
More informationSocial Media Mining. Graph Essentials
Graph Essentials Graph Basics Measures Graph and Essentials Metrics 2 2 Nodes and Edges A network is a graph nodes, actors, or vertices (plural of vertex) Connections, edges or ties Edge Node Measures
More informationUniversity of Maryland Fraternity & Sorority Life Spring 2015 Academic Report
University of Maryland Fraternity & Sorority Life Academic Report Academic and Population Statistics Population: # of Students: # of New Members: Avg. Size: Avg. GPA: % of the Undergraduate Population
More information5. Multiple regression
5. Multiple regression QBUS6840 Predictive Analytics https://www.otexts.org/fpp/5 QBUS6840 Predictive Analytics 5. Multiple regression 2/39 Outline Introduction to multiple linear regression Some useful
More informationGraph theoretic approach to analyze amino acid network
Int. J. Adv. Appl. Math. and Mech. 2(3) (2015) 31-37 (ISSN: 2347-2529) Journal homepage: www.ijaamm.com International Journal of Advances in Applied Mathematics and Mechanics Graph theoretic approach to
More informationOPTIMIZING WEB SERVER'S DATA TRANSFER WITH HOTLINKS
OPTIMIZING WEB SERVER'S DATA TRANSFER WIT OTLINKS Evangelos Kranakis School of Computer Science, Carleton University Ottaa,ON. K1S 5B6 Canada kranakis@scs.carleton.ca Danny Krizanc Department of Mathematics,
More informationParallel Algorithms for Small-world Network. David A. Bader and Kamesh Madduri
Parallel Algorithms for Small-world Network Analysis ayssand Partitioning atto g(s (SNAP) David A. Bader and Kamesh Madduri Overview Informatics networks, small-world topology Community Identification/Graph
More informationHealth Status Monitoring Through Analysis of Behavioral Patterns
Health Status Monitoring Through Analysis of Behavioral Patterns Tracy Barger 1, Donald Brown 1, and Majd Alwan 2 1 University of Virginia, Systems and Information Engineering, Charlottesville, VA 2 University
More informationRecommender Systems Seminar Topic : Application Tung Do. 28. Januar 2014 TU Darmstadt Thanh Tung Do 1
Recommender Systems Seminar Topic : Application Tung Do 28. Januar 2014 TU Darmstadt Thanh Tung Do 1 Agenda Google news personalization : Scalable Online Collaborative Filtering Algorithm, System Components
More informationSearch Heuristics for Load Balancing in IP-networks
Search Heuristics for Load Balancing in IP-networks Mattias Söderqvist Swedish Institute of Computer Science mso@sics.se 3rd March 25 SICS Technical Report T25:4 ISSN 11-3154 ISRN:SICS-T--25/4-SE Abstract
More informationSGL: Stata graph library for network analysis
SGL: Stata graph library for network analysis Hirotaka Miura Federal Reserve Bank of San Francisco Stata Conference Chicago 2011 The views presented here are my own and do not necessarily represent the
More informationSPANNING CACTI FOR STRUCTURALLY CONTROLLABLE NETWORKS NGO THI TU ANH NATIONAL UNIVERSITY OF SINGAPORE
SPANNING CACTI FOR STRUCTURALLY CONTROLLABLE NETWORKS NGO THI TU ANH NATIONAL UNIVERSITY OF SINGAPORE 2012 SPANNING CACTI FOR STRUCTURALLY CONTROLLABLE NETWORKS NGO THI TU ANH (M.Sc., SFU, Russia) A THESIS
More informationStatistical Machine Learning
Statistical Machine Learning UoC Stats 37700, Winter quarter Lecture 4: classical linear and quadratic discriminants. 1 / 25 Linear separation For two classes in R d : simple idea: separate the classes
More informationNominal and ordinal logistic regression
Nominal and ordinal logistic regression April 26 Nominal and ordinal logistic regression Our goal for today is to briefly go over ways to extend the logistic regression model to the case where the outcome
More informationA permutation can also be represented by describing its cycles. What do you suppose is meant by this?
Shuffling, Cycles, and Matrices Warm up problem. Eight people stand in a line. From left to right their positions are numbered,,,... 8. The eight people then change places according to THE RULE which directs
More informationStochastic Loss Reserving with the Collective Risk Model
Glenn Meyers, FCAS, MAAA, Ph.D. Abstract This paper presents a Bayesian stochastic loss reserve model with the following features. 1. The model for expected loss payments depends upon unknown parameters
More informationData Mining. Cluster Analysis: Advanced Concepts and Algorithms
Data Mining Cluster Analysis: Advanced Concepts and Algorithms Tan,Steinbach, Kumar Introduction to Data Mining 4/18/2004 1 More Clustering Methods Prototype-based clustering Density-based clustering Graph-based
More informationPUBLIC TRANSPORT SYSTEMS IN POLAND: FROM BIAŁYSTOK TO ZIELONA GÓRA BY BUS AND TRAM USING UNIVERSAL STATISTICS OF COMPLEX NETWORKS
Vol. 36 (2005) ACTA PHYSICA POLONICA B No 5 PUBLIC TRANSPORT SYSTEMS IN POLAND: FROM BIAŁYSTOK TO ZIELONA GÓRA BY BUS AND TRAM USING UNIVERSAL STATISTICS OF COMPLEX NETWORKS Julian Sienkiewicz and Janusz
More informationImproving Experiments by Optimal Blocking: Minimizing the Maximum Within-block Distance
Improving Experiments by Optimal Blocking: Minimizing the Maximum Within-block Distance Michael J. Higgins Jasjeet Sekhon April 12, 2014 EGAP XI A New Blocking Method A new blocking method with nice theoretical
More informationFinding and counting given length cycles
Finding and counting given length cycles Noga Alon Raphael Yuster Uri Zwick Abstract We present an assortment of methods for finding and counting simple cycles of a given length in directed and undirected
More informationOnline Model-Based Clustering for Crisis Identification in Distributed Computing
Online Model-Based Clustering for Crisis Identification in Distributed Computing Dawn Woodard School of Operations Research and Information Engineering & Dept. of Statistical Science, Cornell University
More informationU = x 1 2. 1 x 1 4. 2 x 1 4. What are the equilibrium relative prices of the three goods? traders has members who are best off?
Chapter 7 General Equilibrium Exercise 7. Suppose there are 00 traders in a market all of whom behave as price takers. Suppose there are three goods and the traders own initially the following quantities:
More informationConstrained Bayes and Empirical Bayes Estimator Applications in Insurance Pricing
Communications for Statistical Applications and Methods 2013, Vol 20, No 4, 321 327 DOI: http://dxdoiorg/105351/csam2013204321 Constrained Bayes and Empirical Bayes Estimator Applications in Insurance
More information