Messagepassing sequential detection of multiple change points in networks


 Vincent Todd
 2 years ago
 Views:
Transcription
1 Messagepassing sequential detection of multiple change points in networks Long Nguyen, Arash Amini Ram Rajagopal University of Michigan Stanford University ISIT, Boston, July 2012 Nguyen/Amini/Rajagopal 1
2 Sequential changepoint detection Quickest detection of change in distribution of a sequence of data data collected sequentially over time tradeoff between false alarm rate and detection delay time extensions to decentralized network with a fusion center classical setting involves only one single change point variable Nguyen/Amini/Rajagopal 2
3 Sequential changepoint detection Quickest detection of change in distribution of a sequence of data data collected sequentially over time tradeoff between false alarm rate and detection delay time extensions to decentralized network with a fusion center classical setting involves only one single change point variable We study problems requiring detection of multiple change points in multiple sequences across network sites multiple change points are statistically dependent need to borrow information across network sites no fusion center needs messagepassing type algorithm new elements of modeling and asymptotic theory Nguyen/Amini/Rajagopal 2a
4 Example Simultaneous traffic monitoring Problem: detecting in realtime potential hotspots in a traffic network data are sequences of measurements of traffic volume at multiple sites sequential change point detection for each site Nguyen/Amini/Rajagopal 3
5 Sequential detection for single change point network site j collects sequence of data X n j for n = 1, 2,... time λ j N is change point variable for site j data are i.i.d. according to density g before the change point; and i.i.d. according to f after Nguyen/Amini/Rajagopal 4
6 Sequential detection for single change point network site j collects sequence of data X n j for n = 1, 2,... time λ j N is change point variable for site j data are i.i.d. according to density g before the change point; and i.i.d. according to f after a sequential change point detection procedure is a stopping time τ j, i.e., {τ j n} σ(x [n] j ) Nguyen/Amini/Rajagopal 4a
7 Sequential detection for single change point network site j collects sequence of data X n j for n = 1, 2,... time λ j N is change point variable for site j data are i.i.d. according to density g before the change point; and i.i.d. according to f after a sequential change point detection procedure is a stopping time τ j, i.e., {τ j n} σ(x [n] j ) NeymanPearson criterion: constraint on false alarm error PFA(τ j ) = P(τ j < λ j ) α for some small α minimum detection delay E[(τ j λ j ) τ j λ j ]. Nguyen/Amini/Rajagopal 4b
8 Optimal rule for single change point detection taking a Bayesian approach, λ j is endowed with a prior under some conditions, optimal sequential rule obtained by thresholding the posterior of λ j : (Shiryaev, 1978) where τ j = inf{n : Λ n 1 α}, Λ n = P(λ j n X [n] j ). wellestablished asymptotic properties (Tartakovsky & Veeravalli, 2006): false alarm: PFA(τ j ) α. detection delay: D(τ j ) = «log α 1 + o(1) I j + d as α 0. here I j = KL(f j g j ), the KullbackLeibler information, constant d depends on the prior Nguyen/Amini/Rajagopal 5
9 Extensions to network setting. survey paper by Tsitsiklis (1993) decentralized sequential detection: Veeravalli, Basar and Poor (1993), Mei (2008), Nguyen, Wainwright and Jordan (2008) sequential change diagnosis: Dayanik, Goulding and Poor (2008) multiple sequence change point detection: Xie and Siegmund (2010) sequential detection of a markov process: Raghavan and Veeravalli (2010)... Nguyen/Amini/Rajagopal 6
10 Talk outline statistical formulation for sequential detection of multiple change points in a network setting probabilistic graphical models extension of sequential analysis to multiple change point variables sequential and realtime messagepassing detection algorithms decision procedures with limited data and computation asymptotic theory characterizing detection delay and algorithm convergence roles of graphical models in asymptotic analysis Nguyen/Amini/Rajagopal 7
11 Graphical models for multiple change points m network sites labeled by U = {1,...,m} given a graph G = (U, E) that specifies the the connections among u U each site j experiences a change at time λ j N λ j is endowed with (independent) prior distribution π j Nguyen/Amini/Rajagopal 8
12 Graphical models for multiple change points m network sites labeled by U = {1,...,m} given a graph G = (U, E) that specifies the the connections among u U each site j experiences a change at time λ j N λ j is endowed with (independent) prior distribution π j there may be private data sequence (X n j ) n 1 for site j private data sequence changes its distribution after λ j Nguyen/Amini/Rajagopal 8a
13 Graphical models for multiple change points m network sites labeled by U = {1,...,m} given a graph G = (U, E) that specifies the the connections among u U each site j experiences a change at time λ j N λ j is endowed with (independent) prior distribution π j there may be private data sequence (X n j ) n 1 for site j private data sequence changes its distribution after λ j there is shared data sequence (X n ij ) n 1 for each edge e = (i, j) connecting neighboring pair of sites j and i: X n ij iid g ij ( ), for n < λ ij := min(λ i, λ j ) iid f ij ( ), for n λ ij = min(λ i, λ j ) Nguyen/Amini/Rajagopal 8b
14 Graphical model of change points and data sequences X 23 λ 3 λ 1 λ 3 λ 1 λ 2 λ 2 X 24 X 45 λ 4 λ 5 X 12 λ 4 λ 5 (a) Topology of sensor network (b) Graphical model of random variables Joint distribution of change points and observed network data at time n: P(λ, X n ) = π j (λ j ) P(X n j λ j ) P(X n ij λ i λ j ) j V j V (ij) E Star notations: λ := (λ 1,...,λ m ), X n = (X n 1,...,X n m). Change point variables are statistically dependent a posteriori! Nguyen/Amini/Rajagopal 9
15 Minfunctional of change points λ 3 X 23 λ 1 λ 2 λ 2 X 24 X 45 λ 1 λ 3 λ 4 λ 5 X 12 λ 4 λ 5 Let S be a subset of network sites. Define the earliest change point among any sites in S: λ S := min u S λ j. Question: what is the optimal stopping rule τ S for estimating λ S? τ S σ(x [n] ). Nguyen/Amini/Rajagopal 10
16 A natural rule is by thresholding the posterior probability: τ S = inf{n : P(min λ j n X [n] ) 1 α}, u S for small α > 0. Nguyen/Amini/Rajagopal 11
17 A natural rule is by thresholding the posterior probability: for small α > 0. τ S = inf{n : P(min λ j n X [n] ) 1 α}, u S This rule is suboptimal (unlike the single change point case, which is optimal under some conditions on the prior). But it will be shown to be asymptotically optimal and computationally tractable. Nguyen/Amini/Rajagopal 11a
18 Messagepassing distributed computation via sumproduct algorithm: the issue to compute posterior probabilities, assuming that data and statistical messages can be only be passed through the graphical structure: P(λ S n X [n] ) 1 α}. X 23 λ 3 λ 1 λ 2 λ 3 m n 12 λ 1 λ 2 m n 32 X 24 X 45 m n 24 λ 4 λ 5 (a) Topology of sensor network X 12 λ 4 λ 5 m n 45 (b) Messagepassing in network Simple to implement via an adaptation of the sumproduct algorithm Computational complexity. When G is a tree, the computational complexity of the message passing algorithm at each time step n is O(( V + E )n), but linearity in n is not desirable. Nguyen/Amini/Rajagopal 12
19 Meanfield approximation. Define latent binary variables Z n j = I(λ j n). Compute P(Z n X [n] ) in terms of P(Z n X [n 1] ) by Bayes rule. Decoupling approximation: As n gets large, due to concentration, the variables Zj n become decoupled across the graph. So, approximate: P(Z n X [n 1] ) j V P(Z n j X [n 1] ) In effect, we have avoided marginalization over time at every time step, resulting in O(1) computational complexity in n. Theorem 1. Both exact messagepassing algorithm and meanfield approximation algorithm construct a Markov sequence of posterior probabilities that obey a contraction map. This entails that both sequences converge to 1 almost surely. Nguyen/Amini/Rajagopal 13
20 Approximation of posterior paths, n P(λ j n X [n] ) MP APPROX MP APPROX MP APPROX MP APPROX MP APPROX MP APPROX Nguyen/Amini/Rajagopal 14
21 Main Theorem (optimal delay theorem). Assume that (a) The change points λ j are endowed with independent geometric priors. (b) The likelihood ratio functions are bounded from above. Then the proposed stopping rule τ S satisfies: (i) False alarm rate: P(τ S λ S ) α. (ii) The expected delay is asymptotically optimal, and takes the form: E[(τ S min u S λ j) τ S min u S λ j] = log α (1 + o(1)) d + I j +. I ij j S (ij) E S } {{ } I λs Here, I j = f i log(f j /g j ), and I ij = f ij log(f ij /g ij ). Nguyen/Amini/Rajagopal 15
22 Graphbased KullbackLeibler information... X 23 λ 3 λ 1 λ 3 λ 1 λ 2 λ 2 X 24 X 45 λ 4 λ 5 X 12 λ 4 λ 5 If S = {1}, then I λs = I 1 If S = {1, 2}, then I λs = I 1 + I 2 + I 12 If S = {1, 2, 3}, then I λs = I 1 + I 2 + I 3 + I 12 + I 23. Nguyen/Amini/Rajagopal 16
23 Concentration inequalities for marginal LRs For φ = min u S λ j, define marginal likelihood ratio D k,n φ := D k φ(x n ) := Pk φ (Xn ) P φ (Xn ), where P k φ denotes P( φ = k). Define conditional prior probability π k φ (m ) := P(λ = m φ = k). By a general result of Tartakovski & Veeravalli (2006), if 1 ] Pφ[ k N max log 1 n N Dk φ(x k+n N ) (1 + ε)i φ 0 (1) for all (small) ε > 0 and all k N, then the lower bound follows, inf eτ φ (α) E [ τ φ τ φ ] log α ( ) q φ +I φ 1 + o(1). Nguyen/Amini/Rajagopal 17
24 Furthermore, let T k ε := sup { n N : By TartakovskiVeeravalli (2006), if one has 1 } n log Dk φ(x k+n 1 ) < I φ ε. E T φ ε := P(φ = k) E k φ(tε k ) <, (2) k=1 for all (small) ε > 0, then the upper bound follows, that is, E[τ S φ log α τ S φ] q φ +I φ (1 + o(1)). Both conditions (1) and (2) can be deduced from an elaborate form of concentration inequality for the marginal likelihood ratio. Nguyen/Amini/Rajagopal 18
25 Key concentration lemma. Denote by P m λ the conditional probability P( λ = m ). Assume that for all m N d in the support of πφ k( ), { P m 1 } λ n log Dk φ(x n ) I φ > ε q(n) exp( c 1 nε 2 ) (3) for all n N and ε (0, ε 0 ) such that n 1 ε 2 p2 (m, k), where p( ) and q( ) are polynomials with nonnegative coefficients, both P(φ = ) and P(λ j φ = k) have finite polynomial moments. Then the optimal delay Theorem holds. Nguyen/Amini/Rajagopal 19
26 Probabilistic calculus of ǫequivalence Definition. Consider two sequences {a n } and {b n } of random variables, where a n = a n (k) and b n = b n (k) could depend on a common parameter k N. The two sequences are called asymptotically εequivalent as n, under {P m λ : m supp(πφ k )}, and denoted a n ε bn, if there exist polynomials p( ) and q( ) (with constant nonnegative coefficients), and ε 0 > 0, such that for all m supp(πφ k ), we have P m λ ( a n b n ε) 1 q(n)e c 1nε 2 for all n N and ε (0, ε 0 ) satisfying nε p(m, k). Nguyen/Amini/Rajagopal 20
27 By union bound and algebraic manipulation, we obtain the following rules: 1. a n ε bn implies a n Cε b n for C > 0 and αa n ε αbn for α R. 2. a n ε bn and b n ε cn implies a n ε cn. (Transitivity) 3. a n ε bn and c n ε dn implies a n ± c n ε bn ± d n. 4. a n ε bn implies max{a n, c n } ε max{b n, c n }. 5. a n ε bn, c n ε 1 and {bn } bounded implies a n c n ε b n. 6. a n ε a > 0 and bn ε b < 0 implies max{an, b n } ε a. 7. log summax inequality for positive sequences {a n } and {b n }: n 1 log(a n + b n ) ε max{n 1 log a n, n 1 log b n }. Based on this calculus we can deduce the ǫequivalence of the marginal likelihood ratio from the ǫequivalence of the likelihood ratios defined on individual sites and edges of neighboring sites. Nguyen/Amini/Rajagopal 21
28 Plots of the slope 1 log α E[τ S φ S τ S φ S ] for star network of (1,2,3,4) centering at j = 1 SINGLE MP APPROX LIMIT j = 2 SINGLE MP APPROX LIMIT e = (1,2) SINGLE MP APPROX LIMIT e = (3,2) SINGLE MP APPROX LIMIT Nguyen/Amini/Rajagopal 22
29 Summary decentralized sequential detection of multiple change points model, algorithm and asymptotic theory needed to go beyond single change point setting new statistical formulation drawing from: classical sequential analysis probabilistic graphical models (Bayes nets) introduced a messagepassing sequential detection algorithm, exploiting the benefit of network information asymptotic theory for analyzing false alarm rates and detection delay Nguyen/Amini/Rajagopal 23
30 for more detail, see A. Amini and X. Nguyen. Sequential detection of multiple change points: A graphical models approach. Technical report, Department of Statistics, Univ of Michigan, See also: R. Rajagopal, X. Nguyen, S.C. Ergen and P. Varaiya. Simultaneous sequential detection of multiple interacting faults. Nguyen/Amini/Rajagopal 24
INFORMATION from a single node (entity) can reach other
1 Network Infusion to Infer Information Sources in Networks Soheil Feizi, Muriel Médard, Gerald Quon, Manolis Kellis, and Ken Duffy arxiv:166.7383v1 [cs.si] 23 Jun 216 Abstract Several significant models
More informationThe Basics of Graphical Models
The Basics of Graphical Models David M. Blei Columbia University October 3, 2015 Introduction These notes follow Chapter 2 of An Introduction to Probabilistic Graphical Models by Michael Jordan. Many figures
More informationThe CUSUM algorithm a small review. Pierre Granjon
The CUSUM algorithm a small review Pierre Granjon June, 1 Contents 1 The CUSUM algorithm 1.1 Algorithm............................... 1.1.1 The problem......................... 1.1. The different steps......................
More informationx if x 0, x if x < 0.
Chapter 3 Sequences In this chapter, we discuss sequences. We say what it means for a sequence to converge, and define the limit of a convergent sequence. We begin with some preliminary results about the
More informationStationary random graphs on Z with prescribed iid degrees and finite mean connections
Stationary random graphs on Z with prescribed iid degrees and finite mean connections Maria Deijfen Johan Jonasson February 2006 Abstract Let F be a probability distribution with support on the nonnegative
More informationDistributed Detection Systems. Hamidreza Ahmadi
Channel Estimation Error in Distributed Detection Systems Hamidreza Ahmadi Outline Detection Theory NeymanPearson Method Classical l Distributed ib t Detection ti Fusion and local sensor rules Channel
More informationModern Optimization Methods for Big Data Problems MATH11146 The University of Edinburgh
Modern Optimization Methods for Big Data Problems MATH11146 The University of Edinburgh Peter Richtárik Week 3 Randomized Coordinate Descent With Arbitrary Sampling January 27, 2016 1 / 30 The Problem
More information5 INTEGER LINEAR PROGRAMMING (ILP) E. Amaldi Fondamenti di R.O. Politecnico di Milano 1
5 INTEGER LINEAR PROGRAMMING (ILP) E. Amaldi Fondamenti di R.O. Politecnico di Milano 1 General Integer Linear Program: (ILP) min c T x Ax b x 0 integer Assumption: A, b integer The integrality condition
More informationIntroduction to Support Vector Machines. Colin Campbell, Bristol University
Introduction to Support Vector Machines Colin Campbell, Bristol University 1 Outline of talk. Part 1. An Introduction to SVMs 1.1. SVMs for binary classification. 1.2. Soft margins and multiclass classification.
More informationBasics of Statistical Machine Learning
CS761 Spring 2013 Advanced Machine Learning Basics of Statistical Machine Learning Lecturer: Xiaojin Zhu jerryzhu@cs.wisc.edu Modern machine learning is rooted in statistics. You will find many familiar
More informationIntroduction to Detection Theory
Introduction to Detection Theory Reading: Ch. 3 in KayII. Notes by Prof. Don Johnson on detection theory, see http://www.ece.rice.edu/~dhj/courses/elec531/notes5.pdf. Ch. 10 in Wasserman. EE 527, Detection
More informationCourse: Model, Learning, and Inference: Lecture 5
Course: Model, Learning, and Inference: Lecture 5 Alan Yuille Department of Statistics, UCLA Los Angeles, CA 90095 yuille@stat.ucla.edu Abstract Probability distributions on structured representation.
More informationA Practical Scheme for Wireless Network Operation
A Practical Scheme for Wireless Network Operation Radhika Gowaikar, Amir F. Dana, Babak Hassibi, Michelle Effros June 21, 2004 Abstract In many problems in wireline networks, it is known that achieving
More informationLecture 4: BK inequality 27th August and 6th September, 2007
CSL866: Percolation and Random Graphs IIT Delhi Amitabha Bagchi Scribe: Arindam Pal Lecture 4: BK inequality 27th August and 6th September, 2007 4. Preliminaries The FKG inequality allows us to lower bound
More informationLecture 13: Martingales
Lecture 13: Martingales 1. Definition of a Martingale 1.1 Filtrations 1.2 Definition of a martingale and its basic properties 1.3 Sums of independent random variables and related models 1.4 Products of
More informationL4: Bayesian Decision Theory
L4: Bayesian Decision Theory Likelihood ratio test Probability of error Bayes risk Bayes, MAP and ML criteria Multiclass problems Discriminant functions CSCE 666 Pattern Analysis Ricardo GutierrezOsuna
More informationSTA 4273H: Statistical Machine Learning
STA 4273H: Statistical Machine Learning Russ Salakhutdinov Department of Statistics! rsalakhu@utstat.toronto.edu! http://www.cs.toronto.edu/~rsalakhu/ Lecture 6 Three Approaches to Classification Construct
More informationHypothesis Testing. 1 Introduction. 2 Hypotheses. 2.1 Null and Alternative Hypotheses. 2.2 Simple vs. Composite. 2.3 OneSided and TwoSided Tests
Hypothesis Testing 1 Introduction This document is a simple tutorial on hypothesis testing. It presents the basic concepts and definitions as well as some frequently asked questions associated with hypothesis
More information1. (First passage/hitting times/gambler s ruin problem:) Suppose that X has a discrete state space and let i be a fixed state. Let
Copyright c 2009 by Karl Sigman 1 Stopping Times 1.1 Stopping Times: Definition Given a stochastic process X = {X n : n 0}, a random time τ is a discrete random variable on the same probability space as
More informationA mixture model for random graphs
A mixture model for random graphs JJ Daudin, F. Picard, S. Robin robin@inapg.inra.fr UMR INAPG / ENGREF / INRA, Paris Mathématique et Informatique Appliquées Examples of networks. Social: Biological:
More informationChristfried Webers. Canberra February June 2015
c Statistical Group and College of Engineering and Computer Science Canberra February June (Many figures from C. M. Bishop, "Pattern Recognition and ") 1of 829 c Part VIII Linear Classification 2 Logistic
More informationRandomization Approaches for Network Revenue Management with Customer Choice Behavior
Randomization Approaches for Network Revenue Management with Customer Choice Behavior Sumit Kunnumkal Indian School of Business, Gachibowli, Hyderabad, 500032, India sumit kunnumkal@isb.edu March 9, 2011
More informationModeling and Performance Evaluation of Computer Systems Security Operation 1
Modeling and Performance Evaluation of Computer Systems Security Operation 1 D. Guster 2 St.Cloud State University 3 N.K. Krivulin 4 St.Petersburg State University 5 Abstract A model of computer system
More informationProbabilistic Models for Big Data. Alex Davies and Roger Frigola University of Cambridge 13th February 2014
Probabilistic Models for Big Data Alex Davies and Roger Frigola University of Cambridge 13th February 2014 The State of Big Data Why probabilistic models for Big Data? 1. If you don t have to worry about
More informationRandom graphs with a given degree sequence
Sourav Chatterjee (NYU) Persi Diaconis (Stanford) Allan Sly (Microsoft) Let G be an undirected simple graph on n vertices. Let d 1,..., d n be the degrees of the vertices of G arranged in descending order.
More informationMarkov random fields and Gibbs measures
Chapter Markov random fields and Gibbs measures 1. Conditional independence Suppose X i is a random element of (X i, B i ), for i = 1, 2, 3, with all X i defined on the same probability space (.F, P).
More informationFuzzy Probability Distributions in Bayesian Analysis
Fuzzy Probability Distributions in Bayesian Analysis Reinhard Viertl and Owat Sunanta Department of Statistics and Probability Theory Vienna University of Technology, Vienna, Austria Corresponding author:
More informationStatistical Machine Learning from Data
Samy Bengio Statistical Machine Learning from Data 1 Statistical Machine Learning from Data Gaussian Mixture Models Samy Bengio IDIAP Research Institute, Martigny, Switzerland, and Ecole Polytechnique
More informationNo: 10 04. Bilkent University. Monotonic Extension. Farhad Husseinov. Discussion Papers. Department of Economics
No: 10 04 Bilkent University Monotonic Extension Farhad Husseinov Discussion Papers Department of Economics The Discussion Papers of the Department of Economics are intended to make the initial results
More information61. REARRANGEMENTS 119
61. REARRANGEMENTS 119 61. Rearrangements Here the difference between conditionally and absolutely convergent series is further refined through the concept of rearrangement. Definition 15. (Rearrangement)
More informationQuickest Detection in Multiple ON OFF Processes Qing Zhao, Senior Member, IEEE, and Jia Ye
5994 IEEE TRANSACTIONS ON SIGNAL PROCESSING, VOL. 58, NO. 12, DECEMBER 2010 Quickest Detection in Multiple ON OFF Processes Qing Zhao, Senior Member, IEEE, and Jia Ye Abstract We consider the quickest
More informationTHE DYING FIBONACCI TREE. 1. Introduction. Consider a tree with two types of nodes, say A and B, and the following properties:
THE DYING FIBONACCI TREE BERNHARD GITTENBERGER 1. Introduction Consider a tree with two types of nodes, say A and B, and the following properties: 1. Let the root be of type A.. Each node of type A produces
More informationSection 3 Sequences and Limits, Continued.
Section 3 Sequences and Limits, Continued. Lemma 3.6 Let {a n } n N be a convergent sequence for which a n 0 for all n N and it α 0. Then there exists N N such that for all n N. α a n 3 α In particular
More informationLinear Models for Classification
Linear Models for Classification Sumeet Agarwal, EEL709 (Most figures from Bishop, PRML) Approaches to classification Discriminant function: Directly assigns each data point x to a particular class Ci
More informationNotes on metric spaces
Notes on metric spaces 1 Introduction The purpose of these notes is to quickly review some of the basic concepts from Real Analysis, Metric Spaces and some related results that will be used in this course.
More informationOn Decentralized Detection with Partial Information Sharing among Sensors
On Decentralized Detection with Partial Information Sharing among Sensors The MIT Faculty has made this article openly available. Please share how this access benefits you. Your story matters. Citation
More informationI. Pointwise convergence
MATH 40  NOTES Sequences of functions Pointwise and Uniform Convergence Fall 2005 Previously, we have studied sequences of real numbers. Now we discuss the topic of sequences of real valued functions.
More informationA crash course in probability and Naïve Bayes classification
Probability theory A crash course in probability and Naïve Bayes classification Chapter 9 Random variable: a variable whose possible values are numerical outcomes of a random phenomenon. s: A person s
More informationProximal mapping via network optimization
L. Vandenberghe EE236C (Spring 234) Proximal mapping via network optimization minimum cut and maximum flow problems parametric minimum cut problem application to proximal mapping Introduction this lecture:
More informationSome stability results of parameter identification in a jump diffusion model
Some stability results of parameter identification in a jump diffusion model D. Düvelmeyer Technische Universität Chemnitz, Fakultät für Mathematik, 09107 Chemnitz, Germany Abstract In this paper we discuss
More informationAn Introduction to Machine Learning
An Introduction to Machine Learning L5: Novelty Detection and Regression Alexander J. Smola Statistical Machine Learning Program Canberra, ACT 0200 Australia Alex.Smola@nicta.com.au Tata Institute, Pune,
More informationDiscrete Optimization
Discrete Optimization [Chen, Batson, Dang: Applied integer Programming] Chapter 3 and 4.14.3 by Johan Högdahl and Victoria Svedberg Seminar 2, 20150331 Todays presentation Chapter 3 Transforms using
More informationCHAPTER II THE LIMIT OF A SEQUENCE OF NUMBERS DEFINITION OF THE NUMBER e.
CHAPTER II THE LIMIT OF A SEQUENCE OF NUMBERS DEFINITION OF THE NUMBER e. This chapter contains the beginnings of the most important, and probably the most subtle, notion in mathematical analysis, i.e.,
More informationP (x) 0. Discrete random variables Expected value. The expected value, mean or average of a random variable x is: xp (x) = v i P (v i )
Discrete random variables Probability mass function Given a discrete random variable X taking values in X = {v 1,..., v m }, its probability mass function P : X [0, 1] is defined as: P (v i ) = Pr[X =
More informationMATH4427 Notebook 2 Spring 2016. 2 MATH4427 Notebook 2 3. 2.1 Definitions and Examples... 3. 2.2 Performance Measures for Estimators...
MATH4427 Notebook 2 Spring 2016 prepared by Professor Jenny Baglivo c Copyright 20092016 by Jenny A. Baglivo. All Rights Reserved. Contents 2 MATH4427 Notebook 2 3 2.1 Definitions and Examples...................................
More informationRandom access protocols for channel access. Markov chains and their stability. Laurent Massoulié.
Random access protocols for channel access Markov chains and their stability laurent.massoulie@inria.fr Aloha: the first random access protocol for channel access [Abramson, Hawaii 70] Goal: allow machines
More informationLECTURE 4. Last time: Lecture outline
LECTURE 4 Last time: Types of convergence Weak Law of Large Numbers Strong Law of Large Numbers Asymptotic Equipartition Property Lecture outline Stochastic processes Markov chains Entropy rate Random
More information2.3 Convex Constrained Optimization Problems
42 CHAPTER 2. FUNDAMENTAL CONCEPTS IN CONVEX OPTIMIZATION Theorem 15 Let f : R n R and h : R R. Consider g(x) = h(f(x)) for all x R n. The function g is convex if either of the following two conditions
More informationALMOST COMMON PRIORS 1. INTRODUCTION
ALMOST COMMON PRIORS ZIV HELLMAN ABSTRACT. What happens when priors are not common? We introduce a measure for how far a type space is from having a common prior, which we term prior distance. If a type
More informationSupplement to Call Centers with Delay Information: Models and Insights
Supplement to Call Centers with Delay Information: Models and Insights Oualid Jouini 1 Zeynep Akşin 2 Yves Dallery 1 1 Laboratoire Genie Industriel, Ecole Centrale Paris, Grande Voie des Vignes, 92290
More informationNonlinear Optimization: Algorithms 3: Interiorpoint methods
Nonlinear Optimization: Algorithms 3: Interiorpoint methods INSEAD, Spring 2006 JeanPhilippe Vert Ecole des Mines de Paris JeanPhilippe.Vert@mines.org Nonlinear optimization c 2006 JeanPhilippe Vert,
More informationStatistical Machine Learning
Statistical Machine Learning UoC Stats 37700, Winter quarter Lecture 4: classical linear and quadratic discriminants. 1 / 25 Linear separation For two classes in R d : simple idea: separate the classes
More informationA Uniform Asymptotic Estimate for Discounted Aggregate Claims with Subexponential Tails
12th International Congress on Insurance: Mathematics and Economics July 1618, 2008 A Uniform Asymptotic Estimate for Discounted Aggregate Claims with Subexponential Tails XUEMIAO HAO (Based on a joint
More informationWeek 1: Introduction to Online Learning
Week 1: Introduction to Online Learning 1 Introduction This is written based on Prediction, Learning, and Games (ISBN: 2184189 / 2184189 CesaBianchi, Nicolo; Lugosi, Gabor 1.1 A Gentle Start Consider
More informationarxiv:1408.3610v1 [math.pr] 15 Aug 2014
PageRank in scalefree random graphs Ningyuan Chen, Nelly Litvak 2, Mariana OlveraCravioto Columbia University, 500 W. 20th Street, 3rd floor, New York, NY 0027 2 University of Twente, P.O.Box 27, 7500AE,
More informationThis article has been accepted for inclusion in a future issue of this journal. Content is final as presented, with the exception of pagination.
IEEE/ACM TRANSACTIONS ON NETWORKING 1 A Greedy Link Scheduler for Wireless Networks With Gaussian MultipleAccess and Broadcast Channels Arun Sridharan, Student Member, IEEE, C Emre Koksal, Member, IEEE,
More informationApproximation Algorithms: LP Relaxation, Rounding, and Randomized Rounding Techniques. My T. Thai
Approximation Algorithms: LP Relaxation, Rounding, and Randomized Rounding Techniques My T. Thai 1 Overview An overview of LP relaxation and rounding method is as follows: 1. Formulate an optimization
More informationINDISTINGUISHABILITY OF ABSOLUTELY CONTINUOUS AND SINGULAR DISTRIBUTIONS
INDISTINGUISHABILITY OF ABSOLUTELY CONTINUOUS AND SINGULAR DISTRIBUTIONS STEVEN P. LALLEY AND ANDREW NOBEL Abstract. It is shown that there are no consistent decision rules for the hypothesis testing problem
More informationLogistic Regression. Jia Li. Department of Statistics The Pennsylvania State University. Logistic Regression
Logistic Regression Department of Statistics The Pennsylvania State University Email: jiali@stat.psu.edu Logistic Regression Preserve linear classification boundaries. By the Bayes rule: Ĝ(x) = arg max
More informationLet H and J be as in the above lemma. The result of the lemma shows that the integral
Let and be as in the above lemma. The result of the lemma shows that the integral ( f(x, y)dy) dx is well defined; we denote it by f(x, y)dydx. By symmetry, also the integral ( f(x, y)dx) dy is well defined;
More informationInterpretation of Somers D under four simple models
Interpretation of Somers D under four simple models Roger B. Newson 03 September, 04 Introduction Somers D is an ordinal measure of association introduced by Somers (96)[9]. It can be defined in terms
More informationMonte Carlo Simulation
1 Monte Carlo Simulation Stefan Weber Leibniz Universität Hannover email: sweber@stochastik.unihannover.de web: www.stochastik.unihannover.de/ sweber Monte Carlo Simulation 2 Quantifying and Hedging
More informationLecture 7: Approximation via Randomized Rounding
Lecture 7: Approximation via Randomized Rounding Often LPs return a fractional solution where the solution x, which is supposed to be in {0, } n, is in [0, ] n instead. There is a generic way of obtaining
More informationA Network Flow Approach in Cloud Computing
1 A Network Flow Approach in Cloud Computing Soheil Feizi, Amy Zhang, Muriel Médard RLE at MIT Abstract In this paper, by using network flow principles, we propose algorithms to address various challenges
More informationMath 115 Spring 2011 Written Homework 5 Solutions
. Evaluate each series. a) 4 7 0... 55 Math 5 Spring 0 Written Homework 5 Solutions Solution: We note that the associated sequence, 4, 7, 0,..., 55 appears to be an arithmetic sequence. If the sequence
More informationWellSeparated Pair Decomposition for the Unitdisk Graph Metric and its Applications
WellSeparated Pair Decomposition for the Unitdisk Graph Metric and its Applications Jie Gao Department of Computer Science Stanford University Joint work with Li Zhang Systems Research Center HewlettPackard
More informationLinear Classification. Volker Tresp Summer 2015
Linear Classification Volker Tresp Summer 2015 1 Classification Classification is the central task of pattern recognition Sensors supply information about an object: to which class do the object belong
More informationSpatial Statistics Chapter 3 Basics of areal data and areal data modeling
Spatial Statistics Chapter 3 Basics of areal data and areal data modeling Recall areal data also known as lattice data are data Y (s), s D where D is a discrete index set. This usually corresponds to data
More informationClassification  Examples
Lecture 2 Scheduling 1 Classification  Examples 1 r j C max given: n jobs with processing times p 1,...,p n and release dates r 1,...,r n jobs have to be scheduled without preemption on one machine taking
More informationLECTURE 15: AMERICAN OPTIONS
LECTURE 15: AMERICAN OPTIONS 1. Introduction All of the options that we have considered thus far have been of the European variety: exercise is permitted only at the termination of the contract. These
More informationTutorial on variational approximation methods. Tommi S. Jaakkola MIT AI Lab
Tutorial on variational approximation methods Tommi S. Jaakkola MIT AI Lab tommi@ai.mit.edu Tutorial topics A bit of history Examples of variational methods A brief intro to graphical models Variational
More informationSHARP BOUNDS FOR THE SUM OF THE SQUARES OF THE DEGREES OF A GRAPH
31 Kragujevac J. Math. 25 (2003) 31 49. SHARP BOUNDS FOR THE SUM OF THE SQUARES OF THE DEGREES OF A GRAPH Kinkar Ch. Das Department of Mathematics, Indian Institute of Technology, Kharagpur 721302, W.B.,
More informationThe Chinese Restaurant Process
COS 597C: Bayesian nonparametrics Lecturer: David Blei Lecture # 1 Scribes: Peter Frazier, Indraneel Mukherjee September 21, 2007 In this first lecture, we begin by introducing the Chinese Restaurant Process.
More informationImperfect Debugging in Software Reliability
Imperfect Debugging in Software Reliability Tevfik Aktekin and Toros Caglar University of New Hampshire Peter T. Paul College of Business and Economics Department of Decision Sciences and United Health
More informationClass #6: Nonlinear classification. ML4Bio 2012 February 17 th, 2012 Quaid Morris
Class #6: Nonlinear classification ML4Bio 2012 February 17 th, 2012 Quaid Morris 1 Module #: Title of Module 2 Review Overview Linear separability Nonlinear classification Linear Support Vector Machines
More informationMonte Carlobased statistical methods (MASM11/FMS091)
Monte Carlobased statistical methods (MASM11/FMS091) Jimmy Olsson Centre for Mathematical Sciences Lund University, Sweden Lecture 5 Sequential Monte Carlo methods I February 5, 2013 J. Olsson Monte Carlobased
More informationProbabilistic Graphical Models
Probabilistic Graphical Models Raquel Urtasun and Tamir Hazan TTI Chicago April 4, 2011 Raquel Urtasun and Tamir Hazan (TTIC) Graphical Models April 4, 2011 1 / 22 Bayesian Networks and independences
More informationPATTERN RECOGNITION AND MACHINE LEARNING CHAPTER 4: LINEAR MODELS FOR CLASSIFICATION
PATTERN RECOGNITION AND MACHINE LEARNING CHAPTER 4: LINEAR MODELS FOR CLASSIFICATION Introduction In the previous chapter, we explored a class of regression models having particularly simple analytical
More informationRECURSIVE TREES. Michael Drmota. Institue of Discrete Mathematics and Geometry Vienna University of Technology, A 1040 Wien, Austria
RECURSIVE TREES Michael Drmota Institue of Discrete Mathematics and Geometry Vienna University of Technology, A 040 Wien, Austria michael.drmota@tuwien.ac.at www.dmg.tuwien.ac.at/drmota/ 2008 Enrage Topical
More informationFalse Discovery Rates
False Discovery Rates John D. Storey Princeton University, Princeton, USA January 2010 Multiple Hypothesis Testing In hypothesis testing, statistical significance is typically based on calculations involving
More informationarxiv:1112.0829v1 [math.pr] 5 Dec 2011
How Not to Win a Million Dollars: A Counterexample to a Conjecture of L. Breiman Thomas P. Hayes arxiv:1112.0829v1 [math.pr] 5 Dec 2011 Abstract Consider a gambling game in which we are allowed to repeatedly
More informationSingle machine parallel batch scheduling with unbounded capacity
Workshop on Combinatorics and Graph Theory 21th, April, 2006 Nankai University Single machine parallel batch scheduling with unbounded capacity Yuan Jinjiang Department of mathematics, Zhengzhou University
More informationSet theory, and set operations
1 Set theory, and set operations Sayan Mukherjee Motivation It goes without saying that a Bayesian statistician should know probability theory in depth to model. Measure theory is not needed unless we
More informationQUICKEST MULTIDECISION ABRUPT CHANGE DETECTION WITH SOME APPLICATIONS TO NETWORK MONITORING
QUICKEST MULTIDECISION ABRUPT CHANGE DETECTION WITH SOME APPLICATIONS TO NETWORK MONITORING I. Nikiforov Université de Technologie de Troyes, UTT/ICD/LM2S, UMR 6281, CNRS 12, rue Marie Curie, CS 42060
More informationA HYBRID GENETIC ALGORITHM FOR THE MAXIMUM LIKELIHOOD ESTIMATION OF MODELS WITH MULTIPLE EQUILIBRIA: A FIRST REPORT
New Mathematics and Natural Computation Vol. 1, No. 2 (2005) 295 303 c World Scientific Publishing Company A HYBRID GENETIC ALGORITHM FOR THE MAXIMUM LIKELIHOOD ESTIMATION OF MODELS WITH MULTIPLE EQUILIBRIA:
More informationTo discuss this topic fully, let us define some terms used in this and the following sets of supplemental notes.
INFINITE SERIES SERIES AND PARTIAL SUMS What if we wanted to sum up the terms of this sequence, how many terms would I have to use? 1, 2, 3,... 10,...? Well, we could start creating sums of a finite number
More informationDiscuss the size of the instance for the minimum spanning tree problem.
3.1 Algorithm complexity The algorithms A, B are given. The former has complexity O(n 2 ), the latter O(2 n ), where n is the size of the instance. Let n A 0 be the size of the largest instance that can
More informationNear Optimal Solutions
Near Optimal Solutions Many important optimization problems are lacking efficient solutions. NPComplete problems unlikely to have polynomial time solutions. Good heuristics important for such problems.
More informationONLINE DEGREEBOUNDED STEINER NETWORK DESIGN. Sina Dehghani Saeed Seddighin Ali Shafahi Fall 2015
ONLINE DEGREEBOUNDED STEINER NETWORK DESIGN Sina Dehghani Saeed Seddighin Ali Shafahi Fall 2015 ONLINE STEINER FOREST PROBLEM An initially given graph G. s 1 s 2 A sequence of demands (s i, t i ) arriving
More informationNetwork Security A Decision and GameTheoretic Approach
Network Security A Decision and GameTheoretic Approach Tansu Alpcan Deutsche Telekom Laboratories, Technical University of Berlin, Germany and Tamer Ba ar University of Illinois at UrbanaChampaign, USA
More informationTransfer Learning! and Prior Estimation! for VC Classes!
15859(B) Machine Learning Theory! Transfer Learning! and Prior Estimation! for VC Classes! Liu Yang! 04/25/2012! Liu Yang 2012 1 Notation! Instance space X = R n! Concept space C of classifiers h: X >
More informationIntroduction to Markov Chain Monte Carlo
Introduction to Markov Chain Monte Carlo Monte Carlo: sample from a distribution to estimate the distribution to compute max, mean Markov Chain Monte Carlo: sampling using local information Generic problem
More informationVERTICES OF GIVEN DEGREE IN SERIESPARALLEL GRAPHS
VERTICES OF GIVEN DEGREE IN SERIESPARALLEL GRAPHS MICHAEL DRMOTA, OMER GIMENEZ, AND MARC NOY Abstract. We show that the number of vertices of a given degree k in several kinds of seriesparallel labelled
More informationCompression algorithm for Bayesian network modeling of binary systems
Compression algorithm for Bayesian network modeling of binary systems I. Tien & A. Der Kiureghian University of California, Berkeley ABSTRACT: A Bayesian network (BN) is a useful tool for analyzing the
More informationWE consider a decentralized detection problem and study
IEEE TRANSACTIONS ON SIGNAL PROCESSING, VOL. 59, NO. 4, APRIL 2011 1759 On Decentralized Detection With Partial Information Sharing Among Sensors O. Patrick Kreidl, John N. Tsitsiklis, Fellow, IEEE, and
More informationOffline sorting buffers on Line
Offline sorting buffers on Line Rohit Khandekar 1 and Vinayaka Pandit 2 1 University of Waterloo, ON, Canada. email: rkhandekar@gmail.com 2 IBM India Research Lab, New Delhi. email: pvinayak@in.ibm.com
More informationMonte Carlo and Empirical Methods for Stochastic Inference (MASM11/FMS091)
Monte Carlo and Empirical Methods for Stochastic Inference (MASM11/FMS091) Magnus Wiktorsson Centre for Mathematical Sciences Lund University, Sweden Lecture 5 Sequential Monte Carlo methods I February
More informationBayesian logistic betting strategy against probability forecasting. Akimichi Takemura, Univ. Tokyo. November 12, 2012
Bayesian logistic betting strategy against probability forecasting Akimichi Takemura, Univ. Tokyo (joint with Masayuki Kumon, Jing Li and Kei Takeuchi) November 12, 2012 arxiv:1204.3496. To appear in Stochastic
More informationSummary of Formulas and Concepts. Descriptive Statistics (Ch. 14)
Summary of Formulas and Concepts Descriptive Statistics (Ch. 14) Definitions Population: The complete set of numerical information on a particular quantity in which an investigator is interested. We assume
More informationInfluences in lowdegree polynomials
Influences in lowdegree polynomials Artūrs Bačkurs December 12, 2012 1 Introduction In 3] it is conjectured that every bounded real polynomial has a highly influential variable The conjecture is known
More information