Set theory, and set operations

Similar documents
Extension of measure

MASSACHUSETTS INSTITUTE OF TECHNOLOGY 6.436J/15.085J Fall 2008 Lecture 5 9/17/2008 RANDOM VARIABLES

How To Find Out How To Calculate A Premeasure On A Set Of Two-Dimensional Algebra

INTRODUCTORY SET THEORY

Mathematics for Econometrics, Fourth Edition

1. Prove that the empty set is a subset of every set.

Elements of probability theory

MEASURE AND INTEGRATION. Dietmar A. Salamon ETH Zürich

The sample space for a pair of die rolls is the set. The sample space for a random number between 0 and 1 is the interval [0, 1].

Lecture Note 1 Set and Probability Theory. MIT Spring 2006 Herman Bennett

Lecture 7: Continuous Random Variables

DISINTEGRATION OF MEASURES

Notes on metric spaces

Probability Theory. Florian Herzog. A random variable is neither random nor variable. Gian-Carlo Rota, M.I.T..

Mathematical Methods of Engineering Analysis

Math/Stats 425 Introduction to Probability. 1. Uncertainty and the axioms of probability

Introduction to Probability

Lecture Notes on Measure Theory and Functional Analysis

The Chinese Restaurant Process

Basic Concepts of Point Set Topology Notes for OU course Math 4853 Spring 2011

Basic Probability Concepts

MA651 Topology. Lecture 6. Separation Axioms.

What is Statistics? Lecture 1. Introduction and probability review. Idea of parametric inference

1 if 1 x 0 1 if 0 x 1

E3: PROBABILITY AND STATISTICS lecture notes

Random variables P(X = 3) = P(X = 3) = 1 8, P(X = 1) = P(X = 1) = 3 8.

Basic Measure Theory. 1.1 Classes of Sets

9 More on differentiation

SOLUTIONS TO EXERCISES FOR. MATHEMATICS 205A Part 3. Spaces with special properties

x a x 2 (1 + x 2 ) n.

M2S1 Lecture Notes. G. A. Young ayoung

God created the integers and the rest is the work of man. (Leopold Kronecker, in an after-dinner speech at a conference, Berlin, 1886)

Lecture 13: Martingales

Metric Spaces. Chapter 1

IEOR 6711: Stochastic Models I Fall 2012, Professor Whitt, Tuesday, September 11 Normal Approximations and the Central Limit Theorem

and s n (x) f(x) for all x and s.t. s n is measurable if f is. REAL ANALYSIS Measures. A (positive) measure on a measurable space

The Ergodic Theorem and randomness

SOLUTIONS TO ASSIGNMENT 1 MATH 576

RANDOM INTERVAL HOMEOMORPHISMS. MICHA L MISIUREWICZ Indiana University Purdue University Indianapolis

THE CENTRAL LIMIT THEOREM TORONTO

CHAPTER II THE LIMIT OF A SEQUENCE OF NUMBERS DEFINITION OF THE NUMBER e.

Basics of Statistical Machine Learning

Omega Automata: Minimization and Learning 1

Stationary random graphs on Z with prescribed iid degrees and finite mean connections

Probability: Theory and Examples. Rick Durrett. Edition 4.1, April 21, 2013

INDISTINGUISHABILITY OF ABSOLUTELY CONTINUOUS AND SINGULAR DISTRIBUTIONS

The Basics of Graphical Models

FUNCTIONAL ANALYSIS LECTURE NOTES: QUOTIENT SPACES

An example of a computable

Metric Spaces Joseph Muscat 2003 (Last revised May 2009)

Lecture 13 - Basic Number Theory.

The Henstock-Kurzweil-Stieltjes type integral for real functions on a fractal subset of the real line

Probability and statistics; Rehearsal for pattern recognition

Chapter 4 Lecture Notes

Information Theory and Coding Prof. S. N. Merchant Department of Electrical Engineering Indian Institute of Technology, Bombay

GENERIC COMPUTABILITY, TURING DEGREES, AND ASYMPTOTIC DENSITY

1 Norms and Vector Spaces

Random variables, probability distributions, binomial random variable

IAM 530 ELEMENTS OF PROBABILITY AND STATISTICS INTRODUCTION

I. Pointwise convergence

Lecture 2: Universality

Separation Properties for Locally Convex Cones

Chapter 3. Cartesian Products and Relations. 3.1 Cartesian Products

MAS108 Probability I

A Tutorial on Probability Theory

Summary of Formulas and Concepts. Descriptive Statistics (Ch. 1-4)

Probability and Statistics Vocabulary List (Definitions for Middle School Teachers)

Arkansas Tech University MATH 4033: Elementary Modern Algebra Dr. Marcel B. Finan

Lecture 1 Introduction Properties of Probability Methods of Enumeration Asrat Temesgen Stockholm University

STAT 315: HOW TO CHOOSE A DISTRIBUTION FOR A RANDOM VARIABLE

Undergraduate Notes in Mathematics. Arkansas Tech University Department of Mathematics

INTRODUCTION TO TOPOLOGY

Statistics in Geophysics: Introduction and Probability Theory

Gambling Systems and Multiplication-Invariant Measures

Lebesgue Measure on R n

Probability: Terminology and Examples Class 2, 18.05, Spring 2014 Jeremy Orloff and Jonathan Bloom

CHAPTER 2 Estimating Probabilities

University of Miskolc

Notes on Continuous Random Variables

THE BANACH CONTRACTION PRINCIPLE. Contents

No: Bilkent University. Monotonic Extension. Farhad Husseinov. Discussion Papers. Department of Economics

CHAPTER 1 BASIC TOPOLOGY

Columbia University in the City of New York New York, N.Y

CS 3719 (Theory of Computation and Algorithms) Lecture 4

Chapter 4. Probability and Probability Distributions

Introduction to Markov Chain Monte Carlo

For a partition B 1,..., B n, where B i B j = for i. A = (A B 1 ) (A B 2 ),..., (A B n ) and thus. P (A) = P (A B i ) = P (A B i )P (B i )

MATHEMATICAL METHODS OF STATISTICS

DEGREES OF ORDERS ON TORSION-FREE ABELIAN GROUPS

Introduction to Topology

TOPOLOGY: THE JOURNEY INTO SEPARATION AXIOMS

Follow links for Class Use and other Permissions. For more information send to:

Georg Cantor ( ):

Duality of linear conic problems

How To Prove The Dirichlet Unit Theorem

1 = (a 0 + b 0 α) (a m 1 + b m 1 α) 2. for certain elements a 0,..., a m 1, b 0,..., b m 1 of F. Multiplying out, we obtain

SOME APPLICATIONS OF MARTINGALES TO PROBABILITY THEORY

I. GROUPS: BASIC DEFINITIONS AND EXAMPLES

Point Set Topology. A. Topological Spaces and Continuous Maps

So let us begin our quest to find the holy grail of real analysis.

Transcription:

1 Set theory, and set operations Sayan Mukherjee Motivation It goes without saying that a Bayesian statistician should know probability theory in depth to model. Measure theory is not needed unless we discuss the probability of two types of events: (1 infinitely repeated operations such as infinite sequences of coin tosses (2 infinitely fine operations such as drawing uniformly from the unit interval. In both these cases we have to worry about infinite events so issues of countability, limits, and measure occur. Measure theoretic probability is needed to understand when and how our intuition about discrete events carries over to both these cases. It also gives us a common framework for both of the above events. Why need for care? One example is order of integration which for Bayesian statistics is very important since we condition and marginalize extensively If we first integrate y and then x: 1 5. If we first integrate x and then y: 1 20. R = {0 x 2, 0 y 1} f(x, y = xy(x2 y 2 (x 2 + y 2, for(x, y (0, 0 3? = f(x, ydxdy R Another example is the Dirichlet process. The following is a simple way of constructing a Dirichlet process: 1. Draw X 1,..., X N iid Poi(λ 2. Normalize Z i = Xi Pi Xi. What is the distribution underlying this random process or algorithm? Is there a sense of convergence of the outputs of this algorithm? Important ideas Law of large numbers: Given X 1,..., X n of iid random variables we state that what does convergence ( mean? 1 n n X i µ, Central limit theorem: 1 n n i=1 Z n = X i µ No(0, 1, σ what does convergence ( mean? What is convergence in distribution? Law of the iterated logarithm: what is lim sup? Notation lim sup n i=1 Ω: set of possible outcomes of some experiment. ( n i=1 X i nµ 2nσ2 log log n = 1,

2 ω Ω: one of the outcomes of the experiment. A Ω A is a subset of Ω; A is true if ω A. 2 Ω : All subsets of Ω called th power set P(Ω Operations Complement; A c = not A = {w : w / A} Union over arbitrary index set Intersection over arbitrary index set Set difference; symmetric difference Relations Containment: A B : ω A ω B Disjoint: A B = Equality: A = B : ω A iff ω B De Morgan s Law: A = {w : w A for at least one }, A B = A or B (or perhaps both. A = {w : w A for all }, A B = AB = both A and B. A \ B = A B c A B = (A B c (A c B. ( c A = (A c ( c A = (A c Montone sequence of sets: A n A means that A 1 A 2... and lim A n = n=1 A n. A n A means that A 1 A 2... and lim A n = A n A means that A 1 A 2... and lim A n = n=1 A n. n=1 A n. A n A means that A 1 A 2... and lim A n = n=1 A n. Limits of sets Infinitely often (io: lim sup A n m 1 n m A n. Also called infinitely often as A n occurs infinitely often as n. The event lim sup A n occurs if and only if infinitely many of the A n occur. Almost all (aa: lim inf A n m 1 n m A n. Also called almost all since A n occurs with at most finitely many exceptions. {lim inf A n } {lim sup A n } A n = {n, n + 1,...} A n = {1,..., n} A 2n = {2, 4, 6,...} lim sup A n = lim inf A n = lim sup A n = lim inf A n = N lim sup A n = N, lim inf A n = Convergence: A n A lim inf A n = lim sup A n = A. Indicator functions Indicator functions allow for Boolean algebra to be turned into ordinary algebra.

3 I A (x = { 1 if x A 0 otherwise. The operations of unions and intersections correspond to pointwise maxima ( or minima ( I Ai (x = i I Ai (x I Ai (x = i I Ai (x. I A c(x = 1 I A (x I A\B (x = max(0, I A (x I B (x I A B (x = I A (x I B (x. ( n ( n n A i B i (A i B i I Ai i=1 i=1 i=1 i i I Bi max I Ai I Bi. i Revisit lim sup: lim sup of a set is really much like lim sup of a function I lim supn A n lim sup = lim sup I An. n lim lim inf lim ( sup ( m n x m inf x m m n lim inf x n lim sup x n. x n x lim inf x n = lim sup x n = x. Some sets are bigger than others The cardinality of a set Ω is the number of elements in the set. Theorem 0.0.1 (Cantor For any set Ω and power set P(Ω, Ω < P(Ω. Example 0.0.1 Ω = N an infinite but countable set P(Ω uncountable R uncountable Q the rationals are countable. Fields and algebras Definition 0.0.1 (algebra Let F be a collection of subsets of Ω. F is called a field (algebra if Ω F and F is closed under complementation and finite union: 1. (i Ω F 2. (ii A F A c F 3. (iii A 1,..., A n F n A j F. We could replace closure under finite unions with closure under finite intersections (due to de Morgan s law n A 1,..., A n F A j F. Definition 0.0.2 (σ-algebra Let F be a collection of subsets of Ω. F is called a field (algebra if Ω F and F is closed under complementation and countable unions:

4 1. (i Ω F 2. (ii A F A c F 3. (iii A 1,..., A n F n A j F 4. (iv A 1,..., A n,... F A j F An algebra is a σ-algebra but not vice versa. Example 0.0.2 Ω = R, A n = (a, b]. F is the collection of all finite disjoint unions of (a, b]. F is an algebra. F is not a σ-algebra. (0, n 1 ] = (0, 1. n=1 Definition 0.0.3 (Borel set Let Ω = R and B 0 be the field of right-semiclosed intervals. Then σ(b 0 is the Borel σ-algebra of R. This idea can be extended to metric spaces such as R d or the space of continuous functions. Probability spaces The idea of a probability space is to assign a measure or nonnegative number to subsets of of Ω. Definition 0.0.4 (Probability space A probability space is a triple (Ω, F, P where P is a function (probability measure P : F [0, 1] such that 1. (i P(A 0, A F 2. ii P(Ω = 1 3. iii if A j F are disjoint P = A j P(A j. Properties of probability spaces 1. (i monotonicity: if A B then P(A < P(B 2. (ii σ-subadditive: for A n, n 1 3. (iii continuity: if A n A and A n F then P(A n P(A if A n A and A n F then P(A n P(A ( P A n P(A n n=1 4. (iv inclusion-exclusion: for A 1,..., A n (related to Euler characteristic in topology n P n = P(A j P(A i A j + P(A i A j A k ( 1 n+1 P(A 1 A n A j 1 i<j n n=1 1 i<j<k n Example 0.0.3 Two events that are disjoint but the sum of the probability of the union is equal to the probability of the sum. A = (0,.5] and B = [.5, 1 with P(A =.5 and P(B =.5. P(A B = 1 = P(A + P(B and A B = {.5}.

5 Lemma 0.0.1 (Fatou For A n F for n 1 1. (i 2. (ii If A n A then P(A n P(A. P(lim inf A n lim inf P(A n lim sup P(A n P(lim sup A n For a probability measure µ and a sequence of functions f n (think f n = I An (i is sometimes written with functional notation lim inf f n dµ lim inf f n dµ. A typical use of Fatous lemma is the following. Suppose we have f n f and sup n 1 f n K <. Then f n f, and by Fatou s lemma f K. (What does the above have to do with any problem of inference? Example 0.0.4 Discrete probability measure on the real numbers whose support is the real line. The support of a discrete probability mesure is not necessarily just the set of points at which it places mass, two points really really close may not be individually distinguished. Let r 1, r 2,.. be an enumeration of the rational numbers (how would one enumerate them? and let P place mass 1/2 n at r n. P is discrete but has support on the line since for any x and any ε > 0 P({x ε, x + ε} > 0, since the interval contains at leasnt one rational number. Does this example have any uses in non-parametric statistical models? Definition 0.0.5 (Distribution function A function F : R [0, 1] is a probability distribution function (df with F (x = P((, x], x R, if F 1. is continuous 2. is monotone and non-decreasing 3. has limits at ± F ( := lim F (x = 1 x F ( := lim F (x = 0. x Example 0.0.5 Distributions that are neither discrete nor continuous. 1. Define a distribution F that is χ-squared with two degrees of freedom. 1. Observe K from a Poisson distribution with mean λ. 2. Sum K observations from F and report this value, if K = 0 report 0. 2. Define a distribution F that is standard normal. 1. Observe X from a Bernoulli distribution with paramater p. 2. If X = 0 report 0 otherwise report a draw from F. Are these distributions interesting priors? Dynkin s π λ theorem We will soon need to define probability measures on infinite and possible uncountable sets, like the power set of the naturals. This is hard. It is easier to define the measure on a much smaller collection of the power set and state consistency conditions that allow us to extend the measure to the the power set. This is what Dynkin s π λ theorem was designed to do.

6 Definition 0.0.6 (π-system Given a set Ω a π system is a collection of subsets P that are closed under finite intersections. 1 P is non-empty; 2 A B P whenever A, B P. Definition 0.0.7 (λ-system Given a set Ω a λ system is a collection of subsets L that contains Ω and is closed under complementation and disjoint countable unions. 1 Ω L ; 2 A L A c L ; 3 A n L, n 1 with A i A j = i j n=1 A n L. Definition 0.0.8 (σ-algebra Let F be a collection of subsets of Ω. F is called a field (algebra if Ω F and F is closed under complementation and countable unions, 1 Ω F; 2 A F A c F; 3 A 1,..., A n F n A j F; 4 A 1,..., A n,... F A j F. A σ-algebra is a π system. A σ-algebra is a λ system but a λ system need not be a σ algebra, a λ system is a weaker system. Example 0.0.6 Ω = {a, b, c, d} and L = {Ω,, {a, b}, {b, c}, {c, d}, {a, d}, {b, d}, {a, c}} L is closed under disjoint unions but not unions. However: Lemma 0.0.2 A class that is both a π system and a λ system is a σ-algebra. Next lecture we will start constructing probability measures on infinite sets and we will need to push the idea of extension of measure. In this the following theorem is central. Theorem 0.0.2 (Dynkin If P is a π system and L is a λ system, then P L implies σ(p L.