3. Joint and Conditional Distributions, Stochastic Independence

Similar documents
MULTIVARIATE PROBABILITY DISTRIBUTIONS

Lecture Notes 1. Brief Review of Basic Probability

Sections 2.11 and 5.8

SF2940: Probability theory Lecture 8: Multivariate Normal Distribution

Definition: Suppose that two random variables, either continuous or discrete, X and Y have joint density

Lecture 6: Discrete & Continuous Probability and Random Variables

Multivariate Normal Distribution

SF2940: Probability theory Lecture 8: Multivariate Normal Distribution

Slides for Risk Management VaR and Expected Shortfall

Slides for Risk Management

Some probability and statistics

A Tutorial on Probability Theory

Summary of Formulas and Concepts. Descriptive Statistics (Ch. 1-4)

What is Statistics? Lecture 1. Introduction and probability review. Idea of parametric inference

Overview of Monte Carlo Simulation, Probability Review and Introduction to Matlab

Stat 704 Data Analysis I Probability Review

The Monte Carlo Framework, Examples from Finance and Generating Correlated Random Variables

Transformations and Expectations of random variables

M2S1 Lecture Notes. G. A. Young ayoung

Probability and statistics; Rehearsal for pattern recognition

Section 5.1 Continuous Random Variables: Introduction

ST 371 (IV): Discrete Random Variables

Data Mining: Algorithms and Applications Matrix Math Review

Introduction to Probability

Department of Mathematics, Indian Institute of Technology, Kharagpur Assignment 2-3, Probability and Statistics, March Due:-March 25, 2015.

Inner product. Definition of inner product

Probability for Estimation (review)

Generating Random Numbers Variance Reduction Quasi-Monte Carlo. Simulation Methods. Leonid Kogan. MIT, Sloan , Fall 2010

Lecture 8: Signal Detection and Noise Assumption

THE CENTRAL LIMIT THEOREM TORONTO

1 Determinants and the Solvability of Linear Systems

1 Short Introduction to Time Series

Correlation in Random Variables

ANALYZING INVESTMENT RETURN OF ASSET PORTFOLIOS WITH MULTIVARIATE ORNSTEIN-UHLENBECK PROCESSES

Permanents, Order Statistics, Outliers, and Robustness

Math 370, Actuarial Problemsolving Spring 2008 A.J. Hildebrand. Practice Test, 1/28/2008 (with solutions)

CONDITIONAL, PARTIAL AND RANK CORRELATION FOR THE ELLIPTICAL COPULA; DEPENDENCE MODELLING IN UNCERTAINTY ANALYSIS

Introduction to General and Generalized Linear Models

Uncertainty quantification for the family-wise error rate in multivariate copula models

Section 6.1 Joint Distribution Functions

Some Research Problems in Uncertainty Theory

Sums of Independent Random Variables

Statistics 100A Homework 7 Solutions

Lecture 8. Confidence intervals and the central limit theorem

Random variables P(X = 3) = P(X = 3) = 1 8, P(X = 1) = P(X = 1) = 3 8.

Multivariate normal distribution and testing for means (see MKB Ch 3)

Fundamentals of Probability and Statistics for Reliability. analysis. Chapter 2

BNG 202 Biomechanics Lab. Descriptive statistics and probability distributions I

Microeconomic Theory: Basic Math Concepts

Covariance and Correlation

The Convolution Operation

Multivariate Normal Distribution Rebecca Jennings, Mary Wakeman-Linn, Xin Zhao November 11, 2010

STA 256: Statistics and Probability I

Math 431 An Introduction to Probability. Final Exam Solutions

An axiomatic approach to capital allocation

ECE302 Spring 2006 HW5 Solutions February 21,

For a partition B 1,..., B n, where B i B j = for i. A = (A B 1 ) (A B 2 ),..., (A B n ) and thus. P (A) = P (A B i ) = P (A B i )P (B i )

What you CANNOT ignore about Probs and Stats

Lecture 5: Mathematical Expectation

Example. A casino offers the following bets (the fairest bets in the casino!) 1 You get $0 (i.e., you can walk away)

Joint Exam 1/P Sample Exam 1

A characterization of trace zero symmetric nonnegative 5x5 matrices

Recognizing Types of First Order Differential Equations E. L. Lady

( ) is proportional to ( 10 + x)!2. Calculate the

MASSACHUSETTS INSTITUTE OF TECHNOLOGY 6.436J/15.085J Fall 2008 Lecture 14 10/27/2008 MOMENT GENERATING FUNCTIONS

Limits and Continuity

Probability Theory. Florian Herzog. A random variable is neither random nor variable. Gian-Carlo Rota, M.I.T..

5. Continuous Random Variables

RANDOM VARIABLES AND PROBABILITY DISTRIBUTIONS

MATHEMATICAL METHODS OF STATISTICS

The Bivariate Normal Distribution

F Matrix Calculus F 1

Elliptical copulae. Dorota Kurowicka, Jolanta Misiewicz, Roger Cooke

Sensitivity analysis of utility based prices and risk-tolerance wealth processes

Understanding and Applying Kalman Filtering

On a comparison result for Markov processes

MATH 304 Linear Algebra Lecture 18: Rank and nullity of a matrix.

Solving Systems of Linear Equations

CS229 Lecture notes. Andrew Ng

The Exponential Distribution

Chapter 4 Lecture Notes

Econometrics Simple Linear Regression

Quadratic forms Cochran s theorem, degrees of freedom, and all that

Introduction: Overview of Kernel Methods

4.5 Linear Dependence and Linear Independence

Discrete Mathematics and Probability Theory Fall 2009 Satish Rao, David Tse Note 18. A Brief Introduction to Continuous Probability

Lecture 4: Joint probability distributions; covariance; correlation

PSTAT 120B Probability and Statistics

Multi-variable Calculus and Optimization

New Methods Providing High Degree Polynomials with Small Mahler Measure

MATH 304 Linear Algebra Lecture 20: Inner product spaces. Orthogonal sets.

Tail inequalities for order statistics of log-concave vectors and applications

Forecast covariances in the linear multiregression dynamic model.

A SURVEY ON CONTINUOUS ELLIPTICAL VECTOR DISTRIBUTIONS

GENERATING SIMULATION INPUT WITH APPROXIMATE COPULAS

DERIVATIVES AS MATRICES; CHAIN RULE

The sample space for a pair of die rolls is the set. The sample space for a random number between 0 and 1 is the interval [0, 1].

Mathematics Course 111: Algebra I Part IV: Vector Spaces

Nonparametric adaptive age replacement with a one-cycle criterion

32. PROBABILITY P(A B)

Transcription:

3. Joint and Conditional Distributions, Stochastic Independence Aim of this section: Multidimensional random variables (random vectors) (joint and marginal distributions) Stochastic (in)dependence and conditional distribution Multivariate normal distribution (definition, properties) Literature: Mood, Graybill, Boes (1974), Chapter IV, pp. 129-174 Wilfling (2011), Chapter 4 94

3.1 Joint and Marginal Distribution Now: Consider several random variables simultaneously Applications: Several economic applications Statistical inference 95

Definition 3.1: (Random vector) Let X 1,, X n be a set of n random variables each representing the same random experiment, i.e. X i : Ω R for i = 1,..., n. Then X = (X 1,..., X n ) is called an n-dimensional random variable or an n-dimensional random vector. Remark: In the literature random vectors are often denoted by X = (X 1,..., X n ) or more simply by X 1,..., X n 96

For n = 2 it is common practice to write X = (X, Y ) or (X, Y ) or X, Y Realizations are denoted by small letters: x = (x 1,..., x n ) R n or x = (x, y) R 2 Now: Characterization of the probability distribution of the random vector X 97

Definition 3.2: (Joint cumulative distribution function) Let X = (X 1,..., X n ) be an n-dimensional random vector. The function defined by F X1,...,X n : R n [0, 1] F X1,...,X n (x 1,..., x n ) = P (X 1 x 1, X 2 x 2,..., X n x n ) is called the joint cumulative distribution function of X. Remark: Definition 3.2 applies to discrete as well as to continuous random variables X 1,..., X n 98

Some properties of the bivariate cdf (n = 2): F X,Y (x, y) is monotone increasing in x and y lim x F X,Y (x, y) = 0 lim y F X,Y (x, y) = 0 lim x + y + F X,Y (x, y) = 1 Remark: Analogous properties hold for the n-dimensional cdf F X1,...,X n (x 1,..., x n ) 99

Now: Joint discrete versus joint continuous random vectors Definition 3.3: (Joint discrete random vector) The random vector X = (X 1,..., X n ) is defined to be a joint discrete random vector if it can assume only a finite (or a countable infinite) number of realizations x = (x 1,..., x n ) such that and P (X 1 = x 1, X 2 = x 2,..., X n = x n ) > 0 P (X1 = x 1, X 2 = x 2,..., X n = x n ) = 1, where the summation is over all possible realizations of X. 100

Definition 3.4: (Joint continuous random vector) The random vector X = (X 1,..., X n ) is defined to be a joint continuous random vector if and only if there exists a nonnegative function f X1,...,X n (x 1,..., x n ) such that xn x1 F X1,...,X n (x 1,..., x n ) =... f X 1,...,X n (u 1,..., u n ) du 1... du n for all (x 1,..., x n ). The function f X1,...,X n is defined to be a joint probability density function of X. Example: Consider X = (X, Y ) with joint pdf f X,Y (x, y) = { x + y, for (x, y) [0, 1] [0, 1] 0, elsewise 101

Joint pdf f X,Y (x, y) 2 1.5 fhx,yl 1 0.5 0 0 0.2 0.4 x 0.6 0.8 1 0 1 0.8 0.6 0.4 y 0.2 102

The joint cdf can be obtained by F X,Y (x, y) = y x f X,Y (u, v) du dv = y 0 x 0 (u + v) du dv =... = (Proof: Class) 0.5(x 2 y + xy 2 ), for (x, y) [0, 1] [0, 1] 0.5(x 2 + x), for (x, y) [0, 1] [1, ) 0.5(y 2 + y), for (x, y) [1, ) [0, 1] 1, for (x, y) [1, ) [1, ) 103

Remarks: If X = (X 1,..., X n ) is a joint continuous random vector, then n F X1,...,X n (x 1,..., x n ) x 1 x n = f X1,...,X n (x 1,..., x n ) The volume under the joint pdf represents probabilities: P (a u 1 < X 1 a o 1,..., au n < X n a o n ) = a o n a u n... a o 1 a u 1 f X1,...,X n (u 1,..., u n ) du 1... du n 104

In this course: Emphasis on joint continuous random vectors Analogous results for joint discrete random vectors (see Mood, Graybill, Boes (1974), Chapter IV) Now: Determination of the distribution of a single random variable X i from the joint distribution of the random vector (X 1,..., X n ) marginal distribution 105

Definition 3.5: (Marginal distribution) Let X = (X 1,..., X n ) be a continuous random vector with joint cdf F X1,...,X n and joint pdf f X1,...,X n. Then F X1 (x 1 ) = F X1,...,X n (x 1, +, +,..., +, + ) F X2 (x 2 ) = F X1,...,X n (+, x 2, +,..., +, + )... F Xn (x n ) = F X1,...,X n (+, +, +,..., +, x n ) are called marginal cdfs while 106

f X1 (x 1 ) = f X2 (x 2 ) = + +... f X 1,...,X n (x 1, x 2,..., x n ) dx 2... dx n + +... f X 1,...,X n (x 1, x 2,..., x n ) dx 1 dx 3... dx n f Xn (x n ) = + +... f X 1,...,X n (x 1, x 2,..., x n ) dx 1 dx 2... dx n 1 are called marginal pdfs of the one-dimensional (univariate) random variables X 1,..., X n. 107

Example: Consider the bivariate pdf f X,Y (x, y) = { 40(x 0.5) 2 y 3 (3 2x y), for (x, y) [0, 1] [0, 1] 0, elsewise 108

Bivariate pdf f X,Y (x, y) 3 fhx,yl 2 1 0 0 0.2 0.4 x 0.6 0.8 1 0 1 0.8 0.6 0.4 y 0.2 109

The marginal pdf of X obtains as f X (x) = 1 0 40(x 0.5)2 y 3 (3 2x y)dy = 40(x 0.5) 2 1 0 (3y3 2xy 3 y 4 )dy [ = 40(x 0.5) 2 3 4 y4 2x 4 y4 1 ] 1 5 y5 = 40(x 0.5) 2 ( 3 4 2x 4 1 5 = 20x 3 + 42x 2 27x + 5.5 ) 0 110

Marginal pdf f X (x) fhxl 1.5 1.25 1 0.75 0.5 0.25 0.2 0.4 0.6 0.8 1 x 111

The marginal pdf of Y obtains as f Y (y) = 1 0 40(x 0.5)2 y 3 (3 2x y)dx = 40y 3 1 = 10 3 y3 (y 2) 0 (x 0.5)2 (3 2x y)dx 112

Marginal pdf f Y (y) fhyl 3 2.5 2 1.5 1 0.5 0.2 0.4 0.6 0.8 1 y 113

Remarks: When considering the marginal instead of the joint distributions, we are faced with an information loss (the joint distribution uniquely determines all marginal distributions, but the converse does not hold in general) Besides the respective univariate marginal distributions, there are also multivariate distributions which can be obtained from the joint distribution of X = (X 1,..., X n ) 114

Example: For n = 5 consider X = (X 1,..., X 5 ) with joint pdf f X1,...,X 5 Then the marginal pdf of Z = (X 1, X 3, X 5 ) obtains as f X1,X 3,X 5 (x 1, x 3, x 5 ) = + + f X 1,...,X 5 (x 1, x 2, x 3, x 4, x 5 ) dx 2 dx 4 (integrate out the irrelevant components) 115

3.2 Conditional Distribution and Stochastic Independence Now: Distribution of a random variable X under the condition that another random variable Y has already taken on the realization y (conditional distribution of X given Y = y) 116

Definition 3.6: (Conditional distribution) Let X = (X, Y ) be a bivariate continuous random vector with joint pdf f X,Y (x, y). The conditional density of X given Y = y is defined to be f X Y =y (x) = f X,Y (x, y). f Y (y) Analogously, the conditional density of Y given X = x is defined to be f Y X=x (y) = f X,Y (x, y). f X (x) 117

Remark: Conditional densities of random vectors are defined analogously, e.g. f X1,X 2,X 4 X 3 =x 3,X 5 =x 5 (x 1, x 2, x 4 ) = f X1,X 2,X 3,X 4,X 5 (x 1, x 2, x 3, x 4, x 5 ) f X3,X 5 (x 3, x 5 ) 118

Example: Consider the bivariate pdf f X,Y (x, y) { 40(x 0.5) = 2 y 3 (3 2x y), for (x, y) [0, 1] [0, 1] 0, elsewise with marginal pdf f Y (y) = 10 3 y3 (y 2) (cf. Slides 108-112) 119

It follows that f X Y =y (x) = f X,Y (x, y) f Y (y) = 40(x 0.5)2 y 3 (3 2x y) 10 3 y3 (y 2) = 12(x 0.5)2 (3 2x y) 2 y 120

Conditional pdf f X Y =0.01 (x) of X given Y = 0.01 Bedingte 3 Dichte 2.5 2 1.5 1 0.5 0.2 0.4 0.6 0.8 1 x 121

Conditional pdf f X Y =0.95 (x) of X given Y = 0.95 Bedingte 1.2 Dichte 1 0.8 0.6 0.4 0.2 0.2 0.4 0.6 0.8 1 x 122

Now: Combine the concepts joint distribution and conditional distribution to define the notion stochastic independence (for two random variables first) Definition 3.7: (Stochastic Independence [I]) Let (X, Y ) be a bivariate continuous random vector with joint pdf f X,Y (x, y). X and Y are defined to be stochastically independent if and only if f X,Y (x, y) = f X (x) f Y (y) for all x, y R. 123

Remarks: Alternatively, stochastic independence can be defined via the cdfs: X and Y are stochastically independent, if and only if F X,Y (x, y) = F X (x) F Y (y) for all x, y R. If X and Y are independent, we have f X Y =y (x) = f X,Y (x, y) f Y (y) = f X(x) f Y (y) f Y (y) = f X (x) f Y X=x (y) = f X,Y (x, y) f X (x) = f X(x) f Y (y) f X (x) = f Y (y) If X and Y are independent and g and h are two continuous functions, then g(x) and h(y ) are also independent 124

Now: Extension to n random variables Definition 3.8: (Stochastic independence [II]) Let (X 1,..., X n ) be a continuous random vector with joint pdf f X1,...,X n (x 1,..., x n ) and joint cdf F X1,...,X n (x 1,..., x n ). X 1,..., X n are defined to be stochastically independent, if and only if for all (x 1,..., x n ) R n or f X1,...,X n (x 1,..., x n ) = f X1 (x 1 )... f Xn (x n ) F X1,...,X n (x 1,..., x n ) = F X1 (x 1 )... F Xn (x n ). 125

Remarks: For discrete random vectors we define: X 1,..., X n are stochastically independent, if and only if for all (x 1,..., x n ) R n or P (X 1 = x 1,..., X n = x n ) = P (X 1 = x 1 )... P (X n = x n ) F X1,...,X n (x 1,..., x n ) = F X1 (x 1 )... F Xn (x n ) In the case of independence, the joint distribution results from the marginal distributions If X 1,..., X n are stochastically independent and g 1,..., g n are continuous functions, then Y 1 = g 1 (X 1 ),..., Y n = g n (X n ) are also stochastically independent 126

3.3 Expectation and Joint Moment Generating Functions Now: Definition of the expectation of a function g : R n R (x 1,..., x n ) g(x 1,... x n ) of a continuous random vector X = (X 1,..., X n ) 127

Definition 3.9: (Expectation of a function) Let (X 1,..., X n ) be a continuous random vector with joint pdf f X1,...,X n (x 1,..., x n ) and g : R n R a real-valued continuous function. The expectation of the function g of the random vector is defined to be E[g(X 1,..., X n )] = +... + g(x 1,..., x n ) f X1,...,X n (x 1,..., x n ) dx 1... dx n. 128

Remarks: For a discrete random vector (X 1,..., X n ) the analogous definition is E[g(X 1,..., X n )] = g(x 1,..., x n ) P (X 1 = x 1,..., X n = x n ), where the summation is over all realizationen of the vector Definition 3.9 includes the expectation of a univariate random variable X: Set n = 1 and g(x) = x E(X 1 ) E(X) = + xf X(x) dx Definition 3.9 includes the variance of X: Set n = 1 and g(x) = [x E(X)] 2 Var(X 1 ) Var(X) = + [x E(X)]2 f X (x) dx 129

Definition 3.9 includes the covariance of two variables: Set n = 2 and g(x 1, x 2 ) = [x 1 E(X 1 )] [x 2 E(X 2 )] Cov(X 1, X 2 ) = + + [x 1 E(X 1 )][x 2 E(X 2 )]f X1,X 2 (x 1, x 2 ) dx 1 dx 2 Via the covariance we define the correlation coefficient: Corr(X 1, X 2 ) = Cov(X 1, X 2 ) Var(X 1 ) Var(X 2 ) General properties of expected values, variances, covariances and the correlation coefficient Class 130

Now: Expectation and variances of random vectors Definition 3.10: (Expected vector, covariance matrix) Let X = (X 1,..., X n ) be a random vector. The expected vector of X is defined to be E(X) = E(X 1 ). E(X n ) The covariance matrix of X is defined to be Cov(X) = Var(X 1 ) Cov(X 1, X 2 )... Cov(X 1, X n ) Cov(X 2, X 1 ) Var(X 2 )... Cov(X 2, X n )...... Cov(X n, X 1 ) Cov(X n, X 2 )... Var(X n ).. 131

Remark: Obviously, the covariance matrix is symmetric per definition Now: Expected vectors and covariance matrices under linear transformations of random vectors Let X = (X 1,..., X n ) be a n-dimensional random vector A be an (m n) matrix of real numbers b be an (m 1) column vector of real numbers 132

Obviously: Y = AX + b is an (m 1) random vector: Y = a 11 a 12... a 1n a 21 a 22... a 2n...... a m1 a m2... a mn X 1 X 2. X n + b 1 b 2. b m = a 11 X 1 + a 12 X 2 +... + a 1n X n + b 1 a 21 X 1 + a 22 X 2 +... + a 2n X n + b 2. a m1 X 1 + a m2 X 2 +... + a mn X n + b m 133

The expected vector of Y is given by E(Y) = a 11 E(X 1 ) + a 12 E(X 2 ) +... + a 1n E(X n ) + b 1 a 21 E(X 1 ) + a 22 E(X 2 ) +... + a 2n E(X n ) + b 2. a m1 E(X 1 ) + a m2 E(X 2 ) +... + a mn E(X n ) + b m = AE(X) + b The covariance matrix of Y is given by Cov(Y) = Var(Y 1 ) Cov(Y 1, Y 2 )... Cov(Y 1, Y n ) Cov(Y 2, Y 1 ) Var(Y 2 )... Cov(Y 2, Y n )...... Cov(Y n, Y 1 ) Cov(Y n, Y 2 )... Var(Y n ) (Proof: Class) = ACov(X)A 134

Remark: Cf. the analogous results for univariate variables: E(a X + b) = a E(X) + b Var(a X + b) = a 2 Var(X) Up to now: Expected values for unconditional distributions Now: Expected values for conditional distributions (cf. Definition 3.6, Slide 117) 135

Definition 3.11: (Conditional expected value of a function) Let (X, Y ) be a continuous random vector with joint pdf f X,Y (x, y) and let g : R 2 R be a real-valued function. The conditional expected value of the function g given X = x is defined to be E[g(X, Y ) X = x] = + g(x, y) f Y X (y) dy. 136

Remarks: An analogous definition applies to a discrete random vector (X, Y ) Definition 3.11 naturally extends to higher-dimensional distributions For g(x, y) = y we obtain the special case E[g(X, Y ) X = x] = E(Y X = x) Note that E[g(X, Y ) X = x] is a function of x 137

Example: Consider the joint pdf f X,Y (x, y) = { x + y, for (x, y) [0, 1] [0, 1] 0, elsewise The conditional distribution of Y given X = x is given by f Y X=x (y) = x + y x + 0.5, for (x, y) [0, 1] [0, 1] 0, elsewise For g(x, y) = y the conditional expectation is given as E(Y X = x) = 1 0 y x + y x + 0.5 dy = 1 x + 0.5 ( x 2 + 1 3 ) 138

Remarks: Consider the function g(x, y) = g(y) (i.e. g does not depend on x) Denote h(x) = E[g(Y ) X = x] We calculate the unconditional expectation of the transformed variable h(x) We have 139

E {E[g(Y ) X = x]} = E[h(X)] = + h(x) f X(x) dx = = + E[g(Y ) X = x] f X(x) dx [ + + g(y) f Y X (y) dy ] f X (x) dx = + + g(y) f Y X (y) f X(x) dy dx = + + g(y) f X,Y (x, y) dy dx = E[g(Y )] 140

Theorem 3.12: Let (X, Y ) be an arbitrary discrete or continuous random vector. Then and, in particular, E[g(Y )] = E {E[g(Y ) X = x]} E[Y ] = E {E[Y X = x]}. Now: Three important rules for conditional and unconditional expected values 141

Theorem 3.13: Let (X, Y ) be an arbitrary discrete or continuous random vector and g 1 ( ), g 2 ( ) two unidimensional functions. Then 1. E[g 1 (Y ) + g 2 (Y ) X = x] = E[g 1 (Y ) X = x] + E[g 2 (Y ) X = x], 2. E[g 1 (Y ) g 2 (X) X = x] = g 2 (x) E[g 1 (Y ) X = x]. 3. If X and Y are stochastically independent we have E[g 1 (X) g 2 (Y )] = E[g 1 (X)] E[g 2 (Y )]. 142

Finally: Moment generating function for random vectors Definition 3.14: (Joint moment generating function) Let X = (X 1,..., X n ) be an arbitrary discrete or continuous random vector. The joint moment generating function of X is defined to be m X1,...,X n (t 1,..., t n ) = E [ e t ] 1 X 1 +...+t n X n if this expectation exists for all t 1,..., t n with h < t j < h for an arbitary value h > 0 and for all j = 1,..., n. 143

Remarks: Via the joint moment generating function m X1,...,X n (t 1,..., t n ) we can derive the following mathematical objects: the marginal moment generating functions m X1 (t 1 ),..., m Xn (t n ) the moments of the marginal distributions the so-called joint moments 144

Important result: (cf. Theorem 2.23, Slide 85) For any given joint moment generating function m X1,...,X n (t 1,..., t n ) there exists a unique joint cdf F X1,...,X n (x 1,..., x n ) 145

3.4 The Multivariate Normal Distribution Now: Extension of the univariate normal distribution Definition 3.15: (Multivariate normal distribution) Let X = (X 1,..., X n ) be an continuous random vector. X is defined to have a multivariate normal distribution with parameters µ 1 σ 2 µ =. and Σ 1 σ 1n =....., µ n σ n1 σn 2 if for x = (x 1,..., x n ) R n its joint pdf is given by { f X (x) = (2π) n/2 [det(σ)] 1/2 exp 1 2 (x µ) Σ 1 (x µ) }. 146

Remarks: See Chang (1984, p. 92) for a definition and the properties of the determinant det(a) of the matrix A Notation: X N(µ, Σ) µ is a column vector with µ 1,..., µ n R Σ is a regular, positive definite, symmetric (n n) matrix Role of the parameters: E(X) = µ and Cov(X) = Σ 147

Joint pdf of the multiv. standard normal distribution N(0, I n ): { φ(x) = (2π) n/2 exp 1 } 2 x x Cf. the analogy to the univariate pdf in Definition 2.24, Slide 91 Properties of the N(µ, Σ) distribution: Partial vectors (marginal distributions) of X also have multivariate normal distributions, i.e. if then X = [ X1 X 2 ] N ([ µ1 µ 2 ] X 1 N(µ 1, Σ 11 ) X 2 N(µ 2, Σ 22 ), [ Σ11 Σ 12 Σ 21 Σ 22 ]) 148

Thus, all univariate variables of X = (X 1,..., X n ) have univariate normal distributions: X 1 N(µ 1, σ 2 1 ) X 2 N(µ 2, σ 2 2 ). X n N(µ n, σ 2 n) The conditional distributions are also (univariately or multivariately) normal: X 1 X 2 = x 2 N ( µ 1 + Σ 12 Σ 1 22 (x 2 µ 2 ), Σ 11 Σ 12 Σ 1 22 Σ 21 Linear transformations: Let A be an (m n) matrix, b an (m 1) vector of real numbers and X = (X 1,..., X n ) N(µ, Σ). Then AX + b N(Aµ + b, AΣA ) ) 149

Example: Consider X N(µ, Σ) ([ 0 N 1 ], [ 1 0.5 0.5 2 Find the distribution of Y = AX + b where [ ] [ ] 1 2 1 A =, b = 3 4 2 It follows that Y N(Aµ + b, AΣA ) ]) In particular, Aµ + b = [ 3 6 ] and AΣA = [ 12 24 24 53 ] 150

Now: Consider the bivariate case (n = 2), i.e. X = (X, Y ), E(X) = We have [ µx µ Y ], Σ = [ σ 2 X σ XY σ Y X σ 2 Y ] σ XY = σ Y X = Cov(X, Y ) = σ X σ Y Corr(X, Y ) = σ X σ Y ρ The joint pdf follows from Definition 3.15 with n = 2 f X,Y (x, y) = 1 2πσ X σ Y 1 ρ 2 exp (Derivation: Class) [ (x µx ) 2 σ 2 X 1 2 ( 1 ρ 2) 2ρ(x µ X)(y µ Y ) + (y µ Y ) 2 ]} σ X σ Y σy 2 151

f X,Y (x, y) for µ X = µ Y = 0, σ x = σ Y = 1 and ρ = 0 0.15 fhx,yl0.1 0.05 0-2 0 y 2 0 x 2-2 152

f X,Y (x, y) for µ X = µ Y = 0, σ x = σ Y = 1 and ρ = 0.9 0.3 fhx,yl0.2 0.1 0-2 0 y 2 0 x 2-2 153

Remarks: The marginal distributions are given by X N(µ X, σ 2 X ) and Y N(µ Y, σ 2 Y ) interesting result for the normal distribution: If (X, Y ) has a bivariate normal distribution, then X and Y are independent if and only if ρ = Corr(X, Y ) = 0 The conditional distributions are given by X Y = y N Y X = x N (Proof: Class) ( ( µ X + ρ σ X (y µ Y ), σx 2 σ Y µ Y + ρ σ Y σ X (x µ X ), σ 2 Y ( 1 ρ 2 )) ( 1 ρ 2 )) 154