(R1) R is nondecreasing. (R2) U Uniform(0, 1) = R(U) X.

Similar documents
MASSACHUSETTS INSTITUTE OF TECHNOLOGY 6.436J/15.085J Fall 2008 Lecture 5 9/17/2008 RANDOM VARIABLES

Microeconomic Theory: Basic Math Concepts

2.2 Derivative as a Function

Review of Fundamental Mathematics

Random variables P(X = 3) = P(X = 3) = 1 8, P(X = 1) = P(X = 1) = 3 8.

Simple linear regression

88 CHAPTER 2. VECTOR FUNCTIONS. . First, we need to compute T (s). a By definition, r (s) T (s) = 1 a sin s a. sin s a, cos s a

A Primer on Mathematical Statistics and Univariate Distributions; The Normal Distribution; The GLM with the Normal Distribution

Lecture 13 - Basic Number Theory.

Lecture 7: Continuous Random Variables

1 if 1 x 0 1 if 0 x 1

1 Sufficient statistics

3 Some Integer Functions

FEGYVERNEKI SÁNDOR, PROBABILITY THEORY AND MATHEmATICAL

Lecture Notes on Elasticity of Substitution

BNG 202 Biomechanics Lab. Descriptive statistics and probability distributions I

Probability and Statistics Prof. Dr. Somesh Kumar Department of Mathematics Indian Institute of Technology, Kharagpur

TOPIC 4: DERIVATIVES

Scalar Valued Functions of Several Variables; the Gradient Vector

1.7 Graphs of Functions

Chapter 3 RANDOM VARIATE GENERATION

Microeconomics Sept. 16, 2010 NOTES ON CALCULUS AND UTILITY FUNCTIONS

Summary of Formulas and Concepts. Descriptive Statistics (Ch. 1-4)

36 CHAPTER 1. LIMITS AND CONTINUITY. Figure 1.17: At which points is f not continuous?

LOGNORMAL MODEL FOR STOCK PRICES

Exploratory Data Analysis

The sample space for a pair of die rolls is the set. The sample space for a random number between 0 and 1 is the interval [0, 1].

Note on growth and growth accounting

Mathematical Induction

Formal Languages and Automata Theory - Regular Expressions and Finite Automata -

E3: PROBABILITY AND STATISTICS lecture notes

Physics Lab Report Guidelines

Chapter 4 Lecture Notes

Unified Lecture # 4 Vectors

Mathematics Review for MS Finance Students

Every Positive Integer is the Sum of Four Squares! (and other exciting problems)

CHAPTER 7 INTRODUCTION TO SAMPLING DISTRIBUTIONS

or, put slightly differently, the profit maximizing condition is for marginal revenue to equal marginal cost:

Lecture 2. Marginal Functions, Average Functions, Elasticity, the Marginal Principle, and Constrained Optimization

Chapter 3. Distribution Problems. 3.1 The idea of a distribution The twenty-fold way

Department of Mathematics, Indian Institute of Technology, Kharagpur Assignment 2-3, Probability and Statistics, March Due:-March 25, 2015.

1 The Brownian bridge construction

NEW YORK STATE TEACHER CERTIFICATION EXAMINATIONS

Handout #1: Mathematical Reasoning

5 Numerical Differentiation

Notes on metric spaces

LS.6 Solution Matrices

Chapter 27: Taxation. 27.1: Introduction. 27.2: The Two Prices with a Tax. 27.2: The Pre-Tax Position

Information Theory and Coding Prof. S. N. Merchant Department of Electrical Engineering Indian Institute of Technology, Bombay

What does the number m in y = mx + b measure? To find out, suppose (x 1, y 1 ) and (x 2, y 2 ) are two points on the graph of y = mx + b.

The Taxman Game. Robert K. Moniot September 5, 2003

Practice with Proofs

Mathematics for Econometrics, Fourth Edition

Math 1B, lecture 5: area and volume

Metric Spaces. Chapter Metrics

Lecture 16 : Relations and Functions DRAFT

MATH 4330/5330, Fourier Analysis Section 11, The Discrete Fourier Transform

Lemma 5.2. Let S be a set. (1) Let f and g be two permutations of S. Then the composition of f and g is a permutation of S.

This unit will lay the groundwork for later units where the students will extend this knowledge to quadratic and exponential functions.

Fixed Point Theorems

Arkansas Tech University MATH 4033: Elementary Modern Algebra Dr. Marcel B. Finan

Gamma Distribution Fitting

Session 7 Bivariate Data and Analysis

CA200 Quantitative Analysis for Business Decisions. File name: CA200_Section_04A_StatisticsIntroduction

Math 4310 Handout - Quotient Vector Spaces

3. Mathematical Induction

Math 181 Handout 16. Rich Schwartz. March 9, 2010

The Graphical Method: An Example

IEOR 6711: Stochastic Models I Fall 2012, Professor Whitt, Tuesday, September 11 Normal Approximations and the Central Limit Theorem

Logo Symmetry Learning Task. Unit 5

9. Quotient Groups Given a group G and a subgroup H, under what circumstances can we find a homomorphism φ: G G ', such that H is the kernel of φ?

CORRELATED TO THE SOUTH CAROLINA COLLEGE AND CAREER-READY FOUNDATIONS IN ALGEBRA

the points are called control points approximating curve

Maximum Likelihood Estimation

3.2. Solving quadratic equations. Introduction. Prerequisites. Learning Outcomes. Learning Style

Continued Fractions and the Euclidean Algorithm

Section 1.4. Difference Equations

Notes on Continuous Random Variables

In order to describe motion you need to describe the following properties.

8 Divisibility and prime numbers

4. Continuous Random Variables, the Pareto and Normal Distributions

Math 461 Fall 2006 Test 2 Solutions

A Determination of g, the Acceleration Due to Gravity, from Newton's Laws of Motion

Binomial lattice model for stock prices

Math 319 Problem Set #3 Solution 21 February 2002

Math 120 Final Exam Practice Problems, Form: A

Year 9 set 1 Mathematics notes, to accompany the 9H book.

LINEAR INEQUALITIES. Mathematics is the art of saying many things in many different ways. MAXWELL

Slope and Rate of Change

Correlation key concepts:

Chapter 3. Cartesian Products and Relations. 3.1 Cartesian Products

2.2. Instantaneous Velocity

PLOTTING DATA AND INTERPRETING GRAPHS

The last three chapters introduced three major proof techniques: direct,

MATH 10: Elementary Statistics and Probability Chapter 5: Continuous Random Variables

Week 4: Standard Error and Confidence Intervals

Nonparametric adaptive age replacement with a one-cycle criterion

9. Momentum and Collisions in One Dimension*

You know from calculus that functions play a fundamental role in mathematics.

Introduction to Probability

Transcription:

9:3 //2 TOPIC Representing and quantile functions This section deals with the quantiles of a random variable We start by discussing how quantiles are related to the inverse df of the preceding section, and to the closely related notion of a representing function We then look at a useful parametric family of distributions, the so-called Tukey family, that are defined in terms of their quantiles Finally, we present a graphical technique for comparing two distributions, the so-called Q/Q-plot Representing functions Let X be a random variable with df F A mapping R from (, ) to R such that (R) R is nondecreasing and (R2) R(U) X whenever U Uniform(, ) is called a representing function for X According to the IPT Theorem (Theorem 5), the left-continuous inverse F to F is a representing function The motivating idea behind a representing function R is the use we put F to in proving Theorem 6, namely, if you want to prove something about X, it may be helpful to try representing X as R(U) for a standard uniform variable U Theorem Suppose R is a representing function for X and Y := T (X) is a transformation of X () If T is nondecreasing, then the mapping u T ( R(u) ) is a representing function for Y (2) If T is nonincreasing, then the mapping u T ( R( u) ) is a representing function for Y Proof () The mapping u T ( R(u) ) is nondecreasing in u and for U Uniform(, ), R(U) X = T ( R(U) ) T (X) = Y (2) In this case the mapping u T ( R( u) ) is nondecreasing in u for U Uniform(, ), U U = R( U) R(U) X = T ( R( U) ) T (X) = Y (R) R is nondecreasing (R2) U Uniform(, ) = R(U) X The theorem covers the important special case where T (x) = a+bx; this transformation is nondecreasing if b, and nonincreasing if b There is always at least one representing function for X, namely F Are there any others? That question is answered by: Theorem 2 Suppose X has df F and left-continuous inverse df F A mapping R from (, ) to R is a representing function for X if and only if F (u) R(u) F (u+) for all u (, ) F (u) F (u+) Consequently F is the only representing function iff F is strictly increasing over { x R : < F (x) < } (see Exercise ) Proof ( =) Suppose () holds (R) holds: Indeed, suppose < u < w < Then R(u) F (u+) (by () for u) F (w) R(w) (since F is nondecreasing) (by () for w) (R2) holds: Indeed, suppose U Uniform(, ) Then F (U) X I am going to show that P [ R(U) = F (U) ] = ; (2) by Exercise 2 this implies R(U) F (U) and hence R(U) X, as required To get (2), let J = {u (, ) : F (u) F (u+) } be the set of jumps of F Exercise 9 asserts that J is countable Thus P [ R(U) F (U) ] P [ U J ] (by ()) = P [ U = u ] = (since P [ U = u ] = for all u) u J u F () 2 2 2

(R) R is nondecreasing (R2) U Uniform(, ) = R(U) X (): F (u) R(u) F (u+) (5): u F (x) F (u) x (R) R is nondecreasing (R2) U Uniform(, ) = R(U) X (): F (u) R(u) F (u+) (= ) Let R be a representing function for X, and let U Uniform(, ) F (u) R(u) for all u (, ): Indeed, let u (, ) be given By the switching formula (5), F (u) R(u) u F ( R(u) ) Now use F ( R(u) ) = P [ X R(u) ] (since X has df F ) = P [ R(U) R(u) ] (by (R2)) P [ U u ] (by (R)) = u (since U is uniform) R(u) F (u+) for all u (, ): Indeed, let u (, ) be given By Exercise 8, F (u+) R(u) u F ( ((R(u)) ) Now use F ( (R(u)) ) = P [ X < R(u) ] = P [ R(U) < R(u) ] P [U < u] = u As the proof pointed out, there can be at most countably many values of u such that F (u) < F (u+) Example Suppose X takes the values, /2, and with probabilities /3 each We have seen that F has the following graph: F (u) 2 2 3 3 2 3 u Consequently R: (, ) R is a representing function for X if and only if, if < u < /3, r, if u = /3, R(u) = /2, if /3 < u < 2/3, r 2, if u = 2/3,, if 2/3 < u < for some numbers r [, /2] and r 2 [/2, ] Suppose R is a representing function for a random variable X with df F Let n be a positive integer and let U, U 2,, U n be independent random variables, each uniformly distributed over (, ) Then X := R(U ), X 2 := R(U 2 ),, X n := R(U n ) are independent and each distributed like X; in brief, X,, X n is a random sample of size n from F Now let X (), X (2),, X (n) be the random variables resulting from arranging the X i s in increasing order; these are called the order statistics for the X i s Similarly, let U () U (2) U (n) be the order statistics for the U i s Since R is nondecreasing we have X () := R(U () ), X (2) := R(U (2) ),, X (n) := R(U (n) ) Consequently to study properties of order statistics, it often suffices to first work out the uniform case (ie, for X U) and then transform the result through R This is a standard technique 2 4

9:3 //2 Quantiles We turn now to the quantiles of a random variable X Here are a couple of examples to motivate the general definition given below Case A: X N(, ) Area u x Area u Case B: X Uniform on, 2, 3 /3 /3 /3 2 In case A, X has a density For u (, ), the u th -quantile of X is defined to be the number x such that the area under the density to the left of x equals u, or, equivalently, the area to the right of x equals u This is the same as requiring P [ X x ] = u and P [ X x ] = u (3) In case B, X has a probability mass function For u = 5, the u th -quantile, or median, of X should clearly be taken to be x = However, (3) doesn t hold for this u and x, rather we have P [ X x ] u and P [ X x ] u (4) We make this the general definition For any random variable X, be it continuous, discrete, or whatever, a u th -quantile of X is defined to be a number x such that (4) holds According to the following theorem, there is always at least one such x; in some situations (as in Case B above, with u = /3 or u = 2/3) there may be more than one Obviously (3) always implies (4) However, (4) implies (3) only when P [ X = x ] = ; (5) see Exercise 3 A mapping Q from (, ) is called a quantile function for X if Q(u) is a u th -quantile for X, for all u (, ) 2 5 Theorem 2: R: (, ) R is a representing function for a random variable X with df F iff F (u) R(u) F (u+) for all u (, ) x is a u th -quantile for X iff P [ X x ] u and P [ X x ] u Q: (, ) R is a quantile function for X iff Q(u) is a u th -quantile for each u (, ) Theorem 3 Let X be a random variable A mapping Q from (, ) to R is a quantile function for X if and only if Q is a representing function for X Proof Let F be the df of X and F the left-continuous inverse to F A number x is a u th quantile for X if and only and u P [ X x] = F (x) F (u) x (6) u P [ X x] = P [ X < x ] = F (x ) u F (x ) F (u+) x; (7) here (6) used the switching formula (5) and (7) used the alternate switching formula given in Exercise 8 Consequently, Q is a quantile function for X F (u) Q(u) F (u+) for all u (, ) Q is a representing function for X, the last step following from Theorem 2 Theorems and 3 imply Theorem 4 Suppose Q is a quantile function for X and Y = T (X) () If T is nondecreasing, then the mapping u T ( Q(u) ) is a quantile function for Y (2) If T is nonincreasing, then the mapping u T ( Q( u) ) is a quantile function for Y 2 6

The Tukey family For λ (, ) let R λ : (, ) R be defined by u λ ( u) λ, if λ R λ (u) := λ ( u ) (8) log, if λ = u for < u < Note that d du R λ(u) = u λ + ( u) λ > for each u, so R λ is continuous and strictly increasing Consequently if U Uniform(, ), then X λ := R λ (U) is a random variable having R λ as its (unique) representing (= quantile) function The distribution of X λ is called the Tukey distribution T(λ) with shape parameter λ (and scale parameter ) The Tukey family is { T(λ) : < λ < } This family has a number of interesting/useful properties It is very easy to simulate a draw from T(λ): you just take a uniform variate u and compute R λ (u) The Tukey family is therefore well-suited for empirical robustness studies, and that is why Tukey introduced it 2 The distributions vary smoothly with λ This is clear for λ Moreover lim R u λ ( u) λ λ(u) = lim λ λ λ = lim λ d dλ (uλ ( u) λ ) d dλ λ u λ log u ( u) λ log( u) = lim λ ( u ) = log = R (u) u In statistics, R is known as logit transformation 2 7 (by l Hospital s rule) R λ (u) = ( u λ ( u) λ) /λ for λ R (u) = log ( u/( u) ) 3 The Tukey family contains (or almost contains) some common distributions For λ =, we have R (u) = u ( u) = 2u for each u Thus X = R (U) Uniform(, ) R (u) T(4) is very close to N(, 2); see Figure 4 below u For any λ, the df F λ of X λ is just the ordinary mathematical inverse R λ to the continuous strictly increasing function R λ, so F λ (x) is the u such that x = R λ (u) For λ = we can get a closed form expression for F (x), as follows: ( u ) x = R (u) = log = u u u = ex = u = ex + e x ; thus F (x) = + e x (9) for < x < Since F (x) is continuously differentiable in x, the distribution has a density which we can find by differentiating the df: f (x) := d dx F (x) = e x ( + e x ) 2 = F (x) ( F (x) ), () again for < x < This distribution is called the (standard) logistic distribution T( ) is very close to the distribution of πt, where T has a t-distribution with degree of freedom; see Figure 5 below 4 As λ goes from to (and the distribution goes (modulo scaling) from uniform, to almost normal, to logistic, to almost t ), the tails of the T(λ) distribution go from very light to very heavy This is another reason why the family is good for robustness studies 2 8

9:3 //2 Q/Q-plots The distributions of two random variables are often compared by making a simultaneous plot of their dfs, or densities, or probability mass functions This section talks about another way to make the comparison, the so-called Q/Q-plot, in which the quantiles of one of the variables are ploted against the corresponding quantiles of the other variable To simplify the discussion, suppose that all quantile functions involved are unique, so we can talk about the u th -quantile, instead of just a u th -quantile Let then X be a random variable with df F and quantile function Q = F, and, similarly, let X 2 be a random variable with df F 2 and quantile function Q 2 = F2 A plot of the points ( Q (u), Q 2 (u) ) for u in (some subinterval) of (, ) is called a Q/Q-plot of X 2 against X, or, equivalently, of F 2 against F, or of Q 2 against Q As in the illustration below, it is good practice to include a probability axis showing the range and spacing of the u s covered by the plot Quantiles of X2 8 6 4 2 Probabilities 5 25 5 75 95 Figure Q/Q-plot of X 2 χ 2 3 versus X N(, ) (Q (5), Q 2 (5)) (Q (25), Q 2 (25)) 5 5 5 5 Quantiles of X (Q (75), Q 2 (75)) Two special cases are worth noting (A) if X 2 is standard uniform, then the plot displays ( Q (u), u ) for u (, ); this is essentially just a plot of the df of X (B) if X is standard uniform, then the plot displays ( u, Q 2 (u) ) for u (, ); this is a graph of the inverse df F 2 2 9 Interpreting Q/Q-plots Suppose X and X 2 are two random variables such that X 2 a + bx ( for some constants a and b with b > ; this says that the distribution of X 2 is obtained from that of X by a change of scale (multiplication by b) and a change in location (addition of a) Let Q be the quantile function for X Since the transformation x a + bx is increasing, Theorem 4 implies that the quantile function for X 2 is given by Q 2 (u) = a + bq (u) (2 Consequently a Q/Q-plot of X 2 against X is simply a straight line with Q 2 -intercept a and slope b: Quantiles of X2 Figure 2: Q/Q-plot of X 2 a + bx against X slope b Q 2 -intercept a Quantiles of X This is one of the primary reasons why Q/Q-plots are of interest The Q/Q-plot in Figure is not a straight line However, at least over the range of probabilities covered in that figure, the plot is not too far from a straight line with intercept a and slope b (for what values of a and b?); that means that the so-called χ 2 3 distribution is approximately the same as that of a + bz for a standard normal random variable Z 2

We have seen that linearity in a Q/Q-plot means that the distributions of the two variables involved are related by a change in location and scale, in the sense that (2) and, equivalently, () hold What does curvature mean? Consider the following three Q/Q-plots; in each plot the quantiles of some random variable X 2 are plotted against the quantiles of some X Figure 4 Q-Q plot of Tukey (4) vs Standard normal Probabilities 25 5 75 9 99 999 9999 (A) Figure 3: Three Q/Q-plots (B) In case (A), the quantiles of X 2 grow ever more rapidly than do those of X as u tends to and to ; this means that X 2 has heavier tails than does X In case (B), X 2 has lighter tails than X In case (C), the right tail of X 2 is heavier than that of X, but the left tail is lighter Looking back at Figure, you can see that the right tail of the χ 2 3 is somewhat heavier than normal, and the left tail is somewhat lighter The Tukey family, revisited The next three pages exhibit Q/Qplots for various members of the Tukey family In each case you should think about what the plots say about the Tukey distributions In Figure 4 the solid, nearly-diagonal line is the Q/Q-plot of T(4) versus standard normal The dotted line is the least squares linear fit to the solid line; the slope of the least squares line is almost 2 In Figure 5 the solid line is the Q/Q-plot of T( ) versus the t distribution with degree of freedom The dotted line is the straight line with Q 2 -intercept and slope π Figure 6 is a simultaneous Q/Q-plot of the Tukey distributions for λ = to by 2, each against standard normal (C) Tukey (4) quantiles -4-2 2 4-4 -2 2 4 Normal quantiles 2 2 2

9:3 //2 Figure 5 Q-Q plot of Tukey (-) vs t with df Probabilities 25 5 5 99995 9975 999 Figure 6 Q-Q plot of Tukey vs Standard Normal Probabilities 5 5 25 5 75 9 95 99995 Tukey quantiles - -5 5 Tukey quantiles -4-2 2 4 - -8-6 -4-2 2 4 6 8 la= 2 la= 4 la= 6 la= 8 la= -3-2 - 2 3 t quantiles -2-2 Normal quantiles 2 3 2 4

Exercise Prove the claim made after the statement of Theorem 2, namely, that F (u) = F (u+) for all u (, ) F is strictly increasing over { x R : < F (x) < } [Hint For =, suppose x = F (u) satisfies F (x) < Argue that F (u+) y for each y > x] For two random variables Y and Z, the notation Y Z means P [ Y B ] = P [ Z B ] for each subset B of R (3) (technically, B needs to be a so-called Borel measurable set, but we ll ignore that in this course) It is sufficient that (3) hold for sets B of the form (, x] for x R, ie, that Y and Z have the same df Exercise 2 (a) Suppose Y and Z are two random variables such that Y Z Show that T (Y ) T (Z) for any transformation T (b) Suppose Y and Z are defined on the same probability space Show that if P [ Y = Z ] =, then Y Z Exercise 3 Show that (3) holds if and only if both (4) and (5) hold Exercise 4 The standard exponential distribution has density { e x, for x, f(x) =, for x < What are its quantiles? Exercise 5 Suppose Z is a standard normal random variable, with density φ(z) = exp( z 2 /2)/ 2π for < z < (a) Show that P [Z z] = ( + o())φ(z)/z as z ; here o() denotes a quantity that tends to as z [Hint: integrate by parts] (b) Let q α be the ( α) th -quantile of Z Show that q α = 2 log(/α) log(log(/α)) log(4π) + o() (4) as α ; here o() denotes a quantity that tends to as α 2 5 Exercise 6 The text pointed out several reasons why the Tukey family is good for robustness studies Give at least one reason why it is not good Exercise 7 Show that T() and T(2) are the same distribution, up to a change in scale Is there a generalization of this fact? Exercise 8 Below Figure, the text states that a plot of (Q (u), u) for u (, ) is essentially just a plot of the df of X The term essentially just was used because the two plots can in fact be slightly different Explain how Exercise 9 Figure 3 shows the Q/Q-plots for three random variables, X, X 2, and X 3, each plotted against Z N(, ) The plots are drawn to different scales In scrambled order, the distributions of the X i s are: (i) standard uniform, (ii) standard exponential, and (iii) t with degree of freedom Match the plots with the distributions Exercise Let X be a standard exponential random variable (see Exercise 4) and let X 2 be a random variable distributed like X (i) Draw a Q/Q-plot of X 2 against X, by plotting Q 2 (u) against Q (u) (ii) Now plot Q 2 ( u) versus Q (u) What is the general lesson to be learned from (i) and (ii)? Exercise Let Q be the quantile function of the t-distribution with degree of freedom, with density / ( π(+x 2 ) ) for < x <, and let Q 2 be the quantile function of the Tukey distribution T( ) with parameter λ = (a) Show that r(u) := Q 2 (u)/q (u) is a nondecreasing function of u for /2 < u < (this is hard to do analytically, but easy to demonstrate by a graph) with r(/2+) = 8/π and r( ) = π (this is not difficult) (b) Use the result of part (a) to explain why the Q/Q-plot in Figure 5 looks like a straight line with Q 2 -intercept and slope π 2 6