Chapter 6: Point Estimation. Fall 2011. - Probability & Statistics



Similar documents
2DI36 Statistics. 2DI36 Part II (Chapter 7 of MR)

MATH4427 Notebook 2 Spring MATH4427 Notebook Definitions and Examples Performance Measures for Estimators...

Maximum Likelihood Estimation

Practice problems for Homework 11 - Point Estimation

Sections 2.11 and 5.8

0 x = 0.30 x = 1.10 x = 3.05 x = 4.15 x = x = 12. f(x) =

Principle of Data Reduction

1. Let A, B and C are three events such that P(A) = 0.45, P(B) = 0.30, P(C) = 0.35,

1 Maximum likelihood estimation

What is Statistics? Lecture 1. Introduction and probability review. Idea of parametric inference

Multivariate Normal Distribution

3.4. The Binomial Probability Distribution. Copyright Cengage Learning. All rights reserved.

Chapter 13 Introduction to Nonlinear Regression( 非 線 性 迴 歸 )

i=1 In practice, the natural logarithm of the likelihood function, called the log-likelihood function and denoted by

5. Continuous Random Variables

Multiple Linear Regression in Data Mining

CHAPTER 6: Continuous Uniform Distribution: 6.1. Definition: The density function of the continuous random variable X on the interval [A, B] is.

STAT 830 Convergence in Distribution

Data Modeling & Analysis Techniques. Probability & Statistics. Manfred Huber

An Introduction to Basic Statistics and Probability

Statistical Theory. Prof. Gesine Reinert

**BEGINNING OF EXAMINATION** The annual number of claims for an insured has probability function: , 0 < q < 1.

STAT 315: HOW TO CHOOSE A DISTRIBUTION FOR A RANDOM VARIABLE

Institute of Actuaries of India Subject CT3 Probability and Mathematical Statistics

NCSS Statistical Software Principal Components Regression. In ordinary least squares, the regression coefficients are estimated using the formula ( )

1 Prior Probability and Posterior Probability

Definition: Suppose that two random variables, either continuous or discrete, X and Y have joint density

Summary of Formulas and Concepts. Descriptive Statistics (Ch. 1-4)

Properties of Future Lifetime Distributions and Estimation

Analysis of Reliability and Warranty Claims in Products with Age and Usage Scales

On the Efficiency of Competitive Stock Markets Where Traders Have Diverse Information

Life Table Analysis using Weighted Survey Data

Maximum likelihood estimation of mean reverting processes

MATHEMATICAL METHODS OF STATISTICS

Pr(X = x) = f(x) = λe λx

Elements of statistics (MATH0487-1)

Numerical Methods for Option Pricing

Estimation and Confidence Intervals

Probability and Statistics Prof. Dr. Somesh Kumar Department of Mathematics Indian Institute of Technology, Kharagpur

5.1 Identifying the Target Parameter

An Introduction to Regression Analysis

Lecture 10: Depicting Sampling Distributions of a Sample Proportion

Overview of Violations of the Basic Assumptions in the Classical Normal Linear Regression Model

Master s Theory Exam Spring 2006

Important Probability Distributions OPRE 6301

RNA-seq. Quantification and Differential Expression. Genomics: Lecture #12

LOGISTIC REGRESSION. Nitin R Patel. where the dependent variable, y, is binary (for convenience we often code these values as

Example 1: Dear Abby. Stat Camp for the Full-time MBA Program

Lecture 3: Continuous distributions, expected value & mean, variance, the normal distribution

The Binomial Probability Distribution

Math 431 An Introduction to Probability. Final Exam Solutions

Dongfeng Li. Autumn 2010

Math 370/408, Spring 2008 Prof. A.J. Hildebrand. Actuarial Exam Practice Problem Set 2 Solutions

Math 370/408, Spring 2008 Prof. A.J. Hildebrand. Actuarial Exam Practice Problem Set 5 Solutions

From the help desk: Bootstrapped standard errors

2. Linear regression with multiple regressors

Information Theory and Coding Prof. S. N. Merchant Department of Electrical Engineering Indian Institute of Technology, Bombay

Auxiliary Variables in Mixture Modeling: 3-Step Approaches Using Mplus

Math 461 Fall 2006 Test 2 Solutions

Chapter 3: DISCRETE RANDOM VARIABLES AND PROBABILITY DISTRIBUTIONS. Part 3: Discrete Uniform Distribution Binomial Distribution

Econometrics Simple Linear Regression

Ch5: Discrete Probability Distributions Section 5-1: Probability Distribution

Random variables, probability distributions, binomial random variable

Factor analysis. Angela Montanari

Introduction to Hypothesis Testing

arxiv: v1 [math.pr] 5 Dec 2011

Missing Data: Part 1 What to Do? Carol B. Thompson Johns Hopkins Biostatistics Center SON Brown Bag 3/20/13

Calculating P-Values. Parkland College. Isela Guerra Parkland College. Recommended Citation

WHERE DOES THE 10% CONDITION COME FROM?

Need for Sampling. Very large populations Destructive testing Continuous production process

Math 151. Rumbos Spring Solutions to Assignment #22

99.37, 99.38, 99.38, 99.39, 99.39, 99.39, 99.39, 99.40, 99.41, cm

FACTORING QUADRATICS and 8.1.2

4. Continuous Random Variables, the Pareto and Normal Distributions

Basics of Statistical Machine Learning

Confidence Intervals for One Standard Deviation Using Standard Deviation

Lecture 14. Chapter 7: Probability. Rule 1: Rule 2: Rule 3: Nancy Pfenning Stats 1000

ST 371 (IV): Discrete Random Variables

Errata and updates for ASM Exam C/Exam 4 Manual (Sixteenth Edition) sorted by page

Statistics in Retail Finance. Chapter 6: Behavioural models

Statistical Functions in Excel

Problem sets for BUEC 333 Part 1: Probability and Statistics

The Kelly Betting System for Favorable Games.

FEGYVERNEKI SÁNDOR, PROBABILITY THEORY AND MATHEmATICAL

Time series Forecasting using Holt-Winters Exponential Smoothing

University of Chicago Graduate School of Business. Business 41000: Business Statistics

Lecture 13: Factoring Integers

Part 2: One-parameter models

An extension of the factoring likelihood approach for non-monotone missing data

Approximation of Aggregate Losses Using Simulation

Joint Exam 1/P Sample Exam 1

LOGNORMAL MODEL FOR STOCK PRICES

The VAR models discussed so fare are appropriate for modeling I(0) data, like asset returns or growth rates of macroeconomic time series.

Practice Problems for Homework #6. Normal distribution and Central Limit Theorem.

Imputing Missing Data using SAS

Transcription:

STAT355 Chapter 6: Point Estimation Fall 2011 Chapter Fall 2011 6: Point1 Estimat / 18

Chap 6 - Point Estimation 1 6.1 Some general Concepts of Point Estimation Point Estimate Unbiasedness Principle of Minimum Variance Unbiased Estimation 2 6.2 Methods of Point Estimation Maximum Likelihood Estimation The Method of Moments Chapter Fall 2011 6: Point2 Estimat / 18

Definitions Definition A point estimate of a parameter θ is a single number that can be regarded as a sensible value for θ. A point estimate is obtained by selecting a suitable statistic and computing its value from the given sample data. The selected statistic is called the point estimator of θ. Chapter Fall 2011 6: Point3 Estimat / 18

Examples Suppose, for example, that the parameter of interest is µ, the true average lifetime of batteries of a certain type. A random sample of n = 3 batteries might yield observed lifetimes (hours) x 1 = 5.0, x 2 = 6.4, x 3 = 5.9. The computed value of the sample mean lifetime is x = 5.77. It is reasonable to regard 5.77 as a very plausible value of µ our best guess for the value of µ based on the available sample information. Chapter Fall 2011 6: Point4 Estimat / 18

Notations Suppose we want to estimate a parameter of a single population (e.g., µ or σ, or λ) based on a random sample of size n. When discussing general concepts and methods of inference, it is convenient to have a generic symbol for the parameter of interest. We will use the Greek letter θ for this purpose. The objective of point estimation is to select a single number, based on sample data, that represents a sensible value for θ. Chapter Fall 2011 6: Point5 Estimat / 18

Some General Concepts of Point Estimation We know that before data is available, the sample observations must be considered random variables X 1, X 2,..., X n. It follows that any function of the X i s - that is, any statistic - such as the sample mean X or sample standard deviation S is also a random variable. Example: In the battery example just given, the estimator used to obtain the point estimate of µ was X, and the point estimate of µ was 5.77. If the three observed lifetimes had instead been x 1 = 5.6, x 2 = 4.5, and x 3 = 6.1, use of the estimator X would have resulted in the estimate x = (5.6 + 4.5 + 6.1)/3 = 5.40. The symbol ˆθ ( theta hat ) is customarily used to denote both the estimator of µ and the point estimate resulting from a given sample. Chapter Fall 2011 6: Point6 Estimat / 18

Some General Concepts of Point Estimation Thus ˆθ = X is read as the point estimator of θ is the sample mean X. The statement the point estimate of θ is 5.77 can be written concisely as ˆθ = 5.77. In the best of all possible worlds, we could find an estimator ˆθ for which ˆθ = θ always. However, ˆθ is a function of the sample X i s, so it is a random variable. For some samples, ˆθ will yield a value larger than θ, whereas for other samples ˆθ will underestimate θ. If we write ˆθ = θ + error of estimation then an accurate estimator would be one resulting in small estimation errors, so that estimated values will be near the true value. Chapter Fall 2011 6: Point7 Estimat / 18

Some General Concepts of Point Estimation A sensible way to quantify the idea of ˆθ being close to θ is to consider the squared error (ˆθ θ) 2. For some samples, ˆθ will be quite close to θ and the resulting squared error will be near 0. Other samples may give values of ˆθ far from θ, corresponding to very large squared errors. A measure of accuracy is the expected or mean square error (MSE) MSE = E[(ˆθ θ) 2 ] If a first estimator has smaller MSE than does a second, it is natural to say that the first estimator is the better one. Chapter Fall 2011 6: Point8 Estimat / 18

Some General Concepts of Point Estimation However, MSE will generally depend on the value of θ. What often happens is that one estimator will have a smaller MSE for some values of θ and a larger MSE for other values. Finding an estimator with the smallest MSE is typically not possible. One way out of this dilemma is to restrict attention just to estimators that have some specified desirable property and then find the best estimator in this restricted group. A popular property of this sort in the statistical community is unbiasedness. Chapter Fall 2011 6: Point9 Estimat / 18

Unbiasedness Definition A point estimator ˆθ is said to be an unbiased estimator of θ if E(ˆθ) = θ for every possible value of θ. If ˆθ is not unbiased, the difference E(ˆθ θ) is called the bias of ˆθ. That is, ˆθ is unbiased if its probability (i.e., sampling) distribution is always centered at the true value of the parameter. Chapter Fall 2011 6: Point 10 Estimat / 18

Unbiasedness - Example The sample proportion X /n can be used as an estimator of p, where X, the number of sample successes, has a binomial distribution with parameters n and p.thus E(ˆp) = E( X n ) = 1 n E(X ) = 1 n (np) = p Proposition When X is a binomial rv with parameters n and p, the sample proportion ˆp = X /n is an unbiased estimator of p. No matter what the true value of p is, the distribution of the estimator will be centered at the true value. Chapter Fall 2011 6: Point 11 Estimat / 18

Principle of Minimum Variance Unbiased Estimation Among all estimators of θ that are unbiased, choose the one that has minimum variance. The resulting is called the minimum v ariance unbiased estimator (MVUE) of θ. The most important result of this type for our purposes concerns estimating the mean µ of normal distribution. Theorem Let X 1,..., X n be a random sample from a normal distribution with parameters µ and σ 2. Then the estimator ˆµ = X is the MVUE for µ. In some situations, it is possible to obtain an estimator with small bias that would be preferred to the best unbiased estimator. Chapter Fall 2011 6: Point 12 Estimat / 18

Reporting a Point Estimate: The Standard Error Besides reporting the value of a point estimate, some indication of its precision should be given. The usual measure of precision is the standard error of the estimator used. Definition The standard error of an estimator ˆθ is its standard deviation σˆθ = V (ˆθ). Chapter Fall 2011 6: Point 13 Estimat / 18

Maximum Likelihood Estimation Definition Let X 1,..., X n have a joint pmf or pdf f (x 1,..., x n ; θ 1,..., θ m ) (1) where the parameters θ 1,..., θ m have unknown values. When x 1,..., x n are observed samples values and (1)is regarded as a function of θ 1,..., θ m, it is called the likelihood function. Definition L(θ 1,..., θ m ) = f (x 1,..., x n ; θ 1,..., θ m ) The maximum likelihood estimates (mle s) ˆθ 1,...ˆθ m are those values of the θ i s that maximize the likelihood function, so that L(ˆθ 1,..., ˆθ m ) L(θ 1,..., θ m ) Chapter Fall 2011 6: Point 14 Estimat / 18

Maximum Likelihood Estimation - Example Let X 1,...X n be a random sample from the normal distribution. n f (x 1,..., x n ; µ, σ 2 1 ) = 2πσ 2 e (x i µ) 2 /(2σ 2 ) i=1 1 = ( 2πσ 2 )n/2 e n i=1 (x i µ) 2 /(2σ 2). (3) The natural logarithm of the likelihood function is ln(f (x 1,..., x n ; µ, σ 2 )) = n 2 ln(2πσ2 ) 1 n 2σ 2 (x i µ) 2 To find the maximizing values of µ and σ 2, we must take the partial derivatives of ln(f ) with respect to µ and σ 2, equate them to zero, and solve the resulting two equations. Omitting the details, the resulting mles are i=1 (2) n ˆµ = X ; ˆσ 2 i=1 = (X i X ) 2 n Chapter Fall 2011 6: Point 15 Estimat / 18

The Method of Moments The basic idea of this method is to equate certain sample characteristics, such as the mean, to the corresponding population expected values. Then solving these equations for unknown parameter values yields the estimators. Definition Let X 1,..., X n be a random sample from a pmf or pdf f (x). For k = 1, 2, 3,..., the kth population moment, or kth moment of the distribution f (x), is E(X k ). The kth sample moment is (1/n) n i=1 X k i. Thus the first population moment is E(X ) = µ, and the first sample moment is X i /n = X. The second population and sample moments are E(X 2 ) and X 2 i /n, respectively. The population moments will be functions of any unknown parameters θ 1, θ 2,... Chapter Fall 2011 6: Point 16 Estimat / 18

The method of Moments Definition Let X 1, X 2,..., X n be a random sample from a distribution with pmf or pdf f (x; θ 1,..., θ m ), where θ 1,..., θ m are parameters whose values are unknown. Then the moment estimators ˆθ 1,..., ˆθ m are obtained by equating the first m sample moments to the corresponding first m population moments and solving for θ 1,..., θ m. If, for example, m = 2, E(X ) and E(X 2 ) will be functions of θ 1 and θ 2. Setting E(X ) = (1/n) X i (= X ) and E(X 2 ) = (1/n) Xi 2 gives two equations in θ 1 and θ 2. The solution then defines the estimators. Chapter Fall 2011 6: Point 17 Estimat / 18

The method of Moments - Example Let X 1, X 2,..., X n represent a random sample of service times of n customers at a certain facility, where the underlying distribution is assumed exponential with parameter λ. Since there is only one parameter to be estimated, the estimator is obtained by equating E(X ) to λ. Since E(X ) = 1/λ for an exponential distribution, this gives λ = 1/ X. The moment estimator of λ is then ˆλ = 1/ X. Chapter Fall 2011 6: Point 18 Estimat / 18