Problems with solution to the written Master s Examination-Option I

Similar documents
Master s Theory Exam Spring 2006

Introduction to General and Generalized Linear Models

Sections 2.11 and 5.8

Math 370, Spring 2008 Prof. A.J. Hildebrand. Practice Test 2 Solutions

Inner product. Definition of inner product

One-Way Analysis of Variance: A Guide to Testing Differences Between Multiple Groups

Definition: Suppose that two random variables, either continuous or discrete, X and Y have joint density

Outline. Topic 4 - Analysis of Variance Approach to Regression. Partitioning Sums of Squares. Total Sum of Squares. Partitioning sums of squares

University of Ljubljana Doctoral Programme in Statistics Methodology of Statistical Research Written examination February 14 th, 2014.

Lecture Notes 1. Brief Review of Basic Probability

Math 370, Actuarial Problemsolving Spring 2008 A.J. Hildebrand. Practice Test, 1/28/2008 (with solutions)

Study Guide for the Final Exam

Simple Linear Regression Inference

NOV /II. 1. Let f(z) = sin z, z C. Then f(z) : 3. Let the sequence {a n } be given. (A) is bounded in the complex plane

Quadratic forms Cochran s theorem, degrees of freedom, and all that

Lecture 6: Discrete & Continuous Probability and Random Variables

SIMPLE LINEAR CORRELATION. r can range from -1 to 1, and is independent of units of measurement. Correlation can be done on two dependent variables.

Stat 704 Data Analysis I Probability Review

Multivariate Analysis of Variance (MANOVA): I. Theory

Univariate Regression

Final Exam Practice Problem Answers

Multivariate Normal Distribution

2013 MBA Jump Start Program. Statistics Module Part 3

Eigenvalues, Eigenvectors, Matrix Factoring, and Principal Components

Department of Mathematics, Indian Institute of Technology, Kharagpur Assignment 2-3, Probability and Statistics, March Due:-March 25, 2015.

Elementary Statistics Sample Exam #3

Overview of Violations of the Basic Assumptions in the Classical Normal Linear Regression Model

Regression step-by-step using Microsoft Excel

Week TSX Index

Interpretation of Somers D under four simple models

Section 1: Simple Linear Regression

Chapter 6: Point Estimation. Fall Probability & Statistics

Multivariate normal distribution and testing for means (see MKB Ch 3)

Part 2: Analysis of Relationship Between Two Variables

SOLUTIONS. f x = 6x 2 6xy 24x, f y = 3x 2 6y. To find the critical points, we solve

Lecture 8: Signal Detection and Noise Assumption

Chapter 3: The Multiple Linear Regression Model

Summary of Formulas and Concepts. Descriptive Statistics (Ch. 1-4)

Factor analysis. Angela Montanari

Dongfeng Li. Autumn 2010

Lecture 13: Martingales

1 Introduction to Matrices

1.5 Oneway Analysis of Variance

Class 19: Two Way Tables, Conditional Distributions, Chi-Square (Text: Sections 2.5; 9.1)

Joint Exam 1/P Sample Exam 1

Regression Analysis: A Complete Example

Statistiek II. John Nerbonne. October 1, Dept of Information Science

Notes on Applied Linear Regression

LOGNORMAL MODEL FOR STOCK PRICES

SPSS Guide: Regression Analysis

Least Squares Estimation

Chapter 4: Statistical Hypothesis Testing

Chicago Booth BUSINESS STATISTICS Final Exam Fall 2011

MULTIPLE REGRESSION WITH CATEGORICAL DATA

Statistics 100A Homework 8 Solutions

DATA INTERPRETATION AND STATISTICS

False. Model 2 is not a special case of Model 1, because Model 2 includes X5, which is not part of Model 1. What she ought to do is estimate

MULTIVARIATE PROBABILITY DISTRIBUTIONS

Simple Methods and Procedures Used in Forecasting

MATH4427 Notebook 2 Spring MATH4427 Notebook Definitions and Examples Performance Measures for Estimators...

Sales forecasting # 2

The Bivariate Normal Distribution

Module 5: Multiple Regression Analysis

PUTNAM TRAINING POLYNOMIALS. Exercises 1. Find a polynomial with integral coefficients whose zeros include

Statistics Review PSY379

Statistical tests for SPSS

Math 370/408, Spring 2008 Prof. A.J. Hildebrand. Actuarial Exam Practice Problem Set 3 Solutions

Factor Analysis. Factor Analysis

The VAR models discussed so fare are appropriate for modeling I(0) data, like asset returns or growth rates of macroeconomic time series.

Example: Boats and Manatees

Inner Product Spaces

The Monte Carlo Framework, Examples from Finance and Generating Correlated Random Variables

SF2940: Probability theory Lecture 8: Multivariate Normal Distribution

Chapter 13 Introduction to Nonlinear Regression( 非 線 性 迴 歸 )

Solution to HW - 1. Problem 1. [Points = 3] In September, Chapel Hill s daily high temperature has a mean

STAT 830 Convergence in Distribution

Degrees of Freedom and Model Search

1 Another method of estimation: least squares

1. What is the critical value for this 95% confidence interval? CV = z.025 = invnorm(0.025) = 1.96

Chapter 13 Introduction to Linear Regression and Correlation Analysis

A note on companion matrices

For a partition B 1,..., B n, where B i B j = for i. A = (A B 1 ) (A B 2 ),..., (A B n ) and thus. P (A) = P (A B i ) = P (A B i )P (B i )

1. Let A, B and C are three events such that P(A) = 0.45, P(B) = 0.30, P(C) = 0.35,

Statistics Graduate Courses

Section 14 Simple Linear Regression: Introduction to Least Squares Regression

12: Analysis of Variance. Introduction

One-Way Analysis of Variance (ANOVA) Example Problem

College of the Holy Cross, Spring 2009 Math 373, Partial Differential Equations Midterm 1 Practice Questions

Math 370, Spring 2008 Prof. A.J. Hildebrand. Practice Test 2

Multiple Linear Regression

Maximum Likelihood Estimation

NCSS Statistical Software Principal Components Regression. In ordinary least squares, the regression coefficients are estimated using the formula ( )

5. Linear Regression

Recall that two vectors in are perpendicular or orthogonal provided that their dot

" Y. Notation and Equations for Regression Lecture 11/4. Notation:

On the Efficiency of Competitive Stock Markets Where Traders Have Diverse Information

Permanents, Order Statistics, Outliers, and Robustness

Basics of Statistical Machine Learning

Analysing Questionnaires using Minitab (for SPSS queries contact -)

Transcription:

Problems with solution to the written Master s Examination-Option I Probability and Statistics, Spring 7 [] Math 3 -MS Exam, Spring 7 If A is a real symmetric n n matrix then show that A is idempotent if and only if A = P P T, where P is a n r matrix such that and r is the rank of A. P T P = I r Solution-Math 3 -MS Exam, Spring 7-Majumdar If Part: : A = P P T P P T = P P T = A Only if part : Spectral Decomposition: A = Q Q T where QQ T = Q T Q = I n where is diagonal. Since the eigen values of A are or we can write [ ] Ir = where r =rank (A. So A = [ P ; P ] [ I r P T ] [ P T P T ] = P P T. Writing Q = [P ; P ] we have [ ] [ ] P I = Q T T [ ] P Q = T P P T P P ; P = P T P P T P T P = I P r. [] Math 33 - MS Exam, Spring 7

Let f and g be continuous on [, ], and suppose that f( < g( and f( > g(. Show that there exists x in (, such that f(x = g(x. Deduce that the equation has a solution in (,. x + 3 = sin πx Solution to Math 33 Problem -MS Exam, Spring 7-Miescke The function f g is continuous, and (f g( <, (f g( >. Hence, by the intermediate value theorem, there exists c in (, such that (f g(c =. For the given equation, use where x is in [, ]. f(x = sin πx and g(x = x +, 3

[3] Stat 4 - MS Exam, Spring 7 The joint p.d.f. of random variables X and Y are given below. Compute the correlation coefficient of X, Y. Are they independent? {, if < x <, < y <, x + y < f(x, y =, otherwise. Solution to Stat 4 problem- MS Exam, Spring 7-Majumdar Similarly E(Y = 3. E(X = E(X = ( x ( x E(XY = xdydx = x dydx = ( x x( xdx = 3 x ( xdx = 6 = E(Y. xydydx = V (X = E(X E(X = 6 9 = 8 = V (Y cov(x, Y = 9 = 36 ρ = Corr (X, Y = 36 8 8 =. Since ρ X, Y are not independent. [4] Stat 4 - (Chapter 5,6,7:-MS Exam, Spring 7 Suppose X,..., X n are iid with the pdf (a Find the mle ˆθ for θ. f(x; θ = x θ, < x θ (b Find the mle for the median of the distribution. 3

Solution to Stat 4 Problem from (Chapter 5,6,7:-MS Exam, Spring 7-Yang (a The likelihood function L(θ = n x i θ = θ n n n x i, < x,..., x n θ The log likelihood function l(θ = log L(θ = n log θ + n log + log The first partial of l is ( n x i l(θ θ which implies that = n θ <, for θ max{x,..., x n } > is the mle for θ. ˆθ = max{x,..., X n } (b Because the distribution is continuous, the median m is the constant satisfying Note that < m < θ and Then the median m m x θ dx = x x dx = θ θ m = m θ m = θ = θ By part (a, the mle for θ is ˆθ = max{x,..., X n }. So the mle for the median is max{x,..., X n } [5] Stat 4 - (Chapter 8,9:-MS Exam, Spring 7 4

Let X = (X,..., X n denote a random sample from the distribution N(θ, that has the pdf f(x; θ = (x θ exp (, < x <. π It is desired to test the hypothesis H : θ = against the alternative hypothesis H : θ =. (a Show that the likelihood ratio L(θ = ; X/L(θ = ; X is based upon the statistic Y = n X i. (b If n = 6, find the best critical region of size α =.5 for the hypothesis test. (Hint: Use the normal table attached. (c If n = 6, find the power of the test in part (b. Solution to Stat 4 Problem - (Chapter 8,9:-MS Exam, Spring 7-Yang (a Proof: The likelihood function of θ given x = (x,..., x n is L(θ; x = n f(x i ; θ = ( { [ n n exp x i θ π ]} n x i + nθ Therefore the likelihood ratio { L(θ = ; X L(θ = ; X = exp } n X i + n So it is based upon the statistic Y = n X i. (b For any positive constant k, L(θ = ; x L(θ = ; x k n x i n log k = c By the Neyman-Pearson Theorem, the best critical region C takes the form of {(x,..., x n : n x i c}, where the constant c is determined by P H (X C = α If n = 6, Y N(, 6 under the null hypothesis. So Y/4 N(, and 5

.5 = P H (Y c = P (Y/4 c/4 By the normal table attached, c/4 = (.64 +.65/ =.645. So the best critical region is C = { (x,..., x n : } n x i 6.58 (c If n = 6, under the alternative hypothesis H : θ =, Y N(6, 6. Then Z = (Y 6/4 = Y/4 4 N(,. The power is P H (Y 6.58 = P (Z.355 = P (Z.355 =.5(.996 +.999 =.997 [6] Stat 46 - MS Exam, Spring 7 The following are the number of minutes it took workers in a factory to complete a certain task, one before and one after each worker had received a special training for completing this task. Worker 3 4 5 6 7 8 9 After Training 4.6 5.8 6.9 7. 8. 6.5 6.9 7. 7.3 8.7 Before Training 8. 7.3 7. 7.7 8. 7. 6.3 6.8 8.5 8.8 (a Explain the model under which both, the Sign Test and the Wilcoxon Signed- Rank Test, can be used. (b Specify the appropriate null hypothesis and alternative. (c Compute the test statistic of the Sign Test for the given data. (d Compute the test statistic of the Wilcoxon Signed-Rank Test for the given data. (e Under what changes of the model could you still use the Sign Test, but not the Wilcoxon Signed-Rank Test? Solution to Stat 46 Problem- MS Exam, Spring 7. -Miescke (a Y i X i, i =,..., n, are independent. Y i X i has a c.d.f. F i with a density f i that is symmetric about θ, i =,..., n. (b H : θ = versus H A : θ >, where the Y i are from Before Training. (c B = #{i : Y i > X i, i =,..., n} = 7, and n is reduced 9 (Worker 5. 6

(d Y i X i 3.5.5.3.5 N/A.7.6.4.. Ranks of abs. diff. 9 8 4 N/A 6 5 3 7 Signed ranks 9 8 4 N/A 6 5 3 7 The sum of the positive signed ranks is T + = 37, and n is reduced to 9 (Worker 5. (e If the densities f i are not all symmetric then only the Sign Test can be used. [7] Stat 43 - MS Exam, Spring 7, A random sample of size n has been drawn from a finite population of N units by adopting SRSWOR method and the sample so drawn is denoted by s. Subsequently, a sub-sample s of size n [< n] is drawn from the chosen sample s, again by adopting SRSWOR method. Denote by ȳ s and ȳ s the respective sample means for a study variable Y based on the two samples. Assume: Eȳ s = Ȳ V ar[ȳ s ] = S [ n N ] where S denotes the population variance based on Y - values. (a. Show that ȳ s also serves as an unbiased estimate for the population mean Ȳ. (b. Compute Var (ȳ s and Cov (ȳ s, ȳ s (c. Find the best linear combination of ȳ s and ȳ s to serve as a pooled estimate of Ȳ. You may use conditional expectation arguments. Note that a subsample (s behaves like an induced sample with reference to the subpopulation captured by the original sample (s. Solution to Stat 43 Problem -MS Exam, Spring 7, Hedayat Essential Steps. (a E[ȳ s ; given s] = E [ȳ s ] = ȳ s and hence E[ȳ s ] = E E. = E[ȳ(s] = Ȳ. (b V [ȳ s ] = V E + E V = S [ n N ] + S [ n n ] = S [ n n ] (c E[ȳ s.ȳ s ] = E E.. = E[ȳ s] = Ȳ + V ar[ȳ s ] Therefore, Cov[ȳ s, ȳ s ] = V ar[ȳ s ] = S [[ n N ] The rest follows from a standard result in pooling two dependent estimates. 7

[8] Stat 46 - MS Exam, Spring 7 A Markov chain X, X,...,.. has the transition probability matrix.3..5.5..4 and is known to start in state X =. Let T = min{n : X n = }. Find the probability that T is an odd number. Solution to Stat 46 Problem - MS Exam, Spring 7-El-Neweihi Let u = P (T odd/x =, u = P (T is odd/x =, By the first step analysis u = (.3( u + (.( u +.5 u = (.5( u + (.( u +.4 Solving for u we get u = 5 37. [9] Stat 48 - MS Exam, Spring 7 Consider a normal population X N(µ, σ =. To test H : µ = 5 against H : µ < 5. Determine the sample size n and the rejection region { X < c } to satisfy the significance level.5 and power level.975 at µ = 3. [Given: Φ (.96 =.975.] Solution to Problem [9]:Stat 48 - MS Exam, Spring 7-Wang The significance level and power level provide the following equations { α = P { Reject H H is true.} β = P { Reject H H is false.} { P ( X < c µ = 5 =.5 P ( X < c µ = 3 =.975. ( c 5 Φ =.5 ( /n c 3 Φ =.975 /n c 5 =.96 /n c 3 =.96 /n Hence, we have sample size n = 77, c = 4 and the rejection region { X < 4 }. 8

[] Stat 48 Problem - MS Exam, Spring 7 A multiple linear regression model Y i = β + β x i + β x i + ε i, was used to fit n = data points. It is calculated that SST O =, SSR = 66. (. Construct the ANOVA table. (.Calculate and interpret the coefficient of the determination R. (3. Test H : β = β = at the α =.5 significance level. (4. Explain the meaning of the coefficient β. Find the statistic and its sampling distribution for an individual test H : β = against H : β. [Given: F (.5,, 7 = 3.59] Solution to Problem []:Stat 48 - MS Exam, Spring 7-Wang (. ANOVA Table Source DF SS MS F Regression 66 33 4.9 Error 7 34 7.88 Total 9 (. The coefficient of the determination R = SSR SST O = 33% It means that there are 33% of variation of the response explained by the linear regression model, or the variation in the data is reduced by 33% after introducing the two predictors. (3 The F -statistic in the ANOVA table F = 4.9 > F (.5,, 7 = 3.59. Therefore, we will reject the null hypothesis at level.5, i.e. at least one predictor contribute significantly to explain the variation in the response. (4. The regression coefficient β is the effect on mean response for a unit increase in predictor variable X, while holding the other predictor X constant. Use t-test for individual test H : β = and the statistic is where ˆβ ( and s ˆβ t ˆβ = ( ˆβ s ˆβ are the least square estimate and its standard error of linear coefficient β. Under H, the test statistic t ˆβ t (n p i.e. t ˆβ t (7. 9