Statistics - Written Examination MEC Students - BOVISA

Similar documents
Regression Analysis: A Complete Example

Factors affecting online sales

Summary of Formulas and Concepts. Descriptive Statistics (Ch. 1-4)

Hypothesis testing - Steps

Joint Exam 1/P Sample Exam 1

0 x = 0.30 x = 1.10 x = 3.05 x = 4.15 x = x = 12. f(x) =

Chicago Booth BUSINESS STATISTICS Final Exam Fall 2011

What is Statistics? Lecture 1. Introduction and probability review. Idea of parametric inference

Definition: Suppose that two random variables, either continuous or discrete, X and Y have joint density

Lecture Notes 1. Brief Review of Basic Probability

4. Continuous Random Variables, the Pareto and Normal Distributions

Generalized Linear Models

Math 461 Fall 2006 Test 2 Solutions

Notes on Continuous Random Variables

Statistics I for QBIC. Contents and Objectives. Chapters 1 7. Revised: August 2013

e.g. arrival of a customer to a service station or breakdown of a component in some system.

Random Variables. Chapter 2. Random Variables 1

Stat 704 Data Analysis I Probability Review

6.041/6.431 Spring 2008 Quiz 2 Wednesday, April 16, 7:30-9:30 PM. SOLUTIONS

University of Ljubljana Doctoral Programme in Statistics Methodology of Statistical Research Written examination February 14 th, 2014.

Opgaven Onderzoeksmethoden, Onderdeel Statistiek

CHAPTER 6: Continuous Uniform Distribution: 6.1. Definition: The density function of the continuous random variable X on the interval [A, B] is.

Premaster Statistics Tutorial 4 Full solutions

Institute of Actuaries of India Subject CT3 Probability and Mathematical Statistics

1. What is the critical value for this 95% confidence interval? CV = z.025 = invnorm(0.025) = 1.96

Homework 4 - KEY. Jeff Brenion. June 16, Note: Many problems can be solved in more than one way; we present only a single solution here.

Math 431 An Introduction to Probability. Final Exam Solutions

MTH 140 Statistics Videos

Department of Mathematics, Indian Institute of Technology, Kharagpur Assignment 2-3, Probability and Statistics, March Due:-March 25, 2015.

CHAPTER 13 SIMPLE LINEAR REGRESSION. Opening Example. Simple Regression. Linear Regression

Statistics 104: Section 6!

Experimental Design. Power and Sample Size Determination. Proportions. Proportions. Confidence Interval for p. The Binomial Test

Recall this chart that showed how most of our course would be organized:

APPLIED MATHEMATICS ADVANCED LEVEL

Good luck! BUSINESS STATISTICS FINAL EXAM INSTRUCTIONS. Name:

Statistics courses often teach the two-sample t-test, linear regression, and analysis of variance

Probability Generating Functions

Crash Course on Basic Statistics

University of Chicago Graduate School of Business. Business 41000: Business Statistics

GLM I An Introduction to Generalized Linear Models

Final Exam Practice Problem Answers

Automated Biosurveillance Data from England and Wales,

Business Statistics. Successful completion of Introductory and/or Intermediate Algebra courses is recommended before taking Business Statistics.

Coefficient of Determination

STAT 350 Practice Final Exam Solution (Spring 2015)

Lecture 6: Discrete & Continuous Probability and Random Variables

4. Simple regression. QBUS6840 Predictive Analytics.

Chapter 4 Statistical Inference in Quality Control and Improvement. Statistical Quality Control (D. C. Montgomery)

MATH4427 Notebook 2 Spring MATH4427 Notebook Definitions and Examples Performance Measures for Estimators...

Fairfield Public Schools

Sections 2.11 and 5.8

Bowerman, O'Connell, Aitken Schermer, & Adcock, Business Statistics in Practice, Canadian edition

The Variability of P-Values. Summary

Curriculum Map Statistics and Probability Honors (348) Saugus High School Saugus Public Schools

Chapter 5. Random variables

Unit 31 A Hypothesis Test about Correlation and Slope in a Simple Linear Regression

Simple Linear Regression Inference

HYPOTHESIS TESTING (ONE SAMPLE) - CHAPTER 7 1. used confidence intervals to answer questions such as...

Mean = (sum of the values / the number of the value) if probabilities are equal

University of Chicago Graduate School of Business. Business 41000: Business Statistics Solution Key

Course Text. Required Computing Software. Course Description. Course Objectives. StraighterLine. Business Statistics

THE CENTRAL LIMIT THEOREM TORONTO

A POPULATION MEAN, CONFIDENCE INTERVALS AND HYPOTHESIS TESTING

Introduction to General and Generalized Linear Models

Tests of Hypotheses Using Statistics

Using Microsoft Excel for Probability and Statistics

Some special discrete probability distributions

WHERE DOES THE 10% CONDITION COME FROM?

1. Let A, B and C are three events such that P(A) = 0.45, P(B) = 0.30, P(C) = 0.35,

1 Simple Linear Regression I Least Squares Estimation

Section 1: Simple Linear Regression

List of Examples. Examples 319

Name: Date: Use the following to answer questions 3-4:

August 2012 EXAMINATIONS Solution Part I

ECE302 Spring 2006 HW3 Solutions February 2,

Confidence Intervals for One Standard Deviation Using Standard Deviation

HYPOTHESIS TESTING (ONE SAMPLE) - CHAPTER 7 1. used confidence intervals to answer questions such as...

table to see that the probability is (b) What is the probability that x is between 16 and 60? The z-scores for 16 and 60 are: = 1.

Multiple Linear Regression

A Model of Optimum Tariff in Vehicle Fleet Insurance

Module 2 Probability and Statistics

An Introduction to Basic Statistics and Probability

Quantile Regression under misspecification, with an application to the U.S. wage structure

STATISTICS 8, FINAL EXAM. Last six digits of Student ID#: Circle your Discussion Section:

General Method: Difference of Means. 3. Calculate df: either Welch-Satterthwaite formula or simpler df = min(n 1, n 2 ) 1.

Example: Credit card default, we may be more interested in predicting the probabilty of a default than classifying individuals as default or not.

PROBABILITY AND STATISTICS. Ma To teach a knowledge of combinatorial reasoning.

Week TSX Index

Data Mining Techniques Chapter 5: The Lure of Statistics: Data Mining Using Familiar Tools

1) Write the following as an algebraic expression using x as the variable: Triple a number subtracted from the number

Data Analysis, Research Study Design and the IRB

MAS108 Probability I

Example: Boats and Manatees

Maximum likelihood estimation of mean reverting processes

Math 370/408, Spring 2008 Prof. A.J. Hildebrand. Actuarial Exam Practice Problem Set 2 Solutions

PSTAT 120B Probability and Statistics

Transcription:

Statistics - Written Examination MEC Students - BOVISA Prof.ssa A. Guglielmi 26.0.2 All rights reserved. Legal action will be taken against infringement. Reproduction is prohibited without prior consent. Name: Student Id. Number: Properly justify all your answers. Explicitly define all the random variables you are going to use in the solution which have not already been introduced in the text. Exercises Exercise A survey revealed that young people do not read newspapers any more. It has been estimated that only 2% of college students read at least one newspaper a day. Consider a sample class from Politecnico di Milano, composed by n students, who behave independently from each other.. Find the probability that, in a class of 50 students, at least of them read at least one newspaper a day. 2. How many students there should be in the sample class to have at least one student reading at least one newspaper a day with a probability greater than 0.8?. Instead of the point estimate used up to this point, compute an upper confidence limit of level 95% for the proportion of students reading at least one newspaper a day, knowing that, in a class of 220 students, only read at least one newspaper everyday.. Let X be the random variable that counts the number of students reading at least one newspaper a day in a class of 50 people; X has binomial distribution with parameters n = 50 and p = 0.02. Then: P(X ) = P(X 2) = [ P(X = 0) + P(X = ) + P(X = 2) ] = [ ( ) ( ) 50 50 = ( p) 50 + p( p) 49 + p 2 ( p) 48] 0.4209 = 0.57907. 2 Using the Poisson approximation with λ = np =, the required probability is e λ( + λ + λ2 ) 0.429 = 0.5768. 2! 2. If Y is the number of students reading at least one newspaper a day in a class of n students, then Y Bin(n,0.02); we have to compute: P(Y ) = P(Y = 0) = (0.98) n. Then (0.98) n > 0.8 n log(0.98) < log(0.2) n > log(0.2) log(0.98) 79.66447. The solution is n 80.

. Since z α = z 0.05 =.645 and ˆp = /220 = 0.059, the requested CI for p is ˆp( ˆp) (, ˆp + z α ) = (,0.085). n 2

Exercise 2 There are two different lines, XX and YY, for the electrical wire extrusion production, to be compared. According to the manufacturer, XX produces wires with higher resistance to traction, but the standard deviation of the wires resistance produced by XX is.408 (in 0 psi), while that of YY is 0.528. Line XX produced a sample of 20 wires with sample mean resistance equal to 84.7; on the other hand, line YY produced a sample of 0 wires with sample mean resistance equal to 82.. Assume that data are Gaussian and that the two samples are independent.. Is there evidence to indicate that the mean resistance of line XX is higher than that of YY? Use a significance level α= %. 2. Compute the p-value of the test at point. What conclusion would you draw?. Compute the power of the test at point. when the difference between the expected resistance of XX and that of YY is equal to.5 (Hint: remember the definition of power, as the probability of rejecting H 0 as a function of the true value of the parameter). 4. Find a 99% two-sided confidence interval for the difference between the expected resistances of XX and of YY, based on the two observed sample means. Let X and Y be the random variables representing the resistance of the wires produced by XX and YY, respectively, and let µ X = E(X), µ Y = E(Y ). Let (x,...,x 20 ) and (y,...,y 0 ) be the observed samples.. We need to test the null hypothesis H 0 : µ X = µ Y versus the alternative H : µ X > µ Y. If we assume that the two samples are independent, then ( ) X Ȳ N µ X µ Y,.4082 + 0.5282 = N(µ X µ Y,0.27). 20 0 The rejection region at level % is: C = {( x,ȳ) : x ȳ 0.27 2. = z 0.00 }. Since ( x ȳ)/ 0.27 = 6.74 we reject H 0 at the given level. 2. The p-value is given by the probability, under H 0, that the test statistic ( X Ȳ )/ 0.27, exceeds the value 6.74. The test statistic has distribution N(0,) under H 0, while, from the tables, Φ(.99) = 0.0000, so the p-value is lower than 0 4. There is very strong evidence that µ X > µ Y.. Let Z = X Ȳ and µ Z = µ X µ Y. The power function of the test is ( ) ( Z π(µ Z ) = P µz 2. = Φ 2. µ ) ( ) Z µz = Φ 2. 0.27 0.27 0.27. With µ Z =.5 we obtain π(.5) = Φ(.88) = 0.9699. 4. Since z α/2 = z 0.005 = 2.576, a 99% CI for µ X µ Y is ( x ȳ z α/2 0.27, x ȳ + zα/2 0.27) = (.4850,.20).

Exercise The income of Italian tourists who rented a house in Cortina during the last Christmas holidays is described by a random variable X that, expressed in hundreds of thousand of euros, has density: f X (x) = x 4 (,+ )(x).. Find the distribution function and the median of X. 2. Let T = /X. Find the distribution function of T. Which distribution is?. Determine the expected value and variance of X. 4. Let us consider 75 people, chosen at random among those who rented a house in Cortina. Find the approximate value of the probability that their average income exceeds 75 thousand euros.. If x <, F X (x) = 0; if x F X (x) = x The median is the value m such that which is about 26 thousand euros. 2. If 0 t, u 4du = [ ] x u = x. F X (m) = 2 m = 2 m = 2 m = 2 /, F T (t) = P T has uniform distribution on the interval (0,). ( ) ( ) X t = P X ( t) / ) = t;. E(X) = E(X 2 ) = x x 4dx = x 2 x 4dx = x dx = x 2dx = [ 2x 2 [ x ] ] = 2, =, Var(X) = ( ) 2 = 2 4. 4. If X i represent the income of the i-th person among the 75 randomly chosen, then X,...,X 75 are iid. If X 75 = (X + + X 75 )/75 is the average income, then, by the Central Limit Theorem, X 75 approximately follows a N(E(X ) =.5,Var(X )/75 = 00 ) distribution, so that P(X 75 >.75) = P(X 75.75) Φ.75.5 = Φ(2.5) = 0.0062. 00 4

Exercise 4 It is important to establish the mechanical properties of a particular kind of rubber by a laboratory trial. With this aim, a sample of the material has undergone a tension testing, and the results are reported in the following table. Assume that the recorded strength values of 2 4 5 6 7 8 imposed length x (cm) 6.50 7.00 7.50 8.00 8.50 9.00 9.50 0.00 observed strength y (MPa) 0.77.24. 2.2 2.7 2.24 2.80.5. the fibers of the sample can be considered as affected by a zero mean Gaussian error, and that the observations were independent, with the same degree of uncertainty.. Defining a proper linear regression model, estimate the regression coefficients of the relation between x and Y. Moreover, estimate the variance of the error. 2. Determine a 90% level confidence interval for the slope of the regression line.. If a length of 0.cm were imposed to the sample, assuming that it still is in its linear elastic phase, what strength would we expect to observe? 4. Determine a 95% prediction interval of level for the strenght when x =0.cm.. We are interested in estimating the coefficients of the linear relation Y i = β 0 + β x i + ǫ i, i.i.d. i =,...,n (n = 8), with ǫ i N(0,σ 2 ) representing the errors, and in estimating σ 2. From the data we obtain x = 8.25 ȳ =.9775, S xy = x i y i n xȳ = 6.8 S xx = x 2 i n( x) 2 = 0.5 S yy = yi 2 n(ȳ) 2 = 4.5988. The least squares estimates of the regression parameters and of the error variance are ˆβ = S xy = 0.6486, S ˆβ 0 = ȳ ˆβ x =.75, xx ( ) ˆσ 2 = S yy S2 xy = 0.00. n 2 S xx 2. Since t α/2,n 2 = t 0.05,6 =.942 e ˆσ 2 /S xx = 0.057, we obtain the 90% CI for β : ˆσ (ˆβ t 2 0.05,6, S ˆβ ˆσ + t 2 0.05,6 ) = (0.5442,0.750). xx S xx. A point estimate for Y new = β 0 + β x new + ǫ new, with x new = 0. and ǫ new N(0,σ 2 ) is ŷ new = ˆβ 0 + ˆβ 0. =.07. 4. Since t α/2,n 2 = t 0.025,6 = 2.4469, the required interval is [ (ŷ new t 0.025,6 ˆσ 2 + [ (0. x)2 + ],ŷ new + t 0.025,6 ˆσ n S 2 + ] ) (0. x)2 + xx n S xx = (2.780,.82). 5