Statistics - Written Examination MEC Students - BOVISA Prof.ssa A. Guglielmi 26.0.2 All rights reserved. Legal action will be taken against infringement. Reproduction is prohibited without prior consent. Name: Student Id. Number: Properly justify all your answers. Explicitly define all the random variables you are going to use in the solution which have not already been introduced in the text. Exercises Exercise A survey revealed that young people do not read newspapers any more. It has been estimated that only 2% of college students read at least one newspaper a day. Consider a sample class from Politecnico di Milano, composed by n students, who behave independently from each other.. Find the probability that, in a class of 50 students, at least of them read at least one newspaper a day. 2. How many students there should be in the sample class to have at least one student reading at least one newspaper a day with a probability greater than 0.8?. Instead of the point estimate used up to this point, compute an upper confidence limit of level 95% for the proportion of students reading at least one newspaper a day, knowing that, in a class of 220 students, only read at least one newspaper everyday.. Let X be the random variable that counts the number of students reading at least one newspaper a day in a class of 50 people; X has binomial distribution with parameters n = 50 and p = 0.02. Then: P(X ) = P(X 2) = [ P(X = 0) + P(X = ) + P(X = 2) ] = [ ( ) ( ) 50 50 = ( p) 50 + p( p) 49 + p 2 ( p) 48] 0.4209 = 0.57907. 2 Using the Poisson approximation with λ = np =, the required probability is e λ( + λ + λ2 ) 0.429 = 0.5768. 2! 2. If Y is the number of students reading at least one newspaper a day in a class of n students, then Y Bin(n,0.02); we have to compute: P(Y ) = P(Y = 0) = (0.98) n. Then (0.98) n > 0.8 n log(0.98) < log(0.2) n > log(0.2) log(0.98) 79.66447. The solution is n 80.
. Since z α = z 0.05 =.645 and ˆp = /220 = 0.059, the requested CI for p is ˆp( ˆp) (, ˆp + z α ) = (,0.085). n 2
Exercise 2 There are two different lines, XX and YY, for the electrical wire extrusion production, to be compared. According to the manufacturer, XX produces wires with higher resistance to traction, but the standard deviation of the wires resistance produced by XX is.408 (in 0 psi), while that of YY is 0.528. Line XX produced a sample of 20 wires with sample mean resistance equal to 84.7; on the other hand, line YY produced a sample of 0 wires with sample mean resistance equal to 82.. Assume that data are Gaussian and that the two samples are independent.. Is there evidence to indicate that the mean resistance of line XX is higher than that of YY? Use a significance level α= %. 2. Compute the p-value of the test at point. What conclusion would you draw?. Compute the power of the test at point. when the difference between the expected resistance of XX and that of YY is equal to.5 (Hint: remember the definition of power, as the probability of rejecting H 0 as a function of the true value of the parameter). 4. Find a 99% two-sided confidence interval for the difference between the expected resistances of XX and of YY, based on the two observed sample means. Let X and Y be the random variables representing the resistance of the wires produced by XX and YY, respectively, and let µ X = E(X), µ Y = E(Y ). Let (x,...,x 20 ) and (y,...,y 0 ) be the observed samples.. We need to test the null hypothesis H 0 : µ X = µ Y versus the alternative H : µ X > µ Y. If we assume that the two samples are independent, then ( ) X Ȳ N µ X µ Y,.4082 + 0.5282 = N(µ X µ Y,0.27). 20 0 The rejection region at level % is: C = {( x,ȳ) : x ȳ 0.27 2. = z 0.00 }. Since ( x ȳ)/ 0.27 = 6.74 we reject H 0 at the given level. 2. The p-value is given by the probability, under H 0, that the test statistic ( X Ȳ )/ 0.27, exceeds the value 6.74. The test statistic has distribution N(0,) under H 0, while, from the tables, Φ(.99) = 0.0000, so the p-value is lower than 0 4. There is very strong evidence that µ X > µ Y.. Let Z = X Ȳ and µ Z = µ X µ Y. The power function of the test is ( ) ( Z π(µ Z ) = P µz 2. = Φ 2. µ ) ( ) Z µz = Φ 2. 0.27 0.27 0.27. With µ Z =.5 we obtain π(.5) = Φ(.88) = 0.9699. 4. Since z α/2 = z 0.005 = 2.576, a 99% CI for µ X µ Y is ( x ȳ z α/2 0.27, x ȳ + zα/2 0.27) = (.4850,.20).
Exercise The income of Italian tourists who rented a house in Cortina during the last Christmas holidays is described by a random variable X that, expressed in hundreds of thousand of euros, has density: f X (x) = x 4 (,+ )(x).. Find the distribution function and the median of X. 2. Let T = /X. Find the distribution function of T. Which distribution is?. Determine the expected value and variance of X. 4. Let us consider 75 people, chosen at random among those who rented a house in Cortina. Find the approximate value of the probability that their average income exceeds 75 thousand euros.. If x <, F X (x) = 0; if x F X (x) = x The median is the value m such that which is about 26 thousand euros. 2. If 0 t, u 4du = [ ] x u = x. F X (m) = 2 m = 2 m = 2 m = 2 /, F T (t) = P T has uniform distribution on the interval (0,). ( ) ( ) X t = P X ( t) / ) = t;. E(X) = E(X 2 ) = x x 4dx = x 2 x 4dx = x dx = x 2dx = [ 2x 2 [ x ] ] = 2, =, Var(X) = ( ) 2 = 2 4. 4. If X i represent the income of the i-th person among the 75 randomly chosen, then X,...,X 75 are iid. If X 75 = (X + + X 75 )/75 is the average income, then, by the Central Limit Theorem, X 75 approximately follows a N(E(X ) =.5,Var(X )/75 = 00 ) distribution, so that P(X 75 >.75) = P(X 75.75) Φ.75.5 = Φ(2.5) = 0.0062. 00 4
Exercise 4 It is important to establish the mechanical properties of a particular kind of rubber by a laboratory trial. With this aim, a sample of the material has undergone a tension testing, and the results are reported in the following table. Assume that the recorded strength values of 2 4 5 6 7 8 imposed length x (cm) 6.50 7.00 7.50 8.00 8.50 9.00 9.50 0.00 observed strength y (MPa) 0.77.24. 2.2 2.7 2.24 2.80.5. the fibers of the sample can be considered as affected by a zero mean Gaussian error, and that the observations were independent, with the same degree of uncertainty.. Defining a proper linear regression model, estimate the regression coefficients of the relation between x and Y. Moreover, estimate the variance of the error. 2. Determine a 90% level confidence interval for the slope of the regression line.. If a length of 0.cm were imposed to the sample, assuming that it still is in its linear elastic phase, what strength would we expect to observe? 4. Determine a 95% prediction interval of level for the strenght when x =0.cm.. We are interested in estimating the coefficients of the linear relation Y i = β 0 + β x i + ǫ i, i.i.d. i =,...,n (n = 8), with ǫ i N(0,σ 2 ) representing the errors, and in estimating σ 2. From the data we obtain x = 8.25 ȳ =.9775, S xy = x i y i n xȳ = 6.8 S xx = x 2 i n( x) 2 = 0.5 S yy = yi 2 n(ȳ) 2 = 4.5988. The least squares estimates of the regression parameters and of the error variance are ˆβ = S xy = 0.6486, S ˆβ 0 = ȳ ˆβ x =.75, xx ( ) ˆσ 2 = S yy S2 xy = 0.00. n 2 S xx 2. Since t α/2,n 2 = t 0.05,6 =.942 e ˆσ 2 /S xx = 0.057, we obtain the 90% CI for β : ˆσ (ˆβ t 2 0.05,6, S ˆβ ˆσ + t 2 0.05,6 ) = (0.5442,0.750). xx S xx. A point estimate for Y new = β 0 + β x new + ǫ new, with x new = 0. and ǫ new N(0,σ 2 ) is ŷ new = ˆβ 0 + ˆβ 0. =.07. 4. Since t α/2,n 2 = t 0.025,6 = 2.4469, the required interval is [ (ŷ new t 0.025,6 ˆσ 2 + [ (0. x)2 + ],ŷ new + t 0.025,6 ˆσ n S 2 + ] ) (0. x)2 + xx n S xx = (2.780,.82). 5