Inference for Multivariate Means

Size: px
Start display at page:

Download "Inference for Multivariate Means"

Transcription

1 Inference for Multivariate Means Statistics 407, ISU Inference for the Population Mean Inference for the Population Mean Inference for the Population Mean his section focuses on the question: This section focuses on the question: Inference This section focuses for on the the question: Population Mean Whether a pre-determined mean is a plausible value for the normal Whether population a pre-determined mean, iven mean theisissample a plausible thatvalue we for for This section focuses on the question: have collected. the normal population mean, iven the sample that we we have collected. Whether a pre-determined mean is a value for the normal population hypothesis mean, testin iven the sample lanuae, that we this have means collected. we want to test In hypothesis testin lanuae, this means we want to totest In hypothesis testin lanuae, this means we want to test H o : µ = µ H o : 0 vs H µ = µ 0 vs A : µ = µ H A : : 0 µ = = µ 00 here µ where where =(µ,µ µ =(µ =(µ 2,...,µ is true mean is the,µ,µ p 2 ),...,µ is the p ) true is is thepopulation the true true population mean population mean and mean and µ and 0 = µ µ 0 = hypothesized mean. 0 = µ 0,µ 02 (µ,...,µ (µ 0,µ 0,µ 02 0p,...,µ ) is the 02,...,µ 0p ) hypothesized is hypothesized mean. mean. 0p is the hypothesized mean. 2

2 Univariate Testin of a Mean Univariate of Recall, in the univariate situation, when we want to test Recall, in the univariate situation, when we want to test H o : µ = µ 0 vs H A : µ = µ 0 we we would calculate calculate the t-statistic, the t-statistic, as follows: as follows: Then we would compare t t with = X µ 0 s/ values n. from a t-distribution with (n-) derees of freedom, and reject Ho only if t is larer than the critical values. Then we would compare t with values from a t-distribution with (n ) derees of freedom, and reject H o only if t is larer than the critical values. 2 3 Confidence Interval for a Mean ould also We could compute also compute a 00( a 00(- α)% confidence interval interval for the for ould also compute a 00( α)% confidence interval fo opulation mean, µ, usin population mean, µ,, usin X ± t n (α/2) s n X ± t n (α/2) s n 3 4

3 Multivariate Testin of a Mean Multivariate Testin Testin of of a Mean a Mean Multivariate Testin of a Mean Mean The multivariate analo of the t-statistic is: Multivariate Testin of a Mean Multivariate Testin of a Mean he The multivariate multivariate analo analo of of the the t-statistic t-statistic is: is: The multivariate The multivariate analo analo of the t-statistic is: te analo The multivariate of the t-statistic analo T 2 of the t-statistic = of is: n( the X t-statistic µ 0 ) is: Sis: T 2 T= 2 n( = X n( T 2 X µ = µ n( X 0 ) S µ 0 n ) S ( X n ( µ X 0 ) 0 ) S n ( X µ 0 ) µ 0 ) T 2 = n( X µ 0 ) S It T 2 is = called n( X µ Hotellins 0 ) S n ( X µ 0 ) It is called It is called It ishotellins called Hotellins T 2 T 2 n ( X µ 0 ). t is called T 2.. T 2. It is called Hotellins T 2. n ( X µ 0 ) tellins follows an (Remember? In T follows 2 2. T 2 follows an (n )p F p,n p,α distribution. in Tfollows 2 follows an an an (n )p n p F p,n p,α distribution. (Remember that, in the univariate n p (n )p case, when t t n, then t 2 that, in n p F p,n p,α distribution. (Remember that, in F,n. Note in the he (n )p the univariate multivariate case, when t t analo, that the n, then t 2 F derees of freedom,n.). Note in the incorporate the n p F case, p,n p,α distribution. when t t (Remember t 2 n, then t 2 F that,,n F,n. Note in. Note in the in the multivariate dimensionality analo, that that of the the the data.) derees freedom incorporate the case, when t t n of, the then t 2 derees of offreedom incorporate the the F,n. Note in the dimensionality of of the the of data.) the data.) nalo, that the derees of freedom incorporate the 4 of the data.) T 2 follows an (n )p n p F p,n p,α distribution. (Remember that, in the univariate case, when t t n, then t 2 F,n. Note in the multivariate analo, that the derees of freedom incorporate the BRIEF ARTICLE 4 5 THE AUTHOR Multivariate Testin of a Mean Null hypothesis: H 0 : µ = µ 0 Alternative hypothesis: H A : µ = µ 0 Why aren t there one-sided alternatives? 6

4 Example: Sweat data Example: Sweat data Perspiration Perspirationofof20 20 healthy females for 3 components: Sweat Sweat Rate, Perspiration Rate, Sodium Sodiumof Content Content 20 healthy and and females Potassium Potassium for 3 components: Content. Sweat Rate, Sodium Content and Potassium Content SweatRate SweatRate Na Na K K X = 45.6 X S n = S n S S n = n = We want to test if We want to test H o : µ = (4 50 0) vs H A : µ = (4 50 0) We want to test if H o : µ = ( ) vs H A : µ = (4 50 0) Calculate: T 2 = T 2 = = =

5 Takin α =0.0, (n )p Takin Usin =0.0, (n )p n p F n p p,n p(0.0) = p,n p(0.0) = 8.8. Then Then we then would we would reject H we would reject o at the atho the at the 0% 0% 0% level level level of of of sinificance. sinificance. Takin Takin Usin α =0.05, =0.05, (n )p (n )p n p F n p p,n p(0.05) p,n p(0.05) = = Then Then we then we would Ho we would would NOT NOT reject reject H o at at the theat 5% 5% the 5% level level level of of of sinificance. sinificance Confidence Reions and Simultaneous Confidence Reions and Simultaneous Comparisons of Comparisons Confidence Component Reions and of Component Simultaneous Means Means Comparisons Means of fidence Reions and Simultaneous Comparisons of Confidence Reions Component and Simultaneous Means Comparisons of Component Means A 00( α)% confidence reion for the mean of a p-dimensional A 00( α)% normal distribution confidence is the reion ellipsoid for the determined mean of a p-dimensional by all µ such that A 00( α)% confidence reion for the mean of a p-dimensional normal distribution is the ellipsoid determined by all µ such that that normal distribution is the ellipsoid determined by all µ such that n( X µ) S n( X µ) n n ( p(n ) X µ) p(n ) X µ) n p F p,n p(α) p,n p(α) n( X n ( µ) p(n ) X S n µ) ( p(n ) X µ) n p n p F F p,n p(α) where X,...,X n are sample observations from a N p (µ, Σ) population.,...,x are sample observations from p (µ, Σ) pop- where where X,...,X n are sample observations from a N p (µ, Σ) population. population. This equation defines an ellipse in p-space. This This equation defines an ellipse in in p-space. p-space. in p-space. α)% confidence reion for the mean of a p-dimensional l distribution is the ellipsoid determined by all µ such that n( X µ) S X,...,X n are sample observations from a N p (µ, Σ) pop-. quation defines an ellipse in p-space

6 Lenths of the major axes. Lenths of the major axes. The shape of the confidence ellipse is dictated by the variancecovariance matrix, S. The orientation is iven by the eienvectors of S, and the lenths of the major axes of the confidence The shape of the confidence ellipse is dictated by the variancecovariance matrix,. The is iven by the eienvectors, ellipse ellipse are are iven iven by by and the of the major axes of the confidence ellipse are iven by p(n ) λ i p(n ) λ n(n p) F i p,n p(α) where λ i is the i th eienvalue of S. where is the i'th eienvalue. Lenths of the major axes. The shape of the confidence ellipse is dictated by the varia covariance matrix, S. The orientation is iven by the eien tors of S, and the lenths of the major axes of the confide n(n p) F p,n p(α) where λ i is the i th eienvalue of S. Inference for the Populati 9 9 This section focuses on the question: Inference for the Population Mea Example: Sweat data This section Whether focuses on a pre-determined the question: mean is a the normal population mean, iven t Multivariate Testin of a Mea have collected. Whether a pre-determined mean is a plausibl thethe normal multivariate population 90% confidence analomean, of reion theiven t-statistic the is: samp have collected. In hypothesis testin lanuae, this mean x data points + sample T 2 mean, = n( X µ 0 ) S n ( X µ 0 ) In hypothesis testin hypothesized lanuae, H o : mean, µ this = µ 0 means vs we H A wa : µ confidence reion (3D) It is called Hotellins T 2. where µ =(µ,µ 2,...,µ p ) is the true popu Notice H that is the (µ 0,µ 02,...,µ o : µ = µ confidence 0p ) is 0 vs H the hypothesized A : µ = µ 0 T 2 follows an (n )p mea n p reion, F p,n p,α which distribution. (Re the univariate corresponds case, when to the t t where µ =(µ,µ result 2,...,µ for p the ) ishypothesis true n, then t test. population 2 F m multivariate analo, that derees of freedom (µ 0,µ 02,...,µ 0p ) is the hypothesized mean. dimensionality of the data.) 2

7 Relationship between Interval and Hypothesis Test Hypothesis testin is equivalent to examinin the hypothesized value in relation to the simultaneous confidence ellipse. If the hypothesized value lies the ellipse, then the null hypothesis is. 3 Simultaneous Confidence Intervals Bonferroni Method vs T 2 intervals method for computin simultaneous confidence intervals Scheffe s: (n )p X i ± n p F p,n p,α s i n called Scheffe s method. These intervals are the confilipse projected into the coordinate axes. 4

8 0% t-interval Simultaneous Confidence Intervals s i X i ± t n, α 2 n o contain the true mean with probability ( α). in multiple (p) confidence intervals the joint probey contain the true means needs to be at least hieve this each interval will need to have a hiher t it contains the true mean. One method for enhe Bonferroni correction: Bonferroni: X i ± t n, α 2p s i n Inference for the Populati This section focuses on the question: 5 Intervals vs Reion Whether a pre-determined mean is a the normal population mean, iven t Multivariate Testin of a Mea have collected. The multivariate analo of the t-statistic is: In hypothesis 90% confidence testinreion lanuae, this mean x data points + sample T 2 mean, = n( X µ 0 ) S n ( X µ 0 ) hypothesized H o : mean, µ = µ 0 vs H A : µ confidence reion (3D) It is called Hotellins T where µ =(µ Bonferroni confidence 2.,µ 2,...,µ p ) is the true popu interval, reion T 2 follows (µ 0,µ 02 an,...,µ (n )p 0p ) is the hypothesized mea n p F p,n p,α distribution. (Re the univariate Ellipse case, extends whenoutside t t n of box,, then in t 2 F multivariatesome analo, directions. that Hypothesized derees of freedom dimensionality mean of is the inside data.) box always. 6

9 Paired Samples! the second value from the first and treat as a one sample problem.! Example: Municipal wastewater treatment plants are required by law to monitor their dischares into rivers and streams on a reular basis. Concern about the reliability of data from one of these self-monitorin prorams led to a study in which samples of effluent were divided and sent to two laboratories were divided and sent to two laboratories for testin. One half of each sample was sent to the Wisconsin state lab and the other half sent to a private commercial lab routinely used in the monitorin proram. Measurements were made on biochemical oxyen demand (BOD) and suspended solids (SS) were obtained, on n= samples.! Are the analyses producin equivalent results? 7 Data Sample BOD SS BOD SS BOD SS

10 Plots BOD SS Lab D 2= SS SS Lab Differences x SS S d =! Plot vs! Are the values for BOD the same from labs & 2?! Are the values for SS the same from labs & 2?! Plot in multivariate plots. +! Is there a systematic shift 222 from x one rep to another ! Plot BOD, alon BOD with sample mean and hypothesized mean.! How similar are the 99.3 sample mean 88.3 and hyp mean? SS Differences Lab BOD T = [ ] S d = 88.3Two 48.6 Sample Hotellins T 2 o Sample Two Sample Hotellins T 2 BOD F Two Sample Hotellins T 2 For two 2,9(0.05) samples Two = collected 9.47 Sample independently Reject Hotellins H o T. 2 from multivariate normal populations, For two N(µ samples 2 Two Sample Hotellins T, Σ),N(µ collected 2, Σ), independently the test statistic from multivariate for H 0 : n For two For samples two mal collected populations, collected independently 9.36 = from 3.6 multivariate normal normal populations, µ N(µ, Σ),N(µ , Σ),, the test statistic for for H 0 : µ = For µ 2 two vs Hsamples A : µ collected = µ 2 is: N(µ, Σ),N(µ 2, Σ), the test statistic for H independently from multivariate normal populations, = µ 2 vs H A : µ N(µ µ = µ 2 vs H A : µ, Σ),N(µ = µ 2 is: 2, Σ), the test statistic for H 0 : vs = µ 2 is: is ected independently from multivaria µ = µ 2 vs H A : µ = µ 2 is: µ Reject, Σ),N(µ T 2 =( X H o. 2 X, ) Σ), (( + the T test statistic f 2 T µ 2 is: 2 =( X X 2 ) (( =( X )S pooled ) T 2 =( X + )S X 2 ) (( n X + n 2 ) (( + )S( pooled X ) X ( 2 X ) X 2 ) n )S pooled ) n 2 ( X X X 22 )) n n n 2 n 2 2 is distributed is as distributed (n +n 2 2)p as is distributed is distributed as (n as n +n (n +n 2 2)p 2 p F (n +n 2 2)p 2)p n n +n +n 2 p 2 p F p,n F np,n +n +n 2 p 2 p,α F p,n +n and 2 p,α and p,n +n +n 2 p,α 2 p,α and and S S pooled S pooled = = n n +n n +n 2 2 S n pooled = n S pooled n n n +n 2 2 S 2. S +n 2 2 S = n + n n +n2 2 2 S n +n 2 2. S + n 2 n 2. +n 2 2 S 2. X 2 ) (( + )S pooled ) 22 ( X X 22 2 ) n n = 22

11 THE AUTHOR BRIEFn i ARTICLE # K! wij yij yi yij yi ni ni j Testin 3 " 83 " " > = for 8 6 if Equality of Variance-Covariance S i = % & K THE 0 if n AUTHOR i $ ' Test for Homoeneity of Variancecovariances H 0 : µ = µ 2 Bartlett s test for homoeneity of variances, for univariate data is: Covariance Matrix H A : µ = µ 2 H 0 : µ which To test eneralizes = µ % 2 the hypothesis to " S " > H 0 : Σ = Σ 2 =... = Σ aainst the alternative S = &K! that they n i 6 0 are n not 5 i if n i= all equal, SAS PROC DISCRIM uses a LRT test 'K 0H A : µ = µ 2 if n$ Λ = i= S i (n i )/2 H 0 : Σ = Σ 2 =... = Σ S pooled (n p)/2 M Statistic (extends Bartlett s univariate test.) No exact distribution, but for lare M % & ' χ 2 data can be used to obtain p-values. Also Box s M test, has no exact distribution. K0n" 5lo S " n =! i " 6lo S i if S > 0, there s R code for this. BUT, both tend to be K i= sensitive, SYSMIS to!! if S $ Comparin several Multivariate Means - MANOVA Comparin Comparin several several Multivariate Means - MANOVA Comparin several Multivariate Means - MANOVA Data: Data:! Data: samples samples samples of of sizeof of size nsize,...n n n,...n,...n ; p variables. ; ; p variables. ; p variables. Population:! Population: populations; populations; different different means means (µ, Population: populations;, different means,...,µ (µ (µ,...,µ,...,µ ), same ), same ), same covariance covariance (Σ), (Σ), multivariate multivariate normal. normal. covariance (Σ), multivariate normal.! Arises Arises Arises from from () from desined () () desined desined experimental experimental experimental studies, studies, studies, where where where there there are are there are multiple responses measured for one or more treatments, (2) in Arises multiple multiple from responses () responses desined measured measured experimental for one foror one more studies, treatments, more treatments, where (2) there in (2) are in classification problems, where we want to classify a new observation multiple classification classification into responses problems, a class, based problems, measured where we on two where or more for want we oneto measured want orclassify more to a variables. classify treatments, newaobser- observation into ainto class, (2) in classification vation problems, a class, based based on where twonor we two more want measured more to classify measured variables. a variables. new observation into a class, based on two or more measured variables

12 Recall Univariate ANOVA Recall Univariate ANOVA We have random samples X l,x l2,...,x lnl from N(µ l, σ 2 ), l =,...,. We are interested to know if the population means of Recall Recall Univariate ANOVA the roups are different, that is, if µ = µ 2 =...= µ. e We have The have We random problem have random issamples parametrized XX l,x l2 l, l,x into l2,...,x the followin lnl from model N(µ formulation: l=, We..., We are. We are interested are to toknow if if ifthe population population means of means the of of l, σ 2 ), ), l l =,...,.,...,. e the roups roups are are different, that that is, if is, is, if if µ = µ 2 =...=. µ. The problem is Xparametrized lj = µ + into the τ l followin + emodel lj formulation: he The problem problem is parametrized is parametrized into into the the followin model formulation: formulaon: with with the the constraint l= n lτ l = 0,, which leadsto tothethe null hypothesis null hypothesis notation: notation: X lj = µ + τ l lj X lj = µ + τ l + e lj H with the constraint o : τ = τ 2 =...= τ =0 l= n lτ l = 0, which leads to the null hy- constraint notation: ithpothesis l= n 29 lτ l = 0, which leads to the null hyothesis notation: H o : τ = τ 2 =...= τ =0 H o : τ = τ 2 =...= τ =0 The model is fit to the data by splittin each observation into The model is fit to the data by splittin each each observation into into components: components: X lj = X + ( X l X) + (X lj X l ), j =,...,n l ; l =,..., X lj = X + ( X l X) + (X lj X l ), j =,...,n l ; l =,..., and computin the sums of squares for each component: and computin the sums of squares for each component: and computin the sums of squares for each component: n l (X lj X) 2 = n l ( X l X) 2 n l n l (X lj X l ) (X 2 l= j= lj X) 2 = n l= l ( X l X) 2 n l + (X l= j= lj X l ) 2 l= j= l= l= j=

13 ANOVA Table ANOVA Table SourceTreatment (Between) SS SS Tr df Treatment (Between) SS Tr Error (Within) SS Res l= n Error (Within) SS Res l Total SS CorTot l= n l Total SS CorTot l= n l Then Then we we test test the hypothesisusin H o : τ = τ 2 =... = τ = 0 vs H A : not all equal 0, usin: F = ANOVA Table Source SS df Then we test the hypothesis H o : τ = τ 2 =... = τ = 0 vs H A : not all equal 0, usin: F = SS Tr /( ) SS Res /( l= n l ) F, n l (α) SS Tr /( ) SS Res /( l= n l ) F, n l (α) Multivariate ANOVA Multivariate ANOVA The correspondin model formulation is The correspondin model formulation is X lj = µ + τ l + e lj And the correspondin data decomposition where X lj, e l j are n l p matrices, µ, τ l are p-d vectors, ivin the correspondin data representation: X lj = X + ( X l X) + (X lj X l ), j =,...,n l ; l =,..., 32 26

14 H A : µ = µ 2 H 0 : Σ = Σ 2 =... = Σ The correspondin sums of squares χ The correspondin sums of squares 2 is iven by: is iven by: n l l= j= n l (X lj X)(X lj X) = j= n l (X lj X l )(X lj X l ) l= l= (X lj X)(X lj X) = l= j= n l ( X l X)( X l X) + l= = B + W n l ( X l X)( X l X) + n l l= j= (X lj X l )(X lj MANOVA Table Source SS df Treatment (Between) B Error (Within) W l= n l Total B + W l= n l 33 Testin the hypothesis, H o : τ =...= τ l 27 = 0 he bi complication Testinhere the hypothesis, Hthat 0 : µ = we µ are H o comparin : τ 2 =...= τmatrices l = 0 now nstead The bi Testin of sinle complication the numbers. here is hypothesis, The thatmatrices we are comparin H o : correspond matrices τ =...= toτ variance now l = 0 llipses instead The p-d of sinle bi space. complication numbers. So there The H A : isµ are matrices that = µ 2 numerous we are correspond comparin test statistics. tomatrices variance now The ellipses most instead p-d popular of space. sinleis So numbers. Wilks there are The lambda: numerous matrices test correspond statistics. to variance The ost popular The bi complication is Wilks lambda: here H 0 : is Σ that = Σwe 2 =... are = comparin Σ matrices now most instead popular ellipses in of sinle is p-d Wilks space. numbers. lambda: So there are numerous test statistics. The The matrices correspond to variance ellipses most popular is Wilks lambda: χ in p-d space. So there are numerous 2 test statistics. The most popular is Wilks' lambda: Λ = W W n l Λ = W (X lj X)(X lj B + W B X) = n + W l ( X l X)( X l X) + l= j= Λ = W l= B + W e bi complication here is that we are comparin matrices now tead of sinle numbers. The matrices correspond to variance ipses in p-d space. So there are numerous test statistics. The st popular is Wilks lambda: f ΛIf is Λ If too is too is too small small then then reject H. If o. If l= j= n l n= l = n is n lare, is lare, (n (n If Λ is too W p + )/2) (p ln W W small then reject H o is Λ distributed = W. If (p + )/2) ln is distributed as asχ B+W 2 χ as as χ 2 B+W 2 n l = n is lare, (n (n (p + )/2) ln W W (α). p( ) For p( ) p( ) smaller n, (p + )/2) ln B+W is distributed as χ n, B+W 2 (α). For. For smaller n, p( ) some combinations of p, the test statistic has an F distribution. ome combinations n, some combinations of p, of of p, the p, the test the test statistic has an has F distribution. an F distribution. some combinations of p, B the + W test statistic has an F distribu Λ is too small then reject H o. If n l = n is lare, (n + )/2) ln Testin the hypothesis, THE AUTHOR H o : τ =...= τ l = 0 Testin the hypothesis, H o : τ =...= τ l = 0 The bi complication here is that we are comparin matrice instead Testin of sinle the numbers. hypothesis The matrices correspond to var ellipses in p-d space. So there are numerous test statistics. If Λ is too small then reject H o. If n l = n is lare, (n W B+W BRIEF ARTICLE n l (X lj X l )(X lj X l ) (α). For smal is distributed as χ 2 28 (α). For smaller n p( ) 34

15 Note that we can also express Λ as Note that we can also express Note that Note we that can we can also also express Λ Λ as as as Λ min(p, ) = Λmin(p, ) min(p, ) = i= +λ i i= +λ i Λ = +λ i i= where where λ i ; i λ=,...,min(p, ) are the eienvalues of W i ; i ) are the eienvalues of of WB. B. values of W B.. Also note, Also note, bein bein based based on on determinants of the of the matrices matrices that that e Wilks matrices lambda Wilks Also that note, lambda isbein essentially isbased essentially on comparin determinants comparin volumes of volumes the matrices of the of the incorrespond- ellipses. lambda ellipses. Another is essentially Another way of calculatin of calculatin volumes of volumes the correspondin of ellipses of ellipses is the is the that correspond- Wilks f the products products ellipses. of the Another oflenths way lenths of the calculatin ofleadin the leadin volumes axes axes of ellipses (eienvalues). is the of ellipses is the products of the lenths of the leadin axes (eienvalues). envalues). where λ i ; i =,...,min(p, ) are the eienvalues of W B. Also note, bein based on determinants of the matrices that Wilks lambda is essentially comparin volumes of the correspondin ellipses. Another way of calculatin volumes of ellipses is the products of the lenths of the leadin axes (eienvalues) Other Test Statistics Other Test Statistics Other Test Statistics Other Test Statistics Roy s! Roy's larest eienvalue of ofbw BW Roy s larest eienvalue of BW Lawley! Lawley and and Hotellins: T trace(bw ) ) Lawley and Hotellins: T = trace(bw ) Pillai s trace: trace(b(b + W) ) Pillai s! Pillai s Pillai's trace: trace: V V = trace(b(b + W) W) ) )

16 Example: Dominican Republic Student Testin 82 Students, enrolled in one of Technoloy (23), Architecture (38), Medical Technoloy (2) prorams. The students were iven 4 tests: Aptitude, Math, Lanuae, General Knowlede. Is there is a BRIEF difference ARTICLE in the averae test scores for each student type? THE AUTHOR H 0 : µ = µ 2 H A : µ = µ 2 H 0 : Σ = Σ 2 =... = Σ χ 2 3 n l (X lj X)(X lj X) = l= j= l= n Summary l (X lj Statistics X l )(X lj X l ) l= j= n l ( X l X)( X l X) + n=82, n =23, n 2 =38, n 3 =2, p=4, =3 2 (n (p + )/2) ln W THE AUTHOR B+W THE AUTHOR X = X = X 2 = X = THE AUTHOR S = S = S = S 2 = S 2 = S 2 = S 3 = S 3 = Can we consider that S 3 = the population variance-covariances miht be equal? 32

17 Descriptive raphics Can we consider that the population variances miht be equal? Are the population means different? Apt Math Lan GenKnow 5 count Tech Arch MedTech StudType Tech Arch MedTech value Descriptive raphics Apt Corr: Corr: 0.52 Corr: 0.39 Can we consider that the population variance-covariances miht be equal? Are the means different? Lan Math Corr: 0.43 Corr: Corr: GenKnow Apt Math Lan GenKnow 34

18 Descriptive raphics Can we consider that the population variance-covariances miht be equal? Are the means different? 35 Test OR B = W =

19 S 2 = S 3 = Test µ = µ 2 = µ 3 τ = τ 2 = τ 3 = 0 H o : µ = µ 2 = µ 3 H o : τ = τ 2 = τ 3 = S 3 = OR B = THE AUTHOR W = S = > summary(manova(cbind(apt, Math, Lan, GenKnow)~StudType, S 2 = domrep), test="wilks") Df Wilks approx 67.5 F num 28.3 Df den 0.57 Df 68.2 Pr(>F) StudType e-07 *** Residuals 79 S 3 = H o : µ = µ 2 = µ 3 H o : τ = τ 2 = τ 3 = 0 B = Which variables? Analysis of variance on each variable W = Df Sum Sq Mean Sq F value Pr(>F) Apt e-07 *** Math ** Lan e-05 *** GenKnow * All variables have sinificant difference between means! In order of importance: 38

20 B = W = Which roups? Df Sum Sq Mean Sq F value Pr(>F) e-07 *** h ** e-05 *** Know * BRIEF ARTICLE 3 BRIEF ARTICLE Tukey s multiple comparisons Apt diff lwr lwr upr padj Arch-Tech MedTech-Tech MedTech-Arch Lan diff lwr upr p adj Arch-Tech MedTech-Tech MedTech-Arch Apt: diff lwr upr p adj Arch-Tech Math: MedTech-Tech Lan: MedTech-Arch GenKnow: BRIEF ARTICLE diff lwr upr p adj Arch-Tech MedTech-Tech MedTech-Arch diff lwr upr p adj Math diff lwr upr adj Arch-Tech Arch-Tech MedTech-Tech MedTech-Arch GenKnow diff lwr lwr upr p adj Arch-Tech MedTech-Tech MedTech-Arch diff lwr upr p adj Arch-Tech MedTech-Tech MedTech-Arch Summary In eneral, usin a multivariate test, such as, is preferable to multiple univariate tests, such as. This controls for the correlation structure in the data. Its possible to et non-sinificant results in the univariate tests, but sinificant results in the multivariate tests. That is, the univariate tests may not detect the differences, in the presence of correlation between the responses. 40

21 This work is licensed under the Creative Commons Attribution- Noncommercial 3.0 United States License. To view a copy of this license, visit us/ or send a letter to Creative Commons, 7 Second Street, Suite 300, San Francisco, California, 9405, USA. 4

Multivariate Analysis of Variance (MANOVA): I. Theory

Multivariate Analysis of Variance (MANOVA): I. Theory Gregory Carey, 1998 MANOVA: I - 1 Multivariate Analysis of Variance (MANOVA): I. Theory Introduction The purpose of a t test is to assess the likelihood that the means for two groups are sampled from the

More information

Multivariate Normal Distribution

Multivariate Normal Distribution Multivariate Normal Distribution Lecture 4 July 21, 2011 Advanced Multivariate Statistical Methods ICPSR Summer Session #2 Lecture #4-7/21/2011 Slide 1 of 41 Last Time Matrices and vectors Eigenvalues

More information

Notes for STA 437/1005 Methods for Multivariate Data

Notes for STA 437/1005 Methods for Multivariate Data Notes for STA 437/1005 Methods for Multivariate Data Radford M. Neal, 26 November 2010 Random Vectors Notation: Let X be a random vector with p elements, so that X = [X 1,..., X p ], where denotes transpose.

More information

Profile analysis is the multivariate equivalent of repeated measures or mixed ANOVA. Profile analysis is most commonly used in two cases:

Profile analysis is the multivariate equivalent of repeated measures or mixed ANOVA. Profile analysis is most commonly used in two cases: Profile Analysis Introduction Profile analysis is the multivariate equivalent of repeated measures or mixed ANOVA. Profile analysis is most commonly used in two cases: ) Comparing the same dependent variables

More information

Lesson 1: Comparison of Population Means Part c: Comparison of Two- Means

Lesson 1: Comparison of Population Means Part c: Comparison of Two- Means Lesson : Comparison of Population Means Part c: Comparison of Two- Means Welcome to lesson c. This third lesson of lesson will discuss hypothesis testing for two independent means. Steps in Hypothesis

More information

Multivariate Analysis of Variance (MANOVA)

Multivariate Analysis of Variance (MANOVA) Multivariate Analysis of Variance (MANOVA) Aaron French, Marcelo Macedo, John Poulsen, Tyler Waterson and Angela Yu Keywords: MANCOVA, special cases, assumptions, further reading, computations Introduction

More information

How to calculate an ANOVA table

How to calculate an ANOVA table How to calculate an ANOVA table Calculations by Hand We look at the following example: Let us say we measure the height of some plants under the effect of different fertilizers. Treatment Measures Mean

More information

Inferences About Differences Between Means Edpsy 580

Inferences About Differences Between Means Edpsy 580 Inferences About Differences Between Means Edpsy 580 Carolyn J. Anderson Department of Educational Psychology University of Illinois at Urbana-Champaign Inferences About Differences Between Means Slide

More information

Multivariate Analysis of Variance. The general purpose of multivariate analysis of variance (MANOVA) is to determine

Multivariate Analysis of Variance. The general purpose of multivariate analysis of variance (MANOVA) is to determine 2 - Manova 4.3.05 25 Multivariate Analysis of Variance What Multivariate Analysis of Variance is The general purpose of multivariate analysis of variance (MANOVA) is to determine whether multiple levels

More information

Recall this chart that showed how most of our course would be organized:

Recall this chart that showed how most of our course would be organized: Chapter 4 One-Way ANOVA Recall this chart that showed how most of our course would be organized: Explanatory Variable(s) Response Variable Methods Categorical Categorical Contingency Tables Categorical

More information

AEROSOL STATISTICS LOGNORMAL DISTRIBUTIONS AND dn/dlogd p

AEROSOL STATISTICS LOGNORMAL DISTRIBUTIONS AND dn/dlogd p Concentration (particle/cm3) [e5] Concentration (particle/cm3) [e5] AEROSOL STATISTICS LOGNORMAL DISTRIBUTIONS AND dn/dlod p APPLICATION NOTE PR-001 Lonormal Distributions Standard statistics based on

More information

Chapter 5 Analysis of variance SPSS Analysis of variance

Chapter 5 Analysis of variance SPSS Analysis of variance Chapter 5 Analysis of variance SPSS Analysis of variance Data file used: gss.sav How to get there: Analyze Compare Means One-way ANOVA To test the null hypothesis that several population means are equal,

More information

Section 13, Part 1 ANOVA. Analysis Of Variance

Section 13, Part 1 ANOVA. Analysis Of Variance Section 13, Part 1 ANOVA Analysis Of Variance Course Overview So far in this course we ve covered: Descriptive statistics Summary statistics Tables and Graphs Probability Probability Rules Probability

More information

ANOVA. February 12, 2015

ANOVA. February 12, 2015 ANOVA February 12, 2015 1 ANOVA models Last time, we discussed the use of categorical variables in multivariate regression. Often, these are encoded as indicator columns in the design matrix. In [1]: %%R

More information

Multivariate Analysis of Variance (MANOVA)

Multivariate Analysis of Variance (MANOVA) Chapter 415 Multivariate Analysis of Variance (MANOVA) Introduction Multivariate analysis of variance (MANOVA) is an extension of common analysis of variance (ANOVA). In ANOVA, differences among various

More information

Multivariate Statistical Inference and Applications

Multivariate Statistical Inference and Applications Multivariate Statistical Inference and Applications ALVIN C. RENCHER Department of Statistics Brigham Young University A Wiley-Interscience Publication JOHN WILEY & SONS, INC. New York Chichester Weinheim

More information

Notes on Applied Linear Regression

Notes on Applied Linear Regression Notes on Applied Linear Regression Jamie DeCoster Department of Social Psychology Free University Amsterdam Van der Boechorststraat 1 1081 BT Amsterdam The Netherlands phone: +31 (0)20 444-8935 email:

More information

DISCRIMINANT FUNCTION ANALYSIS (DA)

DISCRIMINANT FUNCTION ANALYSIS (DA) DISCRIMINANT FUNCTION ANALYSIS (DA) John Poulsen and Aaron French Key words: assumptions, further reading, computations, standardized coefficents, structure matrix, tests of signficance Introduction Discriminant

More information

Data Mining and Data Warehousing. Henryk Maciejewski. Data Mining Predictive modelling: regression

Data Mining and Data Warehousing. Henryk Maciejewski. Data Mining Predictive modelling: regression Data Mining and Data Warehousing Henryk Maciejewski Data Mining Predictive modelling: regression Algorithms for Predictive Modelling Contents Regression Classification Auxiliary topics: Estimation of prediction

More information

Multiple Linear Regression

Multiple Linear Regression Multiple Linear Regression A regression with two or more explanatory variables is called a multiple regression. Rather than modeling the mean response as a straight line, as in simple regression, it is

More information

Unit 31 A Hypothesis Test about Correlation and Slope in a Simple Linear Regression

Unit 31 A Hypothesis Test about Correlation and Slope in a Simple Linear Regression Unit 31 A Hypothesis Test about Correlation and Slope in a Simple Linear Regression Objectives: To perform a hypothesis test concerning the slope of a least squares line To recognize that testing for a

More information

A KERNEL MAXIMUM UNCERTAINTY DISCRIMINANT ANALYSIS AND ITS APPLICATION TO FACE RECOGNITION

A KERNEL MAXIMUM UNCERTAINTY DISCRIMINANT ANALYSIS AND ITS APPLICATION TO FACE RECOGNITION A KERNEL MAXIMUM UNCERTAINTY DISCRIMINANT ANALYSIS AND ITS APPLICATION TO FACE RECOGNITION Carlos Eduardo Thomaz Department of Electrical Enineerin, Centro Universitario da FEI, FEI, Sao Paulo, Brazil

More information

Water Quality and Environmental Treatment Facilities

Water Quality and Environmental Treatment Facilities Geum Soo Kim, Youn Jae Chan, David S. Kelleher1 Paper presented April 2009 and at the Teachin APPAM-KDI Methods International (Seoul, June Seminar 11-13, on 2009) Environmental Policy direct, two-stae

More information

" Y. Notation and Equations for Regression Lecture 11/4. Notation:

 Y. Notation and Equations for Regression Lecture 11/4. Notation: Notation: Notation and Equations for Regression Lecture 11/4 m: The number of predictor variables in a regression Xi: One of multiple predictor variables. The subscript i represents any number from 1 through

More information

Didacticiel - Études de cas

Didacticiel - Études de cas 1 Topic Linear Discriminant Analysis Data Mining Tools Comparison (Tanagra, R, SAS and SPSS). Linear discriminant analysis is a popular method in domains of statistics, machine learning and pattern recognition.

More information

Chapter 23 Inferences About Means

Chapter 23 Inferences About Means Chapter 23 Inferences About Means Chapter 23 - Inferences About Means 391 Chapter 23 Solutions to Class Examples 1. See Class Example 1. 2. We want to know if the mean battery lifespan exceeds the 300-minute

More information

KSTAT MINI-MANUAL. Decision Sciences 434 Kellogg Graduate School of Management

KSTAT MINI-MANUAL. Decision Sciences 434 Kellogg Graduate School of Management KSTAT MINI-MANUAL Decision Sciences 434 Kellogg Graduate School of Management Kstat is a set of macros added to Excel and it will enable you to do the statistics required for this course very easily. To

More information

Descriptive Statistics

Descriptive Statistics Descriptive Statistics Primer Descriptive statistics Central tendency Variation Relative position Relationships Calculating descriptive statistics Descriptive Statistics Purpose to describe or summarize

More information

Introduction to General and Generalized Linear Models

Introduction to General and Generalized Linear Models Introduction to General and Generalized Linear Models General Linear Models - part I Henrik Madsen Poul Thyregod Informatics and Mathematical Modelling Technical University of Denmark DK-2800 Kgs. Lyngby

More information

Regression Analysis: A Complete Example

Regression Analysis: A Complete Example Regression Analysis: A Complete Example This section works out an example that includes all the topics we have discussed so far in this chapter. A complete example of regression analysis. PhotoDisc, Inc./Getty

More information

GLM I An Introduction to Generalized Linear Models

GLM I An Introduction to Generalized Linear Models GLM I An Introduction to Generalized Linear Models CAS Ratemaking and Product Management Seminar March 2009 Presented by: Tanya D. Havlicek, Actuarial Assistant 0 ANTITRUST Notice The Casualty Actuarial

More information

Regression. Name: Class: Date: Multiple Choice Identify the choice that best completes the statement or answers the question.

Regression. Name: Class: Date: Multiple Choice Identify the choice that best completes the statement or answers the question. Class: Date: Regression Multiple Choice Identify the choice that best completes the statement or answers the question. 1. Given the least squares regression line y8 = 5 2x: a. the relationship between

More information

1. What is the critical value for this 95% confidence interval? CV = z.025 = invnorm(0.025) = 1.96

1. What is the critical value for this 95% confidence interval? CV = z.025 = invnorm(0.025) = 1.96 1 Final Review 2 Review 2.1 CI 1-propZint Scenario 1 A TV manufacturer claims in its warranty brochure that in the past not more than 10 percent of its TV sets needed any repair during the first two years

More information

Example: Credit card default, we may be more interested in predicting the probabilty of a default than classifying individuals as default or not.

Example: Credit card default, we may be more interested in predicting the probabilty of a default than classifying individuals as default or not. Statistical Learning: Chapter 4 Classification 4.1 Introduction Supervised learning with a categorical (Qualitative) response Notation: - Feature vector X, - qualitative response Y, taking values in C

More information

Service courses for graduate students in degree programs other than the MS or PhD programs in Biostatistics.

Service courses for graduate students in degree programs other than the MS or PhD programs in Biostatistics. Course Catalog In order to be assured that all prerequisites are met, students must acquire a permission number from the education coordinator prior to enrolling in any Biostatistics course. Courses are

More information

THE FIRST SET OF EXAMPLES USE SUMMARY DATA... EXAMPLE 7.2, PAGE 227 DESCRIBES A PROBLEM AND A HYPOTHESIS TEST IS PERFORMED IN EXAMPLE 7.

THE FIRST SET OF EXAMPLES USE SUMMARY DATA... EXAMPLE 7.2, PAGE 227 DESCRIBES A PROBLEM AND A HYPOTHESIS TEST IS PERFORMED IN EXAMPLE 7. THERE ARE TWO WAYS TO DO HYPOTHESIS TESTING WITH STATCRUNCH: WITH SUMMARY DATA (AS IN EXAMPLE 7.17, PAGE 236, IN ROSNER); WITH THE ORIGINAL DATA (AS IN EXAMPLE 8.5, PAGE 301 IN ROSNER THAT USES DATA FROM

More information

Statistiek II. John Nerbonne. October 1, 2010. Dept of Information Science j.nerbonne@rug.nl

Statistiek II. John Nerbonne. October 1, 2010. Dept of Information Science j.nerbonne@rug.nl Dept of Information Science j.nerbonne@rug.nl October 1, 2010 Course outline 1 One-way ANOVA. 2 Factorial ANOVA. 3 Repeated measures ANOVA. 4 Correlation and regression. 5 Multiple regression. 6 Logistic

More information

Statistics Review PSY379

Statistics Review PSY379 Statistics Review PSY379 Basic concepts Measurement scales Populations vs. samples Continuous vs. discrete variable Independent vs. dependent variable Descriptive vs. inferential stats Common analyses

More information

One-Way Analysis of Variance

One-Way Analysis of Variance One-Way Analysis of Variance Note: Much of the math here is tedious but straightforward. We ll skim over it in class but you should be sure to ask questions if you don t understand it. I. Overview A. We

More information

Introduction to Hypothesis Testing. Point estimation and confidence intervals are useful statistical inference procedures.

Introduction to Hypothesis Testing. Point estimation and confidence intervals are useful statistical inference procedures. Introduction to Hypothesis Testing Point estimation and confidence intervals are useful statistical inference procedures. Another type of inference is used frequently used concerns tests of hypotheses.

More information

One-Way Analysis of Variance: A Guide to Testing Differences Between Multiple Groups

One-Way Analysis of Variance: A Guide to Testing Differences Between Multiple Groups One-Way Analysis of Variance: A Guide to Testing Differences Between Multiple Groups In analysis of variance, the main research question is whether the sample means are from different populations. The

More information

3. The Multivariate Normal Distribution

3. The Multivariate Normal Distribution 3. The Multivariate Normal Distribution 3.1 Introduction A generalization of the familiar bell shaped normal density to several dimensions plays a fundamental role in multivariate analysis While real data

More information

Multivariate analyses

Multivariate analyses 14 Multivariate analyses Learning objectives By the end of this chapter you should be able to: Recognise when it is appropriate to use multivariate analyses (MANOVA) and which test to use (traditional

More information

This can dilute the significance of a departure from the null hypothesis. We can focus the test on departures of a particular form.

This can dilute the significance of a departure from the null hypothesis. We can focus the test on departures of a particular form. One-Degree-of-Freedom Tests Test for group occasion interactions has (number of groups 1) number of occasions 1) degrees of freedom. This can dilute the significance of a departure from the null hypothesis.

More information

Linear Discriminant Analysis

Linear Discriminant Analysis Fiche TD avec le logiciel : course5 Linear Discriminant Analysis A.B. Dufour Contents 1 Fisher s iris dataset 2 2 The principle 5 2.1 Linking one variable and a factor.................. 5 2.2 Linking a

More information

SIMPLE LINEAR CORRELATION. r can range from -1 to 1, and is independent of units of measurement. Correlation can be done on two dependent variables.

SIMPLE LINEAR CORRELATION. r can range from -1 to 1, and is independent of units of measurement. Correlation can be done on two dependent variables. SIMPLE LINEAR CORRELATION Simple linear correlation is a measure of the degree to which two variables vary together, or a measure of the intensity of the association between two variables. Correlation

More information

1. Complete the sentence with the correct word or phrase. 2. Fill in blanks in a source table with the correct formuli for df, MS, and F.

1. Complete the sentence with the correct word or phrase. 2. Fill in blanks in a source table with the correct formuli for df, MS, and F. Final Exam 1. Complete the sentence with the correct word or phrase. 2. Fill in blanks in a source table with the correct formuli for df, MS, and F. 3. Identify the graphic form and nature of the source

More information

Analysis of Variance. MINITAB User s Guide 2 3-1

Analysis of Variance. MINITAB User s Guide 2 3-1 3 Analysis of Variance Analysis of Variance Overview, 3-2 One-Way Analysis of Variance, 3-5 Two-Way Analysis of Variance, 3-11 Analysis of Means, 3-13 Overview of Balanced ANOVA and GLM, 3-18 Balanced

More information

1 Simple Linear Regression I Least Squares Estimation

1 Simple Linear Regression I Least Squares Estimation Simple Linear Regression I Least Squares Estimation Textbook Sections: 8. 8.3 Previously, we have worked with a random variable x that comes from a population that is normally distributed with mean µ and

More information

SPSS and AMOS. Miss Brenda Lee 2:00p.m. 6:00p.m. 24 th July, 2015 The Open University of Hong Kong

SPSS and AMOS. Miss Brenda Lee 2:00p.m. 6:00p.m. 24 th July, 2015 The Open University of Hong Kong Seminar on Quantitative Data Analysis: SPSS and AMOS Miss Brenda Lee 2:00p.m. 6:00p.m. 24 th July, 2015 The Open University of Hong Kong SBAS (Hong Kong) Ltd. All Rights Reserved. 1 Agenda MANOVA, Repeated

More information

SPSS Advanced Statistics 17.0

SPSS Advanced Statistics 17.0 i SPSS Advanced Statistics 17.0 For more information about SPSS Inc. software products, please visit our Web site at http://www.spss.com or contact SPSS Inc. 233 South Wacker Drive, 11th Floor Chicago,

More information

12: Analysis of Variance. Introduction

12: Analysis of Variance. Introduction 1: Analysis of Variance Introduction EDA Hypothesis Test Introduction In Chapter 8 and again in Chapter 11 we compared means from two independent groups. In this chapter we extend the procedure to consider

More information

An analysis method for a quantitative outcome and two categorical explanatory variables.

An analysis method for a quantitative outcome and two categorical explanatory variables. Chapter 11 Two-Way ANOVA An analysis method for a quantitative outcome and two categorical explanatory variables. If an experiment has a quantitative outcome and two categorical explanatory variables that

More information

General Regression Formulae ) (N-2) (1 - r 2 YX

General Regression Formulae ) (N-2) (1 - r 2 YX General Regression Formulae Single Predictor Standardized Parameter Model: Z Yi = β Z Xi + ε i Single Predictor Standardized Statistical Model: Z Yi = β Z Xi Estimate of Beta (Beta-hat: β = r YX (1 Standard

More information

Chapter 6: Multivariate Cointegration Analysis

Chapter 6: Multivariate Cointegration Analysis Chapter 6: Multivariate Cointegration Analysis 1 Contents: Lehrstuhl für Department Empirische of Wirtschaftsforschung Empirical Research and und Econometrics Ökonometrie VI. Multivariate Cointegration

More information

Elements of statistics (MATH0487-1)

Elements of statistics (MATH0487-1) Elements of statistics (MATH0487-1) Prof. Dr. Dr. K. Van Steen University of Liège, Belgium December 10, 2012 Introduction to Statistics Basic Probability Revisited Sampling Exploratory Data Analysis -

More information

Simple Linear Regression Inference

Simple Linear Regression Inference Simple Linear Regression Inference 1 Inference requirements The Normality assumption of the stochastic term e is needed for inference even if it is not a OLS requirement. Therefore we have: Interpretation

More information

e = random error, assumed to be normally distributed with mean 0 and standard deviation σ

e = random error, assumed to be normally distributed with mean 0 and standard deviation σ 1 Linear Regression 1.1 Simple Linear Regression Model The linear regression model is applied if we want to model a numeric response variable and its dependency on at least one numeric factor variable.

More information

Name: Date: Use the following to answer questions 3-4:

Name: Date: Use the following to answer questions 3-4: Name: Date: 1. Determine whether each of the following statements is true or false. A) The margin of error for a 95% confidence interval for the mean increases as the sample size increases. B) The margin

More information

General Procedure for Hypothesis Test. Five types of statistical analysis. 1. Formulate H 1 and H 0. General Procedure for Hypothesis Test

General Procedure for Hypothesis Test. Five types of statistical analysis. 1. Formulate H 1 and H 0. General Procedure for Hypothesis Test Five types of statistical analysis General Procedure for Hypothesis Test Descriptive Inferential Differences Associative Predictive What are the characteristics of the respondents? What are the characteristics

More information

Statistics Graduate Courses

Statistics Graduate Courses Statistics Graduate Courses STAT 7002--Topics in Statistics-Biological/Physical/Mathematics (cr.arr.).organized study of selected topics. Subjects and earnable credit may vary from semester to semester.

More information

MULTIPLE LINEAR REGRESSION ANALYSIS USING MICROSOFT EXCEL. by Michael L. Orlov Chemistry Department, Oregon State University (1996)

MULTIPLE LINEAR REGRESSION ANALYSIS USING MICROSOFT EXCEL. by Michael L. Orlov Chemistry Department, Oregon State University (1996) MULTIPLE LINEAR REGRESSION ANALYSIS USING MICROSOFT EXCEL by Michael L. Orlov Chemistry Department, Oregon State University (1996) INTRODUCTION In modern science, regression analysis is a necessary part

More information

Two-sample hypothesis testing, II 9.07 3/16/2004

Two-sample hypothesis testing, II 9.07 3/16/2004 Two-sample hypothesis testing, II 9.07 3/16/004 Small sample tests for the difference between two independent means For two-sample tests of the difference in mean, things get a little confusing, here,

More information

ANOVA ANOVA. Two-Way ANOVA. One-Way ANOVA. When to use ANOVA ANOVA. Analysis of Variance. Chapter 16. A procedure for comparing more than two groups

ANOVA ANOVA. Two-Way ANOVA. One-Way ANOVA. When to use ANOVA ANOVA. Analysis of Variance. Chapter 16. A procedure for comparing more than two groups ANOVA ANOVA Analysis of Variance Chapter 6 A procedure for comparing more than two groups independent variable: smoking status non-smoking one pack a day > two packs a day dependent variable: number of

More information

SAS Software to Fit the Generalized Linear Model

SAS Software to Fit the Generalized Linear Model SAS Software to Fit the Generalized Linear Model Gordon Johnston, SAS Institute Inc., Cary, NC Abstract In recent years, the class of generalized linear models has gained popularity as a statistical modeling

More information

Statistical tests for SPSS

Statistical tests for SPSS Statistical tests for SPSS Paolo Coletti A.Y. 2010/11 Free University of Bolzano Bozen Premise This book is a very quick, rough and fast description of statistical tests and their usage. It is explicitly

More information

MULTIPLE REGRESSION AND ISSUES IN REGRESSION ANALYSIS

MULTIPLE REGRESSION AND ISSUES IN REGRESSION ANALYSIS MULTIPLE REGRESSION AND ISSUES IN REGRESSION ANALYSIS MSR = Mean Regression Sum of Squares MSE = Mean Squared Error RSS = Regression Sum of Squares SSE = Sum of Squared Errors/Residuals α = Level of Significance

More information

THE SIMPLE PENDULUM. Objective: To investigate the relationship between the length of a simple pendulum and the period of its motion.

THE SIMPLE PENDULUM. Objective: To investigate the relationship between the length of a simple pendulum and the period of its motion. THE SIMPLE PENDULUM Objective: To investiate the relationship between the lenth of a simple pendulum and the period of its motion. Apparatus: Strin, pendulum bob, meter stick, computer with ULI interface,

More information

Experimental Design for Influential Factors of Rates on Massive Open Online Courses

Experimental Design for Influential Factors of Rates on Massive Open Online Courses Experimental Design for Influential Factors of Rates on Massive Open Online Courses December 12, 2014 Ning Li nli7@stevens.edu Qing Wei qwei1@stevens.edu Yating Lan ylan2@stevens.edu Yilin Wei ywei12@stevens.edu

More information

A Primer on Mathematical Statistics and Univariate Distributions; The Normal Distribution; The GLM with the Normal Distribution

A Primer on Mathematical Statistics and Univariate Distributions; The Normal Distribution; The GLM with the Normal Distribution A Primer on Mathematical Statistics and Univariate Distributions; The Normal Distribution; The GLM with the Normal Distribution PSYC 943 (930): Fundamentals of Multivariate Modeling Lecture 4: September

More information

T-test & factor analysis

T-test & factor analysis Parametric tests T-test & factor analysis Better than non parametric tests Stringent assumptions More strings attached Assumes population distribution of sample is normal Major problem Alternatives Continue

More information

Null Hypothesis H 0. The null hypothesis (denoted by H 0

Null Hypothesis H 0. The null hypothesis (denoted by H 0 Hypothesis test In statistics, a hypothesis is a claim or statement about a property of a population. A hypothesis test (or test of significance) is a standard procedure for testing a claim about a property

More information

Eigenvalues, Eigenvectors, Matrix Factoring, and Principal Components

Eigenvalues, Eigenvectors, Matrix Factoring, and Principal Components Eigenvalues, Eigenvectors, Matrix Factoring, and Principal Components The eigenvalues and eigenvectors of a square matrix play a key role in some important operations in statistics. In particular, they

More information

Chapter 4: Vector Autoregressive Models

Chapter 4: Vector Autoregressive Models Chapter 4: Vector Autoregressive Models 1 Contents: Lehrstuhl für Department Empirische of Wirtschaftsforschung Empirical Research and und Econometrics Ökonometrie IV.1 Vector Autoregressive Models (VAR)...

More information

Week 5: Multiple Linear Regression

Week 5: Multiple Linear Regression BUS41100 Applied Regression Analysis Week 5: Multiple Linear Regression Parameter estimation and inference, forecasting, diagnostics, dummy variables Robert B. Gramacy The University of Chicago Booth School

More information

Applied Multivariate Analysis

Applied Multivariate Analysis Neil H. Timm Applied Multivariate Analysis With 42 Figures Springer Contents Preface Acknowledgments List of Tables List of Figures vii ix xix xxiii 1 Introduction 1 1.1 Overview 1 1.2 Multivariate Models

More information

COMPARISONS OF CUSTOMER LOYALTY: PUBLIC & PRIVATE INSURANCE COMPANIES.

COMPARISONS OF CUSTOMER LOYALTY: PUBLIC & PRIVATE INSURANCE COMPANIES. 277 CHAPTER VI COMPARISONS OF CUSTOMER LOYALTY: PUBLIC & PRIVATE INSURANCE COMPANIES. This chapter contains a full discussion of customer loyalty comparisons between private and public insurance companies

More information

Please follow the directions once you locate the Stata software in your computer. Room 114 (Business Lab) has computers with Stata software

Please follow the directions once you locate the Stata software in your computer. Room 114 (Business Lab) has computers with Stata software STATA Tutorial Professor Erdinç Please follow the directions once you locate the Stata software in your computer. Room 114 (Business Lab) has computers with Stata software 1.Wald Test Wald Test is used

More information

Testing Group Differences using T-tests, ANOVA, and Nonparametric Measures

Testing Group Differences using T-tests, ANOVA, and Nonparametric Measures Testing Group Differences using T-tests, ANOVA, and Nonparametric Measures Jamie DeCoster Department of Psychology University of Alabama 348 Gordon Palmer Hall Box 870348 Tuscaloosa, AL 35487-0348 Phone:

More information

Stat 411/511 THE RANDOMIZATION TEST. Charlotte Wickham. stat511.cwick.co.nz. Oct 16 2015

Stat 411/511 THE RANDOMIZATION TEST. Charlotte Wickham. stat511.cwick.co.nz. Oct 16 2015 Stat 411/511 THE RANDOMIZATION TEST Oct 16 2015 Charlotte Wickham stat511.cwick.co.nz Today Review randomization model Conduct randomization test What about CIs? Using a t-distribution as an approximation

More information

Applied Statistics. J. Blanchet and J. Wadsworth. Institute of Mathematics, Analysis, and Applications EPF Lausanne

Applied Statistics. J. Blanchet and J. Wadsworth. Institute of Mathematics, Analysis, and Applications EPF Lausanne Applied Statistics J. Blanchet and J. Wadsworth Institute of Mathematics, Analysis, and Applications EPF Lausanne An MSc Course for Applied Mathematicians, Fall 2012 Outline 1 Model Comparison 2 Model

More information

Lecture 28: Chapter 11, Section 1 Categorical & Quantitative Variable Inference in Paired Design

Lecture 28: Chapter 11, Section 1 Categorical & Quantitative Variable Inference in Paired Design Lecture 28: Chapter 11, Section 1 Categorical & Quantitative Variable Inference in Paired Design Inference for Relationships: 2 Approaches CatQuan Relationship: 3 Designs Inference for Paired Design Paired

More information

Part 3. Comparing Groups. Chapter 7 Comparing Paired Groups 189. Chapter 8 Comparing Two Independent Groups 217

Part 3. Comparing Groups. Chapter 7 Comparing Paired Groups 189. Chapter 8 Comparing Two Independent Groups 217 Part 3 Comparing Groups Chapter 7 Comparing Paired Groups 189 Chapter 8 Comparing Two Independent Groups 217 Chapter 9 Comparing More Than Two Groups 257 188 Elementary Statistics Using SAS Chapter 7 Comparing

More information

Regression step-by-step using Microsoft Excel

Regression step-by-step using Microsoft Excel Step 1: Regression step-by-step using Microsoft Excel Notes prepared by Pamela Peterson Drake, James Madison University Type the data into the spreadsheet The example used throughout this How to is a regression

More information

IBM SPSS Advanced Statistics 20

IBM SPSS Advanced Statistics 20 IBM SPSS Advanced Statistics 20 Note: Before using this information and the product it supports, read the general information under Notices on p. 166. This edition applies to IBM SPSS Statistics 20 and

More information

Unit 29 Chi-Square Goodness-of-Fit Test

Unit 29 Chi-Square Goodness-of-Fit Test Unit 29 Chi-Square Goodness-of-Fit Test Objectives: To perform the chi-square hypothesis test concerning proportions corresponding to more than two categories of a qualitative variable To perform the Bonferroni

More information

Introduction to Analysis of Variance (ANOVA) Limitations of the t-test

Introduction to Analysis of Variance (ANOVA) Limitations of the t-test Introduction to Analysis of Variance (ANOVA) The Structural Model, The Summary Table, and the One- Way ANOVA Limitations of the t-test Although the t-test is commonly used, it has limitations Can only

More information

Analysis of Variance ANOVA

Analysis of Variance ANOVA Analysis of Variance ANOVA Overview We ve used the t -test to compare the means from two independent groups. Now we ve come to the final topic of the course: how to compare means from more than two populations.

More information

2013 MBA Jump Start Program. Statistics Module Part 3

2013 MBA Jump Start Program. Statistics Module Part 3 2013 MBA Jump Start Program Module 1: Statistics Thomas Gilbert Part 3 Statistics Module Part 3 Hypothesis Testing (Inference) Regressions 2 1 Making an Investment Decision A researcher in your firm just

More information

Chapter 7. One-way ANOVA

Chapter 7. One-way ANOVA Chapter 7 One-way ANOVA One-way ANOVA examines equality of population means for a quantitative outcome and a single categorical explanatory variable with any number of levels. The t-test of Chapter 6 looks

More information

Online 12 - Sections 9.1 and 9.2-Doug Ensley

Online 12 - Sections 9.1 and 9.2-Doug Ensley Student: Date: Instructor: Doug Ensley Course: MAT117 01 Applied Statistics - Ensley Assignment: Online 12 - Sections 9.1 and 9.2 1. Does a P-value of 0.001 give strong evidence or not especially strong

More information

Econ 424/Amath 462 Hypothesis Testing in the CER Model

Econ 424/Amath 462 Hypothesis Testing in the CER Model Econ 424/Amath 462 Hypothesis Testing in the CER Model Eric Zivot July 23, 2013 Hypothesis Testing 1. Specify hypothesis to be tested 0 : null hypothesis versus. 1 : alternative hypothesis 2. Specify significance

More information

Elementary hypothesis testing

Elementary hypothesis testing Elementary hypothesis testing Introduction Some distributions related to normal distribution Types of hypotheses Types of errors Critical regions Example when variance is known Example when variance is

More information

LAB 4 INSTRUCTIONS CONFIDENCE INTERVALS AND HYPOTHESIS TESTING

LAB 4 INSTRUCTIONS CONFIDENCE INTERVALS AND HYPOTHESIS TESTING LAB 4 INSTRUCTIONS CONFIDENCE INTERVALS AND HYPOTHESIS TESTING In this lab you will explore the concept of a confidence interval and hypothesis testing through a simulation problem in engineering setting.

More information

Statistics 112 Regression Cheatsheet Section 1B - Ryan Rosario

Statistics 112 Regression Cheatsheet Section 1B - Ryan Rosario Statistics 112 Regression Cheatsheet Section 1B - Ryan Rosario I have found that the best way to practice regression is by brute force That is, given nothing but a dataset and your mind, compute everything

More information

Multivariate normal distribution and testing for means (see MKB Ch 3)

Multivariate normal distribution and testing for means (see MKB Ch 3) Multivariate normal distribution and testing for means (see MKB Ch 3) Where are we going? 2 One-sample t-test (univariate).................................................. 3 Two-sample t-test (univariate).................................................

More information

One-Way Analysis of Variance (ANOVA) Example Problem

One-Way Analysis of Variance (ANOVA) Example Problem One-Way Analysis of Variance (ANOVA) Example Problem Introduction Analysis of Variance (ANOVA) is a hypothesis-testing technique used to test the equality of two or more population (or treatment) means

More information

Linear Models and Conjoint Analysis with Nonlinear Spline Transformations

Linear Models and Conjoint Analysis with Nonlinear Spline Transformations Linear Models and Conjoint Analysis with Nonlinear Spline Transformations Warren F. Kuhfeld Mark Garratt Abstract Many common data analysis models are based on the general linear univariate model, including

More information

9-3.4 Likelihood ratio test. Neyman-Pearson lemma

9-3.4 Likelihood ratio test. Neyman-Pearson lemma 9-3.4 Likelihood ratio test Neyman-Pearson lemma 9-1 Hypothesis Testing 9-1.1 Statistical Hypotheses Statistical hypothesis testing and confidence interval estimation of parameters are the fundamental

More information