Inference for Multivariate Means

Size: px
Start display at page:

Download "Inference for Multivariate Means"

Transcription

1 Inference for Multivariate Means Statistics 407, ISU Inference for the Population Mean Inference for the Population Mean Inference for the Population Mean his section focuses on the question: This section focuses on the question: Inference This section focuses for on the the question: Population Mean Whether a pre-determined mean is a plausible value for the normal Whether population a pre-determined mean, iven mean theisissample a plausible thatvalue we for for This section focuses on the question: have collected. the normal population mean, iven the sample that we we have collected. Whether a pre-determined mean is a value for the normal population hypothesis mean, testin iven the sample lanuae, that we this have means collected. we want to test In hypothesis testin lanuae, this means we want to totest In hypothesis testin lanuae, this means we want to test H o : µ = µ H o : 0 vs H µ = µ 0 vs A : µ = µ H A : : 0 µ = = µ 00 here µ where where =(µ,µ µ =(µ =(µ 2,...,µ is true mean is the,µ,µ p 2 ),...,µ is the p ) true is is thepopulation the true true population mean population mean and mean and µ and 0 = µ µ 0 = hypothesized mean. 0 = µ 0,µ 02 (µ,...,µ (µ 0,µ 0,µ 02 0p,...,µ ) is the 02,...,µ 0p ) hypothesized is hypothesized mean. mean. 0p is the hypothesized mean. 2

2 Univariate Testin of a Mean Univariate of Recall, in the univariate situation, when we want to test Recall, in the univariate situation, when we want to test H o : µ = µ 0 vs H A : µ = µ 0 we we would calculate calculate the t-statistic, the t-statistic, as follows: as follows: Then we would compare t t with = X µ 0 s/ values n. from a t-distribution with (n-) derees of freedom, and reject Ho only if t is larer than the critical values. Then we would compare t with values from a t-distribution with (n ) derees of freedom, and reject H o only if t is larer than the critical values. 2 3 Confidence Interval for a Mean ould also We could compute also compute a 00( a 00(- α)% confidence interval interval for the for ould also compute a 00( α)% confidence interval fo opulation mean, µ, usin population mean, µ,, usin X ± t n (α/2) s n X ± t n (α/2) s n 3 4

3 Multivariate Testin of a Mean Multivariate Testin Testin of of a Mean a Mean Multivariate Testin of a Mean Mean The multivariate analo of the t-statistic is: Multivariate Testin of a Mean Multivariate Testin of a Mean he The multivariate multivariate analo analo of of the the t-statistic t-statistic is: is: The multivariate The multivariate analo analo of the t-statistic is: te analo The multivariate of the t-statistic analo T 2 of the t-statistic = of is: n( the X t-statistic µ 0 ) is: Sis: T 2 T= 2 n( = X n( T 2 X µ = µ n( X 0 ) S µ 0 n ) S ( X n ( µ X 0 ) 0 ) S n ( X µ 0 ) µ 0 ) T 2 = n( X µ 0 ) S It T 2 is = called n( X µ Hotellins 0 ) S n ( X µ 0 ) It is called It is called It ishotellins called Hotellins T 2 T 2 n ( X µ 0 ). t is called T 2.. T 2. It is called Hotellins T 2. n ( X µ 0 ) tellins follows an (Remember? In T follows 2 2. T 2 follows an (n )p F p,n p,α distribution. in Tfollows 2 follows an an an (n )p n p F p,n p,α distribution. (Remember that, in the univariate n p (n )p case, when t t n, then t 2 that, in n p F p,n p,α distribution. (Remember that, in F,n. Note in the he (n )p the univariate multivariate case, when t t analo, that the n, then t 2 F derees of freedom,n.). Note in the incorporate the n p F case, p,n p,α distribution. when t t (Remember t 2 n, then t 2 F that,,n F,n. Note in. Note in the in the multivariate dimensionality analo, that that of the the the data.) derees freedom incorporate the case, when t t n of, the then t 2 derees of offreedom incorporate the the F,n. Note in the dimensionality of of the the of data.) the data.) nalo, that the derees of freedom incorporate the 4 of the data.) T 2 follows an (n )p n p F p,n p,α distribution. (Remember that, in the univariate case, when t t n, then t 2 F,n. Note in the multivariate analo, that the derees of freedom incorporate the BRIEF ARTICLE 4 5 THE AUTHOR Multivariate Testin of a Mean Null hypothesis: H 0 : µ = µ 0 Alternative hypothesis: H A : µ = µ 0 Why aren t there one-sided alternatives? 6

4 Example: Sweat data Example: Sweat data Perspiration Perspirationofof20 20 healthy females for 3 components: Sweat Sweat Rate, Perspiration Rate, Sodium Sodiumof Content Content 20 healthy and and females Potassium Potassium for 3 components: Content. Sweat Rate, Sodium Content and Potassium Content SweatRate SweatRate Na Na K K X = 45.6 X S n = S n S S n = n = We want to test if We want to test H o : µ = (4 50 0) vs H A : µ = (4 50 0) We want to test if H o : µ = ( ) vs H A : µ = (4 50 0) Calculate: T 2 = T 2 = = =

5 Takin α =0.0, (n )p Takin Usin =0.0, (n )p n p F n p p,n p(0.0) = p,n p(0.0) = 8.8. Then Then we then would we would reject H we would reject o at the atho the at the 0% 0% 0% level level level of of of sinificance. sinificance. Takin Takin Usin α =0.05, =0.05, (n )p (n )p n p F n p p,n p(0.05) p,n p(0.05) = = Then Then we then we would Ho we would would NOT NOT reject reject H o at at the theat 5% 5% the 5% level level level of of of sinificance. sinificance Confidence Reions and Simultaneous Confidence Reions and Simultaneous Comparisons of Comparisons Confidence Component Reions and of Component Simultaneous Means Means Comparisons Means of fidence Reions and Simultaneous Comparisons of Confidence Reions Component and Simultaneous Means Comparisons of Component Means A 00( α)% confidence reion for the mean of a p-dimensional A 00( α)% normal distribution confidence is the reion ellipsoid for the determined mean of a p-dimensional by all µ such that A 00( α)% confidence reion for the mean of a p-dimensional normal distribution is the ellipsoid determined by all µ such that that normal distribution is the ellipsoid determined by all µ such that n( X µ) S n( X µ) n n ( p(n ) X µ) p(n ) X µ) n p F p,n p(α) p,n p(α) n( X n ( µ) p(n ) X S n µ) ( p(n ) X µ) n p n p F F p,n p(α) where X,...,X n are sample observations from a N p (µ, Σ) population.,...,x are sample observations from p (µ, Σ) pop- where where X,...,X n are sample observations from a N p (µ, Σ) population. population. This equation defines an ellipse in p-space. This This equation defines an ellipse in in p-space. p-space. in p-space. α)% confidence reion for the mean of a p-dimensional l distribution is the ellipsoid determined by all µ such that n( X µ) S X,...,X n are sample observations from a N p (µ, Σ) pop-. quation defines an ellipse in p-space

6 Lenths of the major axes. Lenths of the major axes. The shape of the confidence ellipse is dictated by the variancecovariance matrix, S. The orientation is iven by the eienvectors of S, and the lenths of the major axes of the confidence The shape of the confidence ellipse is dictated by the variancecovariance matrix,. The is iven by the eienvectors, ellipse ellipse are are iven iven by by and the of the major axes of the confidence ellipse are iven by p(n ) λ i p(n ) λ n(n p) F i p,n p(α) where λ i is the i th eienvalue of S. where is the i'th eienvalue. Lenths of the major axes. The shape of the confidence ellipse is dictated by the varia covariance matrix, S. The orientation is iven by the eien tors of S, and the lenths of the major axes of the confide n(n p) F p,n p(α) where λ i is the i th eienvalue of S. Inference for the Populati 9 9 This section focuses on the question: Inference for the Population Mea Example: Sweat data This section Whether focuses on a pre-determined the question: mean is a the normal population mean, iven t Multivariate Testin of a Mea have collected. Whether a pre-determined mean is a plausibl thethe normal multivariate population 90% confidence analomean, of reion theiven t-statistic the is: samp have collected. In hypothesis testin lanuae, this mean x data points + sample T 2 mean, = n( X µ 0 ) S n ( X µ 0 ) In hypothesis testin hypothesized lanuae, H o : mean, µ this = µ 0 means vs we H A wa : µ confidence reion (3D) It is called Hotellins T 2. where µ =(µ,µ 2,...,µ p ) is the true popu Notice H that is the (µ 0,µ 02,...,µ o : µ = µ confidence 0p ) is 0 vs H the hypothesized A : µ = µ 0 T 2 follows an (n )p mea n p reion, F p,n p,α which distribution. (Re the univariate corresponds case, when to the t t where µ =(µ,µ result 2,...,µ for p the ) ishypothesis true n, then t test. population 2 F m multivariate analo, that derees of freedom (µ 0,µ 02,...,µ 0p ) is the hypothesized mean. dimensionality of the data.) 2

7 Relationship between Interval and Hypothesis Test Hypothesis testin is equivalent to examinin the hypothesized value in relation to the simultaneous confidence ellipse. If the hypothesized value lies the ellipse, then the null hypothesis is. 3 Simultaneous Confidence Intervals Bonferroni Method vs T 2 intervals method for computin simultaneous confidence intervals Scheffe s: (n )p X i ± n p F p,n p,α s i n called Scheffe s method. These intervals are the confilipse projected into the coordinate axes. 4

8 0% t-interval Simultaneous Confidence Intervals s i X i ± t n, α 2 n o contain the true mean with probability ( α). in multiple (p) confidence intervals the joint probey contain the true means needs to be at least hieve this each interval will need to have a hiher t it contains the true mean. One method for enhe Bonferroni correction: Bonferroni: X i ± t n, α 2p s i n Inference for the Populati This section focuses on the question: 5 Intervals vs Reion Whether a pre-determined mean is a the normal population mean, iven t Multivariate Testin of a Mea have collected. The multivariate analo of the t-statistic is: In hypothesis 90% confidence testinreion lanuae, this mean x data points + sample T 2 mean, = n( X µ 0 ) S n ( X µ 0 ) hypothesized H o : mean, µ = µ 0 vs H A : µ confidence reion (3D) It is called Hotellins T where µ =(µ Bonferroni confidence 2.,µ 2,...,µ p ) is the true popu interval, reion T 2 follows (µ 0,µ 02 an,...,µ (n )p 0p ) is the hypothesized mea n p F p,n p,α distribution. (Re the univariate Ellipse case, extends whenoutside t t n of box,, then in t 2 F multivariatesome analo, directions. that Hypothesized derees of freedom dimensionality mean of is the inside data.) box always. 6

9 Paired Samples! the second value from the first and treat as a one sample problem.! Example: Municipal wastewater treatment plants are required by law to monitor their dischares into rivers and streams on a reular basis. Concern about the reliability of data from one of these self-monitorin prorams led to a study in which samples of effluent were divided and sent to two laboratories were divided and sent to two laboratories for testin. One half of each sample was sent to the Wisconsin state lab and the other half sent to a private commercial lab routinely used in the monitorin proram. Measurements were made on biochemical oxyen demand (BOD) and suspended solids (SS) were obtained, on n= samples.! Are the analyses producin equivalent results? 7 Data Sample BOD SS BOD SS BOD SS

10 Plots BOD SS Lab D 2= SS SS Lab Differences x SS S d =! Plot vs! Are the values for BOD the same from labs & 2?! Are the values for SS the same from labs & 2?! Plot in multivariate plots. +! Is there a systematic shift 222 from x one rep to another ! Plot BOD, alon BOD with sample mean and hypothesized mean.! How similar are the 99.3 sample mean 88.3 and hyp mean? SS Differences Lab BOD T = [ ] S d = 88.3Two 48.6 Sample Hotellins T 2 o Sample Two Sample Hotellins T 2 BOD F Two Sample Hotellins T 2 For two 2,9(0.05) samples Two = collected 9.47 Sample independently Reject Hotellins H o T. 2 from multivariate normal populations, For two N(µ samples 2 Two Sample Hotellins T, Σ),N(µ collected 2, Σ), independently the test statistic from multivariate for H 0 : n For two For samples two mal collected populations, collected independently 9.36 = from 3.6 multivariate normal normal populations, µ N(µ, Σ),N(µ , Σ),, the test statistic for for H 0 : µ = For µ 2 two vs Hsamples A : µ collected = µ 2 is: N(µ, Σ),N(µ 2, Σ), the test statistic for H independently from multivariate normal populations, = µ 2 vs H A : µ N(µ µ = µ 2 vs H A : µ, Σ),N(µ = µ 2 is: 2, Σ), the test statistic for H 0 : vs = µ 2 is: is ected independently from multivaria µ = µ 2 vs H A : µ = µ 2 is: µ Reject, Σ),N(µ T 2 =( X H o. 2 X, ) Σ), (( + the T test statistic f 2 T µ 2 is: 2 =( X X 2 ) (( =( X )S pooled ) T 2 =( X + )S X 2 ) (( n X + n 2 ) (( + )S( pooled X ) X ( 2 X ) X 2 ) n )S pooled ) n 2 ( X X X 22 )) n n n 2 n 2 2 is distributed is as distributed (n +n 2 2)p as is distributed is distributed as (n as n +n (n +n 2 2)p 2 p F (n +n 2 2)p 2)p n n +n +n 2 p 2 p F p,n F np,n +n +n 2 p 2 p,α F p,n +n and 2 p,α and p,n +n +n 2 p,α 2 p,α and and S S pooled S pooled = = n n +n n +n 2 2 S n pooled = n S pooled n n n +n 2 2 S 2. S +n 2 2 S = n + n n +n2 2 2 S n +n 2 2. S + n 2 n 2. +n 2 2 S 2. X 2 ) (( + )S pooled ) 22 ( X X 22 2 ) n n = 22

11 THE AUTHOR BRIEFn i ARTICLE # K! wij yij yi yij yi ni ni j Testin 3 " 83 " " > = for 8 6 if Equality of Variance-Covariance S i = % & K THE 0 if n AUTHOR i $ ' Test for Homoeneity of Variancecovariances H 0 : µ = µ 2 Bartlett s test for homoeneity of variances, for univariate data is: Covariance Matrix H A : µ = µ 2 H 0 : µ which To test eneralizes = µ % 2 the hypothesis to " S " > H 0 : Σ = Σ 2 =... = Σ aainst the alternative S = &K! that they n i 6 0 are n not 5 i if n i= all equal, SAS PROC DISCRIM uses a LRT test 'K 0H A : µ = µ 2 if n$ Λ = i= S i (n i )/2 H 0 : Σ = Σ 2 =... = Σ S pooled (n p)/2 M Statistic (extends Bartlett s univariate test.) No exact distribution, but for lare M % & ' χ 2 data can be used to obtain p-values. Also Box s M test, has no exact distribution. K0n" 5lo S " n =! i " 6lo S i if S > 0, there s R code for this. BUT, both tend to be K i= sensitive, SYSMIS to!! if S $ Comparin several Multivariate Means - MANOVA Comparin Comparin several several Multivariate Means - MANOVA Comparin several Multivariate Means - MANOVA Data: Data:! Data: samples samples samples of of sizeof of size nsize,...n n n,...n,...n ; p variables. ; ; p variables. ; p variables. Population:! Population: populations; populations; different different means means (µ, Population: populations;, different means,...,µ (µ (µ,...,µ,...,µ ), same ), same ), same covariance covariance (Σ), (Σ), multivariate multivariate normal. normal. covariance (Σ), multivariate normal.! Arises Arises Arises from from () from desined () () desined desined experimental experimental experimental studies, studies, studies, where where where there there are are there are multiple responses measured for one or more treatments, (2) in Arises multiple multiple from responses () responses desined measured measured experimental for one foror one more studies, treatments, more treatments, where (2) there in (2) are in classification problems, where we want to classify a new observation multiple classification classification into responses problems, a class, based problems, measured where we on two where or more for want we oneto measured want orclassify more to a variables. classify treatments, newaobser- observation into ainto class, (2) in classification vation problems, a class, based based on where twonor we two more want measured more to classify measured variables. a variables. new observation into a class, based on two or more measured variables

12 Recall Univariate ANOVA Recall Univariate ANOVA We have random samples X l,x l2,...,x lnl from N(µ l, σ 2 ), l =,...,. We are interested to know if the population means of Recall Recall Univariate ANOVA the roups are different, that is, if µ = µ 2 =...= µ. e We have The have We random problem have random issamples parametrized XX l,x l2 l, l,x into l2,...,x the followin lnl from model N(µ formulation: l=, We..., We are. We are interested are to toknow if if ifthe population population means of means the of of l, σ 2 ), ), l l =,...,.,...,. e the roups roups are are different, that that is, if is, is, if if µ = µ 2 =...=. µ. The problem is Xparametrized lj = µ + into the τ l followin + emodel lj formulation: he The problem problem is parametrized is parametrized into into the the followin model formulation: formulaon: with with the the constraint l= n lτ l = 0,, which leadsto tothethe null hypothesis null hypothesis notation: notation: X lj = µ + τ l lj X lj = µ + τ l + e lj H with the constraint o : τ = τ 2 =...= τ =0 l= n lτ l = 0, which leads to the null hy- constraint notation: ithpothesis l= n 29 lτ l = 0, which leads to the null hyothesis notation: H o : τ = τ 2 =...= τ =0 H o : τ = τ 2 =...= τ =0 The model is fit to the data by splittin each observation into The model is fit to the data by splittin each each observation into into components: components: X lj = X + ( X l X) + (X lj X l ), j =,...,n l ; l =,..., X lj = X + ( X l X) + (X lj X l ), j =,...,n l ; l =,..., and computin the sums of squares for each component: and computin the sums of squares for each component: and computin the sums of squares for each component: n l (X lj X) 2 = n l ( X l X) 2 n l n l (X lj X l ) (X 2 l= j= lj X) 2 = n l= l ( X l X) 2 n l + (X l= j= lj X l ) 2 l= j= l= l= j=

13 ANOVA Table ANOVA Table SourceTreatment (Between) SS SS Tr df Treatment (Between) SS Tr Error (Within) SS Res l= n Error (Within) SS Res l Total SS CorTot l= n l Total SS CorTot l= n l Then Then we we test test the hypothesisusin H o : τ = τ 2 =... = τ = 0 vs H A : not all equal 0, usin: F = ANOVA Table Source SS df Then we test the hypothesis H o : τ = τ 2 =... = τ = 0 vs H A : not all equal 0, usin: F = SS Tr /( ) SS Res /( l= n l ) F, n l (α) SS Tr /( ) SS Res /( l= n l ) F, n l (α) Multivariate ANOVA Multivariate ANOVA The correspondin model formulation is The correspondin model formulation is X lj = µ + τ l + e lj And the correspondin data decomposition where X lj, e l j are n l p matrices, µ, τ l are p-d vectors, ivin the correspondin data representation: X lj = X + ( X l X) + (X lj X l ), j =,...,n l ; l =,..., 32 26

14 H A : µ = µ 2 H 0 : Σ = Σ 2 =... = Σ The correspondin sums of squares χ The correspondin sums of squares 2 is iven by: is iven by: n l l= j= n l (X lj X)(X lj X) = j= n l (X lj X l )(X lj X l ) l= l= (X lj X)(X lj X) = l= j= n l ( X l X)( X l X) + l= = B + W n l ( X l X)( X l X) + n l l= j= (X lj X l )(X lj MANOVA Table Source SS df Treatment (Between) B Error (Within) W l= n l Total B + W l= n l 33 Testin the hypothesis, H o : τ =...= τ l 27 = 0 he bi complication Testinhere the hypothesis, Hthat 0 : µ = we µ are H o comparin : τ 2 =...= τmatrices l = 0 now nstead The bi Testin of sinle complication the numbers. here is hypothesis, The thatmatrices we are comparin H o : correspond matrices τ =...= toτ variance now l = 0 llipses instead The p-d of sinle bi space. complication numbers. So there The H A : isµ are matrices that = µ 2 numerous we are correspond comparin test statistics. tomatrices variance now The ellipses most instead p-d popular of space. sinleis So numbers. Wilks there are The lambda: numerous matrices test correspond statistics. to variance The ost popular The bi complication is Wilks lambda: here H 0 : is Σ that = Σwe 2 =... are = comparin Σ matrices now most instead popular ellipses in of sinle is p-d Wilks space. numbers. lambda: So there are numerous test statistics. The The matrices correspond to variance ellipses most popular is Wilks lambda: χ in p-d space. So there are numerous 2 test statistics. The most popular is Wilks' lambda: Λ = W W n l Λ = W (X lj X)(X lj B + W B X) = n + W l ( X l X)( X l X) + l= j= Λ = W l= B + W e bi complication here is that we are comparin matrices now tead of sinle numbers. The matrices correspond to variance ipses in p-d space. So there are numerous test statistics. The st popular is Wilks lambda: f ΛIf is Λ If too is too is too small small then then reject H. If o. If l= j= n l n= l = n is n lare, is lare, (n (n If Λ is too W p + )/2) (p ln W W small then reject H o is Λ distributed = W. If (p + )/2) ln is distributed as asχ B+W 2 χ as as χ 2 B+W 2 n l = n is lare, (n (n (p + )/2) ln W W (α). p( ) For p( ) p( ) smaller n, (p + )/2) ln B+W is distributed as χ n, B+W 2 (α). For. For smaller n, p( ) some combinations of p, the test statistic has an F distribution. ome combinations n, some combinations of p, of of p, the p, the test the test statistic has an has F distribution. an F distribution. some combinations of p, B the + W test statistic has an F distribu Λ is too small then reject H o. If n l = n is lare, (n + )/2) ln Testin the hypothesis, THE AUTHOR H o : τ =...= τ l = 0 Testin the hypothesis, H o : τ =...= τ l = 0 The bi complication here is that we are comparin matrice instead Testin of sinle the numbers. hypothesis The matrices correspond to var ellipses in p-d space. So there are numerous test statistics. If Λ is too small then reject H o. If n l = n is lare, (n W B+W BRIEF ARTICLE n l (X lj X l )(X lj X l ) (α). For smal is distributed as χ 2 28 (α). For smaller n p( ) 34

15 Note that we can also express Λ as Note that we can also express Note that Note we that can we can also also express Λ Λ as as as Λ min(p, ) = Λmin(p, ) min(p, ) = i= +λ i i= +λ i Λ = +λ i i= where where λ i ; i λ=,...,min(p, ) are the eienvalues of W i ; i ) are the eienvalues of of WB. B. values of W B.. Also note, Also note, bein bein based based on on determinants of the of the matrices matrices that that e Wilks matrices lambda Wilks Also that note, lambda isbein essentially isbased essentially on comparin determinants comparin volumes of volumes the matrices of the of the incorrespond- ellipses. lambda ellipses. Another is essentially Another way of calculatin of calculatin volumes of volumes the correspondin of ellipses of ellipses is the is the that correspond- Wilks f the products products ellipses. of the Another oflenths way lenths of the calculatin ofleadin the leadin volumes axes axes of ellipses (eienvalues). is the of ellipses is the products of the lenths of the leadin axes (eienvalues). envalues). where λ i ; i =,...,min(p, ) are the eienvalues of W B. Also note, bein based on determinants of the matrices that Wilks lambda is essentially comparin volumes of the correspondin ellipses. Another way of calculatin volumes of ellipses is the products of the lenths of the leadin axes (eienvalues) Other Test Statistics Other Test Statistics Other Test Statistics Other Test Statistics Roy s! Roy's larest eienvalue of ofbw BW Roy s larest eienvalue of BW Lawley! Lawley and and Hotellins: T trace(bw ) ) Lawley and Hotellins: T = trace(bw ) Pillai s trace: trace(b(b + W) ) Pillai s! Pillai s Pillai's trace: trace: V V = trace(b(b + W) W) ) )

16 Example: Dominican Republic Student Testin 82 Students, enrolled in one of Technoloy (23), Architecture (38), Medical Technoloy (2) prorams. The students were iven 4 tests: Aptitude, Math, Lanuae, General Knowlede. Is there is a BRIEF difference ARTICLE in the averae test scores for each student type? THE AUTHOR H 0 : µ = µ 2 H A : µ = µ 2 H 0 : Σ = Σ 2 =... = Σ χ 2 3 n l (X lj X)(X lj X) = l= j= l= n Summary l (X lj Statistics X l )(X lj X l ) l= j= n l ( X l X)( X l X) + n=82, n =23, n 2 =38, n 3 =2, p=4, =3 2 (n (p + )/2) ln W THE AUTHOR B+W THE AUTHOR X = X = X 2 = X = THE AUTHOR S = S = S = S 2 = S 2 = S 2 = S 3 = S 3 = Can we consider that S 3 = the population variance-covariances miht be equal? 32

17 Descriptive raphics Can we consider that the population variances miht be equal? Are the population means different? Apt Math Lan GenKnow 5 count Tech Arch MedTech StudType Tech Arch MedTech value Descriptive raphics Apt Corr: Corr: 0.52 Corr: 0.39 Can we consider that the population variance-covariances miht be equal? Are the means different? Lan Math Corr: 0.43 Corr: Corr: GenKnow Apt Math Lan GenKnow 34

18 Descriptive raphics Can we consider that the population variance-covariances miht be equal? Are the means different? 35 Test OR B = W =

19 S 2 = S 3 = Test µ = µ 2 = µ 3 τ = τ 2 = τ 3 = 0 H o : µ = µ 2 = µ 3 H o : τ = τ 2 = τ 3 = S 3 = OR B = THE AUTHOR W = S = > summary(manova(cbind(apt, Math, Lan, GenKnow)~StudType, S 2 = domrep), test="wilks") Df Wilks approx 67.5 F num 28.3 Df den 0.57 Df 68.2 Pr(>F) StudType e-07 *** Residuals 79 S 3 = H o : µ = µ 2 = µ 3 H o : τ = τ 2 = τ 3 = 0 B = Which variables? Analysis of variance on each variable W = Df Sum Sq Mean Sq F value Pr(>F) Apt e-07 *** Math ** Lan e-05 *** GenKnow * All variables have sinificant difference between means! In order of importance: 38

20 B = W = Which roups? Df Sum Sq Mean Sq F value Pr(>F) e-07 *** h ** e-05 *** Know * BRIEF ARTICLE 3 BRIEF ARTICLE Tukey s multiple comparisons Apt diff lwr lwr upr padj Arch-Tech MedTech-Tech MedTech-Arch Lan diff lwr upr p adj Arch-Tech MedTech-Tech MedTech-Arch Apt: diff lwr upr p adj Arch-Tech Math: MedTech-Tech Lan: MedTech-Arch GenKnow: BRIEF ARTICLE diff lwr upr p adj Arch-Tech MedTech-Tech MedTech-Arch diff lwr upr p adj Math diff lwr upr adj Arch-Tech Arch-Tech MedTech-Tech MedTech-Arch GenKnow diff lwr lwr upr p adj Arch-Tech MedTech-Tech MedTech-Arch diff lwr upr p adj Arch-Tech MedTech-Tech MedTech-Arch Summary In eneral, usin a multivariate test, such as, is preferable to multiple univariate tests, such as. This controls for the correlation structure in the data. Its possible to et non-sinificant results in the univariate tests, but sinificant results in the multivariate tests. That is, the univariate tests may not detect the differences, in the presence of correlation between the responses. 40

21 This work is licensed under the Creative Commons Attribution- Noncommercial 3.0 United States License. To view a copy of this license, visit us/ or send a letter to Creative Commons, 7 Second Street, Suite 300, San Francisco, California, 9405, USA. 4

Multivariate Analysis of Variance (MANOVA): I. Theory

Multivariate Analysis of Variance (MANOVA): I. Theory Gregory Carey, 1998 MANOVA: I - 1 Multivariate Analysis of Variance (MANOVA): I. Theory Introduction The purpose of a t test is to assess the likelihood that the means for two groups are sampled from the

More information

Multivariate Normal Distribution

Multivariate Normal Distribution Multivariate Normal Distribution Lecture 4 July 21, 2011 Advanced Multivariate Statistical Methods ICPSR Summer Session #2 Lecture #4-7/21/2011 Slide 1 of 41 Last Time Matrices and vectors Eigenvalues

More information

Profile analysis is the multivariate equivalent of repeated measures or mixed ANOVA. Profile analysis is most commonly used in two cases:

Profile analysis is the multivariate equivalent of repeated measures or mixed ANOVA. Profile analysis is most commonly used in two cases: Profile Analysis Introduction Profile analysis is the multivariate equivalent of repeated measures or mixed ANOVA. Profile analysis is most commonly used in two cases: ) Comparing the same dependent variables

More information

How to calculate an ANOVA table

How to calculate an ANOVA table How to calculate an ANOVA table Calculations by Hand We look at the following example: Let us say we measure the height of some plants under the effect of different fertilizers. Treatment Measures Mean

More information

AEROSOL STATISTICS LOGNORMAL DISTRIBUTIONS AND dn/dlogd p

AEROSOL STATISTICS LOGNORMAL DISTRIBUTIONS AND dn/dlogd p Concentration (particle/cm3) [e5] Concentration (particle/cm3) [e5] AEROSOL STATISTICS LOGNORMAL DISTRIBUTIONS AND dn/dlod p APPLICATION NOTE PR-001 Lonormal Distributions Standard statistics based on

More information

A KERNEL MAXIMUM UNCERTAINTY DISCRIMINANT ANALYSIS AND ITS APPLICATION TO FACE RECOGNITION

A KERNEL MAXIMUM UNCERTAINTY DISCRIMINANT ANALYSIS AND ITS APPLICATION TO FACE RECOGNITION A KERNEL MAXIMUM UNCERTAINTY DISCRIMINANT ANALYSIS AND ITS APPLICATION TO FACE RECOGNITION Carlos Eduardo Thomaz Department of Electrical Enineerin, Centro Universitario da FEI, FEI, Sao Paulo, Brazil

More information

Recall this chart that showed how most of our course would be organized:

Recall this chart that showed how most of our course would be organized: Chapter 4 One-Way ANOVA Recall this chart that showed how most of our course would be organized: Explanatory Variable(s) Response Variable Methods Categorical Categorical Contingency Tables Categorical

More information

Multivariate Statistical Inference and Applications

Multivariate Statistical Inference and Applications Multivariate Statistical Inference and Applications ALVIN C. RENCHER Department of Statistics Brigham Young University A Wiley-Interscience Publication JOHN WILEY & SONS, INC. New York Chichester Weinheim

More information

Water Quality and Environmental Treatment Facilities

Water Quality and Environmental Treatment Facilities Geum Soo Kim, Youn Jae Chan, David S. Kelleher1 Paper presented April 2009 and at the Teachin APPAM-KDI Methods International (Seoul, June Seminar 11-13, on 2009) Environmental Policy direct, two-stae

More information

DISCRIMINANT FUNCTION ANALYSIS (DA)

DISCRIMINANT FUNCTION ANALYSIS (DA) DISCRIMINANT FUNCTION ANALYSIS (DA) John Poulsen and Aaron French Key words: assumptions, further reading, computations, standardized coefficents, structure matrix, tests of signficance Introduction Discriminant

More information

ANOVA. February 12, 2015

ANOVA. February 12, 2015 ANOVA February 12, 2015 1 ANOVA models Last time, we discussed the use of categorical variables in multivariate regression. Often, these are encoded as indicator columns in the design matrix. In [1]: %%R

More information

KSTAT MINI-MANUAL. Decision Sciences 434 Kellogg Graduate School of Management

KSTAT MINI-MANUAL. Decision Sciences 434 Kellogg Graduate School of Management KSTAT MINI-MANUAL Decision Sciences 434 Kellogg Graduate School of Management Kstat is a set of macros added to Excel and it will enable you to do the statistics required for this course very easily. To

More information

Chapter 23 Inferences About Means

Chapter 23 Inferences About Means Chapter 23 Inferences About Means Chapter 23 - Inferences About Means 391 Chapter 23 Solutions to Class Examples 1. See Class Example 1. 2. We want to know if the mean battery lifespan exceeds the 300-minute

More information

Regression Analysis: A Complete Example

Regression Analysis: A Complete Example Regression Analysis: A Complete Example This section works out an example that includes all the topics we have discussed so far in this chapter. A complete example of regression analysis. PhotoDisc, Inc./Getty

More information

GLM I An Introduction to Generalized Linear Models

GLM I An Introduction to Generalized Linear Models GLM I An Introduction to Generalized Linear Models CAS Ratemaking and Product Management Seminar March 2009 Presented by: Tanya D. Havlicek, Actuarial Assistant 0 ANTITRUST Notice The Casualty Actuarial

More information

Multivariate analyses

Multivariate analyses 14 Multivariate analyses Learning objectives By the end of this chapter you should be able to: Recognise when it is appropriate to use multivariate analyses (MANOVA) and which test to use (traditional

More information

One-Way Analysis of Variance

One-Way Analysis of Variance One-Way Analysis of Variance Note: Much of the math here is tedious but straightforward. We ll skim over it in class but you should be sure to ask questions if you don t understand it. I. Overview A. We

More information

One-Way Analysis of Variance: A Guide to Testing Differences Between Multiple Groups

One-Way Analysis of Variance: A Guide to Testing Differences Between Multiple Groups One-Way Analysis of Variance: A Guide to Testing Differences Between Multiple Groups In analysis of variance, the main research question is whether the sample means are from different populations. The

More information

Linear Discriminant Analysis

Linear Discriminant Analysis Fiche TD avec le logiciel : course5 Linear Discriminant Analysis A.B. Dufour Contents 1 Fisher s iris dataset 2 2 The principle 5 2.1 Linking one variable and a factor.................. 5 2.2 Linking a

More information

Service courses for graduate students in degree programs other than the MS or PhD programs in Biostatistics.

Service courses for graduate students in degree programs other than the MS or PhD programs in Biostatistics. Course Catalog In order to be assured that all prerequisites are met, students must acquire a permission number from the education coordinator prior to enrolling in any Biostatistics course. Courses are

More information

This can dilute the significance of a departure from the null hypothesis. We can focus the test on departures of a particular form.

This can dilute the significance of a departure from the null hypothesis. We can focus the test on departures of a particular form. One-Degree-of-Freedom Tests Test for group occasion interactions has (number of groups 1) number of occasions 1) degrees of freedom. This can dilute the significance of a departure from the null hypothesis.

More information

1 Simple Linear Regression I Least Squares Estimation

1 Simple Linear Regression I Least Squares Estimation Simple Linear Regression I Least Squares Estimation Textbook Sections: 8. 8.3 Previously, we have worked with a random variable x that comes from a population that is normally distributed with mean µ and

More information

Example: Credit card default, we may be more interested in predicting the probabilty of a default than classifying individuals as default or not.

Example: Credit card default, we may be more interested in predicting the probabilty of a default than classifying individuals as default or not. Statistical Learning: Chapter 4 Classification 4.1 Introduction Supervised learning with a categorical (Qualitative) response Notation: - Feature vector X, - qualitative response Y, taking values in C

More information

SPSS Advanced Statistics 17.0

SPSS Advanced Statistics 17.0 i SPSS Advanced Statistics 17.0 For more information about SPSS Inc. software products, please visit our Web site at http://www.spss.com or contact SPSS Inc. 233 South Wacker Drive, 11th Floor Chicago,

More information

An analysis method for a quantitative outcome and two categorical explanatory variables.

An analysis method for a quantitative outcome and two categorical explanatory variables. Chapter 11 Two-Way ANOVA An analysis method for a quantitative outcome and two categorical explanatory variables. If an experiment has a quantitative outcome and two categorical explanatory variables that

More information

Simple Linear Regression Inference

Simple Linear Regression Inference Simple Linear Regression Inference 1 Inference requirements The Normality assumption of the stochastic term e is needed for inference even if it is not a OLS requirement. Therefore we have: Interpretation

More information

Analysis of Variance. MINITAB User s Guide 2 3-1

Analysis of Variance. MINITAB User s Guide 2 3-1 3 Analysis of Variance Analysis of Variance Overview, 3-2 One-Way Analysis of Variance, 3-5 Two-Way Analysis of Variance, 3-11 Analysis of Means, 3-13 Overview of Balanced ANOVA and GLM, 3-18 Balanced

More information

Elements of statistics (MATH0487-1)

Elements of statistics (MATH0487-1) Elements of statistics (MATH0487-1) Prof. Dr. Dr. K. Van Steen University of Liège, Belgium December 10, 2012 Introduction to Statistics Basic Probability Revisited Sampling Exploratory Data Analysis -

More information

ANOVA ANOVA. Two-Way ANOVA. One-Way ANOVA. When to use ANOVA ANOVA. Analysis of Variance. Chapter 16. A procedure for comparing more than two groups

ANOVA ANOVA. Two-Way ANOVA. One-Way ANOVA. When to use ANOVA ANOVA. Analysis of Variance. Chapter 16. A procedure for comparing more than two groups ANOVA ANOVA Analysis of Variance Chapter 6 A procedure for comparing more than two groups independent variable: smoking status non-smoking one pack a day > two packs a day dependent variable: number of

More information

Experimental Design for Influential Factors of Rates on Massive Open Online Courses

Experimental Design for Influential Factors of Rates on Massive Open Online Courses Experimental Design for Influential Factors of Rates on Massive Open Online Courses December 12, 2014 Ning Li nli7@stevens.edu Qing Wei qwei1@stevens.edu Yating Lan ylan2@stevens.edu Yilin Wei ywei12@stevens.edu

More information

COMPARISONS OF CUSTOMER LOYALTY: PUBLIC & PRIVATE INSURANCE COMPANIES.

COMPARISONS OF CUSTOMER LOYALTY: PUBLIC & PRIVATE INSURANCE COMPANIES. 277 CHAPTER VI COMPARISONS OF CUSTOMER LOYALTY: PUBLIC & PRIVATE INSURANCE COMPANIES. This chapter contains a full discussion of customer loyalty comparisons between private and public insurance companies

More information

Applied Multivariate Analysis

Applied Multivariate Analysis Neil H. Timm Applied Multivariate Analysis With 42 Figures Springer Contents Preface Acknowledgments List of Tables List of Figures vii ix xix xxiii 1 Introduction 1 1.1 Overview 1 1.2 Multivariate Models

More information

Chapter 6: Multivariate Cointegration Analysis

Chapter 6: Multivariate Cointegration Analysis Chapter 6: Multivariate Cointegration Analysis 1 Contents: Lehrstuhl für Department Empirische of Wirtschaftsforschung Empirical Research and und Econometrics Ökonometrie VI. Multivariate Cointegration

More information

IBM SPSS Advanced Statistics 20

IBM SPSS Advanced Statistics 20 IBM SPSS Advanced Statistics 20 Note: Before using this information and the product it supports, read the general information under Notices on p. 166. This edition applies to IBM SPSS Statistics 20 and

More information

MULTIPLE LINEAR REGRESSION ANALYSIS USING MICROSOFT EXCEL. by Michael L. Orlov Chemistry Department, Oregon State University (1996)

MULTIPLE LINEAR REGRESSION ANALYSIS USING MICROSOFT EXCEL. by Michael L. Orlov Chemistry Department, Oregon State University (1996) MULTIPLE LINEAR REGRESSION ANALYSIS USING MICROSOFT EXCEL by Michael L. Orlov Chemistry Department, Oregon State University (1996) INTRODUCTION In modern science, regression analysis is a necessary part

More information

Applied Statistics. J. Blanchet and J. Wadsworth. Institute of Mathematics, Analysis, and Applications EPF Lausanne

Applied Statistics. J. Blanchet and J. Wadsworth. Institute of Mathematics, Analysis, and Applications EPF Lausanne Applied Statistics J. Blanchet and J. Wadsworth Institute of Mathematics, Analysis, and Applications EPF Lausanne An MSc Course for Applied Mathematicians, Fall 2012 Outline 1 Model Comparison 2 Model

More information

Regression step-by-step using Microsoft Excel

Regression step-by-step using Microsoft Excel Step 1: Regression step-by-step using Microsoft Excel Notes prepared by Pamela Peterson Drake, James Madison University Type the data into the spreadsheet The example used throughout this How to is a regression

More information

SAS Software to Fit the Generalized Linear Model

SAS Software to Fit the Generalized Linear Model SAS Software to Fit the Generalized Linear Model Gordon Johnston, SAS Institute Inc., Cary, NC Abstract In recent years, the class of generalized linear models has gained popularity as a statistical modeling

More information

Testing Group Differences using T-tests, ANOVA, and Nonparametric Measures

Testing Group Differences using T-tests, ANOVA, and Nonparametric Measures Testing Group Differences using T-tests, ANOVA, and Nonparametric Measures Jamie DeCoster Department of Psychology University of Alabama 348 Gordon Palmer Hall Box 870348 Tuscaloosa, AL 35487-0348 Phone:

More information

Eigenvalues, Eigenvectors, Matrix Factoring, and Principal Components

Eigenvalues, Eigenvectors, Matrix Factoring, and Principal Components Eigenvalues, Eigenvectors, Matrix Factoring, and Principal Components The eigenvalues and eigenvectors of a square matrix play a key role in some important operations in statistics. In particular, they

More information

Linear Models and Conjoint Analysis with Nonlinear Spline Transformations

Linear Models and Conjoint Analysis with Nonlinear Spline Transformations Linear Models and Conjoint Analysis with Nonlinear Spline Transformations Warren F. Kuhfeld Mark Garratt Abstract Many common data analysis models are based on the general linear univariate model, including

More information

Chapter 4: Vector Autoregressive Models

Chapter 4: Vector Autoregressive Models Chapter 4: Vector Autoregressive Models 1 Contents: Lehrstuhl für Department Empirische of Wirtschaftsforschung Empirical Research and und Econometrics Ökonometrie IV.1 Vector Autoregressive Models (VAR)...

More information

Week 5: Multiple Linear Regression

Week 5: Multiple Linear Regression BUS41100 Applied Regression Analysis Week 5: Multiple Linear Regression Parameter estimation and inference, forecasting, diagnostics, dummy variables Robert B. Gramacy The University of Chicago Booth School

More information

One-Way Analysis of Variance (ANOVA) Example Problem

One-Way Analysis of Variance (ANOVA) Example Problem One-Way Analysis of Variance (ANOVA) Example Problem Introduction Analysis of Variance (ANOVA) is a hypothesis-testing technique used to test the equality of two or more population (or treatment) means

More information

Name: Date: Use the following to answer questions 3-4:

Name: Date: Use the following to answer questions 3-4: Name: Date: 1. Determine whether each of the following statements is true or false. A) The margin of error for a 95% confidence interval for the mean increases as the sample size increases. B) The margin

More information

2013 MBA Jump Start Program. Statistics Module Part 3

2013 MBA Jump Start Program. Statistics Module Part 3 2013 MBA Jump Start Program Module 1: Statistics Thomas Gilbert Part 3 Statistics Module Part 3 Hypothesis Testing (Inference) Regressions 2 1 Making an Investment Decision A researcher in your firm just

More information

Factor analysis. Angela Montanari

Factor analysis. Angela Montanari Factor analysis Angela Montanari 1 Introduction Factor analysis is a statistical model that allows to explain the correlations between a large number of observed correlated variables through a small number

More information

Use of deviance statistics for comparing models

Use of deviance statistics for comparing models A likelihood-ratio test can be used under full ML. The use of such a test is a quite general principle for statistical testing. In hierarchical linear models, the deviance test is mostly used for multiparameter

More information

NCSS Statistical Software

NCSS Statistical Software Chapter 06 Introduction This procedure provides several reports for the comparison of two distributions, including confidence intervals for the difference in means, two-sample t-tests, the z-test, the

More information

On the Influence of the Prediction Horizon in Dynamic Matrix Control

On the Influence of the Prediction Horizon in Dynamic Matrix Control International Journal of Control Science and Enineerin 203, 3(): 22-30 DOI: 0.5923/j.control.203030.03 On the Influence of the Prediction Horizon in Dynamic Matrix Control Jose Manue l Lope z-gue de,*,

More information

G*Power 3: A flexible statistical power analysis program. for the social, behavioral, and biomedical sciences

G*Power 3: A flexible statistical power analysis program. for the social, behavioral, and biomedical sciences Runnin Head: G*Power 3 G*Power 3: A fleible statistical power analysis proram for the social, behavioral, and biomedical sciences (in press. Behavior Research Methods. Franz Faul Christian-Albrechts-Universität

More information

Statistical Analysis. NBAF-B Metabolomics Masterclass. Mark Viant

Statistical Analysis. NBAF-B Metabolomics Masterclass. Mark Viant Statistical Analysis NBAF-B Metabolomics Masterclass Mark Viant 1. Introduction 2. Univariate analysis Overview of lecture 3. Unsupervised multivariate analysis Principal components analysis (PCA) Interpreting

More information

ON AN APPROACH TO STATEMENT OF SQL QUESTIONS IN THE NATURAL LANGUAGE FOR CYBER2 KNOWLEDGE DEMONSTRATION AND ASSESSMENT SYSTEM

ON AN APPROACH TO STATEMENT OF SQL QUESTIONS IN THE NATURAL LANGUAGE FOR CYBER2 KNOWLEDGE DEMONSTRATION AND ASSESSMENT SYSTEM 216 International Journal "Information Technoloies & Knowlede" Volume 9, Number 3, 2015 ON AN APPROACH TO STATEMENT OF SQ QUESTIONS IN THE NATURA ANGUAGE FOR CYBER2 KNOWEDGE DEMONSTRATION AND ASSESSMENT

More information

Chapter 13 Introduction to Linear Regression and Correlation Analysis

Chapter 13 Introduction to Linear Regression and Correlation Analysis Chapter 3 Student Lecture Notes 3- Chapter 3 Introduction to Linear Regression and Correlation Analsis Fall 2006 Fundamentals of Business Statistics Chapter Goals To understand the methods for displaing

More information

Psychology 405: Psychometric Theory Homework on Factor analysis and structural equation modeling

Psychology 405: Psychometric Theory Homework on Factor analysis and structural equation modeling Psychology 405: Psychometric Theory Homework on Factor analysis and structural equation modeling William Revelle Department of Psychology Northwestern University Evanston, Illinois USA June, 2014 1 / 20

More information

10. Analysis of Longitudinal Studies Repeat-measures analysis

10. Analysis of Longitudinal Studies Repeat-measures analysis Research Methods II 99 10. Analysis of Longitudinal Studies Repeat-measures analysis This chapter builds on the concepts and methods described in Chapters 7 and 8 of Mother and Child Health: Research methods.

More information

CHAPTER 13. Experimental Design and Analysis of Variance

CHAPTER 13. Experimental Design and Analysis of Variance CHAPTER 13 Experimental Design and Analysis of Variance CONTENTS STATISTICS IN PRACTICE: BURKE MARKETING SERVICES, INC. 13.1 AN INTRODUCTION TO EXPERIMENTAL DESIGN AND ANALYSIS OF VARIANCE Data Collection

More information

Introduction. and set F = F ib(f) and G = F ib(g). If the total fiber T of the square is connected, then T >

Introduction. and set F = F ib(f) and G = F ib(g). If the total fiber T of the square is connected, then T > A NEW ELLULAR VERSION OF LAKERS-MASSEY WOJIEH HAHÓLSKI AND JÉRÔME SHERER Abstract. onsider a push-out diaram of spaces A, construct the homotopy pushout, and then the homotopy pull-back of the diaram one

More information

Chapter 7. One-way ANOVA

Chapter 7. One-way ANOVA Chapter 7 One-way ANOVA One-way ANOVA examines equality of population means for a quantitative outcome and a single categorical explanatory variable with any number of levels. The t-test of Chapter 6 looks

More information

N-Way Analysis of Variance

N-Way Analysis of Variance N-Way Analysis of Variance 1 Introduction A good example when to use a n-way ANOVA is for a factorial design. A factorial design is an efficient way to conduct an experiment. Each observation has data

More information

Data Analysis in SPSS. February 21, 2004. If you wish to cite the contents of this document, the APA reference for them would be

Data Analysis in SPSS. February 21, 2004. If you wish to cite the contents of this document, the APA reference for them would be Data Analysis in SPSS Jamie DeCoster Department of Psychology University of Alabama 348 Gordon Palmer Hall Box 870348 Tuscaloosa, AL 35487-0348 Heather Claypool Department of Psychology Miami University

More information

NCSS Statistical Software Principal Components Regression. In ordinary least squares, the regression coefficients are estimated using the formula ( )

NCSS Statistical Software Principal Components Regression. In ordinary least squares, the regression coefficients are estimated using the formula ( ) Chapter 340 Principal Components Regression Introduction is a technique for analyzing multiple regression data that suffer from multicollinearity. When multicollinearity occurs, least squares estimates

More information

Statistical Models in R

Statistical Models in R Statistical Models in R Some Examples Steven Buechler Department of Mathematics 276B Hurley Hall; 1-6233 Fall, 2007 Outline Statistical Models Structure of models in R Model Assessment (Part IA) Anova

More information

Partial Least Squares (PLS) Regression.

Partial Least Squares (PLS) Regression. Partial Least Squares (PLS) Regression. Hervé Abdi 1 The University of Texas at Dallas Introduction Pls regression is a recent technique that generalizes and combines features from principal component

More information

Overview of Factor Analysis

Overview of Factor Analysis Overview of Factor Analysis Jamie DeCoster Department of Psychology University of Alabama 348 Gordon Palmer Hall Box 870348 Tuscaloosa, AL 35487-0348 Phone: (205) 348-4431 Fax: (205) 348-8648 August 1,

More information

Section 1: Simple Linear Regression

Section 1: Simple Linear Regression Section 1: Simple Linear Regression Carlos M. Carvalho The University of Texas McCombs School of Business http://faculty.mccombs.utexas.edu/carlos.carvalho/teaching/ 1 Regression: General Introduction

More information

Online 12 - Sections 9.1 and 9.2-Doug Ensley

Online 12 - Sections 9.1 and 9.2-Doug Ensley Student: Date: Instructor: Doug Ensley Course: MAT117 01 Applied Statistics - Ensley Assignment: Online 12 - Sections 9.1 and 9.2 1. Does a P-value of 0.001 give strong evidence or not especially strong

More information

AP Statistics 2010 Scoring Guidelines

AP Statistics 2010 Scoring Guidelines AP Statistics 2010 Scoring Guidelines The College Board The College Board is a not-for-profit membership association whose mission is to connect students to college success and opportunity. Founded in

More information

MATHEMATICAL METHODS OF STATISTICS

MATHEMATICAL METHODS OF STATISTICS MATHEMATICAL METHODS OF STATISTICS By HARALD CRAMER TROFESSOK IN THE UNIVERSITY OF STOCKHOLM Princeton PRINCETON UNIVERSITY PRESS 1946 TABLE OF CONTENTS. First Part. MATHEMATICAL INTRODUCTION. CHAPTERS

More information

Business Statistics. Successful completion of Introductory and/or Intermediate Algebra courses is recommended before taking Business Statistics.

Business Statistics. Successful completion of Introductory and/or Intermediate Algebra courses is recommended before taking Business Statistics. Business Course Text Bowerman, Bruce L., Richard T. O'Connell, J. B. Orris, and Dawn C. Porter. Essentials of Business, 2nd edition, McGraw-Hill/Irwin, 2008, ISBN: 978-0-07-331988-9. Required Computing

More information

Multivariate Logistic Regression

Multivariate Logistic Regression 1 Multivariate Logistic Regression As in univariate logistic regression, let π(x) represent the probability of an event that depends on p covariates or independent variables. Then, using an inv.logit formulation

More information

XBRL Generation, as easy as a database. Christelle BERNHARD - christelle.bernhard@addactis.com Consultant & Deputy Product Manager of ADDACTIS Pillar3

XBRL Generation, as easy as a database. Christelle BERNHARD - christelle.bernhard@addactis.com Consultant & Deputy Product Manager of ADDACTIS Pillar3 XBRL Generation, as easy as a database. Christelle BERNHARD - christelle.bernhard@addactis.com Consultant & Deputy Product Manaer of ADDACTIS Pillar3 Copyriht 2015 ADDACTIS Worldwide. All Rihts Reserved

More information

Multivariate Analysis of Ecological Data

Multivariate Analysis of Ecological Data Multivariate Analysis of Ecological Data MICHAEL GREENACRE Professor of Statistics at the Pompeu Fabra University in Barcelona, Spain RAUL PRIMICERIO Associate Professor of Ecology, Evolutionary Biology

More information

Indices of Model Fit STRUCTURAL EQUATION MODELING 2013

Indices of Model Fit STRUCTURAL EQUATION MODELING 2013 Indices of Model Fit STRUCTURAL EQUATION MODELING 2013 Indices of Model Fit A recommended minimal set of fit indices that should be reported and interpreted when reporting the results of SEM analyses:

More information

A Primer on Forecasting Business Performance

A Primer on Forecasting Business Performance A Primer on Forecasting Business Performance There are two common approaches to forecasting: qualitative and quantitative. Qualitative forecasting methods are important when historical data is not available.

More information

Chapter 7: Simple linear regression Learning Objectives

Chapter 7: Simple linear regression Learning Objectives Chapter 7: Simple linear regression Learning Objectives Reading: Section 7.1 of OpenIntro Statistics Video: Correlation vs. causation, YouTube (2:19) Video: Intro to Linear Regression, YouTube (5:18) -

More information

The Wondrous World of fmri statistics

The Wondrous World of fmri statistics Outline The Wondrous World of fmri statistics FMRI data and Statistics course, Leiden, 11-3-2008 The General Linear Model Overview of fmri data analysis steps fmri timeseries Modeling effects of interest

More information

Why Taking This Course? Course Introduction, Descriptive Statistics and Data Visualization. Learning Goals. GENOME 560, Spring 2012

Why Taking This Course? Course Introduction, Descriptive Statistics and Data Visualization. Learning Goals. GENOME 560, Spring 2012 Why Taking This Course? Course Introduction, Descriptive Statistics and Data Visualization GENOME 560, Spring 2012 Data are interesting because they help us understand the world Genomics: Massive Amounts

More information

Factor Analysis. Principal components factor analysis. Use of extracted factors in multivariate dependency models

Factor Analysis. Principal components factor analysis. Use of extracted factors in multivariate dependency models Factor Analysis Principal components factor analysis Use of extracted factors in multivariate dependency models 2 KEY CONCEPTS ***** Factor Analysis Interdependency technique Assumptions of factor analysis

More information

Notes. Statistical consulting is like a final exam on steroids.

Notes. Statistical consulting is like a final exam on steroids. Notes Statistical consulting is like a final exam on steroids. A statistical consultant usually works as part of a team on a project and provides statistical knowledge for the team. That team may comprise

More information

August 2012 EXAMINATIONS Solution Part I

August 2012 EXAMINATIONS Solution Part I August 01 EXAMINATIONS Solution Part I (1) In a random sample of 600 eligible voters, the probability that less than 38% will be in favour of this policy is closest to (B) () In a large random sample,

More information

Crash Course on Basic Statistics

Crash Course on Basic Statistics Crash Course on Basic Statistics Marina Wahl, marina.w4hl@gmail.com University of New York at Stony Brook November 6, 2013 2 Contents 1 Basic Probability 5 1.1 Basic Definitions...........................................

More information

1. Introduction to multivariate data

1. Introduction to multivariate data . Introduction to multivariate data. Books Chat eld, C. and A.J.Collins, Introduction to multivariate analysis. Chapman & Hall Krzanowski, W.J. Principles of multivariate analysis. Oxford.000 Johnson,

More information

Survey, Statistics and Psychometrics Core Research Facility University of Nebraska-Lincoln. Log-Rank Test for More Than Two Groups

Survey, Statistics and Psychometrics Core Research Facility University of Nebraska-Lincoln. Log-Rank Test for More Than Two Groups Survey, Statistics and Psychometrics Core Research Facility University of Nebraska-Lincoln Log-Rank Test for More Than Two Groups Prepared by Harlan Sayles (SRAM) Revised by Julia Soulakova (Statistics)

More information

Simple Linear Regression

Simple Linear Regression STAT 101 Dr. Kari Lock Morgan Simple Linear Regression SECTIONS 9.3 Confidence and prediction intervals (9.3) Conditions for inference (9.1) Want More Stats??? If you have enjoyed learning how to analyze

More information

Introduction to Regression and Data Analysis

Introduction to Regression and Data Analysis Statlab Workshop Introduction to Regression and Data Analysis with Dan Campbell and Sherlock Campbell October 28, 2008 I. The basics A. Types of variables Your variables may take several forms, and it

More information

Course Text. Required Computing Software. Course Description. Course Objectives. StraighterLine. Business Statistics

Course Text. Required Computing Software. Course Description. Course Objectives. StraighterLine. Business Statistics Course Text Business Statistics Lind, Douglas A., Marchal, William A. and Samuel A. Wathen. Basic Statistics for Business and Economics, 7th edition, McGraw-Hill/Irwin, 2010, ISBN: 9780077384470 [This

More information

Predictor Coef StDev T P Constant 970667056 616256122 1.58 0.154 X 0.00293 0.06163 0.05 0.963. S = 0.5597 R-Sq = 0.0% R-Sq(adj) = 0.

Predictor Coef StDev T P Constant 970667056 616256122 1.58 0.154 X 0.00293 0.06163 0.05 0.963. S = 0.5597 R-Sq = 0.0% R-Sq(adj) = 0. Statistical analysis using Microsoft Excel Microsoft Excel spreadsheets have become somewhat of a standard for data storage, at least for smaller data sets. This, along with the program often being packaged

More information

A degrees-of-freedom approximation for a t-statistic with heterogeneous variance

A degrees-of-freedom approximation for a t-statistic with heterogeneous variance The Statistician (1999) 48, Part 4, pp. 495±506 A degrees-of-freedom approximation for a t-statistic with heterogeneous variance Stuart R. Lipsitz and Joseph G. Ibrahim Harvard School of Public Health

More information

Chapter 7 Section 7.1: Inference for the Mean of a Population

Chapter 7 Section 7.1: Inference for the Mean of a Population Chapter 7 Section 7.1: Inference for the Mean of a Population Now let s look at a similar situation Take an SRS of size n Normal Population : N(, ). Both and are unknown parameters. Unlike what we used

More information

STATISTICA Formula Guide: Logistic Regression. Table of Contents

STATISTICA Formula Guide: Logistic Regression. Table of Contents : Table of Contents... 1 Overview of Model... 1 Dispersion... 2 Parameterization... 3 Sigma-Restricted Model... 3 Overparameterized Model... 4 Reference Coding... 4 Model Summary (Summary Tab)... 5 Summary

More information

Factorial Invariance in Student Ratings of Instruction

Factorial Invariance in Student Ratings of Instruction Factorial Invariance in Student Ratings of Instruction Isaac I. Bejar Educational Testing Service Kenneth O. Doyle University of Minnesota The factorial invariance of student ratings of instruction across

More information

Attitudes Toward Science of Students Enrolled in Introductory Level Science Courses at UW-La Crosse

Attitudes Toward Science of Students Enrolled in Introductory Level Science Courses at UW-La Crosse Attitudes Toward Science of Students Enrolled in Introductory Level Science Courses at UW-La Crosse Dana E. Craker Faculty Sponsor: Abdulaziz Elfessi, Department of Mathematics ABSTRACT Nearly fifty percent

More information

Multivariate Analysis. Overview

Multivariate Analysis. Overview Multivariate Analysis Overview Introduction Multivariate thinking Body of thought processes that illuminate the interrelatedness between and within sets of variables. The essence of multivariate thinking

More information

business statistics using Excel OXFORD UNIVERSITY PRESS Glyn Davis & Branko Pecar

business statistics using Excel OXFORD UNIVERSITY PRESS Glyn Davis & Branko Pecar business statistics using Excel Glyn Davis & Branko Pecar OXFORD UNIVERSITY PRESS Detailed contents Introduction to Microsoft Excel 2003 Overview Learning Objectives 1.1 Introduction to Microsoft Excel

More information

Non-Inferiority Tests for One Mean

Non-Inferiority Tests for One Mean Chapter 45 Non-Inferiority ests for One Mean Introduction his module computes power and sample size for non-inferiority tests in one-sample designs in which the outcome is distributed as a normal random

More information

Unit 26: Small Sample Inference for One Mean

Unit 26: Small Sample Inference for One Mean Unit 26: Small Sample Inference for One Mean Prerequisites Students need the background on confidence intervals and significance tests covered in Units 24 and 25. Additional Topic Coverage Additional coverage

More information

CHAPTER 13 SIMPLE LINEAR REGRESSION. Opening Example. Simple Regression. Linear Regression

CHAPTER 13 SIMPLE LINEAR REGRESSION. Opening Example. Simple Regression. Linear Regression Opening Example CHAPTER 13 SIMPLE LINEAR REGREION SIMPLE LINEAR REGREION! Simple Regression! Linear Regression Simple Regression Definition A regression model is a mathematical equation that descries the

More information

Chapter 9. Two-Sample Tests. Effect Sizes and Power Paired t Test Calculation

Chapter 9. Two-Sample Tests. Effect Sizes and Power Paired t Test Calculation Chapter 9 Two-Sample Tests Paired t Test (Correlated Groups t Test) Effect Sizes and Power Paired t Test Calculation Summary Independent t Test Chapter 9 Homework Power and Two-Sample Tests: Paired Versus

More information

Directions for using SPSS

Directions for using SPSS Directions for using SPSS Table of Contents Connecting and Working with Files 1. Accessing SPSS... 2 2. Transferring Files to N:\drive or your computer... 3 3. Importing Data from Another File Format...

More information