1 Learning the Language f the Statistician
2 The fllwing slides cntain many f the symbls we will be using in this class. These are the symbls we will be using in frmulas. While I d nt require yu t memrize all f the frmulas, it is imprtant that yu knw what these symbls mean. Yu will be expected t memrize a few f the simpler frmulas fr the departmental final. T d respnsible research, yu must assimilate, integrate and apply. This pwer pint presentatins cncentrates n assimilating this basic infrmatin.
3 Sample Sampling Ppulatin Distributin Individual Scre yi yi Sample Size n N Mean ӯ µ Standard Deviatin s 2 σ/n σ estimated by s/ n Mu Sigma Variance S 2 σ 2 Sum Prprtin p π Pi Hypthesized Mean ӯ µ Hypthesized Prprtin p 0 π
4 Stating Hyptheses with Symbls One Sample Hypthesis Test fr a Prprtin Null hypthesis P = π The sample prprtin is the same as the ppulatin prprtin. Research hypthesis P π The sample prprtin is NOT the same as the ppulatin prprtin. If yu have a thery, yu can use a netailed test and indicate that it is greater r less than the ppulatin prprtin. One Sample Hypthesis Test fr a Mean Null hypthesis ӯ = µ The sample mean is the same as the ppulatin mean. Research hypthesis ӯ µ The sample mean is nt the same as the ppulatin mean. If yu have a thery, yu can use a netailed test and indicate that it is greater r less than the ppulatin mean.
5 Stating Hyptheses with Symbls Chi Square Null hypthesis H 0 E=O, The expected value equal the bserved value The dependent variable is cntingent n the independent variable in the ppulatin Research hypthesis H 1 E O, The expected value des nt equal the bserved value The dependent variable is NOT cntingent n the independent variable in the ppulatin NOTE Fr an Elabrated Chi Square yu simply state that E=0 fr all f the independent/dependent cmbinatins fr the null hypthesis. Fr the research hypthesis yu state that E 0 fr at least ne f the cmbinatins. Yu wuld actually test each dependent/independent cmbinatin separately.
6 Stating Hyptheses with Symbls OneWay Anva  with 2 grups Null hypthesis H 0 µ 1 = µ 2, The Means are equal Or The Mean f Grup 1 is the same as the Mean f Grup 2 in the ppulatin Research hypthesis Tw Tailed ne the cmputer uses H 0 µ 1 µ 2, The Means are nt equal OR the Mean f Grup 1 is nt the same as the Mean f Grup 2 in the ppulatin One Tailed  state a directin H 0 µ 1 < µ 2, r µ 1 > µ 2 The Mean f Grup 1 lwer than the Mean f Grup 2 in the ppulatin. The Mean f Grup 1 is higher then the mean f Grup 2 in the ppulatin.
7 Stating Hyptheses with Symbls OneWay Anva  with mre than 2 grups* Null hypthesis H 0 µ 1 = µ 2.. µ k The Means f all the grups are equal. Research hypthesis Tw Tailed ne the cmputer uses H 0 µ 1 µ 2,.. µ k The Means are nt equal. The Mean f ne grup is nt equal t the Mean f at least ne ther grup. * This is still bivariate. Yu dn t have mre variables nly mre categries in the categrical variable.
8 BiVariate Regressin Stating Hyptheses with Symbls Null hypthesis H 0 Β 1 = 0, The regressin slpe is nt different frm 0 in the ppulatin There is n relatinship between the independent and dependent variables in the ppulatin. Research hypthesis H 0 Β 1 0, The Slpe is different frm 0 in the ppulatin There is a relatinship between the independent and dependent variable in the ppulatin. MultiVariate Regressin Null hypthesis H 0 Β 1..β k = 0, The regressin slpe is nt different frm 0 in the ppulatin There is n relatinship between the independent and dependent variable in the ppulatin. Research hypthesis H 0 Β 1 β k 0, At leas ne f the Slpes is different frm 0 in the ppulatin. There is a relatinship between the independent variable and at least ne f the dependent variables in the ppulatin.
9 Matching Variables with Types f Analysis Chisquare (2 categrical variables) type f car yu drive by gender race by plitical preference race by eye clr gender by YES/NO questins Anva (1 categrical and ne cntinuus variable) gender by yearly incme gender by scre n self esteem index race by yearly incme plitical preference by yearly incme age by whether r nt yu have children Bi Varate Regressin (Tw Cntinuus Variables) yearly incme by years f educatin years married by marital satisfactin (scale scre) age by number f children Multiple Regressin ( cntinuus/dummy independent and cntinuus dependent) number f dates per year by yearly incme, age, height, gender (dummy variable). pverty rates by sex rati, percent single headed husehld, percent emplyed.
10 Statistics That D Nt Use Hyptheses Cnfidence Intervals We generally d nt state a hypthesis fr a Cnfidence Interval. Cnfidence Intervals are used t estimate a ppulatin mean r prprtin based n a sample mean r prprtin. Opinin plls use Cnfidence Intervals t predict electin results etc. Pearsn Crrelatin (crrelatin cefficient r r) We generally d nt assciate Pearsn Crrelatin Matrixes with hyptheses. We generally use Pearsn Crrelatin Matrixes fr diagnstic purpses and t test the strength f bivariate relatinships.
11 Z scres Z= yy µ σ Equatins/Frmulas Z Tests Where yi = individual s scre µ = ppulatin mean Σ = ppulatin standard deviatin Infrmatin needed Ppulatin mean and standard deviatin Example f when we wuld use this If yu knew an individual s SAT/ACT scre, yu culd determine what percentile they scred in (i.e., the 95%) OR if yu knw what percentile they are in, yu can determine their scre.
12 Summary Statistics Mean ӯ= yy/n Equatins fr Inferential Statistics Median n+1 2 Order values and cunt up this far Variance S 2 = ( yy ӯ)2 n 1 Standard Deviatin S = s 2
13 Inferring a Ppulatin Mean r Prprtin Based n Sample Mean r Prprtin The fllwing Slides Fcus n Hw t Estimate a Ppulatin Mean r Prprtin if we ONLY have a randm sample. In these cases we estimate ne pint in the ppulatin (i.e., the mean IQ f USU students) BUT we build a cnfidence interval arund this single pint generally a 95% cnfidence interval
17 A One r Large Sample Hypthesis Test In the fllwing slides we cmpare a sample mean r prprtin with a ppulatin mean r prprtin. We want t knw if ur sample mean r prprtin is different frm the ppulatin mean r prprtin The ppulatin mean r prprtin culd actually be a mean/prprtin that is specified by a thery r by past research (rather than a number cmputed frm a ppulatin data set)
18 Equatins/Frmulas fr One Sample Hyptheses Tests The equatins are utlined in red What d the symbls mean One sample hypthesis test fr Prprtin P = prprtin in the sample Π 0 =prprtin r hypthesized prprtin in the ppulatin n = sample size Z = cmputed statistic One sample hypthesis test fr Mean Ӯ = mean in the sample µ 0 = mean r hypthesized mean in the ppulatin n = sample size sӯ = standard errr r an estimate f the standard deviatin in the ppulatin s n = cmputatin fr estimating the standard errr using standard deviatin f the sample size times the square rt f the sample size.
21 Symbls fr Statistics that Infer the Relatinship in the Sample t the Ppulatin Symbl (s) Interpretatin Chi Square X 2 Chi Square Statistic Regressin β beta slpe in ppulatin b slpe in sample ạ alpha intercept r cnstant in predictin frmula X1 X value f the X variables Ŷ yhat r predicted Y ӯ Y bar r the mean f Y Anva µ Mu r mean in ppulatin yi  ӯ
22 ChiSquare Equatin
23 Equatins/Frmulas fr Inferential Statistics Pearsn Crrelatin Cefficient and R 2 Frmula r = (XX ) (yi Ӯ) xx x 2 ( yy ӯ) 2 R 2 = r squared Multiple Regressin Predictin Equatin Ŷ = ά + b 1 x 1 + b 2 x 2 + b 3 x Ŷ = predicted scre fr the dependent variable a = intercept r cnstant b = slpe r parameter estimate fr independent variables unit increase in Y variable fr ever 1 unit increase in X X = value f the X values taken frm the cdebk
24 Anva Equatins/Frmulas fr Inferential Statistics Frmula TSS = yy 2 j  G 2 SSB = ( T2 n i ) G 2 n n TSS = Ttal Sum f Squares SSB = Sum f Squares Within SSW = Sum f Squares Between SSW = TSS SSB s 2 B = F statistic s 2 w s 2 B = SSB/k1 S 2 w = SSW/nk F = S 2 B/S 2 W df between = k1 df within = nk
25 Anva and Regressin Sums f Squares Anva TSS = Ttal Sum f Squares SSW = Sum r Squares within each grup SSB = Sum f Squares between the grups SSB/TSS = R square r the prprtin f the ttal sum f squares that is explained by grup membership Regressin TSS Ttal Sum f Squares SSM Sum f Squares Mdel SSE Sum f Squares Errr
26 Equatins/Frmulas fr Inferential Statistics Tw Sample Ttest Frmula T = ӯ 1 ӯ 2 sӯ 1 ӯ 2 this part is cmputed as fllws sӯ 1 ӯ 2 = SP 1/n 1 + 1/n 2 Estimated standard errr f the difference between the tw means Pled standard Sp = deviatin n 1 1 S n 2 1 S2 2 n 1 +n 2 2 standard deviatin standard deviatin f sample 1 f sample 2 What symbls mean t = critical value Ӯ 1 = mean f sample ne Ӯ 2 = mean f sample tw n 1 = size f sample 1 and n 2 = size f sample 2 Degrees f freedm = df = n 1 + n 2 2 Uses a T distributin
27 Equatins/Frmulas fr Inferential Statistics Mann Whitney Fcuses n ranks rather than n means medians Tw Grups Frmula Z= T 1 E(T 1 ) E(T 1) = n 1 (n+1) vvv (T 1) 2 Rank values frm smallest t largest Sum ranks in smaller grup = T 1 Cmpute E(T 1 ) Cmpute Variance Var T 1 = n 1 n 2 S 2 n s 2 = (Yi  Ӯ ) 2 n1 Uses a Z dsitributin.
28 Kruskal Wallis Equatins/Frmulas fr Inferential Statistics Fcuses n ranks (medians) rather than n means Mre than Tw Grups Frmula x 2 = 11 T 2 k 3 (n+1) n (n+1) k n k T = ttal sum f ranks fr each sample n = ttal number f cases n k = number f cases fr the k sample Uses X 2 Distributin Degrees f Freedm = k1 (where K is number f grups) Use when yu want t cmpare mre than tw grups, and the distributin is nt nrmal.
29 Equatins/Frmulas fr Inferential Statistics Frmulas fr Sample Size Sample size (n) = N dd.9604 (N+1) D = degrees f freedm r margin f errr (usually.05) N= ppulatin size.9604 = a cnstant related t at least 95% sure This sample size is large enugh that we can be at least 95% sure we can generalize t the ppulatin with a margin f errr f.05
More information