Interpretation of Computer Analysis Output for Fundamental Statistical Tests Volume One T-test P.Y. Cheng

Size: px
Start display at page:

Download "Interpretation of Computer Analysis Output for Fundamental Statistical Tests Volume One T-test P.Y. Cheng"

Transcription

1 Interpretation of Computer Analysis Output for Fundamental Statistical Tests Volume One T-test P.Y. Cheng

2 Preface When I firstly came to the department in 1985, PC computers were still not common at all. There was only an Apple computer in the office for secretary work. However, we could run SPSS using our university s mainframe computers, which were connected to our department using terminals and RS232 interface! At that time, we had a very good neighbour with us on the same floor, it was the Department of Community Medicine. Their staff ran statistical programs heavily and were very good consultants of us for asking about problems on statistics! In the following decades, I have been trying to improve my knowledge on Statistics, with nearly always Statistics books carried with me, and taking annual leave for self-studying every summer! We Chinese have a common believe that hard working can compensate for stupidity and I ready want to compensate my stupidity with decades of hard working! Several years ago, I have a chance to start writing some books to summarize my painful experience on statistical problems in these decades, trying to help others to get a quicker way while handling similar problems! Volume 1, 2 and 3 of Interpretation of Computer Analysis Output for Fundamental Statistics Tests are the newest Books I have written so far, and the characteristics of these new books are: 1) Comparing with the several books published previously, I have talked more on basic theories underlying the statistical tests! I have tried to use many brilliant and convincing pictures and graphs to explain the important theories and first principles, avoiding dull or complicate approaches! Even if you cannot digest and understand these theories immediately, you can still use this book as a Cook Book for fitting your problems into the nearest examples in it to solve them! 2) In our environment with mostly animal experiments, there are usually only a few animals in each groups for comparison! I have often heard about the worrying about the sample size being too small that whether the T Test, Anova or even Regression can be correctly used! In this book, we ll discuss on this issue and trying to clarify whether the statistical tests should still be used even when the sample size is rather small e.g. N = 10 or N = 5, or even fewer! 3) We ll talk about non-parametric statistical tests corresponding to their parametric ones, when the assumptions for parametric tests can t be hold, or when not knowing the distributions at all! 4) We have, for every test run on SPSS, introduced an alternative way of getting the same results! This might be a manual method, a free Excel Add-In e.g. PHStat4, or even some web-tools! So that readers can still work on solving problems without being stopped by the absent of expensive software e.g. SPSS!

3 Acknowledgments Firstly of all, I would like to thanks Prof. C.M. Wong and Dr L.M. Ho who give me the braveness to start writing. They have been consultants for many staff of the University of Hong Kong and are always helpful for any HKU staff approaching them for asking statistical problems! I would also like to thank everybody who has contributed for the publishing of this book! This include the publishing company (which is still no known yet), the authors of reference books or internet material that has helped me a lot during the writing and checking process! I would like to thank my friends and relatives who have been encouraging me during the publishing of my books, especially my son (Andy) and my wife (Betty) who have shown their patient and understanding when I concentrate on the production of this new book! Lastly, I would like to thank, in advance, any future audience of this book and hope they can find some useful materials for helping them to solve statistical problems. Also hope they can enjoy reading this book in full color, with hundreds of brilliant pictures and many cook book examples! Medical Faculty, The University of Hong Kong Cheng Ping Yuen (Senior Technician) Bachelor of Life Science (BSc), Napier University, UK Master of Public Health (MPH), Hong Kong University Certificate of Hong Kong Statistics Society (HKSS) Fellow of British Royal Statistician Society (RSS) Microsoft Certified Professional (MCP) Hong Kong Registered Medical Technologist (Class I) Phone : (852) / hrmacpy@hku.hk

4 Content 1.1 Some basic concepts 1.1.a The Normal Distribution...P b Distribution of Sample means.....p c The Standard Normal Distribution and T-Distribution...P d Area (Probability) under Z-Distribution and T-Distribution......P e Testing of Hypothesis...P f The worrying about sample size N being too small for Statistical Tests...P Computer Analysis of t-distribution 1.2.a One Sample T-Test P b Two Samples T-Test..P c Paired T-Test..P Sample size N and Power for T-Test 1.3.a Using hand calculator...p b Sample size N in estimating a population averageµ, by PHStat4..P c Calculation of Sample Size and Power using G Power i) Calculation of Sample Size using free software G Power..P.53 ii) Calculation of Power using free software G Power P 58

5 1.4 Testing Hypothesis for (Proportions) instead of Means 1.4.a Approximation of a binomial distribution by a normal distribution.. P b Using PHStat4 to solve the problem in 14.a.....P c Using PHStat4 to solve problems with two proportions....p Non-parametric tests corresponding to various T-Tests 1.5.a Wilcoxon signed rank sum test (correspondent to 1 sample t-test). P b Nonparametric tests for 2 samples ( corresp. To 2 independent samples t-test) i) Mann-Whitney Test for 2 independent samples in SPSS......P. 81 ii) Wilcoxon Rank Sum Test for 2 independent samples in PHStat4....P c Wilcoxon match-paired signed-rank test for paired samples (correspondent to paired t-test)......p d Calculation of Sample Size and Power of Wilcoxon Tests Using G Power....P. 96

6 Interpretation of Computer Analysis Output for Fundamental Statistical Tests Volume One T-test P.Y. Cheng 1

7 1.1Some basic concepts 1.1.a The Normal Distribution The normal distribution is the most important distribution in Statistics, not only because so many natural phenomena (e.g. weight, height, class mark, IQ score ) follow this distribution, but also because the possibility of making use of it for solving many other statistical problems! The function for possibility density of a Normal distribution is:- The curve is, thus, determined by two parameters: 1. the population mean µ, and 2. the population standard deviation σ (or σ 2, the variance) 2

8 1.1.b Distribution of Sample means If we can always measure each individual of a population e.g. height of all children born in 1995 in UK, then we might not need to run statistical tests to get conclusion from them! However, it is usually impossible or, too costly, to make such a measurement! We usually take a sample from the population, run a statistical test with the sample data, and based on distribution and probability theories to get a conclusion of whether accepting a hypothesis or not! This is also called the interpretation of the population using a sample. The following is a sample of height from the population (e.g. children born in UK in 1995):- Image we can measure infinitive number of sample means (although we usually would not, practically, do so) and plot the frequent histogram, we can get a distribution of sample means: 3 Histogram of sample average against frequency

9 Formation of a distribution of sampel means (of size N): (*Please don t mix up with z or t distributions discussed later) Peaks lower than population curve N < total population Peak height increases due to combination of the sample curves above. It is important to learn that, the larger the sample size, the smaller would be the deviation of the distribution curve of the sample means (standard error of the mean), σm = σp/ N: 4

10 Due to the property above, it would always be better to use larger sample size instead of using smaller one! Using larger sample size, we could reduce the possible deviation of the measured mean from the real mean of the population! In the graph above, we assume there would be unlimited number of sample means for plotting the distribution, and the sampling error approach zero! But what would happen if that are limited amount of sample means only? Sampling error when Limited Number of sample means are used (The sampling distribution might be positively skew, negatively skew or not skew, depending on chance. But when the number of sample increase, the skewness would decrease and the distribution would approach normal!) For example, for the following population of 4,000 students, using 20 samples (N=10) for plotting the distribution curve might reduce possible error than just plotting 200 data of 200 students (N=1). The mean from the 20 samples would be closer to the population mean of 4,000 students than just getting the individual data of the 200 students only! Distribution of limited number of sample means of different size N 5

11 1.1.c The Standard Normal Distribution and T-Distribution The standard normal distribution and the t-distribution are extremely important, due to the following three conditions:- 1) If the population is normal and the variance is known : Then the random variable is exactly standard normal (Mean = 0, S.D. = 1), no matter how small the sample size is. 2) If the population is normal and the variance is unknown. The random variable has exactly a t-distribution (Mean = 0, S.D. approaches 1 as n increased) with n-1 degrees of freedom, no matter how small the sample size is. 3) The population is not normal and the variance may or may not be known. The random variable or the random variable (the one used depending on whether the variance is known or unknown) is approximately standard normal if the sample size is sufficiently large (at least thirty). Where : x is the mean of the sample µ is the mean of the population σ is the known standard deviation of population n is the sample size s is the standard deviation calculated from sample The calculation of the random variable and the random variable is also called the standardization of the original distributions. Without such a possibility of standardizing of any normal/non-normal distributions to create a standard normal distribution, it would be impossible (or very difficult) to make use of these most important distributions in Statistics to test for various hypothesizes! 6

12 1) If the population is normal and the variance is known : Then the random variable is exactly standard normal (Mean = 0, S.D. = 1), no matter how small the sample size is. All the four distributions above are normal distributions, but only the GREEN one is a Standard Normal One with µ = 0 and σ 2 = 1 (σ = 1)! 7

13 2) If the population is normal and the variance is unknown. The random variable has exactly a t-distribution (Mean = 0, S.D. approaches 1 as n increased) with n-1 degrees of freedom, no matter how small the sample size is. s, the sample variance, is used instead of a population variance! Please notice that when the underlying population is normal, we can apply t-distributions for statistical tests no matter how small (degree of freedom) is!! A t-distribution is similar with the z-distribution in that they are both bi-symmetric and of dell shape, but the central peak is lower and the two tails are higher! As df (N-1) increase, it would become more and more like a z-distribution! At df 120, we might say that there is almost no difference at all! ** Please don t mix up with 3 rd condition below, in which the underlying population is NOT NORMAL, and we can only apply approximation of Z distribution when N is at least 30 or above, without concerning any t-distributions!! 8

14 3) The population is not normal and the variance may or may not be known. The random variable or the random variable (the one used depending on whether the variance is known or unknown) is approximately standard normal if the sample size is sufficiently large (at least thirty). The is also call the Central Limit Theorem! Please notice that the sample size N must be equal or greater than 30 for applying this Central Limit Theorem, with that we can apply approximation of z distribution for running statistical tests! A graph like the one below might be confusing if it is not stated clearly that whether the underlying population is normal or not: If it is normal, then t-distribution curves can be applied no matter how small N is! Some books suggest approximation of z distribution when N >=30, but I myself don t feel this is necessary! (Why not just use the t-distribution with N-1 degree of freedom?) If it is not normal, apply z distribution approximation when N >=30, and just don t use any t- distributions! 9

15 1.1.d Area (Probability) under Z-Distribution and T-Distribution By knowing about the percentage of area under the z-distribution and the t- distribution (total area = 1), we know the probability of getting the z-value or t- value calculated from or in the 3 conditions discussed above! Area under Z-Distribution(Standard Normal Distribution) The area in between the central axis (Mean = 0) and 1 standard deviation = of total area, which is equal to 1. Thus imply that the probability for z to be between zero and 1 σ = !! Under the Normal Standard Curve, σ = 1, so 1 σ = 1, 2 σ = 2, 3 σ = 3 on the x-axis above!! 10

16 Area under T-Distribution of different df As t increase, the area in right tail decrease 11

17 1.1.e Testing of Hypothesis (Significance) With the z distribution (Standard Normal Distribution) and the t distribution, under which the area under them being well known (either from tables or by using computer), we can then carry out testing of hypothesis-using samples to interpret the underlying population! A probability of 0.05 is usually used as the critical probability for testing of hypothesis!! Z Distribution:- Area = 0.25% Area = 0.25% If the mean and the variance of a population are known, then we can run a normal test with a sample using the z distribution (standard normal distribution). For example, an education department want to know whether the average mark of students on Mathematics this year is the same as pass years (mean = 80 and S.D. = 5). A random sample of 25 students is taken, marks measured with mean = 83 Null hypothesis H0: Mean of this year = Mean of pass years Alternative hypothesis: Ha: Mean of this year =\= Mean of pass z = = / 5/ 25 = 3/1 = 3 z = 3 >> 1.96 The probability of getting z > 1.96 or < by change only (sampling error) = Thus the probability of getting such a high z value of 3 by chance only is << 0.05!! The sample mean is significantly different from the population mean used for z test! So we reject the Null Hypothesis that the average mark being same as pass years! 12

18 T Distribution:- df = Df = 24 Df = Suppose the S.D. of the underlying population of student marks in the above section is unknown, and the sample variance calculated from sample data being 6 instead of being 5, then:- t = = / 6 25 = 3/1.2 = 2.5 t = 2.5 >> (from table : df = 24, 5% probability of same population mean) The probability of getting t > or < by change (sampling error) = 0.05 Thus the probability of getting such a high t value of 2.5 by chance only is << 0.05!! The sample mean is significantly different from the population mean used for t test! So we reject the Null Hypothesis that the average mark being same as last year! (Please remember that, to make use of the t-distribution for the calculation of probability above, we assume that the underlying population being normal, and the t-distribution curves are useful no matter how smaller the sample size being used!) 13

19 One-Tailed Test and Two-Tailed Test Not significate Significate One-tailed Test critical value for 5% chance Two-tailed Test critical value for 5% chance In (a), we test for whether the sample mean is greater than or smaller than the population mean, such that the probability of getting a z value is totally 0.05 in probability, i.e on each side! The probability on each side is only! In (b), we just test for whether the sample mean is greater than the population mean, such that the probability of getting a z value is 0.05 in probability i.e on the right hand extreme only! This also implies that the rejection of the Null Hypothesis (that there is no real difference in means) is easier to achieve with a double fold of chance!! One tailed significance = two tailed significance divided by 2! As for the case in the graph above, the test for the difference between sample mean and the population mean is not significant in a two-tailed test (z=1.8 <.196), but being significant in the one-tailed test (z=1.8>1.651)!! 14

20 Type I Error and Type II Error We might say, by default, type I error would be the error we try to avoid Firstly! This is the error to say that there is difference between two groups while there is, in fact, no! (Suppose having difference is a crime, type I error is the error to sentence a person to have committed the crime, while he, in fact, has not!) If we accept the null hypothesis the there is no difference, then we would not have the risk of committing the type I error. However, we would then immediately be under the risk of committing the type II error i.e. saying that there is no difference while there is, in fact, yes! (Saying that a person has not committed a crime while he, in fact, has committed) 1 - β Range of t-values that would make us commit the Type II error i.e. saying there is no difference while there are, in fact, two curves existing! Critical t-value for rejecting the null hypothesis that there is no real difference between the two groups (only one curve)! In the graph above, using α/2 as the critical point, we would not reject the null hypothesis that there is no real difference between the 2 populations, while t is less than 2! We accept the null hypo-thesis since we don t want to committee the Type I error (sentencing a person that he has committed a crime while he is ignore)! We think there is only ONE curve (Red) existing!! However, if there is, in fact, real difference between the two populations (two curves existing), then we would have committed the Type II error (let the accused person go while he has committed the crime) already!! The range of t values that would make us commit such a mistake is shown in the graph above! The probability that we would commit such an error is the area represented by β! This probability would depend on how different you would say that there is a real different. β would be important for calculation of Power (1-β) and sample size N. We would take about calculation of Sample Size and Power in the later sections! 15

21 1.1.f The worrying about sample size N being too small for Statistical Tests In our laboratory experiment environment, we often hear about the worrying of sample size N in different groups being too small for running common statistical tests e.g. T Test! There are often only a few animals e.g. 5 to 10 in one group for being compared with other groups. The graph of such a few points might hardly show a normally distributed histogram! Let s see the condition of N = 5 below. As stated previously, 1) If the population is normal and the variance is known z = = normal (0,1) No matter how small sample size N is 2) If the population is normal and the variance is unknown t = = normal (0,1) No matter how small sample size N is (Since N = 5 only, we wouldn t consider the 3 rd condition - population being not normal and N >=30!) As you might see, if we assume the underling population is normal for 1 and 2 above, we could finally arrived at a standard normal distribution, or a t-distribution, no matter how small N is!! Then what is the rule of N in these conditions? For both and, the smaller the N, the lower would be the value of z or t, and the lower change of getting value > the critical z or critical t value for obtaining a significant result for rejecting the null hypothesis, and vice versa! 16

22 Previous examples with smaller N:- For example, an education department want to know whether the average mark of students on Mathematics this year is the same as pass years (mean = 80 and S.D. = 5). A random sample of 25 students is taken, marks measured with mean = 83 Null hypothesis H0: Mean of this year = Mean of pass years Alternative hypothesis: Ha: Mean of this year =\= Mean of pass z = = / 5/ 25 = 3/1 = 3 If only 5 students is taken instead of 25, then z = 3 >> 1.96 (reject null hypothesis) z = = / 5/ 5 = 3/ = z = << 1.96 (accept null hypothesis) Suppose the S.D. of the underlying population of student marks in the above section is unknown, and the sample variance calculated from sample data being 6 instead of being 5, then:- t = = / 6 25 = 3/1.2 t = 2.5 >> (from table : df = 24, α=0.05, reject null hypothesis) If only 5 students is taken instead of 25, then t = = / 6 5 = 3/ = t = << (from table : df = 24, 5% accept null hypothesis) 17

23 Previous examples with smaller N and larger group differences:- For example, an education department want to know whether the average mark of students on Mathematics this year is the same as pass years (mean = 80 and S.D. = 5). A random sample of 25 students is taken, marks measured with mean = 83 Null hypothesis H0: Mean of this year = Mean of pass years Alternative hypothesis: Ha: Mean of this year =\= Mean of pass z = = / 5/ 25 = 3/1 = 3 z = 3 >> 1.96 (reject null hypothesis) If also only 5 students is taken instead of 25, but sample mean = 93 this time: z = = / 5/ 5 = 13/ = z = >> 1.96 (reject null hypothesis) Suppose the S.D. of the underlying population of student marks in the above section is unknown, and the sample variance calculated from sample data being 6 instead of being 5, then:- t = = / 6 25 = 3/1.2 t = 2.5 >> (from table : df = 24, reject dull hypothesis) If only 5 students is taken instead of 25, but sample mean = 93 this time: t = = / 6 5 = 13/ = t = >> (from table : df = 24, reject null hypothesis) 18

24 Single subject test So what would happen if only one subject can be measured for comparison? For example, an education department want to know whether the average mark of students on Mathematics this year is the same as pass years (mean = 80 and S.D. = 5). A random sample of 25 students is taken, marks measured with mean = 83 Null hypothesis H0: Mean of this year = Mean of pass years Alternative hypothesis: Ha: Mean of this year =\= Mean of pass z = = / 5/ 25 = 3/1 = 3 z = 3 >> 1.96 (reject null hypothesis) If only 1 students is taken instead of 25, and his mark is 83: z = = / 5/ 1 = 3/5 = 0.6 z = 0.6 << 1.96 (accept null hypothesis) If also only 1 students is taken instead of 25, but his mark = 93 this time: z = = / 5/ 1 = 13/5 = 2.6 z = 2.6 >> 1.96 (reject null hypothesis) However, if the population standard deviation s is not known, then t = and would not be able to be calculated for a single subject case! 19

25 So, should we worry about having sample size N being too small e.g. being 5 only? The answer is, a bit, depending on different conditions! We might try to summary this issue as: 1) If the sample size decrease, and the group difference is nearly the same, then it would be more difficult to get a z or t value larger than the critical value for having a significant result! 2) However, if the group difference is large enough, the test would still be significant even N is not quite large, it would be quite often to have less than 10 animals in a group for comparison! 3) In our experimental environment, it would be quite often that increasing sample size means increasing the experiment cost rapidly, so it would be a wastage if a smaller N is already OK! 4) Even if sample size is only three, two, or just one, instead of five, the Rule of the game would still hold: Passing the test or not is our own business, nothing wrong with the tests themselves! The distribution and probability theories underlying would still be valid! 5) All above is talking on the risk of committing type I error, we worry about that whether N can be large enough for avoid saying that there is difference while there is, in fact, no! However, N is also import for avoiding type II error, the saying that there is no difference while there is, in fact, yes! The calculation of sample size N for getting enough of Power for detecting real difference would be discussed in later sections! 6) Please don t confuse the curve of distribution of sample means with sample size N (A) with z or t distributions (B,C) after standardization by and, although there are closely related! A 20 B C

26 1.2 Computer Analysis of t-distribution 1.2.a One sample T -Test The vendor of a new medicine claimed that it can yield a depression score below 70 after applying for 2 weeks to the patients! A sample of 25 patients has been chosen to take the new medicine and the depression score is taken after two weeks. (The underlying population is an unlimited average of samples of 25 patients)! 1) The result scores, in SPSS, are:- 21

27 2) Analysis, Compare Means, One Sample T Test 3) You would see, move Dep_Score to Test Variable(s), input 70 as Test Value 4) Click OK 22

28 4) Results for One Sample T Test in SPSS :- 4a) One-Sample Statistics Results: Sample Size N Sample Mean x Sample 23 Std. Dev. σ σ m = σ/ N = 4.748/ 25

29 4b) One Sample Test Results: Value to be compared with t value calculated by degree of freedom = N-1 Probability of t > or < by chance < 0.05, reject H 0 that the means are equal! Difference between sample mean (66.36) and hypothesis mean (70) = = % confidence for the mean to fall between these 2 limit % Conclusion: The 25 patients taking the new medicine have a depression score of mean and SD A t value of is obtained, which is significant even for a 2 tailed test! We can reject the Null Hypothesis H0 that the sample mean is same as the comparing value of 70! The vendor might be right that their new medicine can cure patients to obtain a depression score different from 70! 24

30 One-Tailed Test for example above: The computer output above is a 2 tailed test output! For a 2 tailed test: H0 : The population mean =70 Ha: The population mean =\= 70 In a 1 tailed test: H0 : The population mean >=70 Ha: The population mean < 70 We just want to decide whether the population mean of depression score would be less than 70, without considering that whether it would be greater than 70 in average:- NOTHING NEED TO BE CHANGED FOR THE RUNNING OF THE 2 TAILED TEST ABOVE! What you need to do is to know above how to interpret the same computer output. For rejecting the Null Hypothesis:- Step 1: t must be negative in this case that the negative tail is being tested for, and must be positive if the positive tail is being tested for! Step 2: divide the significance by 2, as the 1 tailed test would produce a probability 2 times less that a 2 tailed test for obtaining critical t values! 25

31 One Sample T-Test using Excel (with PHstat4 Add-In) (For installation of PHStat4 Add-In, please refer to Appendix: Installation of Free Software) 1) PHStat, One-Sample Tests, t-test for the Mean, sigma unknown 26

32 2) Input information for running a 2 tailed test: 3) Results nearly same as when using SPSS above, differences might due to rounding off issue: t = when using SPSS above! 2 times p-value in 1 tailed test 27

33 4) Input information for running a 1 tailed test: 5) Results nearly same as when using SPSS above, differences might due to rounding off issue: t = using when using SPSS above! shifting of critical value towards central axis! 0.5 times p-value in 2 tailed test 28

34 1.2.b Two Samples T Test This is also called independent t test, meaning that the two samples would not affect each other in measuring of the data values. Two t-distributed populations are tested by using two samples from each of them. The basic question is: How different are the two means below different such that the chance of getting the critical t-value would be less than 5% (one tailed) or 2.5 % (two-tailed)? Mean of Treatment Group Mean of Control Group For example, 25 patients are chosen for taking a traditional medicine for treating depression (Group 1- control), and another 25 patients are also chosen for taking the new medicine (Group 2)! The Depression Score is taken after 2 weeks and being input to SPSS as:- 29

35 1) Analysis, Compare Means, Independent Samples T Test 2) Move Dep_Score into Test Variable(s) and Group into Grouping Variables : 30

36 3) Click Define Groups 4) Input the values 1 and 2 form definition of Groups 5) Click OK 31

37 6) SPSS output: 7a) Groups Statistics 32 Group 2 has a lower mean and Std. Dev. than Group 1

38 7b) T Test results A B C Part A - Test for assumption of equal variance and t value obtained Levene's Test for Equality of Variances t-test for Equality of Means F Sig. t df Dep_Score Equal variances assumed Equal variances not assumed An analysis of variance test has been run for testing the hypothesis that the variance of the two groups are equal or not! The larger the value of F, the higher the chance that their variance being different! Sign. = < 0.05, meaning that the variance of the two groups being significantly different! This imply that equal variance could not be assumed and we should use the equal variance not assumed data e.g. t, df instead! t value and df calculated under the assumption that the variance of the 2 groups being different! Separated variance instead of pooled variance is used! 33

39 Part B Significance, Mean Difference, and Std Error Difference: t-test for Equality of Means Sig. (2-tailed) Mean Difference Std. Error Difference Dep_Score Equal variances assumed Equal variances not assumed tailed probability = > 0.05, we cannot reject the Null Hypothesis that the two population means are equal! Sample Mean of Group 1 Sample Mean of Group 2 = Part C) 95% Confidence interval of the Difference t-test for Equality of Means 95% Confidence Interval of the Difference Lower Upper Dep_Score Equal variances assumed Equal variances not assumed We have 95% confidence to say that the difference between the two groups (Mean of Group One Mean of Group Two) would fall between the value and 5.039! 34

40 One-Tailed Test: As stated previously, there is no need to make any changes for running the test! Just make sure that whether a positive tail or a negative tail you are testing, and see whether you can get a significate test result after a double fold increase of chance! For example, if you just want to test whether the new medicine can produce a lower depression score in Group 2, this also means the whether Group 1 can produce a higher score that Group 2! Then we can testing the positive tail Mean of Group 1 Mean of Group 2, and follow step 1 and step 2 below: Levene's Test for Equality of Variances t-test for Equality of Means F Sig. t df Dep_Score Equal variances assumed Equal variances not assumed Step 1 : t-value being positive instead of being negative Sig. (2-tailed) t-test for Equality of Means Mean Difference Std. Error Difference Dep_Score Equal variances assumed Equal variances not assumed Step 2: 0.504/2 = Still > 0.05, the 1 tailed test still can t find sign. Difference between the 2 groups! 35

41 Running the 2 samples test with Excel (For implanting Data Analysis Tools Add-In of Excel, please refer to Appendix: Installation of Free Software) 1) DATA, Data Analysis 36

42 2) Test for Assumption of equal variance 3) Input required information 4) Results show that the equal variance assumption is not hold << 0.05, the variance of the two groups are sign. Different! 37

43 5) Choose t-test: Two-Sample Assuming Unequal Variance 6) Input required information: The Hypothesis Mean Difference is a very useful item! If we just want to test whether there is any difference in mean of two group, just leave it blank (meaning zero)! If we want to test whether the 2 groups have a particular value of difference, just fill in the value: Leave blank means being zero! If a particular value of difference is to be tested, please input here!! 38

44 7) Results similar to results from SPSS, the small difference might due to rounding off issue: Sample as in SPSS (two tailed) Two tailed and One tailed Test Results from PH4Stat4 Excel Add-In: Population 1 Sample Sample Size 25 Sample Mean 67.6 Sample Standard Deviation Population 2 Sample Sample Size 25 Sample Mean Sample Standard Deviation Intermediate Calculations Numerator of Degrees of Freedom Denominator of Degrees of Freedom Total Degrees of Freedom Degrees of Freedom 39 Standard Error Difference in Sample Means Separate Variance t Test Statistic Two Tail Test Lower Critical Value Upper Critical Value p Value Do not reject the null hypothesis Population 1 Sample Sample Size 25 Sample Mean 67.6 Sample Standard Deviation Population 2 Sample Sample Size 25 Sample Mean Sample Standard Deviation Intermediate Calculations Numerator of Degrees of Freedom Denominator of Degrees of Freedom Total Degrees of Freedom Degrees of Freedom 39 Standard Error Difference in Sample Means Separate Variance t Test Statistic Same as in SPSS (two tailed) Upper Tail Test Upper Critical Value p Value Do not reject the null hypothesis 39

45 1.2.c Paired T-Test Paired T-Test would be used when e.g. the sample subjects would be measured on two time points, or pairs of twins are studied under experiments etc., etc. The key point is that we assume there is a particular relationship existing such that the data values measured would not be independent from each other! Simply speaking, it is an analysis of the differences from each pair of data: t = d 0 / sd / nd 40

46 For example, a company running two shops want to know whether there would be real difference of income between them. Suppose using SPSS:- 1) Analyze, Compare Means, Paired-Samples T Test 41

47 2) Put Shop_1 under Variable1 and Shop_2 under Variable2 3) Click OK 4) Output: 42

48 Enlarged pictures: Paired Samples Test Paired Differences 95% Confidence Interval of the Std. Std. Error Difference Mean Deviation Mean Lower Upper Pair 1 Shop_1 - Shop_ Mean of Shop 1 Mean of Shop 2 Std. Dev. Of the difference Std. Error of Mean of the difference 95% confidence that the difference would fall into this range Paired Samples Test t df Sig. (2-tailed) Pair 1 Shop_1 - Shop_ t (df = 9) t = Sign. < 0.05, the result is significance! Reject the hypothesis that the difference between the income of the two shop = 0! Shop 2 have an income different from shop 2.

49 Two-Tailed Test Paired Samples Test t df Sig. (2-tailed) Pair 1 Shop_1 - Shop_ Step 1: Make sure the +/- sign is agreed with the hypothesis you want to test for i.e. Shop_1 > Shop_2 or Shop 2 > Shop_1! If opposite, then no need to test anymore. If agreed, go to Step 2! Step 2: Divided this probability value by two and see whether it can be < 0.05 for rejecting the null hypothesis! 0.049/2 = , reject null hypothesis and accept the alternative hypothesis that Shop 1 have income lower that Shop t =

50 Example in 1.2.c Solved by Excel 1) Data, Data Analysis 2) Choose t-test: Paired Two Samples for Means 45

51 3) Input required fields: Besides zero, we can test for hypothesis that variable 1 is different from variable 2 by a certain value! 4) Results is almost the same as in SPSS: 46

52 Output of Excel Add-In PHStat4: 47

53 1.3 Sample Size N and Power for T Test As stated previously in this example, if we cannot get a t value > 2 and have accepted the null hypothesis that there is only one population mean (with the red curve), then we would not commit type I error, but we would immediately under the risk of committing type II error that there are, in fact, two populations (two curves) with different means! 1 β Tradition value = 80% If there is, in fact, another population (the blue curve), then the t value in this range would make us commit type II error for saying that there is only one curve the red one! Or we might say the possibility of committing type II error for not detecting a real population mean difference is the area β!! We could also say 1 β would be the Power of the test for detecting there is real difference between the two groups i.e. accepting the alternative hypothesis! If the power is too small, we would have too high a chance of missing real population differences in a T Test! So we should know how to find its value so to know whether it is high enough or not! The test might be run before the experiment (planned) or only be able to be run after experiment (Post Hoc), depending e.g. whether population Std. Dev. being known etc. etc! 48

54 Calculation of Sample Size and Power Calculation of sample size and power can, itself, be a course of a whole school term! But, for the simplest random sampling from normal populations, we can have the following example: 1.3.a Using hand calculator When Confidence coefficient = (1-α) = 95% Width of the interval (difference want to be able to detect) = δ = 50 Obtained the standard deviation (σ) = 150 The required sample size = [(1.96*150)/50] 2 = b Sample Size N in Estimating a population average (µ), by PHStat4 Excel You must start the program by clicking its shortcut outside Excel! 49

55 50

56 Known σ from previous or other studies, If population Std. Dev. is not available, just calculate (Max. possible value Min. possible value)/4 Difference you expect from Comparing population value and sample value! Confidence interval (1-α) Remarks: 1. Max. possible value of Std. Dev. would occur when the data points are equally spread on the mini. and max. possible values of the data set e.g. : 1 and 50 are the min. and max. possible values of a test score :- 2 data obtained: Std. Dev (1, 50) = data obtained: Std. Dev (1,1, 50) = Or Std. Dev (1,50,50) = data obtained: Std. Dev (1,1,50,50) = data obtained: Std. Dev (1,1,1,1,50,50,50,50) = Min. possible value of Std. Dev. usually = 0 2. Please notice that this Sample Size of 35 is only used for providing a 95% confidence that the mean of the sample obtained would be within mean of the population +/- E, when E is the sampling error you can accept :- -E +E Same result as using hand calculator! For calculation of Sample Size considering there might be a whole diff. population existing, and for calculation of power that is not needed here, please refer to following sections with Gpower Software, which is totally free!

57 1.3.c Calculation of Sample Size and Power using G Power (For installation of G Power please refer to Appendix Installation of G Power) In examples above, we haven t mentioned about the term Power although, in fact, the narrower the interval it can detect, the higher the power the test would have! Anyway, it is common for us to need to calculate the Power, the possibility of detecting a real difference without committing the type II error (without accepting the null hypothesis wrongly)! We would like to introduce an excellent, free G Power software for calculation of sample size and power! 52

58 1.3.c.i Calculation of Sample size using free software G Power Suppose we want to run a test to see whether a high salt diet sample can be detected as different from the general population with a higher blood pressure of 5 mmhg, and with a power of 0.8! The Std. Dev. is known to be 10 mmhg. [This value might be found by checking previously studies or by taking some quick samples for measurement. If both is not possible, just take the maximum possible value (maximum possible value minimum possible value) and divide this value by 4.] 1) Under Statistical test, choose Means: Difference from constant (one sample case) 2) Under Type of power analysis, choose A priori: Compute required sample size 53

59 3) Click Determine for seeing the small window at the right hand side: 4) Input information and click calculation 5) Input the calculated Effect size d (0.5 this time) into the main windows 54

60 6) Input 0.8 unless specially specified. 7) Results : N = 27 55

61 8) Select Two for two tailed test 9) Results N = 34 56

62 10) There are other similar tests e.g. the two samples test : 57

63 1.3.c.ii Calculation of Power with Free software G Power Suppose, in the previous blood pressure study, we have reversely 27 subjects for measurement and the sample mean is found to be 5 mmhg higher than the papulation mean, Std. Dev. is 10 mmhg as before, the power should be as before: 1) Choose t test, Means: Difference from constant (one sample case) 2) Under Type of power analysis, choose Criterion: Computer required α given power... 58

64 3) Click Determine for seeing the small window at the right hand side: 4) Input information and click calculation 5) Input the calculated Effect size d (0.5 this time) into the main windows 59

65 6) Input Sample Size 27: 7) Result: Power (1-β) =

66 8) Select Two for two tailed test 9) Result: Power (1-β) =

67 10) There are other similar tests e.g. the two samples test : 62

68 1.4 Testing Hypothesis for Proportions (instead of Means) 1.4.a Approximation of a binomial distribution by a normal distribution As stated previously, the normal distribution is so important that it can solve many problems not only for so many natural phenomena in the natural world, but it can also be used for solving many other problems by superimposing on other distributions! Let s see the following binomial distributions before any medicine is available for a disease and patient just recover by bed resting, the chance of recovery is 0.4 (fail to recover is 0.6) : 63

69 As you can see, for these binomial distributions, as N increases, the more the histogram would become a normally distribution one! In fact, the more p approaches 0.5, the less N need to be so that a normally distribution can well superimpose on the histogram of the binomial distribution! Generally, we can carry out the approximation when Np and Nq are both greater than 5! In the other way, N = 5/p and round up or N = 5/q and round up, depending on which is smaller!! For example, for 0.4 and 0.6, we have 5/0.4 = 12.5, round up to N=13, for carrying out the approximation! The following is a normal (z) distribution, suppose z = +/- 1.14, the area beyond them = 12.71%:- The area can be easily found in Normal Distribution Table critical value for 0.5 (2 tailed test) = 1.96, 1.14< Not significant! 64

70 If the superimpose of the normal distribution on the binomial one is alright, the later one can be treated as a normal distribution with mean = Np (=20*0.4=8 in this case) and Std. Dev = Npq (= 20*0.4*0.6) = 2.19 in this case)! The x-axis above have two scale, one is the k-value of actual number of patients recovered, and the other one is z-value in standard deviation of the Standard Normal Distribution Curve! (Due to some reason, the equation is adjusted by (+0.5) if k < µ, and is adjusted by (-0.5) if k > µ. z = [(k-µ) + 0.5] / σ (if k < µ) and z = [(k-µ) - 0.5] / σ (if k > µ) Therefore z =[(5-8) + 0.5] /2.19 = and z =[(11-8)-0.5] /2.19 = 1.14 It means that either having 5 or fewer patients (or having 11 or more patients) recover would be due to a chance of 12.7% of a binomial population of p = 0.4 of recovery with bed rest only! 65

71 Suppose, in the case above, if a medicine in testing is applied to 1,000 patient and 430 of them recover finally, we would like to calculate whether p > 0.4 of taking bed resting only! Without approximation of normal distribution, we can image the difficult of solving such a binomial question with e.g. 1000! Or X , for hundreds of times!! But now, we can easily solve the problem by the superimpose of normal distribution:- µ = 1000 * 0.4 = 400 σ = 1000*0.4*0.6 = z =[( )-0.5] /15.49 = =: 1.90 From Table: the chance of having an area of +/ = ! This means that the researchers would have 97% confidence that the medicine would increase the probability of the original recovery rate of 0.4! They should go ahead for considering great investment for its mass production! 66

72 1.4.b Using PHStat4 to solve the problem above In fact, the problem of superimposing binomial distribution by normal distribution above is a running of a z test for a proportion instead of a mean! It can be solved by PHStat as below:- 1) Click the PHStat icon on Desktop to enter excel, click Enable Macro : 2) Click Add-Ins, PHStat 67

73 3) One-Sample Tests, Z Test for Proportion 4) Seeing: 68

74 5) Probability or proportion of patients recovery from bed resting only! Confidence level to say rejecting the null hypotheses! Patient recover after taking medicine! Total number of patients taking the medicine! 6) Results very close to hand calculation :- Just testing the hypothesis of greater than p = 0.4 or not, without considering the negative tail! P value very close to value calculated by hand previously! Seems due to the (-0.5) and rounding off etc.. 69

75 1.4.c Using PHStat4 to solve problems with two proportions from binomial populations We can also use PHStat4 to compare two proportions from 2 binomial populations. For example, if we have 25 patients from a hospital of 300 patients passed away due to a certain disease, and have 34 patients passed away in another hospital treating 350 patients having the same disease, in the same year. Can we say that there is a different proportion of patients passing away in that year? Firstly, let s check whether the two binomial population (with recover probability p1 and recover probability p2) can be superimposed by a normal distribution: Null Hypothesis : (p1 p2) = 0 Alternative Hypothesis : (p1 P2) =\= 0 Test statistics: z = (sample p1 sample p2) / σ(sample p1 sample p2) Rejection region (α = 0.5): z = zα/2 = ) Click PHStat icon on Desktop, click Enable Macro 70

76 2) Add-In, PHStat 3) PHStat, Two-Sample Tests (Summarized Data), Z Test for Differences in Two Proportions 71

77 4) Seeing: 5) Difference want to detect Confidence to say no difference Patients passed away in Hospital 1 Patient treated in Hospital 1 Patients passed away in Hospital 2 Patient treated in Hospital 2 72 Testing for both tails, either greater than or smaller than

78 6) Results: P = , reject the hypothesis that the two proportions of patients passing away for the same disease are different! Even for one tail, p = /2 = , still not significant! 73

79 1.5 Non-parametric tests corresponding to various t tests So far in this book, we are talking about detection for mean, std. dev., proportion etc., which are parameters of a population! For such a calculation, the population being tested must fulfill some assumptions e.g. being normal distributed, having same variance etc.! However, we might often not know about whether these assumptions can be fulfilled or not! We might even not know above what distribution the population is, at all! In these situations, we should run the non-parametric tests corresponding to various t-tests! They are: - Wilcoxon signed rank test corresponding to One sample t test Mann-Whitney test corresponding to Two samples t test or Wilcoxon rank sum test (two samples, identical results for the 2 tests) Wilcoxon matched-pairs sign-rank test corresponding to Paired t test (Dependent Samples, sometimes also called Wilcoxon signed rank test in some books and software, careful!) (* Before giving up parametric t test to run non-parametric tests, please also consider the being robust of t test, that can still give correct decision even where there is small to mediate violation of assumptions for the parametric test, especially on equal variance and equal sample size cases!) 74

80 1.5.a Wilcoxon signed rank test (corresponding to one sample t test) The only assumptions for running Wilcoxon signed rank sum test are: 1) The population is continuous 2) The population has a median 3) The population is symmetric Running the Wilcoxon signed rank sum test in SPSS: For example, we have got a data set as following: 9, 11, 18, 16, 17, 21, 12, 10, 11, 11, 19, 16, 12, , 14, 15, 13 We want to test the hypothesis: Ho: Median = 16 Ha: Median =\= 16 1) Data in SPSS: 75

81 2) Analysis, Nonparametric Tests, One Sample 3) Click Assign Manually : 76

82 4) Move Data to Continuous : 5) Click OK 77

83 6) You would go back to the previous window, select the test again: 7) Select Automatically compare., Click Settings

84 8) Select Choose Tests, Customize tests, Compare median. (Wilcoxon signed-rank test) Input 16: 9) Choose Test Options, input Significance level and Confidence interval, use default is no need to change: 79

85 10) Results: P > 0.05, can t reject the hypothesis that the population median is equal to 16. But 0.066/2 = < 0.05, sign. for 1 tailed test! Calculation by hand if SPSS is not available: i) Subtracting 16 from each observation, we get -7, -5, 2, 0, 1, 5, -4, -6, -5, -5, 3, 0, -4, -3, 4, -2, -1, -3 ii) Discarding the zeros and ranking the others in order of increasing absolute magnitude, we have 1, -1, 2, -2, 3, -3, -3, -4, -4, 4, -5, 5, -5, -5, -6, -7 iii) The 1 s occupy ranks 1 and 2; the mean (average) of these ranks is 1.5; and each 1 is given a rank of 1.5 iv) The 2 s occupy ranks 3 and 4; the means of these ranks is 3.5; each 2 is given a rank of 3.5. v) In a similar manner, each 3 receives a rank of 6; each 4 a rank of 9; each 5 and rank of 12.5; the 6 is assigned a rank of 15; and the -7 a rank of 16. vi) The sequence of the ranks is now 1.5, -1.5, -3.5, 3.5, 6, -6, -6, -9, -9, 9, -12.5, 12.5, -12.5, -12.5, -15, -16. ( - indicate negative ranks). vii) The positive rank sum = 32.5 The negative rank sum = Take the smaller rank sum is taken as T = 32.5 viii) In the table for Wilcoxon signed-rank test, find, in the column headed by the value α = 0.05, n = number of ranks = 16 (18-2), critical value = is not less than or equal 29, so we have to accept the null hypothesis that the median is 16! For one tailed test, α = 0.01, critical value = 35, we can reject the null hypothesis! 80

Projects Involving Statistics (& SPSS)

Projects Involving Statistics (& SPSS) Projects Involving Statistics (& SPSS) Academic Skills Advice Starting a project which involves using statistics can feel confusing as there seems to be many different things you can do (charts, graphs,

More information

INTERPRETING THE ONE-WAY ANALYSIS OF VARIANCE (ANOVA)

INTERPRETING THE ONE-WAY ANALYSIS OF VARIANCE (ANOVA) INTERPRETING THE ONE-WAY ANALYSIS OF VARIANCE (ANOVA) As with other parametric statistics, we begin the one-way ANOVA with a test of the underlying assumptions. Our first assumption is the assumption of

More information

t Tests in Excel The Excel Statistical Master By Mark Harmon Copyright 2011 Mark Harmon

t Tests in Excel The Excel Statistical Master By Mark Harmon Copyright 2011 Mark Harmon t-tests in Excel By Mark Harmon Copyright 2011 Mark Harmon No part of this publication may be reproduced or distributed without the express permission of the author. mark@excelmasterseries.com www.excelmasterseries.com

More information

Comparing Means in Two Populations

Comparing Means in Two Populations Comparing Means in Two Populations Overview The previous section discussed hypothesis testing when sampling from a single population (either a single mean or two means from the same population). Now we

More information

UNDERSTANDING THE DEPENDENT-SAMPLES t TEST

UNDERSTANDING THE DEPENDENT-SAMPLES t TEST UNDERSTANDING THE DEPENDENT-SAMPLES t TEST A dependent-samples t test (a.k.a. matched or paired-samples, matched-pairs, samples, or subjects, simple repeated-measures or within-groups, or correlated groups)

More information

Lesson 1: Comparison of Population Means Part c: Comparison of Two- Means

Lesson 1: Comparison of Population Means Part c: Comparison of Two- Means Lesson : Comparison of Population Means Part c: Comparison of Two- Means Welcome to lesson c. This third lesson of lesson will discuss hypothesis testing for two independent means. Steps in Hypothesis

More information

Lecture Notes Module 1

Lecture Notes Module 1 Lecture Notes Module 1 Study Populations A study population is a clearly defined collection of people, animals, plants, or objects. In psychological research, a study population usually consists of a specific

More information

The Dummy s Guide to Data Analysis Using SPSS

The Dummy s Guide to Data Analysis Using SPSS The Dummy s Guide to Data Analysis Using SPSS Mathematics 57 Scripps College Amy Gamble April, 2001 Amy Gamble 4/30/01 All Rights Rerserved TABLE OF CONTENTS PAGE Helpful Hints for All Tests...1 Tests

More information

HYPOTHESIS TESTING WITH SPSS:

HYPOTHESIS TESTING WITH SPSS: HYPOTHESIS TESTING WITH SPSS: A NON-STATISTICIAN S GUIDE & TUTORIAL by Dr. Jim Mirabella SPSS 14.0 screenshots reprinted with permission from SPSS Inc. Published June 2006 Copyright Dr. Jim Mirabella CHAPTER

More information

SCHOOL OF HEALTH AND HUMAN SCIENCES DON T FORGET TO RECODE YOUR MISSING VALUES

SCHOOL OF HEALTH AND HUMAN SCIENCES DON T FORGET TO RECODE YOUR MISSING VALUES SCHOOL OF HEALTH AND HUMAN SCIENCES Using SPSS Topics addressed today: 1. Differences between groups 2. Graphing Use the s4data.sav file for the first part of this session. DON T FORGET TO RECODE YOUR

More information

Chapter 7 Section 7.1: Inference for the Mean of a Population

Chapter 7 Section 7.1: Inference for the Mean of a Population Chapter 7 Section 7.1: Inference for the Mean of a Population Now let s look at a similar situation Take an SRS of size n Normal Population : N(, ). Both and are unknown parameters. Unlike what we used

More information

Descriptive Statistics

Descriptive Statistics Descriptive Statistics Primer Descriptive statistics Central tendency Variation Relative position Relationships Calculating descriptive statistics Descriptive Statistics Purpose to describe or summarize

More information

LAB 4 INSTRUCTIONS CONFIDENCE INTERVALS AND HYPOTHESIS TESTING

LAB 4 INSTRUCTIONS CONFIDENCE INTERVALS AND HYPOTHESIS TESTING LAB 4 INSTRUCTIONS CONFIDENCE INTERVALS AND HYPOTHESIS TESTING In this lab you will explore the concept of a confidence interval and hypothesis testing through a simulation problem in engineering setting.

More information

Nonparametric Two-Sample Tests. Nonparametric Tests. Sign Test

Nonparametric Two-Sample Tests. Nonparametric Tests. Sign Test Nonparametric Two-Sample Tests Sign test Mann-Whitney U-test (a.k.a. Wilcoxon two-sample test) Kolmogorov-Smirnov Test Wilcoxon Signed-Rank Test Tukey-Duckworth Test 1 Nonparametric Tests Recall, nonparametric

More information

SPSS Tests for Versions 9 to 13

SPSS Tests for Versions 9 to 13 SPSS Tests for Versions 9 to 13 Chapter 2 Descriptive Statistic (including median) Choose Analyze Descriptive statistics Frequencies... Click on variable(s) then press to move to into Variable(s): list

More information

Study Guide for the Final Exam

Study Guide for the Final Exam Study Guide for the Final Exam When studying, remember that the computational portion of the exam will only involve new material (covered after the second midterm), that material from Exam 1 will make

More information

II. DISTRIBUTIONS distribution normal distribution. standard scores

II. DISTRIBUTIONS distribution normal distribution. standard scores Appendix D Basic Measurement And Statistics The following information was developed by Steven Rothke, PhD, Department of Psychology, Rehabilitation Institute of Chicago (RIC) and expanded by Mary F. Schmidt,

More information

Permutation Tests for Comparing Two Populations

Permutation Tests for Comparing Two Populations Permutation Tests for Comparing Two Populations Ferry Butar Butar, Ph.D. Jae-Wan Park Abstract Permutation tests for comparing two populations could be widely used in practice because of flexibility of

More information

NCSS Statistical Software

NCSS Statistical Software Chapter 06 Introduction This procedure provides several reports for the comparison of two distributions, including confidence intervals for the difference in means, two-sample t-tests, the z-test, the

More information

THE FIRST SET OF EXAMPLES USE SUMMARY DATA... EXAMPLE 7.2, PAGE 227 DESCRIBES A PROBLEM AND A HYPOTHESIS TEST IS PERFORMED IN EXAMPLE 7.

THE FIRST SET OF EXAMPLES USE SUMMARY DATA... EXAMPLE 7.2, PAGE 227 DESCRIBES A PROBLEM AND A HYPOTHESIS TEST IS PERFORMED IN EXAMPLE 7. THERE ARE TWO WAYS TO DO HYPOTHESIS TESTING WITH STATCRUNCH: WITH SUMMARY DATA (AS IN EXAMPLE 7.17, PAGE 236, IN ROSNER); WITH THE ORIGINAL DATA (AS IN EXAMPLE 8.5, PAGE 301 IN ROSNER THAT USES DATA FROM

More information

How To Test For Significance On A Data Set

How To Test For Significance On A Data Set Non-Parametric Univariate Tests: 1 Sample Sign Test 1 1 SAMPLE SIGN TEST A non-parametric equivalent of the 1 SAMPLE T-TEST. ASSUMPTIONS: Data is non-normally distributed, even after log transforming.

More information

1. What is the critical value for this 95% confidence interval? CV = z.025 = invnorm(0.025) = 1.96

1. What is the critical value for this 95% confidence interval? CV = z.025 = invnorm(0.025) = 1.96 1 Final Review 2 Review 2.1 CI 1-propZint Scenario 1 A TV manufacturer claims in its warranty brochure that in the past not more than 10 percent of its TV sets needed any repair during the first two years

More information

HYPOTHESIS TESTING: CONFIDENCE INTERVALS, T-TESTS, ANOVAS, AND REGRESSION

HYPOTHESIS TESTING: CONFIDENCE INTERVALS, T-TESTS, ANOVAS, AND REGRESSION HYPOTHESIS TESTING: CONFIDENCE INTERVALS, T-TESTS, ANOVAS, AND REGRESSION HOD 2990 10 November 2010 Lecture Background This is a lightning speed summary of introductory statistical methods for senior undergraduate

More information

CALCULATIONS & STATISTICS

CALCULATIONS & STATISTICS CALCULATIONS & STATISTICS CALCULATION OF SCORES Conversion of 1-5 scale to 0-100 scores When you look at your report, you will notice that the scores are reported on a 0-100 scale, even though respondents

More information

1.5 Oneway Analysis of Variance

1.5 Oneway Analysis of Variance Statistics: Rosie Cornish. 200. 1.5 Oneway Analysis of Variance 1 Introduction Oneway analysis of variance (ANOVA) is used to compare several means. This method is often used in scientific or medical experiments

More information

Outline. Definitions Descriptive vs. Inferential Statistics The t-test - One-sample t-test

Outline. Definitions Descriptive vs. Inferential Statistics The t-test - One-sample t-test The t-test Outline Definitions Descriptive vs. Inferential Statistics The t-test - One-sample t-test - Dependent (related) groups t-test - Independent (unrelated) groups t-test Comparing means Correlation

More information

Parametric and non-parametric statistical methods for the life sciences - Session I

Parametric and non-parametric statistical methods for the life sciences - Session I Why nonparametric methods What test to use? Rank Tests Parametric and non-parametric statistical methods for the life sciences - Session I Liesbeth Bruckers Geert Molenberghs Interuniversity Institute

More information

4. Continuous Random Variables, the Pareto and Normal Distributions

4. Continuous Random Variables, the Pareto and Normal Distributions 4. Continuous Random Variables, the Pareto and Normal Distributions A continuous random variable X can take any value in a given range (e.g. height, weight, age). The distribution of a continuous random

More information

HYPOTHESIS TESTING: POWER OF THE TEST

HYPOTHESIS TESTING: POWER OF THE TEST HYPOTHESIS TESTING: POWER OF THE TEST The first 6 steps of the 9-step test of hypothesis are called "the test". These steps are not dependent on the observed data values. When planning a research project,

More information

EPS 625 INTERMEDIATE STATISTICS FRIEDMAN TEST

EPS 625 INTERMEDIATE STATISTICS FRIEDMAN TEST EPS 625 INTERMEDIATE STATISTICS The Friedman test is an extension of the Wilcoxon test. The Wilcoxon test can be applied to repeated-measures data if participants are assessed on two occasions or conditions

More information

StatCrunch and Nonparametric Statistics

StatCrunch and Nonparametric Statistics StatCrunch and Nonparametric Statistics You can use StatCrunch to calculate the values of nonparametric statistics. It may not be obvious how to enter the data in StatCrunch for various data sets that

More information

QUANTITATIVE METHODS BIOLOGY FINAL HONOUR SCHOOL NON-PARAMETRIC TESTS

QUANTITATIVE METHODS BIOLOGY FINAL HONOUR SCHOOL NON-PARAMETRIC TESTS QUANTITATIVE METHODS BIOLOGY FINAL HONOUR SCHOOL NON-PARAMETRIC TESTS This booklet contains lecture notes for the nonparametric work in the QM course. This booklet may be online at http://users.ox.ac.uk/~grafen/qmnotes/index.html.

More information

Two-Sample T-Tests Assuming Equal Variance (Enter Means)

Two-Sample T-Tests Assuming Equal Variance (Enter Means) Chapter 4 Two-Sample T-Tests Assuming Equal Variance (Enter Means) Introduction This procedure provides sample size and power calculations for one- or two-sided two-sample t-tests when the variances of

More information

UNDERSTANDING THE INDEPENDENT-SAMPLES t TEST

UNDERSTANDING THE INDEPENDENT-SAMPLES t TEST UNDERSTANDING The independent-samples t test evaluates the difference between the means of two independent or unrelated groups. That is, we evaluate whether the means for two independent groups are significantly

More information

Two Related Samples t Test

Two Related Samples t Test Two Related Samples t Test In this example 1 students saw five pictures of attractive people and five pictures of unattractive people. For each picture, the students rated the friendliness of the person

More information

Doing Multiple Regression with SPSS. In this case, we are interested in the Analyze options so we choose that menu. If gives us a number of choices:

Doing Multiple Regression with SPSS. In this case, we are interested in the Analyze options so we choose that menu. If gives us a number of choices: Doing Multiple Regression with SPSS Multiple Regression for Data Already in Data Editor Next we want to specify a multiple regression analysis for these data. The menu bar for SPSS offers several options:

More information

Simple Regression Theory II 2010 Samuel L. Baker

Simple Regression Theory II 2010 Samuel L. Baker SIMPLE REGRESSION THEORY II 1 Simple Regression Theory II 2010 Samuel L. Baker Assessing how good the regression equation is likely to be Assignment 1A gets into drawing inferences about how close the

More information

Introduction. Hypothesis Testing. Hypothesis Testing. Significance Testing

Introduction. Hypothesis Testing. Hypothesis Testing. Significance Testing Introduction Hypothesis Testing Mark Lunt Arthritis Research UK Centre for Ecellence in Epidemiology University of Manchester 13/10/2015 We saw last week that we can never know the population parameters

More information

HYPOTHESIS TESTING (ONE SAMPLE) - CHAPTER 7 1. used confidence intervals to answer questions such as...

HYPOTHESIS TESTING (ONE SAMPLE) - CHAPTER 7 1. used confidence intervals to answer questions such as... HYPOTHESIS TESTING (ONE SAMPLE) - CHAPTER 7 1 PREVIOUSLY used confidence intervals to answer questions such as... You know that 0.25% of women have red/green color blindness. You conduct a study of men

More information

Standard Deviation Estimator

Standard Deviation Estimator CSS.com Chapter 905 Standard Deviation Estimator Introduction Even though it is not of primary interest, an estimate of the standard deviation (SD) is needed when calculating the power or sample size of

More information

Chapter 9. Two-Sample Tests. Effect Sizes and Power Paired t Test Calculation

Chapter 9. Two-Sample Tests. Effect Sizes and Power Paired t Test Calculation Chapter 9 Two-Sample Tests Paired t Test (Correlated Groups t Test) Effect Sizes and Power Paired t Test Calculation Summary Independent t Test Chapter 9 Homework Power and Two-Sample Tests: Paired Versus

More information

DATA INTERPRETATION AND STATISTICS

DATA INTERPRETATION AND STATISTICS PholC60 September 001 DATA INTERPRETATION AND STATISTICS Books A easy and systematic introductory text is Essentials of Medical Statistics by Betty Kirkwood, published by Blackwell at about 14. DESCRIPTIVE

More information

Part 3. Comparing Groups. Chapter 7 Comparing Paired Groups 189. Chapter 8 Comparing Two Independent Groups 217

Part 3. Comparing Groups. Chapter 7 Comparing Paired Groups 189. Chapter 8 Comparing Two Independent Groups 217 Part 3 Comparing Groups Chapter 7 Comparing Paired Groups 189 Chapter 8 Comparing Two Independent Groups 217 Chapter 9 Comparing More Than Two Groups 257 188 Elementary Statistics Using SAS Chapter 7 Comparing

More information

Introduction to Hypothesis Testing. Hypothesis Testing. Step 1: State the Hypotheses

Introduction to Hypothesis Testing. Hypothesis Testing. Step 1: State the Hypotheses Introduction to Hypothesis Testing 1 Hypothesis Testing A hypothesis test is a statistical procedure that uses sample data to evaluate a hypothesis about a population Hypothesis is stated in terms of the

More information

Independent t- Test (Comparing Two Means)

Independent t- Test (Comparing Two Means) Independent t- Test (Comparing Two Means) The objectives of this lesson are to learn: the definition/purpose of independent t-test when to use the independent t-test the use of SPSS to complete an independent

More information

Statistics Review PSY379

Statistics Review PSY379 Statistics Review PSY379 Basic concepts Measurement scales Populations vs. samples Continuous vs. discrete variable Independent vs. dependent variable Descriptive vs. inferential stats Common analyses

More information

CHAPTER 14 NONPARAMETRIC TESTS

CHAPTER 14 NONPARAMETRIC TESTS CHAPTER 14 NONPARAMETRIC TESTS Everything that we have done up until now in statistics has relied heavily on one major fact: that our data is normally distributed. We have been able to make inferences

More information

Non-Inferiority Tests for Two Means using Differences

Non-Inferiority Tests for Two Means using Differences Chapter 450 on-inferiority Tests for Two Means using Differences Introduction This procedure computes power and sample size for non-inferiority tests in two-sample designs in which the outcome is a continuous

More information

Paired T-Test. Chapter 208. Introduction. Technical Details. Research Questions

Paired T-Test. Chapter 208. Introduction. Technical Details. Research Questions Chapter 208 Introduction This procedure provides several reports for making inference about the difference between two population means based on a paired sample. These reports include confidence intervals

More information

SPSS Explore procedure

SPSS Explore procedure SPSS Explore procedure One useful function in SPSS is the Explore procedure, which will produce histograms, boxplots, stem-and-leaf plots and extensive descriptive statistics. To run the Explore procedure,

More information

Two-Sample T-Tests Allowing Unequal Variance (Enter Difference)

Two-Sample T-Tests Allowing Unequal Variance (Enter Difference) Chapter 45 Two-Sample T-Tests Allowing Unequal Variance (Enter Difference) Introduction This procedure provides sample size and power calculations for one- or two-sided two-sample t-tests when no assumption

More information

Independent samples t-test. Dr. Tom Pierce Radford University

Independent samples t-test. Dr. Tom Pierce Radford University Independent samples t-test Dr. Tom Pierce Radford University The logic behind drawing causal conclusions from experiments The sampling distribution of the difference between means The standard error of

More information

Difference tests (2): nonparametric

Difference tests (2): nonparametric NST 1B Experimental Psychology Statistics practical 3 Difference tests (): nonparametric Rudolf Cardinal & Mike Aitken 10 / 11 February 005; Department of Experimental Psychology University of Cambridge

More information

Hypothesis testing. c 2014, Jeffrey S. Simonoff 1

Hypothesis testing. c 2014, Jeffrey S. Simonoff 1 Hypothesis testing So far, we ve talked about inference from the point of estimation. We ve tried to answer questions like What is a good estimate for a typical value? or How much variability is there

More information

3.4 Statistical inference for 2 populations based on two samples

3.4 Statistical inference for 2 populations based on two samples 3.4 Statistical inference for 2 populations based on two samples Tests for a difference between two population means The first sample will be denoted as X 1, X 2,..., X m. The second sample will be denoted

More information

Recall this chart that showed how most of our course would be organized:

Recall this chart that showed how most of our course would be organized: Chapter 4 One-Way ANOVA Recall this chart that showed how most of our course would be organized: Explanatory Variable(s) Response Variable Methods Categorical Categorical Contingency Tables Categorical

More information

NCSS Statistical Software. One-Sample T-Test

NCSS Statistical Software. One-Sample T-Test Chapter 205 Introduction This procedure provides several reports for making inference about a population mean based on a single sample. These reports include confidence intervals of the mean or median,

More information

6.4 Normal Distribution

6.4 Normal Distribution Contents 6.4 Normal Distribution....................... 381 6.4.1 Characteristics of the Normal Distribution....... 381 6.4.2 The Standardized Normal Distribution......... 385 6.4.3 Meaning of Areas under

More information

Normality Testing in Excel

Normality Testing in Excel Normality Testing in Excel By Mark Harmon Copyright 2011 Mark Harmon No part of this publication may be reproduced or distributed without the express permission of the author. mark@excelmasterseries.com

More information

Opgaven Onderzoeksmethoden, Onderdeel Statistiek

Opgaven Onderzoeksmethoden, Onderdeel Statistiek Opgaven Onderzoeksmethoden, Onderdeel Statistiek 1. What is the measurement scale of the following variables? a Shoe size b Religion c Car brand d Score in a tennis game e Number of work hours per week

More information

Chapter 5 Analysis of variance SPSS Analysis of variance

Chapter 5 Analysis of variance SPSS Analysis of variance Chapter 5 Analysis of variance SPSS Analysis of variance Data file used: gss.sav How to get there: Analyze Compare Means One-way ANOVA To test the null hypothesis that several population means are equal,

More information

Describing Populations Statistically: The Mean, Variance, and Standard Deviation

Describing Populations Statistically: The Mean, Variance, and Standard Deviation Describing Populations Statistically: The Mean, Variance, and Standard Deviation BIOLOGICAL VARIATION One aspect of biology that holds true for almost all species is that not every individual is exactly

More information

Two-sample inference: Continuous data

Two-sample inference: Continuous data Two-sample inference: Continuous data Patrick Breheny April 5 Patrick Breheny STA 580: Biostatistics I 1/32 Introduction Our next two lectures will deal with two-sample inference for continuous data As

More information

How To Check For Differences In The One Way Anova

How To Check For Differences In The One Way Anova MINITAB ASSISTANT WHITE PAPER This paper explains the research conducted by Minitab statisticians to develop the methods and data checks used in the Assistant in Minitab 17 Statistical Software. One-Way

More information

1 Nonparametric Statistics

1 Nonparametric Statistics 1 Nonparametric Statistics When finding confidence intervals or conducting tests so far, we always described the population with a model, which includes a set of parameters. Then we could make decisions

More information

Using Excel for inferential statistics

Using Excel for inferential statistics FACT SHEET Using Excel for inferential statistics Introduction When you collect data, you expect a certain amount of variation, just caused by chance. A wide variety of statistical tests can be applied

More information

Chicago Booth BUSINESS STATISTICS 41000 Final Exam Fall 2011

Chicago Booth BUSINESS STATISTICS 41000 Final Exam Fall 2011 Chicago Booth BUSINESS STATISTICS 41000 Final Exam Fall 2011 Name: Section: I pledge my honor that I have not violated the Honor Code Signature: This exam has 34 pages. You have 3 hours to complete this

More information

Skewed Data and Non-parametric Methods

Skewed Data and Non-parametric Methods 0 2 4 6 8 10 12 14 Skewed Data and Non-parametric Methods Comparing two groups: t-test assumes data are: 1. Normally distributed, and 2. both samples have the same SD (i.e. one sample is simply shifted

More information

Two Correlated Proportions (McNemar Test)

Two Correlated Proportions (McNemar Test) Chapter 50 Two Correlated Proportions (Mcemar Test) Introduction This procedure computes confidence intervals and hypothesis tests for the comparison of the marginal frequencies of two factors (each with

More information

STAT 350 Practice Final Exam Solution (Spring 2015)

STAT 350 Practice Final Exam Solution (Spring 2015) PART 1: Multiple Choice Questions: 1) A study was conducted to compare five different training programs for improving endurance. Forty subjects were randomly divided into five groups of eight subjects

More information

Chapter 8 Hypothesis Testing Chapter 8 Hypothesis Testing 8-1 Overview 8-2 Basics of Hypothesis Testing

Chapter 8 Hypothesis Testing Chapter 8 Hypothesis Testing 8-1 Overview 8-2 Basics of Hypothesis Testing Chapter 8 Hypothesis Testing 1 Chapter 8 Hypothesis Testing 8-1 Overview 8-2 Basics of Hypothesis Testing 8-3 Testing a Claim About a Proportion 8-5 Testing a Claim About a Mean: s Not Known 8-6 Testing

More information

SPSS/Excel Workshop 3 Summer Semester, 2010

SPSS/Excel Workshop 3 Summer Semester, 2010 SPSS/Excel Workshop 3 Summer Semester, 2010 In Assignment 3 of STATS 10x you may want to use Excel to perform some calculations in Questions 1 and 2 such as: finding P-values finding t-multipliers and/or

More information

Week 4: Standard Error and Confidence Intervals

Week 4: Standard Error and Confidence Intervals Health Sciences M.Sc. Programme Applied Biostatistics Week 4: Standard Error and Confidence Intervals Sampling Most research data come from subjects we think of as samples drawn from a larger population.

More information

THE KRUSKAL WALLLIS TEST

THE KRUSKAL WALLLIS TEST THE KRUSKAL WALLLIS TEST TEODORA H. MEHOTCHEVA Wednesday, 23 rd April 08 THE KRUSKAL-WALLIS TEST: The non-parametric alternative to ANOVA: testing for difference between several independent groups 2 NON

More information

Bowerman, O'Connell, Aitken Schermer, & Adcock, Business Statistics in Practice, Canadian edition

Bowerman, O'Connell, Aitken Schermer, & Adcock, Business Statistics in Practice, Canadian edition Bowerman, O'Connell, Aitken Schermer, & Adcock, Business Statistics in Practice, Canadian edition Online Learning Centre Technology Step-by-Step - Excel Microsoft Excel is a spreadsheet software application

More information

Principles of Hypothesis Testing for Public Health

Principles of Hypothesis Testing for Public Health Principles of Hypothesis Testing for Public Health Laura Lee Johnson, Ph.D. Statistician National Center for Complementary and Alternative Medicine johnslau@mail.nih.gov Fall 2011 Answers to Questions

More information

NONPARAMETRIC STATISTICS 1. depend on assumptions about the underlying distribution of the data (or on the Central Limit Theorem)

NONPARAMETRIC STATISTICS 1. depend on assumptions about the underlying distribution of the data (or on the Central Limit Theorem) NONPARAMETRIC STATISTICS 1 PREVIOUSLY parametric statistics in estimation and hypothesis testing... construction of confidence intervals computing of p-values classical significance testing depend on assumptions

More information

Two-sample hypothesis testing, II 9.07 3/16/2004

Two-sample hypothesis testing, II 9.07 3/16/2004 Two-sample hypothesis testing, II 9.07 3/16/004 Small sample tests for the difference between two independent means For two-sample tests of the difference in mean, things get a little confusing, here,

More information

Data Analysis Tools. Tools for Summarizing Data

Data Analysis Tools. Tools for Summarizing Data Data Analysis Tools This section of the notes is meant to introduce you to many of the tools that are provided by Excel under the Tools/Data Analysis menu item. If your computer does not have that tool

More information

Chapter 7 Section 1 Homework Set A

Chapter 7 Section 1 Homework Set A Chapter 7 Section 1 Homework Set A 7.15 Finding the critical value t *. What critical value t * from Table D (use software, go to the web and type t distribution applet) should be used to calculate the

More information

Comparing Two Groups. Standard Error of ȳ 1 ȳ 2. Setting. Two Independent Samples

Comparing Two Groups. Standard Error of ȳ 1 ȳ 2. Setting. Two Independent Samples Comparing Two Groups Chapter 7 describes two ways to compare two populations on the basis of independent samples: a confidence interval for the difference in population means and a hypothesis test. The

More information

Chapter 7 Notes - Inference for Single Samples. You know already for a large sample, you can invoke the CLT so:

Chapter 7 Notes - Inference for Single Samples. You know already for a large sample, you can invoke the CLT so: Chapter 7 Notes - Inference for Single Samples You know already for a large sample, you can invoke the CLT so: X N(µ, ). Also for a large sample, you can replace an unknown σ by s. You know how to do a

More information

Introduction to Analysis of Variance (ANOVA) Limitations of the t-test

Introduction to Analysis of Variance (ANOVA) Limitations of the t-test Introduction to Analysis of Variance (ANOVA) The Structural Model, The Summary Table, and the One- Way ANOVA Limitations of the t-test Although the t-test is commonly used, it has limitations Can only

More information

Non-Parametric Tests (I)

Non-Parametric Tests (I) Lecture 5: Non-Parametric Tests (I) KimHuat LIM lim@stats.ox.ac.uk http://www.stats.ox.ac.uk/~lim/teaching.html Slide 1 5.1 Outline (i) Overview of Distribution-Free Tests (ii) Median Test for Two Independent

More information

NCSS Statistical Software

NCSS Statistical Software Chapter 06 Introduction This procedure provides several reports for the comparison of two distributions, including confidence intervals for the difference in means, two-sample t-tests, the z-test, the

More information

Analysis of Variance ANOVA

Analysis of Variance ANOVA Analysis of Variance ANOVA Overview We ve used the t -test to compare the means from two independent groups. Now we ve come to the final topic of the course: how to compare means from more than two populations.

More information

research/scientific includes the following: statistical hypotheses: you have a null and alternative you accept one and reject the other

research/scientific includes the following: statistical hypotheses: you have a null and alternative you accept one and reject the other 1 Hypothesis Testing Richard S. Balkin, Ph.D., LPC-S, NCC 2 Overview When we have questions about the effect of a treatment or intervention or wish to compare groups, we use hypothesis testing Parametric

More information

MULTIPLE CHOICE. Choose the one alternative that best completes the statement or answers the question.

MULTIPLE CHOICE. Choose the one alternative that best completes the statement or answers the question. STT315 Practice Ch 5-7 MULTIPLE CHOICE. Choose the one alternative that best completes the statement or answers the question. Solve the problem. 1) The length of time a traffic signal stays green (nicknamed

More information

Biostatistics: DESCRIPTIVE STATISTICS: 2, VARIABILITY

Biostatistics: DESCRIPTIVE STATISTICS: 2, VARIABILITY Biostatistics: DESCRIPTIVE STATISTICS: 2, VARIABILITY 1. Introduction Besides arriving at an appropriate expression of an average or consensus value for observations of a population, it is important to

More information

Analyzing Data with GraphPad Prism

Analyzing Data with GraphPad Prism 1999 GraphPad Software, Inc. All rights reserved. All Rights Reserved. GraphPad Prism, Prism and InStat are registered trademarks of GraphPad Software, Inc. GraphPad is a trademark of GraphPad Software,

More information

2 Sample t-test (unequal sample sizes and unequal variances)

2 Sample t-test (unequal sample sizes and unequal variances) Variations of the t-test: Sample tail Sample t-test (unequal sample sizes and unequal variances) Like the last example, below we have ceramic sherd thickness measurements (in cm) of two samples representing

More information

KSTAT MINI-MANUAL. Decision Sciences 434 Kellogg Graduate School of Management

KSTAT MINI-MANUAL. Decision Sciences 434 Kellogg Graduate School of Management KSTAT MINI-MANUAL Decision Sciences 434 Kellogg Graduate School of Management Kstat is a set of macros added to Excel and it will enable you to do the statistics required for this course very easily. To

More information

Hypothesis testing - Steps

Hypothesis testing - Steps Hypothesis testing - Steps Steps to do a two-tailed test of the hypothesis that β 1 0: 1. Set up the hypotheses: H 0 : β 1 = 0 H a : β 1 0. 2. Compute the test statistic: t = b 1 0 Std. error of b 1 =

More information

Simple linear regression

Simple linear regression Simple linear regression Introduction Simple linear regression is a statistical method for obtaining a formula to predict values of one variable from another where there is a causal relationship between

More information

HYPOTHESIS TESTING (ONE SAMPLE) - CHAPTER 7 1. used confidence intervals to answer questions such as...

HYPOTHESIS TESTING (ONE SAMPLE) - CHAPTER 7 1. used confidence intervals to answer questions such as... HYPOTHESIS TESTING (ONE SAMPLE) - CHAPTER 7 1 PREVIOUSLY used confidence intervals to answer questions such as... You know that 0.25% of women have red/green color blindness. You conduct a study of men

More information

Stat 411/511 THE RANDOMIZATION TEST. Charlotte Wickham. stat511.cwick.co.nz. Oct 16 2015

Stat 411/511 THE RANDOMIZATION TEST. Charlotte Wickham. stat511.cwick.co.nz. Oct 16 2015 Stat 411/511 THE RANDOMIZATION TEST Oct 16 2015 Charlotte Wickham stat511.cwick.co.nz Today Review randomization model Conduct randomization test What about CIs? Using a t-distribution as an approximation

More information

UNDERSTANDING THE TWO-WAY ANOVA

UNDERSTANDING THE TWO-WAY ANOVA UNDERSTANDING THE e have seen how the one-way ANOVA can be used to compare two or more sample means in studies involving a single independent variable. This can be extended to two independent variables

More information

Once saved, if the file was zipped you will need to unzip it. For the files that I will be posting you need to change the preferences.

Once saved, if the file was zipped you will need to unzip it. For the files that I will be posting you need to change the preferences. 1 Commands in JMP and Statcrunch Below are a set of commands in JMP and Statcrunch which facilitate a basic statistical analysis. The first part concerns commands in JMP, the second part is for analysis

More information

Examining Differences (Comparing Groups) using SPSS Inferential statistics (Part I) Dwayne Devonish

Examining Differences (Comparing Groups) using SPSS Inferential statistics (Part I) Dwayne Devonish Examining Differences (Comparing Groups) using SPSS Inferential statistics (Part I) Dwayne Devonish Statistics Statistics are quantitative methods of describing, analysing, and drawing inferences (conclusions)

More information