Data Analysis. Lecture Empirical Model Building and Methods (Empirische Modellbildung und Methoden) SS Analysis of Experiments  Introduction


 Ellen Bridges
 2 years ago
 Views:
Transcription
1 Data Analysis Lecture Empirical Model Building and Methods (Empirische Modellbildung und Methoden) Prof. Dr. Dr. h.c. Dieter Rombach Dr. Andreas Jedlitschka SS 2014 Analysis of Experiments  Introduction Some parts of this lecture are adopted with permission from lectures given by Sira Vegas and Oscar Dieste at UPM
2 Outline Descriptive statistics Statistical Analysis Parametric Tests Student s ttest Paired ttest Oneway ANOVA Nonparametric Tests MannWhitney Wilcoxon Sign test Jedlitschka, Vegas, Dieste 2014 Slide 2
3 DESCRIPTIVE STATISTICS Jedlitschka, Vegas, Dieste
4 Important notice In inferential statistics, the population parameters are clearly differentiated from estimators (parameters calculated from samples) Population parameters are designated by Greek letters: μ, σ 2, σ Estimators are designated by Latin letters: m, s 2, s In most cases, symbols have an associated subscript denoting the associated sample (a treatment, usually): μ a, s b Jedlitschka, Vegas, Dieste 2014 Slide 4
5 Important notice The notational aspect is important because there are some differences in the calculation of estimators as compared to population parameters, concretely in the case of the variance: Sample variance It affects standard deviation, as it is the squared root of the variance (n1) are the degrees of freedom of the sample. This will be important soon Jedlitschka, Vegas, Dieste 2014 Slide 5
6 Measures of central tendency Dataset: { 1, 2, 2, 2, 3, 14 } Arithmetic Mean Median Mode = 4 middle value of the ordered values: 2 Which one appears most often: 2 Measures differ in their response to outliers Jedlitschka, Vegas, Dieste 2014 Slide 6
7 Mean, Median, Mode Jedlitschka, Vegas, Dieste 2014 Slide 7
8 Dispersion (1/2) Dataset: {1, 2, 2, 2, 3, 14} Range {min, max}: {1, 14} Standard deviation (SD) σ if the data is from the population (N & μ) s if the data is from the population (N1 & ) informs about the variation from the average Is the square root of the variance : 4,51 Jedlitschka, Vegas, Dieste 2014 Slide 8
9 Dispersion (2/2) Interquartile Range Jedlitschka, Vegas, Dieste 2014 Slide 9
10 Shape Variance σ² The average of the squared differences from the mean. Skewness Kurtosis Jedlitschka, Vegas, Dieste 2014 Slide 10
11 Dependency Linear regression Correlation coefficient (Pearson) Interval or ratio & normal distribution More than two variables: Multivariate analysis: principal component, moment_correlation_coefficient Jedlitschka, Vegas, Dieste 2014 Slide 11
12 Motivation STATISTICAL ANALYSIS Jedlitschka, Vegas, Dieste
13 A simple experiment Experiments don t have to be complicated. They can be so simple as comparing a technology to something else 1 factor Jedlitschka, Vegas, Dieste 2014 Slide 13
14 Distribution and Probability Find out whether this is a fair die! What could be the idea? Jedlitschka, Vegas, Dieste 2014 Slide 14
15 Solution Approach Either you have a trustworthy expectation Or Take by chance one of the dice Throw it one hundred times Note down each single event Derive distribution Now take this one and check whether it fulfils the expectation Jedlitschka, Vegas, Dieste 2014 Slide 15
16 A simple experiment Experiments don t have to be complicated. They can be so simple as comparing a pair of techniques 1 factor with 2 levels In cases like these, we don t need expensive tools (SPSS, STATA, etc.) to analyze the experimental results A scholar wants to know if technique A (say functional testing) is better than B (say inspection) He performs an experiment with some students and gets the following data (metric: higher value means better ): Technique A A B B A B B B A A B Measure Jedlitschka, Vegas, Dieste 2014 Slide 16
17 Question How can we decide which technique (A, B) is better? SPSS The most obvious option is looking at the data: Descriptive statistics Median, means Quartiles, variances, standard deviation and suitable plots Box plots Column1 A B 29,9 26,6 11,4 23,7 25,3 28,5 16,5 14,2 21,1 17,9 24,3 N 5 6 mean 20,84 22,53 variance 52,50 29,51 std. dev 7,25 5,43 Jedlitschka, Vegas, Dieste 2014 Slide 17
18 Box plot min Q1 Q3 max min Q1 Q3 max Jedlitschka, Vegas, Dieste
19 Preliminary answer B looks better, but the results are quite similar. We cannot be sure! It is likely that differences arise due to random chance Don t believe it? Remember what we found out with the dice. Or think about throwing a coin four times (What do you expect? What do you get?). As we can see from this example, many processes have an associated probability distribution How can we make a decision on this case? Jedlitschka, Vegas, Dieste 2014 Slide 19
20 Key question Idea: if we would know the probability distribution, we could calculate the probability that B > A Formally speaking: μ b > μ a Problem: What happens if we ignore the probability distribution? Jedlitschka, Vegas, Dieste 2014 Slide 20
21 Reference distribution Fisher claims that it is possible to relate the experimental results with a reference distribution, which is based on the same experimental data. Using this reference distribution, we can obtain an estimation about the likelihood of a given results under the assumption that A and B does not differ (that is, supposing that μ b = μ a ) Does the difference between the two groups represent a real difference or was it due to chance? Jedlitschka, Vegas, Dieste 2014 Slide 21
22 Standard distributions Building the reference distribution, even for a small example, requires a lot of effort. Under some assumptions, reference distributions are close to known probability distribution, such as normal (Gauss) distribution or, in our particular case, Students t t is used instead of the normal distributed when the sample sizes involved are small The good thing is that standard distributions are tabulated. Significance levels can be obtained immediately from the tables. Jedlitschka, Vegas, Dieste 2014 Slide 22
23 Use the standard distribution Calculate the actual difference between means Say d = ( b a) Locate d in the histogram Calculate the area of the histogram that falls at the right side of d That area is the probability that, by mater of chance, we could obtain a difference between means of value ( b a) or higher We call it pvalue If the pvalue is below a cutoff value α (significance level) we can affirm the techniques A and B are not alike α is arbitrarily set at 0.05 We say that we have obtained a significant result Jedlitschka, Vegas, Dieste 2014 Slide 23
24 Back to the Example Observed difference Null Hypothesis is not rejected Jedlitschka, Vegas, Dieste
25 Parametric Test / Independent Sample TTEST Jedlitschka, Vegas, Dieste
26 TTest One factor experiments with one level Onesample ttest Compare mean response of a group against a specific value The formula shows the general concept used by the following tests = mean (of groups 1 and 2) µ 0 = specified value (e.g., population mean) n = number of subjects in groups (1 and 2) (equal!!!) s = Standard Deviation of group (1 and 2) df = n1 Lookup t in Student's tdistribution table to obtain pvalue. Jedlitschka, Vegas, Dieste 2014 Slide 26
27 TTest One factor experiments with two levels Twosample ttest Checks the statistical signification of the difference between the mean responses of two levels of a factor Checks the null hypothesis of the samples belonging to two subpopulations where the mean X is the same Prerequisites : the two sample sizes (that is, the number, n, of participants of each group) are equal; it can be assumed that the two distributions have the same variance. 2 H = mean (of groups 1 and 2) 1 n = number of subjects in groups (1 and 2) (equal!!!) s = Standard Deviation of group (1 and 2) s² = unbiased estimators of the variances df = 2n2 H 0 : 2 2 Jedlitschka, Vegas, Dieste
28 TTest One factor experiments with two levels Special cases Unequal sample sizes, equal variance df = n 1 + n 22 Equal or Unequal sample sizes, unequal variances (also Welch s ttest) Jedlitschka, Vegas, Dieste 2014 Slide 28
29 TTest Project A B Program 3,42 3,44 Defect 2,71 4,97 density 2,84 4,76 1,85 4,96 3,22 4,10 3,48 3,05 2,68 4,09 4,30 3,69 2,49 4,21 1,54 4,40 3,49 1. Calculate means 2. Calculate difference of means 3. Use formula (unequal N) 4. Check obtained t value for respective df in t distribution table 5. Reject H0 if t0 > t α/2,df (two sided) 5. Reject H0 if t0 > t α,df (one sided) Data taken from Wohlin et al Jedlitschka, Vegas, Dieste 2014 Slide 29
30 tdistribution requirements There are three requirements 1. Samples must be independent and identically distributed (i.i.d.). In practice, it means that assignment of levels (A s and B s) to experimental units (subjects) have to be performed in a randomized way i.i.d. implies homoscedasticity and noninteraction 2. Accordingly, the mean estimator should be normally distributed (or close to normality) 3. Response variables are measured on ratio scales. Ordinal metrics cannot be used Condition #1 is probably more important than condition #2 and #3 Jedlitschka, Vegas, Dieste 2014 Slide 30
31 Nonparametric tests If condition #2 does not hold There are several test to check normality or condition #3 does not hold Ordinal metrics can be used nonparametric test can be applied Condition #1 must hold The Wilcoxon Rank Sum or MannWhitney Test is one most popular tests. Quite easy, but requires a minimum sample size and has some technical problems (power calculation) Jedlitschka, Vegas, Dieste 2014 Slide 31
32 Parametric vs. nonparametric Obviously, t distribution is an instance of a parametric test The main difference between both types of tests is the assumption of the distribution of the sample Nonparametric test do not make any assumption Nonparametric tests can be applied in situations where parametric cannot, but in turn they are more conservative (less power) Jedlitschka, Vegas, Dieste 2014 Slide 32
33 NonParametric Test / Independent Sample MANN WHITNEY U TEST Jedlitschka, Vegas, Dieste
34 Mann Whitney U test Nonparametric test for independent groups It has greater efficiency than the ttest on nonnormal distributions Prerequisites The responses are at least ordinal The distributions of both groups are equal under the null hypothesis Jedlitschka, Vegas, Dieste 2014 Slide 34
35 Mann Whitney U test Method 1: For small samples a direct method is recommended. It is very quick, and gives an insight into the meaning of the U statistic. Choose the sample for which the ranks seem to be smaller (The only reason to do this is to make computation easier). Call this "sample 1," and call the other sample "sample 2." For each observation in sample 1, count the number of observations in sample 2 that have a smaller rank (count a half for any that are equal to it). The sum of these ranks is U. Jedlitschka, Vegas, Dieste 2014 Slide 35
36 Mann Whitney U test Method 2: For larger samples, a formula can be used: Add up the ranks for the observations which came from sample 1. Where there are tied groups, take the rank to be equal to the midpoint of the group. The sum of ranks in sample 2 is now determinate, because the sum of all the ranks equals N(N + 1)/2 where N is the total number of observations. U is then given by: and R = Sum of Ranks for the respective group Reject H0 if min(u1, U2) is <= the critical value for the MW Jedlitschka, Vegas, Dieste 2014 Slide 36
37 Mann Whitney U test Project A Rank B Rank Program 3,42 9 3,44 10 Defect 2,71 5 4,97 21 density 2,84 6 4, ,85 2 4, ,22 8 4, , ,05 7 2,68 4 4, , , ,49 3 4, ,54 1 4, ,49 12 S of Ranks U 1 = 99 (use formula) U 2 = 11 (use formula) Check min(u 1, U 2 ) in table n of smaller sample n of larger sample 11 <= 26: reject H0 Data taken from Wohlin et al Table: Jedlitschka, Vegas, Dieste 2014 Slide 37
38 Parametric Test / Dependent Sample PAIRED TTEST Jedlitschka, Vegas, Dieste
39 Paired TTest Parametric test for dependent samples E.g., repeated measures or matched pairs differences between all pairs must be calculated = mean of differences between pairs µ 0 = (optional) specified value (e.g., population mean) n = number of subjects s D = Standard Deviation of differences (1 and 2) df = n1 Jedlitschka, Vegas, Dieste 2014 Slide 39
40 Example Paired TTest 1. Calculate differences (P1 P2) 2. Calculate mean of differences 3. Calculate std. dev. of differences 4. Use formula 5. Check t value for respective df in table 6. Reject H0 if t0 > t α/2,df (two sided) 6. Reject H0 if t0 > t α,df (one sided) Programmer P1 P2 P1 P ,1 18, ,9 16, ,3 32, N mean 131,10 127,73 3,37 variance 627, ,60 748,46 std. dev. 25,04 39,54 27,36 df (N 1) 9 Jedlitschka, Vegas, Dieste 2014 Slide 40
41 TTest Table Reject H0 if t0 > t α/2,df (two sided) = => do not reject H0!!! Jedlitschka, Vegas, Dieste 2014 Slide 41
42 Table for TTest SPSS Outputs Jedlitschka, Vegas, Dieste 2014 Slide 42
43 Non Parametric Test / Dependent Sample WILCOXON SIGN TEST Jedlitschka, Vegas, Dieste
44 Wilcoxon Nonparametric for dependent samples alternative to the paired ttest Prerequisites It must be possible to determine which value is larger and to rank the differences T1 = 23 (sum negative d) d= P1 Ranks (d) T2 = 32 (sum positive d) Programmer P1 P2 P1 P2 P ,1 18,9 18, ,9 16,1 16, ,3 32,7 32, N 10,00 10,00 mean 131,10 127,73 3,37 variance 627, ,60 748,46 std. dev 25,04 39,54 27,36 T T+ Sum of Ranks Check min(u1, U2) in table 23!<= 8: do not reject H0 Jedlitschka, Vegas, Dieste 2014 Slide 44
45 Sign Test Nonparametric for dependent samples alternative to the paired ttest Used if it is not possible to rank the differences but still, at least ordinal scale Based on the signs of the difference Formula: Programmer P1 P2 P1 P2 Sign ,1 18, ,9 16, ,3 32, N 10,00 10,00 mean 131,10 127,73 3,37 variance 627, ,60 748,46 std. dev 25,04 39,54 27,36 Count + 6 T1 = 6 (# negative d) T2 = 4 (# positive d) n = min (T1, T2) do not reject H0!!! Jedlitschka, Vegas, Dieste 2014 Slide 45
46 Parametric Methods / Independent Sample ONE FACTOR ANOVA Jedlitschka, Vegas, Dieste
47 ONEFACTOR ANOVA One factor experiments with more than two levels Checks the statistical significance of the difference between the mean responses of one factor with several levels Y ij j e ij j Y Y Steps: 1. Identify the mathematical model 2. Validation of the basic model that relates the experimental variables 3. Calculate the factor induced variation in the response variable 4. Calculate the statistical significance of the factorinduced variation 5. Establish consequences or recommendations on the alternative that provides the best response variable values j j Y Jedlitschka, Vegas, Dieste 2014 Slide 47
48 Example: ANOVA Factor = programming language levels = {ADA, C, C++, JAVA} Response variable = number of errors detected during three months after development ( Quality ) Number of subjects = 24 H 0 = There is no effect of the programming language on the quality of the program PRG Languages ADA C C++ JAVA N Mean Grand Mean 64 Jedlitschka, Vegas, Dieste 2014 Slide 48
49 Example: ANOVA Results: Descriptives ADA lead to a quality of 61±1.83 Jedlitschka, Vegas, Dieste 2014 Slide 49
50 Example: ANOVA Results: > do not reject H0: There are no significant differences between the variances of the two groups. => variances are equal There is a statistically significant difference between groups as determined by one way ANOVA (F = , p =.021). What do we know now? Jedlitschka, Vegas, Dieste 2014 Slide 50
51 Example: ANOVA Post Hoc Tests Scheffé because of different N. else Tukey is preferred There is statistically significant difference between ADA and C (C++) p=0.032 (p=0.002) and JAVA and C (C++) p=0.009 (p=0.000). There are no difference between ADA and JAVA as well as C and C++. Jedlitschka, Vegas, Dieste 2014 Slide 51
52 Example: ANOVA Homogeneous Subsets Jedlitschka, Vegas, Dieste 2014 Slide 52
53 Example: ANOVA Means Plot Jedlitschka, Vegas, Dieste 2014 Slide 53
54 Further Analysis Twoway ANOVA MANOVA ranova Multitude of other tests Jedlitschka, Vegas, Dieste 2014 Slide 54
55 DECISION TREE Jedlitschka, Vegas, Dieste 2014 Slide 55
56 References Wohlin, Runeson, Höst, Ohlsson, Regnell, Wesslén (2012). Experimentation in Software Engineering, Springer J. Bortz, and N. Döring (2006). Forschungsmethoden und Evaluation für Human und Sozialwissenschaftler (4 Auflage). Berlin: Springer Verlag. N. Juristo and A. Moreno. (2001). Basics of Software Engineering Experimentation, Kluwer Academic Publishers. Jedlitschka, Vegas, Dieste 2014 Slide 56
How to choose a statistical test. Francisco J. Candido dos Reis DGOFMRP University of São Paulo
How to choose a statistical test Francisco J. Candido dos Reis DGOFMRP University of São Paulo Choosing the right test One of the most common queries in stats support is Which analysis should I use There
More informationDescriptive Statistics
Descriptive Statistics Primer Descriptive statistics Central tendency Variation Relative position Relationships Calculating descriptive statistics Descriptive Statistics Purpose to describe or summarize
More informationCHAPTER 3 COMMONLY USED STATISTICAL TERMS
CHAPTER 3 COMMONLY USED STATISTICAL TERMS There are many statistics used in social science research and evaluation. The two main areas of statistics are descriptive and inferential. The third class of
More informationSPSS Tests for Versions 9 to 13
SPSS Tests for Versions 9 to 13 Chapter 2 Descriptive Statistic (including median) Choose Analyze Descriptive statistics Frequencies... Click on variable(s) then press to move to into Variable(s): list
More informationInferential Statistics
Inferential Statistics Sampling and the normal distribution Zscores Confidence levels and intervals Hypothesis testing Commonly used statistical methods Inferential Statistics Descriptive statistics are
More informationStatistics and research
Statistics and research Usaneya Perngparn Chitlada Areesantichai Drug Dependence Research Center (WHOCC for Research and Training in Drug Dependence) College of Public Health Sciences Chulolongkorn University,
More informationAnalysis of numerical data S4
Basic medical statistics for clinical and experimental research Analysis of numerical data S4 Katarzyna Jóźwiak k.jozwiak@nki.nl 3rd November 2015 1/42 Hypothesis tests: numerical and ordinal data 1 group:
More informationBiostatistics: DESCRIPTIVE STATISTICS: 2, VARIABILITY
Biostatistics: DESCRIPTIVE STATISTICS: 2, VARIABILITY 1. Introduction Besides arriving at an appropriate expression of an average or consensus value for observations of a population, it is important to
More informationINTERPRETING THE ONEWAY ANALYSIS OF VARIANCE (ANOVA)
INTERPRETING THE ONEWAY ANALYSIS OF VARIANCE (ANOVA) As with other parametric statistics, we begin the oneway ANOVA with a test of the underlying assumptions. Our first assumption is the assumption of
More informationAn example ANOVA situation. 1Way ANOVA. Some notation for ANOVA. Are these differences significant? Example (Treating Blisters)
An example ANOVA situation Example (Treating Blisters) 1Way ANOVA MATH 143 Department of Mathematics and Statistics Calvin College Subjects: 25 patients with blisters Treatments: Treatment A, Treatment
More informationSupplement on the KruskalWallis test. So what do you do if you don t meet the assumptions of an ANOVA?
Supplement on the KruskalWallis test So what do you do if you don t meet the assumptions of an ANOVA? {There are other ways of dealing with things like unequal variances and nonnormal data, but we won
More informationbusiness statistics using Excel OXFORD UNIVERSITY PRESS Glyn Davis & Branko Pecar
business statistics using Excel Glyn Davis & Branko Pecar OXFORD UNIVERSITY PRESS Detailed contents Introduction to Microsoft Excel 2003 Overview Learning Objectives 1.1 Introduction to Microsoft Excel
More informationSeminar paper Statistics
Seminar paper Statistics The seminar paper must contain:  the title page  the characterization of the data (origin, reason why you have chosen this analysis,...)  the list of the data (in the table)
More informationModule 9: Nonparametric Tests. The Applied Research Center
Module 9: Nonparametric Tests The Applied Research Center Module 9 Overview } Nonparametric Tests } Parametric vs. Nonparametric Tests } Restrictions of Nonparametric Tests } OneSample ChiSquare Test
More informationLecture  32 Regression Modelling Using SPSS
Applied Multivariate Statistical Modelling Prof. J. Maiti Department of Industrial Engineering and Management Indian Institute of Technology, Kharagpur Lecture  32 Regression Modelling Using SPSS (Refer
More informationNCSS Statistical Software
Chapter 06 Introduction This procedure provides several reports for the comparison of two distributions, including confidence intervals for the difference in means, twosample ttests, the ztest, the
More informationNonparametric tests, Bootstrapping
Nonparametric tests, Bootstrapping http://www.isrec.isbsib.ch/~darlene/embnet/ Hypothesis testing review 2 competing theories regarding a population parameter: NULL hypothesis H ( straw man ) ALTERNATIVEhypothesis
More informationUsing SPSS version 14 Joel Elliott, Jennifer Burnaford, Stacey Weiss
Using SPSS version 14 Joel Elliott, Jennifer Burnaford, Stacey Weiss SPSS is a program that is very easy to learn and is also very powerful. This manual is designed to introduce you to the program however,
More informationL.8: Analysing continuous data
L.8: Analysing continuous data  Types of variables  Comparing two means:  independent samples  Comparing two means:  dependent samples  Checking the assumptions  Nonparametric test Types of variables
More informationVariables and Data A variable contains data about anything we measure. For example; age or gender of the participants or their score on a test.
The Analysis of Research Data The design of any project will determine what sort of statistical tests you should perform on your data and how successful the data analysis will be. For example if you decide
More information1. Why the hell do we need statistics?
1. Why the hell do we need statistics? There are three kind of lies: lies, damned lies, and statistics, British Prime Minister Benjamin Disraeli (as credited by Mark Twain): It is easy to lie with statistics,
More informationBox plots & ttests. Example
Box plots & ttests Box Plots Box plots are a graphical representation of your sample (easy to visualize descriptive statistics); they are also known as boxandwhisker diagrams. Any data that you can
More informationLecture 7: Binomial Test, Chisquare
Lecture 7: Binomial Test, Chisquare Test, and ANOVA May, 01 GENOME 560, Spring 01 Goals ANOVA Binomial test Chi square test Fisher s exact test Su In Lee, CSE & GS suinlee@uw.edu 1 Whirlwind Tour of One/Two
More informationUNDERSTANDING THE ONEWAY ANOVA
UNDERSTANDING The Oneway Analysis of Variance (ANOVA) is a procedure for testing the hypothesis that K population means are equal, where K >. The Oneway ANOVA compares the means of the samples or groups
More informationInferences About Differences Between Means Edpsy 580
Inferences About Differences Between Means Edpsy 580 Carolyn J. Anderson Department of Educational Psychology University of Illinois at UrbanaChampaign Inferences About Differences Between Means Slide
More informationData and Regression Analysis. Lecturer: Prof. Duane S. Boning. Rev 10
Data and Regression Analysis Lecturer: Prof. Duane S. Boning Rev 10 1 Agenda 1. Comparison of Treatments (One Variable) Analysis of Variance (ANOVA) 2. Multivariate Analysis of Variance Model forms 3.
More informationChapter 7. Oneway ANOVA
Chapter 7 Oneway ANOVA Oneway ANOVA examines equality of population means for a quantitative outcome and a single categorical explanatory variable with any number of levels. The ttest of Chapter 6 looks
More informationInferential Statistics. Probability. From Samples to Populations. Katie RommelEsham Education 504
Inferential Statistics Katie RommelEsham Education 504 Probability Probability is the scientific way of stating the degree of confidence we have in predicting something Tossing coins and rolling dice
More informationNCSS Statistical Software
Chapter 06 Introduction This procedure provides several reports for the comparison of two distributions, including confidence intervals for the difference in means, twosample ttests, the ztest, the
More information1.5 Oneway Analysis of Variance
Statistics: Rosie Cornish. 200. 1.5 Oneway Analysis of Variance 1 Introduction Oneway analysis of variance (ANOVA) is used to compare several means. This method is often used in scientific or medical experiments
More informationc. The factor is the type of TV program that was watched. The treatment is the embedded commercials in the TV programs.
STAT E150  Statistical Methods Assignment 9 Solutions Exercises 12.8, 12.13, 12.75 For each test: Include appropriate graphs to see that the conditions are met. Use Tukey's Honestly Significant Difference
More informationOutline of Topics. Statistical Methods I. Types of Data. Descriptive Statistics
Statistical Methods I Tamekia L. Jones, Ph.D. (tjones@cog.ufl.edu) Research Assistant Professor Children s Oncology Group Statistics & Data Center Department of Biostatistics Colleges of Medicine and Public
More informationRecall this chart that showed how most of our course would be organized:
Chapter 4 OneWay ANOVA Recall this chart that showed how most of our course would be organized: Explanatory Variable(s) Response Variable Methods Categorical Categorical Contingency Tables Categorical
More informationT adult = 96 T child = 114.
Homework Solutions Do all tests at the 5% level and quote pvalues when possible. When answering each question uses sentences and include the relevant JMP output and plots (do not include the data in your
More informationTutorial 5: Hypothesis Testing
Tutorial 5: Hypothesis Testing Rob Nicholls nicholls@mrclmb.cam.ac.uk MRC LMB Statistics Course 2014 Contents 1 Introduction................................ 1 2 Testing distributional assumptions....................
More informationDifference tests (2): nonparametric
NST 1B Experimental Psychology Statistics practical 3 Difference tests (): nonparametric Rudolf Cardinal & Mike Aitken 10 / 11 February 005; Department of Experimental Psychology University of Cambridge
More informationIntroduction to Stata
Introduction to Stata September 23, 2014 Stata is one of a few statistical analysis programs that social scientists use. Stata is in the midrange of how easy it is to use. Other options include SPSS,
More informationNonparametric TwoSample Tests. Nonparametric Tests. Sign Test
Nonparametric TwoSample Tests Sign test MannWhitney Utest (a.k.a. Wilcoxon twosample test) KolmogorovSmirnov Test Wilcoxon SignedRank Test TukeyDuckworth Test 1 Nonparametric Tests Recall, nonparametric
More informationStudy Guide for the Final Exam
Study Guide for the Final Exam When studying, remember that the computational portion of the exam will only involve new material (covered after the second midterm), that material from Exam 1 will make
More informationANSWERS TO EXERCISES AND REVIEW QUESTIONS
ANSWERS TO EXERCISES AND REVIEW QUESTIONS PART FIVE: STATISTICAL TECHNIQUES TO COMPARE GROUPS Before attempting these questions read through the introduction to Part Five and Chapters 1621 of the SPSS
More informationHypothesis Testing hypothesis testing approach formulation of the test statistic
Hypothesis Testing For the next few lectures, we re going to look at various test statistics that are formulated to allow us to test hypotheses in a variety of contexts: In all cases, the hypothesis testing
More informationChapter 5 Analysis of variance SPSS Analysis of variance
Chapter 5 Analysis of variance SPSS Analysis of variance Data file used: gss.sav How to get there: Analyze Compare Means Oneway ANOVA To test the null hypothesis that several population means are equal,
More informationStatistiek I. ttests. John Nerbonne. CLCG, Rijksuniversiteit Groningen. John Nerbonne 1/35
Statistiek I ttests John Nerbonne CLCG, Rijksuniversiteit Groningen http://wwwletrugnl/nerbonne/teach/statistieki/ John Nerbonne 1/35 ttests To test an average or pair of averages when σ is known, we
More informationIntro to Parametric Statistics,
Descriptive Statistics vs. Inferential Statistics vs. Population Parameters Intro to Parametric Statistics, Assumptions & Degrees of Freedom Some terms we will need Normal Distributions Degrees of freedom
More informationApplied Statistics Handbook
Applied Statistics Handbook Phil Crewson Version 1. Applied Statistics Handbook Copyright 006, AcaStat Software. All rights Reserved. http://www.acastat.com Protected under U.S. Copyright and international
More informationProjects Involving Statistics (& SPSS)
Projects Involving Statistics (& SPSS) Academic Skills Advice Starting a project which involves using statistics can feel confusing as there seems to be many different things you can do (charts, graphs,
More informationOn Importance of Normality Assumption in Using a TTest: One Sample and Two Sample Cases
On Importance of Normality Assumption in Using a TTest: One Sample and Two Sample Cases Srilakshminarayana Gali, SDM Institute for Management Development, Mysore, India. Email: lakshminarayana@sdmimd.ac.in
More informationFor example, enter the following data in three COLUMNS in a new View window.
Statistics with Statview  18 Paired ttest A paired ttest compares two groups of measurements when the data in the two groups are in some way paired between the groups (e.g., before and after on the
More informationRankBased NonParametric Tests
RankBased NonParametric Tests Reminder: Student Instructional Rating Surveys You have until May 8 th to fill out the student instructional rating surveys at https://sakai.rutgers.edu/portal/site/sirs
More informationQUANTITATIVE METHODS BIOLOGY FINAL HONOUR SCHOOL NONPARAMETRIC TESTS
QUANTITATIVE METHODS BIOLOGY FINAL HONOUR SCHOOL NONPARAMETRIC TESTS This booklet contains lecture notes for the nonparametric work in the QM course. This booklet may be online at http://users.ox.ac.uk/~grafen/qmnotes/index.html.
More informationMEASURES OF LOCATION AND SPREAD
Paper TU04 An Overview of Nonparametric Tests in SAS : When, Why, and How Paul A. Pappas and Venita DePuy Durham, North Carolina, USA ABSTRACT Most commonly used statistical procedures are based on the
More information3. Nonparametric methods
3. Nonparametric methods If the probability distributions of the statistical variables are unknown or are not as required (e.g. normality assumption violated), then we may still apply nonparametric tests
More informationACTM State ExamStatistics
ACTM State ExamStatistics For the 25 multiplechoice questions, make your answer choice and record it on the answer sheet provided. Once you have completed that section of the test, proceed to the tiebreaker
More informationResearch Methods 1 Handouts, Graham Hole,COGS  version 1.0, September 2000: Page 1:
Research Methods 1 Handouts, Graham Hole,COGS  version 1.0, September 000: Page 1: NONPARAMETRIC TESTS: What are nonparametric tests? Statistical tests fall into two kinds: parametric tests assume that
More informationComparing two groups (t tests...)
Page 1 of 33 Comparing two groups (t tests...) You've measured a variable in two groups, and the means (and medians) are distinct. Is that due to chance? Or does it tell you the two groups are really different?
More information1 Measures for location and dispersion of a sample
Statistical Geophysics WS 2008/09 7..2008 Christian Heumann und Helmut Küchenhoff Measures for location and dispersion of a sample Measures for location and dispersion of a sample In the following: Variable
More informationStatistics: revision
NST 1B Experimental Psychology Statistics practical 5 Statistics: revision Rudolf Cardinal & Mike Aitken 3 / 4 May 2005 Department of Experimental Psychology University of Cambridge Slides at pobox.com/~rudolf/psychology
More informationThe Statistics Tutor s
statstutor community project encouraging academics to share statistics support resources All stcp resources are released under a Creative Commons licence Stcpmarshallowen7 The Statistics Tutor s www.statstutor.ac.uk
More informationBusiness Statistics. Successful completion of Introductory and/or Intermediate Algebra courses is recommended before taking Business Statistics.
Business Course Text Bowerman, Bruce L., Richard T. O'Connell, J. B. Orris, and Dawn C. Porter. Essentials of Business, 2nd edition, McGrawHill/Irwin, 2008, ISBN: 9780073319889. Required Computing
More informationChapter 16 Multiple Choice Questions (The answers are provided after the last question.)
Chapter 16 Multiple Choice Questions (The answers are provided after the last question.) 1. Which of the following symbols represents a population parameter? a. SD b. σ c. r d. 0 2. If you drew all possible
More informationUNDERSTANDING THE TWOWAY ANOVA
UNDERSTANDING THE e have seen how the oneway ANOVA can be used to compare two or more sample means in studies involving a single independent variable. This can be extended to two independent variables
More informationReview Statistics review 9: Oneway analysis of variance Viv Bewick 1, Liz Cheek 1 and Jonathan Ball 2
Review Statistics review 9: Oneway analysis of variance Viv Bewick 1, Liz Cheek 1 and Jonathan Ball 1 Senior Lecturer, School of Computing, Mathematical and Information Sciences, University of Brighton,
More informationAdditional sources Compilation of sources: http://lrs.ed.uiuc.edu/tseportal/datacollectionmethodologies/jintselink/tselink.htm
Mgt 540 Research Methods Data Analysis 1 Additional sources Compilation of sources: http://lrs.ed.uiuc.edu/tseportal/datacollectionmethodologies/jintselink/tselink.htm http://web.utk.edu/~dap/random/order/start.htm
More informationHypothesis Testing. Chapter 7
Hypothesis Testing Chapter 7 Hypothesis Testing Time to make the educated guess after answering: What the population is, how to extract the sample, what characteristics to measure in the sample, After
More informationHypothesis Testing Level I Quantitative Methods. IFT Notes for the CFA exam
Hypothesis Testing 2014 Level I Quantitative Methods IFT Notes for the CFA exam Contents 1. Introduction... 3 2. Hypothesis Testing... 3 3. Hypothesis Tests Concerning the Mean... 10 4. Hypothesis Tests
More informationAnalysis of Variance ANOVA
Analysis of Variance ANOVA Overview We ve used the t test to compare the means from two independent groups. Now we ve come to the final topic of the course: how to compare means from more than two populations.
More informationStatistics Review PSY379
Statistics Review PSY379 Basic concepts Measurement scales Populations vs. samples Continuous vs. discrete variable Independent vs. dependent variable Descriptive vs. inferential stats Common analyses
More informationDescribe what is meant by a placebo Contrast the doubleblind procedure with the singleblind procedure Review the structure for organizing a memo
Readings: Ha and Ha Textbook  Chapters 1 8 Appendix D & E (online) Plous  Chapters 10, 11, 12 and 14 Chapter 10: The Representativeness Heuristic Chapter 11: The Availability Heuristic Chapter 12: Probability
More informationBiostatistics Lab Notes
Biostatistics Lab Notes Page 1 Lab 1: Measurement and Sampling Biostatistics Lab Notes Because we used a chance mechanism to select our sample, each sample will differ. My data set (GerstmanB.sav), looks
More informationWeek 7 Lecture: Twoway Analysis of Variance (Chapter 12) Twoway ANOVA with Equal Replication (see Zar s section 12.1)
Week 7 Lecture: Twoway Analysis of Variance (Chapter ) We can extend the idea of a oneway ANOVA, which tests the effects of one factor on a response variable, to a twoway ANOVA which tests the effects
More informationOne Way ANOVA. A method for comparing several means along a single variable
Analysis of Variance (ANOVA) One Way ANOVA A method for comparing several means along a single variable It is the same as an independent samples t test, test but for 3 or more samples Called one way when
More informationContents 1. Contents
Contents 1 Contents 3 Ksample Methods 2 3.1 Setup............................ 2 3.2 Classic Method Based on Normality Assumption..... 3 3.3 Permutation F test.................... 5 3.4 KruskalWallis
More informationOneWay Analysis of Variance
Spring, 000   Administrative Items OneWay Analysis of Variance Midterm Grades. Makeup exams, in general. Getting help See me today :0 or Wednesday from :0. Send an email to stine@wharton. Visit
More informationUnit 21 Student s t Distribution in Hypotheses Testing
Unit 21 Student s t Distribution in Hypotheses Testing Objectives: To understand the difference between the standard normal distribution and the Student's t distributions To understand the difference between
More informationSimple Linear Regression Chapter 11
Simple Linear Regression Chapter 11 Rationale Frequently decisionmaking situations require modeling of relationships among business variables. For instance, the amount of sale of a product may be related
More information4. Introduction to Statistics
Statistics for Engineers 41 4. Introduction to Statistics Descriptive Statistics Types of data A variate or random variable is a quantity or attribute whose value may vary from one unit of investigation
More informationStatistics for Management IISTAT 362Final Review
Statistics for Management IISTAT 362Final Review Multiple Choice Identify the letter of the choice that best completes the statement or answers the question. 1. The ability of an interval estimate to
More informationSuggested solution for exam in MSA830: Statistical Analysis and Experimental Design October 2009
Petter Mostad Matematisk Statistik Chalmers Suggested solution for exam in MSA830: Statistical Analysis and Experimental Design October 2009 1. (a) To use a ttest, one must assume that both groups of
More informationHypothesis Testing. Dr. Bob Gee Dean Scott Bonney Professor William G. Journigan American Meridian University
Hypothesis Testing Dr. Bob Gee Dean Scott Bonney Professor William G. Journigan American Meridian University 1 AMU / BonTech, LLC, JourniTech Corporation Copyright 2015 Learning Objectives Upon successful
More informationCREIGHTON UNIVERSITY GRADUATE COLLEGE Fall Semester 2014. Biostatistics & Analysis of Clinical Data for Evidencebased Practice
CREIGHTON UNIVERSITY GRADUATE COLLEGE Fall Semester 2014 Course Number: Course Title: Credit Allocation: Placement: CTS 601 Biostatistics & Analysis of Clinical Data for Evidencebased Practice 3 semester
More informationNonParametric TwoSample Analysis: The MannWhitney U Test
NonParametric TwoSample Analysis: The MannWhitney U Test When samples do not meet the assumption of normality parametric tests should not be used. To overcome this problem, nonparametric tests can
More informationANOVA Analysis of Variance
ANOVA Analysis of Variance What is ANOVA and why do we use it? Can test hypotheses about mean differences between more than 2 samples. Can also make inferences about the effects of several different IVs,
More informationData analysis process
Data analysis process Data collection and preparation Collect data Prepare codebook Set up structure of data Enter data Screen data for errors Exploration of data Descriptive Statistics Graphs Analysis
More informationChapter 11: Linear Regression  Inference in Regression Analysis  Part 2
Chapter 11: Linear Regression  Inference in Regression Analysis  Part 2 Note: Whether we calculate confidence intervals or perform hypothesis tests we need the distribution of the statistic we will use.
More informationSPSS ADVANCED ANALYSIS WENDIANN SETHI SPRING 2011
SPSS ADVANCED ANALYSIS WENDIANN SETHI SPRING 2011 Statistical techniques to be covered Explore relationships among variables Correlation Regression/Multiple regression Logistic regression Factor analysis
More informationNCSS Statistical Software Principal Components Regression. In ordinary least squares, the regression coefficients are estimated using the formula ( )
Chapter 340 Principal Components Regression Introduction is a technique for analyzing multiple regression data that suffer from multicollinearity. When multicollinearity occurs, least squares estimates
More informationBusiness Statistics. Lecture 8: More Hypothesis Testing
Business Statistics Lecture 8: More Hypothesis Testing 1 Goals for this Lecture Review of ttests Additional hypothesis tests Twosample tests Paired tests 2 The Basic Idea of Hypothesis Testing Start
More informationCHAPTER 14 NONPARAMETRIC TESTS
CHAPTER 14 NONPARAMETRIC TESTS Everything that we have done up until now in statistics has relied heavily on one major fact: that our data is normally distributed. We have been able to make inferences
More informationTwoSample TTests Allowing Unequal Variance (Enter Difference)
Chapter 45 TwoSample TTests Allowing Unequal Variance (Enter Difference) Introduction This procedure provides sample size and power calculations for one or twosided twosample ttests when no assumption
More informationCourse Text. Required Computing Software. Course Description. Course Objectives. StraighterLine. Business Statistics
Course Text Business Statistics Lind, Douglas A., Marchal, William A. and Samuel A. Wathen. Basic Statistics for Business and Economics, 7th edition, McGrawHill/Irwin, 2010, ISBN: 9780077384470 [This
More informationChapter 3: Nonparametric Tests
B. Weaver (15Feb00) Nonparametric Tests... 1 Chapter 3: Nonparametric Tests 3.1 Introduction Nonparametric, or distribution free tests are socalled because the assumptions underlying their use are fewer
More informationFactorial Analysis of Variance
Chapter 560 Factorial Analysis of Variance Introduction A common task in research is to compare the average response across levels of one or more factor variables. Examples of factor variables are income
More informationHypothesis Testing & Data Analysis. Statistics. Descriptive Statistics. What is the difference between descriptive and inferential statistics?
2 Hypothesis Testing & Data Analysis 5 What is the difference between descriptive and inferential statistics? Statistics 8 Tools to help us understand our data. Makes a complicated mess simple to understand.
More informationOnesample normal hypothesis Testing, paired ttest, twosample normal inference, normal probability plots
1 / 27 Onesample normal hypothesis Testing, paired ttest, twosample normal inference, normal probability plots Timothy Hanson Department of Statistics, University of South Carolina Stat 704: Data Analysis
More informationUnit 24 Hypothesis Tests about Means
Unit 24 Hypothesis Tests about Means Objectives: To recognize the difference between a paired t test and a twosample t test To perform a paired t test To perform a twosample t test A measure of the amount
More informationNumerical Summarization of Data OPRE 6301
Numerical Summarization of Data OPRE 6301 Motivation... In the previous session, we used graphical techniques to describe data. For example: While this histogram provides useful insight, other interesting
More informationSection 13, Part 1 ANOVA. Analysis Of Variance
Section 13, Part 1 ANOVA Analysis Of Variance Course Overview So far in this course we ve covered: Descriptive statistics Summary statistics Tables and Graphs Probability Probability Rules Probability
More informationComparing Means in Two Populations
Comparing Means in Two Populations Overview The previous section discussed hypothesis testing when sampling from a single population (either a single mean or two means from the same population). Now we
More informationQuantitative Data Analysis: Choosing a statistical test Prepared by the Office of Planning, Assessment, Research and Quality
Quantitative Data Analysis: Choosing a statistical test Prepared by the Office of Planning, Assessment, Research and Quality 1 To help choose which type of quantitative data analysis to use either before
More informationDeath on the Titanic
Death on the Titanic Introduction On its maiden voyage, the cruise ship Titanic collided with an iceberg and sank. There was much loss of life. It is of interest to test how well sample proportions from
More information