Effect size and eta squared James Dean Brown (University of Hawai i at Manoa)



Similar documents
What Are Principal Components Analysis and Exploratory Factor Analysis?

Shiken: JLT Testing & Evlution SIG Newsletter. 5 (3) October 2001 (pp )

Choosing the Right Type of Rotation in PCA and EFA James Dean Brown (University of Hawai i at Manoa)

UNDERSTANDING THE INDEPENDENT-SAMPLES t TEST

January 26, 2009 The Faculty Center for Teaching and Learning

The Effect of Static Visual Instruction on Students Online Learning: A Pilot Study

UNDERSTANDING THE TWO-WAY ANOVA

Assessing Measurement System Variation

California University of Pennsylvania Guidelines for New Course Proposals University Course Syllabus Approved: 2/4/13. Department of Psychology

A Performance Comparison of Native and Non-native Speakers of English on an English Language Proficiency Test ...

Calculating, Interpreting, and Reporting Estimates of Effect Size (Magnitude of an Effect or the Strength of a Relationship)

STATISTICS FOR PSYCHOLOGISTS

Main Effects and Interactions

PSYCHOLOGY 320L Problem Set #3: One-Way ANOVA and Analytical Comparisons

An analysis method for a quantitative outcome and two categorical explanatory variables.

Multivariate analyses

How to report the percentage of explained common variance in exploratory factor analysis

1/27/2013. PSY 512: Advanced Statistics for Psychological and Behavioral Research 2

Multivariate Analysis of Variance (MANOVA): I. Theory

" Y. Notation and Equations for Regression Lecture 11/4. Notation:

UNDERSTANDING ANALYSIS OF COVARIANCE (ANCOVA)

A COMPARISON OF LOW PERFORMING STUDENTS ACHIEVEMENTS IN FACTORING CUBIC POLYNOMIALS USING THREE DIFFERENT STRATEGIES

Univariate Regression

Teacher Education Program Characteristics - A Review

EXCEL Analysis TookPak [Statistical Analysis] 1. First of all, check to make sure that the Analysis ToolPak is installed. Here is how you do it:

Chapter 7. One-way ANOVA

UNDERSTANDING THE DEPENDENT-SAMPLES t TEST

Data Analysis Tools. Tools for Summarizing Data

10. Analysis of Longitudinal Studies Repeat-measures analysis

ANOVA ANOVA. Two-Way ANOVA. One-Way ANOVA. When to use ANOVA ANOVA. Analysis of Variance. Chapter 16. A procedure for comparing more than two groups

Comparing User Engagement across Seven Interactive and Social-Media Ad Types.

Recall this chart that showed how most of our course would be organized:

INTERPRETING THE ONE-WAY ANALYSIS OF VARIANCE (ANOVA)

The Effect of Varied Visual Scaffolds on Engineering Students Online Reading. Abstract. Introduction

Understanding Power and Rules of Thumb for Determining Sample Sizes

Doing Multiple Regression with SPSS. In this case, we are interested in the Analyze options so we choose that menu. If gives us a number of choices:

Running head: THE FIELD OF INSTRUCTIONAL TECHNOLOGY. Instructional Design. Teresa Hill. University of Houston ClearLake

Contrast Coding in Multiple Regression Analysis: Strengths, Weaknesses, and Utility of Popular Coding Structures

MULTIPLE LINEAR REGRESSION ANALYSIS USING MICROSOFT EXCEL. by Michael L. Orlov Chemistry Department, Oregon State University (1996)

Guidelines for Writing An APA Style Lab Report

Two-Way ANOVA tests. I. Definition and Applications...2. II. Two-Way ANOVA prerequisites...2. III. How to use the Two-Way ANOVA tool?...

Understanding and Quantifying EFFECT SIZES

What is Effect Size? effect size in Information Technology, Learning, and Performance Journal manuscripts.

Profile analysis is the multivariate equivalent of repeated measures or mixed ANOVA. Profile analysis is most commonly used in two cases:

Section 13, Part 1 ANOVA. Analysis Of Variance

Instructional coaching at selected middle schools in south Texas and effects on student achievement

Drawing a histogram using Excel

The Relationship Between Epistemological Beliefs and Self-regulated Learning Skills in the Online Course Environment

The Effect of the After-School Reading Education Program for Elementary School on Multicultural Awareness

Case Study in Data Analysis Does a drug prevent cardiomegaly in heart failure?

How To Run Statistical Tests in Excel

Two-Way ANOVA with Post Tests 1

RIFIS Ad Hoc Reports

Writing the Empirical Social Science Research Paper: A Guide for the Perplexed. Josh Pasek. University of Michigan.

SPSS Resources. 1. See website (readings) for SPSS tutorial & Stats handout

The importance of graphing the data: Anscombe s regression examples

Majors Matter: Differential Performance on a Test of General College Outcomes. Jeffrey T. Steedle Council for Aid to Education

Business Valuation Review

Gage Studies for Continuous Data

Factorial Invariance in Student Ratings of Instruction

5. Survey Samples, Sample Populations and Response Rates

A Comparison of Training & Scoring in Distributed & Regional Contexts Writing

1 J (Gr 6): Summarize and describe distributions.

Exploratory Factor Analysis

12: Analysis of Variance. Introduction

The Impact of Input Enhancement through Multimedia on the Improvement of Writing Ability

One-Way Analysis of Variance

EXTENDED LEARNING MODULE A

Multiple Regression. Page 24

IBM SPSS Statistics 20 Part 4: Chi-Square and ANOVA

A Basic Guide to Analyzing Individual Scores Data with SPSS

DATA VALIDATION and CONDITIONAL FORMATTING

COMPARISONS OF CUSTOMER LOYALTY: PUBLIC & PRIVATE INSURANCE COMPANIES.

Balanced Assessment Test Algebra 2008

Chapter 5 Analysis of variance SPSS Analysis of variance

Multivariate Analysis of Variance. The general purpose of multivariate analysis of variance (MANOVA) is to determine

James E. Bartlett, II is Assistant Professor, Department of Business Education and Office Administration, Ball State University, Muncie, Indiana.

The effect of reading education program using multicultural books on elementary student s multicultural awareness

4. Descriptive Statistics: Measures of Variability and Central Tendency

Chapter. Three-Way ANOVA CONCEPTUAL FOUNDATION. A Simple Three-Way Example. 688 Chapter 22 Three-Way ANOVA

CONTINGENCY TABLES ARE NOT ALL THE SAME David C. Howell University of Vermont

Misuse of statistical tests in Archives of Clinical Neuropsychology publications

Evaluation of a Laptop Program: Successes and Recommendations

Center for Advanced Studies in Measurement and Assessment. CASMA Research Report

EVANGEL UNIVERSITY Behavioral Sciences Department

Experimental Designs (revisited)

Using Microsoft Excel to Manage and Analyze Data: Some Tips

Data Analysis in SPSS. February 21, If you wish to cite the contents of this document, the APA reference for them would be

Empirical Study of effect of using Weibull. NIFTY index options

Additional sources Compilation of sources:

METACOGNITIVE AWARENESS OF PRE-SERVICE TEACHERS

Introduction to Regression and Data Analysis

If A is divided by B the result is 2/3. If B is divided by C the result is 4/7. What is the result if A is divided by C?

ABSORBENCY OF PAPER TOWELS

Windows-Based Meta-Analysis Software. Package. Version 2.0

business statistics using Excel OXFORD UNIVERSITY PRESS Glyn Davis & Branko Pecar

Consider a study in which. How many subjects? The importance of sample size calculations. An insignificant effect: two possibilities.

One-Way Analysis of Variance (ANOVA) Example Problem

Copyright subsists in all papers and content posted on this site.

Transcription:

Shiken: JALT Testing & Evaluation SIG Newsletter. 1 () April 008 (p. 38-43) Statistics Corner Questions and answers about language testing statistics: Effect size and eta squared James Dean Brown (University of Hawai i at Manoa) Question: In Chapter 6 of the 008 book on heritage language learning that you coedited with Kimi-Kondo Brown, a study comparing how three different groups of informants use intersentential referencing is outlined. On page 147 of that book, a MANOVA with a partial eta of.9 is outlined. There are several questions about this statistic. What does a partial eta measure? Are there other forms of eta that readers should know about? And how should one interpret a partial eta value of.9? Answer: I will answer your question about partial eta in two parts. I will start by defining and explaining eta. Then I will circle back and do the same for partial eta. Eta Eta can be defined as the proportion of variance associated with or accounted for by each of the main effects, interactions, and error in an ANOVA study (see Tabachnick & Fidell, 001, pp. 54-55, and Thompson, 006, pp. 317-319). Formulaically, eta, or, is defined as follows: Where: effect total effect the sums of squares for whatever effect is of interest total the total sums of squares for all effects, interactions, and errors in the ANOVA Eta is most often reported for straightforward ANOVA designs that (a) are balanced (i.e., have equal cell sizes) and (b) have independent cells (i.e., different people appear in each cell). For example, in Brown (007), I used an example ANOVA to demonstrate how to calculate power with SP. That was a x two-way ANOVA with anxiety and tension as the independent variables and trial 3 as the dependent variable (using the Anxiety.sav example file that comes with recent versions of the SP software). There were three people in each cell and the cells were independent. Notice in Table 1 that the p values (0.90, 0.55, & 0.10) indicate that there were no significant effects (i.e., no p values below.05) for Anxiety, Tension, or their interaction. Note also that there was not sufficient power to detect such effects (i.e., the power statistics of 0.05, 0.09, & 0.37 were not above.80 in any case). All of this led me to conclude that the study lacked sufficient power to detect any significant effects even if they exist in reality, which is reasonable given the very small sample size of 1. Table 1 Results of the Analysis Shown in Figure 3 of the Anxiety.sav used with SP Source df MS F p eta Power Anxiety 0.08 1 0.08 0.0 0.90 0.001 0.05 Tension.08 1.08 0.38 0.55 0.034 0.09 Anxiety x Tension 18.75 1 18.75 3.46 0.10 0.919 0.37 Error 43.33 8 5.4 0.6745 64.4 1

39 Table Descriptive Statistics for the Anxiety.sav Example Used with SP * Anxiety Tension M SD N 1 1 8.67 3.06 3 7.00.65 3 1 6.00.00 3 9.33 1.16 3 *Dependent Variable: Trial 3 Nonetheless, even a cursory look at the means shown in Table indicates that fairly large differences exist between means and something noteworthy is going on, so a better designed replication study with a larger sample size might be justified. Eta can help in interpreting the results by indicating the relative degree to which the variance that was found in the ANOVA was associated with each of the main effects (Anxiety and Tension) and their interaction. Eta values are easy to calculate. Simply add up all the sums of squares (), the total of which is 64.4 in the example; then, divide the for each of the main effects, the interaction, and the error term by that total. The results will be as follows: Anxiety Tension AxT Error AxT Anxiety Tension Error 18.75 64.4 0.08 64.4.08 64.4 0.0014533 0.001 0.0337858 0.034 0.918741 0.919 43.33 0.674501867 0.6745 64.4 Interpretation of these values is easiest if the decimal point is moved two places to the right in each case, the result of which can be interpreted as percentages of variance associated with each of the main effects, the interaction, and error. Starting with Anxiety, the value of 0.001 indicates that a mere 0.1% of the variance is accounted for by Anxiety, whereas Tension accounts for 3.4%, the Anxiety x Tension (A x T) interaction accounts for a much larger 9.19%, and a whopping 67.45% is accounted for by Error. Now let s consider the A x T interaction and Error separately in more detail. The 9.19% accounted for by the A x T interaction should lead the researcher to understand that this interaction effect is much more important than either of the individual main effects for Anxiety or Tension, a fact that, even though there are no significant effects, may help in designing future studies and understanding why the present one did not detect significant differences. Such an important interaction effect should lead the researcher to want to plot out that relationship as shown in Figure 1, where we see that the Tension groups 1 (dotted line) and (plain black line) do indeed have different means but in opposite relationships for Anxiety 1 and. That is, the Tension 1 group is higher than the Tension group when Anxiety is 1, but the Tension 1 group is lower than the Tension group when Anxiety is. 39

40 Thus there is a strong pattern but it is not consistent across Anxiety 1 and conditions (if it were consistent, the lines would be parallel). Thus, even with a nonsignificant interaction (where p.10), the eta value of.919 drew our attention to an important interaction effect that is revealing in itself, and which may help to understand why there were no significant main effects for Tension or Anxiety (i.e., because the interaction cancels out any such differences). Figure 1. Interaction of Anxiety with Tension using the Anxiety.sav example The whopping 67.45% accounted for by Error in the Table 1 indicates that more than two-thirds of the variance was not accounted for at all in this design. This error variance may be due to unreliable variance in the study due to poor design, other systematic variables that might be of interest (if they were operationalized and included in the study), and so forth. All in all, eta values indicate not only that the interaction effect and error are causing almost 97% of the variance in the study (67.45 + 9.19 96.64), but also ways to redesign the study so it will be more powerful and meaningful. One problem with eta is that the magnitude of eta for each particular effect depends to some degree on the significance and number of other effects in the design... One problem with eta is that the magnitude of eta for each particular effect depends to some degree on the significance and number of other effects in the design (Tabachnick & Fidell, 001, p. 54). One statistic that minimizes the effects of this issue is called partial eta. Partial Eta Partial eta can be defined as the ratio of variance accounted for by an effect and that effect plus its associated error variance within an ANOVA study. Formulaically, 40

41 partial eta, or partial effect effect partial + error, is defined as follows: Where: effect the sums of squares for whatever effect is of interest error the sums of squares for whatever error term is associated with that effect In applied linguistics studies, partial eta is most often reported for ANOVA designs that have non-independent cells (i.e., the same people appear in more than one cell). For example, in Brown, Hilgers, and Marsella (1991), students wrote compositions on two different types of topics (a narrative topic and an analytic topic) which were organized into ten prompt sets. The people who wrote on each of the ten prompt sets were different from each other (so this is also known as a between subjects effect). In contrast, every student wrote on each of the two topic types, so these were treated as repeated measures (also known as a within subjects effect). The cell sizes within subjects were exactly the same (which makes sense because they were the same people), whereas the cell sizes between subjects were different to small degrees. The original results of this 10 x two-way repeated-measures ANOVA for prompt sets and topic types are shown in Table 3. Table 3 Two-Way Repeated-Measures ANOVA for 1989 Prompt Sets and Topic Types (As presented in Brown et al, 1991) Source df MS F p Between Subjects Prompt Set 158.37 9 17.597 9.703 0.00 Error 3068.553 169 1.814 Within Subjects Topic Type 0.344 1 0.344 0.194 0.66 Prompt Set by Topic Type 137.57 9 15.86 8.611 0.00 Error 3003.548 169 1.775 Table 4 Two-Way Repeated-Measures ANOVA for 1989 Prompt Sets and Topic Types (Adapted from Brown et al, 1991 with Partial Eta Added) Source df MS F p Partial eta Between Subjects Prompt Set (PS) 158.37 9 17.597 9.703 0.00 0.0490 Error BS 3068.553 169 1.814 Within Subjects Topic Type (TT) 0.344 1 0.344 0.194 0.66 0.0001 Prompt Set by Topic Type 137.57 9 15.86 8.611 0.00 0.0438 Error WS 3003.548 169 1.775 From my present perspective (17 years later), the 1991Brown et al. study would have been strengthened by relabeling the effects and adding partial eta values to the two-way repeated-measures ANOVA table as shown in Table 4. These partial eta 41

4 values are easy to calculate. Simply divide the for each effect by the of that effect plus the for the error associated with that effect. The results will be as follows: PS 158.37 Partial PS 0.04907830 0. 0490 + 158.37 + 3068.553 PS Error BS TT 0.344 Partial TT 0.000114518 0. 0001 + 0.344 + 3003.548 TT Error WS PSxTT 137.57 Partial PSxTT 0.043797116 0. 0438 + 137.57 + 3003.548 PSxTT Error WS The interpretation of these partial eta values is similar to what we did above for eta in that we need to move the decimal point two places to the right in each case, and interpret the results as percentages of variance. However, this time the results indicate the percentage of variance in each of the effects (or interaction) and its associated error that is accounted for by that effect (or interaction). Starting with Prompt Sets, the value of 0.0490 indicates that 4.90% of the between subjects variance is accounted for by Prompt Sets, whereas Topic Types accounts for nearly none of the TT plus Error BS variance (0.01%), though the Prompt Sets by Topic Types interaction (PSxTT) accounts for a somewhat larger 4.38% of the PSxTT plus Error BS variance. Conclusion In direct answer to your question, Kondo-Brown and Fukuda (008) correctly chose to use partial eta because their design was a MANOVA, which by definition involves non-independent or repeated measures. When they reported that partial eta was.9, that meant that the effect for group differences in their MANOVA accounted for 9% of the group-differences plus associated error variance as explained above. This percentage was sufficient to lead them to do univariate follow-up ANOVAs that helped them to further isolate exactly where the significant and interesting means differences were to be found. In recent columns, I have covered a number of issues related to the ANOVA sorts of studies including: sampling and generalizability, sampling errors, sample size and power, and effect size and eta squared. All of these are ways to expand your thinking about ANOVA ways that are often ignored in applied linguistics. They have long been important to understanding ANOVA results in psychology, education, and other fields, and we ignore them to our detriment. To paraphrase something one of my stats teachers said back in the late 1970s: Reporting the traditional ANOVA source table (with, df, MS, F, and p) and discussing the associated significance levels isn t the Reporting the traditional ANOVA source table (with, df, MS, F, and p) and discussing the associated significance levels isn t the end of the study; it s just the beginning... end of the study; it s just the beginning because we can learn much more by carefully plotting and considering the interaction effects and doing follow up analyses like planned or post-hoc comparisons, power and effect size analyses, and so forth. I hope I have delivered that message loud and clear. References 4

43 Brown, J. D. (007). Statistics Corner. Questions and answers about language testing statistics: Sample size and power. Shiken: JALT Testing & Evaluation SIG Newsletter, 11(1), 31-35. Also retrieved from the World Wide Web at http://jalt.org/test/bro_5.htm Brown, J. D., Hilgers, T., & Marsella, J. (1991). Essay prompts and topics: Minimizing the effect of differences. Written Communication, 8(4), 53-555. Kondo-Brown, K. (005). Differences in language skills: Heritage language learner subgroups and foreign language learners. The Modern Language Journal, 89(4), 563 581. Kondo-Brown, K., & Brown, J. D. (Eds.) (008). Teaching Chinese, Japanese, and Korean heritage language students. New York: Lawrence Erlbaum Associates. Kondo-Brown, K., & Fukuda, C. (008). A separate-track for advanced heritage language students?: Japanese intersentential referencing. In K. Kondo-Brown & J. D. Brown (Eds.), Teaching Chinese, Japanese, and Korean heritage language students. New York: Lawrence Erlbaum Associates. Tabachnick, B. G., & Fidell, L. S. (001). Using Multivariate Statistics (5th ed.). Upper Saddle River, NJ: Pearson Allyn & Bacon. Thompson, B. (006). Foundations of behavioral statistics: An insight-based approach. New York: Guilford. HTML: http://jalt.org/test/bro_8.htm / PDF: http://jalt.org/test/pdf/brown8.pdf Copyright 008 by James Dean Brown & the Japan Association for Language Teaching 43