Analysis of Variance (ANOVA) Using Minitab



Similar documents
Process Capability Analysis Using MINITAB (I)

THE PROCESS CAPABILITY ANALYSIS - A TOOL FOR PROCESS PERFORMANCE MEASURES AND METRICS - A CASE STUDY

Chapter 2 Simple Comparative Experiments Solutions

Variables Control Charts

How To Check For Differences In The One Way Anova

Regression Analysis: A Complete Example

How To Test For Significance On A Data Set

INTERPRETING THE ONE-WAY ANALYSIS OF VARIANCE (ANOVA)

Recall this chart that showed how most of our course would be organized:

Multiple Linear Regression

NCSS Statistical Software Principal Components Regression. In ordinary least squares, the regression coefficients are estimated using the formula ( )

Getting Correct Results from PROC REG

Unit 31: One-Way ANOVA

Minitab Tutorials for Design and Analysis of Experiments. Table of Contents

Chapter 7. One-way ANOVA

2. Simple Linear Regression

1. How different is the t distribution from the normal?

1.5 Oneway Analysis of Variance

Additional sources Compilation of sources:

Simple Predictive Analytics Curtis Seare

Normality Testing in Excel

N-Way Analysis of Variance

Regression step-by-step using Microsoft Excel

Examining Differences (Comparing Groups) using SPSS Inferential statistics (Part I) Dwayne Devonish

International Statistical Institute, 56th Session, 2007: Phil Everson

Analysis of Variance. MINITAB User s Guide 2 3-1

Introduction to Regression and Data Analysis

MEASURES OF LOCATION AND SPREAD

DEPARTMENT OF PSYCHOLOGY UNIVERSITY OF LANCASTER MSC IN PSYCHOLOGICAL RESEARCH METHODS ANALYSING AND INTERPRETING DATA 2 PART 1 WEEK 9

CHAPTER 13 SIMPLE LINEAR REGRESSION. Opening Example. Simple Regression. Linear Regression

1. The parameters to be estimated in the simple linear regression model Y=α+βx+ε ε~n(0,σ) are: a) α, β, σ b) α, β, ε c) a, b, s d) ε, 0, σ

INFLUENCE OF MEASUREMENT SYSTEM QUALITY ON THE EVALUATION OF PROCESS CAPABILITY INDICES

Chapter 7 Section 7.1: Inference for the Mean of a Population

Randomized Block Analysis of Variance

Getting Started with Minitab 17

Statistics courses often teach the two-sample t-test, linear regression, and analysis of variance

Testing for Lack of Fit

What is Data Analysis. Kerala School of MathematicsCourse in Statistics for Scientis. Introduction to Data Analysis. Steps in a Statistical Study

How To Run Statistical Tests in Excel

COMPARISONS OF CUSTOMER LOYALTY: PUBLIC & PRIVATE INSURANCE COMPANIES.

Factors affecting online sales

UNDERSTANDING THE INDEPENDENT-SAMPLES t TEST

2013 MBA Jump Start Program. Statistics Module Part 3

Unit 27: Comparing Two Means

Independent t- Test (Comparing Two Means)

A Scientific Approach to Implementing Change

Teaching Business Statistics through Problem Solving

Premaster Statistics Tutorial 4 Full solutions

12: Analysis of Variance. Introduction

Assumptions. Assumptions of linear models. Boxplot. Data exploration. Apply to response variable. Apply to error terms from linear model

An Introduction to Statistics using Microsoft Excel. Dan Remenyi George Onofrei Joe English

Threshold Autoregressive Models in Finance: A Comparative Approach

Descriptive Statistics

Case Study Call Centre Hypothesis Testing

Data Analysis of Trends in iphone 5 Sales on ebay

Copyright PEOPLECERT Int. Ltd and IASSC

One-Way Analysis of Variance

Chapter 13 Introduction to Linear Regression and Correlation Analysis

One-Way Analysis of Variance: A Guide to Testing Differences Between Multiple Groups

BIOL 933 Lab 6 Fall Data Transformation

Online 12 - Sections 9.1 and 9.2-Doug Ensley

Lecture 2: Descriptive Statistics and Exploratory Data Analysis

Exercise 1.12 (Pg )

Section 13, Part 1 ANOVA. Analysis Of Variance

2 Sample t-test (unequal sample sizes and unequal variances)

One-Way Analysis of Variance (ANOVA) Example Problem

Simple Linear Regression Inference

Final Exam Practice Problem Answers

t Tests in Excel The Excel Statistical Master By Mark Harmon Copyright 2011 Mark Harmon

PROBABILITY AND STATISTICS. Ma To teach a knowledge of combinatorial reasoning.

Two-sample t-tests. - Independent samples - Pooled standard devation - The equal variance assumption

ABSORBENCY OF PAPER TOWELS

Non-Inferiority Tests for One Mean

APPRAISAL OF FINANCIAL AND ADMINISTRATIVE FUNCTIONING OF PUNJAB TECHNICAL UNIVERSITY

Motor and Household Insurance: Pricing to Maximise Profit in a Competitive Market

Teaching Multivariate Analysis to Business-Major Students

A Better Statistical Method for A/B Testing in Marketing Campaigns

Univariate Regression

Simple linear regression

Learning Equity between Online and On-Site Mathematics Courses

Chapter 5 Analysis of variance SPSS Analysis of variance

Chapter 4 and 5 solutions

The Assumption(s) of Normality

CHAPTER 13. Experimental Design and Analysis of Variance

Each copy of any part of a JSTOR transmission must contain the same copyright notice that appears on the screen or printed page of such transmission.

Why Taking This Course? Course Introduction, Descriptive Statistics and Data Visualization. Learning Goals. GENOME 560, Spring 2012


ANALYZING NETWORK TRAFFIC FOR MALICIOUS ACTIVITY

Business Statistics. Successful completion of Introductory and/or Intermediate Algebra courses is recommended before taking Business Statistics.

DESCRIPTIVE STATISTICS AND EXPLORATORY DATA ANALYSIS

Predictive Analytics Tools and Techniques

START Selected Topics in Assurance

MULTIPLE REGRESSION AND ISSUES IN REGRESSION ANALYSIS

A Brief Study of the Nurse Scheduling Problem (NSP)

KSTAT MINI-MANUAL. Decision Sciences 434 Kellogg Graduate School of Management

AN ANALYSIS OF WORKING CAPITAL MANAGEMENT EFFICIENCY IN TELECOMMUNICATION EQUIPMENT INDUSTRY

Statistiek II. John Nerbonne. October 1, Dept of Information Science

Transcription:

Analysis of Variance (ANOVA) Using Minitab By Keith M. Bower, M.S., Technical Training Specialist, Minitab Inc. Frequently, scientists are concerned with detecting differences in means (averages) between various levels of a factor, or between different groups. What follows is an example of the ANOVA (Analysis of Variance) procedure using the popular statistical software package, Minitab. ANOVA was developed by the English statistician, R.A. Fisher (1890-1962). Though initially dealing with agricultural data[1], this methodology has been applied to a vast array of other fields for data analysis. Despite its widespread use, some practitioners fail to recognize the need to check the validity of several key assumptions before applying an ANOVA to their data. It is the hope that this article may provide certain useful guidelines for performing basic analysis using such a software package. For this example we shall consider a set of data from the Journal of the Electrochemical Society [1992][2]. This data originated from an experiment performed to investigate the low-pressure vapor deposition of polysilicon. Four wafer positions have been chosen and our goal is to detect whether there are any statistically significant differences between the means of these levels. As is discussed by Hogg and Ledolter [1987][3], assumptions that underpin the ANOVA procedure are: (1) The values for each level follow a Normal (a.k.a. Gaussian) distribution, and (2) The variances are the same for each level (Homogeneity of Variance). Traditionally, Normality has been investigated using Normal probability plots. However as is discussed by Ryan and Joiner [1976][4] inexperienced practitioners have difficulty in their interpretation, and considerable practice is sometimes necessary. With the advent of computer technology, statistical tests may be easily performed to investigate an assumed distributional form, e.g. the Anderson-Darling test. As this dataset is so small (only three observations for each level) it would be unusual to reject the null hypothesis of Normality, though an example of checking Normality will follow later. With regard to assumption (2), namely homogeneous variances, as the output in Fig. 1 shows, we are unable to reject the null hypothesis of equal variances at the 0.05 significance level as the p-value for Bartletts test (which assumes Normality within each factor level) is greater than 0.05. Levenes test does not assume Normality and also fails to reject the null hypothesis of equal variances.

Figure 1. We may therefore proceed with our analysis. One finds that the ANOVA procedure works quite well even if the Normality assumption has been violated, unless one or more of the distributions are highly skewed or if the variances are quite different. Transformations of the original dataset may correct these violations, but are outside the scope of this article (see Montgomery [1997][5] for further information). As the ANOVA table in Fig. 2 shows, we obtain an F-statistic of 8.29. Traditionally, one would seek to compare such a statistic with critical values from a table. With more sophisticated software packages, however, the p-value is automatically computed. In this example we are able to reject the null hypothesis, even at the 0.01 significance level. The p-value tells us that the lowest significance level attainable is only 0.008. This indicates that the wafer position level accounts for a significant amount of variation in the response variable of Uniformity. Therefore, there is very strong evidence to suggest that the means are not all equal. It is pertinent now to discover which levels have significantly different means.

Figure 2 The results from Tukeys simultaneous tests in Fig. 2 indicate that the mean level for wafer position 1 is significantly higher than that of the other wafer positions (the corresponding p-values being less than 0.05). However, wafer positions 2, 3 and 4 are not significantly different from each other.

Note that we may consider a model of the form where i = 1,2,3,4; j = 1,2,3. The final and necessary step is to check the errors (the ) in the model using residual diagnostic tools. We assume that the errors: (1) Exhibit constant variance, (2) Are Normally distributed, (3) Have a mean of zero, (4) Are independent from each other. When these assumptions hold, the ANOVA is an exact test of the null hypothesis of no difference in level means and we need to check these assumptions using the residuals. The residuals are the actual values minus the fitted values from the model. Minitab provides the fitted values and the residuals and we may assess these assumptions as follows. From Fig. 3 we see that the plot of residuals vs. fitted values has more spread in the points for the highest fitted values (corresponding to level 1), though with so few points it is difficult to reject assumption (1) of constant variance in the residuals. The residuals appear to have somewhat of a bell-shaped curve, though the Normal probability plot has two values off the straight line at either end (corresponding to high and low values in level 1). However, as is shown in Fig. 4 we see directly that assumptions (2) and (3) appear to be met as the p-value for the Anderson Darling test for Normality is greater than 0.05. Hence we are unable to reject the null hypothesis of Normality, and also that the mean of the residuals is zero. The I-chart (Individuals control chart) in the top right hand corner of Fig. 3 assesses the independence assumption, and does not exhibit any concerning features.

Figure 3 Figure 4

When appropriate, the use of the correct statistical technique can lead to very useful answers. The assumptions for the statistical techniques, although often overlooked, are crucial to the successful interpretation of results. With the aid of advanced statistical software packages results are quickly, easily and reliably obtained. Keith M. Bower is a Technical Training Specialist with Minitab Inc. Keith's main interests lie in SPC, especially control charting, capability indices and in business statistics. This article was originally published in the February 2000 issue of Scientific Computing and Instrumentation). [1] Fisher, R.A. (1925). Statistical Methods for Research Workers. Oliver & Boyd, Edinburgh. [2] Journal of the Electrochemical Society, Vol. 139, No. 2, 1992, pp. 524-532. [3] Hogg, R.V., Ledolter, J. (1987). Applied Statistics for Engineers and Physical Scientists. Macmillan Publishing Company, NY. [4] Ryan, T.A., Joiner, B.L. (1976). Normal Probability Plots and Tests for Normality. [5] Montgomery, D.C. (1997). Design and Analysis of Experiments, 4th Edition. John Wiley & Sons, Inc. Looking for other information or materials? For help, please contact Minitab's Marketing Department at: E-mail: PublicRelations@minitab.com