# 496 STATISTICAL ANALYSIS OF CAUSE AND EFFECT

Save this PDF as:

Size: px
Start display at page:

## Transcription

1 496 STATISTICAL ANALYSIS OF CAUSE AND EFFECT * Use a non-parametric technique. There are statistical methods, called non-parametric methods, that don t make any assumptions about the underlying distribution of the data. Rather than evaluating the differences of parameters such as the mean or variance, non-parametric methods use other comparisons. For example, if the observations are paired they may be compared directly to see if the after is di erent than the before. Or the method might examine the pattern of points above and below the median to see if the before and after values are randomly scattered in the two regions. Or ranks might be analyzed. Non-parametric statistical methods are discussed later in this chapter. Equal variance assumption Many statistical techniques assume equal variances. ANOVA tests the hypothesis that the means are equal, not that variances are equal. In addition to assuming normality, ANOVA assumes that variances are equal for each treatment. Models fitted by regression analysis are evaluated partly by looking for equal variances of residuals for different levels of Xs and Y. Minitab s test for equal variances is found in Stat > ANOVA > Test for Equal Variances. You need a column containing the data and one or more columns specifying the factor level for each data point. If the data have already passed the normality test, use the P-value from Bartlett s test to test the equal variances assumption. Otherwise, use the P-value from Levene s test. The test shown in Figure 14.3 involved five factor levels and Minitab shows a confidence interval bar for sigma of each of the five samples; the tick mark in the center of the bar represents the sample sigma. These are the data from the sample of 100 analyzed earlier and found to be normally distributed, so Bartlett s test can be used. The P-value from Bartlett s test is 0.182, indicating that we can expect this much variability from populations with equal variances 18.2% of the time. Since this is greater than 5%, we fail to reject the null hypothesis of equal variances. Had the data not been normally distributed we would ve used Levene s test, which has a P-value of and leads to the same conclusion. REGRESSION AND CORRELATION ANALYSIS Scatter plots DefinitionöA scatter diagram is a plot of one variable versus another. One variable is called the independent variable and it is usually shown on the horizontal (bottom) axis. The other variable is called the dependent variable and it is shown on the vertical (side) axis.

2 Regression and correlation analysis 497 Figure Output from Minitab s test for equal variances. UsageöScatter diagrams are used to evaluate cause and effect relationships. The assumption is that the independent variable is causing a change in the dependent variable. Scatter plots are used to answer such questions as Does vendor A s material machine better than vendor B s? Does the length of training have anything to do with the amount of scrap an operator makes? and so on. HOW TO CONSTRUCT A SCATTER DIAGRAM 1. Gather several paired sets of observations, preferably 20 or more. A paired set is one where the dependent variable can be directly tied to the independent variable. 2. Find the largest and smallest independent variable and the largest and smallest dependent variable. 3. Construct the vertical and horizontal axes so that the smallest and largest values can be plotted. Figure 14.4 shows the basic structure of a scatter diagram. 4. Plot the data by placing a mark at the point corresponding to each X^Y pair, as illustrated by Figure If more than one classi cation is used, you may use di erent symbols to represent each group.

3 498 STATISTICAL ANALYSIS OF CAUSE AND EFFECT Figure Layout of a scatter diagram. Figure Plotting points on a scatter diagram. From Pyzdek s Guide to SPCöVolume One: Fundamentals,p.66. Copyright # 1990 by Thomas Pyzdek.

4 Regression and correlation analysis 499 EXAMPLE OF A SCATTER DIAGRAM The orchard manager has been keeping track of the weight of peaches on a day by day basis. The data are provided in Table Table Raw data for scatter diagram. From Pyzdek s Guide to SPCöVolume One: Fundamentals,p.67. Copyright # 1990 by Thomas Pyzdek. NUMBER DAYS ON TREE WEIGHT (OUNCES) Organize the data into X^Y pairs, as shown in Table The independent variable, X, is the number of days the fruit has been on the tree. The dependent variable, Y, is the weight of the peach. 2. Find the largest and smallest values for each data set. The largest and smallest values from Table 14.1 are shown in Table 14.2.

5 500 STATISTICAL ANALYSIS OF CAUSE AND EFFECT Table Smallest and largest values. From Pyzdek s Guide to SPCöVolume One: Fundamentals,p.68. Copyright # 1990 by Thomas Pyzdek. VARIABLE SMALLEST LARGEST Days on tree (X) Weight of peach (Y) Construct the axes. In this case, we need a horizontal axis that allows us to cover the range from 75 to 90 days. The vertical axis must cover the smallest of the small weights (4.4 ounces) to the largest of the weights (6.1 ounces). We will select values beyond these minimum requirements, because we want to estimate how long it will take for a peach to reach 6.5 ounces. 4. Plot the data. The completed scatter diagram is shown in Figure POINTERS FOR USING SCATTER DIAGRAMS. Scatter diagrams display di erent patterns that must be interpreted; Figure 14.7 provides a scatter diagram interpretation guide. Figure Completed scatter diagram. From Pyzdek s Guide to SPCöVolume One: Fundamentals,p.68. Copyright # 1990 by Thomas Pyzdek.

6 Regression and correlation analysis 501 Figure Scatter diagram interpretation guide. From Pyzdek s Guide to SPCöVolume One: Fundamentals,p.69. Copyright # 1990 by Thomas Pyzdek.. Be sure that the independent variable, X, is varied over a su ciently large range. When X is changed only a small amount, you may not see a correlation with Y, even though the correlation really does exist.. If you make a prediction for Y, for an X value that lies outside of the range you tested, be advised that the prediction is highly questionable and should be tested thoroughly. Predicting a Y value beyond the X range actually tested is called extrapolation.. Keep an eye out for the e ect of variables you didn t evaluate. Often, an uncontrolled variable will wipe out the e ect of your X variable. It is also possible that an uncontrolled variable will be causing the e ect and you will mistake the X variable you are controlling as the true cause. This problem is much less likely to occur if you choose X levels at random. An example of this is our peaches. It is possible that any number of variables changed steadily over the time period investigated. It is possible that these variables, and not the independent variable, are responsible for the weight gain (e.g., was fertilizer added periodically during the time period investigated?).

7 502 STATISTICAL ANALYSIS OF CAUSE AND EFFECT. Beware of happenstance data! Happenstance data are data that were collectedinthepastforapurposedi erentthanconstructingascatterdiagram. Since little or no control was exercised over important variables, you may nd nearly anything. Happenstance data should be used only to get ideas for further investigation, never for reaching nal conclusions. One common problem with happenstance data is that the variable that is truly important is not recorded. For example, records might show a correlation between the defect rate and the shift. However, perhaps the real cause of defects is theambienttemperature, whichalsochanges withtheshift.. If there is more than one possible source for the dependent variable, try using di erent plotting symbols for each source. For example, if the orchard manager knew that some peaches were taken from trees near a busy highway, he could use a di erent symbol for those peaches. He might nd an interaction, that is, perhaps the peaches from trees near the highway have a di erent growth rate than those from trees deep within the orchard. Although it is possible to do advanced analysis without plotting the scatter diagram, this is generally bad practice. This misses the enormous learning opportunity provided by the graphical analysis of the data. Correlation and regression Correlation analysis (the study of the strength of the linear relationships among variables) and regression analysis (modeling the relationship between one or more independent variables and a dependent variable) are activities of considerable importance in Six Sigma. A regression problem considers the frequency distributions of one variable when another is held fixed at each of several levels. A correlation problem considers the joint variation of two variables, neither of which is restricted by the experimenter. Correlation and regression analyses are designed to assist the analyst in studying cause and effect. They may be employed in all stages of the problem-solving and planning process. Of course, statistics cannot by themselves establish cause and effect. Proving cause and effect requires sound scientific understanding of the situation at hand. The statistical methods described in this section assist the analyst in performing this task. LINEAR MODELS A linear model is simply an expression of a type of association between two variables, x and y.alinear relationship simply means that a change of a given size in x produces a proportionate change in y. Linear models have the form:

8 512 STATISTICAL ANALYSIS OF CAUSE AND EFFECT ANOVA, or ANalysis Of VArianceöa table examining the hypothesis that the variation explained by the regression is zero. If this is so, then the observed association could be explained by chance alone. The rows and columns are those of a standard one-factor ANOVA table (see Chapter 17). For this example, the important item is the column labeled Significance F. The value shown, 0.00, indicates that the probability of getting these results due to chance alone is less than 0.01; i.e., the association is probably not due to chance alone. Note that the ANOVA applies to the entire model, not to the individual variables. The next table in the output examines each of the terms in the linear model separately. The intercept is as described above, and corresponds to our term a in the linear equation. Our model uses two independent variables. In our terminology staff ¼ b 1, food ¼ b 2. Thus, reading from the coefficients column, the linear model is: y ¼ 1:188 þ 0:902 staff score food score. The remaining columns test the hypotheses that each coefficient in the model is actually zero. Standard error columnögives the standard deviations of each term, i.e., the standard deviation of the intercept ¼ 0.565, etc. tstatcolumnöthe coefficient divided by the standard error, i.e., it shows how many standard deviations the observed coefficient is from zero. P-valueöshowstheareainthetailofat distribution beyond the computed t value. For most experimental work, a P-value less than 0.05 is accepted as an indication that the coefficient is significantly different than zero. All of the terms in our model have significant P-values. Lower 95% and Upper 95% columnsöa 95% confidence interval on the coefficient. If the confidence interval does not include zero, we will fail to reject the hypothesis that the coefficient is zero. None of the intervals in our example include zero. CORRELATION ANALYSIS As mentioned earlier, a correlation problem considers the joint variation of two variables, neither of which is restricted by the experimenter. Unlike regression analysis, which considers the effect of the independent variable(s) on a dependent variable, correlation analysis is concerned with the joint variation of one independent variable with another. In a correlation problem, the analyst has two measurements for each individual item in the sample. Unlike a regression study where the analyst controls the values of the x variables, correlation studies usually involve spontaneous variation in the variables being studied. Correlation methods for determining the strength of the linear relationship between two or more variables are among the most widely applied statistical

9 Regression and correlation analysis 513 techniques. More advanced methods exist for studying situations with more than two variables (e.g., canonical analysis, factor analysis, principal components analysis, etc.), however, with the exception of multiple regression, our discussion will focus on the linear association of two variables at a time. In most cases, the measure of correlation used by analysts is the statistic r, sometimes referred to as Pearson s product-moment correlation. Usually x and y are assumed to have a bivariate normal distribution. Under this assumption r is a sample statistic which estimates the population correlation parameter. One interpretation of r is based on the linear regression model described earlier, namely that r 2 is the proportion of the total variability in the y data which can be explained by the linear regression model. The equation for r is: r ¼ s xy s x s y ¼ n P xy P x P y pffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffi ½n P x 2 ð P xþ 2 Š½n P y 2 ð P yþ 2 Š ð14:7þ and, of course, r 2 is simply the square of r. r is bounded at 1 and +1. When the assumptions hold, the signi cance of r is tested by the regression ANOVA. Interpreting r can become quite tricky, so scatter plots should always be used (see above). When the relationship between x and y is non-linear, the explanatory power of r is difficult to interpret in precise terms and should be discussed with great care. While it is easy to see the value of very high correlations such as r ¼ 0:99, it is not so easy to draw conclusions from lower values of r, even when they are statistically significant (i.e., they are significantly different than 0.0). For example, r ¼ 0:5 does not mean the data show half as much clustering as a perfect straight-line fit. In fact, r ¼ 0doesnot mean that there is no relationship between the x and y data, as Figure shows. When r > 0, y tends to increase when x increases. When r < 0, y tends to decrease when x increases. Although r ¼ 0, the relationship between x and y is perfect, albeit non-linear. At the other extreme, r ¼ 1, a perfect correlation, does not mean that there is a cause and effect relationship between x and y. For example, both x and y might be determined by a third variable, z. In such situations, z is described as a lurking variable which hides in the background, unknown to the experimenter. Lurking variables are behind some of the infamous silly associations, such as the association between teacher s pay and liquor sales (the lurking variable is general prosperity).* *Itispossibletoevaluatetheassociationofxand y by removing the effect of the lurking variable. This can be done using regression analysis and computing partial correlation coefficients. This advanced procedure is described in most texts on regression analysis.

10 514 STATISTICAL ANALYSIS OF CAUSE AND EFFECT Figure Interpreting r ¼ 0 for curvilinear data. Establishing causation requires solid scientific understanding. Causation cannot be proven by statistics alone. Some statistical techniques, such as path analysis, can help determine if the correlations between a number of variables are consistent with causal assumptions. However, these methods are beyond the scope of this book. ANALYSIS OF CATEGORICAL DATA Chi-square, tables MAKING COMPARISONS USING CHI-SQUARE TESTS In Six Sigma, there are many instances when the analyst wants to compare the percentage of items distributed among several categories. The things might be operators, methods, materials, or any other grouping of interest. From each of the groups a sample is taken, evaluated, and placed into one of several categories (e.g., high quality, marginal quality, reject quality). The results can be presented as a table with m rows representing the groups of interest and k columns representing the categories. Such tables can be analyzed to answer the question Do the groups differ with regard to the proportion of items in the categories? The chi-square statistic can be used for this purpose.

### Inferential Statistics

Inferential Statistics Sampling and the normal distribution Z-scores Confidence levels and intervals Hypothesis testing Commonly used statistical methods Inferential Statistics Descriptive statistics are

### Technology Step-by-Step Using StatCrunch

Technology Step-by-Step Using StatCrunch Section 1.3 Simple Random Sampling 1. Select Data, highlight Simulate Data, then highlight Discrete Uniform. 2. Fill in the following window with the appropriate

### Business Statistics. Successful completion of Introductory and/or Intermediate Algebra courses is recommended before taking Business Statistics.

Business Course Text Bowerman, Bruce L., Richard T. O'Connell, J. B. Orris, and Dawn C. Porter. Essentials of Business, 2nd edition, McGraw-Hill/Irwin, 2008, ISBN: 978-0-07-331988-9. Required Computing

### Simple linear regression

Simple linear regression Introduction Simple linear regression is a statistical method for obtaining a formula to predict values of one variable from another where there is a causal relationship between

### SPSS Explore procedure

SPSS Explore procedure One useful function in SPSS is the Explore procedure, which will produce histograms, boxplots, stem-and-leaf plots and extensive descriptive statistics. To run the Explore procedure,

### X X X a) perfect linear correlation b) no correlation c) positive correlation (r = 1) (r = 0) (0 < r < 1)

CORRELATION AND REGRESSION / 47 CHAPTER EIGHT CORRELATION AND REGRESSION Correlation and regression are statistical methods that are commonly used in the medical literature to compare two or more variables.

### Course Text. Required Computing Software. Course Description. Course Objectives. StraighterLine. Business Statistics

Course Text Business Statistics Lind, Douglas A., Marchal, William A. and Samuel A. Wathen. Basic Statistics for Business and Economics, 7th edition, McGraw-Hill/Irwin, 2010, ISBN: 9780077384470 [This

### Simple Linear Regression in SPSS STAT 314

Simple Linear Regression in SPSS STAT 314 1. Ten Corvettes between 1 and 6 years old were randomly selected from last year s sales records in Virginia Beach, Virginia. The following data were obtained,

### e = random error, assumed to be normally distributed with mean 0 and standard deviation σ

1 Linear Regression 1.1 Simple Linear Regression Model The linear regression model is applied if we want to model a numeric response variable and its dependency on at least one numeric factor variable.

### Lesson Lesson Outline Outline

Lesson 15 Linear Regression Lesson 15 Outline Review correlation analysis Dependent and Independent variables Least Squares Regression line Calculating l the slope Calculating the Intercept Residuals and

### Fairfield Public Schools

Mathematics Fairfield Public Schools AP Statistics AP Statistics BOE Approved 04/08/2014 1 AP STATISTICS Critical Areas of Focus AP Statistics is a rigorous course that offers advanced students an opportunity

### 1. What is the critical value for this 95% confidence interval? CV = z.025 = invnorm(0.025) = 1.96

1 Final Review 2 Review 2.1 CI 1-propZint Scenario 1 A TV manufacturer claims in its warranty brochure that in the past not more than 10 percent of its TV sets needed any repair during the first two years

### Using Minitab for Regression Analysis: An extended example

Using Minitab for Regression Analysis: An extended example The following example uses data from another text on fertilizer application and crop yield, and is intended to show how Minitab can be used to

### Univariate Regression

Univariate Regression Correlation and Regression The regression line summarizes the linear relationship between 2 variables Correlation coefficient, r, measures strength of relationship: the closer r is

### Chapter 7: Simple linear regression Learning Objectives

Chapter 7: Simple linear regression Learning Objectives Reading: Section 7.1 of OpenIntro Statistics Video: Correlation vs. causation, YouTube (2:19) Video: Intro to Linear Regression, YouTube (5:18) -

### Regression Analysis: A Complete Example

Regression Analysis: A Complete Example This section works out an example that includes all the topics we have discussed so far in this chapter. A complete example of regression analysis. PhotoDisc, Inc./Getty

### Linear Models in STATA and ANOVA

Session 4 Linear Models in STATA and ANOVA Page Strengths of Linear Relationships 4-2 A Note on Non-Linear Relationships 4-4 Multiple Linear Regression 4-5 Removal of Variables 4-8 Independent Samples

### SPSS: Descriptive and Inferential Statistics. For Windows

For Windows August 2012 Table of Contents Section 1: Summarizing Data...3 1.1 Descriptive Statistics...3 Section 2: Inferential Statistics... 10 2.1 Chi-Square Test... 10 2.2 T tests... 11 2.3 Correlation...

### Module 9: Nonparametric Tests. The Applied Research Center

Module 9: Nonparametric Tests The Applied Research Center Module 9 Overview } Nonparametric Tests } Parametric vs. Nonparametric Tests } Restrictions of Nonparametric Tests } One-Sample Chi-Square Test

### Simple Linear Regression Inference

Simple Linear Regression Inference 1 Inference requirements The Normality assumption of the stochastic term e is needed for inference even if it is not a OLS requirement. Therefore we have: Interpretation

### CHAPTER 14 NONPARAMETRIC TESTS

CHAPTER 14 NONPARAMETRIC TESTS Everything that we have done up until now in statistics has relied heavily on one major fact: that our data is normally distributed. We have been able to make inferences

### CHAPTER 13 SIMPLE LINEAR REGRESSION. Opening Example. Simple Regression. Linear Regression

Opening Example CHAPTER 13 SIMPLE LINEAR REGREION SIMPLE LINEAR REGREION! Simple Regression! Linear Regression Simple Regression Definition A regression model is a mathematical equation that descries the

### Descriptive Statistics

Descriptive Statistics Primer Descriptive statistics Central tendency Variation Relative position Relationships Calculating descriptive statistics Descriptive Statistics Purpose to describe or summarize

### Chapter 13 Introduction to Linear Regression and Correlation Analysis

Chapter 3 Student Lecture Notes 3- Chapter 3 Introduction to Linear Regression and Correlation Analsis Fall 2006 Fundamentals of Business Statistics Chapter Goals To understand the methods for displaing

### Scatter Plots with Error Bars

Chapter 165 Scatter Plots with Error Bars Introduction The procedure extends the capability of the basic scatter plot by allowing you to plot the variability in Y and X corresponding to each point. Each

### Simple Regression Theory II 2010 Samuel L. Baker

SIMPLE REGRESSION THEORY II 1 Simple Regression Theory II 2010 Samuel L. Baker Assessing how good the regression equation is likely to be Assignment 1A gets into drawing inferences about how close the

### How To Run Statistical Tests in Excel

How To Run Statistical Tests in Excel Microsoft Excel is your best tool for storing and manipulating data, calculating basic descriptive statistics such as means and standard deviations, and conducting

### Analysing Questionnaires using Minitab (for SPSS queries contact -) Graham.Currell@uwe.ac.uk

Analysing Questionnaires using Minitab (for SPSS queries contact -) Graham.Currell@uwe.ac.uk Structure As a starting point it is useful to consider a basic questionnaire as containing three main sections:

### Simple Predictive Analytics Curtis Seare

Using Excel to Solve Business Problems: Simple Predictive Analytics Curtis Seare Copyright: Vault Analytics July 2010 Contents Section I: Background Information Why use Predictive Analytics? How to use

### 2013 MBA Jump Start Program. Statistics Module Part 3

2013 MBA Jump Start Program Module 1: Statistics Thomas Gilbert Part 3 Statistics Module Part 3 Hypothesis Testing (Inference) Regressions 2 1 Making an Investment Decision A researcher in your firm just

### SPSS Tests for Versions 9 to 13

SPSS Tests for Versions 9 to 13 Chapter 2 Descriptive Statistic (including median) Choose Analyze Descriptive statistics Frequencies... Click on variable(s) then press to move to into Variable(s): list

### Projects Involving Statistics (& SPSS)

Projects Involving Statistics (& SPSS) Academic Skills Advice Starting a project which involves using statistics can feel confusing as there seems to be many different things you can do (charts, graphs,

### Appendix E: Graphing Data

You will often make scatter diagrams and line graphs to illustrate the data that you collect. Scatter diagrams are often used to show the relationship between two variables. For example, in an absorbance

### Premaster Statistics Tutorial 4 Full solutions

Premaster Statistics Tutorial 4 Full solutions Regression analysis Q1 (based on Doane & Seward, 4/E, 12.7) a. Interpret the slope of the fitted regression = 125,000 + 150. b. What is the prediction for

### Factors affecting online sales

Factors affecting online sales Table of contents Summary... 1 Research questions... 1 The dataset... 2 Descriptive statistics: The exploratory stage... 3 Confidence intervals... 4 Hypothesis tests... 4

### Chapter 14: Analyzing Relationships Between Variables

Chapter Outlines for: Frey, L., Botan, C., & Kreps, G. (1999). Investigating communication: An introduction to research methods. (2nd ed.) Boston: Allyn & Bacon. Chapter 14: Analyzing Relationships Between

### ABSORBENCY OF PAPER TOWELS

ABSORBENCY OF PAPER TOWELS 15. Brief Version of the Case Study 15.1 Problem Formulation 15.2 Selection of Factors 15.3 Obtaining Random Samples of Paper Towels 15.4 How will the Absorbency be measured?

### MTH 140 Statistics Videos

MTH 140 Statistics Videos Chapter 1 Picturing Distributions with Graphs Individuals and Variables Categorical Variables: Pie Charts and Bar Graphs Categorical Variables: Pie Charts and Bar Graphs Quantitative

### business statistics using Excel OXFORD UNIVERSITY PRESS Glyn Davis & Branko Pecar

business statistics using Excel Glyn Davis & Branko Pecar OXFORD UNIVERSITY PRESS Detailed contents Introduction to Microsoft Excel 2003 Overview Learning Objectives 1.1 Introduction to Microsoft Excel

### Simple Regression and Correlation

Simple Regression and Correlation Today, we are going to discuss a powerful statistical technique for examining whether or not two variables are related. Specifically, we are going to talk about the ideas

### Study Guide for the Final Exam

Study Guide for the Final Exam When studying, remember that the computational portion of the exam will only involve new material (covered after the second midterm), that material from Exam 1 will make

### The Dummy s Guide to Data Analysis Using SPSS

The Dummy s Guide to Data Analysis Using SPSS Mathematics 57 Scripps College Amy Gamble April, 2001 Amy Gamble 4/30/01 All Rights Rerserved TABLE OF CONTENTS PAGE Helpful Hints for All Tests...1 Tests

### Module 3: Correlation and Covariance

Using Statistical Data to Make Decisions Module 3: Correlation and Covariance Tom Ilvento Dr. Mugdim Pašiƒ University of Delaware Sarajevo Graduate School of Business O ften our interest in data analysis

### How to Conduct a Hypothesis Test

How to Conduct a Hypothesis Test The idea of hypothesis testing is relatively straightforward. In various studies we observe certain events. We must ask, is the event due to chance alone, or is there some

### Statistical Inference and t-tests

1 Statistical Inference and t-tests Objectives Evaluate the difference between a sample mean and a target value using a one-sample t-test. Evaluate the difference between a sample mean and a target value

### Hypothesis Testing - Relationships

- Relationships Session 3 AHX43 (28) 1 Lecture Outline Correlational Research. The Correlation Coefficient. An example. Considerations. One and Two-tailed Tests. Errors. Power. for Relationships AHX43

### II. DISTRIBUTIONS distribution normal distribution. standard scores

Appendix D Basic Measurement And Statistics The following information was developed by Steven Rothke, PhD, Department of Psychology, Rehabilitation Institute of Chicago (RIC) and expanded by Mary F. Schmidt,

### AP Statistics 2001 Solutions and Scoring Guidelines

AP Statistics 2001 Solutions and Scoring Guidelines The materials included in these files are intended for non-commercial use by AP teachers for course and exam preparation; permission for any other use

### Chapter 15 Multiple Choice Questions (The answers are provided after the last question.)

Chapter 15 Multiple Choice Questions (The answers are provided after the last question.) 1. What is the median of the following set of scores? 18, 6, 12, 10, 14? a. 10 b. 14 c. 18 d. 12 2. Approximately

### Session 7 Bivariate Data and Analysis

Session 7 Bivariate Data and Analysis Key Terms for This Session Previously Introduced mean standard deviation New in This Session association bivariate analysis contingency table co-variation least squares

### The aspect of the data that we want to describe/measure is the degree of linear relationship between and The statistic r describes/measures the degree

PS 511: Advanced Statistics for Psychological and Behavioral Research 1 Both examine linear (straight line) relationships Correlation works with a pair of scores One score on each of two variables ( and

### , has mean A) 0.3. B) the smaller of 0.8 and 0.5. C) 0.15. D) which cannot be determined without knowing the sample results.

BA 275 Review Problems - Week 9 (11/20/06-11/24/06) CD Lessons: 69, 70, 16-20 Textbook: pp. 520-528, 111-124, 133-141 An SRS of size 100 is taken from a population having proportion 0.8 of successes. An

### Spearman s correlation

Spearman s correlation Introduction Before learning about Spearman s correllation it is important to understand Pearson s correlation which is a statistical measure of the strength of a linear relationship

### Data Mining Techniques Chapter 5: The Lure of Statistics: Data Mining Using Familiar Tools

Data Mining Techniques Chapter 5: The Lure of Statistics: Data Mining Using Familiar Tools Occam s razor.......................................................... 2 A look at data I.........................................................

### 1. The parameters to be estimated in the simple linear regression model Y=α+βx+ε ε~n(0,σ) are: a) α, β, σ b) α, β, ε c) a, b, s d) ε, 0, σ

STA 3024 Practice Problems Exam 2 NOTE: These are just Practice Problems. This is NOT meant to look just like the test, and it is NOT the only thing that you should study. Make sure you know all the material

### Chapter 5 Analysis of variance SPSS Analysis of variance

Chapter 5 Analysis of variance SPSS Analysis of variance Data file used: gss.sav How to get there: Analyze Compare Means One-way ANOVA To test the null hypothesis that several population means are equal,

### Bill Burton Albert Einstein College of Medicine william.burton@einstein.yu.edu April 28, 2014 EERS: Managing the Tension Between Rigor and Resources 1

Bill Burton Albert Einstein College of Medicine william.burton@einstein.yu.edu April 28, 2014 EERS: Managing the Tension Between Rigor and Resources 1 Calculate counts, means, and standard deviations Produce

### Elementary Statistics. Scatter Plot, Regression Line, Linear Correlation Coefficient, and Coefficient of Determination

Scatter Plot, Regression Line, Linear Correlation Coefficient, and Coefficient of Determination What is a Scatter Plot? A Scatter Plot is a plot of ordered pairs (x, y) where the horizontal axis is used

### Curriculum Map Statistics and Probability Honors (348) Saugus High School Saugus Public Schools 2009-2010

Curriculum Map Statistics and Probability Honors (348) Saugus High School Saugus Public Schools 2009-2010 Week 1 Week 2 14.0 Students organize and describe distributions of data by using a number of different

### NCSS Statistical Software Principal Components Regression. In ordinary least squares, the regression coefficients are estimated using the formula ( )

Chapter 340 Principal Components Regression Introduction is a technique for analyzing multiple regression data that suffer from multicollinearity. When multicollinearity occurs, least squares estimates

### The Big 50 Revision Guidelines for S1

The Big 50 Revision Guidelines for S1 If you can understand all of these you ll do very well 1. Know what is meant by a statistical model and the Modelling cycle of continuous refinement 2. Understand

### Outline of Topics. Statistical Methods I. Types of Data. Descriptive Statistics

Statistical Methods I Tamekia L. Jones, Ph.D. (tjones@cog.ufl.edu) Research Assistant Professor Children s Oncology Group Statistics & Data Center Department of Biostatistics Colleges of Medicine and Public

### Chapter 10. Key Ideas Correlation, Correlation Coefficient (r),

Chapter 0 Key Ideas Correlation, Correlation Coefficient (r), Section 0-: Overview We have already explored the basics of describing single variable data sets. However, when two quantitative variables

### This chapter will demonstrate how to perform multiple linear regression with IBM SPSS

CHAPTER 7B Multiple Regression: Statistical Methods Using IBM SPSS This chapter will demonstrate how to perform multiple linear regression with IBM SPSS first using the standard method and then using the

### 2. Simple Linear Regression

Research methods - II 3 2. Simple Linear Regression Simple linear regression is a technique in parametric statistics that is commonly used for analyzing mean response of a variable Y which changes according

### Comparing two groups (t tests...)

Page 1 of 33 Comparing two groups (t tests...) You've measured a variable in two groups, and the means (and medians) are distinct. Is that due to chance? Or does it tell you the two groups are really different?

### EXCEL EXERCISE AND ACCELERATION DUE TO GRAVITY

EXCEL EXERCISE AND ACCELERATION DUE TO GRAVITY Objective: To learn how to use the Excel spreadsheet to record your data, calculate values and make graphs. To analyze the data from the Acceleration Due

### Sydney Roberts Predicting Age Group Swimmers 50 Freestyle Time 1. 1. Introduction p. 2. 2. Statistical Methods Used p. 5. 3. 10 and under Males p.

Sydney Roberts Predicting Age Group Swimmers 50 Freestyle Time 1 Table of Contents 1. Introduction p. 2 2. Statistical Methods Used p. 5 3. 10 and under Males p. 8 4. 11 and up Males p. 10 5. 10 and under

### San Jose State University Engineering 10 1

KY San Jose State University Engineering 10 1 Select Insert from the main menu Plotting in Excel Select All Chart Types San Jose State University Engineering 10 2 Definition: A chart that consists of multiple

### Statistical Significance and Bivariate Tests

Statistical Significance and Bivariate Tests BUS 735: Business Decision Making and Research 1 1.1 Goals Goals Specific goals: Re-familiarize ourselves with basic statistics ideas: sampling distributions,

### CALCULATIONS & STATISTICS

CALCULATIONS & STATISTICS CALCULATION OF SCORES Conversion of 1-5 scale to 0-100 scores When you look at your report, you will notice that the scores are reported on a 0-100 scale, even though respondents

Bowerman, O'Connell, Aitken Schermer, & Adcock, Business Statistics in Practice, Canadian edition Online Learning Centre Technology Step-by-Step - Excel Microsoft Excel is a spreadsheet software application

### SIMPLE REGRESSION ANALYSIS

SIMPLE REGRESSION ANALYSIS Introduction. Regression analysis is used when two or more variables are thought to be systematically connected by a linear relationship. In simple regression, we have only two

### Inferential Statistics. Probability. From Samples to Populations. Katie Rommel-Esham Education 504

Inferential Statistics Katie Rommel-Esham Education 504 Probability Probability is the scientific way of stating the degree of confidence we have in predicting something Tossing coins and rolling dice

### Comparing three or more groups (one-way ANOVA...)

Page 1 of 36 Comparing three or more groups (one-way ANOVA...) You've measured a variable in three or more groups, and the means (and medians) are distinct. Is that due to chance? Or does it tell you the

### Unit 31 A Hypothesis Test about Correlation and Slope in a Simple Linear Regression

Unit 31 A Hypothesis Test about Correlation and Slope in a Simple Linear Regression Objectives: To perform a hypothesis test concerning the slope of a least squares line To recognize that testing for a

### Correlational Research. Correlational Research. Stephen E. Brock, Ph.D., NCSP EDS 250. Descriptive Research 1. Correlational Research: Scatter Plots

Correlational Research Stephen E. Brock, Ph.D., NCSP California State University, Sacramento 1 Correlational Research A quantitative methodology used to determine whether, and to what degree, a relationship

### Using Excel for Statistical Analysis

Using Excel for Statistical Analysis You don t have to have a fancy pants statistics package to do many statistical functions. Excel can perform several statistical tests and analyses. First, make sure

### Canonical Correlation

Chapter 400 Introduction Canonical correlation analysis is the study of the linear relations between two sets of variables. It is the multivariate extension of correlation analysis. Although we will present

### MINITAB ASSISTANT WHITE PAPER

MINITAB ASSISTANT WHITE PAPER This paper explains the research conducted by Minitab statisticians to develop the methods and data checks used in the Assistant in Minitab 17 Statistical Software. One-Way

### 1) Write the following as an algebraic expression using x as the variable: Triple a number subtracted from the number

1) Write the following as an algebraic expression using x as the variable: Triple a number subtracted from the number A. 3(x - x) B. x 3 x C. 3x - x D. x - 3x 2) Write the following as an algebraic expression

### Exercise 1.12 (Pg. 22-23)

Individuals: The objects that are described by a set of data. They may be people, animals, things, etc. (Also referred to as Cases or Records) Variables: The characteristics recorded about each individual.

### Data Analysis Tools. Tools for Summarizing Data

Data Analysis Tools This section of the notes is meant to introduce you to many of the tools that are provided by Excel under the Tools/Data Analysis menu item. If your computer does not have that tool

### Variables and Data A variable contains data about anything we measure. For example; age or gender of the participants or their score on a test.

The Analysis of Research Data The design of any project will determine what sort of statistical tests you should perform on your data and how successful the data analysis will be. For example if you decide

### UCLA STAT 13 Statistical Methods - Final Exam Review Solutions Chapter 7 Sampling Distributions of Estimates

UCLA STAT 13 Statistical Methods - Final Exam Review Solutions Chapter 7 Sampling Distributions of Estimates 1. (a) (i) µ µ (ii) σ σ n is exactly Normally distributed. (c) (i) is approximately Normally

### Good luck! BUSINESS STATISTICS FINAL EXAM INSTRUCTIONS. Name:

Glo bal Leadership M BA BUSINESS STATISTICS FINAL EXAM Name: INSTRUCTIONS 1. Do not open this exam until instructed to do so. 2. Be sure to fill in your name before starting the exam. 3. You have two hours

### Using Excel for inferential statistics

FACT SHEET Using Excel for inferential statistics Introduction When you collect data, you expect a certain amount of variation, just caused by chance. A wide variety of statistical tests can be applied

### QUANTITATIVE METHODS BIOLOGY FINAL HONOUR SCHOOL NON-PARAMETRIC TESTS

QUANTITATIVE METHODS BIOLOGY FINAL HONOUR SCHOOL NON-PARAMETRIC TESTS This booklet contains lecture notes for the nonparametric work in the QM course. This booklet may be online at http://users.ox.ac.uk/~grafen/qmnotes/index.html.

### Chapter 23. Inferences for Regression

Chapter 23. Inferences for Regression Topics covered in this chapter: Simple Linear Regression Simple Linear Regression Example 23.1: Crying and IQ The Problem: Infants who cry easily may be more easily

### Chapter 8 Graphs and Functions:

Chapter 8 Graphs and Functions: Cartesian axes, coordinates and points 8.1 Pictorially we plot points and graphs in a plane (flat space) using a set of Cartesian axes traditionally called the x and y axes

### Research Methods & Experimental Design

Research Methods & Experimental Design 16.422 Human Supervisory Control April 2004 Research Methods Qualitative vs. quantitative Understanding the relationship between objectives (research question) and

### Analysis of Variance ANOVA

Analysis of Variance ANOVA Overview We ve used the t -test to compare the means from two independent groups. Now we ve come to the final topic of the course: how to compare means from more than two populations.

### " Y. Notation and Equations for Regression Lecture 11/4. Notation:

Notation: Notation and Equations for Regression Lecture 11/4 m: The number of predictor variables in a regression Xi: One of multiple predictor variables. The subscript i represents any number from 1 through

### Null Hypothesis H 0. The null hypothesis (denoted by H 0

Hypothesis test In statistics, a hypothesis is a claim or statement about a property of a population. A hypothesis test (or test of significance) is a standard procedure for testing a claim about a property

### 1 SAMPLE SIGN TEST. Non-Parametric Univariate Tests: 1 Sample Sign Test 1. A non-parametric equivalent of the 1 SAMPLE T-TEST.

Non-Parametric Univariate Tests: 1 Sample Sign Test 1 1 SAMPLE SIGN TEST A non-parametric equivalent of the 1 SAMPLE T-TEST. ASSUMPTIONS: Data is non-normally distributed, even after log transforming.

### Regression step-by-step using Microsoft Excel

Step 1: Regression step-by-step using Microsoft Excel Notes prepared by Pamela Peterson Drake, James Madison University Type the data into the spreadsheet The example used throughout this How to is a regression

### Statistics and research

Statistics and research Usaneya Perngparn Chitlada Areesantichai Drug Dependence Research Center (WHOCC for Research and Training in Drug Dependence) College of Public Health Sciences Chulolongkorn University,