496 STATISTICAL ANALYSIS OF CAUSE AND EFFECT


 Charla Powers
 2 years ago
 Views:
Transcription
1 496 STATISTICAL ANALYSIS OF CAUSE AND EFFECT * Use a nonparametric technique. There are statistical methods, called nonparametric methods, that don t make any assumptions about the underlying distribution of the data. Rather than evaluating the differences of parameters such as the mean or variance, nonparametric methods use other comparisons. For example, if the observations are paired they may be compared directly to see if the after is di erent than the before. Or the method might examine the pattern of points above and below the median to see if the before and after values are randomly scattered in the two regions. Or ranks might be analyzed. Nonparametric statistical methods are discussed later in this chapter. Equal variance assumption Many statistical techniques assume equal variances. ANOVA tests the hypothesis that the means are equal, not that variances are equal. In addition to assuming normality, ANOVA assumes that variances are equal for each treatment. Models fitted by regression analysis are evaluated partly by looking for equal variances of residuals for different levels of Xs and Y. Minitab s test for equal variances is found in Stat > ANOVA > Test for Equal Variances. You need a column containing the data and one or more columns specifying the factor level for each data point. If the data have already passed the normality test, use the Pvalue from Bartlett s test to test the equal variances assumption. Otherwise, use the Pvalue from Levene s test. The test shown in Figure 14.3 involved five factor levels and Minitab shows a confidence interval bar for sigma of each of the five samples; the tick mark in the center of the bar represents the sample sigma. These are the data from the sample of 100 analyzed earlier and found to be normally distributed, so Bartlett s test can be used. The Pvalue from Bartlett s test is 0.182, indicating that we can expect this much variability from populations with equal variances 18.2% of the time. Since this is greater than 5%, we fail to reject the null hypothesis of equal variances. Had the data not been normally distributed we would ve used Levene s test, which has a Pvalue of and leads to the same conclusion. REGRESSION AND CORRELATION ANALYSIS Scatter plots DefinitionöA scatter diagram is a plot of one variable versus another. One variable is called the independent variable and it is usually shown on the horizontal (bottom) axis. The other variable is called the dependent variable and it is shown on the vertical (side) axis.
2 Regression and correlation analysis 497 Figure Output from Minitab s test for equal variances. UsageöScatter diagrams are used to evaluate cause and effect relationships. The assumption is that the independent variable is causing a change in the dependent variable. Scatter plots are used to answer such questions as Does vendor A s material machine better than vendor B s? Does the length of training have anything to do with the amount of scrap an operator makes? and so on. HOW TO CONSTRUCT A SCATTER DIAGRAM 1. Gather several paired sets of observations, preferably 20 or more. A paired set is one where the dependent variable can be directly tied to the independent variable. 2. Find the largest and smallest independent variable and the largest and smallest dependent variable. 3. Construct the vertical and horizontal axes so that the smallest and largest values can be plotted. Figure 14.4 shows the basic structure of a scatter diagram. 4. Plot the data by placing a mark at the point corresponding to each X^Y pair, as illustrated by Figure If more than one classi cation is used, you may use di erent symbols to represent each group.
3 498 STATISTICAL ANALYSIS OF CAUSE AND EFFECT Figure Layout of a scatter diagram. Figure Plotting points on a scatter diagram. From Pyzdek s Guide to SPCöVolume One: Fundamentals,p.66. Copyright # 1990 by Thomas Pyzdek.
4 Regression and correlation analysis 499 EXAMPLE OF A SCATTER DIAGRAM The orchard manager has been keeping track of the weight of peaches on a day by day basis. The data are provided in Table Table Raw data for scatter diagram. From Pyzdek s Guide to SPCöVolume One: Fundamentals,p.67. Copyright # 1990 by Thomas Pyzdek. NUMBER DAYS ON TREE WEIGHT (OUNCES) Organize the data into X^Y pairs, as shown in Table The independent variable, X, is the number of days the fruit has been on the tree. The dependent variable, Y, is the weight of the peach. 2. Find the largest and smallest values for each data set. The largest and smallest values from Table 14.1 are shown in Table 14.2.
5 500 STATISTICAL ANALYSIS OF CAUSE AND EFFECT Table Smallest and largest values. From Pyzdek s Guide to SPCöVolume One: Fundamentals,p.68. Copyright # 1990 by Thomas Pyzdek. VARIABLE SMALLEST LARGEST Days on tree (X) Weight of peach (Y) Construct the axes. In this case, we need a horizontal axis that allows us to cover the range from 75 to 90 days. The vertical axis must cover the smallest of the small weights (4.4 ounces) to the largest of the weights (6.1 ounces). We will select values beyond these minimum requirements, because we want to estimate how long it will take for a peach to reach 6.5 ounces. 4. Plot the data. The completed scatter diagram is shown in Figure POINTERS FOR USING SCATTER DIAGRAMS. Scatter diagrams display di erent patterns that must be interpreted; Figure 14.7 provides a scatter diagram interpretation guide. Figure Completed scatter diagram. From Pyzdek s Guide to SPCöVolume One: Fundamentals,p.68. Copyright # 1990 by Thomas Pyzdek.
6 Regression and correlation analysis 501 Figure Scatter diagram interpretation guide. From Pyzdek s Guide to SPCöVolume One: Fundamentals,p.69. Copyright # 1990 by Thomas Pyzdek.. Be sure that the independent variable, X, is varied over a su ciently large range. When X is changed only a small amount, you may not see a correlation with Y, even though the correlation really does exist.. If you make a prediction for Y, for an X value that lies outside of the range you tested, be advised that the prediction is highly questionable and should be tested thoroughly. Predicting a Y value beyond the X range actually tested is called extrapolation.. Keep an eye out for the e ect of variables you didn t evaluate. Often, an uncontrolled variable will wipe out the e ect of your X variable. It is also possible that an uncontrolled variable will be causing the e ect and you will mistake the X variable you are controlling as the true cause. This problem is much less likely to occur if you choose X levels at random. An example of this is our peaches. It is possible that any number of variables changed steadily over the time period investigated. It is possible that these variables, and not the independent variable, are responsible for the weight gain (e.g., was fertilizer added periodically during the time period investigated?).
7 502 STATISTICAL ANALYSIS OF CAUSE AND EFFECT. Beware of happenstance data! Happenstance data are data that were collectedinthepastforapurposedi erentthanconstructingascatterdiagram. Since little or no control was exercised over important variables, you may nd nearly anything. Happenstance data should be used only to get ideas for further investigation, never for reaching nal conclusions. One common problem with happenstance data is that the variable that is truly important is not recorded. For example, records might show a correlation between the defect rate and the shift. However, perhaps the real cause of defects is theambienttemperature, whichalsochanges withtheshift.. If there is more than one possible source for the dependent variable, try using di erent plotting symbols for each source. For example, if the orchard manager knew that some peaches were taken from trees near a busy highway, he could use a di erent symbol for those peaches. He might nd an interaction, that is, perhaps the peaches from trees near the highway have a di erent growth rate than those from trees deep within the orchard. Although it is possible to do advanced analysis without plotting the scatter diagram, this is generally bad practice. This misses the enormous learning opportunity provided by the graphical analysis of the data. Correlation and regression Correlation analysis (the study of the strength of the linear relationships among variables) and regression analysis (modeling the relationship between one or more independent variables and a dependent variable) are activities of considerable importance in Six Sigma. A regression problem considers the frequency distributions of one variable when another is held fixed at each of several levels. A correlation problem considers the joint variation of two variables, neither of which is restricted by the experimenter. Correlation and regression analyses are designed to assist the analyst in studying cause and effect. They may be employed in all stages of the problemsolving and planning process. Of course, statistics cannot by themselves establish cause and effect. Proving cause and effect requires sound scientific understanding of the situation at hand. The statistical methods described in this section assist the analyst in performing this task. LINEAR MODELS A linear model is simply an expression of a type of association between two variables, x and y.alinear relationship simply means that a change of a given size in x produces a proportionate change in y. Linear models have the form:
8 512 STATISTICAL ANALYSIS OF CAUSE AND EFFECT ANOVA, or ANalysis Of VArianceöa table examining the hypothesis that the variation explained by the regression is zero. If this is so, then the observed association could be explained by chance alone. The rows and columns are those of a standard onefactor ANOVA table (see Chapter 17). For this example, the important item is the column labeled Significance F. The value shown, 0.00, indicates that the probability of getting these results due to chance alone is less than 0.01; i.e., the association is probably not due to chance alone. Note that the ANOVA applies to the entire model, not to the individual variables. The next table in the output examines each of the terms in the linear model separately. The intercept is as described above, and corresponds to our term a in the linear equation. Our model uses two independent variables. In our terminology staff ¼ b 1, food ¼ b 2. Thus, reading from the coefficients column, the linear model is: y ¼ 1:188 þ 0:902 staff score food score. The remaining columns test the hypotheses that each coefficient in the model is actually zero. Standard error columnögives the standard deviations of each term, i.e., the standard deviation of the intercept ¼ 0.565, etc. tstatcolumnöthe coefficient divided by the standard error, i.e., it shows how many standard deviations the observed coefficient is from zero. Pvalueöshowstheareainthetailofat distribution beyond the computed t value. For most experimental work, a Pvalue less than 0.05 is accepted as an indication that the coefficient is significantly different than zero. All of the terms in our model have significant Pvalues. Lower 95% and Upper 95% columnsöa 95% confidence interval on the coefficient. If the confidence interval does not include zero, we will fail to reject the hypothesis that the coefficient is zero. None of the intervals in our example include zero. CORRELATION ANALYSIS As mentioned earlier, a correlation problem considers the joint variation of two variables, neither of which is restricted by the experimenter. Unlike regression analysis, which considers the effect of the independent variable(s) on a dependent variable, correlation analysis is concerned with the joint variation of one independent variable with another. In a correlation problem, the analyst has two measurements for each individual item in the sample. Unlike a regression study where the analyst controls the values of the x variables, correlation studies usually involve spontaneous variation in the variables being studied. Correlation methods for determining the strength of the linear relationship between two or more variables are among the most widely applied statistical
9 Regression and correlation analysis 513 techniques. More advanced methods exist for studying situations with more than two variables (e.g., canonical analysis, factor analysis, principal components analysis, etc.), however, with the exception of multiple regression, our discussion will focus on the linear association of two variables at a time. In most cases, the measure of correlation used by analysts is the statistic r, sometimes referred to as Pearson s productmoment correlation. Usually x and y are assumed to have a bivariate normal distribution. Under this assumption r is a sample statistic which estimates the population correlation parameter. One interpretation of r is based on the linear regression model described earlier, namely that r 2 is the proportion of the total variability in the y data which can be explained by the linear regression model. The equation for r is: r ¼ s xy s x s y ¼ n P xy P x P y pffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffi ½n P x 2 ð P xþ 2 Š½n P y 2 ð P yþ 2 Š ð14:7þ and, of course, r 2 is simply the square of r. r is bounded at 1 and +1. When the assumptions hold, the signi cance of r is tested by the regression ANOVA. Interpreting r can become quite tricky, so scatter plots should always be used (see above). When the relationship between x and y is nonlinear, the explanatory power of r is difficult to interpret in precise terms and should be discussed with great care. While it is easy to see the value of very high correlations such as r ¼ 0:99, it is not so easy to draw conclusions from lower values of r, even when they are statistically significant (i.e., they are significantly different than 0.0). For example, r ¼ 0:5 does not mean the data show half as much clustering as a perfect straightline fit. In fact, r ¼ 0doesnot mean that there is no relationship between the x and y data, as Figure shows. When r > 0, y tends to increase when x increases. When r < 0, y tends to decrease when x increases. Although r ¼ 0, the relationship between x and y is perfect, albeit nonlinear. At the other extreme, r ¼ 1, a perfect correlation, does not mean that there is a cause and effect relationship between x and y. For example, both x and y might be determined by a third variable, z. In such situations, z is described as a lurking variable which hides in the background, unknown to the experimenter. Lurking variables are behind some of the infamous silly associations, such as the association between teacher s pay and liquor sales (the lurking variable is general prosperity).* *Itispossibletoevaluatetheassociationofxand y by removing the effect of the lurking variable. This can be done using regression analysis and computing partial correlation coefficients. This advanced procedure is described in most texts on regression analysis.
10 514 STATISTICAL ANALYSIS OF CAUSE AND EFFECT Figure Interpreting r ¼ 0 for curvilinear data. Establishing causation requires solid scientific understanding. Causation cannot be proven by statistics alone. Some statistical techniques, such as path analysis, can help determine if the correlations between a number of variables are consistent with causal assumptions. However, these methods are beyond the scope of this book. ANALYSIS OF CATEGORICAL DATA Chisquare, tables MAKING COMPARISONS USING CHISQUARE TESTS In Six Sigma, there are many instances when the analyst wants to compare the percentage of items distributed among several categories. The things might be operators, methods, materials, or any other grouping of interest. From each of the groups a sample is taken, evaluated, and placed into one of several categories (e.g., high quality, marginal quality, reject quality). The results can be presented as a table with m rows representing the groups of interest and k columns representing the categories. Such tables can be analyzed to answer the question Do the groups differ with regard to the proportion of items in the categories? The chisquare statistic can be used for this purpose.
Inferential Statistics
Inferential Statistics Sampling and the normal distribution Zscores Confidence levels and intervals Hypothesis testing Commonly used statistical methods Inferential Statistics Descriptive statistics are
More informationTechnology StepbyStep Using StatCrunch
Technology StepbyStep Using StatCrunch Section 1.3 Simple Random Sampling 1. Select Data, highlight Simulate Data, then highlight Discrete Uniform. 2. Fill in the following window with the appropriate
More informationBusiness Statistics. Successful completion of Introductory and/or Intermediate Algebra courses is recommended before taking Business Statistics.
Business Course Text Bowerman, Bruce L., Richard T. O'Connell, J. B. Orris, and Dawn C. Porter. Essentials of Business, 2nd edition, McGrawHill/Irwin, 2008, ISBN: 9780073319889. Required Computing
More informationSimple linear regression
Simple linear regression Introduction Simple linear regression is a statistical method for obtaining a formula to predict values of one variable from another where there is a causal relationship between
More informationSPSS Explore procedure
SPSS Explore procedure One useful function in SPSS is the Explore procedure, which will produce histograms, boxplots, stemandleaf plots and extensive descriptive statistics. To run the Explore procedure,
More informationX X X a) perfect linear correlation b) no correlation c) positive correlation (r = 1) (r = 0) (0 < r < 1)
CORRELATION AND REGRESSION / 47 CHAPTER EIGHT CORRELATION AND REGRESSION Correlation and regression are statistical methods that are commonly used in the medical literature to compare two or more variables.
More informationCourse Text. Required Computing Software. Course Description. Course Objectives. StraighterLine. Business Statistics
Course Text Business Statistics Lind, Douglas A., Marchal, William A. and Samuel A. Wathen. Basic Statistics for Business and Economics, 7th edition, McGrawHill/Irwin, 2010, ISBN: 9780077384470 [This
More informationSimple Linear Regression in SPSS STAT 314
Simple Linear Regression in SPSS STAT 314 1. Ten Corvettes between 1 and 6 years old were randomly selected from last year s sales records in Virginia Beach, Virginia. The following data were obtained,
More informatione = random error, assumed to be normally distributed with mean 0 and standard deviation σ
1 Linear Regression 1.1 Simple Linear Regression Model The linear regression model is applied if we want to model a numeric response variable and its dependency on at least one numeric factor variable.
More informationLesson Lesson Outline Outline
Lesson 15 Linear Regression Lesson 15 Outline Review correlation analysis Dependent and Independent variables Least Squares Regression line Calculating l the slope Calculating the Intercept Residuals and
More informationFairfield Public Schools
Mathematics Fairfield Public Schools AP Statistics AP Statistics BOE Approved 04/08/2014 1 AP STATISTICS Critical Areas of Focus AP Statistics is a rigorous course that offers advanced students an opportunity
More information1. What is the critical value for this 95% confidence interval? CV = z.025 = invnorm(0.025) = 1.96
1 Final Review 2 Review 2.1 CI 1propZint Scenario 1 A TV manufacturer claims in its warranty brochure that in the past not more than 10 percent of its TV sets needed any repair during the first two years
More informationUsing Minitab for Regression Analysis: An extended example
Using Minitab for Regression Analysis: An extended example The following example uses data from another text on fertilizer application and crop yield, and is intended to show how Minitab can be used to
More informationUnivariate Regression
Univariate Regression Correlation and Regression The regression line summarizes the linear relationship between 2 variables Correlation coefficient, r, measures strength of relationship: the closer r is
More informationChapter 7: Simple linear regression Learning Objectives
Chapter 7: Simple linear regression Learning Objectives Reading: Section 7.1 of OpenIntro Statistics Video: Correlation vs. causation, YouTube (2:19) Video: Intro to Linear Regression, YouTube (5:18) 
More informationRegression Analysis: A Complete Example
Regression Analysis: A Complete Example This section works out an example that includes all the topics we have discussed so far in this chapter. A complete example of regression analysis. PhotoDisc, Inc./Getty
More informationLinear Models in STATA and ANOVA
Session 4 Linear Models in STATA and ANOVA Page Strengths of Linear Relationships 42 A Note on NonLinear Relationships 44 Multiple Linear Regression 45 Removal of Variables 48 Independent Samples
More informationSPSS: Descriptive and Inferential Statistics. For Windows
For Windows August 2012 Table of Contents Section 1: Summarizing Data...3 1.1 Descriptive Statistics...3 Section 2: Inferential Statistics... 10 2.1 ChiSquare Test... 10 2.2 T tests... 11 2.3 Correlation...
More informationModule 9: Nonparametric Tests. The Applied Research Center
Module 9: Nonparametric Tests The Applied Research Center Module 9 Overview } Nonparametric Tests } Parametric vs. Nonparametric Tests } Restrictions of Nonparametric Tests } OneSample ChiSquare Test
More informationSimple Linear Regression Inference
Simple Linear Regression Inference 1 Inference requirements The Normality assumption of the stochastic term e is needed for inference even if it is not a OLS requirement. Therefore we have: Interpretation
More informationCHAPTER 14 NONPARAMETRIC TESTS
CHAPTER 14 NONPARAMETRIC TESTS Everything that we have done up until now in statistics has relied heavily on one major fact: that our data is normally distributed. We have been able to make inferences
More informationCHAPTER 13 SIMPLE LINEAR REGRESSION. Opening Example. Simple Regression. Linear Regression
Opening Example CHAPTER 13 SIMPLE LINEAR REGREION SIMPLE LINEAR REGREION! Simple Regression! Linear Regression Simple Regression Definition A regression model is a mathematical equation that descries the
More informationDescriptive Statistics
Descriptive Statistics Primer Descriptive statistics Central tendency Variation Relative position Relationships Calculating descriptive statistics Descriptive Statistics Purpose to describe or summarize
More informationChapter 13 Introduction to Linear Regression and Correlation Analysis
Chapter 3 Student Lecture Notes 3 Chapter 3 Introduction to Linear Regression and Correlation Analsis Fall 2006 Fundamentals of Business Statistics Chapter Goals To understand the methods for displaing
More informationAdditional sources Compilation of sources: http://lrs.ed.uiuc.edu/tseportal/datacollectionmethodologies/jintselink/tselink.htm
Mgt 540 Research Methods Data Analysis 1 Additional sources Compilation of sources: http://lrs.ed.uiuc.edu/tseportal/datacollectionmethodologies/jintselink/tselink.htm http://web.utk.edu/~dap/random/order/start.htm
More informationScatter Plots with Error Bars
Chapter 165 Scatter Plots with Error Bars Introduction The procedure extends the capability of the basic scatter plot by allowing you to plot the variability in Y and X corresponding to each point. Each
More informationSimple Regression Theory II 2010 Samuel L. Baker
SIMPLE REGRESSION THEORY II 1 Simple Regression Theory II 2010 Samuel L. Baker Assessing how good the regression equation is likely to be Assignment 1A gets into drawing inferences about how close the
More informationHow To Run Statistical Tests in Excel
How To Run Statistical Tests in Excel Microsoft Excel is your best tool for storing and manipulating data, calculating basic descriptive statistics such as means and standard deviations, and conducting
More informationAnalysing Questionnaires using Minitab (for SPSS queries contact ) Graham.Currell@uwe.ac.uk
Analysing Questionnaires using Minitab (for SPSS queries contact ) Graham.Currell@uwe.ac.uk Structure As a starting point it is useful to consider a basic questionnaire as containing three main sections:
More informationSimple Predictive Analytics Curtis Seare
Using Excel to Solve Business Problems: Simple Predictive Analytics Curtis Seare Copyright: Vault Analytics July 2010 Contents Section I: Background Information Why use Predictive Analytics? How to use
More information2013 MBA Jump Start Program. Statistics Module Part 3
2013 MBA Jump Start Program Module 1: Statistics Thomas Gilbert Part 3 Statistics Module Part 3 Hypothesis Testing (Inference) Regressions 2 1 Making an Investment Decision A researcher in your firm just
More informationSPSS Tests for Versions 9 to 13
SPSS Tests for Versions 9 to 13 Chapter 2 Descriptive Statistic (including median) Choose Analyze Descriptive statistics Frequencies... Click on variable(s) then press to move to into Variable(s): list
More informationProjects Involving Statistics (& SPSS)
Projects Involving Statistics (& SPSS) Academic Skills Advice Starting a project which involves using statistics can feel confusing as there seems to be many different things you can do (charts, graphs,
More informationAppendix E: Graphing Data
You will often make scatter diagrams and line graphs to illustrate the data that you collect. Scatter diagrams are often used to show the relationship between two variables. For example, in an absorbance
More informationPremaster Statistics Tutorial 4 Full solutions
Premaster Statistics Tutorial 4 Full solutions Regression analysis Q1 (based on Doane & Seward, 4/E, 12.7) a. Interpret the slope of the fitted regression = 125,000 + 150. b. What is the prediction for
More informationFactors affecting online sales
Factors affecting online sales Table of contents Summary... 1 Research questions... 1 The dataset... 2 Descriptive statistics: The exploratory stage... 3 Confidence intervals... 4 Hypothesis tests... 4
More informationChapter 14: Analyzing Relationships Between Variables
Chapter Outlines for: Frey, L., Botan, C., & Kreps, G. (1999). Investigating communication: An introduction to research methods. (2nd ed.) Boston: Allyn & Bacon. Chapter 14: Analyzing Relationships Between
More informationABSORBENCY OF PAPER TOWELS
ABSORBENCY OF PAPER TOWELS 15. Brief Version of the Case Study 15.1 Problem Formulation 15.2 Selection of Factors 15.3 Obtaining Random Samples of Paper Towels 15.4 How will the Absorbency be measured?
More informationMTH 140 Statistics Videos
MTH 140 Statistics Videos Chapter 1 Picturing Distributions with Graphs Individuals and Variables Categorical Variables: Pie Charts and Bar Graphs Categorical Variables: Pie Charts and Bar Graphs Quantitative
More informationbusiness statistics using Excel OXFORD UNIVERSITY PRESS Glyn Davis & Branko Pecar
business statistics using Excel Glyn Davis & Branko Pecar OXFORD UNIVERSITY PRESS Detailed contents Introduction to Microsoft Excel 2003 Overview Learning Objectives 1.1 Introduction to Microsoft Excel
More informationSimple Regression and Correlation
Simple Regression and Correlation Today, we are going to discuss a powerful statistical technique for examining whether or not two variables are related. Specifically, we are going to talk about the ideas
More informationStudy Guide for the Final Exam
Study Guide for the Final Exam When studying, remember that the computational portion of the exam will only involve new material (covered after the second midterm), that material from Exam 1 will make
More informationThe Dummy s Guide to Data Analysis Using SPSS
The Dummy s Guide to Data Analysis Using SPSS Mathematics 57 Scripps College Amy Gamble April, 2001 Amy Gamble 4/30/01 All Rights Rerserved TABLE OF CONTENTS PAGE Helpful Hints for All Tests...1 Tests
More informationModule 3: Correlation and Covariance
Using Statistical Data to Make Decisions Module 3: Correlation and Covariance Tom Ilvento Dr. Mugdim Pašiƒ University of Delaware Sarajevo Graduate School of Business O ften our interest in data analysis
More informationHow to Conduct a Hypothesis Test
How to Conduct a Hypothesis Test The idea of hypothesis testing is relatively straightforward. In various studies we observe certain events. We must ask, is the event due to chance alone, or is there some
More informationStatistical Inference and ttests
1 Statistical Inference and ttests Objectives Evaluate the difference between a sample mean and a target value using a onesample ttest. Evaluate the difference between a sample mean and a target value
More informationHypothesis Testing  Relationships
 Relationships Session 3 AHX43 (28) 1 Lecture Outline Correlational Research. The Correlation Coefficient. An example. Considerations. One and Twotailed Tests. Errors. Power. for Relationships AHX43
More informationII. DISTRIBUTIONS distribution normal distribution. standard scores
Appendix D Basic Measurement And Statistics The following information was developed by Steven Rothke, PhD, Department of Psychology, Rehabilitation Institute of Chicago (RIC) and expanded by Mary F. Schmidt,
More informationAP Statistics 2001 Solutions and Scoring Guidelines
AP Statistics 2001 Solutions and Scoring Guidelines The materials included in these files are intended for noncommercial use by AP teachers for course and exam preparation; permission for any other use
More informationChapter 15 Multiple Choice Questions (The answers are provided after the last question.)
Chapter 15 Multiple Choice Questions (The answers are provided after the last question.) 1. What is the median of the following set of scores? 18, 6, 12, 10, 14? a. 10 b. 14 c. 18 d. 12 2. Approximately
More informationSession 7 Bivariate Data and Analysis
Session 7 Bivariate Data and Analysis Key Terms for This Session Previously Introduced mean standard deviation New in This Session association bivariate analysis contingency table covariation least squares
More informationThe aspect of the data that we want to describe/measure is the degree of linear relationship between and The statistic r describes/measures the degree
PS 511: Advanced Statistics for Psychological and Behavioral Research 1 Both examine linear (straight line) relationships Correlation works with a pair of scores One score on each of two variables ( and
More information, has mean A) 0.3. B) the smaller of 0.8 and 0.5. C) 0.15. D) which cannot be determined without knowing the sample results.
BA 275 Review Problems  Week 9 (11/20/0611/24/06) CD Lessons: 69, 70, 1620 Textbook: pp. 520528, 111124, 133141 An SRS of size 100 is taken from a population having proportion 0.8 of successes. An
More informationSpearman s correlation
Spearman s correlation Introduction Before learning about Spearman s correllation it is important to understand Pearson s correlation which is a statistical measure of the strength of a linear relationship
More informationData Mining Techniques Chapter 5: The Lure of Statistics: Data Mining Using Familiar Tools
Data Mining Techniques Chapter 5: The Lure of Statistics: Data Mining Using Familiar Tools Occam s razor.......................................................... 2 A look at data I.........................................................
More information1. The parameters to be estimated in the simple linear regression model Y=α+βx+ε ε~n(0,σ) are: a) α, β, σ b) α, β, ε c) a, b, s d) ε, 0, σ
STA 3024 Practice Problems Exam 2 NOTE: These are just Practice Problems. This is NOT meant to look just like the test, and it is NOT the only thing that you should study. Make sure you know all the material
More informationChapter 5 Analysis of variance SPSS Analysis of variance
Chapter 5 Analysis of variance SPSS Analysis of variance Data file used: gss.sav How to get there: Analyze Compare Means Oneway ANOVA To test the null hypothesis that several population means are equal,
More informationBill Burton Albert Einstein College of Medicine william.burton@einstein.yu.edu April 28, 2014 EERS: Managing the Tension Between Rigor and Resources 1
Bill Burton Albert Einstein College of Medicine william.burton@einstein.yu.edu April 28, 2014 EERS: Managing the Tension Between Rigor and Resources 1 Calculate counts, means, and standard deviations Produce
More informationElementary Statistics. Scatter Plot, Regression Line, Linear Correlation Coefficient, and Coefficient of Determination
Scatter Plot, Regression Line, Linear Correlation Coefficient, and Coefficient of Determination What is a Scatter Plot? A Scatter Plot is a plot of ordered pairs (x, y) where the horizontal axis is used
More informationCurriculum Map Statistics and Probability Honors (348) Saugus High School Saugus Public Schools 20092010
Curriculum Map Statistics and Probability Honors (348) Saugus High School Saugus Public Schools 20092010 Week 1 Week 2 14.0 Students organize and describe distributions of data by using a number of different
More informationNCSS Statistical Software Principal Components Regression. In ordinary least squares, the regression coefficients are estimated using the formula ( )
Chapter 340 Principal Components Regression Introduction is a technique for analyzing multiple regression data that suffer from multicollinearity. When multicollinearity occurs, least squares estimates
More informationThe Big 50 Revision Guidelines for S1
The Big 50 Revision Guidelines for S1 If you can understand all of these you ll do very well 1. Know what is meant by a statistical model and the Modelling cycle of continuous refinement 2. Understand
More informationOutline of Topics. Statistical Methods I. Types of Data. Descriptive Statistics
Statistical Methods I Tamekia L. Jones, Ph.D. (tjones@cog.ufl.edu) Research Assistant Professor Children s Oncology Group Statistics & Data Center Department of Biostatistics Colleges of Medicine and Public
More informationChapter 10. Key Ideas Correlation, Correlation Coefficient (r),
Chapter 0 Key Ideas Correlation, Correlation Coefficient (r), Section 0: Overview We have already explored the basics of describing single variable data sets. However, when two quantitative variables
More informationThis chapter will demonstrate how to perform multiple linear regression with IBM SPSS
CHAPTER 7B Multiple Regression: Statistical Methods Using IBM SPSS This chapter will demonstrate how to perform multiple linear regression with IBM SPSS first using the standard method and then using the
More information2. Simple Linear Regression
Research methods  II 3 2. Simple Linear Regression Simple linear regression is a technique in parametric statistics that is commonly used for analyzing mean response of a variable Y which changes according
More informationComparing two groups (t tests...)
Page 1 of 33 Comparing two groups (t tests...) You've measured a variable in two groups, and the means (and medians) are distinct. Is that due to chance? Or does it tell you the two groups are really different?
More informationEXCEL EXERCISE AND ACCELERATION DUE TO GRAVITY
EXCEL EXERCISE AND ACCELERATION DUE TO GRAVITY Objective: To learn how to use the Excel spreadsheet to record your data, calculate values and make graphs. To analyze the data from the Acceleration Due
More informationSydney Roberts Predicting Age Group Swimmers 50 Freestyle Time 1. 1. Introduction p. 2. 2. Statistical Methods Used p. 5. 3. 10 and under Males p.
Sydney Roberts Predicting Age Group Swimmers 50 Freestyle Time 1 Table of Contents 1. Introduction p. 2 2. Statistical Methods Used p. 5 3. 10 and under Males p. 8 4. 11 and up Males p. 10 5. 10 and under
More informationSan Jose State University Engineering 10 1
KY San Jose State University Engineering 10 1 Select Insert from the main menu Plotting in Excel Select All Chart Types San Jose State University Engineering 10 2 Definition: A chart that consists of multiple
More informationStatistical Significance and Bivariate Tests
Statistical Significance and Bivariate Tests BUS 735: Business Decision Making and Research 1 1.1 Goals Goals Specific goals: Refamiliarize ourselves with basic statistics ideas: sampling distributions,
More informationCALCULATIONS & STATISTICS
CALCULATIONS & STATISTICS CALCULATION OF SCORES Conversion of 15 scale to 0100 scores When you look at your report, you will notice that the scores are reported on a 0100 scale, even though respondents
More informationBowerman, O'Connell, Aitken Schermer, & Adcock, Business Statistics in Practice, Canadian edition
Bowerman, O'Connell, Aitken Schermer, & Adcock, Business Statistics in Practice, Canadian edition Online Learning Centre Technology StepbyStep  Excel Microsoft Excel is a spreadsheet software application
More informationSIMPLE REGRESSION ANALYSIS
SIMPLE REGRESSION ANALYSIS Introduction. Regression analysis is used when two or more variables are thought to be systematically connected by a linear relationship. In simple regression, we have only two
More informationInferential Statistics. Probability. From Samples to Populations. Katie RommelEsham Education 504
Inferential Statistics Katie RommelEsham Education 504 Probability Probability is the scientific way of stating the degree of confidence we have in predicting something Tossing coins and rolling dice
More informationComparing three or more groups (oneway ANOVA...)
Page 1 of 36 Comparing three or more groups (oneway ANOVA...) You've measured a variable in three or more groups, and the means (and medians) are distinct. Is that due to chance? Or does it tell you the
More informationUnit 31 A Hypothesis Test about Correlation and Slope in a Simple Linear Regression
Unit 31 A Hypothesis Test about Correlation and Slope in a Simple Linear Regression Objectives: To perform a hypothesis test concerning the slope of a least squares line To recognize that testing for a
More informationCorrelational Research. Correlational Research. Stephen E. Brock, Ph.D., NCSP EDS 250. Descriptive Research 1. Correlational Research: Scatter Plots
Correlational Research Stephen E. Brock, Ph.D., NCSP California State University, Sacramento 1 Correlational Research A quantitative methodology used to determine whether, and to what degree, a relationship
More informationUsing Excel for Statistical Analysis
Using Excel for Statistical Analysis You don t have to have a fancy pants statistics package to do many statistical functions. Excel can perform several statistical tests and analyses. First, make sure
More informationCanonical Correlation
Chapter 400 Introduction Canonical correlation analysis is the study of the linear relations between two sets of variables. It is the multivariate extension of correlation analysis. Although we will present
More informationMINITAB ASSISTANT WHITE PAPER
MINITAB ASSISTANT WHITE PAPER This paper explains the research conducted by Minitab statisticians to develop the methods and data checks used in the Assistant in Minitab 17 Statistical Software. OneWay
More information1) Write the following as an algebraic expression using x as the variable: Triple a number subtracted from the number
1) Write the following as an algebraic expression using x as the variable: Triple a number subtracted from the number A. 3(x  x) B. x 3 x C. 3x  x D. x  3x 2) Write the following as an algebraic expression
More informationExercise 1.12 (Pg. 2223)
Individuals: The objects that are described by a set of data. They may be people, animals, things, etc. (Also referred to as Cases or Records) Variables: The characteristics recorded about each individual.
More informationData Analysis Tools. Tools for Summarizing Data
Data Analysis Tools This section of the notes is meant to introduce you to many of the tools that are provided by Excel under the Tools/Data Analysis menu item. If your computer does not have that tool
More informationVariables and Data A variable contains data about anything we measure. For example; age or gender of the participants or their score on a test.
The Analysis of Research Data The design of any project will determine what sort of statistical tests you should perform on your data and how successful the data analysis will be. For example if you decide
More informationUCLA STAT 13 Statistical Methods  Final Exam Review Solutions Chapter 7 Sampling Distributions of Estimates
UCLA STAT 13 Statistical Methods  Final Exam Review Solutions Chapter 7 Sampling Distributions of Estimates 1. (a) (i) µ µ (ii) σ σ n is exactly Normally distributed. (c) (i) is approximately Normally
More informationGood luck! BUSINESS STATISTICS FINAL EXAM INSTRUCTIONS. Name:
Glo bal Leadership M BA BUSINESS STATISTICS FINAL EXAM Name: INSTRUCTIONS 1. Do not open this exam until instructed to do so. 2. Be sure to fill in your name before starting the exam. 3. You have two hours
More informationUsing Excel for inferential statistics
FACT SHEET Using Excel for inferential statistics Introduction When you collect data, you expect a certain amount of variation, just caused by chance. A wide variety of statistical tests can be applied
More informationQUANTITATIVE METHODS BIOLOGY FINAL HONOUR SCHOOL NONPARAMETRIC TESTS
QUANTITATIVE METHODS BIOLOGY FINAL HONOUR SCHOOL NONPARAMETRIC TESTS This booklet contains lecture notes for the nonparametric work in the QM course. This booklet may be online at http://users.ox.ac.uk/~grafen/qmnotes/index.html.
More informationChapter 23. Inferences for Regression
Chapter 23. Inferences for Regression Topics covered in this chapter: Simple Linear Regression Simple Linear Regression Example 23.1: Crying and IQ The Problem: Infants who cry easily may be more easily
More informationChapter 8 Graphs and Functions:
Chapter 8 Graphs and Functions: Cartesian axes, coordinates and points 8.1 Pictorially we plot points and graphs in a plane (flat space) using a set of Cartesian axes traditionally called the x and y axes
More informationResearch Methods & Experimental Design
Research Methods & Experimental Design 16.422 Human Supervisory Control April 2004 Research Methods Qualitative vs. quantitative Understanding the relationship between objectives (research question) and
More informationAnalysis of Variance ANOVA
Analysis of Variance ANOVA Overview We ve used the t test to compare the means from two independent groups. Now we ve come to the final topic of the course: how to compare means from more than two populations.
More information" Y. Notation and Equations for Regression Lecture 11/4. Notation:
Notation: Notation and Equations for Regression Lecture 11/4 m: The number of predictor variables in a regression Xi: One of multiple predictor variables. The subscript i represents any number from 1 through
More informationNull Hypothesis H 0. The null hypothesis (denoted by H 0
Hypothesis test In statistics, a hypothesis is a claim or statement about a property of a population. A hypothesis test (or test of significance) is a standard procedure for testing a claim about a property
More information1 SAMPLE SIGN TEST. NonParametric Univariate Tests: 1 Sample Sign Test 1. A nonparametric equivalent of the 1 SAMPLE TTEST.
NonParametric Univariate Tests: 1 Sample Sign Test 1 1 SAMPLE SIGN TEST A nonparametric equivalent of the 1 SAMPLE TTEST. ASSUMPTIONS: Data is nonnormally distributed, even after log transforming.
More informationRegression stepbystep using Microsoft Excel
Step 1: Regression stepbystep using Microsoft Excel Notes prepared by Pamela Peterson Drake, James Madison University Type the data into the spreadsheet The example used throughout this How to is a regression
More informationStatistics and research
Statistics and research Usaneya Perngparn Chitlada Areesantichai Drug Dependence Research Center (WHOCC for Research and Training in Drug Dependence) College of Public Health Sciences Chulolongkorn University,
More informationModule 5: Multiple Regression Analysis
Using Statistical Data Using to Make Statistical Decisions: Data Multiple to Make Regression Decisions Analysis Page 1 Module 5: Multiple Regression Analysis Tom Ilvento, University of Delaware, College
More informationIntroduction to Regression and Data Analysis
Statlab Workshop Introduction to Regression and Data Analysis with Dan Campbell and Sherlock Campbell October 28, 2008 I. The basics A. Types of variables Your variables may take several forms, and it
More information