# Case Study in Data Analysis Does a drug prevent cardiomegaly in heart failure?

Size: px
Start display at page:

Download "Case Study in Data Analysis Does a drug prevent cardiomegaly in heart failure?"

Transcription

2 FIRST IMPRESSIONS, BEFORE STATISTICAL ANALYSIS Before doing analyses, it is always wise to carefully look at the data. Here are some thoughts: The heart-failure surgery does what it is supposed to do induce heart failure, with a resulting increase in heart weight. The mean heart weight of the vehicle-treated animals almost doubled with the surgery. Since this is an established method, I d want to compare these results with results of prior runs of this experiment. Are these values similar to those commonly seen with this kind of experiment? The investigators tell me that indeed these data match previous experiments. The drug alone (i.e., in conjunction with sham surgery) increased heart weight, but this increase is much smaller than that induced by the heart-failure surgery. This complicates the results. The increase of heart weight with the heart-failure-inducing procedure is smaller in the drug-treated animals. But this is partly because the drug increases heart weight in the sham-surgery group, and partly because the drug decreases heart weight in the animals subjected to heart-failure surgery. The variation among values is much higher in the heart-failure-induced animals than in the sham-operated animals. Presumably this is because of subtle differences in the surgery from one animal to another, resulting in varying degrees of heart failure. This blurs the results a bit. There is a huge amount of overlap. Any conclusion will be about the differences between averages of different experimental groups. The effect of the drug is not so pronounced that the heart weight of every drug-treated animal is distinct from the heart weight of every control animal. HOW TO DISPLAY DATA The first step in using GraphPad Prism is to decide what kind of data table to use. This experiment is defined by two grouping variables, so I entered the values into a Grouped data table in Prism. Half the animals were subjected to surgery to create heart failure, and half underwent sham surgery. These two groups define the two data-set columns. Half the animals were injected with an experimental drug, and half were injected with vehicle as a control. These two groups define the two rows. Values from ten animals (replicates) are placed side by side in subcolumns. One value was missing, and its spot is simply left blank. The preceding page displayed the raw data plotted from this grouped-data table. An alternative (and more commonly used) approach is to create bar graphs (see below), plotting mean and SD (left), or mean and SEM (right). Copyright 2010 GraphPad Software Inc. All rights reserved. 2

3 I prefer to see the actual data, as on page 1, when possible. With ten rats in each group, a graph of the raw data is easy to understand, and is not excessively cluttered. I don t see any advantage to plotting the mean and SD or SEM, rather than the actual data. It is also possible to plot both the raw data and the error bars on one graph, as shown below with SEM error bars. TWO-WAY ANOVA The outcome (heart weight) is affected by two factors: whether or not the animal had surgically induced heart failure, and whether or not the animal was given the drug. Therefore, my first thought was to analyze the data by two-way ANOVA (later I ll show alternatives). Repeated measures ANOVA is not appropriate. Each animal underwent either surgery to create heart failure or sham surgery, and received either the drug or the vehicle. No animal received two treatments sequentially. This would be impossible, since the outcome being compared is the heart weight, which cannot be determined while the animals are alive. Copyright 2010 GraphPad Software Inc. All rights reserved. 3

6 The basic idea of ANOVA partitioning the variation into components is not a direct way to answer the scientific question this experiment was designed to answer. Most of the ANOVA results are irrelevant to the scientific question. The terms effects and interactions are distracting. It is too easy thinking that the scientific goal of this study (or others) is to seek effects or interactions, even when those values don t correspond to the scientific goals of the experiment. The confidence interval we need is not computed by two-way ANOVA (but, it isn t hard to compute by hand from the ANOVA results). Two-way ANOVA is based on the assumption that all the values are sampled from populations that follow a Gaussian distribution, and that all four of these populations, whatever their means, have the same SD. It takes only a glance at the raw data (page 1) to see that this assumption is violated with these data. The scatter is much greater for the heart-failure animals with larger hearts than it is for the animals given sham surgery. ANALYZE THE DATA WITH REGRESSION Entering the data on an XY data table Since two-way ANOVA didn t directly answer the experimental question, I switched to thinking about models and regression. Below are the same data entered into an XY data table. The control (shamsurgery) animals are assigned an X value of 0; the heart-failure animals are assigned an X value of 1. Copyright 2010 GraphPad Software Inc. All rights reserved. 6

7 Analyze the data with linear regression Here are the results of linear regression: Since each line has only two X values, the best-fit lines go right through the mean value at each X. The Y- intercepts are the Y values when X=0. Since X=0 is the code for sham-operated animals, the Y intercepts equal the mean heart weight of sham-operated animals. The slope equals the difference between the mean weight of the hearts of animals with heart-failure surgery and the mean weight of hearts of rats given sham surgery, divided by the difference in X values. Since the two X values are 0 and 1, that denominator equals 1. Each slope, therefore, equals the mean difference in heart weight, comparing animals given heart-failure surgery to those given sham surgery. The experimental question is whether the increase in heart weight was reduced in animals given the experimental drug. This is equivalent to asking: Do the two slopes differ? Prism doesn t compute a confidence interval for that difference as part of linear regression analysis, but it can compute a P value. Check the option in the Prism linear regression dialog to compare slopes and it will report these results: Are the slopes equal? F = DFn = 1 DFd = 35 P = Note that this F ratio and P value are identical to those reported by two-way ANOVA in testing the null hypothesis of no interaction. Here, linear regression really wasn t a step forward. The value we care the most about the difference between the two slopes was not computed by linear regression. Nor, was its confidence interval. Many people find it hard to grasp the concept of interaction in two-way ANOVA. This example shows that testing for interaction in two-way ANOVA (with two rows and two columns) is equivalent to testing whether two slopes are identical. I think the idea of interaction is easier to understand in this context. Analyze the data with nonlinear regression with equal weighting Prism s nonlinear regression analysis is far more versatile than its linear regression analysis. Let s switch to nonlinear regression, which can also fit straight lines. Copyright 2010 GraphPad Software Inc. All rights reserved. 7

8 Nonlinear regression lets you fit to user-defined models, so we can write a model that fits exactly what we want to know the difference between the two slopes. Here is the user-defined equation to be entered into Prism s nonlinear regression analysis: <A>Slope = ControlSlope <B>Slope = ControlSlope DrugEffect Y = Intercept + Slope*X The last line is the familiar equation for a straight line. The two lines above it define the parameter Slope for each data set. For the first data set (control animals given vehicle) we call the slope ControlSlope. For the second data set, we define the slope to equal that control slope minus a parameter we call DrugEffect. In the Constrain tab, define ControlSlope to be shared between the two data sets, so that Prism only fits one value that applies to both data sets. This step is essential. With this model, Prism fits the value we want to know, DrugEffect. It is the difference between the two slopes, which is equivalent to the difference between the effect of heart failure in the controls (vehicle) and the effect in the drug-treated animals. Prisms results: The best fit value of DrugEffect is 0.121, with a 95% confidence interval ranging from to This is identical to the interaction confidence interval shown on page 5. What about a P value? We can compare the fit of this model to the fit of a model where the DrugEffect is forced to equal zero. Do this using the Compare tab of Prism s nonlinear regression dialog: Copyright 2010 GraphPad Software Inc. All rights reserved. 8

9 The resulting P value is identical to what was obtained by ANOVA: The F and P values are identical to the interaction results of two-way ANOVA results. But, there are three advantages to using this model-fitting approach instead of ANOVA: It is easier to understand the results as a difference between two slopes than as an interaction confidence interval. The confidence interval is computed as part of the nonlinear regression, so it doesn t require any manual calculations. As you ll see below, the model-fitting approach (via nonlinear regression) gives us more options. Regression with differential weighting ANOVA, as well as the regression analysis of the prior section, assumes that all four groups of data were sampled from populations with identical standard deviations. This assumption is unlikely to be true. Here are the sample standard deviations of the four groups: Group Mean SD %CV Sham surgery & Vehicle % Sham surgery & Drug % Heart failure & Vehicle % Heart failure & Drug % Copyright 2010 GraphPad Software Inc. All rights reserved. 9

10 The group of animals with heart failure has a larger standard deviation than the sham-surgery group. This isn t too surprising. The surgery to create heart failure is difficult, and tiny differences in the surgery would result in different degrees of heart failure and different heart weights. One way to cope with this situation is to apply weighting factors in the nonlinear regression. Standard (unweighted) regression minimizes the sum of the squares of the difference between each observed value and the value predicted by the model. The most common form of weighted regression effectively minimizes the sum of the squares of the relative distance (the difference between the observed and predicted value divided by the predicted value). This kind of weighting makes sense when the standard deviations are not consistent, but the coefficients of variation (CV, which equals the SD divided by the mean) are consistent. With these data (see table above), the coefficients of variation are more consistent than the standard deviations. But the CV is higher in the animals with heart failure than those without. The assumption about consistent standard deviations (unweighted regression) or consistent coefficient of variation (weighted regression) applies to the underlying populations. All we have here are the data from one experiment. Ideally, we would want more data (from a number of experiments) before deciding how to best weight the data. And, in addition to more data, it would be nice to have some theory about where the variation is coming from. For this analysis, I ll assume that the CV is consistent (and that the variations we observed in CV values are simply due to random sampling). Therefore, I chose Weight by 1/Y 2 (minimize the relative distances squared) on the Weights tab of Prism s nonlinear regression dialog. Copyright 2010 GraphPad Software Inc. All rights reserved. 10

11 Since there are only two points on each line, the best fit lines don t change with different weighting. But using differential weighting does affect the P value and confidence interval. The P value testing the null hypothesis that there is no drug effect equals , giving us strong evidence to reject that null hypothesis. The confidence interval for the drug effect ranges from to 0.186, a bit narrower than it was with equal weighting. Fitting a ratio rather than a difference Expressing the effect of a drug as a difference in heart weight seems a bit awkward. Instead, let s fit the relative change in heart weight. The revised model is: <A>Slope = ControlSlope <B>Slope = ControlSlope * DrugEffectRatio Y= Intercept + Slope*X The last line is the familiar equation for a straight line. The two lines above it define the parameter Slope for the two data sets. For the first data set (control animals given vehicle) we call the slope ControlSlope. For the second data set, we define the slope to equal the control slope multiplied by a parameter we call DrugEffectRatio. In the Constrain tab, define ControlSlope to be shared between the two data sets. This instructs Prism to use global fitting to only find one best-fit value for that parameter (not one for each line). This step is essential. With this model, Prism fits the value we really want to know, here called the DrugEffectRatio. It is the ratio of the two slopes, which is equivalent to the ratio of the effect of heart failure in the control and drug-treated animals. Again, we ll use relative weighting. The P value testing the null hypothesis that the DrugEffectRatio is actually equal to 1.0 is , the same as before. The best-fit value of the ratio is In heartfailure-induced animals, the increase in heart weight in the drug-treated group equals 48.8 percent of the increase in heart weight in the control (vehicle) group. The 95% confidence interval ranges from to Expressed as a ratio, the results are easier to understand than when they were expressed as a difference. In heart-failure animals, the presence of drug blunts the increase in heart weight to 48.8% of the increase seen in control animals. The 95% confidence interval for that effect ranges from 27.8% to 69.8%. The P value testing the null hypothesis that there is no drug effect is A SIMPLER APPROACH: UNPAIRED t TEST Problems with the ANOVA and regression approaches Both the ANOVA and regression approaches analyze the data from all four groups of rats. Data were collected from all four groups, so it seems reasonable to analyze all of the data. But, analyzing all of the data leads to two unavoidable problems: The analyses are based upon an invalid assumption. ANOVA, and the simple approaches to regression, assume that all four groups of data are sampled from populations with identical standard deviations. A glance at the data (see graph on page 1) shows that this assumption is violated. Weighted regression made an alternative assumption, but it is not at all clear that this alternative assumption is valid. Copyright 2010 GraphPad Software Inc. All rights reserved. 11

12 The drug has two effects. It decreases the heart weight in the animals with heart failure, but it also increases the heart weights of the animals subjected to sham surgery. This complicates the analysis, as both effects contribute to the outcome we are quantifying -- the difference in heart weight with heart failure, comparing animals treated with the experimental drug versus controls. An alternative approach At this point, I decided to use an alternative approach to reframe the experimental question, and simply ask if the hearts of animals with heart failure are smaller when the animals are treated with the drug. This analysis requires working with only two data sets: An unpaired t test is based on the assumptions that both groups of data are sampled from populations that follow a Gaussian distribution, and the SDs of both of these populations are identical (even if the means are different). Both of these assumptions seem quite reasonable with these data. Therefore, the two means can be compared with an unpaired t test. The results are expressed both as a confidence interval, and as a P value: The difference between the two means is , with a 95% confidence interval that extends from to The P value (two-tailed testing the null hypothesis that the two populations have identical means) equals If the two populations really had the same mean, there is a 3.0% chance that random sampling would result in a difference as large, or larger, than observed in this study. The P value and confidence interval are consistent. Because the 95% confidence interval does not include 0.0, the P value must be less than Expressing the results as a ratio The results were expressed as a difference in weights (expressed as a percent of body weight) which is not easy to think about. Dividing all the results by the mean weight of the control (vehicle) animals converts the results into ratios that are easier to think about: 1. The difference between the mean heart weight of control and drug-treated is Copyright 2010 GraphPad Software Inc. All rights reserved. 12

15 APPENDIX. CALCULATING THE INTERACTION CONFIDENCE INTERVAL Here again is the table from page 6. Vehicle (no drug) Experimental Drug Heart-Failure Surgery Sham Surgery Difference Diff. of differences % CI to The confidence interval is computed by the following steps: 1. Calculate the pooled SD. Two-way ANOVA doesn t report this value but it s easy to compute. The residual (or error) mean square reported on the ANOVA table is essentially a variance. The residual mean square is ). Its square root equals , which is the pooled SD. The pooled SD in ANOVA 2. Compute the standard error of that difference, by dividing the pooled SD by the square root of the sum of the reciprocals of sample sizes: SE Pooled SD n a n b n c n d Calculate the number of degrees of freedom (df) as the total number of values (39 rats) minus the number of groups (4). So df= Find the critical value from the t distribution for 95% confidence and 35 df. This value can be computed using the free GraphPad QuickCalc, using a table found in most statistics books (including Appendix D of Intuitive Biostatistics, 2 nd ed.), or by using this Excel formula: =TINV( , 35). The value is Calculate the margin of error of the confidence interval. It equals the standard error ( ) times the value from the t distribution (2.0301). The margin of error equals Add and subtract the margin of error from the observed difference (0.121, see table at the top of this page) to obtain the confidence interval, which ranges from to Copyright 2010 GraphPad Software Inc. All rights reserved. 15

### NCSS Statistical Software Principal Components Regression. In ordinary least squares, the regression coefficients are estimated using the formula ( )

Chapter 340 Principal Components Regression Introduction is a technique for analyzing multiple regression data that suffer from multicollinearity. When multicollinearity occurs, least squares estimates

### How To Run Statistical Tests in Excel

How To Run Statistical Tests in Excel Microsoft Excel is your best tool for storing and manipulating data, calculating basic descriptive statistics such as means and standard deviations, and conducting

### Introduction to. Hypothesis Testing CHAPTER LEARNING OBJECTIVES. 1 Identify the four steps of hypothesis testing.

Introduction to Hypothesis Testing CHAPTER 8 LEARNING OBJECTIVES After reading this chapter, you should be able to: 1 Identify the four steps of hypothesis testing. 2 Define null hypothesis, alternative

### Simple Regression Theory II 2010 Samuel L. Baker

SIMPLE REGRESSION THEORY II 1 Simple Regression Theory II 2010 Samuel L. Baker Assessing how good the regression equation is likely to be Assignment 1A gets into drawing inferences about how close the

### 1. What is the critical value for this 95% confidence interval? CV = z.025 = invnorm(0.025) = 1.96

1 Final Review 2 Review 2.1 CI 1-propZint Scenario 1 A TV manufacturer claims in its warranty brochure that in the past not more than 10 percent of its TV sets needed any repair during the first two years

### Regression Analysis: A Complete Example

Regression Analysis: A Complete Example This section works out an example that includes all the topics we have discussed so far in this chapter. A complete example of regression analysis. PhotoDisc, Inc./Getty

### The F distribution and the basic principle behind ANOVAs. Situating ANOVAs in the world of statistical tests

Tutorial The F distribution and the basic principle behind ANOVAs Bodo Winter 1 Updates: September 21, 2011; January 23, 2014; April 24, 2014; March 2, 2015 This tutorial focuses on understanding rather

### DATA INTERPRETATION AND STATISTICS

PholC60 September 001 DATA INTERPRETATION AND STATISTICS Books A easy and systematic introductory text is Essentials of Medical Statistics by Betty Kirkwood, published by Blackwell at about 14. DESCRIPTIVE

### Data Analysis Tools. Tools for Summarizing Data

Data Analysis Tools This section of the notes is meant to introduce you to many of the tools that are provided by Excel under the Tools/Data Analysis menu item. If your computer does not have that tool

### Recall this chart that showed how most of our course would be organized:

Chapter 4 One-Way ANOVA Recall this chart that showed how most of our course would be organized: Explanatory Variable(s) Response Variable Methods Categorical Categorical Contingency Tables Categorical

Version 5.0 Statistics Guide Harvey Motulsky President, GraphPad Software Inc. All rights reserved. This Statistics Guide is a companion to GraphPad Prism 5. Available for both Mac and Windows, Prism makes

### t Tests in Excel The Excel Statistical Master By Mark Harmon Copyright 2011 Mark Harmon

t-tests in Excel By Mark Harmon Copyright 2011 Mark Harmon No part of this publication may be reproduced or distributed without the express permission of the author. mark@excelmasterseries.com www.excelmasterseries.com

### CHAPTER 13 SIMPLE LINEAR REGRESSION. Opening Example. Simple Regression. Linear Regression

Opening Example CHAPTER 13 SIMPLE LINEAR REGREION SIMPLE LINEAR REGREION! Simple Regression! Linear Regression Simple Regression Definition A regression model is a mathematical equation that descries the

### " Y. Notation and Equations for Regression Lecture 11/4. Notation:

Notation: Notation and Equations for Regression Lecture 11/4 m: The number of predictor variables in a regression Xi: One of multiple predictor variables. The subscript i represents any number from 1 through

### The InStat guide to choosing and interpreting statistical tests

Version 3.0 The InStat guide to choosing and interpreting statistical tests Harvey Motulsky 1990-2003, GraphPad Software, Inc. All rights reserved. Program design, manual and help screens: Programming:

### Chapter 7. One-way ANOVA

Chapter 7 One-way ANOVA One-way ANOVA examines equality of population means for a quantitative outcome and a single categorical explanatory variable with any number of levels. The t-test of Chapter 6 looks

### Analyzing Dose-Response Data 1

Version 4. Step-by-Step Examples Analyzing Dose-Response Data 1 A dose-response curve describes the relationship between response to drug treatment and drug dose or concentration. Sensitivity to a drug

### One-Way Analysis of Variance

One-Way Analysis of Variance Note: Much of the math here is tedious but straightforward. We ll skim over it in class but you should be sure to ask questions if you don t understand it. I. Overview A. We

### 2. Simple Linear Regression

Research methods - II 3 2. Simple Linear Regression Simple linear regression is a technique in parametric statistics that is commonly used for analyzing mean response of a variable Y which changes according

### Unit 31 A Hypothesis Test about Correlation and Slope in a Simple Linear Regression

Unit 31 A Hypothesis Test about Correlation and Slope in a Simple Linear Regression Objectives: To perform a hypothesis test concerning the slope of a least squares line To recognize that testing for a

### LAB 4 INSTRUCTIONS CONFIDENCE INTERVALS AND HYPOTHESIS TESTING

LAB 4 INSTRUCTIONS CONFIDENCE INTERVALS AND HYPOTHESIS TESTING In this lab you will explore the concept of a confidence interval and hypothesis testing through a simulation problem in engineering setting.

### X X X a) perfect linear correlation b) no correlation c) positive correlation (r = 1) (r = 0) (0 < r < 1)

CORRELATION AND REGRESSION / 47 CHAPTER EIGHT CORRELATION AND REGRESSION Correlation and regression are statistical methods that are commonly used in the medical literature to compare two or more variables.

### Using Excel for Statistics Tips and Warnings

Using Excel for Statistics Tips and Warnings November 2000 University of Reading Statistical Services Centre Biometrics Advisory and Support Service to DFID Contents 1. Introduction 3 1.1 Data Entry and

### A POPULATION MEAN, CONFIDENCE INTERVALS AND HYPOTHESIS TESTING

CHAPTER 5. A POPULATION MEAN, CONFIDENCE INTERVALS AND HYPOTHESIS TESTING 5.1 Concepts When a number of animals or plots are exposed to a certain treatment, we usually estimate the effect of the treatment

### An analysis method for a quantitative outcome and two categorical explanatory variables.

Chapter 11 Two-Way ANOVA An analysis method for a quantitative outcome and two categorical explanatory variables. If an experiment has a quantitative outcome and two categorical explanatory variables that

### Using Excel s Analysis ToolPak Add-In

Using Excel s Analysis ToolPak Add-In S. Christian Albright, September 2013 Introduction This document illustrates the use of Excel s Analysis ToolPak add-in for data analysis. The document is aimed at

### Two-Way ANOVA with Post Tests 1

Version 4.0 Step-by-Step Examples Two-Way ANOVA with Post Tests 1 Two-way analysis of variance may be used to examine the effects of two variables (factors), both individually and together, on an experimental

### Experimental Designs (revisited)

Introduction to ANOVA Copyright 2000, 2011, J. Toby Mordkoff Probably, the best way to start thinking about ANOVA is in terms of factors with levels. (I say this because this is how they are described

Version 5.0 Regression Guide Harvey Motulsky President, GraphPad Software Inc. All rights reserved. This Regression Guide is a companion to 5. Available for both Mac and Windows, Prism makes it very easy

### Data Mining Techniques Chapter 5: The Lure of Statistics: Data Mining Using Familiar Tools

Data Mining Techniques Chapter 5: The Lure of Statistics: Data Mining Using Familiar Tools Occam s razor.......................................................... 2 A look at data I.........................................................

### Simple Linear Regression Inference

Simple Linear Regression Inference 1 Inference requirements The Normality assumption of the stochastic term e is needed for inference even if it is not a OLS requirement. Therefore we have: Interpretation

### 2013 MBA Jump Start Program. Statistics Module Part 3

2013 MBA Jump Start Program Module 1: Statistics Thomas Gilbert Part 3 Statistics Module Part 3 Hypothesis Testing (Inference) Regressions 2 1 Making an Investment Decision A researcher in your firm just

### Introduction to Regression and Data Analysis

Statlab Workshop Introduction to Regression and Data Analysis with Dan Campbell and Sherlock Campbell October 28, 2008 I. The basics A. Types of variables Your variables may take several forms, and it

### Experiment #1, Analyze Data using Excel, Calculator and Graphs.

Physics 182 - Fall 2014 - Experiment #1 1 Experiment #1, Analyze Data using Excel, Calculator and Graphs. 1 Purpose (5 Points, Including Title. Points apply to your lab report.) Before we start measuring

### KSTAT MINI-MANUAL. Decision Sciences 434 Kellogg Graduate School of Management

KSTAT MINI-MANUAL Decision Sciences 434 Kellogg Graduate School of Management Kstat is a set of macros added to Excel and it will enable you to do the statistics required for this course very easily. To

### MULTIPLE REGRESSION AND ISSUES IN REGRESSION ANALYSIS

MULTIPLE REGRESSION AND ISSUES IN REGRESSION ANALYSIS MSR = Mean Regression Sum of Squares MSE = Mean Squared Error RSS = Regression Sum of Squares SSE = Sum of Squared Errors/Residuals α = Level of Significance

### An analysis appropriate for a quantitative outcome and a single quantitative explanatory. 9.1 The model behind linear regression

Chapter 9 Simple Linear Regression An analysis appropriate for a quantitative outcome and a single quantitative explanatory variable. 9.1 The model behind linear regression When we are examining the relationship

### Part 2: Analysis of Relationship Between Two Variables

Part 2: Analysis of Relationship Between Two Variables Linear Regression Linear correlation Significance Tests Multiple regression Linear Regression Y = a X + b Dependent Variable Independent Variable

### International Statistical Institute, 56th Session, 2007: Phil Everson

Teaching Regression using American Football Scores Everson, Phil Swarthmore College Department of Mathematics and Statistics 5 College Avenue Swarthmore, PA198, USA E-mail: peverso1@swarthmore.edu 1. Introduction

### Come scegliere un test statistico

Come scegliere un test statistico Estratto dal Capitolo 37 of Intuitive Biostatistics (ISBN 0-19-508607-4) by Harvey Motulsky. Copyright 1995 by Oxfd University Press Inc. (disponibile in Iinternet) Table

### Business Statistics. Successful completion of Introductory and/or Intermediate Algebra courses is recommended before taking Business Statistics.

Business Course Text Bowerman, Bruce L., Richard T. O'Connell, J. B. Orris, and Dawn C. Porter. Essentials of Business, 2nd edition, McGraw-Hill/Irwin, 2008, ISBN: 978-0-07-331988-9. Required Computing

### Statistics Review PSY379

Statistics Review PSY379 Basic concepts Measurement scales Populations vs. samples Continuous vs. discrete variable Independent vs. dependent variable Descriptive vs. inferential stats Common analyses

### Chapter 9. Two-Sample Tests. Effect Sizes and Power Paired t Test Calculation

Chapter 9 Two-Sample Tests Paired t Test (Correlated Groups t Test) Effect Sizes and Power Paired t Test Calculation Summary Independent t Test Chapter 9 Homework Power and Two-Sample Tests: Paired Versus

: Table of Contents... 1 Overview of Model... 1 Dispersion... 2 Parameterization... 3 Sigma-Restricted Model... 3 Overparameterized Model... 4 Reference Coding... 4 Model Summary (Summary Tab)... 5 Summary

### Factors affecting online sales

Factors affecting online sales Table of contents Summary... 1 Research questions... 1 The dataset... 2 Descriptive statistics: The exploratory stage... 3 Confidence intervals... 4 Hypothesis tests... 4

### Chapter Study Guide. Chapter 11 Confidence Intervals and Hypothesis Testing for Means

OPRE504 Chapter Study Guide Chapter 11 Confidence Intervals and Hypothesis Testing for Means I. Calculate Probability for A Sample Mean When Population σ Is Known 1. First of all, we need to find out the

### AP Physics 1 and 2 Lab Investigations

AP Physics 1 and 2 Lab Investigations Student Guide to Data Analysis New York, NY. College Board, Advanced Placement, Advanced Placement Program, AP, AP Central, and the acorn logo are registered trademarks

### Descriptive Statistics

Descriptive Statistics Primer Descriptive statistics Central tendency Variation Relative position Relationships Calculating descriptive statistics Descriptive Statistics Purpose to describe or summarize

### Prism 6 Step-by-Step Example Linear Standard Curves Interpolating from a standard curve is a common way of quantifying the concentration of a sample.

Prism 6 Step-by-Step Example Linear Standard Curves Interpolating from a standard curve is a common way of quantifying the concentration of a sample. Step 1 is to construct a standard curve that defines

### Statistics in Medicine Research Lecture Series CSMC Fall 2014

Catherine Bresee, MS Senior Biostatistician Biostatistics & Bioinformatics Research Institute Statistics in Medicine Research Lecture Series CSMC Fall 2014 Overview Review concept of statistical power

### One-Way Analysis of Variance (ANOVA) Example Problem

One-Way Analysis of Variance (ANOVA) Example Problem Introduction Analysis of Variance (ANOVA) is a hypothesis-testing technique used to test the equality of two or more population (or treatment) means

### Analysis of Variance ANOVA

Analysis of Variance ANOVA Overview We ve used the t -test to compare the means from two independent groups. Now we ve come to the final topic of the course: how to compare means from more than two populations.

### Using Excel for Statistical Analysis

Using Excel for Statistical Analysis You don t have to have a fancy pants statistics package to do many statistical functions. Excel can perform several statistical tests and analyses. First, make sure

### Getting Correct Results from PROC REG

Getting Correct Results from PROC REG Nathaniel Derby, Statis Pro Data Analytics, Seattle, WA ABSTRACT PROC REG, SAS s implementation of linear regression, is often used to fit a line without checking

### Two-sample inference: Continuous data

Two-sample inference: Continuous data Patrick Breheny April 5 Patrick Breheny STA 580: Biostatistics I 1/32 Introduction Our next two lectures will deal with two-sample inference for continuous data As

### 0 Introduction to Data Analysis Using an Excel Spreadsheet

Experiment 0 Introduction to Data Analysis Using an Excel Spreadsheet I. Purpose The purpose of this introductory lab is to teach you a few basic things about how to use an EXCEL 2010 spreadsheet to do

### CONTENTS OF DAY 2. II. Why Random Sampling is Important 9 A myth, an urban legend, and the real reason NOTES FOR SUMMER STATISTICS INSTITUTE COURSE

1 2 CONTENTS OF DAY 2 I. More Precise Definition of Simple Random Sample 3 Connection with independent random variables 3 Problems with small populations 8 II. Why Random Sampling is Important 9 A myth,

### Stata Walkthrough 4: Regression, Prediction, and Forecasting

Stata Walkthrough 4: Regression, Prediction, and Forecasting Over drinks the other evening, my neighbor told me about his 25-year-old nephew, who is dating a 35-year-old woman. God, I can t see them getting

### CHAPTER 13. Experimental Design and Analysis of Variance

CHAPTER 13 Experimental Design and Analysis of Variance CONTENTS STATISTICS IN PRACTICE: BURKE MARKETING SERVICES, INC. 13.1 AN INTRODUCTION TO EXPERIMENTAL DESIGN AND ANALYSIS OF VARIANCE Data Collection

### Introduction to Statistics with GraphPad Prism (5.01) Version 1.1

Babraham Bioinformatics Introduction to Statistics with GraphPad Prism (5.01) Version 1.1 Introduction to Statistics with GraphPad Prism 2 Licence This manual is 2010-11, Anne Segonds-Pichon. This manual

### Notes on Factoring. MA 206 Kurt Bryan

The General Approach Notes on Factoring MA 26 Kurt Bryan Suppose I hand you n, a 2 digit integer and tell you that n is composite, with smallest prime factor around 5 digits. Finding a nontrivial factor

### Confidence Intervals on Effect Size David C. Howell University of Vermont

Confidence Intervals on Effect Size David C. Howell University of Vermont Recent years have seen a large increase in the use of confidence intervals and effect size measures such as Cohen s d in reporting

### CALCULATIONS & STATISTICS

CALCULATIONS & STATISTICS CALCULATION OF SCORES Conversion of 1-5 scale to 0-100 scores When you look at your report, you will notice that the scores are reported on a 0-100 scale, even though respondents

### Two-sample hypothesis testing, II 9.07 3/16/2004

Two-sample hypothesis testing, II 9.07 3/16/004 Small sample tests for the difference between two independent means For two-sample tests of the difference in mean, things get a little confusing, here,

### Introduction to Linear Regression

14. Regression A. Introduction to Simple Linear Regression B. Partitioning Sums of Squares C. Standard Error of the Estimate D. Inferential Statistics for b and r E. Influential Observations F. Regression

### Answer: C. The strength of a correlation does not change if units change by a linear transformation such as: Fahrenheit = 32 + (5/9) * Centigrade

Statistics Quiz Correlation and Regression -- ANSWERS 1. Temperature and air pollution are known to be correlated. We collect data from two laboratories, in Boston and Montreal. Boston makes their measurements

### 11. Analysis of Case-control Studies Logistic Regression

Research methods II 113 11. Analysis of Case-control Studies Logistic Regression This chapter builds upon and further develops the concepts and strategies described in Ch.6 of Mother and Child Health:

### Fitting Models to Biological Data using Linear and Nonlinear Regression

Version 4.0 Fitting Models to Biological Data using Linear and Nonlinear Regression A practical guide to curve fitting Harvey Motulsky & Arthur Christopoulos Copyright 2003 GraphPad Software, Inc. All

### Chapter 7: Simple linear regression Learning Objectives

Chapter 7: Simple linear regression Learning Objectives Reading: Section 7.1 of OpenIntro Statistics Video: Correlation vs. causation, YouTube (2:19) Video: Intro to Linear Regression, YouTube (5:18) -

### EXCEL Tutorial: How to use EXCEL for Graphs and Calculations.

EXCEL Tutorial: How to use EXCEL for Graphs and Calculations. Excel is powerful tool and can make your life easier if you are proficient in using it. You will need to use Excel to complete most of your

### Course Text. Required Computing Software. Course Description. Course Objectives. StraighterLine. Business Statistics

Course Text Business Statistics Lind, Douglas A., Marchal, William A. and Samuel A. Wathen. Basic Statistics for Business and Economics, 7th edition, McGraw-Hill/Irwin, 2010, ISBN: 9780077384470 [This

### Section 14 Simple Linear Regression: Introduction to Least Squares Regression

Slide 1 Section 14 Simple Linear Regression: Introduction to Least Squares Regression There are several different measures of statistical association used for understanding the quantitative relationship

### 1. The parameters to be estimated in the simple linear regression model Y=α+βx+ε ε~n(0,σ) are: a) α, β, σ b) α, β, ε c) a, b, s d) ε, 0, σ

STA 3024 Practice Problems Exam 2 NOTE: These are just Practice Problems. This is NOT meant to look just like the test, and it is NOT the only thing that you should study. Make sure you know all the material

### Forecasting in STATA: Tools and Tricks

Forecasting in STATA: Tools and Tricks Introduction This manual is intended to be a reference guide for time series forecasting in STATA. It will be updated periodically during the semester, and will be

### Below is a very brief tutorial on the basic capabilities of Excel. Refer to the Excel help files for more information.

Excel Tutorial Below is a very brief tutorial on the basic capabilities of Excel. Refer to the Excel help files for more information. Working with Data Entering and Formatting Data Before entering data

### Assignment objectives:

Assignment objectives: Regression Pivot table Exercise #1- Simple Linear Regression Often the relationship between two variables, Y and X, can be adequately represented by a simple linear equation of the

### Elementary Statistics Sample Exam #3

Elementary Statistics Sample Exam #3 Instructions. No books or telephones. Only the supplied calculators are allowed. The exam is worth 100 points. 1. A chi square goodness of fit test is considered to

### Outline. Topic 4 - Analysis of Variance Approach to Regression. Partitioning Sums of Squares. Total Sum of Squares. Partitioning sums of squares

Topic 4 - Analysis of Variance Approach to Regression Outline Partitioning sums of squares Degrees of freedom Expected mean squares General linear test - Fall 2013 R 2 and the coefficient of correlation

### Directions for using SPSS

Directions for using SPSS Table of Contents Connecting and Working with Files 1. Accessing SPSS... 2 2. Transferring Files to N:\drive or your computer... 3 3. Importing Data from Another File Format...

Lecture 16: Generalized Additive Models Regression III: Advanced Methods Bill Jacoby Michigan State University http://polisci.msu.edu/jacoby/icpsr/regress3 Goals of the Lecture Introduce Additive Models

Bowerman, O'Connell, Aitken Schermer, & Adcock, Business Statistics in Practice, Canadian edition Online Learning Centre Technology Step-by-Step - Excel Microsoft Excel is a spreadsheet software application

### Section 13, Part 1 ANOVA. Analysis Of Variance

Section 13, Part 1 ANOVA Analysis Of Variance Course Overview So far in this course we ve covered: Descriptive statistics Summary statistics Tables and Graphs Probability Probability Rules Probability

### Chapter 5 Analysis of variance SPSS Analysis of variance

Chapter 5 Analysis of variance SPSS Analysis of variance Data file used: gss.sav How to get there: Analyze Compare Means One-way ANOVA To test the null hypothesis that several population means are equal,

### Hypothesis testing - Steps

Hypothesis testing - Steps Steps to do a two-tailed test of the hypothesis that β 1 0: 1. Set up the hypotheses: H 0 : β 1 = 0 H a : β 1 0. 2. Compute the test statistic: t = b 1 0 Std. error of b 1 =

### A Quick Algebra Review

1. Simplifying Epressions. Solving Equations 3. Problem Solving 4. Inequalities 5. Absolute Values 6. Linear Equations 7. Systems of Equations 8. Laws of Eponents 9. Quadratics 10. Rationals 11. Radicals

### 17. SIMPLE LINEAR REGRESSION II

17. SIMPLE LINEAR REGRESSION II The Model In linear regression analysis, we assume that the relationship between X and Y is linear. This does not mean, however, that Y can be perfectly predicted from X.

### Testing Research and Statistical Hypotheses

Testing Research and Statistical Hypotheses Introduction In the last lab we analyzed metric artifact attributes such as thickness or width/thickness ratio. Those were continuous variables, which as you

### business statistics using Excel OXFORD UNIVERSITY PRESS Glyn Davis & Branko Pecar

business statistics using Excel Glyn Davis & Branko Pecar OXFORD UNIVERSITY PRESS Detailed contents Introduction to Microsoft Excel 2003 Overview Learning Objectives 1.1 Introduction to Microsoft Excel

### Using Excel for inferential statistics

FACT SHEET Using Excel for inferential statistics Introduction When you collect data, you expect a certain amount of variation, just caused by chance. A wide variety of statistical tests can be applied

### Pristine s Day Trading Journal...with Strategy Tester and Curve Generator

Pristine s Day Trading Journal...with Strategy Tester and Curve Generator User Guide Important Note: Pristine s Day Trading Journal uses macros in an excel file. Macros are an embedded computer code within

### Simple Linear Regression

STAT 101 Dr. Kari Lock Morgan Simple Linear Regression SECTIONS 9.3 Confidence and prediction intervals (9.3) Conditions for inference (9.1) Want More Stats??? If you have enjoyed learning how to analyze

### Module 3: Correlation and Covariance

Using Statistical Data to Make Decisions Module 3: Correlation and Covariance Tom Ilvento Dr. Mugdim Pašiƒ University of Delaware Sarajevo Graduate School of Business O ften our interest in data analysis

### HYPOTHESIS TESTING: CONFIDENCE INTERVALS, T-TESTS, ANOVAS, AND REGRESSION

HYPOTHESIS TESTING: CONFIDENCE INTERVALS, T-TESTS, ANOVAS, AND REGRESSION HOD 2990 10 November 2010 Lecture Background This is a lightning speed summary of introductory statistical methods for senior undergraduate