# Regression Analysis: A Complete Example

Size: px
Start display at page:

Transcription

1 Regression Analysis: A Complete Example This section works out an example that includes all the topics we have discussed so far in this chapter. A complete example of regression analysis. PhotoDisc, Inc./Getty Images A random sample of eight drivers insured with a company and having similar auto insurance policies was selected. The following table lists their driving experiences (in years) and monthly auto insurance premiums. Driving Experience (years) Monthly Auto Insurance Premium 5 \$ a. Does the insurance premium depend on the driving experience or does the driving experience depend on the insurance premium? Do you expect a positive or a negative relationship between these two variables? b. Compute SS xx, SS yy, and SS xy. c. Find the least squares regression line by choosing appropriate dependent and independent variables based on your answer in part a. d. Interpret the meaning of the values of a and b calculated in part c. e. Plot the scatter diagram and the regression line. f. Calculate r and r 2 and explain what they mean. g. Predict the monthly auto insurance premium for a driver with 10 years of driving experience. h. Compute the standard deviation of errors. i. Construct a 90% confidence interval for B. j. Test at the 5% significance level whether B is negative. k. Using α =.05, test whether ρ is different from zero.

2 Solution a. Based on theory and intuition, we expect the insurance premium to depend on driving experience. Consequently, the insurance premium is a dependent variable and driving experience is an independent variable in the regression model. A new driver is considered a high risk by the insurance companies, and he or she has to pay a higher premium for auto insurance. On average, the insurance premium is expected to decrease with an increase in the years of driving experience. Therefore, we expect a negative relationship between these two variables. In other words, both the population correlation coefficient ρ and the population regression slope B are expected to be negative. b. Table 13.5 shows the calculation of Σx, Σy, Σxy, Σx 2, and Σy 2. Experience x Table 13.5 Premium y xy x 2 y Σx = 90 Σy = 474 Σxy = 4739 Σx 2 = 1396 Σy 2 = 29,642 The values of x and y are The values of SS xy, SS xx, and SS yy are computed as follows: c. To find the regression line, we calculate a and b as follows: Thus, our estimated regression line ŷ = a + bx is

3 d. The value of a = gives the value of ŷ for x = 0; that is, it gives the monthly auto insurance premium for a driver with no driving experience. However, as mentioned earlier in this chapter, we should not attach much importance to this statement because the sample contains drivers with only two or more years of experience. The value of b gives the change in ŷ due to a change of one unit in x. Thus, b = indicates that, on average, for every extra year of driving experience, the monthly auto insurance premium decreases by \$1.55. Note that when b is negative, y decreases as x increases. e. Figure shows the scatter diagram and the regression line for the data on eight auto drivers. Note that the regression line slopes downward from left to right. This result is consistent with the negative relationship we anticipated between driving experience and insurance premium. Figure Scatter diagram and the regression line. f. The values of r and r 2 are computed as follows: The value of r =.77 indicates that the driving experience and the monthly auto insurance premium are negatively related. The (linear) relationship is strong but not very strong. The value of r 2 =.59 states that 59% of the total variation in insurance premiums is explained by years of driving experience and 41% is not. The low value of r 2 indicates that there may be many other important variables that contribute to the determination of auto insurance premiums. For example, the premium is expected to depend on the driving record of a driver and the type and age of the car. g. Using the estimated regression line, we find the predicted value of y for x = 10 is Thus, we expect the monthly auto insurance premium of a driver with 10 years of driving experience to be \$ h. The standard deviation of errors is i. To construct a 90% confidence interval for B, first we calculate the standard deviation of b:

4 For a 90% confidence level, the area in each tail of the t distribution is The degrees of freedom are From the t distribution table, the t value for.05 area in the right tail of the t distribution and 6 df is The 90% confidence interval for B is Thus, we can state with 90% confidence that B lies in the interval 2.57 to.52. That is, on average, the monthly auto insurance premium of a driver decreases by an amount between \$.52 and \$2.57 for every extra year of driving experience. j. We perform the following five steps to test the hypothesis about B. Step 1. State the null and alternative hypotheses. The null and alternative hypotheses are written as follows: Note that the null hypothesis can also be written as H 0 : B 0. Step 2. Select the distribution to use. Because σ is not known, we use the t distribution to make the hypothesis test. Step 3. Determine the rejection and nonrejection regions. The significance level is.05. The < sign in the alternative hypothesis indicates that it is a lefttailed test. From the t distribution table, the critical value of t for.05 area in the left tail of the t distribution and 6 df is 1.943, as shown in Figure Figure Step 4. Calculate the value of the test statistic. The value of the test statistic t for b is calculated as follows: Step 5. Make a decision.

5 The value of the test statistic t = falls in the rejection region. Hence, we reject the null hypothesis and conclude that B is negative. That is, the monthly auto insurance premium decreases with an increase in years of driving experience. Using the p-value to Make a Decision We can find the range for the p-value from the t distribution table (Table V of Appendix C) and make a decision by comparing that p-value with the significance level. For this example, df = 6 and the observed value of t is From Table V (the t distribution table) in the row of df = 6, is between and The corresponding areas in the right tail of the t distribution are.025 and.01. But our test is left-tailed and the observed value of t is negative. Thus, t = lies between and The corresponding areas in the left tail of the t distribution are.025 and.01. Therefore the range of the p-value is.01 < p-value<.025 Thus, we can state that for any α equal to or greater than.025 (the upper limit of the p-value range), we will reject the null hypothesis. For our example, α =.05, which is greater than the upper limit of the p-value of.025. As a result, we reject the null hypothesis. Note that if we use technology to find this p-value, we will obtain a p-value of.013. Then we can reject the null hypothesis for any α >.013. k. We perform the following five steps to test the hypothesis about the linear correlation coefficient ρ. Step 1. State the null and alternative hypotheses. The null and alternative hypotheses are: Step 2. Select the distribution to use. Assuming that variables x and y are normally distributed, we will use the t distribution to perform this test about the linear correlation coefficient. Step 3. Determine the rejection and nonrejection regions. The significance level is 5%. From the alternative hypothesis we know that the test is two-tailed. Hence, From the t distribution table, Table V of Appendix C, the critical values of t are and The rejection and nonrejection regions for this test are shown in Figure Figure Step 4. Calculate the value of the test statistic. The value of the test statistic t for r is calculated as follows:

6 Step 5. Make a decision. The value of the test statistic t = falls in the rejection region. Hence, we reject the null hypothesis and conclude that the linear correlation coefficient between driving experience and auto insurance premium is different from zero. Using the p-value to Make a Decision We can find the range for the p-value from the t distribution table and make a decision by comparing that p-value with the significance level. For this example, df = 6 and the observed value of t is From Table V (the t distribution table) in the row of df = 6, t = is between and The corresponding areas in the right tail of the t distribution curve are.025 and.01. Since the test is two tailed, the range of the p-value is Thus, we can state that for any α equal to or greater than.05 (the upper limit of the p-value range), we will reject the null hypothesis. For our example, α =.05, which is equal to the upper limit of the p-value. As a result, we reject the null hypothesis. Welcome to Minitab, press F1 for help. Regression Analysis: Premium y versus Experience x The regression equation is Premium y = Experience x Predictor Coef SE Coef T P Constant Experience x S = R-Sq = 59.0% R-Sq(adj) = 52.1% Analysis of Variance Source DF SS MS F P Regression Residual Error Total Predicted Values for New Observations New Obs Fit SE Fit 95% CI 95% PI (52.11, 70.26) (34.35, 88.02) Values of Predictors for New Observations New Obs Experience x

7 The following table gives information on ages and cholesterol levels for a random sample of 10 men. Age Cholesterol level a. Taking age as an independent variable and cholesterol level as a dependent variable, compute SS xx, SS yy, and SS xy. b. Find the regression of cholesterol level on age. c. Briefly explain the meaning of the values of a and b calculated in part b. d. Calculate r and r 2 and explain what they mean. e. Plot the scatter diagram and the regression line. f. Predict the cholesterol level of a 60-year-old man. g. Compute the standard deviation of errors. h. Construct a 95% confidence interval for B. i. Test at the 5% significance level if B is positive. j. Using α =.025, can you conclude that the linear correlation coefficient is positive? Answer: a. SS xx = ; SS yy = ; SS xy = b. ŷ = x d. r =.41; r 2 =.17 f g. s e = h..53 to 1.83 i. H 0 : B = 0; H 1 : B > 0; critical value: t = 1.860; test statistic: t = 1.265; do not reject H 0 j. H 0 : ρ = 0; H 1 : ρ > 0; critical value: t = 2.306; test statistic: t = 1.271; do not reject H 0

### CHAPTER 13 SIMPLE LINEAR REGRESSION. Opening Example. Simple Regression. Linear Regression

Opening Example CHAPTER 13 SIMPLE LINEAR REGREION SIMPLE LINEAR REGREION! Simple Regression! Linear Regression Simple Regression Definition A regression model is a mathematical equation that descries the

### 1. What is the critical value for this 95% confidence interval? CV = z.025 = invnorm(0.025) = 1.96

1 Final Review 2 Review 2.1 CI 1-propZint Scenario 1 A TV manufacturer claims in its warranty brochure that in the past not more than 10 percent of its TV sets needed any repair during the first two years

### 2. Simple Linear Regression

Research methods - II 3 2. Simple Linear Regression Simple linear regression is a technique in parametric statistics that is commonly used for analyzing mean response of a variable Y which changes according

### Factors affecting online sales

Factors affecting online sales Table of contents Summary... 1 Research questions... 1 The dataset... 2 Descriptive statistics: The exploratory stage... 3 Confidence intervals... 4 Hypothesis tests... 4

### 17. SIMPLE LINEAR REGRESSION II

17. SIMPLE LINEAR REGRESSION II The Model In linear regression analysis, we assume that the relationship between X and Y is linear. This does not mean, however, that Y can be perfectly predicted from X.

### Hypothesis testing - Steps

Hypothesis testing - Steps Steps to do a two-tailed test of the hypothesis that β 1 0: 1. Set up the hypotheses: H 0 : β 1 = 0 H a : β 1 0. 2. Compute the test statistic: t = b 1 0 Std. error of b 1 =

### August 2012 EXAMINATIONS Solution Part I

August 01 EXAMINATIONS Solution Part I (1) In a random sample of 600 eligible voters, the probability that less than 38% will be in favour of this policy is closest to (B) () In a large random sample,

### " Y. Notation and Equations for Regression Lecture 11/4. Notation:

Notation: Notation and Equations for Regression Lecture 11/4 m: The number of predictor variables in a regression Xi: One of multiple predictor variables. The subscript i represents any number from 1 through

### TRINITY COLLEGE. Faculty of Engineering, Mathematics and Science. School of Computer Science & Statistics

UNIVERSITY OF DUBLIN TRINITY COLLEGE Faculty of Engineering, Mathematics and Science School of Computer Science & Statistics BA (Mod) Enter Course Title Trinity Term 2013 Junior/Senior Sophister ST7002

### 1. The parameters to be estimated in the simple linear regression model Y=α+βx+ε ε~n(0,σ) are: a) α, β, σ b) α, β, ε c) a, b, s d) ε, 0, σ

STA 3024 Practice Problems Exam 2 NOTE: These are just Practice Problems. This is NOT meant to look just like the test, and it is NOT the only thing that you should study. Make sure you know all the material

### Predictor Coef StDev T P Constant 970667056 616256122 1.58 0.154 X 0.00293 0.06163 0.05 0.963. S = 0.5597 R-Sq = 0.0% R-Sq(adj) = 0.

Statistical analysis using Microsoft Excel Microsoft Excel spreadsheets have become somewhat of a standard for data storage, at least for smaller data sets. This, along with the program often being packaged

### Chapter 13 Introduction to Linear Regression and Correlation Analysis

Chapter 3 Student Lecture Notes 3- Chapter 3 Introduction to Linear Regression and Correlation Analsis Fall 2006 Fundamentals of Business Statistics Chapter Goals To understand the methods for displaing

### Chapter 7: Simple linear regression Learning Objectives

Chapter 7: Simple linear regression Learning Objectives Reading: Section 7.1 of OpenIntro Statistics Video: Correlation vs. causation, YouTube (2:19) Video: Intro to Linear Regression, YouTube (5:18) -

### 2013 MBA Jump Start Program. Statistics Module Part 3

2013 MBA Jump Start Program Module 1: Statistics Thomas Gilbert Part 3 Statistics Module Part 3 Hypothesis Testing (Inference) Regressions 2 1 Making an Investment Decision A researcher in your firm just

### Unit 31 A Hypothesis Test about Correlation and Slope in a Simple Linear Regression

Unit 31 A Hypothesis Test about Correlation and Slope in a Simple Linear Regression Objectives: To perform a hypothesis test concerning the slope of a least squares line To recognize that testing for a

### Simple Linear Regression Inference

Simple Linear Regression Inference 1 Inference requirements The Normality assumption of the stochastic term e is needed for inference even if it is not a OLS requirement. Therefore we have: Interpretation

### Regression step-by-step using Microsoft Excel

Step 1: Regression step-by-step using Microsoft Excel Notes prepared by Pamela Peterson Drake, James Madison University Type the data into the spreadsheet The example used throughout this How to is a regression

### Final Exam Practice Problem Answers

Final Exam Practice Problem Answers The following data set consists of data gathered from 77 popular breakfast cereals. The variables in the data set are as follows: Brand: The brand name of the cereal

### Week TSX Index 1 8480 2 8470 3 8475 4 8510 5 8500 6 8480

1) The S & P/TSX Composite Index is based on common stock prices of a group of Canadian stocks. The weekly close level of the TSX for 6 weeks are shown: Week TSX Index 1 8480 2 8470 3 8475 4 8510 5 8500

### MGT 267 PROJECT. Forecasting the United States Retail Sales of the Pharmacies and Drug Stores. Done by: Shunwei Wang & Mohammad Zainal

MGT 267 PROJECT Forecasting the United States Retail Sales of the Pharmacies and Drug Stores Done by: Shunwei Wang & Mohammad Zainal Dec. 2002 The retail sale (Million) ABSTRACT The present study aims

### One-Way Analysis of Variance (ANOVA) Example Problem

One-Way Analysis of Variance (ANOVA) Example Problem Introduction Analysis of Variance (ANOVA) is a hypothesis-testing technique used to test the equality of two or more population (or treatment) means

### GLM I An Introduction to Generalized Linear Models

GLM I An Introduction to Generalized Linear Models CAS Ratemaking and Product Management Seminar March 2009 Presented by: Tanya D. Havlicek, Actuarial Assistant 0 ANTITRUST Notice The Casualty Actuarial

### One-Way Analysis of Variance

One-Way Analysis of Variance Note: Much of the math here is tedious but straightforward. We ll skim over it in class but you should be sure to ask questions if you don t understand it. I. Overview A. We

### Chapter 23. Inferences for Regression

Chapter 23. Inferences for Regression Topics covered in this chapter: Simple Linear Regression Simple Linear Regression Example 23.1: Crying and IQ The Problem: Infants who cry easily may be more easily

### Elementary Statistics Sample Exam #3

Elementary Statistics Sample Exam #3 Instructions. No books or telephones. Only the supplied calculators are allowed. The exam is worth 100 points. 1. A chi square goodness of fit test is considered to

### Simple linear regression

Simple linear regression Introduction Simple linear regression is a statistical method for obtaining a formula to predict values of one variable from another where there is a causal relationship between

### NCSS Statistical Software Principal Components Regression. In ordinary least squares, the regression coefficients are estimated using the formula ( )

Chapter 340 Principal Components Regression Introduction is a technique for analyzing multiple regression data that suffer from multicollinearity. When multicollinearity occurs, least squares estimates

### MULTIPLE REGRESSION EXAMPLE

MULTIPLE REGRESSION EXAMPLE For a sample of n = 166 college students, the following variables were measured: Y = height X 1 = mother s height ( momheight ) X 2 = father s height ( dadheight ) X 3 = 1 if

### TIME SERIES ANALYSIS & FORECASTING

CHAPTER 19 TIME SERIES ANALYSIS & FORECASTING Basic Concepts 1. Time Series Analysis BASIC CONCEPTS AND FORMULA The term Time Series means a set of observations concurring any activity against different

### 1.1. Simple Regression in Excel (Excel 2010).

.. Simple Regression in Excel (Excel 200). To get the Data Analysis tool, first click on File > Options > Add-Ins > Go > Select Data Analysis Toolpack & Toolpack VBA. Data Analysis is now available under

### Regression and Correlation

Regression and Correlation Topics Covered: Dependent and independent variables. Scatter diagram. Correlation coefficient. Linear Regression line. by Dr.I.Namestnikova 1 Introduction Regression analysis

### 11. Analysis of Case-control Studies Logistic Regression

Research methods II 113 11. Analysis of Case-control Studies Logistic Regression This chapter builds upon and further develops the concepts and strategies described in Ch.6 of Mother and Child Health:

### DATA INTERPRETATION AND STATISTICS

PholC60 September 001 DATA INTERPRETATION AND STATISTICS Books A easy and systematic introductory text is Essentials of Medical Statistics by Betty Kirkwood, published by Blackwell at about 14. DESCRIPTIVE

### 1 Simple Linear Regression I Least Squares Estimation

Simple Linear Regression I Least Squares Estimation Textbook Sections: 8. 8.3 Previously, we have worked with a random variable x that comes from a population that is normally distributed with mean µ and

### Part 2: Analysis of Relationship Between Two Variables

Part 2: Analysis of Relationship Between Two Variables Linear Regression Linear correlation Significance Tests Multiple regression Linear Regression Y = a X + b Dependent Variable Independent Variable

### KSTAT MINI-MANUAL. Decision Sciences 434 Kellogg Graduate School of Management

KSTAT MINI-MANUAL Decision Sciences 434 Kellogg Graduate School of Management Kstat is a set of macros added to Excel and it will enable you to do the statistics required for this course very easily. To

### Business Statistics. Successful completion of Introductory and/or Intermediate Algebra courses is recommended before taking Business Statistics.

Business Course Text Bowerman, Bruce L., Richard T. O'Connell, J. B. Orris, and Dawn C. Porter. Essentials of Business, 2nd edition, McGraw-Hill/Irwin, 2008, ISBN: 978-0-07-331988-9. Required Computing

### SIMPLE LINEAR CORRELATION. r can range from -1 to 1, and is independent of units of measurement. Correlation can be done on two dependent variables.

SIMPLE LINEAR CORRELATION Simple linear correlation is a measure of the degree to which two variables vary together, or a measure of the intensity of the association between two variables. Correlation

### A POPULATION MEAN, CONFIDENCE INTERVALS AND HYPOTHESIS TESTING

CHAPTER 5. A POPULATION MEAN, CONFIDENCE INTERVALS AND HYPOTHESIS TESTING 5.1 Concepts When a number of animals or plots are exposed to a certain treatment, we usually estimate the effect of the treatment

### The importance of graphing the data: Anscombe s regression examples

The importance of graphing the data: Anscombe s regression examples Bruce Weaver Northern Health Research Conference Nipissing University, North Bay May 30-31, 2008 B. Weaver, NHRC 2008 1 The Objective

### Study Guide for the Final Exam

Study Guide for the Final Exam When studying, remember that the computational portion of the exam will only involve new material (covered after the second midterm), that material from Exam 1 will make

### How To Run Statistical Tests in Excel

How To Run Statistical Tests in Excel Microsoft Excel is your best tool for storing and manipulating data, calculating basic descriptive statistics such as means and standard deviations, and conducting

### Univariate Regression

Univariate Regression Correlation and Regression The regression line summarizes the linear relationship between 2 variables Correlation coefficient, r, measures strength of relationship: the closer r is

### c 2015, Jeffrey S. Simonoff 1

Modeling Lowe s sales Forecasting sales is obviously of crucial importance to businesses. Revenue streams are random, of course, but in some industries general economic factors would be expected to have

### Class 19: Two Way Tables, Conditional Distributions, Chi-Square (Text: Sections 2.5; 9.1)

Spring 204 Class 9: Two Way Tables, Conditional Distributions, Chi-Square (Text: Sections 2.5; 9.) Big Picture: More than Two Samples In Chapter 7: We looked at quantitative variables and compared the

### Hypothesis Testing --- One Mean

Hypothesis Testing --- One Mean A hypothesis is simply a statement that something is true. Typically, there are two hypotheses in a hypothesis test: the null, and the alternative. Null Hypothesis The hypothesis

### Multiple Linear Regression

Multiple Linear Regression A regression with two or more explanatory variables is called a multiple regression. Rather than modeling the mean response as a straight line, as in simple regression, it is

### Module 5: Multiple Regression Analysis

Using Statistical Data Using to Make Statistical Decisions: Data Multiple to Make Regression Decisions Analysis Page 1 Module 5: Multiple Regression Analysis Tom Ilvento, University of Delaware, College

### The Volatility Index Stefan Iacono University System of Maryland Foundation

1 The Volatility Index Stefan Iacono University System of Maryland Foundation 28 May, 2014 Mr. Joe Rinaldi 2 The Volatility Index Introduction The CBOE s VIX, often called the market fear gauge, measures

### Outline. Topic 4 - Analysis of Variance Approach to Regression. Partitioning Sums of Squares. Total Sum of Squares. Partitioning sums of squares

Topic 4 - Analysis of Variance Approach to Regression Outline Partitioning sums of squares Degrees of freedom Expected mean squares General linear test - Fall 2013 R 2 and the coefficient of correlation

### HYPOTHESIS TESTING (ONE SAMPLE) - CHAPTER 7 1. used confidence intervals to answer questions such as...

HYPOTHESIS TESTING (ONE SAMPLE) - CHAPTER 7 1 PREVIOUSLY used confidence intervals to answer questions such as... You know that 0.25% of women have red/green color blindness. You conduct a study of men

### An Introduction to Statistics Course (ECOE 1302) Spring Semester 2011 Chapter 10- TWO-SAMPLE TESTS

The Islamic University of Gaza Faculty of Commerce Department of Economics and Political Sciences An Introduction to Statistics Course (ECOE 130) Spring Semester 011 Chapter 10- TWO-SAMPLE TESTS Practice

### 2. What is the general linear model to be used to model linear trend? (Write out the model) = + + + or

Simple and Multiple Regression Analysis Example: Explore the relationships among Month, Adv.\$ and Sales \$: 1. Prepare a scatter plot of these data. The scatter plots for Adv.\$ versus Sales, and Month versus

### CHAPTER 13. Experimental Design and Analysis of Variance

CHAPTER 13 Experimental Design and Analysis of Variance CONTENTS STATISTICS IN PRACTICE: BURKE MARKETING SERVICES, INC. 13.1 AN INTRODUCTION TO EXPERIMENTAL DESIGN AND ANALYSIS OF VARIANCE Data Collection

Using Your TI-83/84/89 Calculator: Linear Correlation and Regression Dr. Laura Schultz Statistics I This handout describes how to use your calculator for various linear correlation and regression applications.

### Solution Let us regress percentage of games versus total payroll.

Assignment 3, MATH 2560, Due November 16th Question 1: all graphs and calculations have to be done using the computer The following table gives the 1999 payroll (rounded to the nearest million dolars)

### Course Text. Required Computing Software. Course Description. Course Objectives. StraighterLine. Business Statistics

Course Text Business Statistics Lind, Douglas A., Marchal, William A. and Samuel A. Wathen. Basic Statistics for Business and Economics, 7th edition, McGraw-Hill/Irwin, 2010, ISBN: 9780077384470 [This

### Online 12 - Sections 9.1 and 9.2-Doug Ensley

Student: Date: Instructor: Doug Ensley Course: MAT117 01 Applied Statistics - Ensley Assignment: Online 12 - Sections 9.1 and 9.2 1. Does a P-value of 0.001 give strong evidence or not especially strong

### Recall this chart that showed how most of our course would be organized:

Chapter 4 One-Way ANOVA Recall this chart that showed how most of our course would be organized: Explanatory Variable(s) Response Variable Methods Categorical Categorical Contingency Tables Categorical

### Estimation of σ 2, the variance of ɛ

Estimation of σ 2, the variance of ɛ The variance of the errors σ 2 indicates how much observations deviate from the fitted surface. If σ 2 is small, parameters β 0, β 1,..., β k will be reliably estimated

### Case Study in Data Analysis Does a drug prevent cardiomegaly in heart failure?

Case Study in Data Analysis Does a drug prevent cardiomegaly in heart failure? Harvey Motulsky hmotulsky@graphpad.com This is the first case in what I expect will be a series of case studies. While I mention

### A Primer on Forecasting Business Performance

A Primer on Forecasting Business Performance There are two common approaches to forecasting: qualitative and quantitative. Qualitative forecasting methods are important when historical data is not available.

### MULTIPLE LINEAR REGRESSION ANALYSIS USING MICROSOFT EXCEL. by Michael L. Orlov Chemistry Department, Oregon State University (1996)

MULTIPLE LINEAR REGRESSION ANALYSIS USING MICROSOFT EXCEL by Michael L. Orlov Chemistry Department, Oregon State University (1996) INTRODUCTION In modern science, regression analysis is a necessary part

Using Your TI-83/84 Calculator: Linear Correlation and Regression Elementary Statistics Dr. Laura Schultz This handout describes how to use your calculator for various linear correlation and regression

### 1) Write the following as an algebraic expression using x as the variable: Triple a number subtracted from the number

1) Write the following as an algebraic expression using x as the variable: Triple a number subtracted from the number A. 3(x - x) B. x 3 x C. 3x - x D. x - 3x 2) Write the following as an algebraic expression

### The correlation coefficient

The correlation coefficient Clinical Biostatistics The correlation coefficient Martin Bland Correlation coefficients are used to measure the of the relationship or association between two quantitative

### Using R for Linear Regression

Using R for Linear Regression In the following handout words and symbols in bold are R functions and words and symbols in italics are entries supplied by the user; underlined words and symbols are optional

### Math 108 Exam 3 Solutions Spring 00

Math 108 Exam 3 Solutions Spring 00 1. An ecologist studying acid rain takes measurements of the ph in 12 randomly selected Adirondack lakes. The results are as follows: 3.0 6.5 5.0 4.2 5.5 4.7 3.4 6.8

### Please follow the directions once you locate the Stata software in your computer. Room 114 (Business Lab) has computers with Stata software

STATA Tutorial Professor Erdinç Please follow the directions once you locate the Stata software in your computer. Room 114 (Business Lab) has computers with Stata software 1.Wald Test Wald Test is used

### Chapter 2 Probability Topics SPSS T tests

Chapter 2 Probability Topics SPSS T tests Data file used: gss.sav In the lecture about chapter 2, only the One-Sample T test has been explained. In this handout, we also give the SPSS methods to perform

### Copyright 2010-2011 PEOPLECERT Int. Ltd and IASSC

PEOPLECERT - Personnel Certification Body 3 Korai st., 105 64 Athens, Greece, Tel.: +30 210 372 9100, Fax: +30 210 372 9101, e-mail: info@peoplecert.org, www.peoplecert.org Copyright 2010-2011 PEOPLECERT

### Example: Boats and Manatees

Figure 9-6 Example: Boats and Manatees Slide 1 Given the sample data in Table 9-1, find the value of the linear correlation coefficient r, then refer to Table A-6 to determine whether there is a significant

### Stata Walkthrough 4: Regression, Prediction, and Forecasting

Stata Walkthrough 4: Regression, Prediction, and Forecasting Over drinks the other evening, my neighbor told me about his 25-year-old nephew, who is dating a 35-year-old woman. God, I can t see them getting

### Premaster Statistics Tutorial 4 Full solutions

Premaster Statistics Tutorial 4 Full solutions Regression analysis Q1 (based on Doane & Seward, 4/E, 12.7) a. Interpret the slope of the fitted regression = 125,000 + 150. b. What is the prediction for

### Analysing Questionnaires using Minitab (for SPSS queries contact -) Graham.Currell@uwe.ac.uk

Analysing Questionnaires using Minitab (for SPSS queries contact -) Graham.Currell@uwe.ac.uk Structure As a starting point it is useful to consider a basic questionnaire as containing three main sections:

### Correlation and Simple Linear Regression

Correlation and Simple Linear Regression We are often interested in studying the relationship among variables to determine whether they are associated with one another. When we think that changes in a

### Using Excel for inferential statistics

FACT SHEET Using Excel for inferential statistics Introduction When you collect data, you expect a certain amount of variation, just caused by chance. A wide variety of statistical tests can be applied

### Interaction between quantitative predictors

Interaction between quantitative predictors In a first-order model like the ones we have discussed, the association between E(y) and a predictor x j does not depend on the value of the other predictors

### Section 7.1. Introduction to Hypothesis Testing. Schrodinger s cat quantum mechanics thought experiment (1935)

Section 7.1 Introduction to Hypothesis Testing Schrodinger s cat quantum mechanics thought experiment (1935) Statistical Hypotheses A statistical hypothesis is a claim about a population. Null hypothesis

### HYPOTHESIS TESTING (ONE SAMPLE) - CHAPTER 7 1. used confidence intervals to answer questions such as...

HYPOTHESIS TESTING (ONE SAMPLE) - CHAPTER 7 1 PREVIOUSLY used confidence intervals to answer questions such as... You know that 0.25% of women have red/green color blindness. You conduct a study of men

### Statistics 100 Sample Final Questions (Note: These are mostly multiple choice, for extra practice. Your Final Exam will NOT have any multiple choice!

Statistics 100 Sample Final Questions (Note: These are mostly multiple choice, for extra practice. Your Final Exam will NOT have any multiple choice!) Part A - Multiple Choice Indicate the best choice

### Statistics 104 Final Project A Culture of Debt: A Study of Credit Card Spending in America TF: Kevin Rader Anonymous Students: LD, MH, IW, MY

Statistics 104 Final Project A Culture of Debt: A Study of Credit Card Spending in America TF: Kevin Rader Anonymous Students: LD, MH, IW, MY ABSTRACT: This project attempted to determine the relationship

### Introduction to Regression and Data Analysis

Statlab Workshop Introduction to Regression and Data Analysis with Dan Campbell and Sherlock Campbell October 28, 2008 I. The basics A. Types of variables Your variables may take several forms, and it

### Data Mining Techniques Chapter 5: The Lure of Statistics: Data Mining Using Familiar Tools

Data Mining Techniques Chapter 5: The Lure of Statistics: Data Mining Using Familiar Tools Occam s razor.......................................................... 2 A look at data I.........................................................

### General Regression Formulae ) (N-2) (1 - r 2 YX

General Regression Formulae Single Predictor Standardized Parameter Model: Z Yi = β Z Xi + ε i Single Predictor Standardized Statistical Model: Z Yi = β Z Xi Estimate of Beta (Beta-hat: β = r YX (1 Standard

### General Method: Difference of Means. 3. Calculate df: either Welch-Satterthwaite formula or simpler df = min(n 1, n 2 ) 1.

General Method: Difference of Means 1. Calculate x 1, x 2, SE 1, SE 2. 2. Combined SE = SE1 2 + SE2 2. ASSUMES INDEPENDENT SAMPLES. 3. Calculate df: either Welch-Satterthwaite formula or simpler df = min(n

### Elements of statistics (MATH0487-1)

Elements of statistics (MATH0487-1) Prof. Dr. Dr. K. Van Steen University of Liège, Belgium December 10, 2012 Introduction to Statistics Basic Probability Revisited Sampling Exploratory Data Analysis -

### 5/31/2013. Chapter 8 Hypothesis Testing. Hypothesis Testing. Hypothesis Testing. Outline. Objectives. Objectives

C H 8A P T E R Outline 8 1 Steps in Traditional Method 8 2 z Test for a Mean 8 3 t Test for a Mean 8 4 z Test for a Proportion 8 6 Confidence Intervals and Copyright 2013 The McGraw Hill Companies, Inc.

### DETERMINANTS OF CAPITAL ADEQUACY RATIO IN SELECTED BOSNIAN BANKS

DETERMINANTS OF CAPITAL ADEQUACY RATIO IN SELECTED BOSNIAN BANKS Nađa DRECA International University of Sarajevo nadja.dreca@students.ius.edu.ba Abstract The analysis of a data set of observation for 10

### Module 5: Statistical Analysis

Module 5: Statistical Analysis To answer more complex questions using your data, or in statistical terms, to test your hypothesis, you need to use more advanced statistical tests. This module reviews the

### AP Statistics. Chapter 4 Review

Name AP Statistics Chapter 4 Review 1. In a study of the link between high blood pressure and cardiovascular disease, a group of white males aged 35 to 64 was followed for 5 years. At the beginning of

### Jinadasa Gamage, Professor of Mathematics, Illinois State University, Normal, IL, e- mail: jina@ilstu.edu

Submission for ARCH, October 31, 2006 Jinadasa Gamage, Professor of Mathematics, Illinois State University, Normal, IL, e- mail: jina@ilstu.edu Jed L. Linfield, FSA, MAAA, Health Actuary, Kaiser Permanente,

### Chapter 9. Two-Sample Tests. Effect Sizes and Power Paired t Test Calculation

Chapter 9 Two-Sample Tests Paired t Test (Correlated Groups t Test) Effect Sizes and Power Paired t Test Calculation Summary Independent t Test Chapter 9 Homework Power and Two-Sample Tests: Paired Versus

### Simple Linear Regression, Scatterplots, and Bivariate Correlation

1 Simple Linear Regression, Scatterplots, and Bivariate Correlation This section covers procedures for testing the association between two continuous variables using the SPSS Regression and Correlate analyses.

### One-Way Analysis of Variance: A Guide to Testing Differences Between Multiple Groups

One-Way Analysis of Variance: A Guide to Testing Differences Between Multiple Groups In analysis of variance, the main research question is whether the sample means are from different populations. The

### Independent t- Test (Comparing Two Means)

Independent t- Test (Comparing Two Means) The objectives of this lesson are to learn: the definition/purpose of independent t-test when to use the independent t-test the use of SPSS to complete an independent

### Chicago Booth BUSINESS STATISTICS 41000 Final Exam Fall 2011

Chicago Booth BUSINESS STATISTICS 41000 Final Exam Fall 2011 Name: Section: I pledge my honor that I have not violated the Honor Code Signature: This exam has 34 pages. You have 3 hours to complete this