STAT 350 Review Set (#3) Make sure to study Lab 8 software output and questions.

Similar documents
Chapter 7: Simple linear regression Learning Objectives

1. What is the critical value for this 95% confidence interval? CV = z.025 = invnorm(0.025) = 1.96

Multiple Linear Regression

Factors affecting online sales

SPSS Guide: Regression Analysis

Answer: C. The strength of a correlation does not change if units change by a linear transformation such as: Fahrenheit = 32 + (5/9) * Centigrade

Linear Regression. Chapter 5. Prediction via Regression Line Number of new birds and Percent returning. Least Squares

LAGUARDIA COMMUNITY COLLEGE CITY UNIVERSITY OF NEW YORK DEPARTMENT OF MATHEMATICS, ENGINEERING, AND COMPUTER SCIENCE

2013 MBA Jump Start Program. Statistics Module Part 3

Example: Boats and Manatees

CHAPTER 13 SIMPLE LINEAR REGRESSION. Opening Example. Simple Regression. Linear Regression

X X X a) perfect linear correlation b) no correlation c) positive correlation (r = 1) (r = 0) (0 < r < 1)

STAT 350 Practice Final Exam Solution (Spring 2015)

Part 2: Analysis of Relationship Between Two Variables

MULTIPLE CHOICE. Choose the one alternative that best completes the statement or answers the question.

Correlation and Simple Linear Regression

Regression Analysis: A Complete Example

Premaster Statistics Tutorial 4 Full solutions

Unit 31 A Hypothesis Test about Correlation and Slope in a Simple Linear Regression

Elementary Statistics Sample Exam #3

Business Statistics. Successful completion of Introductory and/or Intermediate Algebra courses is recommended before taking Business Statistics.

Case Study: Alex Charter School Gordon Johnson, California State University, Northridge, USA Raj Kiani, California State University, Northridge, USA

1) Write the following as an algebraic expression using x as the variable: Triple a number subtracted from the number

Name: Date: Use the following to answer questions 2-3:

Chapter 5 Analysis of variance SPSS Analysis of variance

Final Exam Practice Problem Answers

Course Text. Required Computing Software. Course Description. Course Objectives. StraighterLine. Business Statistics

Session 7 Bivariate Data and Analysis

Course Objective This course is designed to give you a basic understanding of how to run regressions in SPSS.

Using R for Linear Regression

Chapter 23. Inferences for Regression

Module 3: Correlation and Covariance

Fairfield Public Schools

Chapter 13 Introduction to Linear Regression and Correlation Analysis

SCHOOL OF HEALTH AND HUMAN SCIENCES DON T FORGET TO RECODE YOUR MISSING VALUES

Lets suppose we rolled a six-sided die 150 times and recorded the number of times each outcome (1-6) occured. The data is

Simple linear regression

2. What is the general linear model to be used to model linear trend? (Write out the model) = or

Statistics 151 Practice Midterm 1 Mike Kowalski

DEPARTMENT OF PSYCHOLOGY UNIVERSITY OF LANCASTER MSC IN PSYCHOLOGICAL RESEARCH METHODS ANALYSING AND INTERPRETING DATA 2 PART 1 WEEK 9

MTH 140 Statistics Videos

Section Format Day Begin End Building Rm# Instructor. 001 Lecture Tue 6:45 PM 8:40 PM Silver 401 Ballerini

The importance of graphing the data: Anscombe s regression examples

How To Run Statistical Tests in Excel

Scatter Plot, Correlation, and Regression on the TI-83/84

ACCELERATION DUE TO GRAVITY

Correlation key concepts:

The correlation coefficient

Regression step-by-step using Microsoft Excel

Statistical Models in R

Module 5: Statistical Analysis

How Far is too Far? Statistical Outlier Detection

1. The parameters to be estimated in the simple linear regression model Y=α+βx+ε ε~n(0,σ) are: a) α, β, σ b) α, β, ε c) a, b, s d) ε, 0, σ

Introduction to Minitab and basic commands. Manipulating data in Minitab Describing data; calculating statistics; transformation.

This unit will lay the groundwork for later units where the students will extend this knowledge to quadratic and exponential functions.

We extended the additive model in two variables to the interaction model by adding a third term to the equation.

Exercise 1.12 (Pg )

NCSS Statistical Software Principal Components Regression. In ordinary least squares, the regression coefficients are estimated using the formula ( )

Scatter Plots with Error Bars

A Primer on Forecasting Business Performance

Regression and Correlation

HYPOTHESIS TESTING: CONFIDENCE INTERVALS, T-TESTS, ANOVAS, AND REGRESSION

Using Excel for Statistical Analysis

POLYNOMIAL AND MULTIPLE REGRESSION. Polynomial regression used to fit nonlinear (e.g. curvilinear) data into a least squares linear regression model.

11. Analysis of Case-control Studies Logistic Regression

RARITAN VALLEY COMMUNITY COLLEGE ACADEMIC COURSE OUTLINE MATH 111H STATISTICS II HONORS

Simple Regression Theory II 2010 Samuel L. Baker

Father s height (inches)

Nonlinear Regression Functions. SW Ch 8 1/54/

Module 5: Multiple Regression Analysis

Pearson s Correlation

Class 19: Two Way Tables, Conditional Distributions, Chi-Square (Text: Sections 2.5; 9.1)

Research Methods & Experimental Design

DATA INTERPRETATION AND STATISTICS

Good luck! BUSINESS STATISTICS FINAL EXAM INSTRUCTIONS. Name:

y = a + bx Chapter 10: Horngren 13e The Dependent Variable: The cost that is being predicted The Independent Variable: The cost driver

General Method: Difference of Means. 3. Calculate df: either Welch-Satterthwaite formula or simpler df = min(n 1, n 2 ) 1.

a) Find the five point summary for the home runs of the National League teams. b) What is the mean number of home runs by the American League teams?

Simple Linear Regression, Scatterplots, and Bivariate Correlation

Statistics courses often teach the two-sample t-test, linear regression, and analysis of variance

Homework 8 Solutions

Your Name: Section: INTRODUCTION TO STATISTICAL REASONING Computer Lab Exercise #5 Analysis of Time of Death Data for Soldiers in Vietnam

COMPARISONS OF CUSTOMER LOYALTY: PUBLIC & PRIVATE INSURANCE COMPANIES.

Example G Cost of construction of nuclear power plants

PLOTTING DATA AND INTERPRETING GRAPHS

430 Statistics and Financial Mathematics for Business

Georgia Department of Education Common Core Georgia Performance Standards Framework Teacher Edition Coordinate Algebra Unit 4

Lean Six Sigma Analyze Phase Introduction. TECH QUALITY and PRODUCTIVITY in INDUSTRY and TECHNOLOGY

1. Suppose that a score on a final exam depends upon attendance and unobserved factors that affect exam performance (such as student ability).

Interaction between quantitative predictors

Basic Statistics and Data Analysis for Health Researchers from Foreign Countries

2. Simple Linear Regression

Homework 11. Part 1. Name: Score: / null

Chapter Four. Data Analyses and Presentation of the Findings

August 2012 EXAMINATIONS Solution Part I

SIMPLE LINEAR CORRELATION. r can range from -1 to 1, and is independent of units of measurement. Correlation can be done on two dependent variables.

ECON 142 SKETCH OF SOLUTIONS FOR APPLIED EXERCISE #2

Transcription:

Make sure to study Lab 8 software output and questions. 1. The editor of a statistics textbook would like to plan for the next edition. A key variable is the number of pages that will be in the final version. Text files are prepared by the authors using LaTeX, and separate files contain figures and tables. For the previous edition of the textbook, the number of pages in the LaTeX files can easily be determined, as well as the number of pages in the final version of the textbook. Here are the data: Scatter Plot Estimate Std. Error t value Pr(> t ) (Intercept) -6.20176 5.71233-1.086 0.301 latex 1.20810 0.09828 xxx 0 Multiple R-squared: 0.9321, Adjusted R-squared: 0.926

a) What is the explanatory variable? Response variable? LaTex Page b) Describe the overall pattern of the scatter plot above. Strong Linear Positive relationship between textbook page and LaTex Page c) Find the equation of the least-squares regression line based on the software output. TextPages = - 6.20 + 1.21 LaTeXPages d) Interpret the slope of the regression line. c) Find the predicted number of pages for the next edition if the number of LaTeX pages for a chapter is 62. TextPages = - 6.20 + 1.21 *62 = 68.82 => 69 d) What proportion of the variation in Text pages is explained by LaTex pages? Multiple R-squared: 0.9321, 2 = 93.2% The proportion of the variation in Text pages is explained by LaTex pages is 93.2% e) What is the value of the correlation coefficient r? 0.97 f) Find a 90% confidence interval for the slope β 1 df = 13 2 = 11. T* = 1.796 1.2081 +/- 1.796*0.09828 g) Suppose there are 80 LaTex pages in a chapter for the new edition. Raj obtained a 95% confidence interval for the mean pages and a 95% prediction interval for the pages of the final version, but did not label them. Which of the following two is the prediction interval? Prediction Interval (75, 106) Confidence Interval (84, 97) h) Which interval should the editor use to plan for the next edition of the statistics textbook? Prediction Interval i) One new chapter has 200 LaText pages. Discuss if it is appropriate use the regression line obtained in part (c) to predict the final textbook pages. Not really. We need to be careful about extrapolation.

2. A study shows that there is a positive correlation between the size of a hospital (measured by its number of beds x) and the median number of days y that patients remain in the hospital. Does this mean that you can shorten a hospital stay by choosing a small hospital? Why? No. Correlation does not equal to Causation. Sicker patients tend to go to larger hospitals, and stay longer. 3. Data show that married, divorced, and widowed men earn quite a bit more than men the same age who have never been married. This does not mean that a man can raise his income by getting married, because men who have never been married are different from married men in many ways other than marital status. Suggest several lurking variables that might help explain the association between marital status and income. Age. 4. (10.44) Can a pretest on mathematics skills predict success in a statistics course? The 62 students in an introductory statistics class took a pretest at the beginning of the semester. The least-squares regression line for predicting the score y on the final exam from the pretest score x was = 13.8 + 0.81x. The standard error of b1 was 0.43. (a) Test the null hypothesis that there is no linear relationship between the pretest score and the score on the final exam against the two-sided alternative. 1) H0: slope = 0 vs H1: slope!= 0 2) test statistic t = 0.81/0.43 = 1.88, with DF = 60 3) P-value in (0.025*2, 0.05*2) = (0.05, 0.10) 4) At significance level 0.05, we fail to reject the null. Concluding that there is no linear relationship between the two scores. (b) Would you reject this null hypothesis versus the one-sided alternative that the slope is positive? Explain your answer. Yes, we would reject the null for one-sided alternative. Because the p-value is between 0.025 and 0.05.

5. (10.39, 10.40, 10.41) The Leaning Tower of Pisa is an architectural wonder. Engineers concerned about the tower s stability have done extensive studies of its increasing tilt. Measurements of the lean of the tower over time provide much useful information. The following table gives measurements for the years 1975 to 1987. The variable lean represents the difference between where a point on the tower would be if the tower were straight and where it actually is. The data are coded as tenths of a millimeter in excess of 2.9 meters, so that the 1975 lean, which was 2.9642 meters, appears in the table as 642. Only the last two digits of the year were entered into the computer. 17 A. Does the trend in lean over time appear to be linear? Yes B. What is the equation of the least-squares line? What percent of the variation in lean is explained by this line? Lean = -61.12 + 9.32 Year C. Give a 99% confidence interval for the average rate of change (tenths of a millimeter per year) of the lean. DF = 13 2 = 11 t* = 3.1058 9.3187 +/- 3.1058 x 0.3099 = (8.36, 10.28) D. In 1918 the lean was 2.9071 meters. (The coded value is 71.) Using the leastsquares equation for the years 1975 to 1987, calculate a predicted value for the lean in 1918. (Note that you must use the coded value 18 for year.) Lean(18) = 107 107 mm beyond 2.9 m 2.9107 m E. Although the least-squares line gives an excellent fit to the data for 1975 to 1987, this pattern did not extend back to 1918. Write a short statement explaining why this conclusion follows from the information available. Use numerical and graphical summaries to support your explanation. This is an example of extrapolation. F. How would you code the explanatory variable for the year 2013? 113 = 2013-1900 G. The engineers working on the Leaning Tower of Pisa were most interested in how much the tower would lean if no corrective action was taken. Use the leastsquares equation to predict the tower s lean in the year 2013. (Note: The tower was renovated in 2001 to make sure it does not fall down.) Lean(113) = 992 992 mm beyond 2.9 m 3.892 m H. To give a margin of error for the lean in 2013, would you use a confidence interval for a mean response or a prediction interval? Explain your choice. Prediction Interval, since we are predicting for one specific year.

Coefficients: Estimate Std. Error t value Pr(> t ) (Intercept) -61.1209 25.1298 xxxx 0.0333 * year 9.3187 0.3099 xxxx 6.5e-12 *** s = 4.181, R-squared = 0.988