Stat 301 Exam 1 October 1, 2013

Similar documents
MULTIPLE REGRESSION EXAMPLE

Unit 31 A Hypothesis Test about Correlation and Slope in a Simple Linear Regression

1. What is the critical value for this 95% confidence interval? CV = z.025 = invnorm(0.025) = 1.96

SPSS Guide: Regression Analysis

Chapter 5 Analysis of variance SPSS Analysis of variance

Regression Analysis: A Complete Example

International Statistical Institute, 56th Session, 2007: Phil Everson

Final Exam Practice Problem Answers

Good luck! BUSINESS STATISTICS FINAL EXAM INSTRUCTIONS. Name:

Chapter 7 Section 7.1: Inference for the Mean of a Population

August 2012 EXAMINATIONS Solution Part I

Chapter 7: Simple linear regression Learning Objectives

Correlation and Simple Linear Regression

STAT 350 Practice Final Exam Solution (Spring 2015)

Correlation and Regression

Simple linear regression

Exercise 1.12 (Pg )

Lesson 1: Comparison of Population Means Part c: Comparison of Two- Means

Chapter 13 Introduction to Linear Regression and Correlation Analysis

Premaster Statistics Tutorial 4 Full solutions

Using R for Linear Regression

Multiple Linear Regression

Chapter 4 and 5 solutions

The right edge of the box is the third quartile, Q 3, which is the median of the data values above the median. Maximum Median

MTH 140 Statistics Videos

2013 MBA Jump Start Program. Statistics Module Part 3

The correlation coefficient

1) Write the following as an algebraic expression using x as the variable: Triple a number subtracted from the number

Business Statistics. Successful completion of Introductory and/or Intermediate Algebra courses is recommended before taking Business Statistics.

Fairfield Public Schools

Chapter 2 Probability Topics SPSS T tests

" Y. Notation and Equations for Regression Lecture 11/4. Notation:

Institute of Actuaries of India Subject CT3 Probability and Mathematical Statistics

NCSS Statistical Software Principal Components Regression. In ordinary least squares, the regression coefficients are estimated using the formula ( )

Simple Linear Regression

How Far is too Far? Statistical Outlier Detection

Chapter 23. Inferences for Regression

We extended the additive model in two variables to the interaction model by adding a third term to the equation.

Chapter 7 Section 1 Homework Set A

Doing Multiple Regression with SPSS. In this case, we are interested in the Analyze options so we choose that menu. If gives us a number of choices:

Regression step-by-step using Microsoft Excel

Course Text. Required Computing Software. Course Description. Course Objectives. StraighterLine. Business Statistics

MATH 564 Project Report. Analysis of Desktop Virtualization Capacity with. Linear Regression Model

ABSORBENCY OF PAPER TOWELS

Section 14 Simple Linear Regression: Introduction to Least Squares Regression

List of Examples. Examples 319

Simple Regression Theory II 2010 Samuel L. Baker

2. Simple Linear Regression

Recall this chart that showed how most of our course would be organized:

Interaction between quantitative predictors


Generalized Linear Models

SIMPLE LINEAR CORRELATION. r can range from -1 to 1, and is independent of units of measurement. Correlation can be done on two dependent variables.

Week TSX Index

Statistics courses often teach the two-sample t-test, linear regression, and analysis of variance

General Method: Difference of Means. 3. Calculate df: either Welch-Satterthwaite formula or simpler df = min(n 1, n 2 ) 1.

Name: Date: Use the following to answer questions 2-3:

Module 5: Multiple Regression Analysis

Coefficient of Determination

Elementary Statistics Sample Exam #3

Data Mining Techniques Chapter 5: The Lure of Statistics: Data Mining Using Familiar Tools

Mean = (sum of the values / the number of the value) if probabilities are equal

Introduction to Regression and Data Analysis

Descriptive Statistics

Chapter 10. Key Ideas Correlation, Correlation Coefficient (r),

Basic Statistics and Data Analysis for Health Researchers from Foreign Countries

You have data! What s next?

Stata Walkthrough 4: Regression, Prediction, and Forecasting

MODEL I: DRINK REGRESSED ON GPA & MALE, WITHOUT CENTERING

Quick Stata Guide by Liz Foster

Data Analysis Tools. Tools for Summarizing Data

IAPRI Quantitative Analysis Capacity Building Series. Multiple regression analysis & interpreting results

1.5 Oneway Analysis of Variance

STATISTICS 8, FINAL EXAM. Last six digits of Student ID#: Circle your Discussion Section:

How To Run Statistical Tests in Excel

KSTAT MINI-MANUAL. Decision Sciences 434 Kellogg Graduate School of Management

Chicago Booth BUSINESS STATISTICS Final Exam Fall 2011

An SPSS companion book. Basic Practice of Statistics

Difference of Means and ANOVA Problems

Correlation and Regression Analysis: SPSS

The Wilcoxon Rank-Sum Test

Multiple Optimization Using the JMP Statistical Software Kodak Research Conference May 9, 2005

Statistics 2014 Scoring Guidelines

Rockefeller College University at Albany

STATS8: Introduction to Biostatistics. Data Exploration. Babak Shahbaba Department of Statistics, UCI

Hypothesis testing - Steps

Part 3. Comparing Groups. Chapter 7 Comparing Paired Groups 189. Chapter 8 Comparing Two Independent Groups 217

Multiple Regression in SPSS This example shows you how to perform multiple regression. The basic command is regression : linear.

1. The parameters to be estimated in the simple linear regression model Y=α+βx+ε ε~n(0,σ) are: a) α, β, σ b) α, β, ε c) a, b, s d) ε, 0, σ

Univariate Regression

Data Mining and Data Warehousing. Henryk Maciejewski. Data Mining Predictive modelling: regression

Chapter Four. Data Analyses and Presentation of the Findings

Statistical Models in R

Independent t- Test (Comparing Two Means)

Comparing Means in Two Populations

Projects Involving Statistics (& SPSS)

Below is a very brief tutorial on the basic capabilities of Excel. Refer to the Excel help files for more information.

Transcription:

Stat 301 Exam 1 October 1, 2013 Name: INSTRUCTIONS: Read the questions carefully and completely. Answer each question and show work in the space provided. Partial credit will not be given if work is not shown. Use the JMP output. It is not necessary to calculate something by hand that JMP has already calculated for you. When asked to explain, describe, or comment, do so within the context of the problem and support statements with statistical summaries. Be sure to include units of measurements when discussing quantitative variables. A person s percentage body fat is determined from a person s density. The density is obtained from the displacement of water in a large tub. In this exam we will look at men s percentage body fat. 1. [27 pts] The American Council on Exercise (ACE) has a chart that describes different levels of percentage body fat. For male athletes the range of body fat is 6 to 13%. A random sample of 44 men has their percentage body fat determined by water displacement. The JMP analysis of the data is given below. 0 20 30 40 0 Percentage Body Fat 1 Count 0.0% maximum 40.1 7.0% quartile 2.0 0.0% median 17.1 2.0% quartile 11.7 0.0% minimum 3.7 Mean 18.78 Std Dev 9.318 Std Err Mean 1.40 Upper 9% Mean 21.61 Lower 9% Mean 1.9 N 44 Hypothesized Value 13 DF 43 t Test Test Statistic 4.114 Prob > t 0.0002* Prob > t <.0001* Prob < t 0.9999 a) [4] Looking at the histogram, describe the distribution of percentage body fat. Be sure to comment on shape, center and variability. 1

b) [3] Looking at the box plot, are their any potential outliers? How do you know this? If so, what is (are) the associated percentage body fat(s)? c) [8] Could this sample be from a population of athletes? Test the hypothesis that the population mean percentage body fat is 13% versus an alternative that the population mean percentage fat is greater than 13%. Be sure to give the null and alternative hypothesis using appropriate statistical notation, value of the test statistic, P-value, decision and reason for reaching that decision and a conclusion in the context of the problem. For this problem use the usual significance level of 0.0. d) [4] Give the values for the 9% confidence interval for the population mean percentage body fat. Explain briefly why this confidence interval is consistent with the test of hypothesis you did in c). 2

e) [4] Construct a 9% prediction interval for the percentage body fat of a randomly selected man. Note: the appropriate value of t* is 2.0167. f) [4] What is the difference in interpretation between the confidence interval in d) and the prediction interval in e)? 2. [33 pts] The random sample of 44 men includes 24 men who are under 40 years of age and 20 men who are 40 to years of age. The JMP analysis appears below. BodyFat 0 40 30 20 0 40 to under 40 Age Group Rsquare 0.264098 Adj Rsquare 0.24676 Root Mean Square Error 8.087709 Mean of Response 18.779 Observations (or Sum Wgts) 44 t Test Assuming equal variances Difference 9.07 t Ratio 3.88237 Std Err Dif 2.449 DF 42 Upper CL Dif 14.448 Prob > t 0.0004* Lower CL Dif 4.6 Prob > t 0.0002* Confidence 0.9 Prob < t 0.9898 Level Number Mean Std Dev Std Err Mean Lower 9% Upper 9% 40 to 20 23.960.218 2.2843 19.184 28.746 under 40 24 14.483.7648 1.1767 12.024 16.893 a) [] Compare the percentage body fat of 40 to year old men to that of the men under 40. Be sure to compare centers, variability and mention if there are any potential outliers in either group. 3

b) [3] What is the value of s p, the pooled estimate of the common standard deviation, σ? c) [8] Test the hypothesis that there is no difference between the population mean percentage body fat of 40 to year old men and the population mean percentage body fat of men under 40 years old. Be sure to give the null and alternative hypothesis using clearly understood statistical symbols, value of the test statistic, P-value, decision and reason for reaching that decision and a conclusion in the context of the problem. d) [] Give the 9% confidence interval for the difference in population mean percentage body fat. What does this say about how much the population mean percentage body fat of men 40 to years old differs from that of men under 40 years old? e) [4] Is the condition of equal population standard deviations, σ, satisfied for these data? Support your answer. 4

Below is the JMP output for the distribution of the two-sample residuals. 1.64 1.28 0.67 0.0-0.67-1.28-1.64 0.9 0.8 0.7 0.60 0.4 0.30 0.20 0. 0.0 Normal Quantile Plot Count -20-1 - - 0 1 20 Residual f) [3] Looking at the normal quantile plot describe what you see and what this tells you about the condition that random errors are normally distributed. g) [3] Looking at the box plot compare the median to the mean? What does this comparison indicate about the shape of the distribution of residuals? h) [2] Looking at the histogram, where is the mound? How would you describe the shape of the distribution of residuals?

3. [40] Measuring percentage body fat by displacement of water is a time consuming process that requires the individual to be naked. Could a less time consuming and invasive measurement, like the circumference of a man s abdomen (cm), be used to predict the percentage body fat? Below is JMP output looking at the relationship between abdomen circumference and percentage body fat. 0 BodyFat 40 30 20 Summary of Fit RSquare 0.70664 RSquare Adj 0.699660 Root Mean Square Error.637 Mean of Response 18.779 Observations (or Sum Wgts) 44 0 0 0 1 Abdomen Parameter Estimates Term Estimate Std Error t Ratio Prob> t Lower 9% Upper 9% Intercept 41.19 6.012 6.8 <.0001* 3.32 29.06 Abdomen (cm) 0.648 0.0644.06 <.0001* 0.18 0.778 a) [3] Describe the general relationship between abdomen circumference and percentage body fat. Use complete sentences and say something about direction, form, strength and unusual values. b) [3] Give the equation of the least squares regression line relating percentage body fat to the abdomen circumference. c) [2] Use the equation in b) to predict the percentage body fat for a man with abdomen circumference of 1 cm. 6

d) [] Calculate a 9% prediction interval for a man with abdomen circumference of 1 cm. Note t* = 2.081. Note: the sample mean abdomen circumference is 92.6 cm and the sample variance of abdomen circumference is 146.21 cm 2. e) [] Give an interpretation of the estimated slope coefficient within the context of the problem. f) [3] Why doesn t the intercept have an interpretation within the context of the problem? g) [3] Give the value of R 2 and an interpretation of that value within the context of the problem. h) [2] Give the value of the estimate of the random error standard deviation, σ. 7

i) [6] Report the 9% confidence interval for the slope. Use this interval to test for a statistically significant relationship between percentage body fat and abdomen circumference. j) [4] Describe what you see in the plot of residuals versus predicted body fat. What does this plot tell you about the adequacy of the linear model? BodyFat Residual 0 - - 0 20 30 40 0 BodyFat Predicted k) [4] Comment on the condition of normally distributed random errors. Be sure to support your comments by referring to the normal quantile plot of residual. BodyFat Residual 0 - - 0.0 0.1 0.1 0.2 0.3 0.4 0. 0.6 0.7 0.8 0.9 0.9 Normal Quantile 8