TRINITY COLLEGE. Faculty of Engineering, Mathematics and Science. School of Computer Science & Statistics


 Barbra Elliott
 2 years ago
 Views:
Transcription
1 UNIVERSITY OF DUBLIN TRINITY COLLEGE Faculty of Engineering, Mathematics and Science School of Computer Science & Statistics BA (Mod) Enter Course Title Trinity Term 2013 Junior/Senior Sophister ST7002 Multiple Linear Regression Prof Haslett Date Venue Time Instructions to Candidates: Answer all questions.. All carry equal marks.in all questions, extra marks will be awarded for imaginative answers, including those that go beyond the question as posed when explaining and illustrating the ideas discussed Materials permitted for this examination: Calculator, Log tables Materials omitted from the front page of an examination paper will not be permitted during an examination Questions should start on page 2 only Page 1 of 13
2 Q1 Write short notes on EIGHT of the following topics. You should illustrate your notes by referring to examples. You may draw on other questions in this exam paper or on examples discussed in class. However additional credit will be given for the use of other examples. All topics carry equal marks a) Lessons from my project b) Objectives in regression c) Linear regression does not necessarily mean straight lines d) The role of the Normal distribution in regression e) Transformations f) Variance Inflation Factors g) Critical analysis of rows and columns/cases and variables h) Extra and Sequential Sums of Squares i) Sampling Distributions and Standard Errors in Regression j) Interactions Q2 In a clinical experiment, 64 patients (31/33, Male/Female) were administered a drug. The response (mg/l, based on a blood analysis 12 hours later) was noted. Three drug levels were used (200,400,600); the gender and weights (kg) of the patients were noted. (Naturally, the average Male/Female weights differed.). Several analyses are reported as below. a) In a pair of preliminary analyses, the response by gender was analysed, as overleaf (Q2A) Explain how these two analyses should be interpreted. (6 marks) b) Two further simple analyses are reported (Q2B) Interpret the analyses. Explain the different interpretations of the Confidence and Prediction intervals. (6 marks) Page 2 of 13
3 resp resp resp XST7002 Q2A TwoSample TTest: resp, Gender Gender N Mean StDev SE Mean Regression of Response on Gender. M=1; F=0 Difference = mu (0)  mu (1) Estimate for difference: % CI for difference: (0.074, 0.660) TTest of difference = 0 (vs not =): TValue = 2.51 PValue = DF = Resp vs Gender M,F/1,0 resp = Gender S RSq 9.2% RSq(adj) 7.7% 2.0 Regression Analysis: resp versus Gender resp = Gender Gender Predictor Coef SE Coef T P Constant Gender S = RSq = 9.2 Q2B Regression Analysis: resp versus dose resp = dose Predictor Coef SE Coef T P Constant dose S = RSq = 73.7% Regression Analysis: resp versus wt resp = wt Predictor Coef SE Coef T P Constant wt S = RSq = 18.3% Resp vs Dose resp = dose Regression 95% CI 95% PI S RSq 73.7% RSq(adj) 73.2% Resp vs Wt resp = wt Regression 95% CI 95% PI S RSq 18.3% RSq(adj) 16.9% dose wt Q2 Continues Page 3 of 13
4 c) Two multiple regression analyses are presented below Q2C The researcher is puzzled by their different interpretations as regards the apparent importance of dosage. The dose/wt variable is a derived variable, being the ratio of dose to weight. Provide her with an explanation. Use this to discuss ideas of correlated predictor variables and of direct and indirect relationships. d) A further derived variable, (Gender*dose/wt) is created; this is the simple (8 marks) product of the binary Gender variable by dose/wt. The resulting regression is in Q2D. What is the purpose of including such a variable in such a regression analysis? What is the interpretation in this case? Contrast with the analysis above. Illustrate your discussion rough sketches of the two simple regression lines (resp vs dose/wt) that are implicit in this model. (8 marks) e) A final analysis leads to Q2E. Discuss the interpretation. Would you propose further analyses? Q2C Regression Analysis: resp versus wt, Gender, dose resp = wt Gender dose Predictor Coef SE Coef T P Constant wt Gender dose S = RSq = 89.6% Regression Analysis: resp versus wt, Gender, dose, dose/wt resp = wt Gender dose dose/wt Predictor Coef SE Coef T P VIF Constant wt Gender dose dose/wt (5 marks) S = RSq = 90.6% Q2 Continues Page 4 of 13
5 Q2D Regression Analysis: resp versus Gender, dose/wt, Gender*dose/wt resp = Gender dose/wt Gender*dose/wt Predictor Coef SE Coef T P Constant Gender dose/wt Gender*dose/wt S = RSq = 89.2% RSq(adj) = 88.7% Q2E Regression Analysis: resp versus dose/wt, Gender, wt, dose, Gender*dose/ resp = dose/wt Gender wt dose Gender*dose/wt Predictor Coef SE Coef T P VIF Constant dose/wt Gender wt dose Gender*dose/wt S = RSq = 90.6% RSq(adj) = 89.8% Analysis of Variance Source DF SS MS F P Regression Residual Error Total Source DF Seq SS dose/wt Gender wt dose Gender*dose/wt Page 5 of 13
6 Q3 Data are available on the Weight (gms) and physical dimensions Length, Width and Height (cms) of 56 perch. All are caught from the same lake (Laengelmavesi) near Tampere in Finland. A matrix plot and various analyses are presented below. The interest lies in relating the dimensions to the weight. Matrix Plot of Weight, Length, Ht, width Weight Length Ht width a) It is immediately apparent that separate linear regressions of Weight on Length, Width and Height will encounter difficulties. Discuss. (6 marks) b) All variables were log transformed; the resulting multiple regression analysis as in Q3A overleaf. Discuss the various aspects of this transformation and subsequent analysis. What are the implications of the VIF values? (6 marks) c) In an attempt at a simpler model, the derived variable Vol=Length Height Width was formed. The Fitted Line plot in C overleaf summarises the analysis; the SE for the slope is returned as Explain the features of this analysis and plot. (8 marks) d) Use the models in (b) and (c) above, to compute approximate 95% Prediction Intervals for the Weights of two fish with dimensions (Length, Height and Width) being respectively (14.7, 3.5 and 2.0) and (45.2, 11.9 and 7.3). Explain carefully the basis, in the fitted models, for your calculations. (8 marks) e) The slope SE is What are the implications for possible further simplification? (3 marks) f) It is remarked that although the last model (c) provides an excellent and simple fit, its interpretation differs to an extent that seems to be statistically significant  from the details of the model fitted in (b). Discuss. (2 marks) Page 6 of 13
7 Frequency Deleted Residual Percent Deleted Residual XST7002 Q3A Regression Analysis: logwt versus loglen, loght, logwidth logwt = loglen loght logwidth Predictor Coef SE Coef T P VIF Constant loglen loght logwidth S = RSq = 99.4% RSq(adj) = 99.4% Unusual Observations Obs loglen logwt Fit SE Fit Residual St Resid X R R R R X Q3B R denotes an observation with a large standardized residual. X denotes an observation whose X value gives it large leverage. 99 Normal Probability Plot Residual Plots for logwt Versus Fits Deleted Residual Fitted Value Histogram Versus Order Q3C Deleted Residual Observation Order Page 7 of 13
8 Weight XST Weight vs Vol log10(weight) = log10(vol) Regression 95% PI S RSq 99.3% RSq(adj) 99.3% Vol Page 8 of 13
9 Outline Solution Q2 a) The two analyses are equivalent, though differently packaged. The Ttest reports that the observed means and mean difference in response are 1.611, and The regression reports the same info (to within rounding error): when Gender =0 (F) the regression reports the expected value to be 1.61; when Gender =1 (M) the regression reports = The regression model regards Gender and as an Indicator variable; the plot which interpolates to other values of Gender is not interpretable. Treating the remaining variation (other than due to Gender) can be regarded as random, both find that the Tratio for the difference is Both report that this is statistically significant. b) Resp vs dose reports in increase in the average increase in response of per unit of dose; this is =0.74 per 200 units. If the remaining error can be treated as random, this is hugely statistically significant. Res vs wt shows an apparent reduction in response associated with weight, also statistically significant. Anticipating later analyses, response is often more naturally sensitive to dose per unit weight; the latter analysis is consistent with this. The prediction intervals as shown are effectively descriptive. Most of the data lie within these. The Confidence Intervals qualify statements about the mean response (over very many patients with specified dose or weight. One interpretation is that regression lines that are statistically consistent with the data must lie within the CI band. Page 9 of 13
10 c) The first analysis suggests that weight, dose and gender are all important predictors of response. The second suggests that, when dose/wt is included as a predictor, neither dose nor weight contribute much Dose extra information. We already know that weight and gender and interrelated. And by construction Dose/wt Weight does/wt is correlated to dose and Gender to weight. The apparent confusion arises frequently when the xvariable (predictor variables) are themselves interrelated. The slope coefficients are in such cases  not simply interpretable in terms of the corresponding bivariate correlations. Resp The diagram not a requirement, but worthy of marks if offered provides one way to envision a possible set of direct and indirect relationships with response. The concept of direct and indirect relationships has been discussed in class. d) The new derived variable simplifies the direct consideration and comparison of two simple models: Resp vs dose/wt separately for M/F. Separately these may be written as resp = int cpt + slope (dose/wt), with potentially different values for each for M/F. This can also be a way to investigate interaction. Here F dose/wt M ( ) + ( ) dose/wt Lines, when sketched, are effectively parallel and have almost same slope). Since the values and are small compared to SE, via T = 0.58, T = 0.00 we can conclude that a single regression relationship, for both M and F, is likely to be adequate. This in turn suggest that the Weight and Gender terms in the second model in c) may not be simply interpreted as suggested there as individually necessary. For the Gender term is correlated with Weight. Perhaps the inclusion of Weight requires the inclusion of Gender to counterbalance it. e) The analysis confirms that the important variable is dose/wt. No other variables are significant. However, the very high VIF values for dose and dose/wt point to the fact that these variables are (naturally) highly interdependent. One of these is likely to be to most important. The choice should be guided by the considerations of the way the drug interacts, biochemically, with the patient. Page 10 of 13
11 Q3 a) It is clear that the bivariate relationships (top row) are not linear. Additionally, there is clear evidence of variance of weight increasing with weight. It is also the case that there is a great deal of correlation between the 3 x variables, likely to cause problems if they are ever used together. b) The model in the log scale shows that all vars are very significantly different from 0; tho that was never in doubt. R 2 is high. This can also be written as Wt = Len 1.65 Ht 0.81 Width error. The nominal interpretation is that an increase of 1 in eg loglen ( ie an increase of 10 in Len) will induce  on average an increase of 1.65 in logwt (ie =45fold in Wt) if all other variables are held constant. But The VIF values suggest that the covariates are correlated as anticipated and that the SEs are therefore inflated. This was apparent also in the scatterplots. Effectively this means that some fish are large in respect of all three dimensions, and some are small. In these circumstances one option is to choose a single composite that reflects all of the variables. It is likely to be futile to choose one of them. There are however a number of unusual observations. 4 of these exhibit large residuals, which merit attention. Three are very large and positive, and one is large and negative. Two are influential, being far from the others in respect of (log)len, Ht, Width. There is nothing wrong with this, necessarily. c) The option followed was to choose a product named Vol. The Fitted Line plot has fitted Weight to Vol, in the log scale, presenting the analysis in the backtransformed antilog scale. Alternatively as Log(Vol) = Log(Length)+ Log(Height) + Log(Width), the composite Log(Vol) variable is the sum of the three Log(covariates). The fitted model can be written as Wt = Vol error = 0.3 Vol error = 0.3 Len 0.98 Ht 0.98 Width error. An increase of 1 in LogVol (ie a 10fold increase) will generate, on average a fold (ie 9.55fold) increase in Wt, on average. The constant 0.3 could be thought of as fishdensity, were fish to be cuboid. As it is, it is a combination of fish density and the ratio of actual fish volume to the volume of the corresponding cuboid. When Vol 0.98 is large (ie when Wt is large) the absolute errors implicit in a 10 error fold variation are large.this exhibits the fanlike figure for the prediction intervals Predicting the weight if a large fish is harder than that of a small fish in absolute terms. The issue is equivalent to describing the prediction error in %age terms. Page 11 of 13
12 The R2 value is almost as high as in b). The value of s is This is in the logscale and compares with the value of above. The model is not quite as tightfitting, but it is simpler. No info is available on unusual obs, or SEs. d) The two prediction equations are: (as in b) Pred logwt = Log(len) log(ht)m Log(Width) 2(0.037) (as in c) Pred logwt = Log(len*Ht*Width) 2(0.040), back transformed as antilog( Pred log(wt)) as below Fish dimensions Fish len ht width Vol log 10 dimensions Log(vol) sum log model b coeffs model c coeff const len ht width s const vol s Fish Pred LogWt lo hi Pred LogWt lo hi backtransform backtransform Conclusions: very similar See f) below Model c has a slope is ( 0.011). This includes coeff=1. That is, the data are statistically consistent with 1; a Null Hyp: slope =1 would not be rejected. A simpler version of this model would then be LogWt = const + Log Vol. Equiv this is Wt=0.3 Vol. This model is not unlike the Tree model discussed in class. Note that the SE (0.011) is very much smaller than the SE s for each of the dimensions in model b). That s because their SE s have been inflated by, effectively, the lack of determinancy of the separate coeffs. Note however, that the correlation between these dimensions has no implications at all for the usefulness of the prediction equation generated by model b). It is simply the case that many different combinations of these coefficients are effectively equivalent to each other. e) However, model (c) corresponds to giving coefficients of 1 to each (log) dimension. This is just about consistent with the fit for LogHt ( (0.21) it is not as consistent with LogLen (1.65 2(0.22) and LogWidth (0.55 2(0.18). The implications are that Fish that are very long will be given low values of log Wt in model c and Fish that are very wide will be given high values of Page 12 of 13
13 LogWt in model c. Fish are not cuboids. However it is moot whether the data would require a rejection (in model b) of the Null Hyp that all coefficients were equal to 1. XST7002 Page 13 of 13
Regression Analysis: A Complete Example
Regression Analysis: A Complete Example This section works out an example that includes all the topics we have discussed so far in this chapter. A complete example of regression analysis. PhotoDisc, Inc./Getty
More informationResiduals. Residuals = ª Department of ISM, University of Alabama, ST 260, M23 Residuals & Minitab. ^ e i = y i  y i
A continuation of regression analysis Lesson Objectives Continue to build on regression analysis. Learn how residual plots help identify problems with the analysis. M231 M232 Example 1: continued Case
More information31. SIMPLE LINEAR REGRESSION VI: LEVERAGE AND INFLUENCE
31. SIMPLE LINEAR REGRESSION VI: LEVERAGE AND INFLUENCE These topics are not covered in the text, but they are important. Leverage If the data set contains outliers, these can affect the leastsquares fit.
More informationUsing Minitab for Regression Analysis: An extended example
Using Minitab for Regression Analysis: An extended example The following example uses data from another text on fertilizer application and crop yield, and is intended to show how Minitab can be used to
More informationMultiple Regression Analysis in Minitab 1
Multiple Regression Analysis in Minitab 1 Suppose we are interested in how the exercise and body mass index affect the blood pressure. A random sample of 10 males 50 years of age is selected and their
More informationRegression. Name: Class: Date: Multiple Choice Identify the choice that best completes the statement or answers the question.
Class: Date: Regression Multiple Choice Identify the choice that best completes the statement or answers the question. 1. Given the least squares regression line y8 = 5 2x: a. the relationship between
More information2. Simple Linear Regression
Research methods  II 3 2. Simple Linear Regression Simple linear regression is a technique in parametric statistics that is commonly used for analyzing mean response of a variable Y which changes according
More informationAnalysis of Covariance
Analysis of Covariance 1. Introduction The Analysis of Covariance (generally known as ANCOVA) is a technique that sits between analysis of variance and regression analysis. It has a number of purposes
More informationPredictor Coef StDev T P Constant 970667056 616256122 1.58 0.154 X 0.00293 0.06163 0.05 0.963. S = 0.5597 RSq = 0.0% RSq(adj) = 0.
Statistical analysis using Microsoft Excel Microsoft Excel spreadsheets have become somewhat of a standard for data storage, at least for smaller data sets. This, along with the program often being packaged
More informationPaired Differences and Regression
Paired Differences and Regression Students sometimes have difficulty distinguishing between paired data and independent samples when comparing two means. One can return to this topic after covering simple
More informationIn Chapter 2, we used linear regression to describe linear relationships. The setting for this is a
Math 143 Inference on Regression 1 Review of Linear Regression In Chapter 2, we used linear regression to describe linear relationships. The setting for this is a bivariate data set (i.e., a list of cases/subjects
More information1. The parameters to be estimated in the simple linear regression model Y=α+βx+ε ε~n(0,σ) are: a) α, β, σ b) α, β, ε c) a, b, s d) ε, 0, σ
STA 3024 Practice Problems Exam 2 NOTE: These are just Practice Problems. This is NOT meant to look just like the test, and it is NOT the only thing that you should study. Make sure you know all the material
More informationSimple Linear Regression
Inference for Regression Simple Linear Regression IPS Chapter 10.1 2009 W.H. Freeman and Company Objectives (IPS Chapter 10.1) Simple linear regression Statistical model for linear regression Estimating
More information, then the form of the model is given by: which comprises a deterministic component involving the three regression coefficients (
Multiple regression Introduction Multiple regression is a logical extension of the principles of simple linear regression to situations in which there are several predictor variables. For instance if we
More informationName: Student ID#: Serial #:
STAT 22 Business Statistics II Term3 KING FAHD UNIVERSITY OF PETROLEUM & MINERALS Department Of Mathematics & Statistics DHAHRAN, SAUDI ARABIA STAT 22: BUSINESS STATISTICS II Third Exam July, 202 9:20
More informationFor example, enter the following data in three COLUMNS in a new View window.
Statistics with Statview  18 Paired ttest A paired ttest compares two groups of measurements when the data in the two groups are in some way paired between the groups (e.g., before and after on the
More informationBivariate Analysis. Correlation. Correlation. Pearson's Correlation Coefficient. Variable 1. Variable 2
Bivariate Analysis Variable 2 LEVELS >2 LEVELS COTIUOUS Correlation Used when you measure two continuous variables. Variable 2 2 LEVELS X 2 >2 LEVELS X 2 COTIUOUS ttest X 2 X 2 AOVA (Ftest) ttest AOVA
More information4. Multiple Regression in Practice
30 Multiple Regression in Practice 4. Multiple Regression in Practice The preceding chapters have helped define the broad principles on which regression analysis is based. What features one should look
More informationChapter 7: Simple linear regression Learning Objectives
Chapter 7: Simple linear regression Learning Objectives Reading: Section 7.1 of OpenIntro Statistics Video: Correlation vs. causation, YouTube (2:19) Video: Intro to Linear Regression, YouTube (5:18) 
More informationAnalysing Questionnaires using Minitab (for SPSS queries contact ) Graham.Currell@uwe.ac.uk
Analysing Questionnaires using Minitab (for SPSS queries contact ) Graham.Currell@uwe.ac.uk Structure As a starting point it is useful to consider a basic questionnaire as containing three main sections:
More informationDo the following using Mintab (1) Make a normal probability plot for each of the two curing times.
SMAM 314 Computer Assignment 4 1. An experiment was performed to determine the effect of curing time on the comprehensive strength of concrete blocks. Two independent random samples of 14 blocks were prepared
More informationUNDERSTANDING MULTIPLE REGRESSION
UNDERSTANDING Multiple regression analysis (MRA) is any of several related statistical methods for evaluating the effects of more than one independent (or predictor) variable on a dependent (or outcome)
More informationUnit 31 A Hypothesis Test about Correlation and Slope in a Simple Linear Regression
Unit 31 A Hypothesis Test about Correlation and Slope in a Simple Linear Regression Objectives: To perform a hypothesis test concerning the slope of a least squares line To recognize that testing for a
More information1. What is the critical value for this 95% confidence interval? CV = z.025 = invnorm(0.025) = 1.96
1 Final Review 2 Review 2.1 CI 1propZint Scenario 1 A TV manufacturer claims in its warranty brochure that in the past not more than 10 percent of its TV sets needed any repair during the first two years
More informationUnivariate Regression
Univariate Regression Correlation and Regression The regression line summarizes the linear relationship between 2 variables Correlation coefficient, r, measures strength of relationship: the closer r is
More information10. Analysis of Longitudinal Studies Repeatmeasures analysis
Research Methods II 99 10. Analysis of Longitudinal Studies Repeatmeasures analysis This chapter builds on the concepts and methods described in Chapters 7 and 8 of Mother and Child Health: Research methods.
More informationwhere b is the slope of the line and a is the intercept i.e. where the line cuts the y axis.
Least Squares Introduction We have mentioned that one should not always conclude that because two variables are correlated that one variable is causing the other to behave a certain way. However, sometimes
More informationMultiple Regression in SPSS STAT 314
Multiple Regression in SPSS STAT 314 I. The accompanying data is on y = profit margin of savings and loan companies in a given year, x 1 = net revenues in that year, and x 2 = number of savings and loan
More informationChapter 11: Two Variable Regression Analysis
Department of Mathematics Izmir University of Economics Week 1415 20142015 In this chapter, we will focus on linear models and extend our analysis to relationships between variables, the definitions
More informationA. Karpinski
Chapter 3 Multiple Linear Regression Page 1. Overview of multiple regression 32 2. Considering relationships among variables 33 3. Extending the simple regression model to multiple predictors 34 4.
More information12/31/2016. PSY 512: Advanced Statistics for Psychological and Behavioral Research 2
PSY 512: Advanced Statistics for Psychological and Behavioral Research 2 Understand when to use multiple Understand the multiple equation and what the coefficients represent Understand different methods
More informationNCSS Statistical Software Principal Components Regression. In ordinary least squares, the regression coefficients are estimated using the formula ( )
Chapter 340 Principal Components Regression Introduction is a technique for analyzing multiple regression data that suffer from multicollinearity. When multicollinearity occurs, least squares estimates
More information17. SIMPLE LINEAR REGRESSION II
17. SIMPLE LINEAR REGRESSION II The Model In linear regression analysis, we assume that the relationship between X and Y is linear. This does not mean, however, that Y can be perfectly predicted from X.
More informationAssumptions in the Normal Linear Regression Model. A2: The error terms (and thus the Y s at each X) have constant variance.
Assumptions in the Normal Linear Regression Model A1: There is a linear relationship between X and Y. A2: The error terms (and thus the Y s at each X) have constant variance. A3: The error terms are independent.
More informationModule 5: Multiple Regression Analysis
Using Statistical Data Using to Make Statistical Decisions: Data Multiple to Make Regression Decisions Analysis Page 1 Module 5: Multiple Regression Analysis Tom Ilvento, University of Delaware, College
More informationAP Statistics 2002 Scoring Guidelines
AP Statistics 2002 Scoring Guidelines The materials included in these files are intended for use by AP teachers for course and exam preparation in the classroom; permission for any other use must be sought
More informationChapter 15 Multiple Regression
Multiple Regression Learning Objectives 1. Understand how multiple regression analysis can be used to develop relationships involving one dependent variable and several independent variables. 2. Be able
More informationpsyc3010 lecture 8 standard and hierarchical multiple regression last week: correlation and regression Next week: moderated regression
psyc3010 lecture 8 standard and hierarchical multiple regression last week: correlation and regression Next week: moderated regression 1 last week this week last week we revised correlation & regression
More informationc 2015, Jeffrey S. Simonoff 1
Modeling Lowe s sales Forecasting sales is obviously of crucial importance to businesses. Revenue streams are random, of course, but in some industries general economic factors would be expected to have
More informationCHAPTER 13 SIMPLE LINEAR REGRESSION. Opening Example. Simple Regression. Linear Regression
Opening Example CHAPTER 13 SIMPLE LINEAR REGREION SIMPLE LINEAR REGREION! Simple Regression! Linear Regression Simple Regression Definition A regression model is a mathematical equation that descries the
More informationBIOSTATISTICS QUIZ ANSWERS
BIOSTATISTICS QUIZ ANSWERS 1. When you read scientific literature, do you know whether the statistical tests that were used were appropriate and why they were used? a. Always b. Mostly c. Rarely d. Never
More informationSolution Let us regress percentage of games versus total payroll.
Assignment 3, MATH 2560, Due November 16th Question 1: all graphs and calculations have to be done using the computer The following table gives the 1999 payroll (rounded to the nearest million dolars)
More informationPerform hypothesis testing
Multivariate hypothesis tests for fixed effects Testing homogeneity of level1 variances In the following sections, we use the model displayed in the figure below to illustrate the hypothesis tests. Partial
More informationQuestions and Answers on Hypothesis Testing and Confidence Intervals
Questions and Answers on Hypothesis Testing and Confidence Intervals L. Magee Fall, 2008 1. Using 25 observations and 5 regressors, including the constant term, a researcher estimates a linear regression
More informationAP * Statistics Review. Linear Regression
AP * Statistics Review Linear Regression Teacher Packet Advanced Placement and AP are registered trademark of the College Entrance Examination Board. The College Board was not involved in the production
More informationStatistical Modelling in Stata 5: Linear Models
Statistical Modelling in Stata 5: Linear Models Mark Lunt Arthritis Research UK Centre for Excellence in Epidemiology University of Manchester 08/11/2016 Structure This Week What is a linear model? How
More informationAn example ANOVA situation. 1Way ANOVA. Some notation for ANOVA. Are these differences significant? Example (Treating Blisters)
An example ANOVA situation Example (Treating Blisters) 1Way ANOVA MATH 143 Department of Mathematics and Statistics Calvin College Subjects: 25 patients with blisters Treatments: Treatment A, Treatment
More informationDEPARTMENT OF PSYCHOLOGY UNIVERSITY OF LANCASTER MSC IN PSYCHOLOGICAL RESEARCH METHODS ANALYSING AND INTERPRETING DATA 2 PART 1 WEEK 9
DEPARTMENT OF PSYCHOLOGY UNIVERSITY OF LANCASTER MSC IN PSYCHOLOGICAL RESEARCH METHODS ANALYSING AND INTERPRETING DATA 2 PART 1 WEEK 9 Analysis of covariance and multiple regression So far in this course,
More informationRelationship of two variables
Relationship of two variables A correlation exists between two variables when the values of one are somehow associated with the values of the other in some way. Scatter Plot (or Scatter Diagram) A plot
More informationChapter 13 Introduction to Linear Regression and Correlation Analysis
Chapter 3 Student Lecture Notes 3 Chapter 3 Introduction to Linear Regression and Correlation Analsis Fall 2006 Fundamentals of Business Statistics Chapter Goals To understand the methods for displaing
More informationInterpreting Multiple Regression
Fall Semester, 2001 Statistics 621 Lecture 5 Robert Stine 1 Preliminaries Interpreting Multiple Regression Project and assignments Hope to have some further information on project soon. Due date for Assignment
More informationInferential Statistics
Inferential Statistics Sampling and the normal distribution Zscores Confidence levels and intervals Hypothesis testing Commonly used statistical methods Inferential Statistics Descriptive statistics are
More informationAP Statistics Solutions to Packet 14
AP Statistics Solutions to Packet 4 Inference for Regression Inference about the Model Predictions and Conditions HW #,, 6, 7 4. AN ETINCT BEAST, I Archaeopteryx is an extinct beast having feathers like
More informationSimple Linear Regression in SPSS STAT 314
Simple Linear Regression in SPSS STAT 314 1. Ten Corvettes between 1 and 6 years old were randomly selected from last year s sales records in Virginia Beach, Virginia. The following data were obtained,
More informationRNR / ENTO Assumptions for Simple Linear Regression
74 RNR / ENTO 63 Assumptions for Simple Linear Regression Statistical statements (hypothesis tests and CI estimation) with least squares estimates depends on 4 assumptions:. Linearity of the mean responses
More informationSection 3 Part 1. Relationships between two numerical variables
Section 3 Part 1 Relationships between two numerical variables 1 Relationship between two variables The summary statistics covered in the previous lessons are appropriate for describing a single variable.
More informatione = random error, assumed to be normally distributed with mean 0 and standard deviation σ
1 Linear Regression 1.1 Simple Linear Regression Model The linear regression model is applied if we want to model a numeric response variable and its dependency on at least one numeric factor variable.
More informationA correlation exists between two variables when one of them is related to the other in some way.
Lecture #10 Chapter 10 Correlation and Regression The main focus of this chapter is to form inferences based on sample data that come in pairs. Given such paired sample data, we want to determine whether
More informationAnalyzing Linear Relationships, Two or More Variables
PART V ANALYZING RELATIONSHIPS CHAPTER 14 Analyzing Linear Relationships, Two or More Variables INTRODUCTION In the previous chapter, we introduced Kate Cameron, the owner of Woodbon, a company that produces
More informationStatistics II Final Exam  January Use the University stationery to give your answers to the following questions.
Statistics II Final Exam  January 2012 Use the University stationery to give your answers to the following questions. Do not forget to write down your name and class group in each page. Indicate clearly
More information" Y. Notation and Equations for Regression Lecture 11/4. Notation:
Notation: Notation and Equations for Regression Lecture 11/4 m: The number of predictor variables in a regression Xi: One of multiple predictor variables. The subscript i represents any number from 1 through
More informationMULTIPLE REGRESSION EXAMPLE
MULTIPLE REGRESSION EXAMPLE For a sample of n = 166 college students, the following variables were measured: Y = height X 1 = mother s height ( momheight ) X 2 = father s height ( dadheight ) X 3 = 1 if
More informationAnswer: C. The strength of a correlation does not change if units change by a linear transformation such as: Fahrenheit = 32 + (5/9) * Centigrade
Statistics Quiz Correlation and Regression  ANSWERS 1. Temperature and air pollution are known to be correlated. We collect data from two laboratories, in Boston and Montreal. Boston makes their measurements
More informationRegression in SPSS. Workshop offered by the Mississippi Center for Supercomputing Research and the UM Office of Information Technology
Regression in SPSS Workshop offered by the Mississippi Center for Supercomputing Research and the UM Office of Information Technology John P. Bentley Department of Pharmacy Administration University of
More informationSimple Linear Regression Chapter 11
Simple Linear Regression Chapter 11 Rationale Frequently decisionmaking situations require modeling of relationships among business variables. For instance, the amount of sale of a product may be related
More informationAP Statistics 2001 Solutions and Scoring Guidelines
AP Statistics 2001 Solutions and Scoring Guidelines The materials included in these files are intended for noncommercial use by AP teachers for course and exam preparation; permission for any other use
More informationSydney Roberts Predicting Age Group Swimmers 50 Freestyle Time 1. 1. Introduction p. 2. 2. Statistical Methods Used p. 5. 3. 10 and under Males p.
Sydney Roberts Predicting Age Group Swimmers 50 Freestyle Time 1 Table of Contents 1. Introduction p. 2 2. Statistical Methods Used p. 5 3. 10 and under Males p. 8 4. 11 and up Males p. 10 5. 10 and under
More informationCORRELATION AND SIMPLE REGRESSION ANALYSIS USING SAS IN DAIRY SCIENCE
CORRELATION AND SIMPLE REGRESSION ANALYSIS USING SAS IN DAIRY SCIENCE A. K. Gupta, Vipul Sharma and M. Manoj NDRI, Karnal132001 When analyzing farm records, simple descriptive statistics can reveal a
More informationSection I: Multiple Choice Select the best answer for each question.
Chapter 15 (Regression Inference) AP Statistics Practice Test (TPS 4 p796) Section I: Multiple Choice Select the best answer for each question. 1. Which of the following is not one of the conditions that
More informationYiming Peng, Department of Statistics. February 12, 2013
Regression Analysis Using JMP Yiming Peng, Department of Statistics February 12, 2013 2 Presentation and Data http://www.lisa.stat.vt.edu Short Courses Regression Analysis Using JMP Download Data to Desktop
More informationLesson 4 Part 1. Relationships between. two numerical variables. Correlation Coefficient. Relationship between two
Lesson Part Relationships between two numerical variables Correlation Coefficient The correlation coefficient is a summary statistic that describes the linear between two numerical variables Relationship
More informationThe scatterplot indicates a positive linear relationship between waist size and body fat percentage:
STAT E150 Statistical Methods Multiple Regression Three percent of a man's body is essential fat, which is necessary for a healthy body. However, too much body fat can be dangerous. For men between the
More informationElementary Statistics Sample Exam #3
Elementary Statistics Sample Exam #3 Instructions. No books or telephones. Only the supplied calculators are allowed. The exam is worth 100 points. 1. A chi square goodness of fit test is considered to
More information2011 # AP Exam Solutions 2011 # # # #1 5/16/2011
2011 AP Exam Solutions 1. A professional sports team evaluates potential players for a certain position based on two main characteristics, speed and strength. (a) Speed is measured by the time required
More information11. Analysis of Casecontrol Studies Logistic Regression
Research methods II 113 11. Analysis of Casecontrol Studies Logistic Regression This chapter builds upon and further develops the concepts and strategies described in Ch.6 of Mother and Child Health:
More informationData and Regression Analysis. Lecturer: Prof. Duane S. Boning. Rev 10
Data and Regression Analysis Lecturer: Prof. Duane S. Boning Rev 10 1 Agenda 1. Comparison of Treatments (One Variable) Analysis of Variance (ANOVA) 2. Multivariate Analysis of Variance Model forms 3.
More informationX X X a) perfect linear correlation b) no correlation c) positive correlation (r = 1) (r = 0) (0 < r < 1)
CORRELATION AND REGRESSION / 47 CHAPTER EIGHT CORRELATION AND REGRESSION Correlation and regression are statistical methods that are commonly used in the medical literature to compare two or more variables.
More informationModule 3: Multiple Regression Concepts
Contents Module 3: Multiple Regression Concepts Fiona Steele 1 Centre for Multilevel Modelling...4 What is Multiple Regression?... 4 Motivation... 4 Conditioning... 4 Data for multiple regression analysis...
More informationInstitute of Actuaries of India Subject CT3 Probability and Mathematical Statistics
Institute of Actuaries of India Subject CT3 Probability and Mathematical Statistics For 2015 Examinations Aim The aim of the Probability and Mathematical Statistics subject is to provide a grounding in
More information2013 MBA Jump Start Program. Statistics Module Part 3
2013 MBA Jump Start Program Module 1: Statistics Thomas Gilbert Part 3 Statistics Module Part 3 Hypothesis Testing (Inference) Regressions 2 1 Making an Investment Decision A researcher in your firm just
More informationCorrelation Coefficient The correlation coefficient is a summary statistic that describes the linear relationship between two numerical variables 2
Lesson 4 Part 1 Relationships between two numerical variables 1 Correlation Coefficient The correlation coefficient is a summary statistic that describes the linear relationship between two numerical variables
More informationChapter 10 Correlation and Regression. Overview. Section 102 Correlation Key Concept. Definition. Definition. Exploring the Data
Chapter 10 Correlation and Regression 101 Overview 102 Correlation 10 Regression Overview This chapter introduces important methods for making inferences about a correlation (or relationship) between
More informationSELFTEST: SIMPLE REGRESSION
ECO 22000 McRAE SELFTEST: SIMPLE REGRESSION Note: Those questions indicated with an (N) are unlikely to appear in this form on an inclass examination, but you should be able to describe the procedures
More informationPremaster Statistics Tutorial 4 Full solutions
Premaster Statistics Tutorial 4 Full solutions Regression analysis Q1 (based on Doane & Seward, 4/E, 12.7) a. Interpret the slope of the fitted regression = 125,000 + 150. b. What is the prediction for
More information0.1 Multiple Regression Models
0.1 Multiple Regression Models We will introduce the multiple Regression model as a mean of relating one numerical response variable y to two or more independent (or predictor variables. We will see different
More information, has mean A) 0.3. B) the smaller of 0.8 and 0.5. C) 0.15. D) which cannot be determined without knowing the sample results.
BA 275 Review Problems  Week 9 (11/20/0611/24/06) CD Lessons: 69, 70, 1620 Textbook: pp. 520528, 111124, 133141 An SRS of size 100 is taken from a population having proportion 0.8 of successes. An
More informationANOVA MULTIPLE CHOICE QUESTIONS. In the following multiplechoice questions, select the best answer.
ANOVA MULTIPLE CHOICE QUESTIONS In the following multiplechoice questions, select the best answer. 1. Analysis of variance is a statistical method of comparing the of several populations. a. standard
More informationThe importance of graphing the data: Anscombe s regression examples
The importance of graphing the data: Anscombe s regression examples Bruce Weaver Northern Health Research Conference Nipissing University, North Bay May 3031, 2008 B. Weaver, NHRC 2008 1 The Objective
More informationSimple linear regression
Simple linear regression Introduction Simple linear regression is a statistical method for obtaining a formula to predict values of one variable from another where there is a causal relationship between
More informationChapter 10. Key Ideas Correlation, Correlation Coefficient (r),
Chapter 0 Key Ideas Correlation, Correlation Coefficient (r), Section 0: Overview We have already explored the basics of describing single variable data sets. However, when two quantitative variables
More informationMGT 267 PROJECT. Forecasting the United States Retail Sales of the Pharmacies and Drug Stores. Done by: Shunwei Wang & Mohammad Zainal
MGT 267 PROJECT Forecasting the United States Retail Sales of the Pharmacies and Drug Stores Done by: Shunwei Wang & Mohammad Zainal Dec. 2002 The retail sale (Million) ABSTRACT The present study aims
More informationFactors affecting online sales
Factors affecting online sales Table of contents Summary... 1 Research questions... 1 The dataset... 2 Descriptive statistics: The exploratory stage... 3 Confidence intervals... 4 Hypothesis tests... 4
More informationII. DISTRIBUTIONS distribution normal distribution. standard scores
Appendix D Basic Measurement And Statistics The following information was developed by Steven Rothke, PhD, Department of Psychology, Rehabilitation Institute of Chicago (RIC) and expanded by Mary F. Schmidt,
More informationPOLYNOMIAL AND MULTIPLE REGRESSION. Polynomial regression used to fit nonlinear (e.g. curvilinear) data into a least squares linear regression model.
Polynomial Regression POLYNOMIAL AND MULTIPLE REGRESSION Polynomial regression used to fit nonlinear (e.g. curvilinear) data into a least squares linear regression model. It is a form of linear regression
More informationInference for Regression
Simple Linear Regression Inference for Regression The simple linear regression model Estimating regression parameters; Confidence intervals and significance tests for regression parameters Inference about
More informationSimple Linear Regression
STAT 101 Dr. Kari Lock Morgan Simple Linear Regression SECTIONS 9.3 Confidence and prediction intervals (9.3) Conditions for inference (9.1) Want More Stats??? If you have enjoyed learning how to analyze
More informationPrediction and Confidence Intervals in Regression
Fall Semester, 2001 Statistics 621 Lecture 3 Robert Stine 1 Prediction and Confidence Intervals in Regression Preliminaries Teaching assistants See them in Room 3009 SHDH. Hours are detailed in the syllabus.
More informationSimple Linear Regression Inference
Simple Linear Regression Inference 1 Inference requirements The Normality assumption of the stochastic term e is needed for inference even if it is not a OLS requirement. Therefore we have: Interpretation
More informationHomework 11. Part 1. Name: Score: / null
Name: Score: / Homework 11 Part 1 null 1 For which of the following correlations would the data points be clustered most closely around a straight line? A. r = 0.50 B. r = 0.80 C. r = 0.10 D. There is
More informationVersions 1a Page 1 of 17
Note to Students: This practice exam is intended to give you an idea of the type of questions the instructor asks and the approximate length of the exam. It does NOT indicate the exact questions or the
More information