# Chapter 13 Introduction to Nonlinear Regression( 非 線 性 迴 歸 )

Save this PDF as:

Size: px
Start display at page:

Download "Chapter 13 Introduction to Nonlinear Regression( 非 線 性 迴 歸 )"

## Transcription

1 Chapter 13 Introduction to Nonlinear Regression( 非 線 性 迴 歸 ) and Neural Networks( 類 神 經 網 路 ) 許 湘 伶 Applied Linear Regression Models (Kutner, Nachtsheim, Neter, Li) hsuhl (NUK) LR Chap 10 1 / 35

2 13 Examples of nonlinear regression model: growth from birth to maturity in human: nonlinear in nature dose-response relationships( 劑 量 反 應 關 係 ) Purpose: obtain estimates of parameters Neural network model: data mining applications ( 略 過 ) logistic regression models: Chap. 14 hsuhl (NUK) LR Chap 10 2 / 35

3 13.1 Linear and Nonlinear Regression Models Linear and Nonlinear Regression Models the general linear regression model (6.7): Y i = β 0 + β 1 X i1 + β 2 X i β p 1 X i,p 1 + ε i A polynomial regression model in one or more predictor variables is linear in the parameters. Ex: Y i = β 0 + β 1 X i1 + β 2 X 2 i1 + β 3 X i2 + β 4 X 2 i2 + β 5 X i1 X i2 + ε i log 10 Y i = β 0 + β 1 X i1 + β 2 exp(x i2 ) + ε i (transformed) hsuhl (NUK) LR Chap 10 3 / 35

4 13.1 Linear and Nonlinear Regression Models Linear and Nonlinear Regression Models (cont.) In general, we can state a linear regression model as: nonlinear regression models Y i = f (X i, β) + ε i = X iβ + ε i where X i = [1 X i1 X i,p 1 ] the same basic form as that in (13.4): Y i = f (X i, γ) + ε i f (X i, γ): mean response given by the nonlinear response function f (X, γ) ε i : error term; assumed to have E{ε i } = 0, constant variance and to be uncorrelated γ: parameter vector hsuhl (NUK) LR Chap 10 4 / 35

5 13.1 Linear and Nonlinear Regression Models Linear and Nonlinear Regression Models (cont.) exponential regression model: in growth( 生 長 ) studies; concentration( 濃 度 ) Y i = γ 0 exp(γ 1 X i ) + ε i, ε i indep. N(0, σ 2 ) hsuhl (NUK) LR Chap 10 5 / 35

6 13.1 Linear and Nonlinear Regression Models Linear and Nonlinear Regression Models (cont.) logistic regression models( 邏 輯 式 迴 歸 模 型 ): in population studies( 人 口 研 究 ); Y i = γ γ 1 exp(γ 2 X i ) + ε i, ε i indep. N(0, σ 2 ) hsuhl (NUK) LR Chap 10 6 / 35

7 13.1 Linear and Nonlinear Regression Models Linear and Nonlinear Regression Models (cont.) logistic regression models( 邏 輯 式 迴 歸 模 型 ): the response variable is qualitative (0,1): purchase a new car the error terms are not normally distributed with constant variance (Chap. 14) hsuhl (NUK) LR Chap 10 7 / 35

8 13.1 Linear and Nonlinear Regression Models Linear and Nonlinear Regression Models (cont.) General Form of Nonlinear Regression Models: The error terms ε i are often assumed to be independent normal random variables with constant variance. Important difference: #{β i } is not necessarily directly related to #{X i } in the model linear regression: (p 1) X variables p regression coefficients nonlinear regression: p 1 Y i = β j X ji + ε i j=0 Y i = γ 0 exp(γ 1 X i ) + ε i (p = 2, q = 1) hsuhl (NUK) LR Chap 10 8 / 35

9 13.1 Linear and Nonlinear Regression Models Linear and Nonlinear Regression Models (cont.) Nonlinear regression model q: #{X variables} p: #{regression parameters} X i : the observations on the X variables without the initial element 1 The general form of a nonlinear regression model: Y i = f (X i, γ) + ε i X i = q 1 X i1 X i2. γ = p 1 γ 0 γ 1. X iq γ p 1 hsuhl (NUK) LR Chap 10 9 / 35

10 13.1 Linear and Nonlinear Regression Models Linear and Nonlinear Regression Models (cont.) intrinsically linear( 實 質 線 性 的 ) response functions: nonlinear response functions can be linearized by a transformation Ex: f (X, γ) = γ 0 [exp(γ 1 X)] g(x, γ) = log e f (X, γ) = β 0 + β 1 X β 0 = log γ 0, β 1 = γ 1 Just because a nonlinear response function is intrinsically linear does not necessarily imply that linear regression is appropriate. ( the error term in the linearized model will no longer be normal with constant variance) hsuhl (NUK) LR Chap / 35

11 13.1 Linear and Nonlinear Regression Models Linear and Nonlinear Regression Models (cont.) Estimation of regression parameters: 1 least squares method 2 maximum likelihood method Also as in linear regression, both of these methods of estimation yield the same parameter estimates when the error terms in (13.12) are independent normal with constant variance. It is usually not possible to find analytical expression for LSE and MLE for nonlinear regression models. numerical search procedures must be used: require intensive computations hsuhl (NUK) LR Chap / 35

12 13.2 Least Squares Estimation in Nonlinear Regression LSE in nonlinear regression The concepts of LSE for linear regression also extend directly to nonlinear regression models. The least squares criterion: n Q = [Y i f (X i, γ)] 2 i=1 Q must be minimized with respect to γ 0,..., γ p 1 A difference from linear regression is that the solution of the normal equations usually requires an iterative numerical search procedure because analytical solutions generally cannot be found. hsuhl (NUK) LR Chap / 35

13 13.2 Least Squares Estimation in Nonlinear Regression Solution of Normal Equations Y i = f (X i, γ) + ε i n Q = [Y i f (X i, γ)] 2 i=1 g = arg min γ Q (g : the vector of the LSE g k ) Q γ k = (partial derivative of Q with respect to γ k ) [ n f (X i, γ) 2[Y i f (X i, γ)] γ k i=1 ] set = 0 γk =g k hsuhl (NUK) LR Chap / 35

14 13.2 Least Squares Estimation in Nonlinear Regression Solution of Normal Equations (cont.) The p normal equations: [ ] [ ] n f (X i, γ) n f (X i, γ) Y i f (X i, g) i=1 γ k γ=g i=1 γ k γ=g k = 0, 1,..., p 1 = 0, g: the vector of the least squares estimates g k g = [g 0,..., g p 1 ] p 1 nonlinear in the parameter estimates g k numerical search procedure are required multiple solution may be possible hsuhl (NUK) LR Chap / 35

15 13.2 Least Squares Estimation in Nonlinear Regression Solution of Normal Equations (cont.) Example: Severely injured patients Y : Prognostic( 預 後 的 ) Index X: Days Hospitalized ( 住 院 治 療 ) Related earlier studies: the relationship between Y and X is exponential Y i = γ 0 exp(γ 1 X i ) + ε i hsuhl (NUK) LR Chap / 35

16 13.2 Least Squares Estimation in Nonlinear Regression Solution of Normal Equations (cont.) f (X i, γ) γ 0 = exp(γ 1 X i ) f (X i, γ) γ 1 = γ 0 X i exp(γ 1 X i ) Q γ k g = 0 Y i exp(g 1 X i ) g 0 exp(2g1 X i ) = 0 Yi X i exp(g 1 X i ) g 0 Xi exp(2g 1 X i ) = 0 No closed-form solution exists for g = (g 0, g 1 ) T hsuhl (NUK) LR Chap / 35

17 13.2 Least Squares Estimation in Nonlinear Regression Gauss-Newton Method( 高 斯 - 牛 頓 法 ) linearization method: 1 use a Taylor series expansion to approximate the nonlinear regression model ( f (x) = f (a) + n=1 2 employ OLS to estimate the parameters f (n) (x a)n) n! hsuhl (NUK) LR Chap / 35

18 13.2 Least Squares Estimation in Nonlinear Regression Gauss-Newton Method( 高 斯 - 牛 頓 法 ) (cont.) 圖 片 來 源 :Wiki: Taylor s theorem hsuhl (NUK) LR Chap / 35

19 13.2 Least Squares Estimation in Nonlinear Regression Gauss-Newton Method( 高 斯 - 牛 頓 法 ) (cont.) Gauss-Newton Method: 1 initial parameters γ 0,..., γ p 1 : g (0) 0,..., g (0) p 1 2 approximate the mean responses f (X i, γ) in the Taylor series expansion around g (0) k : f (X i, γ) f (X i, g (0) ) + p 1 k=0 [ ] f (X i, γ) γ k γ=g (0) (γ k g (0) k ) 3 obtain revised estimated regression coefficients g (1) k : (later) g (1) k = g (0) k + b (0) k hsuhl (NUK) LR Chap / 35

20 13.2 Least Squares Estimation in Nonlinear Regression Gauss-Newton Method( 高 斯 - 牛 頓 法 ) (cont.) (Y (0) i =Y i f (0) i ) p 1 f (X i, γ) f (X i, g (0) [ ] f (Xi, γ) ) + (γ k g (0) k γ ) k γ=g (0) = f (0) p 1 i + k=0 k=0 D (0) ik β(0) k Y i f (0) p 1 i + D (0) ik β(0) k + ε i = Y (0) i p 1 k=0 k=0 D (0) ik β(0) k + ε i (no intercept) (13.24) The purpose of fitting the linear regression model approximation (13.24) is therefore to estimate β (0) k and use these estimates to adjust the initial starting estimates of the regression parameters. hsuhl (NUK) LR Chap / 35

21 13.2 Least Squares Estimation in Nonlinear Regression Gauss-Newton Method( 高 斯 - 牛 頓 法 ) (cont.) Matrix Form: Y (0) i p 1 k=0 D(0) ik β(0) k + ε i hsuhl (NUK) LR Chap / 35

22 13.2 Least Squares Estimation in Nonlinear Regression Gauss-Newton Method( 高 斯 - 牛 頓 法 ) (cont.) the D matrix of partial derivative play the role of the X matrix (without a column of 1s for the intercept) Estimate β (0) by OLS: b (0) = (D (0) D (0) ) 1 D (0) Y (0) obtain revised estimated regression coefficients g (1) k : g (1) k = g (0) k + b (0) k hsuhl (NUK) LR Chap / 35

23 13.2 Least Squares Estimation in Nonlinear Regression Gauss-Newton Method( 高 斯 - 牛 頓 法 ) (cont.) Evaluated for g (0) by SSE (0) : n n SSE (0) = [Y i f (X i, g (0) )] 2 = [Y i f (0) i ] 2 i=1 i=1 After the end of the first iteration: n n SSE (1) = [Y i f (X i, g (1) )] 2 = [Y i f (1) i ] 2 i=1 i=1 If the Gauss-Newton method is working effectively in the first iteration, SSE (1) should be smaller than SSE (0). ( g (1) should be better estimates) hsuhl (NUK) LR Chap / 35

24 13.2 Least Squares Estimation in Nonlinear Regression Gauss-Newton Method( 高 斯 - 牛 頓 法 ) (cont.) The Gauss-Newton method repeats the procedure with g (1) now used for the new starting values. Until g (s+1) g (s) and/or SSE (s+1) SSE (s) become negligible The Gauss-Newton method works effectively in many nonlinear regression applications. (Sometimes may require numerous iterations before converging.) hsuhl (NUK) LR Chap / 35

25 13.2 Least Squares Estimation in Nonlinear Regression Gauss-Newton Method( 高 斯 - 牛 頓 法 ) (cont.) Example: Severely injured patients Initial: Transformed Y (log γ 0 exp(γ 1 X) = log γ 0 + γ 1 X) Y i = β 0 + β 1 X i + ε i OLS b 0 = , b 1 = g (0) 0 = exp(b 0 ) = , g (0) 1 = b 1 = hsuhl (NUK) LR Chap / 35

26 13.2 Least Squares Estimation in Nonlinear Regression Gauss-Newton Method( 高 斯 - 牛 頓 法 ) (cont.) Ŷ = ( ) exp( X) hsuhl (NUK) LR Chap / 35

27 13.2 Least Squares Estimation in Nonlinear Regression Gauss-Newton Method( 高 斯 - 牛 頓 法 ) (cont.) The choice of initial starting values: a poor choice may result in slow convergence, convergence to a local minimum, or even divergence Good starting values: result in faster convergence, will lead to a solution that is the global minimum rather than a local minimum A variety of methods for obtaining starting values: 1 related earlier studies 2 select p representative observations solve for p parameters, then used as the starting values 3 do a grid search in the parameter space using as the starting values that g for which Q is smallest hsuhl (NUK) LR Chap / 35

28 13.2 Least Squares Estimation in Nonlinear Regression Gauss-Newton Method( 高 斯 - 牛 頓 法 ) (cont.) Some properties that exist for linear regression least squares do not hold for nonlinear regression least squares. Ex: e i 0; SSR + SSE SSTO; R 2 is not a meaningful descriptive statistic for nonlinear regression. Two other direct search procedures: 1 The method of steepest descent searches 2 The Marquardt algorithm: seeks to utilize the best feature of the Gauss-Newton method and the method of steepest descent a middle ground( 折 衷 ; 妥 協 ) between these two method hsuhl (NUK) LR Chap / 35

29 13.3 Model Building and Diagnostics Model building and diagnostics The model-building process for nonlinear regression models often differs somewhat from that for linear regression models Validation of the selected nonlinear regression model can be performed in the same fashion as for linear regression models. Use of diagnostics tools to examine the appropriateness of a fitted model plays an important role in the process of building a nonlinear regression model. hsuhl (NUK) LR Chap / 35

30 13.3 Model Building and Diagnostics Model building and diagnostics (cont.) When replicate observations are available and the sample size is reasonably large, the appropriate of a nonlinear regression function can be tested formally by means of the lack of fit test for linear regression models. (the test is an approximate one) Plots: e i vs. t i, Ŷ i, X ik can be helpful in diagnosing departures from the assumed model Unequal variances WLS; transformations hsuhl (NUK) LR Chap / 35

31 13.4 Inferences about Nonlinear Regression Parameters Inferences Inferences about the regression parameters in nonlinear regression are usually based on large-sample theory. When n is large, LSE and MLE for nonlinear regression models with normal error terms are approximately normally distributed and almost unbiased and have almost minimum variance Estimate of Error Term Variance: MSE = SSE n p = [Yi f (X i, g)] 2 n p (Not unbiased estimator of σ 2 but the bias is small when n is large) hsuhl (NUK) LR Chap / 35

32 13.4 Inferences about Nonlinear Regression Parameters Inferences (cont.) Large-Sample Theory When the error terms ε i are independent N(0, σ 2 ) and the sample size n is reasonably large, the sampling distribution of g is approximately normal. The expected value of the mean vector is approximately: E{g} γ The approximate variance-covariance matrix of the regression coefficients is estimated by: s 2 {g} = MSE(D D) 1 hsuhl (NUK) LR Chap / 35

33 13.4 Inferences about Nonlinear Regression Parameters Inferences (cont.) No simple rule exists that tells us when it is appropriate to use the large-sample inference methods and when it is not appropriate. However, a number of guidelines have been developed that are helpful in assessing the appropriateness of using the large-sample inference procedures in a given application. When the diagnostics suggest that large-sample inference procedures are not appropriate in a particular instance, remedial measures should be explored. reparameterize the nonlinear regression model bootstrap estimated of precision and confidence intervals instead of the large-sample inferences hsuhl (NUK) LR Chap / 35

34 13.4 Inferences about Nonlinear Regression Parameters Interval Estimation Large-sample theorem: approximate result for a single γ k g k γ k t(n p), k = 0, 1,..., p 1 s{g k } g k ± t(1 α/2; n p)s{g k } Several γ k : m parameters to be estimated with approximate family confidence coefficient 1 α the Bonferroni confidence limits: g k ± Bs{g k } B = t(1 α/2m; n p) hsuhl (NUK) LR Chap / 35

35 13.4 Inferences about Nonlinear Regression Parameters Test Concerning a single γ k A large-sample test: H 0 : γ k = γ k0 vs. H a : γ k γ k0 t = g k γ k0 s{g k } If t t(1 α/2; n p), conclude H 0 If t > t(1 α/2; n p), conclude H a Test concerning several γ k F = approx SSE(R) SSE(F ) df R df F MSE(F ) F (df R df F, df F ) when H 0 holds hsuhl (NUK) LR Chap / 35

### Outline. Topic 4 - Analysis of Variance Approach to Regression. Partitioning Sums of Squares. Total Sum of Squares. Partitioning sums of squares

Topic 4 - Analysis of Variance Approach to Regression Outline Partitioning sums of squares Degrees of freedom Expected mean squares General linear test - Fall 2013 R 2 and the coefficient of correlation

### Generalized Linear Models. Today: definition of GLM, maximum likelihood estimation. Involves choice of a link function (systematic component)

Generalized Linear Models Last time: definition of exponential family, derivation of mean and variance (memorize) Today: definition of GLM, maximum likelihood estimation Include predictors x i through

### Econometrics Simple Linear Regression

Econometrics Simple Linear Regression Burcu Eke UC3M Linear equations with one variable Recall what a linear equation is: y = b 0 + b 1 x is a linear equation with one variable, or equivalently, a straight

### Estimation of σ 2, the variance of ɛ

Estimation of σ 2, the variance of ɛ The variance of the errors σ 2 indicates how much observations deviate from the fitted surface. If σ 2 is small, parameters β 0, β 1,..., β k will be reliably estimated

### Statistics in Geophysics: Linear Regression II

Statistics in Geophysics: Linear Regression II Steffen Unkel Department of Statistics Ludwig-Maximilians-University Munich, Germany Winter Term 2013/14 1/28 Model definition Suppose we have the following

Lecture 5: Linear least-squares Regression III: Advanced Methods William G. Jacoby Department of Political Science Michigan State University http://polisci.msu.edu/jacoby/icpsr/regress3 Simple Linear Regression

### Notes on Applied Linear Regression

Notes on Applied Linear Regression Jamie DeCoster Department of Social Psychology Free University Amsterdam Van der Boechorststraat 1 1081 BT Amsterdam The Netherlands phone: +31 (0)20 444-8935 email:

### Overview of Violations of the Basic Assumptions in the Classical Normal Linear Regression Model

Overview of Violations of the Basic Assumptions in the Classical Normal Linear Regression Model 1 September 004 A. Introduction and assumptions The classical normal linear regression model can be written

### Data Mining and Data Warehousing. Henryk Maciejewski. Data Mining Predictive modelling: regression

Data Mining and Data Warehousing Henryk Maciejewski Data Mining Predictive modelling: regression Algorithms for Predictive Modelling Contents Regression Classification Auxiliary topics: Estimation of prediction

### Lecture 3: Linear methods for classification

Lecture 3: Linear methods for classification Rafael A. Irizarry and Hector Corrada Bravo February, 2010 Today we describe four specific algorithms useful for classification problems: linear regression,

: Table of Contents... 1 Overview of Model... 1 Dispersion... 2 Parameterization... 3 Sigma-Restricted Model... 3 Overparameterized Model... 4 Reference Coding... 4 Model Summary (Summary Tab)... 5 Summary

### Auxiliary Variables in Mixture Modeling: 3-Step Approaches Using Mplus

Auxiliary Variables in Mixture Modeling: 3-Step Approaches Using Mplus Tihomir Asparouhov and Bengt Muthén Mplus Web Notes: No. 15 Version 8, August 5, 2014 1 Abstract This paper discusses alternatives

### Inference in Regression Analysis

Yang Feng Inference in the Normal Error Regression Model Y i = β 0 + β 1 X i + ɛ i Y i value of the response variable in the i th trial β 0 and β 1 are parameters X i is a known constant, the value of

### Inference in Regression Analysis. Dr. Frank Wood

Inference in Regression Analysis Dr. Frank Wood Inference in the Normal Error Regression Model Y i = β 0 + β 1 X i + ɛ i Y i value of the response variable in the i th trial β 0 and β 1 are parameters

### Regression. Name: Class: Date: Multiple Choice Identify the choice that best completes the statement or answers the question.

Class: Date: Regression Multiple Choice Identify the choice that best completes the statement or answers the question. 1. Given the least squares regression line y8 = 5 2x: a. the relationship between

### Bivariate Regression Analysis. The beginning of many types of regression

Bivariate Regression Analysis The beginning of many types of regression TOPICS Beyond Correlation Forecasting Two points to estimate the slope Meeting the BLUE criterion The OLS method Purpose of Regression

### e = random error, assumed to be normally distributed with mean 0 and standard deviation σ

1 Linear Regression 1.1 Simple Linear Regression Model The linear regression model is applied if we want to model a numeric response variable and its dependency on at least one numeric factor variable.

### Yiming Peng, Department of Statistics. February 12, 2013

Regression Analysis Using JMP Yiming Peng, Department of Statistics February 12, 2013 2 Presentation and Data http://www.lisa.stat.vt.edu Short Courses Regression Analysis Using JMP Download Data to Desktop

### Logistic Regression. Vibhav Gogate The University of Texas at Dallas. Some Slides from Carlos Guestrin, Luke Zettlemoyer and Dan Weld.

Logistic Regression Vibhav Gogate The University of Texas at Dallas Some Slides from Carlos Guestrin, Luke Zettlemoyer and Dan Weld. Generative vs. Discriminative Classifiers Want to Learn: h:x Y X features

### Basics of Statistical Machine Learning

CS761 Spring 2013 Advanced Machine Learning Basics of Statistical Machine Learning Lecturer: Xiaojin Zhu jerryzhu@cs.wisc.edu Modern machine learning is rooted in statistics. You will find many familiar

### Using the Delta Method to Construct Confidence Intervals for Predicted Probabilities, Rates, and Discrete Changes

Using the Delta Method to Construct Confidence Intervals for Predicted Probabilities, Rates, Discrete Changes JunXuJ.ScottLong Indiana University August 22, 2005 The paper provides technical details on

### Experimental data analysis Lecture 3: Confidence intervals. Dodo Das

Experimental data analysis Lecture 3: Confidence intervals Dodo Das Review of lecture 2 Nonlinear regression - Iterative likelihood maximization Levenberg-Marquardt algorithm (Hybrid of steepest descent

### Web-based Supplementary Materials for Bayesian Effect Estimation. Accounting for Adjustment Uncertainty by Chi Wang, Giovanni

1 Web-based Supplementary Materials for Bayesian Effect Estimation Accounting for Adjustment Uncertainty by Chi Wang, Giovanni Parmigiani, and Francesca Dominici In Web Appendix A, we provide detailed

### Week 5: Multiple Linear Regression

BUS41100 Applied Regression Analysis Week 5: Multiple Linear Regression Parameter estimation and inference, forecasting, diagnostics, dummy variables Robert B. Gramacy The University of Chicago Booth School

### MATH4427 Notebook 2 Spring 2016. 2 MATH4427 Notebook 2 3. 2.1 Definitions and Examples... 3. 2.2 Performance Measures for Estimators...

MATH4427 Notebook 2 Spring 2016 prepared by Professor Jenny Baglivo c Copyright 2009-2016 by Jenny A. Baglivo. All Rights Reserved. Contents 2 MATH4427 Notebook 2 3 2.1 Definitions and Examples...................................

### LOGISTIC REGRESSION. Nitin R Patel. where the dependent variable, y, is binary (for convenience we often code these values as

LOGISTIC REGRESSION Nitin R Patel Logistic regression extends the ideas of multiple linear regression to the situation where the dependent variable, y, is binary (for convenience we often code these values

### IAPRI Quantitative Analysis Capacity Building Series. Multiple regression analysis & interpreting results

IAPRI Quantitative Analysis Capacity Building Series Multiple regression analysis & interpreting results How important is R-squared? R-squared Published in Agricultural Economics 0.45 Best article of the

### Nonlinear Statistical Models

Nonlinear Statistical Models Earlier in the course, we considered the general linear statistical model (GLM): y i =β 0 +β 1 x 1i + +β k x ki +ε i, i= 1,, n, written in matrix form as: y 1 y n = 1 x 11

### Linear and Piecewise Linear Regressions

Tarigan Statistical Consulting & Coaching statistical-coaching.ch Doctoral Program in Computer Science of the Universities of Fribourg, Geneva, Lausanne, Neuchâtel, Bern and the EPFL Hands-on Data Analysis

### 1 Logit & Probit Models for Binary Response

ECON 370: Limited Dependent Variable 1 Limited Dependent Variable Econometric Methods, ECON 370 We had previously discussed the possibility of running regressions even when the dependent variable is dichotomous

### 1. χ 2 minimization 2. Fits in case of of systematic errors

Data fitting Volker Blobel University of Hamburg March 2005 1. χ 2 minimization 2. Fits in case of of systematic errors Keys during display: enter = next page; = next page; = previous page; home = first

### Logistic Regression. Jia Li. Department of Statistics The Pennsylvania State University. Logistic Regression

Logistic Regression Department of Statistics The Pennsylvania State University Email: jiali@stat.psu.edu Logistic Regression Preserve linear classification boundaries. By the Bayes rule: Ĝ(x) = arg max

### Logistic Regression (1/24/13)

STA63/CBB540: Statistical methods in computational biology Logistic Regression (/24/3) Lecturer: Barbara Engelhardt Scribe: Dinesh Manandhar Introduction Logistic regression is model for regression used

### SAS Software to Fit the Generalized Linear Model

SAS Software to Fit the Generalized Linear Model Gordon Johnston, SAS Institute Inc., Cary, NC Abstract In recent years, the class of generalized linear models has gained popularity as a statistical modeling

### Poisson Models for Count Data

Chapter 4 Poisson Models for Count Data In this chapter we study log-linear models for count data under the assumption of a Poisson error structure. These models have many applications, not only to the

### The Method of Least Squares

Hervé Abdi 1 1 Introduction The least square methods (LSM) is probably the most popular technique in statistics. This is due to several factors. First, most common estimators can be casted within this

### INTRODUCTORY STATISTICS

INTRODUCTORY STATISTICS FIFTH EDITION Thomas H. Wonnacott University of Western Ontario Ronald J. Wonnacott University of Western Ontario WILEY JOHN WILEY & SONS New York Chichester Brisbane Toronto Singapore

### Comparison of sales forecasting models for an innovative agro-industrial product: Bass model versus logistic function

The Empirical Econometrics and Quantitative Economics Letters ISSN 2286 7147 EEQEL all rights reserved Volume 1, Number 4 (December 2012), pp. 89 106. Comparison of sales forecasting models for an innovative

### Statistical Machine Learning

Statistical Machine Learning UoC Stats 37700, Winter quarter Lecture 4: classical linear and quadratic discriminants. 1 / 25 Linear separation For two classes in R d : simple idea: separate the classes

### Chapter 11: Two Variable Regression Analysis

Department of Mathematics Izmir University of Economics Week 14-15 2014-2015 In this chapter, we will focus on linear models and extend our analysis to relationships between variables, the definitions

### Least Squares Estimation

Least Squares Estimation SARA A VAN DE GEER Volume 2, pp 1041 1045 in Encyclopedia of Statistics in Behavioral Science ISBN-13: 978-0-470-86080-9 ISBN-10: 0-470-86080-4 Editors Brian S Everitt & David

### Chapter 6: Point Estimation. Fall 2011. - Probability & Statistics

STAT355 Chapter 6: Point Estimation Fall 2011 Chapter Fall 2011 6: Point1 Estimat / 18 Chap 6 - Point Estimation 1 6.1 Some general Concepts of Point Estimation Point Estimate Unbiasedness Principle of

### Lecture 2: Simple Linear Regression

DMBA: Statistics Lecture 2: Simple Linear Regression Least Squares, SLR properties, Inference, and Forecasting Carlos Carvalho The University of Texas McCombs School of Business mccombs.utexas.edu/faculty/carlos.carvalho/teaching

### Example: Credit card default, we may be more interested in predicting the probabilty of a default than classifying individuals as default or not.

Statistical Learning: Chapter 4 Classification 4.1 Introduction Supervised learning with a categorical (Qualitative) response Notation: - Feature vector X, - qualitative response Y, taking values in C

### Generating Random Numbers Variance Reduction Quasi-Monte Carlo. Simulation Methods. Leonid Kogan. MIT, Sloan. 15.450, Fall 2010

Simulation Methods Leonid Kogan MIT, Sloan 15.450, Fall 2010 c Leonid Kogan ( MIT, Sloan ) Simulation Methods 15.450, Fall 2010 1 / 35 Outline 1 Generating Random Numbers 2 Variance Reduction 3 Quasi-Monte

### Collinearity of independent variables. Collinearity is a condition in which some of the independent variables are highly correlated.

Collinearity of independent variables Collinearity is a condition in which some of the independent variables are highly correlated. Why is this a problem? Collinearity tends to inflate the variance of

### New Work Item for ISO 3534-5 Predictive Analytics (Initial Notes and Thoughts) Introduction

Introduction New Work Item for ISO 3534-5 Predictive Analytics (Initial Notes and Thoughts) Predictive analytics encompasses the body of statistical knowledge supporting the analysis of massive data sets.

### Two Topics in Parametric Integration Applied to Stochastic Simulation in Industrial Engineering

Two Topics in Parametric Integration Applied to Stochastic Simulation in Industrial Engineering Department of Industrial Engineering and Management Sciences Northwestern University September 15th, 2014

### MISSING DATA TECHNIQUES WITH SAS. IDRE Statistical Consulting Group

MISSING DATA TECHNIQUES WITH SAS IDRE Statistical Consulting Group ROAD MAP FOR TODAY To discuss: 1. Commonly used techniques for handling missing data, focusing on multiple imputation 2. Issues that could

### A General Approach to Variance Estimation under Imputation for Missing Survey Data

A General Approach to Variance Estimation under Imputation for Missing Survey Data J.N.K. Rao Carleton University Ottawa, Canada 1 2 1 Joint work with J.K. Kim at Iowa State University. 2 Workshop on Survey

### COMPUTATIONAL INTELLIGENCE (INTRODUCTION TO MACHINE LEARNING) SS16. Lecture 2: Linear Regression Gradient Descent Non-linear basis functions

COMPUTATIONAL INTELLIGENCE (INTRODUCTION TO MACHINE LEARNING) SS16 Lecture 2: Linear Regression Gradient Descent Non-linear basis functions LINEAR REGRESSION MOTIVATION Why Linear Regression? Regression

### Model Selection and Claim Frequency for Workers Compensation Insurance

Model Selection and Claim Frequency for Workers Compensation Insurance Jisheng Cui, David Pitt and Guoqi Qian Abstract We consider a set of workers compensation insurance claim data where the aggregate

### 5. Linear Regression

5. Linear Regression Outline.................................................................... 2 Simple linear regression 3 Linear model............................................................. 4

### MATHEMATICAL METHODS OF STATISTICS

MATHEMATICAL METHODS OF STATISTICS By HARALD CRAMER TROFESSOK IN THE UNIVERSITY OF STOCKHOLM Princeton PRINCETON UNIVERSITY PRESS 1946 TABLE OF CONTENTS. First Part. MATHEMATICAL INTRODUCTION. CHAPTERS

### 5. Multiple regression

5. Multiple regression QBUS6840 Predictive Analytics https://www.otexts.org/fpp/5 QBUS6840 Predictive Analytics 5. Multiple regression 2/39 Outline Introduction to multiple linear regression Some useful

### Multiple Linear Regression in Data Mining

Multiple Linear Regression in Data Mining Contents 2.1. A Review of Multiple Linear Regression 2.2. Illustration of the Regression Process 2.3. Subset Selection in Linear Regression 1 2 Chap. 2 Multiple

### A Comparison of Nonlinear. Regression Codes

A Comparison of Nonlinear Regression Codes by Paul Fredrick Mondragon Submitted in Partial Fulfillment of the Requirements for the Degree of Master of Science in Mathematics with Operations Research and

### 3. Regression & Exponential Smoothing

3. Regression & Exponential Smoothing 3.1 Forecasting a Single Time Series Two main approaches are traditionally used to model a single time series z 1, z 2,..., z n 1. Models the observation z t as a

### Nonlinear Regression:

Zurich University of Applied Sciences School of Engineering IDP Institute of Data Analysis and Process Design Nonlinear Regression: A Powerful Tool With Considerable Complexity Half-Day : Improved Inference

### 2. What is the general linear model to be used to model linear trend? (Write out the model) = + + + or

Simple and Multiple Regression Analysis Example: Explore the relationships among Month, Adv.\$ and Sales \$: 1. Prepare a scatter plot of these data. The scatter plots for Adv.\$ versus Sales, and Month versus

### Coefficient of Determination

Coefficient of Determination The coefficient of determination R 2 (or sometimes r 2 ) is another measure of how well the least squares equation ŷ = b 0 + b 1 x performs as a predictor of y. R 2 is computed

### Standard errors of marginal effects in the heteroskedastic probit model

Standard errors of marginal effects in the heteroskedastic probit model Thomas Cornelißen Discussion Paper No. 320 August 2005 ISSN: 0949 9962 Abstract In non-linear regression models, such as the heteroskedastic

### Basic Statistcs Formula Sheet

Basic Statistcs Formula Sheet Steven W. ydick May 5, 0 This document is only intended to review basic concepts/formulas from an introduction to statistics course. Only mean-based procedures are reviewed,

### Designing a learning system

Lecture Designing a learning system Milos Hauskrecht milos@cs.pitt.edu 539 Sennott Square, x4-8845 http://.cs.pitt.edu/~milos/courses/cs750/ Design of a learning system (first vie) Application or Testing

### SUMAN DUVVURU STAT 567 PROJECT REPORT

SUMAN DUVVURU STAT 567 PROJECT REPORT SURVIVAL ANALYSIS OF HEROIN ADDICTS Background and introduction: Current illicit drug use among teens is continuing to increase in many countries around the world.

### Logit Models for Binary Data

Chapter 3 Logit Models for Binary Data We now turn our attention to regression models for dichotomous data, including logistic regression and probit analysis. These models are appropriate when the response

### Regression Estimation - Least Squares and Maximum Likelihood. Dr. Frank Wood

Regression Estimation - Least Squares and Maximum Likelihood Dr. Frank Wood Least Squares Max(min)imization Function to minimize w.r.t. b 0, b 1 Q = n (Y i (b 0 + b 1 X i )) 2 i=1 Minimize this by maximizing

### I L L I N O I S UNIVERSITY OF ILLINOIS AT URBANA-CHAMPAIGN

Beckman HLM Reading Group: Questions, Answers and Examples Carolyn J. Anderson Department of Educational Psychology I L L I N O I S UNIVERSITY OF ILLINOIS AT URBANA-CHAMPAIGN Linear Algebra Slide 1 of

### Adequacy of Biomath. Models. Empirical Modeling Tools. Bayesian Modeling. Model Uncertainty / Selection

Directions in Statistical Methodology for Multivariable Predictive Modeling Frank E Harrell Jr University of Virginia Seattle WA 19May98 Overview of Modeling Process Model selection Regression shape Diagnostics

### Chapter 13 Introduction to Linear Regression and Correlation Analysis

Chapter 3 Student Lecture Notes 3- Chapter 3 Introduction to Linear Regression and Correlation Analsis Fall 2006 Fundamentals of Business Statistics Chapter Goals To understand the methods for displaing

### Simple Linear Regression Inference

Simple Linear Regression Inference 1 Inference requirements The Normality assumption of the stochastic term e is needed for inference even if it is not a OLS requirement. Therefore we have: Interpretation

### The Delta Method and Applications

Chapter 5 The Delta Method and Applications 5.1 Linear approximations of functions In the simplest form of the central limit theorem, Theorem 4.18, we consider a sequence X 1, X,... of independent and

### Questions and Answers on Hypothesis Testing and Confidence Intervals

Questions and Answers on Hypothesis Testing and Confidence Intervals L. Magee Fall, 2008 1. Using 25 observations and 5 regressors, including the constant term, a researcher estimates a linear regression

### Reject Inference in Credit Scoring. Jie-Men Mok

Reject Inference in Credit Scoring Jie-Men Mok BMI paper January 2009 ii Preface In the Master programme of Business Mathematics and Informatics (BMI), it is required to perform research on a business

### The general form of the PROC GLM statement is

Linear Regression Analysis using PROC GLM Regression analysis is a statistical method of obtaining an equation that represents a linear relationship between two variables (simple linear regression), or

### Parallelization Strategies for Multicore Data Analysis

Parallelization Strategies for Multicore Data Analysis Wei-Chen Chen 1 Russell Zaretzki 2 1 University of Tennessee, Dept of EEB 2 University of Tennessee, Dept. Statistics, Operations, and Management

### Linear Models for Continuous Data

Chapter 2 Linear Models for Continuous Data The starting point in our exploration of statistical models in social research will be the classical linear model. Stops along the way include multiple linear

### Lecture 8 February 4

ICS273A: Machine Learning Winter 2008 Lecture 8 February 4 Scribe: Carlos Agell (Student) Lecturer: Deva Ramanan 8.1 Neural Nets 8.1.1 Logistic Regression Recall the logistic function: g(x) = 1 1 + e θt

### Heteroskedasticity and Weighted Least Squares

Econ 507. Econometric Analysis. Spring 2009 April 14, 2009 The Classical Linear Model: 1 Linearity: Y = Xβ + u. 2 Strict exogeneity: E(u) = 0 3 No Multicollinearity: ρ(x) = K. 4 No heteroskedasticity/

### Chapter 11: Linear Regression - Inference in Regression Analysis - Part 2

Chapter 11: Linear Regression - Inference in Regression Analysis - Part 2 Note: Whether we calculate confidence intervals or perform hypothesis tests we need the distribution of the statistic we will use.

### Part II. Multiple Linear Regression

Part II Multiple Linear Regression 86 Chapter 7 Multiple Regression A multiple linear regression model is a linear model that describes how a y-variable relates to two or more xvariables (or transformations

### Part 2: Analysis of Relationship Between Two Variables

Part 2: Analysis of Relationship Between Two Variables Linear Regression Linear correlation Significance Tests Multiple regression Linear Regression Y = a X + b Dependent Variable Independent Variable

### MULTIPLE REGRESSION AND ISSUES IN REGRESSION ANALYSIS

MULTIPLE REGRESSION AND ISSUES IN REGRESSION ANALYSIS MSR = Mean Regression Sum of Squares MSE = Mean Squared Error RSS = Regression Sum of Squares SSE = Sum of Squared Errors/Residuals α = Level of Significance

### Sales forecasting # 2

Sales forecasting # 2 Arthur Charpentier arthur.charpentier@univ-rennes1.fr 1 Agenda Qualitative and quantitative methods, a very general introduction Series decomposition Short versus long term forecasting

### Overview Classes. 12-3 Logistic regression (5) 19-3 Building and applying logistic regression (6) 26-3 Generalizations of logistic regression (7)

Overview Classes 12-3 Logistic regression (5) 19-3 Building and applying logistic regression (6) 26-3 Generalizations of logistic regression (7) 2-4 Loglinear models (8) 5-4 15-17 hrs; 5B02 Building and

### GLM, insurance pricing & big data: paying attention to convergence issues.

GLM, insurance pricing & big data: paying attention to convergence issues. Michaël NOACK - michael.noack@addactis.com Senior consultant & Manager of ADDACTIS Pricing Copyright 2014 ADDACTIS Worldwide.

### HYPOTHESIS TESTING: CONFIDENCE INTERVALS, T-TESTS, ANOVAS, AND REGRESSION

HYPOTHESIS TESTING: CONFIDENCE INTERVALS, T-TESTS, ANOVAS, AND REGRESSION HOD 2990 10 November 2010 Lecture Background This is a lightning speed summary of introductory statistical methods for senior undergraduate

### Linear Classification. Volker Tresp Summer 2015

Linear Classification Volker Tresp Summer 2015 1 Classification Classification is the central task of pattern recognition Sensors supply information about an object: to which class do the object belong

### Microeconometrics Blundell Lecture 1 Overview and Binary Response Models

Microeconometrics Blundell Lecture 1 Overview and Binary Response Models Richard Blundell http://www.ucl.ac.uk/~uctp39a/ University College London February-March 2016 Blundell (University College London)

### An Introduction to Time Series Regression

An Introduction to Time Series Regression Henry Thompson Auburn University An economic model suggests examining the effect of exogenous x t on endogenous y t with an exogenous control variable z t. In

### DEPARTMENT OF ECONOMICS. Unit ECON 12122 Introduction to Econometrics. Notes 4 2. R and F tests

DEPARTMENT OF ECONOMICS Unit ECON 11 Introduction to Econometrics Notes 4 R and F tests These notes provide a summary of the lectures. They are not a complete account of the unit material. You should also

### Chapter 14: Analyzing Relationships Between Variables

Chapter Outlines for: Frey, L., Botan, C., & Kreps, G. (1999). Investigating communication: An introduction to research methods. (2nd ed.) Boston: Allyn & Bacon. Chapter 14: Analyzing Relationships Between

### Multinomial and Ordinal Logistic Regression

Multinomial and Ordinal Logistic Regression ME104: Linear Regression Analysis Kenneth Benoit August 22, 2012 Regression with categorical dependent variables When the dependent variable is categorical,

### The Method of Least Squares

33 The Method of Least Squares KEY WORDS confidence interval, critical sum of squares, dependent variable, empirical model, experimental error, independent variable, joint confidence region, least squares,

### Multiple Regression - Selecting the Best Equation An Example Techniques for Selecting the "Best" Regression Equation

Multiple Regression - Selecting the Best Equation When fitting a multiple linear regression model, a researcher will likely include independent variables that are not important in predicting the dependent

### Neural Networks. CAP5610 Machine Learning Instructor: Guo-Jun Qi

Neural Networks CAP5610 Machine Learning Instructor: Guo-Jun Qi Recap: linear classifier Logistic regression Maximizing the posterior distribution of class Y conditional on the input vector X Support vector

### Section 1: Simple Linear Regression

Section 1: Simple Linear Regression Carlos M. Carvalho The University of Texas McCombs School of Business http://faculty.mccombs.utexas.edu/carlos.carvalho/teaching/ 1 Regression: General Introduction

### CHAPTER 9: SERIAL CORRELATION

Serial correlation (or autocorrelation) is the violation of Assumption 4 (observations of the error term are uncorrelated with each other). Pure Serial Correlation This type of correlation tends to be