Chapter 13 Introduction to Nonlinear Regression( 非 線 性 迴 歸 )
|
|
- Linette Wilson
- 8 years ago
- Views:
Transcription
1 Chapter 13 Introduction to Nonlinear Regression( 非 線 性 迴 歸 ) and Neural Networks( 類 神 經 網 路 ) 許 湘 伶 Applied Linear Regression Models (Kutner, Nachtsheim, Neter, Li) hsuhl (NUK) LR Chap 10 1 / 35
2 13 Examples of nonlinear regression model: growth from birth to maturity in human: nonlinear in nature dose-response relationships( 劑 量 反 應 關 係 ) Purpose: obtain estimates of parameters Neural network model: data mining applications ( 略 過 ) logistic regression models: Chap. 14 hsuhl (NUK) LR Chap 10 2 / 35
3 13.1 Linear and Nonlinear Regression Models Linear and Nonlinear Regression Models the general linear regression model (6.7): Y i = β 0 + β 1 X i1 + β 2 X i β p 1 X i,p 1 + ε i A polynomial regression model in one or more predictor variables is linear in the parameters. Ex: Y i = β 0 + β 1 X i1 + β 2 X 2 i1 + β 3 X i2 + β 4 X 2 i2 + β 5 X i1 X i2 + ε i log 10 Y i = β 0 + β 1 X i1 + β 2 exp(x i2 ) + ε i (transformed) hsuhl (NUK) LR Chap 10 3 / 35
4 13.1 Linear and Nonlinear Regression Models Linear and Nonlinear Regression Models (cont.) In general, we can state a linear regression model as: nonlinear regression models Y i = f (X i, β) + ε i = X iβ + ε i where X i = [1 X i1 X i,p 1 ] the same basic form as that in (13.4): Y i = f (X i, γ) + ε i f (X i, γ): mean response given by the nonlinear response function f (X, γ) ε i : error term; assumed to have E{ε i } = 0, constant variance and to be uncorrelated γ: parameter vector hsuhl (NUK) LR Chap 10 4 / 35
5 13.1 Linear and Nonlinear Regression Models Linear and Nonlinear Regression Models (cont.) exponential regression model: in growth( 生 長 ) studies; concentration( 濃 度 ) Y i = γ 0 exp(γ 1 X i ) + ε i, ε i indep. N(0, σ 2 ) hsuhl (NUK) LR Chap 10 5 / 35
6 13.1 Linear and Nonlinear Regression Models Linear and Nonlinear Regression Models (cont.) logistic regression models( 邏 輯 式 迴 歸 模 型 ): in population studies( 人 口 研 究 ); Y i = γ γ 1 exp(γ 2 X i ) + ε i, ε i indep. N(0, σ 2 ) hsuhl (NUK) LR Chap 10 6 / 35
7 13.1 Linear and Nonlinear Regression Models Linear and Nonlinear Regression Models (cont.) logistic regression models( 邏 輯 式 迴 歸 模 型 ): the response variable is qualitative (0,1): purchase a new car the error terms are not normally distributed with constant variance (Chap. 14) hsuhl (NUK) LR Chap 10 7 / 35
8 13.1 Linear and Nonlinear Regression Models Linear and Nonlinear Regression Models (cont.) General Form of Nonlinear Regression Models: The error terms ε i are often assumed to be independent normal random variables with constant variance. Important difference: #{β i } is not necessarily directly related to #{X i } in the model linear regression: (p 1) X variables p regression coefficients nonlinear regression: p 1 Y i = β j X ji + ε i j=0 Y i = γ 0 exp(γ 1 X i ) + ε i (p = 2, q = 1) hsuhl (NUK) LR Chap 10 8 / 35
9 13.1 Linear and Nonlinear Regression Models Linear and Nonlinear Regression Models (cont.) Nonlinear regression model q: #{X variables} p: #{regression parameters} X i : the observations on the X variables without the initial element 1 The general form of a nonlinear regression model: Y i = f (X i, γ) + ε i X i = q 1 X i1 X i2. γ = p 1 γ 0 γ 1. X iq γ p 1 hsuhl (NUK) LR Chap 10 9 / 35
10 13.1 Linear and Nonlinear Regression Models Linear and Nonlinear Regression Models (cont.) intrinsically linear( 實 質 線 性 的 ) response functions: nonlinear response functions can be linearized by a transformation Ex: f (X, γ) = γ 0 [exp(γ 1 X)] g(x, γ) = log e f (X, γ) = β 0 + β 1 X β 0 = log γ 0, β 1 = γ 1 Just because a nonlinear response function is intrinsically linear does not necessarily imply that linear regression is appropriate. ( the error term in the linearized model will no longer be normal with constant variance) hsuhl (NUK) LR Chap / 35
11 13.1 Linear and Nonlinear Regression Models Linear and Nonlinear Regression Models (cont.) Estimation of regression parameters: 1 least squares method 2 maximum likelihood method Also as in linear regression, both of these methods of estimation yield the same parameter estimates when the error terms in (13.12) are independent normal with constant variance. It is usually not possible to find analytical expression for LSE and MLE for nonlinear regression models. numerical search procedures must be used: require intensive computations hsuhl (NUK) LR Chap / 35
12 13.2 Least Squares Estimation in Nonlinear Regression LSE in nonlinear regression The concepts of LSE for linear regression also extend directly to nonlinear regression models. The least squares criterion: n Q = [Y i f (X i, γ)] 2 i=1 Q must be minimized with respect to γ 0,..., γ p 1 A difference from linear regression is that the solution of the normal equations usually requires an iterative numerical search procedure because analytical solutions generally cannot be found. hsuhl (NUK) LR Chap / 35
13 13.2 Least Squares Estimation in Nonlinear Regression Solution of Normal Equations Y i = f (X i, γ) + ε i n Q = [Y i f (X i, γ)] 2 i=1 g = arg min γ Q (g : the vector of the LSE g k ) Q γ k = (partial derivative of Q with respect to γ k ) [ n f (X i, γ) 2[Y i f (X i, γ)] γ k i=1 ] set = 0 γk =g k hsuhl (NUK) LR Chap / 35
14 13.2 Least Squares Estimation in Nonlinear Regression Solution of Normal Equations (cont.) The p normal equations: [ ] [ ] n f (X i, γ) n f (X i, γ) Y i f (X i, g) i=1 γ k γ=g i=1 γ k γ=g k = 0, 1,..., p 1 = 0, g: the vector of the least squares estimates g k g = [g 0,..., g p 1 ] p 1 nonlinear in the parameter estimates g k numerical search procedure are required multiple solution may be possible hsuhl (NUK) LR Chap / 35
15 13.2 Least Squares Estimation in Nonlinear Regression Solution of Normal Equations (cont.) Example: Severely injured patients Y : Prognostic( 預 後 的 ) Index X: Days Hospitalized ( 住 院 治 療 ) Related earlier studies: the relationship between Y and X is exponential Y i = γ 0 exp(γ 1 X i ) + ε i hsuhl (NUK) LR Chap / 35
16 13.2 Least Squares Estimation in Nonlinear Regression Solution of Normal Equations (cont.) f (X i, γ) γ 0 = exp(γ 1 X i ) f (X i, γ) γ 1 = γ 0 X i exp(γ 1 X i ) Q γ k g = 0 Y i exp(g 1 X i ) g 0 exp(2g1 X i ) = 0 Yi X i exp(g 1 X i ) g 0 Xi exp(2g 1 X i ) = 0 No closed-form solution exists for g = (g 0, g 1 ) T hsuhl (NUK) LR Chap / 35
17 13.2 Least Squares Estimation in Nonlinear Regression Gauss-Newton Method( 高 斯 - 牛 頓 法 ) linearization method: 1 use a Taylor series expansion to approximate the nonlinear regression model ( f (x) = f (a) + n=1 2 employ OLS to estimate the parameters f (n) (x a)n) n! hsuhl (NUK) LR Chap / 35
18 13.2 Least Squares Estimation in Nonlinear Regression Gauss-Newton Method( 高 斯 - 牛 頓 法 ) (cont.) 圖 片 來 源 :Wiki: Taylor s theorem hsuhl (NUK) LR Chap / 35
19 13.2 Least Squares Estimation in Nonlinear Regression Gauss-Newton Method( 高 斯 - 牛 頓 法 ) (cont.) Gauss-Newton Method: 1 initial parameters γ 0,..., γ p 1 : g (0) 0,..., g (0) p 1 2 approximate the mean responses f (X i, γ) in the Taylor series expansion around g (0) k : f (X i, γ) f (X i, g (0) ) + p 1 k=0 [ ] f (X i, γ) γ k γ=g (0) (γ k g (0) k ) 3 obtain revised estimated regression coefficients g (1) k : (later) g (1) k = g (0) k + b (0) k hsuhl (NUK) LR Chap / 35
20 13.2 Least Squares Estimation in Nonlinear Regression Gauss-Newton Method( 高 斯 - 牛 頓 法 ) (cont.) (Y (0) i =Y i f (0) i ) p 1 f (X i, γ) f (X i, g (0) [ ] f (Xi, γ) ) + (γ k g (0) k γ ) k γ=g (0) = f (0) p 1 i + k=0 k=0 D (0) ik β(0) k Y i f (0) p 1 i + D (0) ik β(0) k + ε i = Y (0) i p 1 k=0 k=0 D (0) ik β(0) k + ε i (no intercept) (13.24) The purpose of fitting the linear regression model approximation (13.24) is therefore to estimate β (0) k and use these estimates to adjust the initial starting estimates of the regression parameters. hsuhl (NUK) LR Chap / 35
21 13.2 Least Squares Estimation in Nonlinear Regression Gauss-Newton Method( 高 斯 - 牛 頓 法 ) (cont.) Matrix Form: Y (0) i p 1 k=0 D(0) ik β(0) k + ε i hsuhl (NUK) LR Chap / 35
22 13.2 Least Squares Estimation in Nonlinear Regression Gauss-Newton Method( 高 斯 - 牛 頓 法 ) (cont.) the D matrix of partial derivative play the role of the X matrix (without a column of 1s for the intercept) Estimate β (0) by OLS: b (0) = (D (0) D (0) ) 1 D (0) Y (0) obtain revised estimated regression coefficients g (1) k : g (1) k = g (0) k + b (0) k hsuhl (NUK) LR Chap / 35
23 13.2 Least Squares Estimation in Nonlinear Regression Gauss-Newton Method( 高 斯 - 牛 頓 法 ) (cont.) Evaluated for g (0) by SSE (0) : n n SSE (0) = [Y i f (X i, g (0) )] 2 = [Y i f (0) i ] 2 i=1 i=1 After the end of the first iteration: n n SSE (1) = [Y i f (X i, g (1) )] 2 = [Y i f (1) i ] 2 i=1 i=1 If the Gauss-Newton method is working effectively in the first iteration, SSE (1) should be smaller than SSE (0). ( g (1) should be better estimates) hsuhl (NUK) LR Chap / 35
24 13.2 Least Squares Estimation in Nonlinear Regression Gauss-Newton Method( 高 斯 - 牛 頓 法 ) (cont.) The Gauss-Newton method repeats the procedure with g (1) now used for the new starting values. Until g (s+1) g (s) and/or SSE (s+1) SSE (s) become negligible The Gauss-Newton method works effectively in many nonlinear regression applications. (Sometimes may require numerous iterations before converging.) hsuhl (NUK) LR Chap / 35
25 13.2 Least Squares Estimation in Nonlinear Regression Gauss-Newton Method( 高 斯 - 牛 頓 法 ) (cont.) Example: Severely injured patients Initial: Transformed Y (log γ 0 exp(γ 1 X) = log γ 0 + γ 1 X) Y i = β 0 + β 1 X i + ε i OLS b 0 = , b 1 = g (0) 0 = exp(b 0 ) = , g (0) 1 = b 1 = hsuhl (NUK) LR Chap / 35
26 13.2 Least Squares Estimation in Nonlinear Regression Gauss-Newton Method( 高 斯 - 牛 頓 法 ) (cont.) Ŷ = ( ) exp( X) hsuhl (NUK) LR Chap / 35
27 13.2 Least Squares Estimation in Nonlinear Regression Gauss-Newton Method( 高 斯 - 牛 頓 法 ) (cont.) The choice of initial starting values: a poor choice may result in slow convergence, convergence to a local minimum, or even divergence Good starting values: result in faster convergence, will lead to a solution that is the global minimum rather than a local minimum A variety of methods for obtaining starting values: 1 related earlier studies 2 select p representative observations solve for p parameters, then used as the starting values 3 do a grid search in the parameter space using as the starting values that g for which Q is smallest hsuhl (NUK) LR Chap / 35
28 13.2 Least Squares Estimation in Nonlinear Regression Gauss-Newton Method( 高 斯 - 牛 頓 法 ) (cont.) Some properties that exist for linear regression least squares do not hold for nonlinear regression least squares. Ex: e i 0; SSR + SSE SSTO; R 2 is not a meaningful descriptive statistic for nonlinear regression. Two other direct search procedures: 1 The method of steepest descent searches 2 The Marquardt algorithm: seeks to utilize the best feature of the Gauss-Newton method and the method of steepest descent a middle ground( 折 衷 ; 妥 協 ) between these two method hsuhl (NUK) LR Chap / 35
29 13.3 Model Building and Diagnostics Model building and diagnostics The model-building process for nonlinear regression models often differs somewhat from that for linear regression models Validation of the selected nonlinear regression model can be performed in the same fashion as for linear regression models. Use of diagnostics tools to examine the appropriateness of a fitted model plays an important role in the process of building a nonlinear regression model. hsuhl (NUK) LR Chap / 35
30 13.3 Model Building and Diagnostics Model building and diagnostics (cont.) When replicate observations are available and the sample size is reasonably large, the appropriate of a nonlinear regression function can be tested formally by means of the lack of fit test for linear regression models. (the test is an approximate one) Plots: e i vs. t i, Ŷ i, X ik can be helpful in diagnosing departures from the assumed model Unequal variances WLS; transformations hsuhl (NUK) LR Chap / 35
31 13.4 Inferences about Nonlinear Regression Parameters Inferences Inferences about the regression parameters in nonlinear regression are usually based on large-sample theory. When n is large, LSE and MLE for nonlinear regression models with normal error terms are approximately normally distributed and almost unbiased and have almost minimum variance Estimate of Error Term Variance: MSE = SSE n p = [Yi f (X i, g)] 2 n p (Not unbiased estimator of σ 2 but the bias is small when n is large) hsuhl (NUK) LR Chap / 35
32 13.4 Inferences about Nonlinear Regression Parameters Inferences (cont.) Large-Sample Theory When the error terms ε i are independent N(0, σ 2 ) and the sample size n is reasonably large, the sampling distribution of g is approximately normal. The expected value of the mean vector is approximately: E{g} γ The approximate variance-covariance matrix of the regression coefficients is estimated by: s 2 {g} = MSE(D D) 1 hsuhl (NUK) LR Chap / 35
33 13.4 Inferences about Nonlinear Regression Parameters Inferences (cont.) No simple rule exists that tells us when it is appropriate to use the large-sample inference methods and when it is not appropriate. However, a number of guidelines have been developed that are helpful in assessing the appropriateness of using the large-sample inference procedures in a given application. When the diagnostics suggest that large-sample inference procedures are not appropriate in a particular instance, remedial measures should be explored. reparameterize the nonlinear regression model bootstrap estimated of precision and confidence intervals instead of the large-sample inferences hsuhl (NUK) LR Chap / 35
34 13.4 Inferences about Nonlinear Regression Parameters Interval Estimation Large-sample theorem: approximate result for a single γ k g k γ k t(n p), k = 0, 1,..., p 1 s{g k } g k ± t(1 α/2; n p)s{g k } Several γ k : m parameters to be estimated with approximate family confidence coefficient 1 α the Bonferroni confidence limits: g k ± Bs{g k } B = t(1 α/2m; n p) hsuhl (NUK) LR Chap / 35
35 13.4 Inferences about Nonlinear Regression Parameters Test Concerning a single γ k A large-sample test: H 0 : γ k = γ k0 vs. H a : γ k γ k0 t = g k γ k0 s{g k } If t t(1 α/2; n p), conclude H 0 If t > t(1 α/2; n p), conclude H a Test concerning several γ k F = approx SSE(R) SSE(F ) df R df F MSE(F ) F (df R df F, df F ) when H 0 holds hsuhl (NUK) LR Chap / 35
Outline. Topic 4 - Analysis of Variance Approach to Regression. Partitioning Sums of Squares. Total Sum of Squares. Partitioning sums of squares
Topic 4 - Analysis of Variance Approach to Regression Outline Partitioning sums of squares Degrees of freedom Expected mean squares General linear test - Fall 2013 R 2 and the coefficient of correlation
More informationEconometrics Simple Linear Regression
Econometrics Simple Linear Regression Burcu Eke UC3M Linear equations with one variable Recall what a linear equation is: y = b 0 + b 1 x is a linear equation with one variable, or equivalently, a straight
More informationEstimation of σ 2, the variance of ɛ
Estimation of σ 2, the variance of ɛ The variance of the errors σ 2 indicates how much observations deviate from the fitted surface. If σ 2 is small, parameters β 0, β 1,..., β k will be reliably estimated
More informationNotes on Applied Linear Regression
Notes on Applied Linear Regression Jamie DeCoster Department of Social Psychology Free University Amsterdam Van der Boechorststraat 1 1081 BT Amsterdam The Netherlands phone: +31 (0)20 444-8935 email:
More informationOverview of Violations of the Basic Assumptions in the Classical Normal Linear Regression Model
Overview of Violations of the Basic Assumptions in the Classical Normal Linear Regression Model 1 September 004 A. Introduction and assumptions The classical normal linear regression model can be written
More informationData Mining and Data Warehousing. Henryk Maciejewski. Data Mining Predictive modelling: regression
Data Mining and Data Warehousing Henryk Maciejewski Data Mining Predictive modelling: regression Algorithms for Predictive Modelling Contents Regression Classification Auxiliary topics: Estimation of prediction
More informationSTATISTICA Formula Guide: Logistic Regression. Table of Contents
: Table of Contents... 1 Overview of Model... 1 Dispersion... 2 Parameterization... 3 Sigma-Restricted Model... 3 Overparameterized Model... 4 Reference Coding... 4 Model Summary (Summary Tab)... 5 Summary
More informationLecture 3: Linear methods for classification
Lecture 3: Linear methods for classification Rafael A. Irizarry and Hector Corrada Bravo February, 2010 Today we describe four specific algorithms useful for classification problems: linear regression,
More informationAuxiliary Variables in Mixture Modeling: 3-Step Approaches Using Mplus
Auxiliary Variables in Mixture Modeling: 3-Step Approaches Using Mplus Tihomir Asparouhov and Bengt Muthén Mplus Web Notes: No. 15 Version 8, August 5, 2014 1 Abstract This paper discusses alternatives
More informationLogistic Regression. Vibhav Gogate The University of Texas at Dallas. Some Slides from Carlos Guestrin, Luke Zettlemoyer and Dan Weld.
Logistic Regression Vibhav Gogate The University of Texas at Dallas Some Slides from Carlos Guestrin, Luke Zettlemoyer and Dan Weld. Generative vs. Discriminative Classifiers Want to Learn: h:x Y X features
More informationUsing the Delta Method to Construct Confidence Intervals for Predicted Probabilities, Rates, and Discrete Changes
Using the Delta Method to Construct Confidence Intervals for Predicted Probabilities, Rates, Discrete Changes JunXuJ.ScottLong Indiana University August 22, 2005 The paper provides technical details on
More informationBasics of Statistical Machine Learning
CS761 Spring 2013 Advanced Machine Learning Basics of Statistical Machine Learning Lecturer: Xiaojin Zhu jerryzhu@cs.wisc.edu Modern machine learning is rooted in statistics. You will find many familiar
More informationLOGISTIC REGRESSION. Nitin R Patel. where the dependent variable, y, is binary (for convenience we often code these values as
LOGISTIC REGRESSION Nitin R Patel Logistic regression extends the ideas of multiple linear regression to the situation where the dependent variable, y, is binary (for convenience we often code these values
More informationWeb-based Supplementary Materials for Bayesian Effect Estimation. Accounting for Adjustment Uncertainty by Chi Wang, Giovanni
1 Web-based Supplementary Materials for Bayesian Effect Estimation Accounting for Adjustment Uncertainty by Chi Wang, Giovanni Parmigiani, and Francesca Dominici In Web Appendix A, we provide detailed
More informationWeek 5: Multiple Linear Regression
BUS41100 Applied Regression Analysis Week 5: Multiple Linear Regression Parameter estimation and inference, forecasting, diagnostics, dummy variables Robert B. Gramacy The University of Chicago Booth School
More informationMATH4427 Notebook 2 Spring 2016. 2 MATH4427 Notebook 2 3. 2.1 Definitions and Examples... 3. 2.2 Performance Measures for Estimators...
MATH4427 Notebook 2 Spring 2016 prepared by Professor Jenny Baglivo c Copyright 2009-2016 by Jenny A. Baglivo. All Rights Reserved. Contents 2 MATH4427 Notebook 2 3 2.1 Definitions and Examples...................................
More informationIAPRI Quantitative Analysis Capacity Building Series. Multiple regression analysis & interpreting results
IAPRI Quantitative Analysis Capacity Building Series Multiple regression analysis & interpreting results How important is R-squared? R-squared Published in Agricultural Economics 0.45 Best article of the
More informationSAS Software to Fit the Generalized Linear Model
SAS Software to Fit the Generalized Linear Model Gordon Johnston, SAS Institute Inc., Cary, NC Abstract In recent years, the class of generalized linear models has gained popularity as a statistical modeling
More informationLogistic Regression. Jia Li. Department of Statistics The Pennsylvania State University. Logistic Regression
Logistic Regression Department of Statistics The Pennsylvania State University Email: jiali@stat.psu.edu Logistic Regression Preserve linear classification boundaries. By the Bayes rule: Ĝ(x) = arg max
More informationThe Method of Least Squares
Hervé Abdi 1 1 Introduction The least square methods (LSM) is probably the most popular technique in statistics. This is due to several factors. First, most common estimators can be casted within this
More informationComparison of sales forecasting models for an innovative agro-industrial product: Bass model versus logistic function
The Empirical Econometrics and Quantitative Economics Letters ISSN 2286 7147 EEQEL all rights reserved Volume 1, Number 4 (December 2012), pp. 89 106. Comparison of sales forecasting models for an innovative
More informationChapter 6: Point Estimation. Fall 2011. - Probability & Statistics
STAT355 Chapter 6: Point Estimation Fall 2011 Chapter Fall 2011 6: Point1 Estimat / 18 Chap 6 - Point Estimation 1 6.1 Some general Concepts of Point Estimation Point Estimate Unbiasedness Principle of
More informationPoisson Models for Count Data
Chapter 4 Poisson Models for Count Data In this chapter we study log-linear models for count data under the assumption of a Poisson error structure. These models have many applications, not only to the
More informationStatistical Machine Learning
Statistical Machine Learning UoC Stats 37700, Winter quarter Lecture 4: classical linear and quadratic discriminants. 1 / 25 Linear separation For two classes in R d : simple idea: separate the classes
More informationExample: Credit card default, we may be more interested in predicting the probabilty of a default than classifying individuals as default or not.
Statistical Learning: Chapter 4 Classification 4.1 Introduction Supervised learning with a categorical (Qualitative) response Notation: - Feature vector X, - qualitative response Y, taking values in C
More informationNonlinear Statistical Models
Nonlinear Statistical Models Earlier in the course, we considered the general linear statistical model (GLM): y i =β 0 +β 1 x 1i + +β k x ki +ε i, i= 1,, n, written in matrix form as: y 1 y n = 1 x 11
More informationGenerating Random Numbers Variance Reduction Quasi-Monte Carlo. Simulation Methods. Leonid Kogan. MIT, Sloan. 15.450, Fall 2010
Simulation Methods Leonid Kogan MIT, Sloan 15.450, Fall 2010 c Leonid Kogan ( MIT, Sloan ) Simulation Methods 15.450, Fall 2010 1 / 35 Outline 1 Generating Random Numbers 2 Variance Reduction 3 Quasi-Monte
More informationCoefficient of Determination
Coefficient of Determination The coefficient of determination R 2 (or sometimes r 2 ) is another measure of how well the least squares equation ŷ = b 0 + b 1 x performs as a predictor of y. R 2 is computed
More informationLogistic Regression (1/24/13)
STA63/CBB540: Statistical methods in computational biology Logistic Regression (/24/3) Lecturer: Barbara Engelhardt Scribe: Dinesh Manandhar Introduction Logistic regression is model for regression used
More informationModel Selection and Claim Frequency for Workers Compensation Insurance
Model Selection and Claim Frequency for Workers Compensation Insurance Jisheng Cui, David Pitt and Guoqi Qian Abstract We consider a set of workers compensation insurance claim data where the aggregate
More informationA General Approach to Variance Estimation under Imputation for Missing Survey Data
A General Approach to Variance Estimation under Imputation for Missing Survey Data J.N.K. Rao Carleton University Ottawa, Canada 1 2 1 Joint work with J.K. Kim at Iowa State University. 2 Workshop on Survey
More informationChapter 13 Introduction to Linear Regression and Correlation Analysis
Chapter 3 Student Lecture Notes 3- Chapter 3 Introduction to Linear Regression and Correlation Analsis Fall 2006 Fundamentals of Business Statistics Chapter Goals To understand the methods for displaing
More informationNew Work Item for ISO 3534-5 Predictive Analytics (Initial Notes and Thoughts) Introduction
Introduction New Work Item for ISO 3534-5 Predictive Analytics (Initial Notes and Thoughts) Predictive analytics encompasses the body of statistical knowledge supporting the analysis of massive data sets.
More informationTwo Topics in Parametric Integration Applied to Stochastic Simulation in Industrial Engineering
Two Topics in Parametric Integration Applied to Stochastic Simulation in Industrial Engineering Department of Industrial Engineering and Management Sciences Northwestern University September 15th, 2014
More information5. Linear Regression
5. Linear Regression Outline.................................................................... 2 Simple linear regression 3 Linear model............................................................. 4
More informationMultiple Linear Regression in Data Mining
Multiple Linear Regression in Data Mining Contents 2.1. A Review of Multiple Linear Regression 2.2. Illustration of the Regression Process 2.3. Subset Selection in Linear Regression 1 2 Chap. 2 Multiple
More information5. Multiple regression
5. Multiple regression QBUS6840 Predictive Analytics https://www.otexts.org/fpp/5 QBUS6840 Predictive Analytics 5. Multiple regression 2/39 Outline Introduction to multiple linear regression Some useful
More information2. What is the general linear model to be used to model linear trend? (Write out the model) = + + + or
Simple and Multiple Regression Analysis Example: Explore the relationships among Month, Adv.$ and Sales $: 1. Prepare a scatter plot of these data. The scatter plots for Adv.$ versus Sales, and Month versus
More information3. Regression & Exponential Smoothing
3. Regression & Exponential Smoothing 3.1 Forecasting a Single Time Series Two main approaches are traditionally used to model a single time series z 1, z 2,..., z n 1. Models the observation z t as a
More informationLeast Squares Estimation
Least Squares Estimation SARA A VAN DE GEER Volume 2, pp 1041 1045 in Encyclopedia of Statistics in Behavioral Science ISBN-13: 978-0-470-86080-9 ISBN-10: 0-470-86080-4 Editors Brian S Everitt & David
More informationMISSING DATA TECHNIQUES WITH SAS. IDRE Statistical Consulting Group
MISSING DATA TECHNIQUES WITH SAS IDRE Statistical Consulting Group ROAD MAP FOR TODAY To discuss: 1. Commonly used techniques for handling missing data, focusing on multiple imputation 2. Issues that could
More informationMATHEMATICAL METHODS OF STATISTICS
MATHEMATICAL METHODS OF STATISTICS By HARALD CRAMER TROFESSOK IN THE UNIVERSITY OF STOCKHOLM Princeton PRINCETON UNIVERSITY PRESS 1946 TABLE OF CONTENTS. First Part. MATHEMATICAL INTRODUCTION. CHAPTERS
More informationMULTIPLE REGRESSION AND ISSUES IN REGRESSION ANALYSIS
MULTIPLE REGRESSION AND ISSUES IN REGRESSION ANALYSIS MSR = Mean Regression Sum of Squares MSE = Mean Squared Error RSS = Regression Sum of Squares SSE = Sum of Squared Errors/Residuals α = Level of Significance
More informationStandard errors of marginal effects in the heteroskedastic probit model
Standard errors of marginal effects in the heteroskedastic probit model Thomas Cornelißen Discussion Paper No. 320 August 2005 ISSN: 0949 9962 Abstract In non-linear regression models, such as the heteroskedastic
More informationNonlinear Regression:
Zurich University of Applied Sciences School of Engineering IDP Institute of Data Analysis and Process Design Nonlinear Regression: A Powerful Tool With Considerable Complexity Half-Day : Improved Inference
More informationSimple Linear Regression Inference
Simple Linear Regression Inference 1 Inference requirements The Normality assumption of the stochastic term e is needed for inference even if it is not a OLS requirement. Therefore we have: Interpretation
More informationLecture 8 February 4
ICS273A: Machine Learning Winter 2008 Lecture 8 February 4 Scribe: Carlos Agell (Student) Lecturer: Deva Ramanan 8.1 Neural Nets 8.1.1 Logistic Regression Recall the logistic function: g(x) = 1 1 + e θt
More informationAPPLICATION OF LINEAR REGRESSION MODEL FOR POISSON DISTRIBUTION IN FORECASTING
APPLICATION OF LINEAR REGRESSION MODEL FOR POISSON DISTRIBUTION IN FORECASTING Sulaimon Mutiu O. Department of Statistics & Mathematics Moshood Abiola Polytechnic, Abeokuta, Ogun State, Nigeria. Abstract
More informationAdequacy of Biomath. Models. Empirical Modeling Tools. Bayesian Modeling. Model Uncertainty / Selection
Directions in Statistical Methodology for Multivariable Predictive Modeling Frank E Harrell Jr University of Virginia Seattle WA 19May98 Overview of Modeling Process Model selection Regression shape Diagnostics
More informationDesigning a learning system
Lecture Designing a learning system Milos Hauskrecht milos@cs.pitt.edu 539 Sennott Square, x4-8845 http://.cs.pitt.edu/~milos/courses/cs750/ Design of a learning system (first vie) Application or Testing
More informationHYPOTHESIS TESTING: CONFIDENCE INTERVALS, T-TESTS, ANOVAS, AND REGRESSION
HYPOTHESIS TESTING: CONFIDENCE INTERVALS, T-TESTS, ANOVAS, AND REGRESSION HOD 2990 10 November 2010 Lecture Background This is a lightning speed summary of introductory statistical methods for senior undergraduate
More informationI L L I N O I S UNIVERSITY OF ILLINOIS AT URBANA-CHAMPAIGN
Beckman HLM Reading Group: Questions, Answers and Examples Carolyn J. Anderson Department of Educational Psychology I L L I N O I S UNIVERSITY OF ILLINOIS AT URBANA-CHAMPAIGN Linear Algebra Slide 1 of
More informationGLM, insurance pricing & big data: paying attention to convergence issues.
GLM, insurance pricing & big data: paying attention to convergence issues. Michaël NOACK - michael.noack@addactis.com Senior consultant & Manager of ADDACTIS Pricing Copyright 2014 ADDACTIS Worldwide.
More informationGeneralized Linear Models
Generalized Linear Models We have previously worked with regression models where the response variable is quantitative and normally distributed. Now we turn our attention to two types of models where the
More informationParallelization Strategies for Multicore Data Analysis
Parallelization Strategies for Multicore Data Analysis Wei-Chen Chen 1 Russell Zaretzki 2 1 University of Tennessee, Dept of EEB 2 University of Tennessee, Dept. Statistics, Operations, and Management
More informationSales forecasting # 2
Sales forecasting # 2 Arthur Charpentier arthur.charpentier@univ-rennes1.fr 1 Agenda Qualitative and quantitative methods, a very general introduction Series decomposition Short versus long term forecasting
More informationSUMAN DUVVURU STAT 567 PROJECT REPORT
SUMAN DUVVURU STAT 567 PROJECT REPORT SURVIVAL ANALYSIS OF HEROIN ADDICTS Background and introduction: Current illicit drug use among teens is continuing to increase in many countries around the world.
More informationOverview Classes. 12-3 Logistic regression (5) 19-3 Building and applying logistic regression (6) 26-3 Generalizations of logistic regression (7)
Overview Classes 12-3 Logistic regression (5) 19-3 Building and applying logistic regression (6) 26-3 Generalizations of logistic regression (7) 2-4 Loglinear models (8) 5-4 15-17 hrs; 5B02 Building and
More informationGLM I An Introduction to Generalized Linear Models
GLM I An Introduction to Generalized Linear Models CAS Ratemaking and Product Management Seminar March 2009 Presented by: Tanya D. Havlicek, Actuarial Assistant 0 ANTITRUST Notice The Casualty Actuarial
More information(Quasi-)Newton methods
(Quasi-)Newton methods 1 Introduction 1.1 Newton method Newton method is a method to find the zeros of a differentiable non-linear function g, x such that g(x) = 0, where g : R n R n. Given a starting
More informationProbabilistic Linear Classification: Logistic Regression. Piyush Rai IIT Kanpur
Probabilistic Linear Classification: Logistic Regression Piyush Rai IIT Kanpur Probabilistic Machine Learning (CS772A) Jan 18, 2016 Probabilistic Machine Learning (CS772A) Probabilistic Linear Classification:
More informationANALYSIS OF FACTOR BASED DATA MINING TECHNIQUES
Advances in Information Mining ISSN: 0975 3265 & E-ISSN: 0975 9093, Vol. 3, Issue 1, 2011, pp-26-32 Available online at http://www.bioinfo.in/contents.php?id=32 ANALYSIS OF FACTOR BASED DATA MINING TECHNIQUES
More informationSection 1: Simple Linear Regression
Section 1: Simple Linear Regression Carlos M. Carvalho The University of Texas McCombs School of Business http://faculty.mccombs.utexas.edu/carlos.carvalho/teaching/ 1 Regression: General Introduction
More informationA Comparison of Nonlinear. Regression Codes
A Comparison of Nonlinear Regression Codes by Paul Fredrick Mondragon Submitted in Partial Fulfillment of the Requirements for the Degree of Master of Science in Mathematics with Operations Research and
More informationReject Inference in Credit Scoring. Jie-Men Mok
Reject Inference in Credit Scoring Jie-Men Mok BMI paper January 2009 ii Preface In the Master programme of Business Mathematics and Informatics (BMI), it is required to perform research on a business
More informationLinear Models for Continuous Data
Chapter 2 Linear Models for Continuous Data The starting point in our exploration of statistical models in social research will be the classical linear model. Stops along the way include multiple linear
More informationPart 2: Analysis of Relationship Between Two Variables
Part 2: Analysis of Relationship Between Two Variables Linear Regression Linear correlation Significance Tests Multiple regression Linear Regression Y = a X + b Dependent Variable Independent Variable
More information" Y. Notation and Equations for Regression Lecture 11/4. Notation:
Notation: Notation and Equations for Regression Lecture 11/4 m: The number of predictor variables in a regression Xi: One of multiple predictor variables. The subscript i represents any number from 1 through
More informationTime Series Analysis
Time Series Analysis hm@imm.dtu.dk Informatics and Mathematical Modelling Technical University of Denmark DK-2800 Kgs. Lyngby 1 Outline of the lecture Identification of univariate time series models, cont.:
More informationMultinomial and Ordinal Logistic Regression
Multinomial and Ordinal Logistic Regression ME104: Linear Regression Analysis Kenneth Benoit August 22, 2012 Regression with categorical dependent variables When the dependent variable is categorical,
More informationQuadratic forms Cochran s theorem, degrees of freedom, and all that
Quadratic forms Cochran s theorem, degrees of freedom, and all that Dr. Frank Wood Frank Wood, fwood@stat.columbia.edu Linear Regression Models Lecture 1, Slide 1 Why We Care Cochran s theorem tells us
More informationDemand Forecasting LEARNING OBJECTIVES IEEM 517. 1. Understand commonly used forecasting techniques. 2. Learn to evaluate forecasts
IEEM 57 Demand Forecasting LEARNING OBJECTIVES. Understand commonly used forecasting techniques. Learn to evaluate forecasts 3. Learn to choose appropriate forecasting techniques CONTENTS Motivation Forecast
More informationVI. Introduction to Logistic Regression
VI. Introduction to Logistic Regression We turn our attention now to the topic of modeling a categorical outcome as a function of (possibly) several factors. The framework of generalized linear models
More informationSTA 4273H: Statistical Machine Learning
STA 4273H: Statistical Machine Learning Russ Salakhutdinov Department of Statistics! rsalakhu@utstat.toronto.edu! http://www.cs.toronto.edu/~rsalakhu/ Lecture 6 Three Approaches to Classification Construct
More informationEmpirical Model-Building and Response Surfaces
Empirical Model-Building and Response Surfaces GEORGE E. P. BOX NORMAN R. DRAPER Technische Universitat Darmstadt FACHBEREICH INFORMATIK BIBLIOTHEK Invortar-Nf.-. Sachgsbiete: Standort: New York John Wiley
More informationWes, Delaram, and Emily MA751. Exercise 4.5. 1 p(x; β) = [1 p(xi ; β)] = 1 p(x. y i [βx i ] log [1 + exp {βx i }].
Wes, Delaram, and Emily MA75 Exercise 4.5 Consider a two-class logistic regression problem with x R. Characterize the maximum-likelihood estimates of the slope and intercept parameter if the sample for
More informationElements of statistics (MATH0487-1)
Elements of statistics (MATH0487-1) Prof. Dr. Dr. K. Van Steen University of Liège, Belgium December 10, 2012 Introduction to Statistics Basic Probability Revisited Sampling Exploratory Data Analysis -
More informationPenalized Logistic Regression and Classification of Microarray Data
Penalized Logistic Regression and Classification of Microarray Data Milan, May 2003 Anestis Antoniadis Laboratoire IMAG-LMC University Joseph Fourier Grenoble, France Penalized Logistic Regression andclassification
More informationDeflator Selection and Generalized Linear Modelling in Market-based Accounting Research
Deflator Selection and Generalized Linear Modelling in Market-based Accounting Research Changbao Wu and Bixia Xu 1 Abstract The scale factor refers to an unknown size variable which affects some or all
More informationService courses for graduate students in degree programs other than the MS or PhD programs in Biostatistics.
Course Catalog In order to be assured that all prerequisites are met, students must acquire a permission number from the education coordinator prior to enrolling in any Biostatistics course. Courses are
More information1 Simple Linear Regression I Least Squares Estimation
Simple Linear Regression I Least Squares Estimation Textbook Sections: 8. 8.3 Previously, we have worked with a random variable x that comes from a population that is normally distributed with mean µ and
More informationCurve Fitting. Before You Begin
Curve Fitting Chapter 16: Curve Fitting Before You Begin Selecting the Active Data Plot When performing linear or nonlinear fitting when the graph window is active, you must make the desired data plot
More informationSpatial Statistics Chapter 3 Basics of areal data and areal data modeling
Spatial Statistics Chapter 3 Basics of areal data and areal data modeling Recall areal data also known as lattice data are data Y (s), s D where D is a discrete index set. This usually corresponds to data
More informationLinear Classification. Volker Tresp Summer 2015
Linear Classification Volker Tresp Summer 2015 1 Classification Classification is the central task of pattern recognition Sensors supply information about an object: to which class do the object belong
More information2. Linear regression with multiple regressors
2. Linear regression with multiple regressors Aim of this section: Introduction of the multiple regression model OLS estimation in multiple regression Measures-of-fit in multiple regression Assumptions
More informationCourse 4 Examination Questions And Illustrative Solutions. November 2000
Course 4 Examination Questions And Illustrative Solutions Novemer 000 1. You fit an invertile first-order moving average model to a time series. The lag-one sample autocorrelation coefficient is 0.35.
More informationWeb-based Supplementary Materials for. Modeling of Hormone Secretion-Generating. Mechanisms With Splines: A Pseudo-Likelihood.
Web-based Supplementary Materials for Modeling of Hormone Secretion-Generating Mechanisms With Splines: A Pseudo-Likelihood Approach by Anna Liu and Yuedong Wang Web Appendix A This appendix computes mean
More informationINDIRECT INFERENCE (prepared for: The New Palgrave Dictionary of Economics, Second Edition)
INDIRECT INFERENCE (prepared for: The New Palgrave Dictionary of Economics, Second Edition) Abstract Indirect inference is a simulation-based method for estimating the parameters of economic models. Its
More informationA Basic Introduction to Missing Data
John Fox Sociology 740 Winter 2014 Outline Why Missing Data Arise Why Missing Data Arise Global or unit non-response. In a survey, certain respondents may be unreachable or may refuse to participate. Item
More information17. SIMPLE LINEAR REGRESSION II
17. SIMPLE LINEAR REGRESSION II The Model In linear regression analysis, we assume that the relationship between X and Y is linear. This does not mean, however, that Y can be perfectly predicted from X.
More information11. Analysis of Case-control Studies Logistic Regression
Research methods II 113 11. Analysis of Case-control Studies Logistic Regression This chapter builds upon and further develops the concepts and strategies described in Ch.6 of Mother and Child Health:
More informationPolynomial Neural Network Discovery Client User Guide
Polynomial Neural Network Discovery Client User Guide Version 1.3 Table of contents Table of contents...2 1. Introduction...3 1.1 Overview...3 1.2 PNN algorithm principles...3 1.3 Additional criteria...3
More informationPOLYNOMIAL AND MULTIPLE REGRESSION. Polynomial regression used to fit nonlinear (e.g. curvilinear) data into a least squares linear regression model.
Polynomial Regression POLYNOMIAL AND MULTIPLE REGRESSION Polynomial regression used to fit nonlinear (e.g. curvilinear) data into a least squares linear regression model. It is a form of linear regression
More informationMultiple Linear Regression
Multiple Linear Regression A regression with two or more explanatory variables is called a multiple regression. Rather than modeling the mean response as a straight line, as in simple regression, it is
More informationSales forecasting # 1
Sales forecasting # 1 Arthur Charpentier arthur.charpentier@univ-rennes1.fr 1 Agenda Qualitative and quantitative methods, a very general introduction Series decomposition Short versus long term forecasting
More informationDepartment of Mathematics, Indian Institute of Technology, Kharagpur Assignment 2-3, Probability and Statistics, March 2015. Due:-March 25, 2015.
Department of Mathematics, Indian Institute of Technology, Kharagpur Assignment -3, Probability and Statistics, March 05. Due:-March 5, 05.. Show that the function 0 for x < x+ F (x) = 4 for x < for x
More informationVariance Reduction. Pricing American Options. Monte Carlo Option Pricing. Delta and Common Random Numbers
Variance Reduction The statistical efficiency of Monte Carlo simulation can be measured by the variance of its output If this variance can be lowered without changing the expected value, fewer replications
More informationA Log-Robust Optimization Approach to Portfolio Management
A Log-Robust Optimization Approach to Portfolio Management Dr. Aurélie Thiele Lehigh University Joint work with Ban Kawas Research partially supported by the National Science Foundation Grant CMMI-0757983
More information1. What is the critical value for this 95% confidence interval? CV = z.025 = invnorm(0.025) = 1.96
1 Final Review 2 Review 2.1 CI 1-propZint Scenario 1 A TV manufacturer claims in its warranty brochure that in the past not more than 10 percent of its TV sets needed any repair during the first two years
More information