Nonlinear Regression:
|
|
- Jeffrey Higgins
- 7 years ago
- Views:
Transcription
1 Zurich University of Applied Sciences School of Engineering IDP Institute of Data Analysis and Process Design Nonlinear Regression: A Powerful Tool With Considerable Complexity Half-Day 1: Estimation and Standard Inference Andreas Ruckstuhl Institut für Datenanalyse und Prozessdesign Zürcher Hochschule für Angewandte Wissenschaften
2 Nonlinear Regression: Half-Day 1 Estimation and Standard Inference 2 / 27 Outline: Half-Day 1 Half-Day 2 Half-Day 3 Estimation and Standard Inference The Nonlinear Regression Model Iterative Estimation - Model Fitting Inference Based on Linear Approximations Improved Inference and Visualisation Likelihood Based Inference Profile t Plot and Profile Traces Parameter Transformations Bootstrap, Prediction and Calibration Bootstrap Prediction Calibration Outlook
3 Nonlinear Regression: Half-Day 1 Estimation and Standard Inference 3 / 27 1 The Nonlinear Regression Model The regression model Y i = h x (1) i,..., x (m) i ; θ 1, θ 2,..., θ p + E i with E i indep. N 0, σ 2 In case of the linear regression model h x (1) i,..., x (m) i ; θ 1, θ 2,..., θ p = θ θ 2x (2) i θ px (p) i (i.e., m = p) Examples of nonlinear regression function: h x i; θ = θ1x θ 3 i θ 2 + x θ 3 h x; θ = exp i θ 1 ( x (1) i ) θ3 exp θ2 x (2) i θ2 h x; θ = θ 1 exp x i
4 Nonlinear Regression: Half-Day 1 Estimation and Standard Inference 4 / 27 Example: Puromycin The Michaelis-Menten model for enzyme kinetics relates the initial velocity of an enzymatic reaction to the substrate concentration 200 teated with Puromycin not treated Velocity Velocity Concentration Concentration Y i = θ1 xi θ 2 + x i + E i with E i i.i.d. N 0, σ 2 (Michaelis-Menten model) x substrate concentration [ppm] Y initial velocity [(number/min)/min]
5 Nonlinear Regression: Half-Day 1 Estimation and Standard Inference 5 / 27 Example: Biochemical Oxygen Demand (BOD) Biochemical oxygen demand of stream water 20 Oxygen demand (mg/l) Oxygen demand Time (days) Time Y i = θ 1 (1 e θ2 xi ) + E i mit E i i.i.d. N 0, σ 2, where Y is the biochemical oxygen demand (BOD) [mg/l] and x the incubation time [days]
6 Nonlinear Regression: Half-Day 1 Estimation and Standard Inference 6 / 27 Example: Cellulose Membrane Ratio of protonated to deprotonated carboxyl groups within the pore of celluose membrane versus ph value x of the bulk solution 163 y (= chem. shift) y 160 (a) (b) x (=ph) x Theoretically, this relation is described by the Henderson-Hasselbach equation, Y i = θ1 + θ2 10θ 3+θ 4 x i θ 3+θ 4 x i + E i i = 1,..., n, with E i i.i.d. N 0, σ 2.
7 Nonlinear Regression: Half-Day 1 Estimation and Standard Inference 7 / 27 Transformably Linear Models Example: h x, θ = θ 1 exp Applying the log-transformation, we obtain θ2 log h x, θ = log θ 1 exp x θ2 = log θ 1 + log exp x Hence Conclusion: = log θ 1 + θ 2 1 x θ 2 log h x, θ = ϑ 1 + ϑ 2 x x The complete transformably linear model is log Y i = ϑ 1+ϑ 2 x i +E i, E i i.i.d. N 0, σ 2 The error term is additive In the original representation, the model transforms to ϑ 1 + ϑ 2 x i + E i Y i = exp θ2 = θ 1 exp x Ẽ i i.e., Ẽ i is log-normally distributed and the error is multiplicative. Transform to a linear model only if required by the error structure. Check assumptions on error term by residual analysis.
8 Nonlinear Regression: Half-Day 1 Estimation and Standard Inference 8 / 27 If there is a deterministic model y = θ 1 x θ 2, the random component may be either additiv or multiplicativ. The Tukey-Anscombe plot of the fitted model will show clearly which model is more adequate for the data lm(log(y) ~ log(x)) nls(y ~ a * x^b) y = a * x^b + E ln(y) = ln(a) + b*ln(x) + E
9 Nonlinear Regression: Half-Day 1 Estimation and Standard Inference 9 / 27 A selection of transformably linear models h x, θ = 1/(θ 1 + θ 2 exp x ) 1/h x, θ = θ 1 + θ 2 exp x h x, θ = θ 1 x/(θ 2 + x) 1/h x, θ = 1/θ 1 + θ 2 /θ 1 1 x h x, θ = θ 1 x θ 2 ln h x, θ = ln θ 1 + θ 2 ln x h x, θ = θ 1 exp θ 2 g x ln h x, θ = ln θ 1 + θ 2 g x h x, θ = exp θ 1 x (1) exp θ 2 /x (2) ln ln h x, θ = ln θ 1 + ln x (1) θ 2 /x (2) h x, θ = θ 1 ( x (1) ) θ 2 ( x (2) ) θ 3 ln h x, θ = ln θ 1 + θ 2 ln x (1) + θ 3 ln x (2)
10 Nonlinear Regression: Half-Day 1 Estimation and Standard Inference 10 / 27 2 Model Fitting Using an Iterative Algorithm The method of least squares: Find the minimum of S θ = n (y i η i θ ) 2 mit η i θ = h θ, x i. i=1 Key steps for minimising: approximate the surface η θ at a temporarily best value θ (l) by a tangent plane where η θ (l) is the point of contact. search the point on the plane, which is closest to Y (that is a linear regression fitting problem). The new point lies on the plain but not on the surface. However, it defines a parameter vector θ (l+1) which will be used in the next iteration step.
11 Nonlinear Regression: Half-Day 1 Estimation and Standard Inference 11 / 27 Algebraically formulated 1 Linear approximation of η i θ at θ (m) : η i θ η i θ (m) + A (m) ( θ θ (m)), where A (m) = A θ (m) is the derivative matrix of η θ at θ (m) in the m-th iteration step. 2 (Local) linear Model Ỹ (m) A (m) β (m) + E where Ỹ (m) = Y η θ (m) and β (m) = θ θ (m) 3 Least-squares estimation for β (m) β (m). Set θ (m+1) = θ (m) + β(m). 4 Repeat steps 1 to 3 until the procedure converges. result θ = θ (m+1)
12 Nonlinear Regression: Half-Day 1 Estimation and Standard Inference 12 / 27 Starting Values interpret the behaviour of the regression function in terms of the parameter analytically or graphically transform the regression function to obtain simpler, preferably linear, behaviour use your knowledge from previous or similar experiments Example Puromycin (2) - using transformation θ1 xi y h x, θ = θ 2 + x i transform to linearity ỹ = 1 y 1 h x, θ = θ2 1 θ 1 x + 1 θ 1 that is ỹ β 1 x + β 0 linear regression β = (0.005, ) T starting values: θ0 1 = 1 β0 196 θ0 2 = β 1 β
13 Nonlinear Regression: Half-Day 1 Estimation and Standard Inference 13 / 27 Example Puromycin (3) /Velocity Velocity /Concentration Concentration Left: Regression line used for determining the starting values θ 1 and θ 2. Right: Regression function h x; θ based on the starting values θ = θ (0) ( ) and based on the least-squares estimation θ = θ ( ), respectively.
14 Nonlinear Regression: Half-Day 1 Estimation and Standard Inference 14 / 27 Example: Cellulose membrane (2) - starting values h x ; θ = θ 1 + θ 2 10 θ3+θ4x θ3+θ4x mit θ 4 < 0 We know: h x ; θ θ 1 for x h x ; θ θ 2 for x From data, we obtain θ (0) 1 = und θ (0) 2 = (0) θ 1 y i Let ỹ i = log 10 y i θ (0) 2 hence ỹ i = θ 3 + θ 4 x i. Simple linear regression results in starting values for both θ 3 and θ 4 θ (0) 3 = 1.83 and θ (0) 4 = 0.36.,
15 Nonlinear Regression: Half-Day 1 Estimation and Standard Inference 15 / 27 Example: Cellulose membrane (3) y x (=ph) (a) y (= chem. shift) x (=ph) (a) Regression line used for determining the starting values θ 3 and θ 4. (b) Regression function h x; θ based on the starting values θ = θ (0) ( ) and based on the least-squares estimation θ = θ ( ), respectively. (b)
16 Nonlinear Regression: Half-Day 1 Estimation and Standard Inference 16 / 27 Self-Starter Function For repeated use of the same nonlinear regression model use an automated way of providing starting values. Basically, collect all the manual steps which are necessary to obtain the initial values for a nonlinear regression model into a function. Self-starter functions are specific for a given mean function and calculate starting values for a given dataset. If SSmicmen() (c.f. next slide) is a self-starter function, then you can run the fitting process as nls(rate SSmicmen(conc, Vm, K), data=d.minor) How to write your own self-starter functions see help or, e.g., Ritz & Streibig (2008), Sec 3.2 With the standard installation of R, the following self-starter functions are implemented:
17 Nonlinear Regression: Half-Day 1 Estimation and Standard Inference 17 / 27 Self-Starter Functions in the Standard Installation Model Mean Function Name of Self-Starter Function Biexponential A1 e x elrc1 + A2 e x elrc2 SSbiexp(x, A1, lrc1, A2, lrc2) Asymptotic regression Asym + (R0 Asym) e x elrc SSasymp(x, Asym, R0, lrc) Asymptotic with offset Asymptotic (c0 = 0) regression regression Asym (1 e (x c0) elrc ) SSasympOff(x, Asym, lrc, c0) Asym (1 e x elrc ) First-order x1 elke+lka lcl e lka e lke compartment (e x2 elke e x2 elka ) SSasympOrig(x, Asym, lrc) SSfol(x1, x2, lke, lka, lcl) Gompertz Asym e b2 b3x SSgompertz(x, Asym, b2, b3) B A Logistic A + 1+e (xmid x)/scal SSfpl(x, A, B, xmid, scal) Asym Logistic (A = 0) 1+e (xmid x)/scal SSlogis(x, Asym, xmid, scal) x Michaelis-Menten Vm K+x SSmicmen(x, Vm, K) Weibull Asym Drop e elrc xpwr SSweibull(x, Asym, Drop, lrc, pwr)
18 Nonlinear Regression: Half-Day 1 Estimation and Standard Inference 18 / 27 3 Inference Based on Linear Approximations As a look on the summary output of the Example Cellulose Membrane shows it look very similar to the summary output of a fitted linear regression model: Formula: delta (T1 + T2 * 10ˆ(T3 + T4 * ph))/(10ˆ(t3 + T4 * ph) + 1) Parameters: Value Std. Error t value Pr(> t ) θ < 2e-16 *** θ < 2e-16 *** θ e-08 *** θ e-08 *** Residual standard error: on 35 degrees of freedom Number of iterations to convergence: 7 Achieved convergence tolerance: 3.652e-06
19 Nonlinear Regression: Half-Day 1 Estimation and Standard Inference 19 / 27 The Asymptotic Properties This approach is based on the local linearization of the model (cf. iterative estimation procedure) Y = η θ + A β + E where A θ is the n p matrix of partial derivatives. If the estimation procedure has converged, then β = 0. Asymptotic Distribution of the Least Squares Estimator with asymptotic covariance matrix θ as. N θ, V θ V θ = σ 2 (A θ T A θ ) 1
20 Nonlinear Regression: Half-Day 1 Estimation and Standard Inference 20 / 27 Application in Practise To explicitly determine the covariance matrix V θ, we plug-in estimates instead of true parameters: A θ is calculated using θ Â. For the error variance σ 2 we plug-in the usual estimator. Hence, V ( ) 1 = σ 2  T  where σ 2 = S θ n p = 1 n p n i=1 ( θ ) 2 y i η i and  = A θ.
21 Nonlinear Regression: Half-Day 1 Estimation and Standard Inference 21 / 27 Approximate 95%-confidence interval Hence, an approximate 95%-confidence interval for β k is θ k ± ŝe βk q t n p 0.975, where ŝe βk is the square root of the kth diagonal element of V. Example Cellulose Membrane From the summary output Parameters: Value Std. Error t value Pr(> t ) θ < 2e-16 *** θ < 2e-16 *** θ e-08 *** θ e-08 *** Residual standard error: on 35 degrees of freedom we can calculate the 95% confidence interval for θ 1 : ± 0.13 q t = ± 0.26
22 Nonlinear Regression: Half-Day 1 Estimation and Standard Inference 22 / 27 Example: Puromycin - back to the initial data set The Michaelis-Menten model for enzyme kinetics relates the initial velocity of an enzymatic reaction to the substrate concentration 200 teated with Puromycin not treated Velocity Velocity Concentration Concentration Y i = θ1 xi θ 2 + x i + E i with E i i.i.d. N 0, σ 2 (Michaelis-Menten model) x substrate concentration [ppm] Y initial velocity [(number/min)/min]
23 Nonlinear Regression: Half-Day 1 Estimation and Standard Inference 23 / 27 Example: Puromycin (4) Modell: Y i = θ 1x i θ 2 + x i + E i. Model with and without treatment (all data): Y i = (θ 1 + θ 3 z i )x i + E i. θ 2 + θ 4 z i + x { i 1 for with where z i = 0 for without Working hypothesis: Only the asymptotic velocity θ 1 is influenced by adding Puromycin. Hence Null hypothesis: θ 4 = 0 R output for the example Puromycin Parameters: Value Std. Error t value Pr(> t ) θ e-15 θ e-05 θ e-05 θ Residual standard error: 10.4 on 19 df Since the P-value of is larger than the level of 5% the null hypothesis is not rejected on the 5% level. 95% confidence interval for θ 4: ± q t = [ , ]
24 Nonlinear Regression: Half-Day 1 Estimation and Standard Inference 24 / 27 Inference for the expected value E Y x o = h x o ; θ at x o : Linear Regression h x o, β = x T o β is estimated by η o = x T o β. (1 α) 100% confidence interval for h x o, β is η o ± q tn p 1 α/2 se η o with se η o = σ x T o (X T X) 1 x o Nonlinear Regression h x o, θ is estimated by η o = h x o, θ. (1 α) 100% confidence interval for h x o, θ is h x o, θ ± q tn p 1 α/2 se η o with se η o = σ and â o = h x o, θ θ â T o. θ= θ (ÂT Â ) 1âo
25 Nonlinear Regression: Half-Day 1 Estimation and Standard Inference 25 / 27 Confidence Band Left: Confidence band (i.g., pointwise confidence intervals) for a fitted straight line (linear regression model). Right: Confidence band for the fitted curve h x, θ of the example Biochemical Oxygen Demand. log(pcb Concentration) Oxygen Demand Years^(1/3) Days
26 Nonlinear Regression: Half-Day 1 Estimation and Standard Inference 26 / 27 Variable Selection How about variable selection in nonlinear regression? There is no one-to-one correspondence between predictor variables and parameter as in linear regression! Hence, the number of variables may differ from the number of parameters. There are hardly ever problems, where some of the variables are in question (Model is derived from subject matter theory!) However, there are problems where a submodel (a submodel is nested within the full model) may be adequat to describe the data; cf. Example Puromycin, Slide 17, Half-Day 1. If we have a collection of candidate which need not to be submodels of each other and the subject matter is somehow indifferent to this models, but we want to find the the most appropriate model for the data one can use Akaike s information criterion (AIC) to select the best model (and/or run a residual analysis)
27 Nonlinear Regression: Half-Day 1 Estimation and Standard Inference 27 / 27 Take Home Message Half-Day 1 In nonlinear regression, Y i = h x i, θ + E i, functions h are analysed which are not linear functions of the unknown parameters θ. Such models are often derived from the subject matter theory. The flexibility of this model class is bought by a more complex estimation and inference theory. Parameter estimation is done by an iterative procedure which needs appropriate starting values. Inference is based on an asymptotic theory. For finite sample size the results just hold approximately Model assumptions are assessed like in linear regression modelling.
Nonlinear Regression:
Zurich University of Applied Sciences School of Engineering IDP Institute of Data Analysis and Process Design Nonlinear Regression: A Powerful Tool With Considerable Complexity Half-Day : Improved Inference
More informationSimple Linear Regression Inference
Simple Linear Regression Inference 1 Inference requirements The Normality assumption of the stochastic term e is needed for inference even if it is not a OLS requirement. Therefore we have: Interpretation
More informationChapter 13 Introduction to Nonlinear Regression( 非 線 性 迴 歸 )
Chapter 13 Introduction to Nonlinear Regression( 非 線 性 迴 歸 ) and Neural Networks( 類 神 經 網 路 ) 許 湘 伶 Applied Linear Regression Models (Kutner, Nachtsheim, Neter, Li) hsuhl (NUK) LR Chap 10 1 / 35 13 Examples
More informationLeast Squares Estimation
Least Squares Estimation SARA A VAN DE GEER Volume 2, pp 1041 1045 in Encyclopedia of Statistics in Behavioral Science ISBN-13: 978-0-470-86080-9 ISBN-10: 0-470-86080-4 Editors Brian S Everitt & David
More informationSTATISTICA Formula Guide: Logistic Regression. Table of Contents
: Table of Contents... 1 Overview of Model... 1 Dispersion... 2 Parameterization... 3 Sigma-Restricted Model... 3 Overparameterized Model... 4 Reference Coding... 4 Model Summary (Summary Tab)... 5 Summary
More informationA Basic Introduction to Missing Data
John Fox Sociology 740 Winter 2014 Outline Why Missing Data Arise Why Missing Data Arise Global or unit non-response. In a survey, certain respondents may be unreachable or may refuse to participate. Item
More informationSAS Software to Fit the Generalized Linear Model
SAS Software to Fit the Generalized Linear Model Gordon Johnston, SAS Institute Inc., Cary, NC Abstract In recent years, the class of generalized linear models has gained popularity as a statistical modeling
More informationRegression III: Advanced Methods
Lecture 16: Generalized Additive Models Regression III: Advanced Methods Bill Jacoby Michigan State University http://polisci.msu.edu/jacoby/icpsr/regress3 Goals of the Lecture Introduce Additive Models
More informationMultivariate Logistic Regression
1 Multivariate Logistic Regression As in univariate logistic regression, let π(x) represent the probability of an event that depends on p covariates or independent variables. Then, using an inv.logit formulation
More informationUnit 31 A Hypothesis Test about Correlation and Slope in a Simple Linear Regression
Unit 31 A Hypothesis Test about Correlation and Slope in a Simple Linear Regression Objectives: To perform a hypothesis test concerning the slope of a least squares line To recognize that testing for a
More informationSimple linear regression
Simple linear regression Introduction Simple linear regression is a statistical method for obtaining a formula to predict values of one variable from another where there is a causal relationship between
More informationTime Series Analysis
Time Series Analysis hm@imm.dtu.dk Informatics and Mathematical Modelling Technical University of Denmark DK-2800 Kgs. Lyngby 1 Outline of the lecture Identification of univariate time series models, cont.:
More informationWeek 5: Multiple Linear Regression
BUS41100 Applied Regression Analysis Week 5: Multiple Linear Regression Parameter estimation and inference, forecasting, diagnostics, dummy variables Robert B. Gramacy The University of Chicago Booth School
More informationMachine Learning and Pattern Recognition Logistic Regression
Machine Learning and Pattern Recognition Logistic Regression Course Lecturer:Amos J Storkey Institute for Adaptive and Neural Computation School of Informatics University of Edinburgh Crichton Street,
More informationPart 2: Analysis of Relationship Between Two Variables
Part 2: Analysis of Relationship Between Two Variables Linear Regression Linear correlation Significance Tests Multiple regression Linear Regression Y = a X + b Dependent Variable Independent Variable
More informationUsing R for Linear Regression
Using R for Linear Regression In the following handout words and symbols in bold are R functions and words and symbols in italics are entries supplied by the user; underlined words and symbols are optional
More informationNCSS Statistical Software Principal Components Regression. In ordinary least squares, the regression coefficients are estimated using the formula ( )
Chapter 340 Principal Components Regression Introduction is a technique for analyzing multiple regression data that suffer from multicollinearity. When multicollinearity occurs, least squares estimates
More informationOverview Classes. 12-3 Logistic regression (5) 19-3 Building and applying logistic regression (6) 26-3 Generalizations of logistic regression (7)
Overview Classes 12-3 Logistic regression (5) 19-3 Building and applying logistic regression (6) 26-3 Generalizations of logistic regression (7) 2-4 Loglinear models (8) 5-4 15-17 hrs; 5B02 Building and
More informationHYPOTHESIS TESTING: POWER OF THE TEST
HYPOTHESIS TESTING: POWER OF THE TEST The first 6 steps of the 9-step test of hypothesis are called "the test". These steps are not dependent on the observed data values. When planning a research project,
More informationMultiple Linear Regression
Multiple Linear Regression A regression with two or more explanatory variables is called a multiple regression. Rather than modeling the mean response as a straight line, as in simple regression, it is
More informationSUMAN DUVVURU STAT 567 PROJECT REPORT
SUMAN DUVVURU STAT 567 PROJECT REPORT SURVIVAL ANALYSIS OF HEROIN ADDICTS Background and introduction: Current illicit drug use among teens is continuing to increase in many countries around the world.
More informationIndices of Model Fit STRUCTURAL EQUATION MODELING 2013
Indices of Model Fit STRUCTURAL EQUATION MODELING 2013 Indices of Model Fit A recommended minimal set of fit indices that should be reported and interpreted when reporting the results of SEM analyses:
More informationLogistic Regression (a type of Generalized Linear Model)
Logistic Regression (a type of Generalized Linear Model) 1/36 Today Review of GLMs Logistic Regression 2/36 How do we find patterns in data? We begin with a model of how the world works We use our knowledge
More informationIntroduction to General and Generalized Linear Models
Introduction to General and Generalized Linear Models General Linear Models - part I Henrik Madsen Poul Thyregod Informatics and Mathematical Modelling Technical University of Denmark DK-2800 Kgs. Lyngby
More informationApplied Statistics. J. Blanchet and J. Wadsworth. Institute of Mathematics, Analysis, and Applications EPF Lausanne
Applied Statistics J. Blanchet and J. Wadsworth Institute of Mathematics, Analysis, and Applications EPF Lausanne An MSc Course for Applied Mathematicians, Fall 2012 Outline 1 Model Comparison 2 Model
More informationStatistical Machine Learning
Statistical Machine Learning UoC Stats 37700, Winter quarter Lecture 4: classical linear and quadratic discriminants. 1 / 25 Linear separation For two classes in R d : simple idea: separate the classes
More informationChapter 6: Multivariate Cointegration Analysis
Chapter 6: Multivariate Cointegration Analysis 1 Contents: Lehrstuhl für Department Empirische of Wirtschaftsforschung Empirical Research and und Econometrics Ökonometrie VI. Multivariate Cointegration
More informationConfidence Intervals for the Difference Between Two Means
Chapter 47 Confidence Intervals for the Difference Between Two Means Introduction This procedure calculates the sample size necessary to achieve a specified distance from the difference in sample means
More informationLogistic Regression. Jia Li. Department of Statistics The Pennsylvania State University. Logistic Regression
Logistic Regression Department of Statistics The Pennsylvania State University Email: jiali@stat.psu.edu Logistic Regression Preserve linear classification boundaries. By the Bayes rule: Ĝ(x) = arg max
More informationRegression Modeling Strategies
Frank E. Harrell, Jr. Regression Modeling Strategies With Applications to Linear Models, Logistic Regression, and Survival Analysis With 141 Figures Springer Contents Preface Typographical Conventions
More informationExample: Credit card default, we may be more interested in predicting the probabilty of a default than classifying individuals as default or not.
Statistical Learning: Chapter 4 Classification 4.1 Introduction Supervised learning with a categorical (Qualitative) response Notation: - Feature vector X, - qualitative response Y, taking values in C
More informationData Mining and Data Warehousing. Henryk Maciejewski. Data Mining Predictive modelling: regression
Data Mining and Data Warehousing Henryk Maciejewski Data Mining Predictive modelling: regression Algorithms for Predictive Modelling Contents Regression Classification Auxiliary topics: Estimation of prediction
More informationRegression Analysis: A Complete Example
Regression Analysis: A Complete Example This section works out an example that includes all the topics we have discussed so far in this chapter. A complete example of regression analysis. PhotoDisc, Inc./Getty
More informationGeneralized Linear Models
Generalized Linear Models We have previously worked with regression models where the response variable is quantitative and normally distributed. Now we turn our attention to two types of models where the
More informationEstimation of σ 2, the variance of ɛ
Estimation of σ 2, the variance of ɛ The variance of the errors σ 2 indicates how much observations deviate from the fitted surface. If σ 2 is small, parameters β 0, β 1,..., β k will be reliably estimated
More informationPoisson Models for Count Data
Chapter 4 Poisson Models for Count Data In this chapter we study log-linear models for count data under the assumption of a Poisson error structure. These models have many applications, not only to the
More informationLecture 3: Linear methods for classification
Lecture 3: Linear methods for classification Rafael A. Irizarry and Hector Corrada Bravo February, 2010 Today we describe four specific algorithms useful for classification problems: linear regression,
More informationBasics of Statistical Machine Learning
CS761 Spring 2013 Advanced Machine Learning Basics of Statistical Machine Learning Lecturer: Xiaojin Zhu jerryzhu@cs.wisc.edu Modern machine learning is rooted in statistics. You will find many familiar
More informationPOLYNOMIAL AND MULTIPLE REGRESSION. Polynomial regression used to fit nonlinear (e.g. curvilinear) data into a least squares linear regression model.
Polynomial Regression POLYNOMIAL AND MULTIPLE REGRESSION Polynomial regression used to fit nonlinear (e.g. curvilinear) data into a least squares linear regression model. It is a form of linear regression
More informationA Primer on Mathematical Statistics and Univariate Distributions; The Normal Distribution; The GLM with the Normal Distribution
A Primer on Mathematical Statistics and Univariate Distributions; The Normal Distribution; The GLM with the Normal Distribution PSYC 943 (930): Fundamentals of Multivariate Modeling Lecture 4: September
More informationCHAPTER 13 SIMPLE LINEAR REGRESSION. Opening Example. Simple Regression. Linear Regression
Opening Example CHAPTER 13 SIMPLE LINEAR REGREION SIMPLE LINEAR REGREION! Simple Regression! Linear Regression Simple Regression Definition A regression model is a mathematical equation that descries the
More informationE(y i ) = x T i β. yield of the refined product as a percentage of crude specific gravity vapour pressure ASTM 10% point ASTM end point in degrees F
Random and Mixed Effects Models (Ch. 10) Random effects models are very useful when the observations are sampled in a highly structured way. The basic idea is that the error associated with any linear,
More informationNotes on Applied Linear Regression
Notes on Applied Linear Regression Jamie DeCoster Department of Social Psychology Free University Amsterdam Van der Boechorststraat 1 1081 BT Amsterdam The Netherlands phone: +31 (0)20 444-8935 email:
More informationGLMs: Gompertz s Law. GLMs in R. Gompertz s famous graduation formula is. or log µ x is linear in age, x,
Computing: an indispensable tool or an insurmountable hurdle? Iain Currie Heriot Watt University, Scotland ATRC, University College Dublin July 2006 Plan of talk General remarks The professional syllabus
More information5. Linear Regression
5. Linear Regression Outline.................................................................... 2 Simple linear regression 3 Linear model............................................................. 4
More informationLOGISTIC REGRESSION. Nitin R Patel. where the dependent variable, y, is binary (for convenience we often code these values as
LOGISTIC REGRESSION Nitin R Patel Logistic regression extends the ideas of multiple linear regression to the situation where the dependent variable, y, is binary (for convenience we often code these values
More informationElements of statistics (MATH0487-1)
Elements of statistics (MATH0487-1) Prof. Dr. Dr. K. Van Steen University of Liège, Belgium December 10, 2012 Introduction to Statistics Basic Probability Revisited Sampling Exploratory Data Analysis -
More informationI L L I N O I S UNIVERSITY OF ILLINOIS AT URBANA-CHAMPAIGN
Beckman HLM Reading Group: Questions, Answers and Examples Carolyn J. Anderson Department of Educational Psychology I L L I N O I S UNIVERSITY OF ILLINOIS AT URBANA-CHAMPAIGN Linear Algebra Slide 1 of
More informationMultinomial and Ordinal Logistic Regression
Multinomial and Ordinal Logistic Regression ME104: Linear Regression Analysis Kenneth Benoit August 22, 2012 Regression with categorical dependent variables When the dependent variable is categorical,
More informationInner Product Spaces
Math 571 Inner Product Spaces 1. Preliminaries An inner product space is a vector space V along with a function, called an inner product which associates each pair of vectors u, v with a scalar u, v, and
More informationR 2 -type Curves for Dynamic Predictions from Joint Longitudinal-Survival Models
Faculty of Health Sciences R 2 -type Curves for Dynamic Predictions from Joint Longitudinal-Survival Models Inference & application to prediction of kidney graft failure Paul Blanche joint work with M-C.
More informationCorrelation and Simple Linear Regression
Correlation and Simple Linear Regression We are often interested in studying the relationship among variables to determine whether they are associated with one another. When we think that changes in a
More informationLOGIT AND PROBIT ANALYSIS
LOGIT AND PROBIT ANALYSIS A.K. Vasisht I.A.S.R.I., Library Avenue, New Delhi 110 012 amitvasisht@iasri.res.in In dummy regression variable models, it is assumed implicitly that the dependent variable Y
More informationExample: Boats and Manatees
Figure 9-6 Example: Boats and Manatees Slide 1 Given the sample data in Table 9-1, find the value of the linear correlation coefficient r, then refer to Table A-6 to determine whether there is a significant
More informationGeneral Method: Difference of Means. 3. Calculate df: either Welch-Satterthwaite formula or simpler df = min(n 1, n 2 ) 1.
General Method: Difference of Means 1. Calculate x 1, x 2, SE 1, SE 2. 2. Combined SE = SE1 2 + SE2 2. ASSUMES INDEPENDENT SAMPLES. 3. Calculate df: either Welch-Satterthwaite formula or simpler df = min(n
More informationModule 5: Multiple Regression Analysis
Using Statistical Data Using to Make Statistical Decisions: Data Multiple to Make Regression Decisions Analysis Page 1 Module 5: Multiple Regression Analysis Tom Ilvento, University of Delaware, College
More informationBasic Statistics and Data Analysis for Health Researchers from Foreign Countries
Basic Statistics and Data Analysis for Health Researchers from Foreign Countries Volkert Siersma siersma@sund.ku.dk The Research Unit for General Practice in Copenhagen Dias 1 Content Quantifying association
More informationNominal and ordinal logistic regression
Nominal and ordinal logistic regression April 26 Nominal and ordinal logistic regression Our goal for today is to briefly go over ways to extend the logistic regression model to the case where the outcome
More informationLecture 11: Confidence intervals and model comparison for linear regression; analysis of variance
Lecture 11: Confidence intervals and model comparison for linear regression; analysis of variance 14 November 2007 1 Confidence intervals and hypothesis testing for linear regression Just as there was
More informationStatistical Models in R
Statistical Models in R Some Examples Steven Buechler Department of Mathematics 276B Hurley Hall; 1-6233 Fall, 2007 Outline Statistical Models Linear Models in R Regression Regression analysis is the appropriate
More informationNonlinear Statistical Models
Nonlinear Statistical Models Earlier in the course, we considered the general linear statistical model (GLM): y i =β 0 +β 1 x 1i + +β k x ki +ε i, i= 1,, n, written in matrix form as: y 1 y n = 1 x 11
More informationPlease follow the directions once you locate the Stata software in your computer. Room 114 (Business Lab) has computers with Stata software
STATA Tutorial Professor Erdinç Please follow the directions once you locate the Stata software in your computer. Room 114 (Business Lab) has computers with Stata software 1.Wald Test Wald Test is used
More informationGamma Distribution Fitting
Chapter 552 Gamma Distribution Fitting Introduction This module fits the gamma probability distributions to a complete or censored set of individual or grouped data values. It outputs various statistics
More informationTesting for Granger causality between stock prices and economic growth
MPRA Munich Personal RePEc Archive Testing for Granger causality between stock prices and economic growth Pasquale Foresti 2006 Online at http://mpra.ub.uni-muenchen.de/2962/ MPRA Paper No. 2962, posted
More informationLOGISTIC REGRESSION ANALYSIS
LOGISTIC REGRESSION ANALYSIS C. Mitchell Dayton Department of Measurement, Statistics & Evaluation Room 1230D Benjamin Building University of Maryland September 1992 1. Introduction and Model Logistic
More informationJava Modules for Time Series Analysis
Java Modules for Time Series Analysis Agenda Clustering Non-normal distributions Multifactor modeling Implied ratings Time series prediction 1. Clustering + Cluster 1 Synthetic Clustering + Time series
More informationLesson 1: Comparison of Population Means Part c: Comparison of Two- Means
Lesson : Comparison of Population Means Part c: Comparison of Two- Means Welcome to lesson c. This third lesson of lesson will discuss hypothesis testing for two independent means. Steps in Hypothesis
More informationOverview of Violations of the Basic Assumptions in the Classical Normal Linear Regression Model
Overview of Violations of the Basic Assumptions in the Classical Normal Linear Regression Model 1 September 004 A. Introduction and assumptions The classical normal linear regression model can be written
More information1. What is the critical value for this 95% confidence interval? CV = z.025 = invnorm(0.025) = 1.96
1 Final Review 2 Review 2.1 CI 1-propZint Scenario 1 A TV manufacturer claims in its warranty brochure that in the past not more than 10 percent of its TV sets needed any repair during the first two years
More informationLogit Models for Binary Data
Chapter 3 Logit Models for Binary Data We now turn our attention to regression models for dichotomous data, including logistic regression and probit analysis. These models are appropriate when the response
More informationDEPARTMENT OF PSYCHOLOGY UNIVERSITY OF LANCASTER MSC IN PSYCHOLOGICAL RESEARCH METHODS ANALYSING AND INTERPRETING DATA 2 PART 1 WEEK 9
DEPARTMENT OF PSYCHOLOGY UNIVERSITY OF LANCASTER MSC IN PSYCHOLOGICAL RESEARCH METHODS ANALYSING AND INTERPRETING DATA 2 PART 1 WEEK 9 Analysis of covariance and multiple regression So far in this course,
More informationPart II. Multiple Linear Regression
Part II Multiple Linear Regression 86 Chapter 7 Multiple Regression A multiple linear regression model is a linear model that describes how a y-variable relates to two or more xvariables (or transformations
More informationInstitute of Actuaries of India Subject CT3 Probability and Mathematical Statistics
Institute of Actuaries of India Subject CT3 Probability and Mathematical Statistics For 2015 Examinations Aim The aim of the Probability and Mathematical Statistics subject is to provide a grounding in
More informationOrdinal Regression. Chapter
Ordinal Regression Chapter 4 Many variables of interest are ordinal. That is, you can rank the values, but the real distance between categories is unknown. Diseases are graded on scales from least severe
More informationIntroduction to Logistic Regression
OpenStax-CNX module: m42090 1 Introduction to Logistic Regression Dan Calderon This work is produced by OpenStax-CNX and licensed under the Creative Commons Attribution License 3.0 Abstract Gives introduction
More information" Y. Notation and Equations for Regression Lecture 11/4. Notation:
Notation: Notation and Equations for Regression Lecture 11/4 m: The number of predictor variables in a regression Xi: One of multiple predictor variables. The subscript i represents any number from 1 through
More informationLinear Threshold Units
Linear Threshold Units w x hx (... w n x n w We assume that each feature x j and each weight w j is a real number (we will relax this later) We will study three different algorithms for learning linear
More informationHow Far is too Far? Statistical Outlier Detection
How Far is too Far? Statistical Outlier Detection Steven Walfish President, Statistical Outsourcing Services steven@statisticaloutsourcingservices.com 30-325-329 Outline What is an Outlier, and Why are
More informationExperimental Design. Power and Sample Size Determination. Proportions. Proportions. Confidence Interval for p. The Binomial Test
Experimental Design Power and Sample Size Determination Bret Hanlon and Bret Larget Department of Statistics University of Wisconsin Madison November 3 8, 2011 To this point in the semester, we have largely
More informationHYPOTHESIS TESTING: CONFIDENCE INTERVALS, T-TESTS, ANOVAS, AND REGRESSION
HYPOTHESIS TESTING: CONFIDENCE INTERVALS, T-TESTS, ANOVAS, AND REGRESSION HOD 2990 10 November 2010 Lecture Background This is a lightning speed summary of introductory statistical methods for senior undergraduate
More informationStatistics in Retail Finance. Chapter 6: Behavioural models
Statistics in Retail Finance 1 Overview > So far we have focussed mainly on application scorecards. In this chapter we shall look at behavioural models. We shall cover the following topics:- Behavioural
More information1 Simple Linear Regression I Least Squares Estimation
Simple Linear Regression I Least Squares Estimation Textbook Sections: 8. 8.3 Previously, we have worked with a random variable x that comes from a population that is normally distributed with mean µ and
More informationTechnical report. in SPSS AN INTRODUCTION TO THE MIXED PROCEDURE
Linear mixedeffects modeling in SPSS AN INTRODUCTION TO THE MIXED PROCEDURE Table of contents Introduction................................................................3 Data preparation for MIXED...................................................3
More informationChapter 4: Statistical Hypothesis Testing
Chapter 4: Statistical Hypothesis Testing Christophe Hurlin November 20, 2015 Christophe Hurlin () Advanced Econometrics - Master ESA November 20, 2015 1 / 225 Section 1 Introduction Christophe Hurlin
More informationAUTOMATION OF ENERGY DEMAND FORECASTING. Sanzad Siddique, B.S.
AUTOMATION OF ENERGY DEMAND FORECASTING by Sanzad Siddique, B.S. A Thesis submitted to the Faculty of the Graduate School, Marquette University, in Partial Fulfillment of the Requirements for the Degree
More informationAnalysis of Bayesian Dynamic Linear Models
Analysis of Bayesian Dynamic Linear Models Emily M. Casleton December 17, 2010 1 Introduction The main purpose of this project is to explore the Bayesian analysis of Dynamic Linear Models (DLMs). The main
More informationNew Work Item for ISO 3534-5 Predictive Analytics (Initial Notes and Thoughts) Introduction
Introduction New Work Item for ISO 3534-5 Predictive Analytics (Initial Notes and Thoughts) Predictive analytics encompasses the body of statistical knowledge supporting the analysis of massive data sets.
More informationLinear Models for Classification
Linear Models for Classification Sumeet Agarwal, EEL709 (Most figures from Bishop, PRML) Approaches to classification Discriminant function: Directly assigns each data point x to a particular class Ci
More informationFinal Exam Practice Problem Answers
Final Exam Practice Problem Answers The following data set consists of data gathered from 77 popular breakfast cereals. The variables in the data set are as follows: Brand: The brand name of the cereal
More informationMultivariate Normal Distribution
Multivariate Normal Distribution Lecture 4 July 21, 2011 Advanced Multivariate Statistical Methods ICPSR Summer Session #2 Lecture #4-7/21/2011 Slide 1 of 41 Last Time Matrices and vectors Eigenvalues
More informationPackage EstCRM. July 13, 2015
Version 1.4 Date 2015-7-11 Package EstCRM July 13, 2015 Title Calibrating Parameters for the Samejima's Continuous IRT Model Author Cengiz Zopluoglu Maintainer Cengiz Zopluoglu
More informationHURDLE AND SELECTION MODELS Jeff Wooldridge Michigan State University BGSE/IZA Course in Microeconometrics July 2009
HURDLE AND SELECTION MODELS Jeff Wooldridge Michigan State University BGSE/IZA Course in Microeconometrics July 2009 1. Introduction 2. A General Formulation 3. Truncated Normal Hurdle Model 4. Lognormal
More informationAdditional sources Compilation of sources: http://lrs.ed.uiuc.edu/tseportal/datacollectionmethodologies/jin-tselink/tselink.htm
Mgt 540 Research Methods Data Analysis 1 Additional sources Compilation of sources: http://lrs.ed.uiuc.edu/tseportal/datacollectionmethodologies/jin-tselink/tselink.htm http://web.utk.edu/~dap/random/order/start.htm
More informationIntroduction to Path Analysis
This work is licensed under a Creative Commons Attribution-NonCommercial-ShareAlike License. Your use of this material constitutes acceptance of that license and the conditions of use of materials on this
More informationUnivariate Regression
Univariate Regression Correlation and Regression The regression line summarizes the linear relationship between 2 variables Correlation coefficient, r, measures strength of relationship: the closer r is
More informationInteraction between quantitative predictors
Interaction between quantitative predictors In a first-order model like the ones we have discussed, the association between E(y) and a predictor x j does not depend on the value of the other predictors
More informationLongitudinal Meta-analysis
Quality & Quantity 38: 381 389, 2004. 2004 Kluwer Academic Publishers. Printed in the Netherlands. 381 Longitudinal Meta-analysis CORA J. M. MAAS, JOOP J. HOX and GERTY J. L. M. LENSVELT-MULDERS Department
More informationDistribution (Weibull) Fitting
Chapter 550 Distribution (Weibull) Fitting Introduction This procedure estimates the parameters of the exponential, extreme value, logistic, log-logistic, lognormal, normal, and Weibull probability distributions
More informationTrend and Seasonal Components
Chapter 2 Trend and Seasonal Components If the plot of a TS reveals an increase of the seasonal and noise fluctuations with the level of the process then some transformation may be necessary before doing
More informationImpulse Response Functions
Impulse Response Functions Wouter J. Den Haan University of Amsterdam April 28, 2011 General definition IRFs The IRF gives the j th -period response when the system is shocked by a one-standard-deviation
More information