Nonlinear Statistical Models
|
|
|
- Carmel Edwards
- 9 years ago
- Views:
Transcription
1 Nonlinear Statistical Models Earlier in the course, we considered the general linear statistical model (GLM): y i =β 0 +β 1 x 1i + +β k x ki +ε i, i= 1,, n, written in matrix form as: y 1 y n = 1 x 11 x p 1,1 1 x 12 x p 1,2 1 x 1n x p 1,n y= Xβ+ε, OR β 0 β 1 β p 1 + Specifically, we discussed estimation of model parameters, measurements of uncertainty in these estimates and the predicted values, and the use of confidence intervals or significance tests to assess the significance of parameters Through the matrix formulation, we saw that in linear models (model is a linear function of the parameters), the estimates of linear functions of the response values and their standard errors have simple closed-form expressions and can thus be analytically computed Unfortunately, this will not be the case for nonlinear models 144 ε 1 ε 2 ε n Definition: As before, consider a set of n observed values on a response variable y assumed to be dependent on k explanatory variables x=(x 1,, x k ) This situation can be described through the following nonlinear statistical model: y i = f(x i,θ)+ε i, i= 1,, n, where x =[x i 1i, x 2i,, x ki ] is the row vector of observations on k explanatory variables x 1,, x k for the i th observational unit,θ=(θ 1,,θ p ) is a vector of p parameters, andε i is the i th error term In a matrix setting, this model can be expressed as: y 1 f(x,θ) ε 1 1 y 2 y n }{{} y = f(x,θ) 2 f(x n } {{,θ) } f(θ) + ε 2 ε n }{{} ε Typically, we assumeε i iid (0,σ 2 ) and that the explanatory variables in x are fixed If f(θ)= Xθ, where X is the design matrix as defined earlier, then the model reverts to a linear model 145
2 As Leonid has argued, nonlinear models commonly arise as solutions to a system of differential equations which describe some dynamic process in a particular application Some common examples of nonlinear models are given below: 1 Exponential Growth/Decay Model: y i =αe β x i+ε i, where x is typically time This arises as a solution of the differential equation: de(y) =β E(y) d x 2 Two-term Exponential Model: y i = θ 1 θ 1 θ 2 e θ 2 x i e θ 1 x i +εi This model results when two processes lead to exponential growth or decay For example, if we are monitoring the concentration of a drug in the bloodstream, which enters the bloodstream from muscle tissue and is removed by the kidneys, then this model is a solution to the differential equation: de(y) =θ 1 E(y m ) θ 2 E(y), d x where y m is the concentration of the drug in the muscle tissue and y is the concentration in the blood 3 Mitscherlich Model: y i =α 1 e β(x i+δ) +ε i As an example of where this model is used, let y be the yield of a crop, let x be the amount of some nutrient, and letαbe the maximum attainable yield Then if we assume the increase in yield is proportional to the difference between α and the actual yield, we get the following differential equation, to which the model is a solution: d E( y) =β α E(y), d x whereδis the equivalent nutrient value of the soil 4 Logistic Growth Model: y i = Mu u+(m u) exp( kt i ) +ε i This model results from a differential equation where we assume the population growth rate decays as the population increases, so that it
3 %&)!"#"$ reaches a carrying capacity M The form of this differential equation was discussed in some detail in the past weeks, and is given by: de(y) = ke(y) 1 E(y), E(y t= 0)=u dt M where M is the carrying capacity, u is the population size at time 0, and k controls the rate of growth The form of this model is shown as a fit to the AIDS data ( ) below %&'%( Fitting Nonlinear Models via Least Squares : In principle, fitting a nonlinear statistical model using least squares is no different than fitting a linear one That is, for the model y i = f(x i,θ)+ε i, we seek to find that parameter vector θ that minimizes the sum of the squared errors: Q(θ y, X) = n ε 2 i= i=1 n [y i f(x i,θ)]2 i=1 = [y f(θ)] [y f(θ)]=ε ε where f(θ) is the n 1 vector of f(x,θ) values i Differentiating Q with respect to theθ j (in an effort to minimize Q), setting the resulting equations to zero and evaluating atθ= θ gives: Q(θ y, X) n = 2 [y θ j i f(x, θ)] f(x, θ) i i θ= θ i=1 θ =0, j j= 1,, p These resulting equations are p nonlinear equations in the p parameters, which in general cannot be solved analytically to obtain explicit solutions forθ As a result, numerical methods are used to solve these nonlinear systems In MatLab, there are several numerical methods that can be used, although the default is a
4 generalization of the standard Gauss-Newton method, known as the Levenberg-Marquardt Algorithm Gauss-Newton Method: This method, like others that may be used, is iterative in nature, and requires starting values for the parameters These starting values are then continually adjusted to home in on the least squares solution The method is conducted via the following steps: 1 Consider taking a 1st-order Taylor expansion of f(θ) about some starting parameter vectorθ 0 : f(θ) f(θ 0 )+ F(θ 0 )(θ θ 0 ), (1) where F(θ 0 ) is the n p matrix of partial derivatives evaluated atθ 0 and the n data values x, as given i below: [f(x 1,θ0 )] [f(x 1,θ0 )] [f(x 1,θ0 )] θ 1 θ 2 θ p [f(x 2,θ0 )] [f(x 2,θ0 )] [f(x 2 F(θ 0,θ0 )] )= θ 1 θ 2 θ p [f(x n,θ0 )] [f(x n,θ0 )] [f(x n,θ0 )] θ 1 θ 2 θ p 2 Calculate f(θ 0 ) and F(θ 0 ) Recognizing f(θ) as an approximately linearized model inθ from equation (1), linear least squares is used to update the starting values and obtain the solution at the next iteration 3 Repeat steps 1 & 2 until the change in the parameters is below some tolerance Some Key Results: Based on this same Taylor expansion, Gallant (1987) established the following asymptotic (large sample) results: 1 If we assumeε N(0,σ 2 I), then the least squares parameter vector θ is approximately normally distributed with meanθ and variance matrix Var( θ)=(f F) 1 σ 2, written: θ N[θ,(F F) 1 σ 2 ] In practice, we compute F( θ) as an estimate of F(θ) and estimateσ 2 with MSE=SSE/(n p) to give the asymptotic variance-covariance matrix for θ as: s 2 ( θ)= Var( θ)=( F F) 1 MSE, where: F= F( θ)
5 EFGH IJKLM IN OPQ RFMH SRLT UFGH INV WXYZ[ WX 1-/ 1/ D: : ACB*-/ ; < A@? > = < */;: 9876-/ / *+,-*+/*+-*+0/*+0-*+/*+-1/ 2345 This gives an approximate 100(1 α)% confidence interval for a given parameterθ i as: θ i ± t 1 α,n p s( θ) ii where( F F) 1 ii = θ i ± t 1 α,n p MSE( F F) 1 ii is the i th diagonal entry of( F F) 1 2 Again assuming the errorsε N(0,σ 2 I), then: Var( y)= F(F F) 1 F σ 2, which can be estimated as: s 2 ( y)= Var( y)= F( F F) 1 F MSE This technique of quantifying uncertainty through the use of a first-order Taylor series expansion is known as the Delta Method and was the most commonly used technique for finding approximate standard errors before the advent of modern computing Gallant (1987) illustrated several cases where these approximations are poor, giving confidence intervals and hypothesis tests which are completely wrong, Example: Data were collected on the number of big horn sheep on the Mount Everts winter range in Yellowstone between 1965 and 1999 (obtained from Marty Kardos) Consider the time plot of these data below The population suffered a die-off resulting from the infectious keratoconjunctivitis in 1983 Marty indicates that this is fairly typical for bighorn sheep populations: logistic-like population growth up to a carrying capacity followed by disease-related die-offs, and subsequent recovery For now, consider modeling the initial population growth from 1965 to 1982 using a logistic growth model Here, we will fit the model, and use it to illustrate the calculation of standard errors for the parameter estimates from this nonlinear model Some more modern techniques for estimating standard errors and subsequently performing statistical inferences in nonlinear models are the method of bootstrapping, and Markov chain Monte Carlo (MCMC) methods
6 y ˆŒ cdb w m m cb t v u n o \db t s rq \b p o n m l db k j i b \]^_\]^\]^`\]ab\]ac\]a_\]a^\]à\]`b\]`c The logistic growth model is a 3-parameter nonlinear model of the form: y i = Mu u+(m u) exp( kt i ) +ε i, where: y i = the number of bighorn sheep at time t i, M u k = the carrying capacity of the population, = the number of sheep at time 0 (1964 here), = a parameter representing the growth rate Ž y { z y y xyz { }~ { ƒ~ ˆ Š Fitting this model to the data with nonlinear least squares using the nlinfit function in MatLab gives the following fitted model: y i = M u u+( M u) exp( kt i ), efgh where M= , u=274643, k= 03279, as computed using the Gauss-Newton method outlined earlier The fitted model is shown in overlay below Is this a good model? The code for fitting the logistic model is given below and can be found on the course webpage under the files sheepm and logisticm The first file is the driver program and the logisticm file defines the logistic function
7 % ============================================= % % Fits a logistic growth model to the % % sheep data, and plots the fitted model % % ============================================= % sheep82year = sheepyear(sheepyear<=1982); % Take only years before 1983 sheep82count = sheepcount(sheepyear<=1982); % Take only cases before 1983 time = sheep82year-1964; % Defines the year-1964 variable beta = [ ]; % Parameter starting values [betahat,resid,j] = nlinfit(time, sheep82count,@logistic,beta); % Performs NLS, returning the parameter % estimates, residuals, & Jacobian yhat = sheep82count-resid; % Computes predicted counts plot(sheep82year,sheep82count, ko, sheep82year,yhat); % Plots sheep counts vs year w/ fitted curve xlabel( Year, fontsize,14); ylabel( Number of Bighorn Sheep, fontsize,14); title( Bighorn Sheep Counts: , fontsize,14, fontweight, b ); I tried a few different things to try to improve the fit, including a Weibull-like modification to the logistic growth model, where I added a parameter in the exponent of the model yielding a generalized logistic growth model: y i = Mu u+(m u) exp[ kt γ i ]+ε i, whereγis a parameter controlling the degree of curvature in the logistic shape Ifγ=1, this reverts to the usual logistic growth model This change is Weibull-like in the sense that we have added a parameter in the exponent of an exponential function, much like the addition of theγ-parameter in going from exponential growth to a Weibull growth model This Weibull-like curvature-adjustment to the logistic model has a corresponding differential equation to which it is the solution This equation has the form: de(y) =γt γ 1 ku 1 u, dt M
8 º± ±²³ µ± ²¹ º» Æ ³Ç È É Ê˲» º Ì Í º š š š ª «ª š Ÿ š š œ ž where M is the carrying capacity, u is the population size at time 0, k controls the rate of growth, andγis this curvature piece When parameterized this way, it turns out that the resulting parameter estimates for k &γare unstable because they are highly correlated An alterative, but equivalent way to parameterize this model is as follows: ¼½¾ ÀÁ ½¾ÂÃÄÅ y i = Mu u+(m u) exp[ (kt i ) γ ] +ε i This is the form of the model we will use for the remainder of our treatment of this generalized logistic model M = u = k = γ = Essentially,γcontrols the steepness of the S-shape in the logistic model, where larger values ofγcorrespond to steeper curves Since our logistic model for the sheep data appears to not be steep enough, we would expect this parameter to be larger than 1 Fitting this generalized logistic model produced the fitted curve and parameter estimates as given on the next page Does this seem like a better fit? The MatLab code used to fit this generalized logistic model is shown below:
9 % ===================================== % % Fits a generalized logistic growth % % model to the sheep data, % % and plots the fitted mode in overlay % % ===================================== % beta2 = [ ]; % Parameter starting values [betahat2,resid2,j2] = nlinfit(time, sheep82count,@genlogistic,beta2); % Performs NLS, returning the parameter % estimates, residuals, & Jacobian yhat2 = sheep82count-resid2; % Computes predicted counts plot(sheep82year,sheep82count, ko, sheep82year,yhat2); % Plots the sheep counts vs year with % fitted curve in overlay xlabel( Year, fontsize,14); ylabel( Number of Bighorn Sheep, fontsize,14); title( Bighorn Sheep Counts ( ) with Generalized Logistic Fit, fontsize,14, fontweight, b ); 160 Which of these parameters do you expect to be the least stable (ie: which do you expect to have the highest variability?) To get the standard errors of these parameters inθ=(m, u, k,γ), we need to compute the Delta Method approximate variance matrix, given as: s 2 ( θ)= Var( θ)=( F F) 1 MSE To compute the F-matrix, we first compute the partial derivatives of f i (θ)= f(x i,θ)= Mu u+(m u) exp[ (kt i ) γ ] with respect to M, u, k, andγrespectively Doing so with repeated use of the product (quotient) rule gives: f i (θ) M f i (θ) u f i (θ) k f i (θ) γ = = u 2 {1 exp[ (kt i ) γ ]} {u+(m u) exp[ (kt i ) γ ]} 2, M 2 exp[ (kt i ) γ ] {u+(m u) exp[ (kt i ) γ ]} 2, = Mu(M u)γkγ 1 t γ i exp[ (kt i) γ ] {u+(m u) exp[ (kt i ) γ ]} 2, = Mu(M u)(kt i) γ exp[ (kt i ) γ ] log(kt i ) {u+(m u) exp[ (kt i ) γ ]} 2 161
10 Plugging the least squares estimates θ=( M, u, k, γ) into these derivatives yields the F-matrix shown below: F= [f(t 1, θ)] M [f(t 2, θ)] M [f(t 17, θ)] M = [f(t 1, θ)] u [f(t 2, θ)] u [f(t 17, θ)] [f(t 1, θ)] k [f(t 2, θ)] k [f(t 17, θ)] [f(t 1, θ)] γ [f(t 2, θ)] γ [f(t 17, θ)] u k γ , where the actual values were computed in MatLab Finally, computing MSE ( F F) 1 to get the variance-covariance matrix of the parameter estimates, extracting the diagonal entries of this 4x4 matrix, and taking square roots yields the standard errors of the parameter estimates The MatLab code to do this is given below: % ============================================= % % Computes the Delta Method parameter estimate % % standard errors for a 4-parameter generalized % % logistic model % % ============================================= % n = length(sheep82year); % Sample size p = length(beta2); % No model parameters mse = sum(resid2^2)/(n-p); % Computes the mean squared error (MSE) Mhat = betahat2(1); uhat = betahat2(2); khat = betahat2(3); ghat = betahat2(4); % Renames the three parameter estimates den = (uhat+(mhat-uhat)*exp(-(khat* time)^ghat))^2; % Denominator of first-order derivatives
11 dfdm = uhat^2*(1-exp(-(khat*time)^ ghat))/den; % Computes df/dm (nx1 vector) dfdu = Mhat^2*exp(-(khat*time)^ghat)/den; % Computes df/du (nx1 vector) dfdk = Mhat*uhat*(Mhat-uhat)*(time^ ghat)*ghat*khat^(ghat-1)* exp(-(khat*time)^ghat)/den; % Computes df/dk (nx1 vector) dfdg = Mhat*uhat*(Mhat-uhat)*(khat* time)^ghat*log(khat*time)* exp(-(khat*time)^ghat)/den; % Computes df/dg (nx1 vector) Fhat = [dfdm dfdu dfdk dfdg]; % Combines derivatives into nx4 matrix varmat = inv(fhat *Fhat)*mse; % Inverts estimated (F F)*MSE matrix separ = sqrt(diag(varmat)); % Standard errors are square root % diagonal elements of varmat tstar = tinv(1-025/p,n-p); % Computes Bonferroni t-value ci = [betahat2 -tstar*separ betahat2 + tstar*separ]; % Computes Bonferroni CIs for the 3 parameters betahat2 = separ = This provides approximate Bonferroni-adjusted simultaneous 95% confidence intervals for the four parameters (using t 1 025/4 (n p)= t (17 4)= tinv(99375,13)=28961) as: M± t (n p) SE( M) = ± 28961(4178) =(19554, 21974) u± t (n p) SE( u) = ± 28961(7166) =(4722, 8873) k± t99375 (n p) SE( k) = ± 28961(00091) =(01116, 01645) γ± t (n p) SE( γ) = ± 28961(08093) =(111, 580) In MatLab, the Delta Method of computing standard errors for parameter estimates is the default technique
12 and can be easily obtained after calling the nlinfit function, by typing: ci2 = nlparci(betahat2,resid2,j2) % Computes NLS Delta Method CIs This returns approximate 95% individual confidence intervals for the three model parameters based on the Delta Method standard errors We will next explore the use of bootstrapping as an alternative to using Delta Method-based standard errors Both techniques have their problems and consider parameter uncertainty for each parameter separately rather than jointly Later, we will briefly explore the use of Markov chain Monte Carlo (MCMC) methods which account for the dependence among model parameters in measuring uncertainty 166
Chapter 13 Introduction to Nonlinear Regression( 非 線 性 迴 歸 )
Chapter 13 Introduction to Nonlinear Regression( 非 線 性 迴 歸 ) and Neural Networks( 類 神 經 網 路 ) 許 湘 伶 Applied Linear Regression Models (Kutner, Nachtsheim, Neter, Li) hsuhl (NUK) LR Chap 10 1 / 35 13 Examples
gloser og kommentarer til Macbeth ( sidetal henviser til "Illustrated Shakespeare") GS 96
side 1! " # $ % &! " '! ( $ ) *! " & +! '!, $ #! "! - % " "! &. / 0 1 & +! '! ' & ( ' 2 ) % $ 2 3 % # 2! &. 2! ' 2! '! $ $! " 2! $ ( ' 4! " 3 & % / ( / ( ' 5! * ( 2 & )! 2 5! 2 &! # '! " & +! '! ' & &!
POLYNOMIAL AND MULTIPLE REGRESSION. Polynomial regression used to fit nonlinear (e.g. curvilinear) data into a least squares linear regression model.
Polynomial Regression POLYNOMIAL AND MULTIPLE REGRESSION Polynomial regression used to fit nonlinear (e.g. curvilinear) data into a least squares linear regression model. It is a form of linear regression
Nonlinear Regression:
Zurich University of Applied Sciences School of Engineering IDP Institute of Data Analysis and Process Design Nonlinear Regression: A Powerful Tool With Considerable Complexity Half-Day : Improved Inference
Web-based Supplementary Materials for Bayesian Effect Estimation. Accounting for Adjustment Uncertainty by Chi Wang, Giovanni
1 Web-based Supplementary Materials for Bayesian Effect Estimation Accounting for Adjustment Uncertainty by Chi Wang, Giovanni Parmigiani, and Francesca Dominici In Web Appendix A, we provide detailed
Basics of Statistical Machine Learning
CS761 Spring 2013 Advanced Machine Learning Basics of Statistical Machine Learning Lecturer: Xiaojin Zhu [email protected] Modern machine learning is rooted in statistics. You will find many familiar
Generating Random Numbers Variance Reduction Quasi-Monte Carlo. Simulation Methods. Leonid Kogan. MIT, Sloan. 15.450, Fall 2010
Simulation Methods Leonid Kogan MIT, Sloan 15.450, Fall 2010 c Leonid Kogan ( MIT, Sloan ) Simulation Methods 15.450, Fall 2010 1 / 35 Outline 1 Generating Random Numbers 2 Variance Reduction 3 Quasi-Monte
17. SIMPLE LINEAR REGRESSION II
17. SIMPLE LINEAR REGRESSION II The Model In linear regression analysis, we assume that the relationship between X and Y is linear. This does not mean, however, that Y can be perfectly predicted from X.
1 Prior Probability and Posterior Probability
Math 541: Statistical Theory II Bayesian Approach to Parameter Estimation Lecturer: Songfeng Zheng 1 Prior Probability and Posterior Probability Consider now a problem of statistical inference in which
Using the Delta Method to Construct Confidence Intervals for Predicted Probabilities, Rates, and Discrete Changes
Using the Delta Method to Construct Confidence Intervals for Predicted Probabilities, Rates, Discrete Changes JunXuJ.ScottLong Indiana University August 22, 2005 The paper provides technical details on
3. Regression & Exponential Smoothing
3. Regression & Exponential Smoothing 3.1 Forecasting a Single Time Series Two main approaches are traditionally used to model a single time series z 1, z 2,..., z n 1. Models the observation z t as a
Centre for Central Banking Studies
Centre for Central Banking Studies Technical Handbook No. 4 Applied Bayesian econometrics for central bankers Andrew Blake and Haroon Mumtaz CCBS Technical Handbook No. 4 Applied Bayesian econometrics
Example: Credit card default, we may be more interested in predicting the probabilty of a default than classifying individuals as default or not.
Statistical Learning: Chapter 4 Classification 4.1 Introduction Supervised learning with a categorical (Qualitative) response Notation: - Feature vector X, - qualitative response Y, taking values in C
Errata and updates for ASM Exam C/Exam 4 Manual (Sixteenth Edition) sorted by page
Errata for ASM Exam C/4 Study Manual (Sixteenth Edition) Sorted by Page 1 Errata and updates for ASM Exam C/Exam 4 Manual (Sixteenth Edition) sorted by page Practice exam 1:9, 1:22, 1:29, 9:5, and 10:8
Notes on Applied Linear Regression
Notes on Applied Linear Regression Jamie DeCoster Department of Social Psychology Free University Amsterdam Van der Boechorststraat 1 1081 BT Amsterdam The Netherlands phone: +31 (0)20 444-8935 email:
The Method of Least Squares. Lectures INF2320 p. 1/80
The Method of Least Squares Lectures INF2320 p. 1/80 Lectures INF2320 p. 2/80 The method of least squares We study the following problem: Given n points (t i,y i ) for i = 1,...,n in the (t,y)-plane. How
An introduction to Value-at-Risk Learning Curve September 2003
An introduction to Value-at-Risk Learning Curve September 2003 Value-at-Risk The introduction of Value-at-Risk (VaR) as an accepted methodology for quantifying market risk is part of the evolution of risk
3.1 State Space Models
31 State Space Models In this section we study state space models of continuous-time linear systems The corresponding results for discrete-time systems, obtained via duality with the continuous-time models,
A Basic Introduction to Missing Data
John Fox Sociology 740 Winter 2014 Outline Why Missing Data Arise Why Missing Data Arise Global or unit non-response. In a survey, certain respondents may be unreachable or may refuse to participate. Item
Lecture 3: Linear methods for classification
Lecture 3: Linear methods for classification Rafael A. Irizarry and Hector Corrada Bravo February, 2010 Today we describe four specific algorithms useful for classification problems: linear regression,
Least Squares Estimation
Least Squares Estimation SARA A VAN DE GEER Volume 2, pp 1041 1045 in Encyclopedia of Statistics in Behavioral Science ISBN-13: 978-0-470-86080-9 ISBN-10: 0-470-86080-4 Editors Brian S Everitt & David
Standard errors of marginal effects in the heteroskedastic probit model
Standard errors of marginal effects in the heteroskedastic probit model Thomas Cornelißen Discussion Paper No. 320 August 2005 ISSN: 0949 9962 Abstract In non-linear regression models, such as the heteroskedastic
Precalculus REVERSE CORRELATION. Content Expectations for. Precalculus. Michigan CONTENT EXPECTATIONS FOR PRECALCULUS CHAPTER/LESSON TITLES
Content Expectations for Precalculus Michigan Precalculus 2011 REVERSE CORRELATION CHAPTER/LESSON TITLES Chapter 0 Preparing for Precalculus 0-1 Sets There are no state-mandated Precalculus 0-2 Operations
VISUALIZATION OF DENSITY FUNCTIONS WITH GEOGEBRA
VISUALIZATION OF DENSITY FUNCTIONS WITH GEOGEBRA Csilla Csendes University of Miskolc, Hungary Department of Applied Mathematics ICAM 2010 Probability density functions A random variable X has density
Data Mining and Data Warehousing. Henryk Maciejewski. Data Mining Predictive modelling: regression
Data Mining and Data Warehousing Henryk Maciejewski Data Mining Predictive modelling: regression Algorithms for Predictive Modelling Contents Regression Classification Auxiliary topics: Estimation of prediction
2. What is the general linear model to be used to model linear trend? (Write out the model) = + + + or
Simple and Multiple Regression Analysis Example: Explore the relationships among Month, Adv.$ and Sales $: 1. Prepare a scatter plot of these data. The scatter plots for Adv.$ versus Sales, and Month versus
Regression III: Advanced Methods
Lecture 16: Generalized Additive Models Regression III: Advanced Methods Bill Jacoby Michigan State University http://polisci.msu.edu/jacoby/icpsr/regress3 Goals of the Lecture Introduce Additive Models
1 Teaching notes on GMM 1.
Bent E. Sørensen January 23, 2007 1 Teaching notes on GMM 1. Generalized Method of Moment (GMM) estimation is one of two developments in econometrics in the 80ies that revolutionized empirical work in
MISSING DATA TECHNIQUES WITH SAS. IDRE Statistical Consulting Group
MISSING DATA TECHNIQUES WITH SAS IDRE Statistical Consulting Group ROAD MAP FOR TODAY To discuss: 1. Commonly used techniques for handling missing data, focusing on multiple imputation 2. Issues that could
Inner Product Spaces
Math 571 Inner Product Spaces 1. Preliminaries An inner product space is a vector space V along with a function, called an inner product which associates each pair of vectors u, v with a scalar u, v, and
Hypothesis testing - Steps
Hypothesis testing - Steps Steps to do a two-tailed test of the hypothesis that β 1 0: 1. Set up the hypotheses: H 0 : β 1 = 0 H a : β 1 0. 2. Compute the test statistic: t = b 1 0 Std. error of b 1 =
Imputing Missing Data using SAS
ABSTRACT Paper 3295-2015 Imputing Missing Data using SAS Christopher Yim, California Polytechnic State University, San Luis Obispo Missing data is an unfortunate reality of statistics. However, there are
Simple Regression Theory II 2010 Samuel L. Baker
SIMPLE REGRESSION THEORY II 1 Simple Regression Theory II 2010 Samuel L. Baker Assessing how good the regression equation is likely to be Assignment 1A gets into drawing inferences about how close the
General Framework for an Iterative Solution of Ax b. Jacobi s Method
2.6 Iterative Solutions of Linear Systems 143 2.6 Iterative Solutions of Linear Systems Consistent linear systems in real life are solved in one of two ways: by direct calculation (using a matrix factorization,
LOGISTIC REGRESSION. Nitin R Patel. where the dependent variable, y, is binary (for convenience we often code these values as
LOGISTIC REGRESSION Nitin R Patel Logistic regression extends the ideas of multiple linear regression to the situation where the dependent variable, y, is binary (for convenience we often code these values
Maximum Likelihood Estimation
Math 541: Statistical Theory II Lecturer: Songfeng Zheng Maximum Likelihood Estimation 1 Maximum Likelihood Estimation Maximum likelihood is a relatively simple method of constructing an estimator for
SUMAN DUVVURU STAT 567 PROJECT REPORT
SUMAN DUVVURU STAT 567 PROJECT REPORT SURVIVAL ANALYSIS OF HEROIN ADDICTS Background and introduction: Current illicit drug use among teens is continuing to increase in many countries around the world.
HYPOTHESIS TESTING: CONFIDENCE INTERVALS, T-TESTS, ANOVAS, AND REGRESSION
HYPOTHESIS TESTING: CONFIDENCE INTERVALS, T-TESTS, ANOVAS, AND REGRESSION HOD 2990 10 November 2010 Lecture Background This is a lightning speed summary of introductory statistical methods for senior undergraduate
Markov Chain Monte Carlo Simulation Made Simple
Markov Chain Monte Carlo Simulation Made Simple Alastair Smith Department of Politics New York University April2,2003 1 Markov Chain Monte Carlo (MCMC) simualtion is a powerful technique to perform numerical
Time Series Analysis
Time Series Analysis Autoregressive, MA and ARMA processes Andrés M. Alonso Carolina García-Martos Universidad Carlos III de Madrid Universidad Politécnica de Madrid June July, 212 Alonso and García-Martos
A Primer on Mathematical Statistics and Univariate Distributions; The Normal Distribution; The GLM with the Normal Distribution
A Primer on Mathematical Statistics and Univariate Distributions; The Normal Distribution; The GLM with the Normal Distribution PSYC 943 (930): Fundamentals of Multivariate Modeling Lecture 4: September
CHAPTER 3 EXAMPLES: REGRESSION AND PATH ANALYSIS
Examples: Regression And Path Analysis CHAPTER 3 EXAMPLES: REGRESSION AND PATH ANALYSIS Regression analysis with univariate or multivariate dependent variables is a standard procedure for modeling relationships
IAPRI Quantitative Analysis Capacity Building Series. Multiple regression analysis & interpreting results
IAPRI Quantitative Analysis Capacity Building Series Multiple regression analysis & interpreting results How important is R-squared? R-squared Published in Agricultural Economics 0.45 Best article of the
SAS Software to Fit the Generalized Linear Model
SAS Software to Fit the Generalized Linear Model Gordon Johnston, SAS Institute Inc., Cary, NC Abstract In recent years, the class of generalized linear models has gained popularity as a statistical modeling
Gamma Distribution Fitting
Chapter 552 Gamma Distribution Fitting Introduction This module fits the gamma probability distributions to a complete or censored set of individual or grouped data values. It outputs various statistics
Principle of Data Reduction
Chapter 6 Principle of Data Reduction 6.1 Introduction An experimenter uses the information in a sample X 1,..., X n to make inferences about an unknown parameter θ. If the sample size n is large, then
SIMULATION STUDIES IN STATISTICS WHAT IS A SIMULATION STUDY, AND WHY DO ONE? What is a (Monte Carlo) simulation study, and why do one?
SIMULATION STUDIES IN STATISTICS WHAT IS A SIMULATION STUDY, AND WHY DO ONE? What is a (Monte Carlo) simulation study, and why do one? Simulations for properties of estimators Simulations for properties
NCSS Statistical Software Principal Components Regression. In ordinary least squares, the regression coefficients are estimated using the formula ( )
Chapter 340 Principal Components Regression Introduction is a technique for analyzing multiple regression data that suffer from multicollinearity. When multicollinearity occurs, least squares estimates
A Coefficient of Variation for Skewed and Heavy-Tailed Insurance Losses. Michael R. Powers[ 1 ] Temple University and Tsinghua University
A Coefficient of Variation for Skewed and Heavy-Tailed Insurance Losses Michael R. Powers[ ] Temple University and Tsinghua University Thomas Y. Powers Yale University [June 2009] Abstract We propose a
Principles of Hypothesis Testing for Public Health
Principles of Hypothesis Testing for Public Health Laura Lee Johnson, Ph.D. Statistician National Center for Complementary and Alternative Medicine [email protected] Fall 2011 Answers to Questions
PHASE ESTIMATION ALGORITHM FOR FREQUENCY HOPPED BINARY PSK AND DPSK WAVEFORMS WITH SMALL NUMBER OF REFERENCE SYMBOLS
PHASE ESTIMATION ALGORITHM FOR FREQUENCY HOPPED BINARY PSK AND DPSK WAVEFORMS WITH SMALL NUM OF REFERENCE SYMBOLS Benjamin R. Wiederholt The MITRE Corporation Bedford, MA and Mario A. Blanco The MITRE
SIMPLE LINEAR CORRELATION. r can range from -1 to 1, and is independent of units of measurement. Correlation can be done on two dependent variables.
SIMPLE LINEAR CORRELATION Simple linear correlation is a measure of the degree to which two variables vary together, or a measure of the intensity of the association between two variables. Correlation
Summary of Formulas and Concepts. Descriptive Statistics (Ch. 1-4)
Summary of Formulas and Concepts Descriptive Statistics (Ch. 1-4) Definitions Population: The complete set of numerical information on a particular quantity in which an investigator is interested. We assume
Chapter 3 RANDOM VARIATE GENERATION
Chapter 3 RANDOM VARIATE GENERATION In order to do a Monte Carlo simulation either by hand or by computer, techniques must be developed for generating values of random variables having known distributions.
Logit Models for Binary Data
Chapter 3 Logit Models for Binary Data We now turn our attention to regression models for dichotomous data, including logistic regression and probit analysis. These models are appropriate when the response
i=1 In practice, the natural logarithm of the likelihood function, called the log-likelihood function and denoted by
Statistics 580 Maximum Likelihood Estimation Introduction Let y (y 1, y 2,..., y n be a vector of iid, random variables from one of a family of distributions on R n and indexed by a p-dimensional parameter
R 2 -type Curves for Dynamic Predictions from Joint Longitudinal-Survival Models
Faculty of Health Sciences R 2 -type Curves for Dynamic Predictions from Joint Longitudinal-Survival Models Inference & application to prediction of kidney graft failure Paul Blanche joint work with M-C.
Estimation of σ 2, the variance of ɛ
Estimation of σ 2, the variance of ɛ The variance of the errors σ 2 indicates how much observations deviate from the fitted surface. If σ 2 is small, parameters β 0, β 1,..., β k will be reliably estimated
Logistic Regression (1/24/13)
STA63/CBB540: Statistical methods in computational biology Logistic Regression (/24/3) Lecturer: Barbara Engelhardt Scribe: Dinesh Manandhar Introduction Logistic regression is model for regression used
Linear Regression. Guy Lebanon
Linear Regression Guy Lebanon Linear Regression Model and Least Squares Estimation Linear regression is probably the most popular model for predicting a RV Y R based on multiple RVs X 1,..., X d R. It
Forecasting in supply chains
1 Forecasting in supply chains Role of demand forecasting Effective transportation system or supply chain design is predicated on the availability of accurate inputs to the modeling process. One of the
GLM, insurance pricing & big data: paying attention to convergence issues.
GLM, insurance pricing & big data: paying attention to convergence issues. Michaël NOACK - [email protected] Senior consultant & Manager of ADDACTIS Pricing Copyright 2014 ADDACTIS Worldwide.
NCSS Statistical Software
Chapter 06 Introduction This procedure provides several reports for the comparison of two distributions, including confidence intervals for the difference in means, two-sample t-tests, the z-test, the
Lesson 1: Comparison of Population Means Part c: Comparison of Two- Means
Lesson : Comparison of Population Means Part c: Comparison of Two- Means Welcome to lesson c. This third lesson of lesson will discuss hypothesis testing for two independent means. Steps in Hypothesis
Distribution (Weibull) Fitting
Chapter 550 Distribution (Weibull) Fitting Introduction This procedure estimates the parameters of the exponential, extreme value, logistic, log-logistic, lognormal, normal, and Weibull probability distributions
Nonlinear Regression:
Zurich University of Applied Sciences School of Engineering IDP Institute of Data Analysis and Process Design Nonlinear Regression: A Powerful Tool With Considerable Complexity Half-Day 1: Estimation and
Spatial Statistics Chapter 3 Basics of areal data and areal data modeling
Spatial Statistics Chapter 3 Basics of areal data and areal data modeling Recall areal data also known as lattice data are data Y (s), s D where D is a discrete index set. This usually corresponds to data
Week 5: Multiple Linear Regression
BUS41100 Applied Regression Analysis Week 5: Multiple Linear Regression Parameter estimation and inference, forecasting, diagnostics, dummy variables Robert B. Gramacy The University of Chicago Booth School
Gaussian Processes to Speed up Hamiltonian Monte Carlo
Gaussian Processes to Speed up Hamiltonian Monte Carlo Matthieu Lê Murray, Iain http://videolectures.net/mlss09uk_murray_mcmc/ Rasmussen, Carl Edward. "Gaussian processes to speed up hybrid Monte Carlo
Efficiency and the Cramér-Rao Inequality
Chapter Efficiency and the Cramér-Rao Inequality Clearly we would like an unbiased estimator ˆφ (X of φ (θ to produce, in the long run, estimates which are fairly concentrated i.e. have high precision.
1 Sufficient statistics
1 Sufficient statistics A statistic is a function T = rx 1, X 2,, X n of the random sample X 1, X 2,, X n. Examples are X n = 1 n s 2 = = X i, 1 n 1 the sample mean X i X n 2, the sample variance T 1 =
STA 4273H: Statistical Machine Learning
STA 4273H: Statistical Machine Learning Russ Salakhutdinov Department of Statistics! [email protected]! http://www.cs.toronto.edu/~rsalakhu/ Lecture 6 Three Approaches to Classification Construct
Overview of Violations of the Basic Assumptions in the Classical Normal Linear Regression Model
Overview of Violations of the Basic Assumptions in the Classical Normal Linear Regression Model 1 September 004 A. Introduction and assumptions The classical normal linear regression model can be written
The Effects of Start Prices on the Performance of the Certainty Equivalent Pricing Policy
BMI Paper The Effects of Start Prices on the Performance of the Certainty Equivalent Pricing Policy Faculty of Sciences VU University Amsterdam De Boelelaan 1081 1081 HV Amsterdam Netherlands Author: R.D.R.
STATISTICA Formula Guide: Logistic Regression. Table of Contents
: Table of Contents... 1 Overview of Model... 1 Dispersion... 2 Parameterization... 3 Sigma-Restricted Model... 3 Overparameterized Model... 4 Reference Coding... 4 Model Summary (Summary Tab)... 5 Summary
Please follow the directions once you locate the Stata software in your computer. Room 114 (Business Lab) has computers with Stata software
STATA Tutorial Professor Erdinç Please follow the directions once you locate the Stata software in your computer. Room 114 (Business Lab) has computers with Stata software 1.Wald Test Wald Test is used
Simple Linear Regression Inference
Simple Linear Regression Inference 1 Inference requirements The Normality assumption of the stochastic term e is needed for inference even if it is not a OLS requirement. Therefore we have: Interpretation
α α λ α = = λ λ α ψ = = α α α λ λ ψ α = + β = > θ θ β > β β θ θ θ β θ β γ θ β = γ θ > β > γ θ β γ = θ β = θ β = θ β = β θ = β β θ = = = β β θ = + α α α α α = = λ λ λ λ λ λ λ = λ λ α α α α λ ψ + α =
Dynamical Systems Analysis II: Evaluating Stability, Eigenvalues
Dynamical Systems Analysis II: Evaluating Stability, Eigenvalues By Peter Woolf [email protected]) University of Michigan Michigan Chemical Process Dynamics and Controls Open Textbook version 1.0 Creative
Unit 31 A Hypothesis Test about Correlation and Slope in a Simple Linear Regression
Unit 31 A Hypothesis Test about Correlation and Slope in a Simple Linear Regression Objectives: To perform a hypothesis test concerning the slope of a least squares line To recognize that testing for a
4 Lyapunov Stability Theory
4 Lyapunov Stability Theory In this section we review the tools of Lyapunov stability theory. These tools will be used in the next section to analyze the stability properties of a robot controller. We
4. Continuous Random Variables, the Pareto and Normal Distributions
4. Continuous Random Variables, the Pareto and Normal Distributions A continuous random variable X can take any value in a given range (e.g. height, weight, age). The distribution of a continuous random
STAT 830 Convergence in Distribution
STAT 830 Convergence in Distribution Richard Lockhart Simon Fraser University STAT 830 Fall 2011 Richard Lockhart (Simon Fraser University) STAT 830 Convergence in Distribution STAT 830 Fall 2011 1 / 31
Tim Kerins. Leaving Certificate Honours Maths - Algebra. Tim Kerins. the date
Leaving Certificate Honours Maths - Algebra the date Chapter 1 Algebra This is an important portion of the course. As well as generally accounting for 2 3 questions in examination it is the basis for many
1. What is the critical value for this 95% confidence interval? CV = z.025 = invnorm(0.025) = 1.96
1 Final Review 2 Review 2.1 CI 1-propZint Scenario 1 A TV manufacturer claims in its warranty brochure that in the past not more than 10 percent of its TV sets needed any repair during the first two years
Chapter 4: Vector Autoregressive Models
Chapter 4: Vector Autoregressive Models 1 Contents: Lehrstuhl für Department Empirische of Wirtschaftsforschung Empirical Research and und Econometrics Ökonometrie IV.1 Vector Autoregressive Models (VAR)...
Numerical methods for American options
Lecture 9 Numerical methods for American options Lecture Notes by Andrzej Palczewski Computational Finance p. 1 American options The holder of an American option has the right to exercise it at any moment
MULTIPLE REGRESSION AND ISSUES IN REGRESSION ANALYSIS
MULTIPLE REGRESSION AND ISSUES IN REGRESSION ANALYSIS MSR = Mean Regression Sum of Squares MSE = Mean Squared Error RSS = Regression Sum of Squares SSE = Sum of Squared Errors/Residuals α = Level of Significance
Interaction between quantitative predictors
Interaction between quantitative predictors In a first-order model like the ones we have discussed, the association between E(y) and a predictor x j does not depend on the value of the other predictors
problem arises when only a non-random sample is available differs from censored regression model in that x i is also unobserved
4 Data Issues 4.1 Truncated Regression population model y i = x i β + ε i, ε i N(0, σ 2 ) given a random sample, {y i, x i } N i=1, then OLS is consistent and efficient problem arises when only a non-random
Credit Scoring Modelling for Retail Banking Sector.
Credit Scoring Modelling for Retail Banking Sector. Elena Bartolozzi, Matthew Cornford, Leticia García-Ergüín, Cristina Pascual Deocón, Oscar Iván Vasquez & Fransico Javier Plaza. II Modelling Week, Universidad
The Steepest Descent Algorithm for Unconstrained Optimization and a Bisection Line-search Method
The Steepest Descent Algorithm for Unconstrained Optimization and a Bisection Line-search Method Robert M. Freund February, 004 004 Massachusetts Institute of Technology. 1 1 The Algorithm The problem
Equations, Inequalities & Partial Fractions
Contents Equations, Inequalities & Partial Fractions.1 Solving Linear Equations 2.2 Solving Quadratic Equations 1. Solving Polynomial Equations 1.4 Solving Simultaneous Linear Equations 42.5 Solving Inequalities
Chapter 4 Online Appendix: The Mathematics of Utility Functions
Chapter 4 Online Appendix: The Mathematics of Utility Functions We saw in the text that utility functions and indifference curves are different ways to represent a consumer s preferences. Calculus can
Least-Squares Intersection of Lines
Least-Squares Intersection of Lines Johannes Traa - UIUC 2013 This write-up derives the least-squares solution for the intersection of lines. In the general case, a set of lines will not intersect at a
Time Series Analysis
Time Series Analysis [email protected] Informatics and Mathematical Modelling Technical University of Denmark DK-2800 Kgs. Lyngby 1 Outline of the lecture Identification of univariate time series models, cont.:
Statistics 104: Section 6!
Page 1 Statistics 104: Section 6! TF: Deirdre (say: Dear-dra) Bloome Email: [email protected] Section Times Thursday 2pm-3pm in SC 109, Thursday 5pm-6pm in SC 705 Office Hours: Thursday 6pm-7pm SC
Multiple-Comparison Procedures
Multiple-Comparison Procedures References A good review of many methods for both parametric and nonparametric multiple comparisons, planned and unplanned, and with some discussion of the philosophical
www.mathsbox.org.uk ab = c a If the coefficients a,b and c are real then either α and β are real or α and β are complex conjugates
Further Pure Summary Notes. Roots of Quadratic Equations For a quadratic equation ax + bx + c = 0 with roots α and β Sum of the roots Product of roots a + b = b a ab = c a If the coefficients a,b and c
Calculating VaR. Capital Market Risk Advisors CMRA
Calculating VaR Capital Market Risk Advisors How is VAR Calculated? Sensitivity Estimate Models - use sensitivity factors such as duration to estimate the change in value of the portfolio to changes in
