Fitting Subject-specific Curves to Grouped Longitudinal Data
|
|
|
- Maximilian Davidson
- 9 years ago
- Views:
Transcription
1 Fitting Subject-specific Curves to Grouped Longitudinal Data Djeundje, Viani Heriot-Watt University, Department of Actuarial Mathematics & Statistics Edinburgh, EH14 4AS, UK Currie, Iain Heriot-Watt University, Department of Actuarial Mathematics & Statistics Edinburgh, EH14 4AS, UK 1 A motivating example In longitudinal studies, a response variable measured over time, may vary across subgroups. A typical example is illustrated by the first three panels of Figure 1; these display simulated growth data (based on the model in Durban et al. 2005) of 197 children who suffer from acute lymphoblastic leukaemia, and have received three different treatments. In some cases, a parametric mixed model is sufficient to summarize this type of data. However in Figure 1, a parametric approach does not seem appropriate, so smoothing is incorporated into the modelling process in order to extract the correct patterns from the data. In this setting, the mixed model representation of truncated polynomials is widely used, with a well known covariance structure for the random effects. In this paper, we outline this approach, demonstrate its limitations, and describe a more appropriate approach via penalty arguments. 2 The standard model We consider n subjects classified into m groups and we denote by g(i) the group to which subject i belongs, by t ij the time of the jth observation on subject i, and by y ij the value of the response variable on subject i at time t ij. We suppose that the data are entered in group order and denote by y i and t i the response and time vectors for subject i. Our aim is to extract both the group and subject effects from these data in a smooth fashion. A convenient model for this purpose can be written as (1) y ij = S g(i) (t ij ) + S i (t ij ) + ε ij, ε ij N (0, σ 2 ), where S g(i) ( ) quantifies the group effect to which subject i belongs, and S i ( ) measures the departure of the ith subject from its group effect. A common approach consists of expressing these functions in terms of truncated lines, ie, (2) S g(i) (t) = a g(i),0 + a g(i),1 t + q u g(i),k (t τ k ) +, Si (t) = ă i0 + ă i1 t + k=1 q l=1 ŭ il (t τ l ) +, where x + = max{x, 0}, and τ = {τ 1,..., τ q } and τ = { τ 1,..., τ q } are sets of internal knots at the group and subject levels respectively. If a g(i) = (a g(i),0, a g(i),1 ), u g(i) = (u g(i),1,..., u g(i),q ), ă i = (ă i0, ă i1 ) and ŭ i = (ŭ i1,..., ŭ i q ), then model (1) can be expressed in matrix form as (3) y i = S g(i) (t i ) + S i (t i ) + ε i, i = 1,..., n, where (4) S g(i) (t i ) = X i a g(i) + T i u g(i), Si (t i ) = X i ă i + T i ŭ i,
2 and ε i is the error vector for subject i; here X i = X i = [1 : t i ], and T i and T i are made up of truncated lines for subject i at the group and subject level respectively. Two issues need to be addressed: smoothness and identifiability. A standard approach uses a rich set of knots and then deals simultaneously with smoothness and identifiability by applying the following normal constraints: (5) u g(i) N (0, σ 2 P I q ), ă i N (0, Σ), ŭ i N (0, σ 2 SI q ), where Σ is a 2 2 symmetric positive definite matrix, and I q and I q indicate identity matrices of sizes q and q respectively; see Coull et al. (2001), Ruppert et al. (2003) and Durban et al. (2005). We refer to (5) as the standard covariance assumption and to the model defined by (4) and (5) as the standard model. One advantage of this model is that it can be fitted to data with the lme function in the R package nlme, as described by Durban et al. (2005). However, the investigation of this approach in terms of its ability to separate appropriately the two inter-connected effects (ie, the group and subject effects) has received little attention. In a recent paper, Djeundje & Currie (2010) illustrated some of its problems. In this section we describe these problems and explain why the approach will fail, especially as far as the identifiability of the group and subject effects is concerned. First, we consider the growth data introduced earlier; this data set contains 197 children and the number of observations per child varies from 1 to 21; for this reason, we follow Durban et al. (2005) and use q = 40 and q = 10 knots at the treatment and subject levels. To illustrate our point, we consider two scenarios: Scenario 1: We use the representation (4); we refer to this representation as the forward basis. Scenario 2: We consider the representation (4), but replace the basis functions (t τ k ) + and (t τ l ) + by (τ k t) + and ( τ l t) + respectively; we refer to this representation as the backward basis. The fitted children (subject) means from both these scenarios (obtained by adding the fitted children effects to their group effects) are nearly identical, and sit appropriately on the data. It is easy to suppose that this goodness of fit at the global level implies that the fitted group and subject effects are appropriately extracted. However, the lower right panel in Figure 1 illustrates a fitted group effect under these two scenarios; we see that this fitted group effect together with the corresponding confidence bands depends on whether the forward or backward basis is used. Second, we perform a simulation exercise to clarify this point further. For this purpose we consider n = 60 subjects divided into m = 3 groups of sizes n 1 = 20, n 2 = 15 and n 3 = 25. We consider the special case of balanced data with M = 25 observations made on each subject at equallyspaced time points on [0, 1]; we denote this common time vector by t. We generate the response data for a single simulation as follows: We define the true group effects by the following quadratic functions: S k (t) = (t k 4 )2, k = 1, 2, 3. We define the true subject curves by S i (t) = A i sin(φ i t + ϕ i ), where A i N (0, σ 2 A ) and φ i, ϕ i U[0, θ], i = 1,..., n. The results presented in this paper use (σ 2 A, θ) = (0.252, 2π), but we note that large values of θ increase the flexibility of these subject curves. Finally, we simulate response data according to model (1), with σ 2 set to An illustration of such simulated data for group 1 is shown in the upper left panel of Figure 2. We want to evaluate the standard model in terms of its ability to recover the true underlying group curves as well as the behaviour of the related confidence bands. The upper right panel of Figure 2 displays the fitted group effect corresponding to the data in the left panel. Specifically, this graphic shows for the effect of group 1: (i) the true effect (dashed black line), (ii) the fitted effect
3 using the forward basis (red), (iii) the fitted effect using the backward basis (green), and (iv) the fitted effect using the penalty approach (blue, and described in the next section). The bias in the fitted effect and the associated confidence bands arising from the standard model (with either the forward or the backward basis) is worrying. We will now evaluate this approach asymptotically over a set of simulations as follows: We perform and store N = 100 simulations. For each simulation r, 1 r N, we fit the standard model and estimate the group effect S k (t) as Ŝ(r) k (t), k = 1, 2, 3, compute the standard deviation SD (r) (t) about Ŝ(r) (t), k = 1, 2, 3, k compute the average mean square errors as MSE (r) = Finally, we compute the mean standard deviation as SD k = k m k=1 Ŝ(r) k (t) S k(t) 2 mm, m = 3, M = 25. N r=1 SD(r) k (t) N, k = 1, 2, 3. In the lower left panel of Figure 2, the red and green dots show the MSE (r) obtained under the forward and backward bases respectively, while the blue dots refer to the penalty approach (presented in the next section). The lower right panel shows the mean standard deviation SD k with the same colour specification (each group is denoted by a different line style). We note that there is a close connection between the upper and lower right panels: the upper panel shows the widening fan effect of the confidence bands (for group one) arising from fitting the standard model (with the forward or backward basis) to a single simulated sample. This widening suggests that the standard deviation will increase or decrease (depending on the basis orientation). The lower panel does indeed confirm this widening (increasing or decreasing) under the standard model. We have conducted an extensive investigation into the effect of different values of the parameters (σa 2, θ, σ2 ). One important finding was that the discrepancy between the fitted effects and the true underlying effects with the standard model increases as the underlying subject curves become more flexible, ie, as θ increases; with small θ the subject curves are close to linear and the bias is relatively small. The reason for the bias with the standard model is the covariance specification (5). In (5), the parameters σp 2 and σ2 S determine the smoothness of the group and subject effects respectively; the intention is that their identification be controlled by the covariance component Σ. However, Σ can only separate the linear components of these two effects. There are two cases. First, if the subject effects are not too flexible, (ie, close to linear) then the variance parameter σs 2 will tend to be small and the standard model can successfully recover the two effects. Second, if the subject effects are sufficiently non-linear (as in the growth data example or in the simulated data in the upper left panel of Figure 2), then the non-linear components T i u g(i) and T i ŭ i in (4) become substantial; there is nothing in the standard covariance (5) that enables the appropriate separation of the two effects. In the next section, we derive a more appropriate covariance structure via a penalty argument; this provides a solution to the problem. 3 Penalty approach From now on, we express the group and subject effects in terms of B-splines, and we smooth using the P -spline system of Eilers & Marx (1996). A more general approach which includes both B-splines and/or truncated polynomials is described in Djeundje & Currie (2010). In terms of B-spline bases, the components of model (1) can be expressed in matrix form as (6) S g(i) (t i ) = B i α g(i), Si (t i ) = B i ᾰ i,
4 where B i and B i are B-spline bases for subject i at the group and subject levels, and α g(i) and ᾰ i are regression coefficients. Following Eilers & Marx (1996), we achieve smoothness by penalizing the roughness of the B-spline coefficients at both levels, and for identifiability, we follow Djeundje & Currie (2010) and shrink the subjects B-spline coefficients towards 0. This yields the following constraints: (7) α g(i) 2 < ρ, ᾰ i 2 < ρ 1, ᾰ i 2 < ρ 2, for some well chosen constants (ρ, ρ 1, ρ 2 ); here and below, and represent second order difference operators of appropriate size. Using Lagrange arguments, the penalized residual sum of squares, PRSS, of (1) & (6) under the constraints (7) can be expressed as (8) where PRSS = y i B i α g(i) B m i ᾰ i 2 + α kp k α k + ᾰ P i i ᾰ i, i=1 k=1 i=1 (9) P k = λ, Pi = λ + δi c represent the penalty matrices at the group and subject levels respectively, and c is the number of columns in B i. In these expressions, λ and λ are the smoothing parameters at the group and subject levels, while δ is the shrinkage/identifiability parameter; these three parameters play (inversely) the equivalent role as ρ, ρ 1 and ρ 2 in (7). We can now fit the model in the least squares sense by minimizing the PRSS with respect to the regression coefficients, and use criteria like AIC, BIC or GCV to estimate the smoothing/identifiability parameters, as described in Djeundje & Currie (2010). Nowadays, smooth models are often expressed as mixed models, since estimates of the variance/smoothing parameters obtained from the mixed model representation and restricted likelihood tend to behave well (Reiss and Ogdon, 2009). Further, our simulation study displays a mixed model structure. Using the singular value decomposition of the penalty component, we re-parameterize the PRSS as (10) PRSS = y i X i β g(i) Z i b g(i) B m i ᾰ i 2 + λ b kb k + ᾰ i Pi ᾰ i, i=1 k=1 i=1 with matrices X i and Z i defined as (11) X i = [1 : t i ], Z i = B i U i Λ 1/2 i ; here Λ i is the diagonal matrix formed by the positive eigenvalues of, and U i is the matrix whose columns are the corresponding eigenvectors. Setting y = vec(y 1,..., y n ), β = vec(β 1,..., β m ) and u = vec(b 1,..., b m, ᾰ 1,..., ᾰ n ), we can show that the minimization of the PRSS in (10) with respect to the regression coefficients corresponds to the maximization of the log-likelihood of (y, u) arising from the mixed model representation (12) y u N (Xβ + Zu, σ 2 I), u N (0, σ 2 P 1 ), where (13) P = blockdiag(λi c 2,..., λi c 2, P1,..., P n ) } {{ } m times is the full penalty matrix, β is the fixed effect, u is the random effect, and X and Z are appropriate regression matrices; here c is the number of columns in B i. We note that in the balanced case (as in our simulation) we can write X i = X g, Z i = Z g and B i = B, i = 1,..., n, and then in (12) we have X = blockdiag(1 n1 X g,..., 1 nm X g ) and Z = [blockdiag(1 n1 Z g,..., 1 nm Z g ) : I n B].
5 An illustration of the effectiveness of the penalty approach fitted via the mixed model representation is shown in Figure 2: (a) the top right panel shows the fitted group effect (blue lines) (b) the lower left panel compares the MSE (r) of the penalty approach (blue dots) with the MSE (r) of the standard approach (c) the lower right panel compares the mean standard deviation for the three groups for the two approaches. These graphics demonstrate that the penalty approach is successful in removing both the bias in the estimates and the fanning effect in the confidence intervals seen with the standard model (at least in the cases discussed here). 4 Conclusion In this short paper we have emphasized the importance of dealing appropriately with both smoothness and identifiability in the analysis of grouped longitudinal data. We have described a penalty approach which allows us to address both these issues directly. The penalty approach yields a model which can be reformulated as a mixed model and we have demonstrated that the estimates of the group effects from this model are free of bias and of fanning of the associated confidence intervals. A fuller description of this work is in preparation together with software to implement our approach. Treatment 1 Treatment 2 Treatment 3 Treatment 1 Figure 1: Graphics related to the height measurements of 197 children suffering from acute lymphoblastic leukaemia, and receiving three different treatments. The first three panels show the data. The last panel shows the fitted treatment effect (for group one) from the standard approach with the forward basis (red) and the backward basis (green). REFERENCES Coull, B. A., Ruppert, D. and Wand, M. P. (2001). Simple incorporation of interactions into additive models. Biometrics, 57,
6 y Group 1 y Group 1 Truth Penalty t t MSE Simulation index Penalty SD Penalty Group1 Group2 Group t Figure 2: Some graphics related to the simulated data. Top left: a sample of the simulated data. Top right: the fitted group effect for the data in the top left panel. Lower left: comparison of the mean square error. Lower right: Comparison of the standard deviation; each group is denoted by a different line style. Djeundje, V. A. B. and Currie, I. D. (2010). Appropriate covariance-specification via penalties for penalized splines in mixed models for longitudinal data. Electronic Journal of Statistics, 4, Durban, M., Harezlak, J., Wand, M. P. and Carroll, R. J. (2005). Simple fitting of subject-specific curves for longitudinal data. Statistics in Medicine, 24, Eilers, P. H. C. and Marx, B. D. (1996). Flexible smoothing with B-splines and penalties. Statistical Science, 11, Reiss, P. T. and Ogden, R. T. (2009). Smoothing parameter selection for a class of semiparametric linear models. Journal of the Royal Statistical Society, Series B, 71, Ruppert, D., Wand, M. P. and Carroll, R. J. (2003). Semiparametric Regression. Cambridge: Cambridge University Press. ABSTRACT The fitting of subject-specific curves (also known as factor-by-curve interactions) is important in the analysis of longitudinal data. One approach is to use a mixed model with the curves described in terms of truncated lines and the randomness expressed in terms of normal distributions. This approach gives simple fitting with standard mixed model software. We show that these normal assumptions can lead to biased estimates of and inappropriate confidence intervals for the population effects. We describe an alternative approach which uses a penalty argument to derive an appropriate covariance structure for a mixed model for such data. The effectiveness of our methods is demonstrated with the analysis of some growth curve data and a simulation study.
Least Squares Estimation
Least Squares Estimation SARA A VAN DE GEER Volume 2, pp 1041 1045 in Encyclopedia of Statistics in Behavioral Science ISBN-13: 978-0-470-86080-9 ISBN-10: 0-470-86080-4 Editors Brian S Everitt & David
Lecture 3: Linear methods for classification
Lecture 3: Linear methods for classification Rafael A. Irizarry and Hector Corrada Bravo February, 2010 Today we describe four specific algorithms useful for classification problems: linear regression,
Introduction to Matrix Algebra
Psychology 7291: Multivariate Statistics (Carey) 8/27/98 Matrix Algebra - 1 Introduction to Matrix Algebra Definitions: A matrix is a collection of numbers ordered by rows and columns. It is customary
Logistic Regression. Jia Li. Department of Statistics The Pennsylvania State University. Logistic Regression
Logistic Regression Department of Statistics The Pennsylvania State University Email: [email protected] Logistic Regression Preserve linear classification boundaries. By the Bayes rule: Ĝ(x) = arg max
Review Jeopardy. Blue vs. Orange. Review Jeopardy
Review Jeopardy Blue vs. Orange Review Jeopardy Jeopardy Round Lectures 0-3 Jeopardy Round $200 How could I measure how far apart (i.e. how different) two observations, y 1 and y 2, are from each other?
Web-based Supplementary Materials for Bayesian Effect Estimation. Accounting for Adjustment Uncertainty by Chi Wang, Giovanni
1 Web-based Supplementary Materials for Bayesian Effect Estimation Accounting for Adjustment Uncertainty by Chi Wang, Giovanni Parmigiani, and Francesca Dominici In Web Appendix A, we provide detailed
Penalized Splines - A statistical Idea with numerous Applications...
Penalized Splines - A statistical Idea with numerous Applications... Göran Kauermann Ludwig-Maximilians-University Munich Graz 7. September 2011 1 Penalized Splines - A statistical Idea with numerous Applications...
Least-Squares Intersection of Lines
Least-Squares Intersection of Lines Johannes Traa - UIUC 2013 This write-up derives the least-squares solution for the intersection of lines. In the general case, a set of lines will not intersect at a
Overview of Violations of the Basic Assumptions in the Classical Normal Linear Regression Model
Overview of Violations of the Basic Assumptions in the Classical Normal Linear Regression Model 1 September 004 A. Introduction and assumptions The classical normal linear regression model can be written
Similarity and Diagonalization. Similar Matrices
MATH022 Linear Algebra Brief lecture notes 48 Similarity and Diagonalization Similar Matrices Let A and B be n n matrices. We say that A is similar to B if there is an invertible n n matrix P such that
Exact Inference for Gaussian Process Regression in case of Big Data with the Cartesian Product Structure
Exact Inference for Gaussian Process Regression in case of Big Data with the Cartesian Product Structure Belyaev Mikhail 1,2,3, Burnaev Evgeny 1,2,3, Kapushev Yermek 1,2 1 Institute for Information Transmission
Understanding the Impact of Weights Constraints in Portfolio Theory
Understanding the Impact of Weights Constraints in Portfolio Theory Thierry Roncalli Research & Development Lyxor Asset Management, Paris [email protected] January 2010 Abstract In this article,
15.062 Data Mining: Algorithms and Applications Matrix Math Review
.6 Data Mining: Algorithms and Applications Matrix Math Review The purpose of this document is to give a brief review of selected linear algebra concepts that will be useful for the course and to develop
Example: Credit card default, we may be more interested in predicting the probabilty of a default than classifying individuals as default or not.
Statistical Learning: Chapter 4 Classification 4.1 Introduction Supervised learning with a categorical (Qualitative) response Notation: - Feature vector X, - qualitative response Y, taking values in C
Exploratory Factor Analysis and Principal Components. Pekka Malo & Anton Frantsev 30E00500 Quantitative Empirical Research Spring 2016
and Principal Components Pekka Malo & Anton Frantsev 30E00500 Quantitative Empirical Research Spring 2016 Agenda Brief History and Introductory Example Factor Model Factor Equation Estimation of Loadings
Penalized regression: Introduction
Penalized regression: Introduction Patrick Breheny August 30 Patrick Breheny BST 764: Applied Statistical Modeling 1/19 Maximum likelihood Much of 20th-century statistics dealt with maximum likelihood
Penalized Logistic Regression and Classification of Microarray Data
Penalized Logistic Regression and Classification of Microarray Data Milan, May 2003 Anestis Antoniadis Laboratoire IMAG-LMC University Joseph Fourier Grenoble, France Penalized Logistic Regression andclassification
7 Time series analysis
7 Time series analysis In Chapters 16, 17, 33 36 in Zuur, Ieno and Smith (2007), various time series techniques are discussed. Applying these methods in Brodgar is straightforward, and most choices are
Spatial Statistics Chapter 3 Basics of areal data and areal data modeling
Spatial Statistics Chapter 3 Basics of areal data and areal data modeling Recall areal data also known as lattice data are data Y (s), s D where D is a discrete index set. This usually corresponds to data
3. Regression & Exponential Smoothing
3. Regression & Exponential Smoothing 3.1 Forecasting a Single Time Series Two main approaches are traditionally used to model a single time series z 1, z 2,..., z n 1. Models the observation z t as a
Statistical Machine Learning
Statistical Machine Learning UoC Stats 37700, Winter quarter Lecture 4: classical linear and quadratic discriminants. 1 / 25 Linear separation For two classes in R d : simple idea: separate the classes
Machine Learning and Pattern Recognition Logistic Regression
Machine Learning and Pattern Recognition Logistic Regression Course Lecturer:Amos J Storkey Institute for Adaptive and Neural Computation School of Informatics University of Edinburgh Crichton Street,
University of Lille I PC first year list of exercises n 7. Review
University of Lille I PC first year list of exercises n 7 Review Exercise Solve the following systems in 4 different ways (by substitution, by the Gauss method, by inverting the matrix of coefficients
Lecture 1: Schur s Unitary Triangularization Theorem
Lecture 1: Schur s Unitary Triangularization Theorem This lecture introduces the notion of unitary equivalence and presents Schur s theorem and some of its consequences It roughly corresponds to Sections
Smoothing and Non-Parametric Regression
Smoothing and Non-Parametric Regression Germán Rodríguez [email protected] Spring, 2001 Objective: to estimate the effects of covariates X on a response y nonparametrically, letting the data suggest
NCSS Statistical Software Principal Components Regression. In ordinary least squares, the regression coefficients are estimated using the formula ( )
Chapter 340 Principal Components Regression Introduction is a technique for analyzing multiple regression data that suffer from multicollinearity. When multicollinearity occurs, least squares estimates
Moving Least Squares Approximation
Chapter 7 Moving Least Squares Approimation An alternative to radial basis function interpolation and approimation is the so-called moving least squares method. As we will see below, in this method the
Math 115A HW4 Solutions University of California, Los Angeles. 5 2i 6 + 4i. (5 2i)7i (6 + 4i)( 3 + i) = 35i + 14 ( 22 6i) = 36 + 41i.
Math 5A HW4 Solutions September 5, 202 University of California, Los Angeles Problem 4..3b Calculate the determinant, 5 2i 6 + 4i 3 + i 7i Solution: The textbook s instructions give us, (5 2i)7i (6 + 4i)(
Department of Mathematics, Indian Institute of Technology, Kharagpur Assignment 2-3, Probability and Statistics, March 2015. Due:-March 25, 2015.
Department of Mathematics, Indian Institute of Technology, Kharagpur Assignment -3, Probability and Statistics, March 05. Due:-March 5, 05.. Show that the function 0 for x < x+ F (x) = 4 for x < for x
GLAM Array Methods in Statistics
GLAM Array Methods in Statistics Iain Currie Heriot Watt University A Generalized Linear Array Model is a low-storage, high-speed, GLAM method for multidimensional smoothing, when data forms an array,
GAM for large datasets and load forecasting
GAM for large datasets and load forecasting Simon Wood, University of Bath, U.K. in collaboration with Yannig Goude, EDF French Grid load GW 3 5 7 5 1 15 day GW 3 4 5 174 176 178 18 182 day Half hourly
problem arises when only a non-random sample is available differs from censored regression model in that x i is also unobserved
4 Data Issues 4.1 Truncated Regression population model y i = x i β + ε i, ε i N(0, σ 2 ) given a random sample, {y i, x i } N i=1, then OLS is consistent and efficient problem arises when only a non-random
STA 4273H: Statistical Machine Learning
STA 4273H: Statistical Machine Learning Russ Salakhutdinov Department of Statistics! [email protected]! http://www.cs.toronto.edu/~rsalakhu/ Lecture 6 Three Approaches to Classification Construct
Introduction to General and Generalized Linear Models
Introduction to General and Generalized Linear Models General Linear Models - part I Henrik Madsen Poul Thyregod Informatics and Mathematical Modelling Technical University of Denmark DK-2800 Kgs. Lyngby
Chapter 6. Orthogonality
6.3 Orthogonal Matrices 1 Chapter 6. Orthogonality 6.3 Orthogonal Matrices Definition 6.4. An n n matrix A is orthogonal if A T A = I. Note. We will see that the columns of an orthogonal matrix must be
Simple Linear Regression Inference
Simple Linear Regression Inference 1 Inference requirements The Normality assumption of the stochastic term e is needed for inference even if it is not a OLS requirement. Therefore we have: Interpretation
[1] Diagonal factorization
8.03 LA.6: Diagonalization and Orthogonal Matrices [ Diagonal factorization [2 Solving systems of first order differential equations [3 Symmetric and Orthonormal Matrices [ Diagonal factorization Recall:
Lasso on Categorical Data
Lasso on Categorical Data Yunjin Choi, Rina Park, Michael Seo December 14, 2012 1 Introduction In social science studies, the variables of interest are often categorical, such as race, gender, and nationality.
DATA ANALYSIS II. Matrix Algorithms
DATA ANALYSIS II Matrix Algorithms Similarity Matrix Given a dataset D = {x i }, i=1,..,n consisting of n points in R d, let A denote the n n symmetric similarity matrix between the points, given as where
Smoothing. Fitting without a parametrization
Smoothing or Fitting without a parametrization Volker Blobel University of Hamburg March 2005 1. Why smoothing 2. Running median smoother 3. Orthogonal polynomials 4. Transformations 5. Spline functions
Regression III: Advanced Methods
Lecture 16: Generalized Additive Models Regression III: Advanced Methods Bill Jacoby Michigan State University http://polisci.msu.edu/jacoby/icpsr/regress3 Goals of the Lecture Introduce Additive Models
We discuss 2 resampling methods in this chapter - cross-validation - the bootstrap
Statistical Learning: Chapter 5 Resampling methods (Cross-validation and bootstrap) (Note: prior to these notes, we'll discuss a modification of an earlier train/test experiment from Ch 2) We discuss 2
Time Series Analysis
Time Series Analysis [email protected] Informatics and Mathematical Modelling Technical University of Denmark DK-2800 Kgs. Lyngby 1 Outline of the lecture Identification of univariate time series models, cont.:
Linear algebra and the geometry of quadratic equations. Similarity transformations and orthogonal matrices
MATH 30 Differential Equations Spring 006 Linear algebra and the geometry of quadratic equations Similarity transformations and orthogonal matrices First, some things to recall from linear algebra Two
Auxiliary Variables in Mixture Modeling: 3-Step Approaches Using Mplus
Auxiliary Variables in Mixture Modeling: 3-Step Approaches Using Mplus Tihomir Asparouhov and Bengt Muthén Mplus Web Notes: No. 15 Version 8, August 5, 2014 1 Abstract This paper discusses alternatives
Multivariate Analysis (Slides 13)
Multivariate Analysis (Slides 13) The final topic we consider is Factor Analysis. A Factor Analysis is a mathematical approach for attempting to explain the correlation between a large set of variables
A Latent Variable Approach to Validate Credit Rating Systems using R
A Latent Variable Approach to Validate Credit Rating Systems using R Chicago, April 24, 2009 Bettina Grün a, Paul Hofmarcher a, Kurt Hornik a, Christoph Leitner a, Stefan Pichler a a WU Wien Grün/Hofmarcher/Hornik/Leitner/Pichler
Sample Size Calculation for Longitudinal Studies
Sample Size Calculation for Longitudinal Studies Phil Schumm Department of Health Studies University of Chicago August 23, 2004 (Supported by National Institute on Aging grant P01 AG18911-01A1) Introduction
Logit Models for Binary Data
Chapter 3 Logit Models for Binary Data We now turn our attention to regression models for dichotomous data, including logistic regression and probit analysis. These models are appropriate when the response
Analysis of Bayesian Dynamic Linear Models
Analysis of Bayesian Dynamic Linear Models Emily M. Casleton December 17, 2010 1 Introduction The main purpose of this project is to explore the Bayesian analysis of Dynamic Linear Models (DLMs). The main
Basics of Statistical Machine Learning
CS761 Spring 2013 Advanced Machine Learning Basics of Statistical Machine Learning Lecturer: Xiaojin Zhu [email protected] Modern machine learning is rooted in statistics. You will find many familiar
Orthogonal Distance Regression
Applied and Computational Mathematics Division NISTIR 89 4197 Center for Computing and Applied Mathematics Orthogonal Distance Regression Paul T. Boggs and Janet E. Rogers November, 1989 (Revised July,
Estimation of σ 2, the variance of ɛ
Estimation of σ 2, the variance of ɛ The variance of the errors σ 2 indicates how much observations deviate from the fitted surface. If σ 2 is small, parameters β 0, β 1,..., β k will be reliably estimated
Linear Models and Conjoint Analysis with Nonlinear Spline Transformations
Linear Models and Conjoint Analysis with Nonlinear Spline Transformations Warren F. Kuhfeld Mark Garratt Abstract Many common data analysis models are based on the general linear univariate model, including
STATISTICA Formula Guide: Logistic Regression. Table of Contents
: Table of Contents... 1 Overview of Model... 1 Dispersion... 2 Parameterization... 3 Sigma-Restricted Model... 3 Overparameterized Model... 4 Reference Coding... 4 Model Summary (Summary Tab)... 5 Summary
13 MATH FACTS 101. 2 a = 1. 7. The elements of a vector have a graphical interpretation, which is particularly easy to see in two or three dimensions.
3 MATH FACTS 0 3 MATH FACTS 3. Vectors 3.. Definition We use the overhead arrow to denote a column vector, i.e., a linear segment with a direction. For example, in three-space, we write a vector in terms
Linear Algebra Review. Vectors
Linear Algebra Review By Tim K. Marks UCSD Borrows heavily from: Jana Kosecka [email protected] http://cs.gmu.edu/~kosecka/cs682.html Virginia de Sa Cogsci 8F Linear Algebra review UCSD Vectors The length
Efficient Curve Fitting Techniques
15/11/11 Life Conference and Exhibition 11 Stuart Carroll, Christopher Hursey Efficient Curve Fitting Techniques - November 1 The Actuarial Profession www.actuaries.org.uk Agenda Background Outline of
Department of Economics
Department of Economics On Testing for Diagonality of Large Dimensional Covariance Matrices George Kapetanios Working Paper No. 526 October 2004 ISSN 1473-0278 On Testing for Diagonality of Large Dimensional
INDIRECT INFERENCE (prepared for: The New Palgrave Dictionary of Economics, Second Edition)
INDIRECT INFERENCE (prepared for: The New Palgrave Dictionary of Economics, Second Edition) Abstract Indirect inference is a simulation-based method for estimating the parameters of economic models. Its
Interpretation of Somers D under four simple models
Interpretation of Somers D under four simple models Roger B. Newson 03 September, 04 Introduction Somers D is an ordinal measure of association introduced by Somers (96)[9]. It can be defined in terms
MATH 423 Linear Algebra II Lecture 38: Generalized eigenvectors. Jordan canonical form (continued).
MATH 423 Linear Algebra II Lecture 38: Generalized eigenvectors Jordan canonical form (continued) Jordan canonical form A Jordan block is a square matrix of the form λ 1 0 0 0 0 λ 1 0 0 0 0 λ 0 0 J = 0
E(y i ) = x T i β. yield of the refined product as a percentage of crude specific gravity vapour pressure ASTM 10% point ASTM end point in degrees F
Random and Mixed Effects Models (Ch. 10) Random effects models are very useful when the observations are sampled in a highly structured way. The basic idea is that the error associated with any linear,
Linear Algebra Notes for Marsden and Tromba Vector Calculus
Linear Algebra Notes for Marsden and Tromba Vector Calculus n-dimensional Euclidean Space and Matrices Definition of n space As was learned in Math b, a point in Euclidean three space can be thought of
Multivariate Normal Distribution
Multivariate Normal Distribution Lecture 4 July 21, 2011 Advanced Multivariate Statistical Methods ICPSR Summer Session #2 Lecture #4-7/21/2011 Slide 1 of 41 Last Time Matrices and vectors Eigenvalues
State of Stress at Point
State of Stress at Point Einstein Notation The basic idea of Einstein notation is that a covector and a vector can form a scalar: This is typically written as an explicit sum: According to this convention,
v w is orthogonal to both v and w. the three vectors v, w and v w form a right-handed set of vectors.
3. Cross product Definition 3.1. Let v and w be two vectors in R 3. The cross product of v and w, denoted v w, is the vector defined as follows: the length of v w is the area of the parallelogram with
Several Views of Support Vector Machines
Several Views of Support Vector Machines Ryan M. Rifkin Honda Research Institute USA, Inc. Human Intention Understanding Group 2007 Tikhonov Regularization We are considering algorithms of the form min
Introduction to Fixed Effects Methods
Introduction to Fixed Effects Methods 1 1.1 The Promise of Fixed Effects for Nonexperimental Research... 1 1.2 The Paired-Comparisons t-test as a Fixed Effects Method... 2 1.3 Costs and Benefits of Fixed
Multiple group discriminant analysis: Robustness and error rate
Institut f. Statistik u. Wahrscheinlichkeitstheorie Multiple group discriminant analysis: Robustness and error rate P. Filzmoser, K. Joossens, and C. Croux Forschungsbericht CS-006- Jänner 006 040 Wien,
MATH4427 Notebook 2 Spring 2016. 2 MATH4427 Notebook 2 3. 2.1 Definitions and Examples... 3. 2.2 Performance Measures for Estimators...
MATH4427 Notebook 2 Spring 2016 prepared by Professor Jenny Baglivo c Copyright 2009-2016 by Jenny A. Baglivo. All Rights Reserved. Contents 2 MATH4427 Notebook 2 3 2.1 Definitions and Examples...................................
Basic Statistics and Data Analysis for Health Researchers from Foreign Countries
Basic Statistics and Data Analysis for Health Researchers from Foreign Countries Volkert Siersma [email protected] The Research Unit for General Practice in Copenhagen Dias 1 Content Quantifying association
Chapter 13 Introduction to Nonlinear Regression( 非 線 性 迴 歸 )
Chapter 13 Introduction to Nonlinear Regression( 非 線 性 迴 歸 ) and Neural Networks( 類 神 經 網 路 ) 許 湘 伶 Applied Linear Regression Models (Kutner, Nachtsheim, Neter, Li) hsuhl (NUK) LR Chap 10 1 / 35 13 Examples
ASEN 3112 - Structures. MDOF Dynamic Systems. ASEN 3112 Lecture 1 Slide 1
19 MDOF Dynamic Systems ASEN 3112 Lecture 1 Slide 1 A Two-DOF Mass-Spring-Dashpot Dynamic System Consider the lumped-parameter, mass-spring-dashpot dynamic system shown in the Figure. It has two point
Inner products on R n, and more
Inner products on R n, and more Peyam Ryan Tabrizian Friday, April 12th, 2013 1 Introduction You might be wondering: Are there inner products on R n that are not the usual dot product x y = x 1 y 1 + +
PEST - Beyond Basic Model Calibration. Presented by Jon Traum
PEST - Beyond Basic Model Calibration Presented by Jon Traum Purpose of Presentation Present advance techniques available in PEST for model calibration High level overview Inspire more people to use PEST!
A Primer on Mathematical Statistics and Univariate Distributions; The Normal Distribution; The GLM with the Normal Distribution
A Primer on Mathematical Statistics and Univariate Distributions; The Normal Distribution; The GLM with the Normal Distribution PSYC 943 (930): Fundamentals of Multivariate Modeling Lecture 4: September
Nonlinear Iterative Partial Least Squares Method
Numerical Methods for Determining Principal Component Analysis Abstract Factors Béchu, S., Richard-Plouet, M., Fernandez, V., Walton, J., and Fairley, N. (2016) Developments in numerical treatments for
On Marginal Effects in Semiparametric Censored Regression Models
On Marginal Effects in Semiparametric Censored Regression Models Bo E. Honoré September 3, 2008 Introduction It is often argued that estimation of semiparametric censored regression models such as the
Average Redistributional Effects. IFAI/IZA Conference on Labor Market Policy Evaluation
Average Redistributional Effects IFAI/IZA Conference on Labor Market Policy Evaluation Geert Ridder, Department of Economics, University of Southern California. October 10, 2006 1 Motivation Most papers
Please follow the directions once you locate the Stata software in your computer. Room 114 (Business Lab) has computers with Stata software
STATA Tutorial Professor Erdinç Please follow the directions once you locate the Stata software in your computer. Room 114 (Business Lab) has computers with Stata software 1.Wald Test Wald Test is used
Syntax Menu Description Options Remarks and examples Stored results Methods and formulas References Also see. Description
Title stata.com lpoly Kernel-weighted local polynomial smoothing Syntax Menu Description Options Remarks and examples Stored results Methods and formulas References Also see Syntax lpoly yvar xvar [ if
1 Teaching notes on GMM 1.
Bent E. Sørensen January 23, 2007 1 Teaching notes on GMM 1. Generalized Method of Moment (GMM) estimation is one of two developments in econometrics in the 80ies that revolutionized empirical work in
Econometric Methods for Panel Data
Based on the books by Baltagi: Econometric Analysis of Panel Data and by Hsiao: Analysis of Panel Data Robert M. Kunst [email protected] University of Vienna and Institute for Advanced Studies
Nonlinear Regression:
Zurich University of Applied Sciences School of Engineering IDP Institute of Data Analysis and Process Design Nonlinear Regression: A Powerful Tool With Considerable Complexity Half-Day : Improved Inference
Scatter Plots with Error Bars
Chapter 165 Scatter Plots with Error Bars Introduction The procedure extends the capability of the basic scatter plot by allowing you to plot the variability in Y and X corresponding to each point. Each
Forecasting methods applied to engineering management
Forecasting methods applied to engineering management Áron Szász-Gábor Abstract. This paper presents arguments for the usefulness of a simple forecasting application package for sustaining operational
Towards Online Recognition of Handwritten Mathematics
Towards Online Recognition of Handwritten Mathematics Vadim Mazalov, joint work with Oleg Golubitsky and Stephen M. Watt Ontario Research Centre for Computer Algebra Department of Computer Science Western
A short course in Longitudinal Data Analysis ESRC Research Methods and Short Course Material for Practicals with the joiner package.
A short course in Longitudinal Data Analysis ESRC Research Methods and Short Course Material for Practicals with the joiner package. Lab 2 - June, 2008 1 jointdata objects To analyse longitudinal data
THE DYING FIBONACCI TREE. 1. Introduction. Consider a tree with two types of nodes, say A and B, and the following properties:
THE DYING FIBONACCI TREE BERNHARD GITTENBERGER 1. Introduction Consider a tree with two types of nodes, say A and B, and the following properties: 1. Let the root be of type A.. Each node of type A produces
Trend and Seasonal Components
Chapter 2 Trend and Seasonal Components If the plot of a TS reveals an increase of the seasonal and noise fluctuations with the level of the process then some transformation may be necessary before doing
Topic 5: Stochastic Growth and Real Business Cycles
Topic 5: Stochastic Growth and Real Business Cycles Yulei Luo SEF of HKU October 1, 2015 Luo, Y. (SEF of HKU) Macro Theory October 1, 2015 1 / 45 Lag Operators The lag operator (L) is de ned as Similar
Java Modules for Time Series Analysis
Java Modules for Time Series Analysis Agenda Clustering Non-normal distributions Multifactor modeling Implied ratings Time series prediction 1. Clustering + Cluster 1 Synthetic Clustering + Time series
These slides follow closely the (English) course textbook Pattern Recognition and Machine Learning by Christopher Bishop
Music and Machine Learning (IFT6080 Winter 08) Prof. Douglas Eck, Université de Montréal These slides follow closely the (English) course textbook Pattern Recognition and Machine Learning by Christopher
Comparison of Estimation Methods for Complex Survey Data Analysis
Comparison of Estimation Methods for Complex Survey Data Analysis Tihomir Asparouhov 1 Muthen & Muthen Bengt Muthen 2 UCLA 1 Tihomir Asparouhov, Muthen & Muthen, 3463 Stoner Ave. Los Angeles, CA 90066.
Notes on Symmetric Matrices
CPSC 536N: Randomized Algorithms 2011-12 Term 2 Notes on Symmetric Matrices Prof. Nick Harvey University of British Columbia 1 Symmetric Matrices We review some basic results concerning symmetric matrices.
