# A Semi-parametric Approach for Decomposition of Absorption Spectra in the Presence of Unknown Components

Save this PDF as:

Size: px
Start display at page:

Download "A Semi-parametric Approach for Decomposition of Absorption Spectra in the Presence of Unknown Components"

## Transcription

1 A Semi-parametric Approach for Decomposition of Absorption Spectra in the Presence of Unknown Components Payman Sadegh 1,2, Henrik Aalborg Nielsen 1, and Henrik Madsen 1 Abstract Decomposition of absorption spectra using linear regression has been proposed for calculating concentrations of mixture compounds The method is based on projecting the observed mixture spectrum onto the linear space generated by the reference spectra that correspond to the individual components comprising the mixture The computed coefficients are then used as estimates for concentration of the components that comprise the mixture Existence of unknown components in the mixture, however, introduces bias on the obtained concentration estimates We extend the usual linear regression model to an additive semi-parametric model to take the unknown component into account, estimate the absorption profile of the unknown component, and obtain concentration estimates of the known compounds A standard back-fitting method as well as a mean weighted least squares criterion are applied The techniques are illustrated on simulated absorption spectra Keywords: Parameter estimation, non-parametric methods, unbiased estimates, chemometry, absorption spectra, additive models, semi-parametric models, mean weighted least squares, back-fitting 1 Department of Mathematical Modelling, Technical University of Denmark, DK-2800 Lyngby, Denmark 2 1

2 1 Introduction Chemometric spectroscopy is a simple way for examination of gases and liquids UV examination of wastewater for instance has been proposed for quality control purposes (Thomas, Theraulaz & Suryani 1996) The technique is based on the analysis of the absorption spectrum obtained from the sample of interest Depending on the concentrations of comprising compounds, the spectral absorption of the mixtures vary at different wavelengths This information may in principle be used to encode the concentration of an existing compound, given information about the absorption pattern of the compound of interest The functional dependency of absorption spectra upon concentrations and absorption spectra of comprising compounds is in general unknown Several simple models have been proposed, most notably, a linear regression model (Gallot & Thomas 1993) In this model, it is assumed that the absorption spectrum of a mixture is a linear combination of absorption spectra of the comprising compounds where each coefficients determines the concentration of the corresponding compound Hence, if spectral absorption measurements are performed at minimum p wavelengths, where p is the number of existing compounds, the concentrations may be estimated by the least squares technique (Gallot & Thomas 1993) The technique fails when unknown compounds are present Even though, it is not of interest to estimate the concentration of the unknown components, the presence of such will introduce bias on the concentration estimates for the known ones unless the spectrum of the unknown component is orthogonal to the spectra of the known ones Such a situation is unlikely to occur for most decomposition problems of interest in chemometry We propose a semi-parametric model to account for the presence of unknown compounds We apply both a standard back-fitting method for estimation in additive models and a novel technique based on a mean weighted least squares criterion (MWLS) MWLS provides a promising easy way to embed prior information into kernel estimation schemes While kernel estimation of a function at N data points may be regarded as N independent weighted least squares problems, the MWLS combines the N optimization problems into one The advantage is that any global information about the behavior of the process may be imposed as hard or soft constraints in the resulting single optimization criterion The rest of the paper is organized as follows In Section 2, we present the semi-parametric formulation of spectral absorption decomposition problem, we review techniques from the theory of general additive models, and in- 2

3 troduce the MWLS technique In Section 3, we present a simple numerical study based on simulated data, and finally, Section 4 presents concluding remarks 2 Problem formulation Consider the problem of decomposing the observed function f(t), t T, into known functions f i (t), i =1,, p, by estimating the parameter θ = [θ 1,,θ p ] of the linear regression f(t) = p θ i f i (t)+r(t)+e(t), (1) i=1 where R(t) accounts for the superposition of all unknown components comprising f(t), and {e(t)} is a sequence of zero mean independently distributed random variables representing the measurement noise Disregarding R(t), the usual least squares estimate of θ is given by arg min θ [f(t) t T p θ i f i (t)] 2 (2) where T = {t 1,,t N } is the set of N sampled observations Inserting f(t) from (1) in the solution to (2) shows that the bias on the least squares estimate is given by (X X) 1 X R(t 1 ) R(t N ) where X = i=1 f 1 (t 1 ) f p (t 1 ) f 1 (t 2 ) f p (t 2 ) f 1 (t N ) f p (t N ) Hence depending on R(t), the bias might be arbitrarily large For chemometric data, the function f(t) is the measured absorption spectrum at various wavelengths t and f i (t) is the known absorption spectrum for component i The presence of unknown components introduces 3

4 the remainder term R(t) which should be simultaneously with the coefficients θ i estimated from data The back-fitting algorithm (Hastie & Tibshirani 1990) can be applied under such circumstances Back-fitting is an iterative method for decomposition of a number of unknown functions in an additive model Starting from an initial guess, the algorithm iteratively estimates each one of the functions, fixing all others to their corresponding latest updated values The algorithm has an explicit solution for problems of the type (1), see (Hastie & Tibshirani 1990), page 118 The solution involves a smoother function for estimation of R( ) andaweighted least squares criterion to estimate θ Only in the case the smoother is a spline smoother, the back-fitting can be explicitly related to an optimization criterion (Hastie & Tibshirani 1990) Another approach, which is novel to the best of our knowledge, is based on a mean weighted least squares criterion The approach is particularly appealing since its solution is explicitly related to an optimization criterion which is a missing element for back-fitting using other smoothers than splines The MWLS approach is as follows Consider the following optimization min w h ( t τ )[f(t) θ,{φ(τ)} τ T t T p θ i f i (t) q (t τ; φ(τ))] 2, (3) where q (t τ; φ(τ)) and φ(τ) respectively denote a local approximation to R( ) around τ and its corresponding (τ-dependent) parameter and {w h ( d )} is a weight sequence that falls monotonically with d One typical choice for q (t τ; φ(τ)) is a low order polynomial in t τ The weight sequence is obtained from the kernel K h ( d ) according to w h ( d ) = K h( d ) K h ( d ) Some typical selections for the kernel K h ( d ) are Gaussian kernel K h ( d )= 1 ) ( h 2π exp d2 2h 2, and Epanechnikov kernel K h ( d )= 3 4h d i=1 (1 d2 h 2 ) I( d h), where I( d h) = 1if d h and zero otherwise 4

5 W = τ T The criterion (3) may be explained as follows The optimization problem obtained by considering the inner sum in (3) as the cost function, ie min w h ( t τ )[f(t) θ,φ(τ) t T p θ i f i (t) q (t τ; φ(τ))] 2, (4) i=1 provides the usual weighted least squares problem for non-parametric estimation of R(τ), τ T, based on the local approximation R(t) q(t τ; φ(τ)) around τ Hence a non-parametric estimate for R(τ) is obtained by inserting the optimal value of φ(τ) inq(0; φ(τ)) In connection with estimating θ, on the other hand, (4) is of no immediate use since the obtained estimates of θ vary with τ This dependency is eliminated in (3) by the outer summation over τ This may be thought of as restricting the solutions to the independent optimization problems (4) to yield a common estimate for θ Now assume that the local approximation q(t τ; φ(τ)) is linear in φ(τ), ie m q (t τ; φ(τ)) = φ i (τ)g i (t τ) (5) i=1 where φ(τ) =[φ 1 (τ),,φ m (τ)] LetW τ denote a diagonal N N matrix with the (i, i) element being equal to w h ( t i τ ) Further denote X τ = and finally Y =[f(t 1 ),,f(t N )] g 1 (t 1 τ) g m (t 1 τ) g 1 (t 2 τ) g m (t 2 τ) g 1 (t N τ) g m (t N τ) Proposition 1 Assume that the local approximation q(t τ; φ(τ)) is linear in φ(τ) (see (5)) The optimal value of θ according to (3) is equivalent to the solution to the weighted least squares problem min(y Xθ) W (Y Xθ) θ, where ( Wτ W τ X τ (X τ W τ X τ ) 1 X τ W τ ) 5

6 Proof: Since φ(τ) varies with τ in (3), φ(τ) may be simply computed by finding the optimal value of φ(τ) in (4) as a function of θ Inserting the optimal values for φ(τ) in (3) and collecting terms yields the desired result 3 Numerical example In this section, we apply the techniques discussed earlier to a spectral decomposition problem Consider two absorption spectra f 1 (t) andf 2 (t) as given in Figure 1 These spectra contain COD, TOC, TSS, and BOD with concentrations 63, 0, 15, and 15 for f 1 and 36, 125, 0, 115 for f 2 The unknown component is assumed to consist of concentrations of nitrates with spectrum R(t) as illustrated in Figure 2 The spectra of Figure 1 and Figure 2 are taken experimentally from real samples absorption f (t) 1 f 2 (t) wavelength Figure 1: Reference absorption spectra We simulate the spectrum illustrated in Figure 3 by the linear combination: f(t) =f 1 (t)+f 2 (t)+r(t) The least squares solution yields coefficient estimates 135 and 181 for f 1 6

7 absorption wavelength Figure 2: Spectrum for the unknown component The solid curve, dashed, and dotted curves respectively represent the true spectrum, the estimated spectrum using a mean weighted least squares criterion, and the estimated spectrum using regular least squares and f 2, and an estimate for R(t) as illustrated in Figure 2 These results clearly indicate the inappropriateness of the least squares solution We apply the result presented in Proposition 1 where the local approximators are polynomials of second order and the weights are computed according to a unit bandwidth Gaussian kernel The coefficients of f 1 and f 2 are respectively estimated to 091 and 093 We further apply the back-fitting solution to the above estimation problem The best result is obtained by applying spline smoother with a large degree of freedom, yielding estimates of 125 and 073 for the coefficients of f 1 and f 2 respectively These estimates are noticeably more biased than the MWLS solution Finally, we investigate the effect of measurement noise We simulate 25 independent samples according to f(t) =f 1 (t)+f 2 (t)+r(t)+e(t) 7

8 absorption f(t) wavelength Figure 3: Simulated sample where {e(t)} is a sequence of iid N(0, ) random variables, and empirically compute mean and covariance of the concentration estimates ˆθ These quantities are computed to E{ˆθ} = [ ],COV{ˆθ} = [ where E{ } and COV { } as usual denote mean value and covariance Numerical experimentation indicates that the estimation procedure fails for noise variances above ] 4 Discussion and conclusion 2 We have proposed a solution to decomposition of absorption spectra in presence of correlated error (eg due to existence of unknown components) The underlying assumption throughout the paper is a linear additive model We have applied both the back-fitting solution and a mean weighted least squares criterion Numerical experience with back-fitting iterations for typ- 8

9 ical chemometric spectra fails due to high correlated among data The back-fitting method yields reasonable estimates only if the explicit end solution of the iterations, which exists for a model linear in parameters, is applied Contrary to back-fitting, the mean weighted least squares solution is not tied to an iterative algorithm but to a well defined optimization problem The mean weighted least squares solution performs reasonably well for decomposition of absorption spectra in the presence of unknown components The solution is rather sensitive to measurement noise which is again due to high correlation among reference spectra on the one hand and high correlation between the unmodelled error spectrum and the reference spectra on the other To further elaborate on the discussion above, Figure 4 on page 10 shows a scatter-plot matrix of the wavelength and five typical absorption spectra f i, i =1,, 5 The actual spectra are shown in the left column and, with the axes swapped, in the bottom row From these it is seem that f 2 (t) andf 3 (t) are very similar, correspondingly the plot of f 2 (t) against f 3 (t) shows an almost linear relation Consequently, concentrations of substances corresponding to these spectra will be difficult to distinguish, ie the estimated concentrations will be highly correlated For data simulated according to some arbitrary linear combination of the illustrated spectra and some typical spectrum for the unknown component R(t) (simulated as a Guassian bell curve around some bandwidth), the R-squared value is above when omitting R(t) from the model and replacing it with a intercept term This indicate that the simulated spectrum f(t) lies almost entirely in the space spanned by the reference spectra, making estimation of R(t) difficult if f(t) is measured with noise Consequently, to reduce bias on the estimates of concentrations R(t) must, to some extend which is determined by the level of noise, lie in an other space than the one spanned by the reference spectra The above considerations indicate that, if possible, reference spectra should be chosen so that (1) all explain different aspects of the unknown spectra (as opposed to f 2 (t) andf 3 (t) above), and (2) the unknown R(t) lies, to some extend, in an other space than the one spanned by the reference spectra Furthermore, measurement noise should be reduced as much as possible, eg by performing several measurements on the sample of interest and averaging Finally, the mean weighted least squares criterion introduced in this pa- 9

10 per has application potentials far beyond the scope of the present paper The approach provides a simple yet powerful tool to embed global information about the process of interest in local estimation techniques, hence combining the noise reduction and interpretability of global models with the generality, minimal model reliance, and convenience of non-parametric methods Contrast this to usual ways of embedding prior information in non-parametric methods which concern local or smoothness properties such as selection of a suitable kernel, bandwidth, and degree of local approximators This poses an interesting direction for future research and forthcoming publications f t f f f f Figure 4: Scatter-plot matrix of wavelength (t) and reference spectra (f 1 (t),,f 5 (t) 10

11 References Gallot, S & Thomas, O (1993), Fast and easy interpretation of a set of absorption spectra: theory and qualitative applications for UV examination of waters and wastewatres, Fresenius Journal of Analytical Chemistry 346, Hastie, T J & Tibshirani, R J (1990), Generalized Additive Models, Chapman & Hall Thomas, O, Theraulaz, F & Suryani, S (1996), Advanced UV examination of wastewaters, Environmental Technology 17,

### Least Squares Estimation

Least Squares Estimation SARA A VAN DE GEER Volume 2, pp 1041 1045 in Encyclopedia of Statistics in Behavioral Science ISBN-13: 978-0-470-86080-9 ISBN-10: 0-470-86080-4 Editors Brian S Everitt & David

### Introduction to General and Generalized Linear Models

Introduction to General and Generalized Linear Models General Linear Models - part I Henrik Madsen Poul Thyregod Informatics and Mathematical Modelling Technical University of Denmark DK-2800 Kgs. Lyngby

### ELEC-E8104 Stochastics models and estimation, Lecture 3b: Linear Estimation in Static Systems

Stochastics models and estimation, Lecture 3b: Linear Estimation in Static Systems Minimum Mean Square Error (MMSE) MMSE estimation of Gaussian random vectors Linear MMSE estimator for arbitrarily distributed

### Parametric versus Semi/nonparametric Regression Models

Parametric versus Semi/nonparametric Regression Models Hamdy F. F. Mahmoud Virginia Polytechnic Institute and State University Department of Statistics LISA short course series- July 23, 2014 Hamdy Mahmoud

Lecture 16: Generalized Additive Models Regression III: Advanced Methods Bill Jacoby Michigan State University http://polisci.msu.edu/jacoby/icpsr/regress3 Goals of the Lecture Introduce Additive Models

### Smoothing and Non-Parametric Regression

Smoothing and Non-Parametric Regression Germán Rodríguez grodri@princeton.edu Spring, 2001 Objective: to estimate the effects of covariates X on a response y nonparametrically, letting the data suggest

### Concepts in Machine Learning, Unsupervised Learning & Astronomy Applications

Data Mining In Modern Astronomy Sky Surveys: Concepts in Machine Learning, Unsupervised Learning & Astronomy Applications Ching-Wa Yip cwyip@pha.jhu.edu; Bloomberg 518 Human are Great Pattern Recognizers

### Nonlinear Iterative Partial Least Squares Method

Numerical Methods for Determining Principal Component Analysis Abstract Factors Béchu, S., Richard-Plouet, M., Fernandez, V., Walton, J., and Fairley, N. (2016) Developments in numerical treatments for

### Lecture 3: Linear methods for classification

Lecture 3: Linear methods for classification Rafael A. Irizarry and Hector Corrada Bravo February, 2010 Today we describe four specific algorithms useful for classification problems: linear regression,

### Overview of Violations of the Basic Assumptions in the Classical Normal Linear Regression Model

Overview of Violations of the Basic Assumptions in the Classical Normal Linear Regression Model 1 September 004 A. Introduction and assumptions The classical normal linear regression model can be written

### Data Mining Chapter 6: Models and Patterns Fall 2011 Ming Li Department of Computer Science and Technology Nanjing University

Data Mining Chapter 6: Models and Patterns Fall 2011 Ming Li Department of Computer Science and Technology Nanjing University Models vs. Patterns Models A model is a high level, global description of a

### Introduction to time series analysis

Introduction to time series analysis Margherita Gerolimetto November 3, 2010 1 What is a time series? A time series is a collection of observations ordered following a parameter that for us is time. Examples

### Review Jeopardy. Blue vs. Orange. Review Jeopardy

Review Jeopardy Blue vs. Orange Review Jeopardy Jeopardy Round Lectures 0-3 Jeopardy Round \$200 How could I measure how far apart (i.e. how different) two observations, y 1 and y 2, are from each other?

### D-optimal plans in observational studies

D-optimal plans in observational studies Constanze Pumplün Stefan Rüping Katharina Morik Claus Weihs October 11, 2005 Abstract This paper investigates the use of Design of Experiments in observational

### Recall that two vectors in are perpendicular or orthogonal provided that their dot

Orthogonal Complements and Projections Recall that two vectors in are perpendicular or orthogonal provided that their dot product vanishes That is, if and only if Example 1 The vectors in are orthogonal

### Predict the Popularity of YouTube Videos Using Early View Data

000 001 002 003 004 005 006 007 008 009 010 011 012 013 014 015 016 017 018 019 020 021 022 023 024 025 026 027 028 029 030 031 032 033 034 035 036 037 038 039 040 041 042 043 044 045 046 047 048 049 050

### 4.3 Moving Average Process MA(q)

66 CHAPTER 4. STATIONARY TS MODELS 4.3 Moving Average Process MA(q) Definition 4.5. {X t } is a moving-average process of order q if X t = Z t + θ 1 Z t 1 +... + θ q Z t q, (4.9) where and θ 1,...,θ q

### Building a Smooth Yield Curve. University of Chicago. Jeff Greco

Building a Smooth Yield Curve University of Chicago Jeff Greco email: jgreco@math.uchicago.edu Preliminaries As before, we will use continuously compounding Act/365 rates for both the zero coupon rates

### Introduction to machine learning and pattern recognition Lecture 1 Coryn Bailer-Jones

Introduction to machine learning and pattern recognition Lecture 1 Coryn Bailer-Jones http://www.mpia.de/homes/calj/mlpr_mpia2008.html 1 1 What is machine learning? Data description and interpretation

### Syntax Menu Description Options Remarks and examples Stored results Methods and formulas References Also see. Description

Title stata.com lpoly Kernel-weighted local polynomial smoothing Syntax Menu Description Options Remarks and examples Stored results Methods and formulas References Also see Syntax lpoly yvar xvar [ if

### NCSS Statistical Software Principal Components Regression. In ordinary least squares, the regression coefficients are estimated using the formula ( )

Chapter 340 Principal Components Regression Introduction is a technique for analyzing multiple regression data that suffer from multicollinearity. When multicollinearity occurs, least squares estimates

### Factor analysis. Angela Montanari

Factor analysis Angela Montanari 1 Introduction Factor analysis is a statistical model that allows to explain the correlations between a large number of observed correlated variables through a small number

### Scientific Computing: An Introductory Survey

Scientific Computing: An Introductory Survey Chapter 3 Linear Least Squares Prof. Michael T. Heath Department of Computer Science University of Illinois at Urbana-Champaign Copyright c 2002. Reproduction

### STA 4273H: Statistical Machine Learning

STA 4273H: Statistical Machine Learning Russ Salakhutdinov Department of Statistics! rsalakhu@utstat.toronto.edu! http://www.cs.toronto.edu/~rsalakhu/ Lecture 6 Three Approaches to Classification Construct

### Practical Guide to the Simplex Method of Linear Programming

Practical Guide to the Simplex Method of Linear Programming Marcel Oliver Revised: April, 0 The basic steps of the simplex algorithm Step : Write the linear programming problem in standard form Linear

### Mehtap Ergüven Abstract of Ph.D. Dissertation for the degree of PhD of Engineering in Informatics

INTERNATIONAL BLACK SEA UNIVERSITY COMPUTER TECHNOLOGIES AND ENGINEERING FACULTY ELABORATION OF AN ALGORITHM OF DETECTING TESTS DIMENSIONALITY Mehtap Ergüven Abstract of Ph.D. Dissertation for the degree

### L10: Probability, statistics, and estimation theory

L10: Probability, statistics, and estimation theory Review of probability theory Bayes theorem Statistics and the Normal distribution Least Squares Error estimation Maximum Likelihood estimation Bayesian

### Fitting Subject-specific Curves to Grouped Longitudinal Data

Fitting Subject-specific Curves to Grouped Longitudinal Data Djeundje, Viani Heriot-Watt University, Department of Actuarial Mathematics & Statistics Edinburgh, EH14 4AS, UK E-mail: vad5@hw.ac.uk Currie,

### Multiple Linear Regression in Data Mining

Multiple Linear Regression in Data Mining Contents 2.1. A Review of Multiple Linear Regression 2.2. Illustration of the Regression Process 2.3. Subset Selection in Linear Regression 1 2 Chap. 2 Multiple

### Econometrics Simple Linear Regression

Econometrics Simple Linear Regression Burcu Eke UC3M Linear equations with one variable Recall what a linear equation is: y = b 0 + b 1 x is a linear equation with one variable, or equivalently, a straight

### 7 Time series analysis

7 Time series analysis In Chapters 16, 17, 33 36 in Zuur, Ieno and Smith (2007), various time series techniques are discussed. Applying these methods in Brodgar is straightforward, and most choices are

### BayesX - Software for Bayesian Inference in Structured Additive Regression

BayesX - Software for Bayesian Inference in Structured Additive Regression Thomas Kneib Faculty of Mathematics and Economics, University of Ulm Department of Statistics, Ludwig-Maximilians-University Munich

### 3. Regression & Exponential Smoothing

3. Regression & Exponential Smoothing 3.1 Forecasting a Single Time Series Two main approaches are traditionally used to model a single time series z 1, z 2,..., z n 1. Models the observation z t as a

### Tetris: Experiments with the LP Approach to Approximate DP

Tetris: Experiments with the LP Approach to Approximate DP Vivek F. Farias Electrical Engineering Stanford University Stanford, CA 94403 vivekf@stanford.edu Benjamin Van Roy Management Science and Engineering

### Chapter 4: Vector Autoregressive Models

Chapter 4: Vector Autoregressive Models 1 Contents: Lehrstuhl für Department Empirische of Wirtschaftsforschung Empirical Research and und Econometrics Ökonometrie IV.1 Vector Autoregressive Models (VAR)...

### Lasso on Categorical Data

Lasso on Categorical Data Yunjin Choi, Rina Park, Michael Seo December 14, 2012 1 Introduction In social science studies, the variables of interest are often categorical, such as race, gender, and nationality.

### Orthogonal Diagonalization of Symmetric Matrices

MATH10212 Linear Algebra Brief lecture notes 57 Gram Schmidt Process enables us to find an orthogonal basis of a subspace. Let u 1,..., u k be a basis of a subspace V of R n. We begin the process of finding

### Improving Generalization

Improving Generalization Introduction to Neural Networks : Lecture 10 John A. Bullinaria, 2004 1. Improving Generalization 2. Training, Validation and Testing Data Sets 3. Cross-Validation 4. Weight Restriction

### PATTERN RECOGNITION AND MACHINE LEARNING CHAPTER 4: LINEAR MODELS FOR CLASSIFICATION

PATTERN RECOGNITION AND MACHINE LEARNING CHAPTER 4: LINEAR MODELS FOR CLASSIFICATION Introduction In the previous chapter, we explored a class of regression models having particularly simple analytical

### Module 5: Multiple Regression Analysis

Using Statistical Data Using to Make Statistical Decisions: Data Multiple to Make Regression Decisions Analysis Page 1 Module 5: Multiple Regression Analysis Tom Ilvento, University of Delaware, College

### Introduction to Matrix Algebra

Psychology 7291: Multivariate Statistics (Carey) 8/27/98 Matrix Algebra - 1 Introduction to Matrix Algebra Definitions: A matrix is a collection of numbers ordered by rows and columns. It is customary

### Determination of Stress Histories in Structures by Natural Input Modal Analysis

Determination of Stress Histories in Structures by Natural Input Modal Analysis Henrik P. Hjelm Structural Vibration Solutions A/S Novi Science Park, Niels Jernes Vej 10, DK-9220 Aalborg East, Denmark

### Factor Analysis. Factor Analysis

Factor Analysis Principal Components Analysis, e.g. of stock price movements, sometimes suggests that several variables may be responding to a small number of underlying forces. In the factor model, we

### Simple Linear Regression Inference

Simple Linear Regression Inference 1 Inference requirements The Normality assumption of the stochastic term e is needed for inference even if it is not a OLS requirement. Therefore we have: Interpretation

### Degrees of Freedom and Model Search

Degrees of Freedom and Model Search Ryan J. Tibshirani Abstract Degrees of freedom is a fundamental concept in statistical modeling, as it provides a quantitative description of the amount of fitting performed

### On the Traffic Capacity of Cellular Data Networks. 1 Introduction. T. Bonald 1,2, A. Proutière 1,2

On the Traffic Capacity of Cellular Data Networks T. Bonald 1,2, A. Proutière 1,2 1 France Telecom Division R&D, 38-40 rue du Général Leclerc, 92794 Issy-les-Moulineaux, France {thomas.bonald, alexandre.proutiere}@francetelecom.com

### Multivariate Analysis (Slides 13)

Multivariate Analysis (Slides 13) The final topic we consider is Factor Analysis. A Factor Analysis is a mathematical approach for attempting to explain the correlation between a large set of variables

### Appendix E: Graphing Data

You will often make scatter diagrams and line graphs to illustrate the data that you collect. Scatter diagrams are often used to show the relationship between two variables. For example, in an absorbance

### Designed experiments for semi-parametric models and functional data with a case-study in Tribology

Designed experiments for semi-parametric models and functional data with a case-study in Tribology David C. Woods 1, Christopher J. Marley and Susan M. Lewis Southampton Statistical Sciences Research Institute,

### Linear Models and Conjoint Analysis with Nonlinear Spline Transformations

Linear Models and Conjoint Analysis with Nonlinear Spline Transformations Warren F. Kuhfeld Mark Garratt Abstract Many common data analysis models are based on the general linear univariate model, including

### Example: Credit card default, we may be more interested in predicting the probabilty of a default than classifying individuals as default or not.

Statistical Learning: Chapter 4 Classification 4.1 Introduction Supervised learning with a categorical (Qualitative) response Notation: - Feature vector X, - qualitative response Y, taking values in C

### MATH 304 Linear Algebra Lecture 9: Subspaces of vector spaces (continued). Span. Spanning set.

MATH 304 Linear Algebra Lecture 9: Subspaces of vector spaces (continued). Span. Spanning set. Vector space A vector space is a set V equipped with two operations, addition V V (x,y) x + y V and scalar

### Cheng Soon Ong & Christfried Webers. Canberra February June 2016

c Cheng Soon Ong & Christfried Webers Research Group and College of Engineering and Computer Science Canberra February June (Many figures from C. M. Bishop, "Pattern Recognition and ") 1of 31 c Part I

### The application of linear programming to management accounting

The application of linear programming to management accounting Solutions to Chapter 26 questions Question 26.16 (a) M F Contribution per unit 96 110 Litres of material P required 8 10 Contribution per

### Multiple Imputation for Missing Data: A Cautionary Tale

Multiple Imputation for Missing Data: A Cautionary Tale Paul D. Allison University of Pennsylvania Address correspondence to Paul D. Allison, Sociology Department, University of Pennsylvania, 3718 Locust

### CURVE FITTING LEAST SQUARES APPROXIMATION

CURVE FITTING LEAST SQUARES APPROXIMATION Data analysis and curve fitting: Imagine that we are studying a physical system involving two quantities: x and y Also suppose that we expect a linear relationship

### Simple Predictive Analytics Curtis Seare

Using Excel to Solve Business Problems: Simple Predictive Analytics Curtis Seare Copyright: Vault Analytics July 2010 Contents Section I: Background Information Why use Predictive Analytics? How to use

### Regression Using Support Vector Machines: Basic Foundations

Regression Using Support Vector Machines: Basic Foundations Technical Report December 2004 Aly Farag and Refaat M Mohamed Computer Vision and Image Processing Laboratory Electrical and Computer Engineering

### Multiple Optimization Using the JMP Statistical Software Kodak Research Conference May 9, 2005

Multiple Optimization Using the JMP Statistical Software Kodak Research Conference May 9, 2005 Philip J. Ramsey, Ph.D., Mia L. Stephens, MS, Marie Gaudard, Ph.D. North Haven Group, http://www.northhavengroup.com/

### Notes for STA 437/1005 Methods for Multivariate Data

Notes for STA 437/1005 Methods for Multivariate Data Radford M. Neal, 26 November 2010 Random Vectors Notation: Let X be a random vector with p elements, so that X = [X 1,..., X p ], where denotes transpose.

### SAS Software to Fit the Generalized Linear Model

SAS Software to Fit the Generalized Linear Model Gordon Johnston, SAS Institute Inc., Cary, NC Abstract In recent years, the class of generalized linear models has gained popularity as a statistical modeling

### the points are called control points approximating curve

Chapter 4 Spline Curves A spline curve is a mathematical representation for which it is easy to build an interface that will allow a user to design and control the shape of complex curves and surfaces.

### Joint Probability Distributions and Random Samples (Devore Chapter Five)

Joint Probability Distributions and Random Samples (Devore Chapter Five) 1016-345-01 Probability and Statistics for Engineers Winter 2010-2011 Contents 1 Joint Probability Distributions 1 1.1 Two Discrete

### Chapter 3: The Multiple Linear Regression Model

Chapter 3: The Multiple Linear Regression Model Advanced Econometrics - HEC Lausanne Christophe Hurlin University of Orléans November 23, 2013 Christophe Hurlin (University of Orléans) Advanced Econometrics

### Bootstrapping Big Data

Bootstrapping Big Data Ariel Kleiner Ameet Talwalkar Purnamrita Sarkar Michael I. Jordan Computer Science Division University of California, Berkeley {akleiner, ameet, psarkar, jordan}@eecs.berkeley.edu

### 1. LINEAR EQUATIONS. A linear equation in n unknowns x 1, x 2,, x n is an equation of the form

1. LINEAR EQUATIONS A linear equation in n unknowns x 1, x 2,, x n is an equation of the form a 1 x 1 + a 2 x 2 + + a n x n = b, where a 1, a 2,..., a n, b are given real numbers. For example, with x and

### 8. Linear least-squares

8. Linear least-squares EE13 (Fall 211-12) definition examples and applications solution of a least-squares problem, normal equations 8-1 Definition overdetermined linear equations if b range(a), cannot

### These axioms must hold for all vectors ū, v, and w in V and all scalars c and d.

DEFINITION: A vector space is a nonempty set V of objects, called vectors, on which are defined two operations, called addition and multiplication by scalars (real numbers), subject to the following axioms

### 2. Mean-variance portfolio theory

2. Mean-variance portfolio theory (2.1) Markowitz s mean-variance formulation (2.2) Two-fund theorem (2.3) Inclusion of the riskfree asset 1 2.1 Markowitz mean-variance formulation Suppose there are N

### Adequacy of Biomath. Models. Empirical Modeling Tools. Bayesian Modeling. Model Uncertainty / Selection

Directions in Statistical Methodology for Multivariable Predictive Modeling Frank E Harrell Jr University of Virginia Seattle WA 19May98 Overview of Modeling Process Model selection Regression shape Diagnostics

### Analysis of Bayesian Dynamic Linear Models

Analysis of Bayesian Dynamic Linear Models Emily M. Casleton December 17, 2010 1 Introduction The main purpose of this project is to explore the Bayesian analysis of Dynamic Linear Models (DLMs). The main

: Table of Contents... 1 Overview of Model... 1 Dispersion... 2 Parameterization... 3 Sigma-Restricted Model... 3 Overparameterized Model... 4 Reference Coding... 4 Model Summary (Summary Tab)... 5 Summary

### Variance Reduction. Pricing American Options. Monte Carlo Option Pricing. Delta and Common Random Numbers

Variance Reduction The statistical efficiency of Monte Carlo simulation can be measured by the variance of its output If this variance can be lowered without changing the expected value, fewer replications

### Supporting Information

S1 Supporting Information GFT NMR, a New Approach to Rapidly Obtain Precise High Dimensional NMR Spectral Information Seho Kim and Thomas Szyperski * Department of Chemistry, University at Buffalo, The

### Estimation and Inference in Cointegration Models Economics 582

Estimation and Inference in Cointegration Models Economics 582 Eric Zivot May 17, 2012 Tests for Cointegration Let the ( 1) vector Y be (1). Recall, Y is cointegrated with 0 cointegrating vectors if there

### http://www.jstor.org This content downloaded on Tue, 19 Feb 2013 17:28:43 PM All use subject to JSTOR Terms and Conditions

A Significance Test for Time Series Analysis Author(s): W. Allen Wallis and Geoffrey H. Moore Reviewed work(s): Source: Journal of the American Statistical Association, Vol. 36, No. 215 (Sep., 1941), pp.

### State Space Time Series Analysis

State Space Time Series Analysis p. 1 State Space Time Series Analysis Siem Jan Koopman http://staff.feweb.vu.nl/koopman Department of Econometrics VU University Amsterdam Tinbergen Institute 2011 State

### Canonical Correlation

Chapter 400 Introduction Canonical correlation analysis is the study of the linear relations between two sets of variables. It is the multivariate extension of correlation analysis. Although we will present

### 1 Solving LPs: The Simplex Algorithm of George Dantzig

Solving LPs: The Simplex Algorithm of George Dantzig. Simplex Pivoting: Dictionary Format We illustrate a general solution procedure, called the simplex algorithm, by implementing it on a very simple example.

### South Carolina College- and Career-Ready (SCCCR) Probability and Statistics

South Carolina College- and Career-Ready (SCCCR) Probability and Statistics South Carolina College- and Career-Ready Mathematical Process Standards The South Carolina College- and Career-Ready (SCCCR)

### Simple linear regression

Simple linear regression Introduction Simple linear regression is a statistical method for obtaining a formula to predict values of one variable from another where there is a causal relationship between

### San Jose State University Engineering 10 1

KY San Jose State University Engineering 10 1 Select Insert from the main menu Plotting in Excel Select All Chart Types San Jose State University Engineering 10 2 Definition: A chart that consists of multiple

### Estimation with Minimum Mean Square Error

C H A P T E R 8 Estimation with Minimum Mean Square Error INTRODUCTION A recurring theme in this text and in much of communication, control and signal processing is that of making systematic estimates,

### Uses of Derivative Spectroscopy

Uses of Derivative Spectroscopy Application Note UV-Visible Spectroscopy Anthony J. Owen Derivative spectroscopy uses first or higher derivatives of absorbance with respect to wavelength for qualitative

### Chapter 6: Multivariate Cointegration Analysis

Chapter 6: Multivariate Cointegration Analysis 1 Contents: Lehrstuhl für Department Empirische of Wirtschaftsforschung Empirical Research and und Econometrics Ökonometrie VI. Multivariate Cointegration

### Least-Squares Intersection of Lines

Least-Squares Intersection of Lines Johannes Traa - UIUC 2013 This write-up derives the least-squares solution for the intersection of lines. In the general case, a set of lines will not intersect at a

### Cofactor Expansion: Cramer s Rule

Cofactor Expansion: Cramer s Rule MATH 322, Linear Algebra I J. Robert Buchanan Department of Mathematics Spring 2015 Introduction Today we will focus on developing: an efficient method for calculating

### Notes on Symmetric Matrices

CPSC 536N: Randomized Algorithms 2011-12 Term 2 Notes on Symmetric Matrices Prof. Nick Harvey University of British Columbia 1 Symmetric Matrices We review some basic results concerning symmetric matrices.

### Introduction to Kernel Smoothing

Introduction to Kernel Smoothing Density 0.000 0.002 0.004 0.006 700 800 900 1000 1100 1200 1300 Wilcoxon score M. P. Wand & M. C. Jones Kernel Smoothing Monographs on Statistics and Applied Probability

### Communication on the Grassmann Manifold: A Geometric Approach to the Noncoherent Multiple-Antenna Channel

IEEE TRANSACTIONS ON INFORMATION THEORY, VOL. 48, NO. 2, FEBRUARY 2002 359 Communication on the Grassmann Manifold: A Geometric Approach to the Noncoherent Multiple-Antenna Channel Lizhong Zheng, Student

### Univariate Regression

Univariate Regression Correlation and Regression The regression line summarizes the linear relationship between 2 variables Correlation coefficient, r, measures strength of relationship: the closer r is

### MISSING DATA TECHNIQUES WITH SAS. IDRE Statistical Consulting Group

MISSING DATA TECHNIQUES WITH SAS IDRE Statistical Consulting Group ROAD MAP FOR TODAY To discuss: 1. Commonly used techniques for handling missing data, focusing on multiple imputation 2. Issues that could

### Random Vectors and the Variance Covariance Matrix

Random Vectors and the Variance Covariance Matrix Definition 1. A random vector X is a vector (X 1, X 2,..., X p ) of jointly distributed random variables. As is customary in linear algebra, we will write

### Demand Forecasting When a product is produced for a market, the demand occurs in the future. The production planning cannot be accomplished unless

Demand Forecasting When a product is produced for a market, the demand occurs in the future. The production planning cannot be accomplished unless the volume of the demand known. The success of the business

### The KaleidaGraph Guide to Curve Fitting

The KaleidaGraph Guide to Curve Fitting Contents Chapter 1 Curve Fitting Overview 1.1 Purpose of Curve Fitting... 5 1.2 Types of Curve Fits... 5 Least Squares Curve Fits... 5 Nonlinear Curve Fits... 6

### Risk Decomposition of Investment Portfolios. Dan dibartolomeo Northfield Webinar January 2014

Risk Decomposition of Investment Portfolios Dan dibartolomeo Northfield Webinar January 2014 Main Concepts for Today Investment practitioners rely on a decomposition of portfolio risk into factors to guide

### USING SPECTRAL RADIUS RATIO FOR NODE DEGREE TO ANALYZE THE EVOLUTION OF SCALE- FREE NETWORKS AND SMALL-WORLD NETWORKS

USING SPECTRAL RADIUS RATIO FOR NODE DEGREE TO ANALYZE THE EVOLUTION OF SCALE- FREE NETWORKS AND SMALL-WORLD NETWORKS Natarajan Meghanathan Jackson State University, 1400 Lynch St, Jackson, MS, USA natarajan.meghanathan@jsums.edu

### Getting Correct Results from PROC REG

Getting Correct Results from PROC REG Nathaniel Derby, Statis Pro Data Analytics, Seattle, WA ABSTRACT PROC REG, SAS s implementation of linear regression, is often used to fit a line without checking