# Manifold Learning Examples PCA, LLE and ISOMAP

Save this PDF as:

Size: px
Start display at page:

Download "Manifold Learning Examples PCA, LLE and ISOMAP"

## Transcription

1 Manifold Learning Examples PCA, LLE and ISOMAP Dan Ventura October 14, 28 Abstract We try to give a helpful concrete example that demonstrates how to use PCA, LLE and Isomap, attempts to provide some intuition as to how and why they work, and compares and contrasts the three techniques. 1 Introduction Suppose we are given the following data points in R 2 see Figure 1: { } ,,,, These points all live on a nonlinear 1D manifold described by the equation fx = x/1 3. We would like to represent this data in R 1. What mapping is appropriate? PCA, LLE and Isomap all have something slightly different to say, and we will try to see how they compare by using this simple example. Figure 1: Sample data points on the manifold defined by fx = x/1 3 1

2 2 PCA We begin by computing the covariance matrix note that in general we would have to preprocess the data by subtracting out the mean, but in this clever example the mean is zero, 2 68 C = Next, we calculate the eigenvectors and eigenvalues of the covariance matrix. How do we do this, you might ask MATLAB or some other math package will do it nicely note that I am making liberal use of rounding throughout this example: Eigenvalues: 223 and 2.6 Eigenvectors: and Notice that one eigenvalue is much larger than the other, indicating a significant difference in the variance in the two derived dimensions. Even though this data is non-linear, we can get a pretty decent linear fit in 1D and the best single linear axis for the data is represented by the eigenvector associated with the large eigenvalue see Figure 2. Remember, if we think of the covariance matrix as an operator, the eigenvectors of that operator define the basis if we limit ourselves to thinking about unit vectors, they are the axes that are invariant under the application of the operator. So, our new major and only since we are going to 1D axis is defined by the first eigenvector above and we transform our data into the new 1D coordinate system by taking the dot product of this eigenvector with our data vectors: = = 9.83 = = 9.83 =

3 Figure 2: Major axis discovered by PCA 3 Isomap Isomap uses the same basic idea as PCA, the difference being that linearity is only preserved locally via small neighborhoods. The global geometry of the discovered axes are nonlinear because of the fact that these small neighborhoods are stitched together without trying to maintain linearity. The result is a nonlinear axis or axes that define a manifold. The steps of the algorithm are basically 1. define neighbors for each data point 2. find interpoint distances graph represented as a matrix M 3. find eigenvectors of M In PCA we try to preserve covariance. Here we try to do the same thing in a nonlinear way and our approach is to try to preserve interpoint distance on the manifold. The matrix M acts just as the matrix C in PCA. In fact, it is really the same thing in a higher dimensional space. M can actually be thought of as the covariance matrix for the space whose dimensions are defined by the data points if we have N data points, M is an N N matrix. Since in this N-dimensional space, the dimensions are the datapoints, the covariance for a particular pair of dimensions is simply the distance between the datapoints that define those dimensions. Pretty much everything we do here is linear, just as it is in PCA. The source of the nonlinearity is the way in which we calculate our interpoint distances. We do not, in general use the Euclidean distances between the points. Rather, we use those distances only for points considered neighbors. The rest of interpoint distances are calculated by finding the shortest path through the graph on the manifold. Starting with the original data points as shown in Figure 1 and setting the number of neighbors K = 2, we can begin to calculate the matrix M. Here 3

4 we have filled in only the actual Euclidean distances of neighbors we index the matrix with the data in Figure 1 from left to right and rows represent data points and columns represent neighbors: M neighbors = ?? ??? ??? ?? Next, we can fill in the remaining entries in M using Floyd s algorithm: M Floyd = Notice that M is not quite symmetric. Can you see why? Now, we lied just a little in step 3 above we don t actually calculate the eigenvectors of M. What we do instead is calculate the eigenvectors of τm, where τm = HSH/2 and S ij = M ij 2 that is, S is the matrix of squared distances and H ij = δ ij 1/N. Recall that N is the number of data points and δ is the Krönecker delta function. I believe the point of the τ operator is analogous to subtracting the mean in the PCA case. So, τm Floyd = Now, remember, this is just like the covariance matrix C in the PCA example. What we have done is taken 2D data and exploded it into 5D. This interpoint distance matrix is the covariance matrix for the 5D space. So, we now find the eigenvectors and eigenvalues for τm: Eigenvalues: Eigenvectors: Just as in the PCA example, the first eigenvalue dominates the others, though in this case, the difference in even greater. Why? Our final step is

5 to transform our original data into the new coordinate system defined by the major eigenvector. Let us re-emphasize that this new axis is linear in the 5D space but is nonlinear in the way it preserves interpoint distances in the lower dimensional space that we are really interested in. Because of this high dimesional/low dimensional game we are playing, the data transformation is a little murkier than it was for PCA. We have a 5D eigenvector but we need a transform for 2D datapoints. The trick is, and I quote from the paper, Let λ p be the p-th eigenvalue in decreasing order of the matrix τd G, and vp i be the i-th component of the p-th eigenvector. Then set the p-th component of the d-dimensional coordinate vector y i equal to λ p vp. i We can rephrase this in terms of our example, for which we are only interested in the largest eigenvalue Let λ 1 be the 1-st eigenvalue in decreasing order of the matrix τm, and v1 i be the i-th component of the 1-st eigenvector. Then set the 1-st component of the 1-dimensional coordinate vector y i equal to λ 1 v1. i Here, i ranges from 1 to 5 because there are 5 components to the eigenvector because there are 5 data points. So, our transformed data points are Since the stated goal of Isomap is to preserve geodesic distances rather than Euclidean distances, let s end this section by comparing the original geodesic and Euclidean distances in 2D with the interpoint distances of the final 1D data. Here are the original Euclidean distances: D Euclidean = Here are the original geodesic distances calculated by moving along the gross approximation of the manifold given the five points and linear interpolation between them: 5

6 D geodesic = Now, here are the interpoint distances for newly transformed data: Dnew = To emphasize the point, let s look at the L 2 matrix norm Σ i,j A 2 ij of the difference between the Euclidean distance matrix and the new distance matrix E 1 = D Euclidean Dnew L 2 = 2.5 and the L 2 matrix norm of the difference between the geodesic distance matrix and the new distance matrix E 2 = D geodesic Dnew L 2 = 2.14 Hmmm. If we are really preserving geodesic distance rather than Euclidean, we should expect E 1 > E 2, but that is not what we find. Can you figure out why? 4 LLE Locally Linear Embedding LLE does the same basic thing as Isomap it finds a nonlinear manifold by stitching together small linear neighborhoods. The difference between the two algorithms is in how they do the stitching. As we saw in the previous section, Isomap does this by doing a graph traversal. In contrast, LLE does it by finding a set of weights that perform local linear interpolations that closely approximate the data. This time the steps are 1. define neighbors for each data point 2. find weights that allow neighbors to interpolate original data accurately represented as a matrix W 3. given those weights, find new data points that minimize interpolation error in lower dimensional space We again consider the data of Figure 1 and again set the number of neighbors K = 2. We must find a weight matrix W that minimizes the cost function ɛw = i X i Σ j W ijxj 2 6

7 subject to the constraints that W ij = if Xj is not a neighbor of Xi and that the rows of the weight matrix sum to one: Σ j W ij = 1. These constraints are reasonable and guarantee some nice invariance properties that make W useful for mapping into a lower dimensional space. In general finding the matrix W involves solving a least-squares problem, and there is not much detail in the paper about how to do that; however, there is more detail available at roweis/lle/algorithm.html. Following the recipe there results in the following weight matrix: W = If you really try this, you will discover that following the recipe given for calculating the weights results in a singular covariance matrix for row three there are an infinite number of solutions. Given our particular data and constraints, it is clear that the weights we want are.5 and.5 but I would like something better than that. Figure 3 shows the reconstructed points using W compared with the original data. So, now we have completed steps 1 and 2 and all we have left to do is step 3. Since our original data is 2D and we are interested in moving to 1D, step 3 involves finding points in R 1 that minimize the embedding cost function ΦY = i Y i Σ j W ijyj 2 Once again this is done by solving an eigenvalue problem. In fact, this is quite intuitive here, if we realize that W supposedly preserves important local geometry independent of what dimensional space we work in. So, what Figure 3: LLE reconstruction for K=2 compared with original data 7

8 we are looking for is a vector V of data points each one being one of the d- dimensional vectors Y i such that V = W V, or in other words we are looking for a set of points in d-dimensional space that don t move when W is applied, or in other words a set of points that are all perfectly reconstructed by their neighbors and W, or in other words a set of points that perfectly fit the local geometry described by W. In any case, we have to find these points by solving for eigenvectors again, and it would be extra intuitive if we could use W. Of course, that would be a little too intuitive, and we have to do some processing to W first, analogous to what we did to M with τ in the Isomap example. Digging through the notes of the paper, or visiting the website above will reveal that the way to do this is to compute the matrix M = I W T I W [where I is the appropriately sized identity matrix] and then find its eigenvectors and eigenvalues. So, M = Eigenvalues: Eigenvectors: This time we are looking for small eigenvalues, rather than large, and we discard the smallest one which will always be with an accompanying eigenvector whose components are all equal. We take the d next smallest and use them as the d dimensions of our transformed data the smallest gives the first dimension for each of our points, the next smallest the second dimension for each point, etc. Since we are moving to 1D, we will use only the second smallest eigenvalue/vector. Both the paper and the website imply that having found the eigenvectors you already have the transformed data, but that isn t exactly true. In fact, you have to divide each eigenvector by the square root of its eigenvalue to get the magnitudes correct it took me a long time to figure this out, and I have sent a note to Sam Roweis seeking confirmation. In this particular case, since the phase of the discarded eigenvector is 1, we also have to multiply all our eigenvectors by 1 to make the signs all work out correctly. Doing both of these things gives us our final transformation:

9 The big win for LLE, supposedly, is that even though it requires solving an N N eigenvalue problem just like Isomap does, the matrix for LLE is very sparse because many weights are, whereas the matrix for Isomap has s only down the diagonal. Some annoying unanswered questions for LLE: 1. Is the intuition for eigenvectors here right? In this example W V V so it is not a perfect fit anyway. 2. *Why are we using smallest eigenvectors this time? 5 Conclusion As a final note, in these examples, we assumed a target d for the number of dimensions to which we wanted to reduce. These algorithms can somewhat automatically discover the best d. PCA and Isomap do this in an iterative fashion, trying increasing values for d and computing some sort of residual variance. Plotting this residual against d will allow us to find an inflection point that indicates a good value for d. LLE estimates d by analyzing a reciprocal cost function in which reconstruction weights derived from the lower dimensional data are applied to the original high dimensional data again, we would iterate on d. 9

### 15.062 Data Mining: Algorithms and Applications Matrix Math Review

.6 Data Mining: Algorithms and Applications Matrix Math Review The purpose of this document is to give a brief review of selected linear algebra concepts that will be useful for the course and to develop

### Nonlinear Dimensionality Reduction

Nonlinear Dimensionality Reduction Piyush Rai CS5350/6350: Machine Learning October 25, 2011 Recap: Linear Dimensionality Reduction Linear Dimensionality Reduction: Based on a linear projection of the

### CS 5614: (Big) Data Management Systems. B. Aditya Prakash Lecture #18: Dimensionality Reduc7on

CS 5614: (Big) Data Management Systems B. Aditya Prakash Lecture #18: Dimensionality Reduc7on Dimensionality Reduc=on Assump=on: Data lies on or near a low d- dimensional subspace Axes of this subspace

### DATA ANALYSIS II. Matrix Algorithms

DATA ANALYSIS II Matrix Algorithms Similarity Matrix Given a dataset D = {x i }, i=1,..,n consisting of n points in R d, let A denote the n n symmetric similarity matrix between the points, given as where

### Section 1.1. Introduction to R n

The Calculus of Functions of Several Variables Section. Introduction to R n Calculus is the study of functional relationships and how related quantities change with each other. In your first exposure to

### [1] Diagonal factorization

8.03 LA.6: Diagonalization and Orthogonal Matrices [ Diagonal factorization [2 Solving systems of first order differential equations [3 Symmetric and Orthonormal Matrices [ Diagonal factorization Recall:

### Dimensionality Reduction: Principal Components Analysis

Dimensionality Reduction: Principal Components Analysis In data mining one often encounters situations where there are a large number of variables in the database. In such situations it is very likely

### MATH 551 - APPLIED MATRIX THEORY

MATH 55 - APPLIED MATRIX THEORY FINAL TEST: SAMPLE with SOLUTIONS (25 points NAME: PROBLEM (3 points A web of 5 pages is described by a directed graph whose matrix is given by A Do the following ( points

### Spectral Methods for Dimensionality Reduction

Spectral Methods for Dimensionality Reduction Prof. Lawrence Saul Dept of Computer & Information Science University of Pennsylvania UCLA IPAM Tutorial, July 11-14, 2005 Dimensionality reduction Question

### Comparison of Non-linear Dimensionality Reduction Techniques for Classification with Gene Expression Microarray Data

CMPE 59H Comparison of Non-linear Dimensionality Reduction Techniques for Classification with Gene Expression Microarray Data Term Project Report Fatma Güney, Kübra Kalkan 1/15/2013 Keywords: Non-linear

### So which is the best?

Manifold Learning Techniques: So which is the best? Todd Wittman Math 8600: Geometric Data Analysis Instructor: Gilad Lerman Spring 2005 Note: This presentation does not contain information on LTSA, which

### Nonlinear Iterative Partial Least Squares Method

Numerical Methods for Determining Principal Component Analysis Abstract Factors Béchu, S., Richard-Plouet, M., Fernandez, V., Walton, J., and Fairley, N. (2016) Developments in numerical treatments for

### Eigenvalues, Eigenvectors, Matrix Factoring, and Principal Components

Eigenvalues, Eigenvectors, Matrix Factoring, and Principal Components The eigenvalues and eigenvectors of a square matrix play a key role in some important operations in statistics. In particular, they

### Principal Component Analysis Application to images

Principal Component Analysis Application to images Václav Hlaváč Czech Technical University in Prague Faculty of Electrical Engineering, Department of Cybernetics Center for Machine Perception http://cmp.felk.cvut.cz/

### We call this set an n-dimensional parallelogram (with one vertex 0). We also refer to the vectors x 1,..., x n as the edges of P.

Volumes of parallelograms 1 Chapter 8 Volumes of parallelograms In the present short chapter we are going to discuss the elementary geometrical objects which we call parallelograms. These are going to

### A Introduction to Matrix Algebra and Principal Components Analysis

A Introduction to Matrix Algebra and Principal Components Analysis Multivariate Methods in Education ERSH 8350 Lecture #2 August 24, 2011 ERSH 8350: Lecture 2 Today s Class An introduction to matrix algebra

### Introduction to Matrix Algebra

Psychology 7291: Multivariate Statistics (Carey) 8/27/98 Matrix Algebra - 1 Introduction to Matrix Algebra Definitions: A matrix is a collection of numbers ordered by rows and columns. It is customary

### Dimensionality Reduction A Short Tutorial

Dimensionality Reduction A Short Tutorial Ali Ghodsi Department of Statistics and Actuarial Science University of Waterloo Waterloo, Ontario, Canada, 2006 c Ali Ghodsi, 2006 Contents 1 An Introduction

### Virtual Landmarks for the Internet

Virtual Landmarks for the Internet Liying Tang Mark Crovella Boston University Computer Science Internet Distance Matters! Useful for configuring Content delivery networks Peer to peer applications Multiuser

### CHAPTER 8 FACTOR EXTRACTION BY MATRIX FACTORING TECHNIQUES. From Exploratory Factor Analysis Ledyard R Tucker and Robert C.

CHAPTER 8 FACTOR EXTRACTION BY MATRIX FACTORING TECHNIQUES From Exploratory Factor Analysis Ledyard R Tucker and Robert C MacCallum 1997 180 CHAPTER 8 FACTOR EXTRACTION BY MATRIX FACTORING TECHNIQUES In

### Principal components analysis

CS229 Lecture notes Andrew Ng Part XI Principal components analysis In our discussion of factor analysis, we gave a way to model data x R n as approximately lying in some k-dimension subspace, where k

### Lecture 5: Singular Value Decomposition SVD (1)

EEM3L1: Numerical and Analytical Techniques Lecture 5: Singular Value Decomposition SVD (1) EE3L1, slide 1, Version 4: 25-Sep-02 Motivation for SVD (1) SVD = Singular Value Decomposition Consider the system

### Dimension Reduction. Wei-Ta Chu 2014/10/22. Multimedia Content Analysis, CSIE, CCU

1 Dimension Reduction Wei-Ta Chu 2014/10/22 2 1.1 Principal Component Analysis (PCA) Widely used in dimensionality reduction, lossy data compression, feature extraction, and data visualization Also known

### by the matrix A results in a vector which is a reflection of the given

Eigenvalues & Eigenvectors Example Suppose Then So, geometrically, multiplying a vector in by the matrix A results in a vector which is a reflection of the given vector about the y-axis We observe that

### Linear Algebra Review. Vectors

Linear Algebra Review By Tim K. Marks UCSD Borrows heavily from: Jana Kosecka kosecka@cs.gmu.edu http://cs.gmu.edu/~kosecka/cs682.html Virginia de Sa Cogsci 8F Linear Algebra review UCSD Vectors The length

### 3. INNER PRODUCT SPACES

. INNER PRODUCT SPACES.. Definition So far we have studied abstract vector spaces. These are a generalisation of the geometric spaces R and R. But these have more structure than just that of a vector space.

### November 28 th, Carlos Guestrin. Lower dimensional projections

PCA Machine Learning 070/578 Carlos Guestrin Carnegie Mellon University November 28 th, 2007 Lower dimensional projections Rather than picking a subset of the features, we can new features that are combinations

### Orthogonal Projections

Orthogonal Projections and Reflections (with exercises) by D. Klain Version.. Corrections and comments are welcome! Orthogonal Projections Let X,..., X k be a family of linearly independent (column) vectors

### Similarity and Diagonalization. Similar Matrices

MATH022 Linear Algebra Brief lecture notes 48 Similarity and Diagonalization Similar Matrices Let A and B be n n matrices. We say that A is similar to B if there is an invertible n n matrix P such that

### Review Jeopardy. Blue vs. Orange. Review Jeopardy

Review Jeopardy Blue vs. Orange Review Jeopardy Jeopardy Round Lectures 0-3 Jeopardy Round \$200 How could I measure how far apart (i.e. how different) two observations, y 1 and y 2, are from each other?

### Chapter 6. Orthogonality

6.3 Orthogonal Matrices 1 Chapter 6. Orthogonality 6.3 Orthogonal Matrices Definition 6.4. An n n matrix A is orthogonal if A T A = I. Note. We will see that the columns of an orthogonal matrix must be

### An Introduction to Locally Linear Embedding

An Introduction to Locally Linear Embedding Lawrence K. Saul AT&T Labs Research 180 Park Ave, Florham Park, NJ 07932 SA lsaul@research.att.com Sam T. Roweis Gatsby Computational Neuroscience nit, CL 17

### Least-Squares Intersection of Lines

Least-Squares Intersection of Lines Johannes Traa - UIUC 2013 This write-up derives the least-squares solution for the intersection of lines. In the general case, a set of lines will not intersect at a

### Linear Algebra Notes for Marsden and Tromba Vector Calculus

Linear Algebra Notes for Marsden and Tromba Vector Calculus n-dimensional Euclidean Space and Matrices Definition of n space As was learned in Math b, a point in Euclidean three space can be thought of

### Introduction to Principal Components and FactorAnalysis

Introduction to Principal Components and FactorAnalysis Multivariate Analysis often starts out with data involving a substantial number of correlated variables. Principal Component Analysis (PCA) is a

### 13 MATH FACTS 101. 2 a = 1. 7. The elements of a vector have a graphical interpretation, which is particularly easy to see in two or three dimensions.

3 MATH FACTS 0 3 MATH FACTS 3. Vectors 3.. Definition We use the overhead arrow to denote a column vector, i.e., a linear segment with a direction. For example, in three-space, we write a vector in terms

### Lecture 9: Shape Description (Regions)

Lecture 9: Shape Description (Regions) c Bryan S. Morse, Brigham Young University, 1998 2000 Last modified on February 16, 2000 at 4:00 PM Contents 9.1 What Are Descriptors?.........................................

### 10.3 POWER METHOD FOR APPROXIMATING EIGENVALUES

55 CHAPTER NUMERICAL METHODS. POWER METHOD FOR APPROXIMATING EIGENVALUES In Chapter 7 we saw that the eigenvalues of an n n matrix A are obtained by solving its characteristic equation n c n n c n n...

### A tutorial on Principal Components Analysis

A tutorial on Principal Components Analysis Lindsay I Smith February 26, 2002 Chapter 1 Introduction This tutorial is designed to give the reader an understanding of Principal Components Analysis (PCA).

### Eigenvalues and Eigenvectors

Chapter 6 Eigenvalues and Eigenvectors 6. Introduction to Eigenvalues Linear equations Ax D b come from steady state problems. Eigenvalues have their greatest importance in dynamic problems. The solution

### 8. Linear least-squares

8. Linear least-squares EE13 (Fall 211-12) definition examples and applications solution of a least-squares problem, normal equations 8-1 Definition overdetermined linear equations if b range(a), cannot

### Using the Singular Value Decomposition

Using the Singular Value Decomposition Emmett J. Ientilucci Chester F. Carlson Center for Imaging Science Rochester Institute of Technology emmett@cis.rit.edu May 9, 003 Abstract This report introduces

### Vector and Matrix Norms

Chapter 1 Vector and Matrix Norms 11 Vector Spaces Let F be a field (such as the real numbers, R, or complex numbers, C) with elements called scalars A Vector Space, V, over the field F is a non-empty

### 10.3 POWER METHOD FOR APPROXIMATING EIGENVALUES

58 CHAPTER NUMERICAL METHODS. POWER METHOD FOR APPROXIMATING EIGENVALUES In Chapter 7 you saw that the eigenvalues of an n n matrix A are obtained by solving its characteristic equation n c nn c nn...

### Applications to Data Smoothing and Image Processing I

Applications to Data Smoothing and Image Processing I MA 348 Kurt Bryan Signals and Images Let t denote time and consider a signal a(t) on some time interval, say t. We ll assume that the signal a(t) is

### MATH 304 Linear Algebra Lecture 4: Matrix multiplication. Diagonal matrices. Inverse matrix.

MATH 304 Linear Algebra Lecture 4: Matrix multiplication. Diagonal matrices. Inverse matrix. Matrices Definition. An m-by-n matrix is a rectangular array of numbers that has m rows and n columns: a 11

### CSE 494 CSE/CBS 598 (Fall 2007): Numerical Linear Algebra for Data Exploration Clustering Instructor: Jieping Ye

CSE 494 CSE/CBS 598 Fall 2007: Numerical Linear Algebra for Data Exploration Clustering Instructor: Jieping Ye 1 Introduction One important method for data compression and classification is to organize

### Orthogonal Diagonalization of Symmetric Matrices

MATH10212 Linear Algebra Brief lecture notes 57 Gram Schmidt Process enables us to find an orthogonal basis of a subspace. Let u 1,..., u k be a basis of a subspace V of R n. We begin the process of finding

### CHAPTER 17. Linear Programming: Simplex Method

CHAPTER 17 Linear Programming: Simplex Method CONTENTS 17.1 AN ALGEBRAIC OVERVIEW OF THE SIMPLEX METHOD Algebraic Properties of the Simplex Method Determining a Basic Solution Basic Feasible Solution 17.2

### Similar matrices and Jordan form

Similar matrices and Jordan form We ve nearly covered the entire heart of linear algebra once we ve finished singular value decompositions we ll have seen all the most central topics. A T A is positive

### Supervised Feature Selection & Unsupervised Dimensionality Reduction

Supervised Feature Selection & Unsupervised Dimensionality Reduction Feature Subset Selection Supervised: class labels are given Select a subset of the problem features Why? Redundant features much or

### Lecture 5 - Triangular Factorizations & Operation Counts

LU Factorization Lecture 5 - Triangular Factorizations & Operation Counts We have seen that the process of GE essentially factors a matrix A into LU Now we want to see how this factorization allows us

### MATH 240 Fall, Chapter 1: Linear Equations and Matrices

MATH 240 Fall, 2007 Chapter Summaries for Kolman / Hill, Elementary Linear Algebra, 9th Ed. written by Prof. J. Beachy Sections 1.1 1.5, 2.1 2.3, 4.2 4.9, 3.1 3.5, 5.3 5.5, 6.1 6.3, 6.5, 7.1 7.3 DEFINITIONS

### STATISTICS AND DATA ANALYSIS IN GEOLOGY, 3rd ed. Clarificationof zonationprocedure described onpp. 238-239

STATISTICS AND DATA ANALYSIS IN GEOLOGY, 3rd ed. by John C. Davis Clarificationof zonationprocedure described onpp. 38-39 Because the notation used in this section (Eqs. 4.8 through 4.84) is inconsistent

### Linear Programming. March 14, 2014

Linear Programming March 1, 01 Parts of this introduction to linear programming were adapted from Chapter 9 of Introduction to Algorithms, Second Edition, by Cormen, Leiserson, Rivest and Stein [1]. 1

### Math 215 HW #6 Solutions

Math 5 HW #6 Solutions Problem 34 Show that x y is orthogonal to x + y if and only if x = y Proof First, suppose x y is orthogonal to x + y Then since x, y = y, x In other words, = x y, x + y = (x y) T

### Advanced Techniques for Mobile Robotics Compact Course on Linear Algebra. Wolfram Burgard, Cyrill Stachniss, Kai Arras, Maren Bennewitz

Advanced Techniques for Mobile Robotics Compact Course on Linear Algebra Wolfram Burgard, Cyrill Stachniss, Kai Arras, Maren Bennewitz Vectors Arrays of numbers Vectors represent a point in a n dimensional

### Steven M. Ho!and. Department of Geology, University of Georgia, Athens, GA 30602-2501

PRINCIPAL COMPONENTS ANALYSIS (PCA) Steven M. Ho!and Department of Geology, University of Georgia, Athens, GA 30602-2501 May 2008 Introduction Suppose we had measured two variables, length and width, and

### 1 Review of Least Squares Solutions to Overdetermined Systems

cs4: introduction to numerical analysis /9/0 Lecture 7: Rectangular Systems and Numerical Integration Instructor: Professor Amos Ron Scribes: Mark Cowlishaw, Nathanael Fillmore Review of Least Squares

### Linear Least Squares

Linear Least Squares Suppose we are given a set of data points {(x i,f i )}, i = 1,...,n. These could be measurements from an experiment or obtained simply by evaluating a function at some points. One

### Solution of Linear Systems

Chapter 3 Solution of Linear Systems In this chapter we study algorithms for possibly the most commonly occurring problem in scientific computing, the solution of linear systems of equations. We start

### Vector algebra Christian Miller CS Fall 2011

Vector algebra Christian Miller CS 354 - Fall 2011 Vector algebra A system commonly used to describe space Vectors, linear operators, tensors, etc. Used to build classical physics and the vast majority

### Component Ordering in Independent Component Analysis Based on Data Power

Component Ordering in Independent Component Analysis Based on Data Power Anne Hendrikse Raymond Veldhuis University of Twente University of Twente Fac. EEMCS, Signals and Systems Group Fac. EEMCS, Signals

### Notes on Factoring. MA 206 Kurt Bryan

The General Approach Notes on Factoring MA 26 Kurt Bryan Suppose I hand you n, a 2 digit integer and tell you that n is composite, with smallest prime factor around 5 digits. Finding a nontrivial factor

### 4.3 Least Squares Approximations

18 Chapter. Orthogonality.3 Least Squares Approximations It often happens that Ax D b has no solution. The usual reason is: too many equations. The matrix has more rows than columns. There are more equations

### Notes for STA 437/1005 Methods for Multivariate Data

Notes for STA 437/1005 Methods for Multivariate Data Radford M. Neal, 26 November 2010 Random Vectors Notation: Let X be a random vector with p elements, so that X = [X 1,..., X p ], where denotes transpose.

### Inner product. Definition of inner product

Math 20F Linear Algebra Lecture 25 1 Inner product Review: Definition of inner product. Slide 1 Norm and distance. Orthogonal vectors. Orthogonal complement. Orthogonal basis. Definition of inner product

### NCSS Statistical Software Principal Components Regression. In ordinary least squares, the regression coefficients are estimated using the formula ( )

Chapter 340 Principal Components Regression Introduction is a technique for analyzing multiple regression data that suffer from multicollinearity. When multicollinearity occurs, least squares estimates

### Summary of week 8 (Lectures 22, 23 and 24)

WEEK 8 Summary of week 8 (Lectures 22, 23 and 24) This week we completed our discussion of Chapter 5 of [VST] Recall that if V and W are inner product spaces then a linear map T : V W is called an isometry

### Ranking on Data Manifolds

Ranking on Data Manifolds Dengyong Zhou, Jason Weston, Arthur Gretton, Olivier Bousquet, and Bernhard Schölkopf Max Planck Institute for Biological Cybernetics, 72076 Tuebingen, Germany {firstname.secondname

### MATH 304 Linear Algebra Lecture 8: Inverse matrix (continued). Elementary matrices. Transpose of a matrix.

MATH 304 Linear Algebra Lecture 8: Inverse matrix (continued). Elementary matrices. Transpose of a matrix. Inverse matrix Definition. Let A be an n n matrix. The inverse of A is an n n matrix, denoted

### 1 Introduction. 2 Matrices: Definition. Matrix Algebra. Hervé Abdi Lynne J. Williams

In Neil Salkind (Ed.), Encyclopedia of Research Design. Thousand Oaks, CA: Sage. 00 Matrix Algebra Hervé Abdi Lynne J. Williams Introduction Sylvester developed the modern concept of matrices in the 9th

### DISSERTATION EXPLOITING GEOMETRY, TOPOLOGY, AND OPTIMIZATION FOR KNOWLEDGE DISCOVERY IN BIG DATA. Submitted by. Lori Beth Ziegelmeier

DISSERTATION EXPLOITING GEOMETRY, TOPOLOGY, AND OPTIMIZATION FOR KNOWLEDGE DISCOVERY IN BIG DATA Submitted by Lori Beth Ziegelmeier Department of Mathematics In partial fulfullment of the requirements

### Visualization of General Defined Space Data

International Journal of Computer Graphics & Animation (IJCGA) Vol.3, No.4, October 013 Visualization of General Defined Space Data John R Rankin La Trobe University, Australia Abstract A new algorithm

### Quick Reference Guide to Linear Algebra in Quantum Mechanics

Quick Reference Guide to Linear Algebra in Quantum Mechanics Scott N. Walck September 2, 2014 Contents 1 Complex Numbers 2 1.1 Introduction............................ 2 1.2 Real Numbers...........................

### THE UNIVERSITY OF AUCKLAND

COMPSCI 369 THE UNIVERSITY OF AUCKLAND FIRST SEMESTER, 2011 MID-SEMESTER TEST Campus: City COMPUTER SCIENCE Computational Science (Time allowed: 50 minutes) NOTE: Attempt all questions Use of calculators

### CS3220 Lecture Notes: QR factorization and orthogonal transformations

CS3220 Lecture Notes: QR factorization and orthogonal transformations Steve Marschner Cornell University 11 March 2009 In this lecture I ll talk about orthogonal matrices and their properties, discuss

### We shall turn our attention to solving linear systems of equations. Ax = b

59 Linear Algebra We shall turn our attention to solving linear systems of equations Ax = b where A R m n, x R n, and b R m. We already saw examples of methods that required the solution of a linear system

### Regression III: Advanced Methods

Lecture 5: Linear least-squares Regression III: Advanced Methods William G. Jacoby Department of Political Science Michigan State University http://polisci.msu.edu/jacoby/icpsr/regress3 Simple Linear Regression

### December 4, 2013 MATH 171 BASIC LINEAR ALGEBRA B. KITCHENS

December 4, 2013 MATH 171 BASIC LINEAR ALGEBRA B KITCHENS The equation 1 Lines in two-dimensional space (1) 2x y = 3 describes a line in two-dimensional space The coefficients of x and y in the equation

### Factoring Algorithms

Factoring Algorithms The p 1 Method and Quadratic Sieve November 17, 2008 () Factoring Algorithms November 17, 2008 1 / 12 Fermat s factoring method Fermat made the observation that if n has two factors

### EM Clustering Approach for Multi-Dimensional Analysis of Big Data Set

EM Clustering Approach for Multi-Dimensional Analysis of Big Data Set Amhmed A. Bhih School of Electrical and Electronic Engineering Princy Johnson School of Electrical and Electronic Engineering Martin

### Factor analysis. Angela Montanari

Factor analysis Angela Montanari 1 Introduction Factor analysis is a statistical model that allows to explain the correlations between a large number of observed correlated variables through a small number

### Concepts in Machine Learning, Unsupervised Learning & Astronomy Applications

Data Mining In Modern Astronomy Sky Surveys: Concepts in Machine Learning, Unsupervised Learning & Astronomy Applications Ching-Wa Yip cwyip@pha.jhu.edu; Bloomberg 518 Human are Great Pattern Recognizers

### Bindel, Fall 2012 Matrix Computations (CS 6210) Week 8: Friday, Oct 12

Why eigenvalues? Week 8: Friday, Oct 12 I spend a lot of time thinking about eigenvalue problems. In part, this is because I look for problems that can be solved via eigenvalues. But I might have fewer

### Situation 23: Simultaneous Equations Prepared at the University of Georgia EMAT 6500 class Date last revised: July 22 nd, 2013 Nicolina Scarpelli

Situation 23: Simultaneous Equations Prepared at the University of Georgia EMAT 6500 class Date last revised: July 22 nd, 2013 Nicolina Scarpelli Prompt: A mentor teacher and student teacher are discussing

### 1 Spectral Methods for Dimensionality

1 Spectral Methods for Dimensionality Reduction Lawrence K. Saul Kilian Q. Weinberger Fei Sha Jihun Ham Daniel D. Lee How can we search for low dimensional structure in high dimensional data? If the data

### the points are called control points approximating curve

Chapter 4 Spline Curves A spline curve is a mathematical representation for which it is easy to build an interface that will allow a user to design and control the shape of complex curves and surfaces.

### Basic Math Refresher A tutorial and assessment of basic math skills for students in PUBP704.

Basic Math Refresher A tutorial and assessment of basic math skills for students in PUBP704. The purpose of this Basic Math Refresher is to review basic math concepts so that students enrolled in PUBP704:

### Linear algebra and the geometry of quadratic equations. Similarity transformations and orthogonal matrices

MATH 30 Differential Equations Spring 006 Linear algebra and the geometry of quadratic equations Similarity transformations and orthogonal matrices First, some things to recall from linear algebra Two

### Chapter 7. Lyapunov Exponents. 7.1 Maps

Chapter 7 Lyapunov Exponents Lyapunov exponents tell us the rate of divergence of nearby trajectories a key component of chaotic dynamics. For one dimensional maps the exponent is simply the average

### Notes on Orthogonal and Symmetric Matrices MENU, Winter 2013

Notes on Orthogonal and Symmetric Matrices MENU, Winter 201 These notes summarize the main properties and uses of orthogonal and symmetric matrices. We covered quite a bit of material regarding these topics,

### Practice Math 110 Final. Instructions: Work all of problems 1 through 5, and work any 5 of problems 10 through 16.

Practice Math 110 Final Instructions: Work all of problems 1 through 5, and work any 5 of problems 10 through 16. 1. Let A = 3 1 1 3 3 2. 6 6 5 a. Use Gauss elimination to reduce A to an upper triangular

### Torgerson s Classical MDS derivation: 1: Determining Coordinates from Euclidean Distances

Torgerson s Classical MDS derivation: 1: Determining Coordinates from Euclidean Distances It is possible to construct a matrix X of Cartesian coordinates of points in Euclidean space when we know the Euclidean

### State of Stress at Point

State of Stress at Point Einstein Notation The basic idea of Einstein notation is that a covector and a vector can form a scalar: This is typically written as an explicit sum: According to this convention,

### LINEAR ALGEBRA. September 23, 2010

LINEAR ALGEBRA September 3, 00 Contents 0. LU-decomposition.................................... 0. Inverses and Transposes................................. 0.3 Column Spaces and NullSpaces.............................

### Solving Quadratic Equations by Completing the Square

9. Solving Quadratic Equations by Completing the Square 9. OBJECTIVES 1. Solve a quadratic equation by the square root method. Solve a quadratic equation by completing the square. Solve a geometric application

### row row row 4

13 Matrices The following notes came from Foundation mathematics (MATH 123) Although matrices are not part of what would normally be considered foundation mathematics, they are one of the first topics