November 28 th, Carlos Guestrin. Lower dimensional projections

Save this PDF as:
 WORD  PNG  TXT  JPG

Size: px
Start display at page:

Download "November 28 th, Carlos Guestrin. Lower dimensional projections"

Transcription

1 PCA Machine Learning 070/578 Carlos Guestrin Carnegie Mellon University November 28 th, 2007 Lower dimensional projections Rather than picking a subset of the features, we can new features that are combinations of existing features Let s see this in the unsupervised setting just X, but no Y 2

2 Linear projection and reconstruction x 2 project into -dimension z x reconstruction: only know z, what was (x,x 2 ) 3 Linear projections, a review Project a point into a (lower dimensional) space: point: x = (x,,x n ) select a basis set of basis vectors (u,,u k ) we consider orthonormal basis: u i u i =, and u i u j =0 for i j select a center defines offset of space best coordinates in lower dimensional space defined by dot-products: (z,,z k ), z i = (x-x) u i minimum squared error 4

3 PCA finds projection that minimizes reconstruction error Given m data points: x i = (x i,,x ni ), i= m Will represent each point as a projection: where: and PCA: Given k n, find (u,,u k ) minimizing reconstruction error: x 2 x 5 Understanding the reconstruction error Note that x i can be represented exactly by n-dimensional projection: Given k n, find (u,,u k ) minimizing reconstruction error: Rewriting error: 6

4 Reconstruction error and covariance matrix 7 Minimizing reconstruction error and eigen vectors Minimizing reconstruction error equivalent to picking orthonormal basis (u,,u n ) minimizing: Eigen vector: Minimizing reconstruction error equivalent to picking (u k+,,u n ) to be eigen vectors with smallest eigen values 8

5 Basic PCA algoritm Start from m by n data matrix X Recenter: subtract mean from each row of X X c X X Compute covariance matrix: Σ /m X c T X c Find eigen vectors and values of Σ Principal components: k eigen vectors with highest eigen values 9 PCA example 0

6 PCA example reconstruction only used first principal component Eigenfaces [Turk, Pentland 9] Input images: Principal components: 2

7 Eigenfaces reconstruction Each image corresponds to adding 8 principal components: 3 Scaling up Covariance matrix can be really big! Σ is n by n 0000 features! Σ finding eigenvectors is very slow Use singular value decomposition (SVD) finds to k eigenvectors great implementations available, e.g., Matlab svd 4

8 SVD Write X = W S V T X data matri one row per datapoint W weight matri one row per datapoint coordinate of x i in eigenspace S singular value matri diagonal matrix in our setting each entry is eigenvalue λ j V T singular vector matrix in our setting each row is eigenvector v j 5 PCA using SVD algoritm Start from m by n data matrix X Recenter: subtract mean from each row of X X c X X Call SVD algorithm on X c ask for k singular vectors Principal components: k singular vectors with highest singular values (rows of V T ) Coefficients become: 6

9 What you need to know Dimensionality reduction why and when it s important Simple feature selection Principal component analysis minimizing reconstruction error relationship to covariance matrix and eigenvectors using SVD 7 Announcements University Course Assessments Please, please, please, please, please, please, please, please, please, please, please, please, please, please, please, please Last lecture: Thursday, 9, 4:40-6:30pm, Wean

10 Markov Decision Processes (MDPs) Machine Learning 070/578 Carlos Guestrin Carnegie Mellon University November 28 th, Thus far this semester Regression: Classification: Density estimation: 20

11 Learning to act [Ng et al. 05] Reinforcement learning An agent Makes sensor observations Must select action Receives rewards positive for good states negative for bad states 2 Learning to play backgammon [Tesauro 95] Combines reinforcement learning with neural networks Played 300,000 games against itself Achieved grandmaster level! 22

12 Roadmap to learning about reinforcement learning When we learned about Bayes nets: First talked about formal framework: representation inference Then learning for BNs For reinforcement learning: Formal framework Markov decision processes Then learning 23 peasant footman building Real-time Strategy Game Peasants collect resources and build Footmen attack enemies Buildings train peasants and footmen 24

13 States and actions State space: Joint state x of entire system Action space: Joint action a= {a,, a n } for all agents 25 States change over time Like an HMM, state changes over time Next state depends on current state and action selected e.g., action= build castle likely to lead to a state where you have a castle Transition model: Dynamics of the entire system P(x 26

14 Some states and actions are better than others Each state x is associated with a reward positive reward for successful attack negative for loss Reward function: Total reward R(x) 27 Markov Decision Process (MDP) Representation State space: Joint state x of entire system Action space: Joint action a= {a,, a n } for all agents Reward function: Total reward R( sometimes reward can depend on action Transition model: Dynamics of the entire system P(x 28

15 Discounted Rewards An assistant professor gets paid, say, 20K per year. How much, in total, will the A.P. earn in their life? = Infinity $ $ What s wrong with this argument? 29 Discounted Rewards A reward (payment) in the future is not worth quite as much as a reward now. Because of chance of obliteration Because of inflation Example: Being promised $0,000 next year is worth only 90% as much as receiving $0,000 right now. Assuming payment n years in future is worth only (0.9) n of payment now, what is the AP s Future Discounted Sum of Rewards? 30

16 Discount Factors People in economics and probabilistic decision-making do this all the time. The Discounted sum of future rewards using discount factor γ is (reward now) + γ (reward in time step) + γ 2 (reward in 2 time steps) + γ 3 (reward in 3 time steps) + : : (infinite sum) 3 The Academic Life Assume Discount Factor γ = A. Assistant Prof B. Assoc. Prof T. Tenured Prof 400 Define: 0.2 S. 0.2 On the Street D. Dead 0 V A = Expected discounted future rewards starting in state A V B = Expected discounted future rewards starting in state B V T = T V S = S V D = D How do we compute V A, V B, V T, V S, V D?

17 Computing the Future Rewards of an Academic 0.6 A. Assistant Prof Assume Discount Factor γ = 0.9 B. Assoc. Prof S. On the Street D. Dead T. Tenured Prof Policy Policy: π(x) = a At state action a for all agents x 0 π(x 0 ) = both peasants get wood x π(x ) = one peasant builds barrack, other gets gold x 2 π(x 2 ) = peasants get gold, footmen attack 34

18 Value of Policy Value: V π (x) Expected longterm reward starting from x Start from x 0 x 0 R(x 0 ) π(x 0 ) x R(x ) V π (x 0 ) = E π [R(x 0 ) + γ R(x ) + γ 2 R(x 2 ) + γ 3 R(x 3 ) + γ 4 R(x 4 ) + L] π(x ) x 2 π(x 2 ) Future rewards discounted by γ 2 [0,) x R(x ) π(x ) x R(x ) π(x ) R(x 2 ) x 3 R(x 3 ) π(x 3 ) x 4 R(x 4 ) 35 Computing the value of a policy Discounted value of a state: V π (x 0 ) = E π [R(x 0 ) + γ R(x ) + γ 2 R(x 2 ) + γ 3 R(x 3 ) + γ 4 R(x 4 ) + L] value of starting from x 0 and continuing with policy π from then on A recursion! 36

19 Simple approach for computing the value of a policy: Iteratively Can solve using a simple convergent iterative approach: (a.k.a. dynamic programming) Start with some guess V 0 Iteratively say: V t+ = R + γ P π V t Stop when V t+ -V t ε means that V π -V t+ ε/(-γ) 37 But we want to learn a Policy So far, told you how good a policy is But how can we choose the best policy??? x 0 Policy: π(x) = a At state action a for all agents π(x 0 ) = both peasants get wood Suppose there was only one time step: world is about to end!!! select action that maximizes reward! x x 2 π(x ) = one peasant builds barrack, other gets gold π(x 2 ) = peasants get gold, footmen attack 38

20 Unrolling the recursion Choose actions that lead to best value in the long run Optimal value policy achieves optimal value V * 39 Bellman equation Evaluating policy π: Computing the optimal value V * - Bellman equation " V ( x) = max R( + # a! x' P( x' V " ( x' ) 40

21 Optimal Long-term Plan Optimal value function V * (x) Optimal Policy: π * (x) " # (x) = argmax a Optimal policy: R( + $ % x' P(x' V # (x') 4 Interesting fact Unique value " V ( x) = max R( + # a! Slightly surprising fact: There is only one V * that solves Bellman equation! there may be many optimal policies that achieve V * Surprising fact: optimal policies are good everywhere!!! x' P( x' V " ( x' ) 42

22 Solve Bellman equation Solving an MDP Optimal value V * (x) " V ( x) = max R( + # a! Many algorithms solve the Bellman equations: x' P( x' V Bellman equation is non-linear!!! Optimal policy π*(x) " ( x' ) Policy iteration [Howard 60, Bellman 57] Value iteration [Bellman 57] Linear programming [Manne 60] 43 Value iteration (a.k.a. dynamic programming) the simplest of all " V ( x) = max R( + # a! x' P( x' V " ( x' ) Start with some guess V 0 Iteratively say: Stop when V t+ -V t ε means that V -V t+ ε/(-γ)! V t+ ( x) = max R( + " P( x' Vt ( x' ) a x' 44

23 A simple example You run a startup company. In every state you must choose between Saving money or Advertising. Poor & Unknown +0 S S A Rich & Unknown +0 A S Poor & Famous +0 S A Rich & Famous +0 γ = 0.9 A 45 Let s compute V t (x) for our example Poor & Unknown +0 S S A Rich & Unknown +0 A S Poor & Famous +0 S A Rich & Famous +0 γ = 0.9 A t V t (PU) V t (PF) V t (RU) V t (RF)! V t+ ( x) = max R( + " P( x' Vt ( x' ) a x' 46

24 Let s compute V t (x) for our example Poor & Unknown +0 S S A Rich & Unknown +0 A S Poor & Famous +0 S A Rich & Famous +0 γ = 0.9 A t V t (PU) V t (PF) V t (RU) V t (RF) ! V t+ ( x) = max R( + " P( x' Vt ( x' ) a x' 47 What you need to know What s a Markov decision process state, actions, transitions, rewards a policy value function for a policy computing V π Optimal value function and optimal policy Bellman equation Solving Bellman equation with value iteration, policy iteration and linear programming 48

25 Acknowledgment This lecture contains some material from Andrew Moore s excellent collection of ML tutorials: 49

Face Recognition using Principle Component Analysis

Face Recognition using Principle Component Analysis Face Recognition using Principle Component Analysis Kyungnam Kim Department of Computer Science University of Maryland, College Park MD 20742, USA Summary This is the summary of the basic idea about PCA

More information

Linear Algebra Review. Vectors

Linear Algebra Review. Vectors Linear Algebra Review By Tim K. Marks UCSD Borrows heavily from: Jana Kosecka kosecka@cs.gmu.edu http://cs.gmu.edu/~kosecka/cs682.html Virginia de Sa Cogsci 8F Linear Algebra review UCSD Vectors The length

More information

Manifold Learning Examples PCA, LLE and ISOMAP

Manifold Learning Examples PCA, LLE and ISOMAP Manifold Learning Examples PCA, LLE and ISOMAP Dan Ventura October 14, 28 Abstract We try to give a helpful concrete example that demonstrates how to use PCA, LLE and Isomap, attempts to provide some intuition

More information

Orthogonal Diagonalization of Symmetric Matrices

Orthogonal Diagonalization of Symmetric Matrices MATH10212 Linear Algebra Brief lecture notes 57 Gram Schmidt Process enables us to find an orthogonal basis of a subspace. Let u 1,..., u k be a basis of a subspace V of R n. We begin the process of finding

More information

Object Recognition and Template Matching

Object Recognition and Template Matching Object Recognition and Template Matching Template Matching A template is a small image (sub-image) The goal is to find occurrences of this template in a larger image That is, you want to find matches of

More information

Similarity and Diagonalization. Similar Matrices

Similarity and Diagonalization. Similar Matrices MATH022 Linear Algebra Brief lecture notes 48 Similarity and Diagonalization Similar Matrices Let A and B be n n matrices. We say that A is similar to B if there is an invertible n n matrix P such that

More information

Presentation 3: Eigenvalues and Eigenvectors of a Matrix

Presentation 3: Eigenvalues and Eigenvectors of a Matrix Colleen Kirksey, Beth Van Schoyck, Dennis Bowers MATH 280: Problem Solving November 18, 2011 Presentation 3: Eigenvalues and Eigenvectors of a Matrix Order of Presentation: 1. Definitions of Eigenvalues

More information

Lecture 5: Singular Value Decomposition SVD (1)

Lecture 5: Singular Value Decomposition SVD (1) EEM3L1: Numerical and Analytical Techniques Lecture 5: Singular Value Decomposition SVD (1) EE3L1, slide 1, Version 4: 25-Sep-02 Motivation for SVD (1) SVD = Singular Value Decomposition Consider the system

More information

Review Jeopardy. Blue vs. Orange. Review Jeopardy

Review Jeopardy. Blue vs. Orange. Review Jeopardy Review Jeopardy Blue vs. Orange Review Jeopardy Jeopardy Round Lectures 0-3 Jeopardy Round $200 How could I measure how far apart (i.e. how different) two observations, y 1 and y 2, are from each other?

More information

Lecture 3: Linear methods for classification

Lecture 3: Linear methods for classification Lecture 3: Linear methods for classification Rafael A. Irizarry and Hector Corrada Bravo February, 2010 Today we describe four specific algorithms useful for classification problems: linear regression,

More information

Machine Learning and Pattern Recognition Logistic Regression

Machine Learning and Pattern Recognition Logistic Regression Machine Learning and Pattern Recognition Logistic Regression Course Lecturer:Amos J Storkey Institute for Adaptive and Neural Computation School of Informatics University of Edinburgh Crichton Street,

More information

Dimension Reduction. Wei-Ta Chu 2014/10/22. Multimedia Content Analysis, CSIE, CCU

Dimension Reduction. Wei-Ta Chu 2014/10/22. Multimedia Content Analysis, CSIE, CCU 1 Dimension Reduction Wei-Ta Chu 2014/10/22 2 1.1 Principal Component Analysis (PCA) Widely used in dimensionality reduction, lossy data compression, feature extraction, and data visualization Also known

More information

1 Eigenvalues and Eigenvectors

1 Eigenvalues and Eigenvectors Math 20 Chapter 5 Eigenvalues and Eigenvectors Eigenvalues and Eigenvectors. Definition: A scalar λ is called an eigenvalue of the n n matrix A is there is a nontrivial solution x of Ax = λx. Such an x

More information

by the matrix A results in a vector which is a reflection of the given

by the matrix A results in a vector which is a reflection of the given Eigenvalues & Eigenvectors Example Suppose Then So, geometrically, multiplying a vector in by the matrix A results in a vector which is a reflection of the given vector about the y-axis We observe that

More information

CS 2750 Machine Learning. Lecture 1. Machine Learning. http://www.cs.pitt.edu/~milos/courses/cs2750/ CS 2750 Machine Learning.

CS 2750 Machine Learning. Lecture 1. Machine Learning. http://www.cs.pitt.edu/~milos/courses/cs2750/ CS 2750 Machine Learning. Lecture Machine Learning Milos Hauskrecht milos@cs.pitt.edu 539 Sennott Square, x5 http://www.cs.pitt.edu/~milos/courses/cs75/ Administration Instructor: Milos Hauskrecht milos@cs.pitt.edu 539 Sennott

More information

PCA to Eigenfaces. CS 510 Lecture #16 March 23 th A 9 dimensional PCA example

PCA to Eigenfaces. CS 510 Lecture #16 March 23 th A 9 dimensional PCA example PCA to Eigenfaces CS 510 Lecture #16 March 23 th 2015 A 9 dimensional PCA example is dark around the edges and bright in the middle. is light with dark vertical bars. is light with dark horizontal bars.

More information

Principal components analysis

Principal components analysis CS229 Lecture notes Andrew Ng Part XI Principal components analysis In our discussion of factor analysis, we gave a way to model data x R n as approximately lying in some k-dimension subspace, where k

More information

10-601. Machine Learning. http://www.cs.cmu.edu/afs/cs/academic/class/10601-f10/index.html

10-601. Machine Learning. http://www.cs.cmu.edu/afs/cs/academic/class/10601-f10/index.html 10-601 Machine Learning http://www.cs.cmu.edu/afs/cs/academic/class/10601-f10/index.html Course data All up-to-date info is on the course web page: http://www.cs.cmu.edu/afs/cs/academic/class/10601-f10/index.html

More information

Nonlinear Iterative Partial Least Squares Method

Nonlinear Iterative Partial Least Squares Method Numerical Methods for Determining Principal Component Analysis Abstract Factors Béchu, S., Richard-Plouet, M., Fernandez, V., Walton, J., and Fairley, N. (2016) Developments in numerical treatments for

More information

Logistic Regression. Vibhav Gogate The University of Texas at Dallas. Some Slides from Carlos Guestrin, Luke Zettlemoyer and Dan Weld.

Logistic Regression. Vibhav Gogate The University of Texas at Dallas. Some Slides from Carlos Guestrin, Luke Zettlemoyer and Dan Weld. Logistic Regression Vibhav Gogate The University of Texas at Dallas Some Slides from Carlos Guestrin, Luke Zettlemoyer and Dan Weld. Generative vs. Discriminative Classifiers Want to Learn: h:x Y X features

More information

Notes on Jordan Canonical Form

Notes on Jordan Canonical Form Notes on Jordan Canonical Form Eric Klavins University of Washington 8 Jordan blocks and Jordan form A Jordan Block of size m and value λ is a matrix J m (λ) having the value λ repeated along the main

More information

APPM4720/5720: Fast algorithms for big data. Gunnar Martinsson The University of Colorado at Boulder

APPM4720/5720: Fast algorithms for big data. Gunnar Martinsson The University of Colorado at Boulder APPM4720/5720: Fast algorithms for big data Gunnar Martinsson The University of Colorado at Boulder Course objectives: The purpose of this course is to teach efficient algorithms for processing very large

More information

Au = = = 3u. Aw = = = 2w. so the action of A on u and w is very easy to picture: it simply amounts to a stretching by 3 and 2, respectively.

Au = = = 3u. Aw = = = 2w. so the action of A on u and w is very easy to picture: it simply amounts to a stretching by 3 and 2, respectively. Chapter 7 Eigenvalues and Eigenvectors In this last chapter of our exploration of Linear Algebra we will revisit eigenvalues and eigenvectors of matrices, concepts that were already introduced in Geometry

More information

L1 vs. L2 Regularization and feature selection.

L1 vs. L2 Regularization and feature selection. L1 vs. L2 Regularization and feature selection. Paper by Andrew Ng (2004) Presentation by Afshin Rostami Main Topics Covering Numbers Definition Convergence Bounds L1 regularized logistic regression L1

More information

Statistical Machine Learning

Statistical Machine Learning Statistical Machine Learning UoC Stats 37700, Winter quarter Lecture 4: classical linear and quadratic discriminants. 1 / 25 Linear separation For two classes in R d : simple idea: separate the classes

More information

Vector and Matrix Norms

Vector and Matrix Norms Chapter 1 Vector and Matrix Norms 11 Vector Spaces Let F be a field (such as the real numbers, R, or complex numbers, C) with elements called scalars A Vector Space, V, over the field F is a non-empty

More information

CS Master Level Courses and Areas COURSE DESCRIPTIONS. CSCI 521 Real-Time Systems. CSCI 522 High Performance Computing

CS Master Level Courses and Areas COURSE DESCRIPTIONS. CSCI 521 Real-Time Systems. CSCI 522 High Performance Computing CS Master Level Courses and Areas The graduate courses offered may change over time, in response to new developments in computer science and the interests of faculty and students; the list of graduate

More information

Machine Learning. CUNY Graduate Center, Spring 2013. Professor Liang Huang. huang@cs.qc.cuny.edu

Machine Learning. CUNY Graduate Center, Spring 2013. Professor Liang Huang. huang@cs.qc.cuny.edu Machine Learning CUNY Graduate Center, Spring 2013 Professor Liang Huang huang@cs.qc.cuny.edu http://acl.cs.qc.edu/~lhuang/teaching/machine-learning Logistics Lectures M 9:30-11:30 am Room 4419 Personnel

More information

The basic unit in matrix algebra is a matrix, generally expressed as: a 11 a 12. a 13 A = a 21 a 22 a 23

The basic unit in matrix algebra is a matrix, generally expressed as: a 11 a 12. a 13 A = a 21 a 22 a 23 (copyright by Scott M Lynch, February 2003) Brief Matrix Algebra Review (Soc 504) Matrix algebra is a form of mathematics that allows compact notation for, and mathematical manipulation of, high-dimensional

More information

Introduction to Matrix Algebra

Introduction to Matrix Algebra Psychology 7291: Multivariate Statistics (Carey) 8/27/98 Matrix Algebra - 1 Introduction to Matrix Algebra Definitions: A matrix is a collection of numbers ordered by rows and columns. It is customary

More information

Data Clustering. Dec 2nd, 2013 Kyrylo Bessonov

Data Clustering. Dec 2nd, 2013 Kyrylo Bessonov Data Clustering Dec 2nd, 2013 Kyrylo Bessonov Talk outline Introduction to clustering Types of clustering Supervised Unsupervised Similarity measures Main clustering algorithms k-means Hierarchical Main

More information

Factor analysis. Angela Montanari

Factor analysis. Angela Montanari Factor analysis Angela Montanari 1 Introduction Factor analysis is a statistical model that allows to explain the correlations between a large number of observed correlated variables through a small number

More information

Mehtap Ergüven Abstract of Ph.D. Dissertation for the degree of PhD of Engineering in Informatics

Mehtap Ergüven Abstract of Ph.D. Dissertation for the degree of PhD of Engineering in Informatics INTERNATIONAL BLACK SEA UNIVERSITY COMPUTER TECHNOLOGIES AND ENGINEERING FACULTY ELABORATION OF AN ALGORITHM OF DETECTING TESTS DIMENSIONALITY Mehtap Ergüven Abstract of Ph.D. Dissertation for the degree

More information

Lecture 11: Graphical Models for Inference

Lecture 11: Graphical Models for Inference Lecture 11: Graphical Models for Inference So far we have seen two graphical models that are used for inference - the Bayesian network and the Join tree. These two both represent the same joint probability

More information

Bayesian Machine Learning (ML): Modeling And Inference in Big Data. Zhuhua Cai Google, Rice University caizhua@gmail.com

Bayesian Machine Learning (ML): Modeling And Inference in Big Data. Zhuhua Cai Google, Rice University caizhua@gmail.com Bayesian Machine Learning (ML): Modeling And Inference in Big Data Zhuhua Cai Google Rice University caizhua@gmail.com 1 Syllabus Bayesian ML Concepts (Today) Bayesian ML on MapReduce (Next morning) Bayesian

More information

CS 591.03 Introduction to Data Mining Instructor: Abdullah Mueen

CS 591.03 Introduction to Data Mining Instructor: Abdullah Mueen CS 591.03 Introduction to Data Mining Instructor: Abdullah Mueen LECTURE 3: DATA TRANSFORMATION AND DIMENSIONALITY REDUCTION Chapter 3: Data Preprocessing Data Preprocessing: An Overview Data Quality Major

More information

K-nearest-neighbor: an introduction to machine learning

K-nearest-neighbor: an introduction to machine learning K-nearest-neighbor: an introduction to machine learning Xiaojin Zhu jerryzhu@cs.wisc.edu Computer Sciences Department University of Wisconsin, Madison slide 1 Outline Types of learning Classification:

More information

CS 2750 Machine Learning. Lecture 1. Machine Learning. CS 2750 Machine Learning.

CS 2750 Machine Learning. Lecture 1. Machine Learning.  CS 2750 Machine Learning. Lecture 1 Machine Learning Milos Hauskrecht milos@cs.pitt.edu 539 Sennott Square, x-5 http://www.cs.pitt.edu/~milos/courses/cs75/ Administration Instructor: Milos Hauskrecht milos@cs.pitt.edu 539 Sennott

More information

Linear Algebraic Equations, SVD, and the Pseudo-Inverse

Linear Algebraic Equations, SVD, and the Pseudo-Inverse Linear Algebraic Equations, SVD, and the Pseudo-Inverse Philip N. Sabes October, 21 1 A Little Background 1.1 Singular values and matrix inversion For non-smmetric matrices, the eigenvalues and singular

More information

1 Introduction to Matrices

1 Introduction to Matrices 1 Introduction to Matrices In this section, important definitions and results from matrix algebra that are useful in regression analysis are introduced. While all statements below regarding the columns

More information

Final Project Report

Final Project Report CPSC545 by Introduction to Data Mining Prof. Martin Schultz & Prof. Mark Gerstein Student Name: Yu Kor Hugo Lam Student ID : 904907866 Due Date : May 7, 2007 Introduction Final Project Report Pseudogenes

More information

Numerical Methods I Solving Linear Systems: Sparse Matrices, Iterative Methods and Non-Square Systems

Numerical Methods I Solving Linear Systems: Sparse Matrices, Iterative Methods and Non-Square Systems Numerical Methods I Solving Linear Systems: Sparse Matrices, Iterative Methods and Non-Square Systems Aleksandar Donev Courant Institute, NYU 1 donev@courant.nyu.edu 1 Course G63.2010.001 / G22.2420-001,

More information

Self Organizing Maps: Fundamentals

Self Organizing Maps: Fundamentals Self Organizing Maps: Fundamentals Introduction to Neural Networks : Lecture 16 John A. Bullinaria, 2004 1. What is a Self Organizing Map? 2. Topographic Maps 3. Setting up a Self Organizing Map 4. Kohonen

More information

Chapter 6. Orthogonality

Chapter 6. Orthogonality 6.3 Orthogonal Matrices 1 Chapter 6. Orthogonality 6.3 Orthogonal Matrices Definition 6.4. An n n matrix A is orthogonal if A T A = I. Note. We will see that the columns of an orthogonal matrix must be

More information

CHAPTER 2. Eigenvalue Problems (EVP s) for ODE s

CHAPTER 2. Eigenvalue Problems (EVP s) for ODE s A SERIES OF CLASS NOTES FOR 005-006 TO INTRODUCE LINEAR AND NONLINEAR PROBLEMS TO ENGINEERS, SCIENTISTS, AND APPLIED MATHEMATICIANS DE CLASS NOTES 4 A COLLECTION OF HANDOUTS ON PARTIAL DIFFERENTIAL EQUATIONS

More information

Adaptive Face Recognition System from Myanmar NRC Card

Adaptive Face Recognition System from Myanmar NRC Card Adaptive Face Recognition System from Myanmar NRC Card Ei Phyo Wai University of Computer Studies, Yangon, Myanmar Myint Myint Sein University of Computer Studies, Yangon, Myanmar ABSTRACT Biometrics is

More information

Solutions to Linear Algebra Practice Problems

Solutions to Linear Algebra Practice Problems Solutions to Linear Algebra Practice Problems. Find all solutions to the following systems of linear equations. (a) x x + x 5 x x x + x + x 5 (b) x + x + x x + x + x x + x + 8x Answer: (a) We create the

More information

Section 2.1. Section 2.2. Exercise 6: We have to compute the product AB in two ways, where , B =. 2 1 3 5 A =

Section 2.1. Section 2.2. Exercise 6: We have to compute the product AB in two ways, where , B =. 2 1 3 5 A = Section 2.1 Exercise 6: We have to compute the product AB in two ways, where 4 2 A = 3 0 1 3, B =. 2 1 3 5 Solution 1. Let b 1 = (1, 2) and b 2 = (3, 1) be the columns of B. Then Ab 1 = (0, 3, 13) and

More information

CS 5614: (Big) Data Management Systems. B. Aditya Prakash Lecture #18: Dimensionality Reduc7on

CS 5614: (Big) Data Management Systems. B. Aditya Prakash Lecture #18: Dimensionality Reduc7on CS 5614: (Big) Data Management Systems B. Aditya Prakash Lecture #18: Dimensionality Reduc7on Dimensionality Reduc=on Assump=on: Data lies on or near a low d- dimensional subspace Axes of this subspace

More information

Machine Learning for Data Science (CS4786) Lecture 1

Machine Learning for Data Science (CS4786) Lecture 1 Machine Learning for Data Science (CS4786) Lecture 1 Tu-Th 10:10 to 11:25 AM Hollister B14 Instructors : Lillian Lee and Karthik Sridharan ROUGH DETAILS ABOUT THE COURSE Diagnostic assignment 0 is out:

More information

203.4770: Introduction to Machine Learning Dr. Rita Osadchy

203.4770: Introduction to Machine Learning Dr. Rita Osadchy 203.4770: Introduction to Machine Learning Dr. Rita Osadchy 1 Outline 1. About the Course 2. What is Machine Learning? 3. Types of problems and Situations 4. ML Example 2 About the course Course Homepage:

More information

8/29/2015. Data Mining and Machine Learning. Erich Seamon University of Idaho

8/29/2015. Data Mining and Machine Learning. Erich Seamon University of Idaho Data Mining and Machine Learning Erich Seamon University of Idaho www.webpages.uidaho.edu/erichs erichs@uidaho.edu 1 I am NOMAD 2 1 Data Mining and Machine Learning Outline Outlining the data mining and

More information

Inverse problems. Nikolai Piskunov 2014. Regularization: Example Lecture 4

Inverse problems. Nikolai Piskunov 2014. Regularization: Example Lecture 4 Inverse problems Nikolai Piskunov 2014 Regularization: Example Lecture 4 Now that we know the theory, let s try an application: Problem: Optimal filtering of 1D and 2D data Solution: formulate an inverse

More information

SYSTEMS OF EQUATIONS

SYSTEMS OF EQUATIONS SYSTEMS OF EQUATIONS 1. Examples of systems of equations Here are some examples of systems of equations. Each system has a number of equations and a number (not necessarily the same) of variables for which

More information

CHAPTER 8 FACTOR EXTRACTION BY MATRIX FACTORING TECHNIQUES. From Exploratory Factor Analysis Ledyard R Tucker and Robert C.

CHAPTER 8 FACTOR EXTRACTION BY MATRIX FACTORING TECHNIQUES. From Exploratory Factor Analysis Ledyard R Tucker and Robert C. CHAPTER 8 FACTOR EXTRACTION BY MATRIX FACTORING TECHNIQUES From Exploratory Factor Analysis Ledyard R Tucker and Robert C MacCallum 1997 180 CHAPTER 8 FACTOR EXTRACTION BY MATRIX FACTORING TECHNIQUES In

More information

Partial Least Squares (PLS) Regression.

Partial Least Squares (PLS) Regression. Partial Least Squares (PLS) Regression. Hervé Abdi 1 The University of Texas at Dallas Introduction Pls regression is a recent technique that generalizes and combines features from principal component

More information

Introduction to machine learning and pattern recognition Lecture 1 Coryn Bailer-Jones

Introduction to machine learning and pattern recognition Lecture 1 Coryn Bailer-Jones Introduction to machine learning and pattern recognition Lecture 1 Coryn Bailer-Jones http://www.mpia.de/homes/calj/mlpr_mpia2008.html 1 1 What is machine learning? Data description and interpretation

More information

Math 202-0 Quizzes Winter 2009

Math 202-0 Quizzes Winter 2009 Quiz : Basic Probability Ten Scrabble tiles are placed in a bag Four of the tiles have the letter printed on them, and there are two tiles each with the letters B, C and D on them (a) Suppose one tile

More information

Linear Programming Notes V Problem Transformations

Linear Programming Notes V Problem Transformations Linear Programming Notes V Problem Transformations 1 Introduction Any linear programming problem can be rewritten in either of two standard forms. In the first form, the objective is to maximize, the material

More information

C19 Machine Learning

C19 Machine Learning C9 Machine Learning 8 Lectures Hilary Term 25 2 Tutorial Sheets A. Zisserman Overview: Supervised classification perceptron, support vector machine, loss functions, kernels, random forests, neural networks

More information

Linear Programming. March 14, 2014

Linear Programming. March 14, 2014 Linear Programming March 1, 01 Parts of this introduction to linear programming were adapted from Chapter 9 of Introduction to Algorithms, Second Edition, by Cormen, Leiserson, Rivest and Stein [1]. 1

More information

Singular Value Decomposition Tutorial

Singular Value Decomposition Tutorial Singular Value Decomposition Tutorial Kirk Baker March 9, 5 (Revised January 4, 3) Contents Acknowledgments Introduction 3 Points and Space 4 Vectors 3 5 Matrices 4 5. Matrix Notation..................................

More information

Theta Functions. Lukas Lewark. Seminar on Modular Forms, 31. Januar 2007

Theta Functions. Lukas Lewark. Seminar on Modular Forms, 31. Januar 2007 Theta Functions Lukas Lewark Seminar on Modular Forms, 31. Januar 007 Abstract Theta functions are introduced, associated to lattices or quadratic forms. Their transformation property is proven and the

More information

1 Introduction. 2 Matrices: Definition. Matrix Algebra. Hervé Abdi Lynne J. Williams

1 Introduction. 2 Matrices: Definition. Matrix Algebra. Hervé Abdi Lynne J. Williams In Neil Salkind (Ed.), Encyclopedia of Research Design. Thousand Oaks, CA: Sage. 00 Matrix Algebra Hervé Abdi Lynne J. Williams Introduction Sylvester developed the modern concept of matrices in the 9th

More information

Matrix Norms. Tom Lyche. September 28, Centre of Mathematics for Applications, Department of Informatics, University of Oslo

Matrix Norms. Tom Lyche. September 28, Centre of Mathematics for Applications, Department of Informatics, University of Oslo Matrix Norms Tom Lyche Centre of Mathematics for Applications, Department of Informatics, University of Oslo September 28, 2009 Matrix Norms We consider matrix norms on (C m,n, C). All results holds for

More information

Concepts in Machine Learning, Unsupervised Learning & Astronomy Applications

Concepts in Machine Learning, Unsupervised Learning & Astronomy Applications Data Mining In Modern Astronomy Sky Surveys: Concepts in Machine Learning, Unsupervised Learning & Astronomy Applications Ching-Wa Yip cwyip@pha.jhu.edu; Bloomberg 518 Human are Great Pattern Recognizers

More information

Overview of Violations of the Basic Assumptions in the Classical Normal Linear Regression Model

Overview of Violations of the Basic Assumptions in the Classical Normal Linear Regression Model Overview of Violations of the Basic Assumptions in the Classical Normal Linear Regression Model 1 September 004 A. Introduction and assumptions The classical normal linear regression model can be written

More information

1 2 3 1 1 2 x = + x 2 + x 4 1 0 1

1 2 3 1 1 2 x = + x 2 + x 4 1 0 1 (d) If the vector b is the sum of the four columns of A, write down the complete solution to Ax = b. 1 2 3 1 1 2 x = + x 2 + x 4 1 0 0 1 0 1 2. (11 points) This problem finds the curve y = C + D 2 t which

More information

STA 4273H: Statistical Machine Learning

STA 4273H: Statistical Machine Learning STA 4273H: Statistical Machine Learning Russ Salakhutdinov Department of Statistics! rsalakhu@utstat.toronto.edu! http://www.cs.toronto.edu/~rsalakhu/ Lecture 6 Three Approaches to Classification Construct

More information

THREE DIMENSIONAL REPRESENTATION OF AMINO ACID CHARAC- TERISTICS

THREE DIMENSIONAL REPRESENTATION OF AMINO ACID CHARAC- TERISTICS THREE DIMENSIONAL REPRESENTATION OF AMINO ACID CHARAC- TERISTICS O.U. Sezerman 1, R. Islamaj 2, E. Alpaydin 2 1 Laborotory of Computational Biology, Sabancı University, Istanbul, Turkey. 2 Computer Engineering

More information

(January 14, 2009) End k (V ) End k (V/W )

(January 14, 2009) End k (V ) End k (V/W ) (January 14, 29) [16.1] Let p be the smallest prime dividing the order of a finite group G. Show that a subgroup H of G of index p is necessarily normal. Let G act on cosets gh of H by left multiplication.

More information

NCSS Statistical Software Principal Components Regression. In ordinary least squares, the regression coefficients are estimated using the formula ( )

NCSS Statistical Software Principal Components Regression. In ordinary least squares, the regression coefficients are estimated using the formula ( ) Chapter 340 Principal Components Regression Introduction is a technique for analyzing multiple regression data that suffer from multicollinearity. When multicollinearity occurs, least squares estimates

More information

MATH 304 Linear Algebra Lecture 9: Subspaces of vector spaces (continued). Span. Spanning set.

MATH 304 Linear Algebra Lecture 9: Subspaces of vector spaces (continued). Span. Spanning set. MATH 304 Linear Algebra Lecture 9: Subspaces of vector spaces (continued). Span. Spanning set. Vector space A vector space is a set V equipped with two operations, addition V V (x,y) x + y V and scalar

More information

C 1 x(t) = e ta C = e C n. 2! A2 + t3

C 1 x(t) = e ta C = e C n. 2! A2 + t3 Matrix Exponential Fundamental Matrix Solution Objective: Solve dt A x with an n n constant coefficient matrix A x (t) Here the unknown is the vector function x(t) x n (t) General Solution Formula in Matrix

More information

Matrix Calculations: Applications of Eigenvalues and Eigenvectors; Inner Products

Matrix Calculations: Applications of Eigenvalues and Eigenvectors; Inner Products Matrix Calculations: Applications of Eigenvalues and Eigenvectors; Inner Products H. Geuvers Institute for Computing and Information Sciences Intelligent Systems Version: spring 2015 H. Geuvers Version:

More information

CS229 Lecture notes. Andrew Ng

CS229 Lecture notes. Andrew Ng CS229 Lecture notes Andrew Ng Part X Factor analysis Whenwehavedatax (i) R n thatcomesfromamixtureofseveral Gaussians, the EM algorithm can be applied to fit a mixture model. In this setting, we usually

More information

Supervised Feature Selection & Unsupervised Dimensionality Reduction

Supervised Feature Selection & Unsupervised Dimensionality Reduction Supervised Feature Selection & Unsupervised Dimensionality Reduction Feature Subset Selection Supervised: class labels are given Select a subset of the problem features Why? Redundant features much or

More information

Tetris: Experiments with the LP Approach to Approximate DP

Tetris: Experiments with the LP Approach to Approximate DP Tetris: Experiments with the LP Approach to Approximate DP Vivek F. Farias Electrical Engineering Stanford University Stanford, CA 94403 vivekf@stanford.edu Benjamin Van Roy Management Science and Engineering

More information

Joint Probability Distributions and Random Samples (Devore Chapter Five)

Joint Probability Distributions and Random Samples (Devore Chapter Five) Joint Probability Distributions and Random Samples (Devore Chapter Five) 1016-345-01 Probability and Statistics for Engineers Winter 2010-2011 Contents 1 Joint Probability Distributions 1 1.1 Two Discrete

More information

L10: Probability, statistics, and estimation theory

L10: Probability, statistics, and estimation theory L10: Probability, statistics, and estimation theory Review of probability theory Bayes theorem Statistics and the Normal distribution Least Squares Error estimation Maximum Likelihood estimation Bayesian

More information

1 Review of Newton Polynomials

1 Review of Newton Polynomials cs: introduction to numerical analysis 0/0/0 Lecture 8: Polynomial Interpolation: Using Newton Polynomials and Error Analysis Instructor: Professor Amos Ron Scribes: Giordano Fusco, Mark Cowlishaw, Nathanael

More information

Logistic Regression. Jia Li. Department of Statistics The Pennsylvania State University. Logistic Regression

Logistic Regression. Jia Li. Department of Statistics The Pennsylvania State University. Logistic Regression Logistic Regression Department of Statistics The Pennsylvania State University Email: jiali@stat.psu.edu Logistic Regression Preserve linear classification boundaries. By the Bayes rule: Ĝ(x) = arg max

More information

Lecture 20: Clustering

Lecture 20: Clustering Lecture 20: Clustering Wrap-up of neural nets (from last lecture Introduction to unsupervised learning K-means clustering COMP-424, Lecture 20 - April 3, 2013 1 Unsupervised learning In supervised learning,

More information

Component Ordering in Independent Component Analysis Based on Data Power

Component Ordering in Independent Component Analysis Based on Data Power Component Ordering in Independent Component Analysis Based on Data Power Anne Hendrikse Raymond Veldhuis University of Twente University of Twente Fac. EEMCS, Signals and Systems Group Fac. EEMCS, Signals

More information

MATRIX ALGEBRA AND SYSTEMS OF EQUATIONS. + + x 2. x n. a 11 a 12 a 1n b 1 a 21 a 22 a 2n b 2 a 31 a 32 a 3n b 3. a m1 a m2 a mn b m

MATRIX ALGEBRA AND SYSTEMS OF EQUATIONS. + + x 2. x n. a 11 a 12 a 1n b 1 a 21 a 22 a 2n b 2 a 31 a 32 a 3n b 3. a m1 a m2 a mn b m MATRIX ALGEBRA AND SYSTEMS OF EQUATIONS 1. SYSTEMS OF EQUATIONS AND MATRICES 1.1. Representation of a linear system. The general system of m equations in n unknowns can be written a 11 x 1 + a 12 x 2 +

More information

Lecture Notes 2: Matrices as Systems of Linear Equations

Lecture Notes 2: Matrices as Systems of Linear Equations 2: Matrices as Systems of Linear Equations 33A Linear Algebra, Puck Rombach Last updated: April 13, 2016 Systems of Linear Equations Systems of linear equations can represent many things You have probably

More information

Regression III: Advanced Methods

Regression III: Advanced Methods Lecture 16: Generalized Additive Models Regression III: Advanced Methods Bill Jacoby Michigan State University http://polisci.msu.edu/jacoby/icpsr/regress3 Goals of the Lecture Introduce Additive Models

More information

Factor Analysis. Chapter 420. Introduction

Factor Analysis. Chapter 420. Introduction Chapter 420 Introduction (FA) is an exploratory technique applied to a set of observed variables that seeks to find underlying factors (subsets of variables) from which the observed variables were generated.

More information

Principle Component Analysis and Partial Least Squares: Two Dimension Reduction Techniques for Regression

Principle Component Analysis and Partial Least Squares: Two Dimension Reduction Techniques for Regression Principle Component Analysis and Partial Least Squares: Two Dimension Reduction Techniques for Regression Saikat Maitra and Jun Yan Abstract: Dimension reduction is one of the major tasks for multivariate

More information

Eigenvalues and Eigenvectors

Eigenvalues and Eigenvectors Chapter 6 Eigenvalues and Eigenvectors 6. Introduction to Eigenvalues Linear equations Ax D b come from steady state problems. Eigenvalues have their greatest importance in dynamic problems. The solution

More information

Regression Using Support Vector Machines: Basic Foundations

Regression Using Support Vector Machines: Basic Foundations Regression Using Support Vector Machines: Basic Foundations Technical Report December 2004 Aly Farag and Refaat M Mohamed Computer Vision and Image Processing Laboratory Electrical and Computer Engineering

More information

1D 3D 1D 3D. is called eigenstate or state function. When an operator act on a state, it can be written as

1D 3D 1D 3D. is called eigenstate or state function. When an operator act on a state, it can be written as Chapter 3 (Lecture 4-5) Postulates of Quantum Mechanics Now we turn to an application of the preceding material, and move into the foundations of quantum mechanics. Quantum mechanics is based on a series

More information

The Singular Value Decomposition in Symmetric (Löwdin) Orthogonalization and Data Compression

The Singular Value Decomposition in Symmetric (Löwdin) Orthogonalization and Data Compression The Singular Value Decomposition in Symmetric (Löwdin) Orthogonalization and Data Compression The SVD is the most generally applicable of the orthogonal-diagonal-orthogonal type matrix decompositions Every

More information

Temporal Difference Learning in the Tetris Game

Temporal Difference Learning in the Tetris Game Temporal Difference Learning in the Tetris Game Hans Pirnay, Slava Arabagi February 6, 2009 1 Introduction Learning to play the game Tetris has been a common challenge on a few past machine learning competitions.

More information

Linear Dependence Tests

Linear Dependence Tests Linear Dependence Tests The book omits a few key tests for checking the linear dependence of vectors. These short notes discuss these tests, as well as the reasoning behind them. Our first test checks

More information

1 Review of Least Squares Solutions to Overdetermined Systems

1 Review of Least Squares Solutions to Overdetermined Systems cs4: introduction to numerical analysis /9/0 Lecture 7: Rectangular Systems and Numerical Integration Instructor: Professor Amos Ron Scribes: Mark Cowlishaw, Nathanael Fillmore Review of Least Squares

More information

ADVANTAGE UPDATING. Leemon C. Baird III. Technical Report WL-TR Wright Laboratory. Wright-Patterson Air Force Base, OH

ADVANTAGE UPDATING. Leemon C. Baird III. Technical Report WL-TR Wright Laboratory. Wright-Patterson Air Force Base, OH ADVANTAGE UPDATING Leemon C. Baird III Technical Report WL-TR-93-1146 Wright Laboratory Wright-Patterson Air Force Base, OH 45433-7301 Address: WL/AAAT, Bldg 635 2185 Avionics Circle Wright-Patterson Air

More information

Collaborative Filtering. Radek Pelánek

Collaborative Filtering. Radek Pelánek Collaborative Filtering Radek Pelánek 2015 Collaborative Filtering assumption: users with similar taste in past will have similar taste in future requires only matrix of ratings applicable in many domains

More information

Dimensionality Reduction: Principal Components Analysis

Dimensionality Reduction: Principal Components Analysis Dimensionality Reduction: Principal Components Analysis In data mining one often encounters situations where there are a large number of variables in the database. In such situations it is very likely

More information

HT2015: SC4 Statistical Data Mining and Machine Learning

HT2015: SC4 Statistical Data Mining and Machine Learning HT2015: SC4 Statistical Data Mining and Machine Learning Dino Sejdinovic Department of Statistics Oxford http://www.stats.ox.ac.uk/~sejdinov/sdmml.html Bayesian Nonparametrics Parametric vs Nonparametric

More information