November 28 th, Carlos Guestrin. Lower dimensional projections


 Nora Russell
 2 years ago
 Views:
Transcription
1 PCA Machine Learning 070/578 Carlos Guestrin Carnegie Mellon University November 28 th, 2007 Lower dimensional projections Rather than picking a subset of the features, we can new features that are combinations of existing features Let s see this in the unsupervised setting just X, but no Y 2
2 Linear projection and reconstruction x 2 project into dimension z x reconstruction: only know z, what was (x,x 2 ) 3 Linear projections, a review Project a point into a (lower dimensional) space: point: x = (x,,x n ) select a basis set of basis vectors (u,,u k ) we consider orthonormal basis: u i u i =, and u i u j =0 for i j select a center defines offset of space best coordinates in lower dimensional space defined by dotproducts: (z,,z k ), z i = (xx) u i minimum squared error 4
3 PCA finds projection that minimizes reconstruction error Given m data points: x i = (x i,,x ni ), i= m Will represent each point as a projection: where: and PCA: Given k n, find (u,,u k ) minimizing reconstruction error: x 2 x 5 Understanding the reconstruction error Note that x i can be represented exactly by ndimensional projection: Given k n, find (u,,u k ) minimizing reconstruction error: Rewriting error: 6
4 Reconstruction error and covariance matrix 7 Minimizing reconstruction error and eigen vectors Minimizing reconstruction error equivalent to picking orthonormal basis (u,,u n ) minimizing: Eigen vector: Minimizing reconstruction error equivalent to picking (u k+,,u n ) to be eigen vectors with smallest eigen values 8
5 Basic PCA algoritm Start from m by n data matrix X Recenter: subtract mean from each row of X X c X X Compute covariance matrix: Σ /m X c T X c Find eigen vectors and values of Σ Principal components: k eigen vectors with highest eigen values 9 PCA example 0
6 PCA example reconstruction only used first principal component Eigenfaces [Turk, Pentland 9] Input images: Principal components: 2
7 Eigenfaces reconstruction Each image corresponds to adding 8 principal components: 3 Scaling up Covariance matrix can be really big! Σ is n by n 0000 features! Σ finding eigenvectors is very slow Use singular value decomposition (SVD) finds to k eigenvectors great implementations available, e.g., Matlab svd 4
8 SVD Write X = W S V T X data matri one row per datapoint W weight matri one row per datapoint coordinate of x i in eigenspace S singular value matri diagonal matrix in our setting each entry is eigenvalue λ j V T singular vector matrix in our setting each row is eigenvector v j 5 PCA using SVD algoritm Start from m by n data matrix X Recenter: subtract mean from each row of X X c X X Call SVD algorithm on X c ask for k singular vectors Principal components: k singular vectors with highest singular values (rows of V T ) Coefficients become: 6
9 What you need to know Dimensionality reduction why and when it s important Simple feature selection Principal component analysis minimizing reconstruction error relationship to covariance matrix and eigenvectors using SVD 7 Announcements University Course Assessments Please, please, please, please, please, please, please, please, please, please, please, please, please, please, please, please Last lecture: Thursday, 9, 4:406:30pm, Wean
10 Markov Decision Processes (MDPs) Machine Learning 070/578 Carlos Guestrin Carnegie Mellon University November 28 th, Thus far this semester Regression: Classification: Density estimation: 20
11 Learning to act [Ng et al. 05] Reinforcement learning An agent Makes sensor observations Must select action Receives rewards positive for good states negative for bad states 2 Learning to play backgammon [Tesauro 95] Combines reinforcement learning with neural networks Played 300,000 games against itself Achieved grandmaster level! 22
12 Roadmap to learning about reinforcement learning When we learned about Bayes nets: First talked about formal framework: representation inference Then learning for BNs For reinforcement learning: Formal framework Markov decision processes Then learning 23 peasant footman building Realtime Strategy Game Peasants collect resources and build Footmen attack enemies Buildings train peasants and footmen 24
13 States and actions State space: Joint state x of entire system Action space: Joint action a= {a,, a n } for all agents 25 States change over time Like an HMM, state changes over time Next state depends on current state and action selected e.g., action= build castle likely to lead to a state where you have a castle Transition model: Dynamics of the entire system P(x 26
14 Some states and actions are better than others Each state x is associated with a reward positive reward for successful attack negative for loss Reward function: Total reward R(x) 27 Markov Decision Process (MDP) Representation State space: Joint state x of entire system Action space: Joint action a= {a,, a n } for all agents Reward function: Total reward R( sometimes reward can depend on action Transition model: Dynamics of the entire system P(x 28
15 Discounted Rewards An assistant professor gets paid, say, 20K per year. How much, in total, will the A.P. earn in their life? = Infinity $ $ What s wrong with this argument? 29 Discounted Rewards A reward (payment) in the future is not worth quite as much as a reward now. Because of chance of obliteration Because of inflation Example: Being promised $0,000 next year is worth only 90% as much as receiving $0,000 right now. Assuming payment n years in future is worth only (0.9) n of payment now, what is the AP s Future Discounted Sum of Rewards? 30
16 Discount Factors People in economics and probabilistic decisionmaking do this all the time. The Discounted sum of future rewards using discount factor γ is (reward now) + γ (reward in time step) + γ 2 (reward in 2 time steps) + γ 3 (reward in 3 time steps) + : : (infinite sum) 3 The Academic Life Assume Discount Factor γ = A. Assistant Prof B. Assoc. Prof T. Tenured Prof 400 Define: 0.2 S. 0.2 On the Street D. Dead 0 V A = Expected discounted future rewards starting in state A V B = Expected discounted future rewards starting in state B V T = T V S = S V D = D How do we compute V A, V B, V T, V S, V D?
17 Computing the Future Rewards of an Academic 0.6 A. Assistant Prof Assume Discount Factor γ = 0.9 B. Assoc. Prof S. On the Street D. Dead T. Tenured Prof Policy Policy: π(x) = a At state action a for all agents x 0 π(x 0 ) = both peasants get wood x π(x ) = one peasant builds barrack, other gets gold x 2 π(x 2 ) = peasants get gold, footmen attack 34
18 Value of Policy Value: V π (x) Expected longterm reward starting from x Start from x 0 x 0 R(x 0 ) π(x 0 ) x R(x ) V π (x 0 ) = E π [R(x 0 ) + γ R(x ) + γ 2 R(x 2 ) + γ 3 R(x 3 ) + γ 4 R(x 4 ) + L] π(x ) x 2 π(x 2 ) Future rewards discounted by γ 2 [0,) x R(x ) π(x ) x R(x ) π(x ) R(x 2 ) x 3 R(x 3 ) π(x 3 ) x 4 R(x 4 ) 35 Computing the value of a policy Discounted value of a state: V π (x 0 ) = E π [R(x 0 ) + γ R(x ) + γ 2 R(x 2 ) + γ 3 R(x 3 ) + γ 4 R(x 4 ) + L] value of starting from x 0 and continuing with policy π from then on A recursion! 36
19 Simple approach for computing the value of a policy: Iteratively Can solve using a simple convergent iterative approach: (a.k.a. dynamic programming) Start with some guess V 0 Iteratively say: V t+ = R + γ P π V t Stop when V t+ V t ε means that V π V t+ ε/(γ) 37 But we want to learn a Policy So far, told you how good a policy is But how can we choose the best policy??? x 0 Policy: π(x) = a At state action a for all agents π(x 0 ) = both peasants get wood Suppose there was only one time step: world is about to end!!! select action that maximizes reward! x x 2 π(x ) = one peasant builds barrack, other gets gold π(x 2 ) = peasants get gold, footmen attack 38
20 Unrolling the recursion Choose actions that lead to best value in the long run Optimal value policy achieves optimal value V * 39 Bellman equation Evaluating policy π: Computing the optimal value V *  Bellman equation " V ( x) = max R( + # a! x' P( x' V " ( x' ) 40
21 Optimal Longterm Plan Optimal value function V * (x) Optimal Policy: π * (x) " # (x) = argmax a Optimal policy: R( + $ % x' P(x' V # (x') 4 Interesting fact Unique value " V ( x) = max R( + # a! Slightly surprising fact: There is only one V * that solves Bellman equation! there may be many optimal policies that achieve V * Surprising fact: optimal policies are good everywhere!!! x' P( x' V " ( x' ) 42
22 Solve Bellman equation Solving an MDP Optimal value V * (x) " V ( x) = max R( + # a! Many algorithms solve the Bellman equations: x' P( x' V Bellman equation is nonlinear!!! Optimal policy π*(x) " ( x' ) Policy iteration [Howard 60, Bellman 57] Value iteration [Bellman 57] Linear programming [Manne 60] 43 Value iteration (a.k.a. dynamic programming) the simplest of all " V ( x) = max R( + # a! x' P( x' V " ( x' ) Start with some guess V 0 Iteratively say: Stop when V t+ V t ε means that V V t+ ε/(γ)! V t+ ( x) = max R( + " P( x' Vt ( x' ) a x' 44
23 A simple example You run a startup company. In every state you must choose between Saving money or Advertising. Poor & Unknown +0 S S A Rich & Unknown +0 A S Poor & Famous +0 S A Rich & Famous +0 γ = 0.9 A 45 Let s compute V t (x) for our example Poor & Unknown +0 S S A Rich & Unknown +0 A S Poor & Famous +0 S A Rich & Famous +0 γ = 0.9 A t V t (PU) V t (PF) V t (RU) V t (RF)! V t+ ( x) = max R( + " P( x' Vt ( x' ) a x' 46
24 Let s compute V t (x) for our example Poor & Unknown +0 S S A Rich & Unknown +0 A S Poor & Famous +0 S A Rich & Famous +0 γ = 0.9 A t V t (PU) V t (PF) V t (RU) V t (RF) ! V t+ ( x) = max R( + " P( x' Vt ( x' ) a x' 47 What you need to know What s a Markov decision process state, actions, transitions, rewards a policy value function for a policy computing V π Optimal value function and optimal policy Bellman equation Solving Bellman equation with value iteration, policy iteration and linear programming 48
25 Acknowledgment This lecture contains some material from Andrew Moore s excellent collection of ML tutorials: 49
Face Recognition using Principle Component Analysis
Face Recognition using Principle Component Analysis Kyungnam Kim Department of Computer Science University of Maryland, College Park MD 20742, USA Summary This is the summary of the basic idea about PCA
More informationLinear Algebra Review. Vectors
Linear Algebra Review By Tim K. Marks UCSD Borrows heavily from: Jana Kosecka kosecka@cs.gmu.edu http://cs.gmu.edu/~kosecka/cs682.html Virginia de Sa Cogsci 8F Linear Algebra review UCSD Vectors The length
More informationManifold Learning Examples PCA, LLE and ISOMAP
Manifold Learning Examples PCA, LLE and ISOMAP Dan Ventura October 14, 28 Abstract We try to give a helpful concrete example that demonstrates how to use PCA, LLE and Isomap, attempts to provide some intuition
More informationOrthogonal Diagonalization of Symmetric Matrices
MATH10212 Linear Algebra Brief lecture notes 57 Gram Schmidt Process enables us to find an orthogonal basis of a subspace. Let u 1,..., u k be a basis of a subspace V of R n. We begin the process of finding
More informationObject Recognition and Template Matching
Object Recognition and Template Matching Template Matching A template is a small image (subimage) The goal is to find occurrences of this template in a larger image That is, you want to find matches of
More informationSimilarity and Diagonalization. Similar Matrices
MATH022 Linear Algebra Brief lecture notes 48 Similarity and Diagonalization Similar Matrices Let A and B be n n matrices. We say that A is similar to B if there is an invertible n n matrix P such that
More informationPresentation 3: Eigenvalues and Eigenvectors of a Matrix
Colleen Kirksey, Beth Van Schoyck, Dennis Bowers MATH 280: Problem Solving November 18, 2011 Presentation 3: Eigenvalues and Eigenvectors of a Matrix Order of Presentation: 1. Definitions of Eigenvalues
More informationLecture 5: Singular Value Decomposition SVD (1)
EEM3L1: Numerical and Analytical Techniques Lecture 5: Singular Value Decomposition SVD (1) EE3L1, slide 1, Version 4: 25Sep02 Motivation for SVD (1) SVD = Singular Value Decomposition Consider the system
More informationReview Jeopardy. Blue vs. Orange. Review Jeopardy
Review Jeopardy Blue vs. Orange Review Jeopardy Jeopardy Round Lectures 03 Jeopardy Round $200 How could I measure how far apart (i.e. how different) two observations, y 1 and y 2, are from each other?
More informationLecture 3: Linear methods for classification
Lecture 3: Linear methods for classification Rafael A. Irizarry and Hector Corrada Bravo February, 2010 Today we describe four specific algorithms useful for classification problems: linear regression,
More informationMachine Learning and Pattern Recognition Logistic Regression
Machine Learning and Pattern Recognition Logistic Regression Course Lecturer:Amos J Storkey Institute for Adaptive and Neural Computation School of Informatics University of Edinburgh Crichton Street,
More informationDimension Reduction. WeiTa Chu 2014/10/22. Multimedia Content Analysis, CSIE, CCU
1 Dimension Reduction WeiTa Chu 2014/10/22 2 1.1 Principal Component Analysis (PCA) Widely used in dimensionality reduction, lossy data compression, feature extraction, and data visualization Also known
More information1 Eigenvalues and Eigenvectors
Math 20 Chapter 5 Eigenvalues and Eigenvectors Eigenvalues and Eigenvectors. Definition: A scalar λ is called an eigenvalue of the n n matrix A is there is a nontrivial solution x of Ax = λx. Such an x
More informationby the matrix A results in a vector which is a reflection of the given
Eigenvalues & Eigenvectors Example Suppose Then So, geometrically, multiplying a vector in by the matrix A results in a vector which is a reflection of the given vector about the yaxis We observe that
More informationCS 2750 Machine Learning. Lecture 1. Machine Learning. http://www.cs.pitt.edu/~milos/courses/cs2750/ CS 2750 Machine Learning.
Lecture Machine Learning Milos Hauskrecht milos@cs.pitt.edu 539 Sennott Square, x5 http://www.cs.pitt.edu/~milos/courses/cs75/ Administration Instructor: Milos Hauskrecht milos@cs.pitt.edu 539 Sennott
More informationPCA to Eigenfaces. CS 510 Lecture #16 March 23 th A 9 dimensional PCA example
PCA to Eigenfaces CS 510 Lecture #16 March 23 th 2015 A 9 dimensional PCA example is dark around the edges and bright in the middle. is light with dark vertical bars. is light with dark horizontal bars.
More informationPrincipal components analysis
CS229 Lecture notes Andrew Ng Part XI Principal components analysis In our discussion of factor analysis, we gave a way to model data x R n as approximately lying in some kdimension subspace, where k
More information10601. Machine Learning. http://www.cs.cmu.edu/afs/cs/academic/class/10601f10/index.html
10601 Machine Learning http://www.cs.cmu.edu/afs/cs/academic/class/10601f10/index.html Course data All uptodate info is on the course web page: http://www.cs.cmu.edu/afs/cs/academic/class/10601f10/index.html
More informationNonlinear Iterative Partial Least Squares Method
Numerical Methods for Determining Principal Component Analysis Abstract Factors Béchu, S., RichardPlouet, M., Fernandez, V., Walton, J., and Fairley, N. (2016) Developments in numerical treatments for
More informationLogistic Regression. Vibhav Gogate The University of Texas at Dallas. Some Slides from Carlos Guestrin, Luke Zettlemoyer and Dan Weld.
Logistic Regression Vibhav Gogate The University of Texas at Dallas Some Slides from Carlos Guestrin, Luke Zettlemoyer and Dan Weld. Generative vs. Discriminative Classifiers Want to Learn: h:x Y X features
More informationNotes on Jordan Canonical Form
Notes on Jordan Canonical Form Eric Klavins University of Washington 8 Jordan blocks and Jordan form A Jordan Block of size m and value λ is a matrix J m (λ) having the value λ repeated along the main
More informationAPPM4720/5720: Fast algorithms for big data. Gunnar Martinsson The University of Colorado at Boulder
APPM4720/5720: Fast algorithms for big data Gunnar Martinsson The University of Colorado at Boulder Course objectives: The purpose of this course is to teach efficient algorithms for processing very large
More informationAu = = = 3u. Aw = = = 2w. so the action of A on u and w is very easy to picture: it simply amounts to a stretching by 3 and 2, respectively.
Chapter 7 Eigenvalues and Eigenvectors In this last chapter of our exploration of Linear Algebra we will revisit eigenvalues and eigenvectors of matrices, concepts that were already introduced in Geometry
More informationL1 vs. L2 Regularization and feature selection.
L1 vs. L2 Regularization and feature selection. Paper by Andrew Ng (2004) Presentation by Afshin Rostami Main Topics Covering Numbers Definition Convergence Bounds L1 regularized logistic regression L1
More informationStatistical Machine Learning
Statistical Machine Learning UoC Stats 37700, Winter quarter Lecture 4: classical linear and quadratic discriminants. 1 / 25 Linear separation For two classes in R d : simple idea: separate the classes
More informationVector and Matrix Norms
Chapter 1 Vector and Matrix Norms 11 Vector Spaces Let F be a field (such as the real numbers, R, or complex numbers, C) with elements called scalars A Vector Space, V, over the field F is a nonempty
More informationCS Master Level Courses and Areas COURSE DESCRIPTIONS. CSCI 521 RealTime Systems. CSCI 522 High Performance Computing
CS Master Level Courses and Areas The graduate courses offered may change over time, in response to new developments in computer science and the interests of faculty and students; the list of graduate
More informationMachine Learning. CUNY Graduate Center, Spring 2013. Professor Liang Huang. huang@cs.qc.cuny.edu
Machine Learning CUNY Graduate Center, Spring 2013 Professor Liang Huang huang@cs.qc.cuny.edu http://acl.cs.qc.edu/~lhuang/teaching/machinelearning Logistics Lectures M 9:3011:30 am Room 4419 Personnel
More informationThe basic unit in matrix algebra is a matrix, generally expressed as: a 11 a 12. a 13 A = a 21 a 22 a 23
(copyright by Scott M Lynch, February 2003) Brief Matrix Algebra Review (Soc 504) Matrix algebra is a form of mathematics that allows compact notation for, and mathematical manipulation of, highdimensional
More informationIntroduction to Matrix Algebra
Psychology 7291: Multivariate Statistics (Carey) 8/27/98 Matrix Algebra  1 Introduction to Matrix Algebra Definitions: A matrix is a collection of numbers ordered by rows and columns. It is customary
More informationData Clustering. Dec 2nd, 2013 Kyrylo Bessonov
Data Clustering Dec 2nd, 2013 Kyrylo Bessonov Talk outline Introduction to clustering Types of clustering Supervised Unsupervised Similarity measures Main clustering algorithms kmeans Hierarchical Main
More informationFactor analysis. Angela Montanari
Factor analysis Angela Montanari 1 Introduction Factor analysis is a statistical model that allows to explain the correlations between a large number of observed correlated variables through a small number
More informationMehtap Ergüven Abstract of Ph.D. Dissertation for the degree of PhD of Engineering in Informatics
INTERNATIONAL BLACK SEA UNIVERSITY COMPUTER TECHNOLOGIES AND ENGINEERING FACULTY ELABORATION OF AN ALGORITHM OF DETECTING TESTS DIMENSIONALITY Mehtap Ergüven Abstract of Ph.D. Dissertation for the degree
More informationLecture 11: Graphical Models for Inference
Lecture 11: Graphical Models for Inference So far we have seen two graphical models that are used for inference  the Bayesian network and the Join tree. These two both represent the same joint probability
More informationBayesian Machine Learning (ML): Modeling And Inference in Big Data. Zhuhua Cai Google, Rice University caizhua@gmail.com
Bayesian Machine Learning (ML): Modeling And Inference in Big Data Zhuhua Cai Google Rice University caizhua@gmail.com 1 Syllabus Bayesian ML Concepts (Today) Bayesian ML on MapReduce (Next morning) Bayesian
More informationCS 591.03 Introduction to Data Mining Instructor: Abdullah Mueen
CS 591.03 Introduction to Data Mining Instructor: Abdullah Mueen LECTURE 3: DATA TRANSFORMATION AND DIMENSIONALITY REDUCTION Chapter 3: Data Preprocessing Data Preprocessing: An Overview Data Quality Major
More informationKnearestneighbor: an introduction to machine learning
Knearestneighbor: an introduction to machine learning Xiaojin Zhu jerryzhu@cs.wisc.edu Computer Sciences Department University of Wisconsin, Madison slide 1 Outline Types of learning Classification:
More informationCS 2750 Machine Learning. Lecture 1. Machine Learning. CS 2750 Machine Learning.
Lecture 1 Machine Learning Milos Hauskrecht milos@cs.pitt.edu 539 Sennott Square, x5 http://www.cs.pitt.edu/~milos/courses/cs75/ Administration Instructor: Milos Hauskrecht milos@cs.pitt.edu 539 Sennott
More informationLinear Algebraic Equations, SVD, and the PseudoInverse
Linear Algebraic Equations, SVD, and the PseudoInverse Philip N. Sabes October, 21 1 A Little Background 1.1 Singular values and matrix inversion For nonsmmetric matrices, the eigenvalues and singular
More information1 Introduction to Matrices
1 Introduction to Matrices In this section, important definitions and results from matrix algebra that are useful in regression analysis are introduced. While all statements below regarding the columns
More informationFinal Project Report
CPSC545 by Introduction to Data Mining Prof. Martin Schultz & Prof. Mark Gerstein Student Name: Yu Kor Hugo Lam Student ID : 904907866 Due Date : May 7, 2007 Introduction Final Project Report Pseudogenes
More informationNumerical Methods I Solving Linear Systems: Sparse Matrices, Iterative Methods and NonSquare Systems
Numerical Methods I Solving Linear Systems: Sparse Matrices, Iterative Methods and NonSquare Systems Aleksandar Donev Courant Institute, NYU 1 donev@courant.nyu.edu 1 Course G63.2010.001 / G22.2420001,
More informationSelf Organizing Maps: Fundamentals
Self Organizing Maps: Fundamentals Introduction to Neural Networks : Lecture 16 John A. Bullinaria, 2004 1. What is a Self Organizing Map? 2. Topographic Maps 3. Setting up a Self Organizing Map 4. Kohonen
More informationChapter 6. Orthogonality
6.3 Orthogonal Matrices 1 Chapter 6. Orthogonality 6.3 Orthogonal Matrices Definition 6.4. An n n matrix A is orthogonal if A T A = I. Note. We will see that the columns of an orthogonal matrix must be
More informationCHAPTER 2. Eigenvalue Problems (EVP s) for ODE s
A SERIES OF CLASS NOTES FOR 005006 TO INTRODUCE LINEAR AND NONLINEAR PROBLEMS TO ENGINEERS, SCIENTISTS, AND APPLIED MATHEMATICIANS DE CLASS NOTES 4 A COLLECTION OF HANDOUTS ON PARTIAL DIFFERENTIAL EQUATIONS
More informationAdaptive Face Recognition System from Myanmar NRC Card
Adaptive Face Recognition System from Myanmar NRC Card Ei Phyo Wai University of Computer Studies, Yangon, Myanmar Myint Myint Sein University of Computer Studies, Yangon, Myanmar ABSTRACT Biometrics is
More informationSolutions to Linear Algebra Practice Problems
Solutions to Linear Algebra Practice Problems. Find all solutions to the following systems of linear equations. (a) x x + x 5 x x x + x + x 5 (b) x + x + x x + x + x x + x + 8x Answer: (a) We create the
More informationSection 2.1. Section 2.2. Exercise 6: We have to compute the product AB in two ways, where , B =. 2 1 3 5 A =
Section 2.1 Exercise 6: We have to compute the product AB in two ways, where 4 2 A = 3 0 1 3, B =. 2 1 3 5 Solution 1. Let b 1 = (1, 2) and b 2 = (3, 1) be the columns of B. Then Ab 1 = (0, 3, 13) and
More informationCS 5614: (Big) Data Management Systems. B. Aditya Prakash Lecture #18: Dimensionality Reduc7on
CS 5614: (Big) Data Management Systems B. Aditya Prakash Lecture #18: Dimensionality Reduc7on Dimensionality Reduc=on Assump=on: Data lies on or near a low d dimensional subspace Axes of this subspace
More informationMachine Learning for Data Science (CS4786) Lecture 1
Machine Learning for Data Science (CS4786) Lecture 1 TuTh 10:10 to 11:25 AM Hollister B14 Instructors : Lillian Lee and Karthik Sridharan ROUGH DETAILS ABOUT THE COURSE Diagnostic assignment 0 is out:
More information203.4770: Introduction to Machine Learning Dr. Rita Osadchy
203.4770: Introduction to Machine Learning Dr. Rita Osadchy 1 Outline 1. About the Course 2. What is Machine Learning? 3. Types of problems and Situations 4. ML Example 2 About the course Course Homepage:
More information8/29/2015. Data Mining and Machine Learning. Erich Seamon University of Idaho
Data Mining and Machine Learning Erich Seamon University of Idaho www.webpages.uidaho.edu/erichs erichs@uidaho.edu 1 I am NOMAD 2 1 Data Mining and Machine Learning Outline Outlining the data mining and
More informationInverse problems. Nikolai Piskunov 2014. Regularization: Example Lecture 4
Inverse problems Nikolai Piskunov 2014 Regularization: Example Lecture 4 Now that we know the theory, let s try an application: Problem: Optimal filtering of 1D and 2D data Solution: formulate an inverse
More informationSYSTEMS OF EQUATIONS
SYSTEMS OF EQUATIONS 1. Examples of systems of equations Here are some examples of systems of equations. Each system has a number of equations and a number (not necessarily the same) of variables for which
More informationCHAPTER 8 FACTOR EXTRACTION BY MATRIX FACTORING TECHNIQUES. From Exploratory Factor Analysis Ledyard R Tucker and Robert C.
CHAPTER 8 FACTOR EXTRACTION BY MATRIX FACTORING TECHNIQUES From Exploratory Factor Analysis Ledyard R Tucker and Robert C MacCallum 1997 180 CHAPTER 8 FACTOR EXTRACTION BY MATRIX FACTORING TECHNIQUES In
More informationPartial Least Squares (PLS) Regression.
Partial Least Squares (PLS) Regression. Hervé Abdi 1 The University of Texas at Dallas Introduction Pls regression is a recent technique that generalizes and combines features from principal component
More informationIntroduction to machine learning and pattern recognition Lecture 1 Coryn BailerJones
Introduction to machine learning and pattern recognition Lecture 1 Coryn BailerJones http://www.mpia.de/homes/calj/mlpr_mpia2008.html 1 1 What is machine learning? Data description and interpretation
More informationMath 2020 Quizzes Winter 2009
Quiz : Basic Probability Ten Scrabble tiles are placed in a bag Four of the tiles have the letter printed on them, and there are two tiles each with the letters B, C and D on them (a) Suppose one tile
More informationLinear Programming Notes V Problem Transformations
Linear Programming Notes V Problem Transformations 1 Introduction Any linear programming problem can be rewritten in either of two standard forms. In the first form, the objective is to maximize, the material
More informationC19 Machine Learning
C9 Machine Learning 8 Lectures Hilary Term 25 2 Tutorial Sheets A. Zisserman Overview: Supervised classification perceptron, support vector machine, loss functions, kernels, random forests, neural networks
More informationLinear Programming. March 14, 2014
Linear Programming March 1, 01 Parts of this introduction to linear programming were adapted from Chapter 9 of Introduction to Algorithms, Second Edition, by Cormen, Leiserson, Rivest and Stein [1]. 1
More informationSingular Value Decomposition Tutorial
Singular Value Decomposition Tutorial Kirk Baker March 9, 5 (Revised January 4, 3) Contents Acknowledgments Introduction 3 Points and Space 4 Vectors 3 5 Matrices 4 5. Matrix Notation..................................
More informationTheta Functions. Lukas Lewark. Seminar on Modular Forms, 31. Januar 2007
Theta Functions Lukas Lewark Seminar on Modular Forms, 31. Januar 007 Abstract Theta functions are introduced, associated to lattices or quadratic forms. Their transformation property is proven and the
More information1 Introduction. 2 Matrices: Definition. Matrix Algebra. Hervé Abdi Lynne J. Williams
In Neil Salkind (Ed.), Encyclopedia of Research Design. Thousand Oaks, CA: Sage. 00 Matrix Algebra Hervé Abdi Lynne J. Williams Introduction Sylvester developed the modern concept of matrices in the 9th
More informationMatrix Norms. Tom Lyche. September 28, Centre of Mathematics for Applications, Department of Informatics, University of Oslo
Matrix Norms Tom Lyche Centre of Mathematics for Applications, Department of Informatics, University of Oslo September 28, 2009 Matrix Norms We consider matrix norms on (C m,n, C). All results holds for
More informationConcepts in Machine Learning, Unsupervised Learning & Astronomy Applications
Data Mining In Modern Astronomy Sky Surveys: Concepts in Machine Learning, Unsupervised Learning & Astronomy Applications ChingWa Yip cwyip@pha.jhu.edu; Bloomberg 518 Human are Great Pattern Recognizers
More informationOverview of Violations of the Basic Assumptions in the Classical Normal Linear Regression Model
Overview of Violations of the Basic Assumptions in the Classical Normal Linear Regression Model 1 September 004 A. Introduction and assumptions The classical normal linear regression model can be written
More information1 2 3 1 1 2 x = + x 2 + x 4 1 0 1
(d) If the vector b is the sum of the four columns of A, write down the complete solution to Ax = b. 1 2 3 1 1 2 x = + x 2 + x 4 1 0 0 1 0 1 2. (11 points) This problem finds the curve y = C + D 2 t which
More informationSTA 4273H: Statistical Machine Learning
STA 4273H: Statistical Machine Learning Russ Salakhutdinov Department of Statistics! rsalakhu@utstat.toronto.edu! http://www.cs.toronto.edu/~rsalakhu/ Lecture 6 Three Approaches to Classification Construct
More informationTHREE DIMENSIONAL REPRESENTATION OF AMINO ACID CHARAC TERISTICS
THREE DIMENSIONAL REPRESENTATION OF AMINO ACID CHARAC TERISTICS O.U. Sezerman 1, R. Islamaj 2, E. Alpaydin 2 1 Laborotory of Computational Biology, Sabancı University, Istanbul, Turkey. 2 Computer Engineering
More information(January 14, 2009) End k (V ) End k (V/W )
(January 14, 29) [16.1] Let p be the smallest prime dividing the order of a finite group G. Show that a subgroup H of G of index p is necessarily normal. Let G act on cosets gh of H by left multiplication.
More informationNCSS Statistical Software Principal Components Regression. In ordinary least squares, the regression coefficients are estimated using the formula ( )
Chapter 340 Principal Components Regression Introduction is a technique for analyzing multiple regression data that suffer from multicollinearity. When multicollinearity occurs, least squares estimates
More informationMATH 304 Linear Algebra Lecture 9: Subspaces of vector spaces (continued). Span. Spanning set.
MATH 304 Linear Algebra Lecture 9: Subspaces of vector spaces (continued). Span. Spanning set. Vector space A vector space is a set V equipped with two operations, addition V V (x,y) x + y V and scalar
More informationC 1 x(t) = e ta C = e C n. 2! A2 + t3
Matrix Exponential Fundamental Matrix Solution Objective: Solve dt A x with an n n constant coefficient matrix A x (t) Here the unknown is the vector function x(t) x n (t) General Solution Formula in Matrix
More informationMatrix Calculations: Applications of Eigenvalues and Eigenvectors; Inner Products
Matrix Calculations: Applications of Eigenvalues and Eigenvectors; Inner Products H. Geuvers Institute for Computing and Information Sciences Intelligent Systems Version: spring 2015 H. Geuvers Version:
More informationCS229 Lecture notes. Andrew Ng
CS229 Lecture notes Andrew Ng Part X Factor analysis Whenwehavedatax (i) R n thatcomesfromamixtureofseveral Gaussians, the EM algorithm can be applied to fit a mixture model. In this setting, we usually
More informationSupervised Feature Selection & Unsupervised Dimensionality Reduction
Supervised Feature Selection & Unsupervised Dimensionality Reduction Feature Subset Selection Supervised: class labels are given Select a subset of the problem features Why? Redundant features much or
More informationTetris: Experiments with the LP Approach to Approximate DP
Tetris: Experiments with the LP Approach to Approximate DP Vivek F. Farias Electrical Engineering Stanford University Stanford, CA 94403 vivekf@stanford.edu Benjamin Van Roy Management Science and Engineering
More informationJoint Probability Distributions and Random Samples (Devore Chapter Five)
Joint Probability Distributions and Random Samples (Devore Chapter Five) 101634501 Probability and Statistics for Engineers Winter 20102011 Contents 1 Joint Probability Distributions 1 1.1 Two Discrete
More informationL10: Probability, statistics, and estimation theory
L10: Probability, statistics, and estimation theory Review of probability theory Bayes theorem Statistics and the Normal distribution Least Squares Error estimation Maximum Likelihood estimation Bayesian
More information1 Review of Newton Polynomials
cs: introduction to numerical analysis 0/0/0 Lecture 8: Polynomial Interpolation: Using Newton Polynomials and Error Analysis Instructor: Professor Amos Ron Scribes: Giordano Fusco, Mark Cowlishaw, Nathanael
More informationLogistic Regression. Jia Li. Department of Statistics The Pennsylvania State University. Logistic Regression
Logistic Regression Department of Statistics The Pennsylvania State University Email: jiali@stat.psu.edu Logistic Regression Preserve linear classification boundaries. By the Bayes rule: Ĝ(x) = arg max
More informationLecture 20: Clustering
Lecture 20: Clustering Wrapup of neural nets (from last lecture Introduction to unsupervised learning Kmeans clustering COMP424, Lecture 20  April 3, 2013 1 Unsupervised learning In supervised learning,
More informationComponent Ordering in Independent Component Analysis Based on Data Power
Component Ordering in Independent Component Analysis Based on Data Power Anne Hendrikse Raymond Veldhuis University of Twente University of Twente Fac. EEMCS, Signals and Systems Group Fac. EEMCS, Signals
More informationMATRIX ALGEBRA AND SYSTEMS OF EQUATIONS. + + x 2. x n. a 11 a 12 a 1n b 1 a 21 a 22 a 2n b 2 a 31 a 32 a 3n b 3. a m1 a m2 a mn b m
MATRIX ALGEBRA AND SYSTEMS OF EQUATIONS 1. SYSTEMS OF EQUATIONS AND MATRICES 1.1. Representation of a linear system. The general system of m equations in n unknowns can be written a 11 x 1 + a 12 x 2 +
More informationLecture Notes 2: Matrices as Systems of Linear Equations
2: Matrices as Systems of Linear Equations 33A Linear Algebra, Puck Rombach Last updated: April 13, 2016 Systems of Linear Equations Systems of linear equations can represent many things You have probably
More informationRegression III: Advanced Methods
Lecture 16: Generalized Additive Models Regression III: Advanced Methods Bill Jacoby Michigan State University http://polisci.msu.edu/jacoby/icpsr/regress3 Goals of the Lecture Introduce Additive Models
More informationFactor Analysis. Chapter 420. Introduction
Chapter 420 Introduction (FA) is an exploratory technique applied to a set of observed variables that seeks to find underlying factors (subsets of variables) from which the observed variables were generated.
More informationPrinciple Component Analysis and Partial Least Squares: Two Dimension Reduction Techniques for Regression
Principle Component Analysis and Partial Least Squares: Two Dimension Reduction Techniques for Regression Saikat Maitra and Jun Yan Abstract: Dimension reduction is one of the major tasks for multivariate
More informationEigenvalues and Eigenvectors
Chapter 6 Eigenvalues and Eigenvectors 6. Introduction to Eigenvalues Linear equations Ax D b come from steady state problems. Eigenvalues have their greatest importance in dynamic problems. The solution
More informationRegression Using Support Vector Machines: Basic Foundations
Regression Using Support Vector Machines: Basic Foundations Technical Report December 2004 Aly Farag and Refaat M Mohamed Computer Vision and Image Processing Laboratory Electrical and Computer Engineering
More information1D 3D 1D 3D. is called eigenstate or state function. When an operator act on a state, it can be written as
Chapter 3 (Lecture 45) Postulates of Quantum Mechanics Now we turn to an application of the preceding material, and move into the foundations of quantum mechanics. Quantum mechanics is based on a series
More informationThe Singular Value Decomposition in Symmetric (Löwdin) Orthogonalization and Data Compression
The Singular Value Decomposition in Symmetric (Löwdin) Orthogonalization and Data Compression The SVD is the most generally applicable of the orthogonaldiagonalorthogonal type matrix decompositions Every
More informationTemporal Difference Learning in the Tetris Game
Temporal Difference Learning in the Tetris Game Hans Pirnay, Slava Arabagi February 6, 2009 1 Introduction Learning to play the game Tetris has been a common challenge on a few past machine learning competitions.
More informationLinear Dependence Tests
Linear Dependence Tests The book omits a few key tests for checking the linear dependence of vectors. These short notes discuss these tests, as well as the reasoning behind them. Our first test checks
More information1 Review of Least Squares Solutions to Overdetermined Systems
cs4: introduction to numerical analysis /9/0 Lecture 7: Rectangular Systems and Numerical Integration Instructor: Professor Amos Ron Scribes: Mark Cowlishaw, Nathanael Fillmore Review of Least Squares
More informationADVANTAGE UPDATING. Leemon C. Baird III. Technical Report WLTR Wright Laboratory. WrightPatterson Air Force Base, OH
ADVANTAGE UPDATING Leemon C. Baird III Technical Report WLTR931146 Wright Laboratory WrightPatterson Air Force Base, OH 454337301 Address: WL/AAAT, Bldg 635 2185 Avionics Circle WrightPatterson Air
More informationCollaborative Filtering. Radek Pelánek
Collaborative Filtering Radek Pelánek 2015 Collaborative Filtering assumption: users with similar taste in past will have similar taste in future requires only matrix of ratings applicable in many domains
More informationDimensionality Reduction: Principal Components Analysis
Dimensionality Reduction: Principal Components Analysis In data mining one often encounters situations where there are a large number of variables in the database. In such situations it is very likely
More informationHT2015: SC4 Statistical Data Mining and Machine Learning
HT2015: SC4 Statistical Data Mining and Machine Learning Dino Sejdinovic Department of Statistics Oxford http://www.stats.ox.ac.uk/~sejdinov/sdmml.html Bayesian Nonparametrics Parametric vs Nonparametric
More information