COMPUTATIONAL INTELLIGENCE (INTRODUCTION TO MACHINE LEARNING) SS16. Lecture 2: Linear Regression Gradient Descent Nonlinear basis functions


 Sherman Casey
 2 years ago
 Views:
Transcription
1 COMPUTATIONAL INTELLIGENCE (INTRODUCTION TO MACHINE LEARNING) SS16 Lecture 2: Linear Regression Gradient Descent Nonlinear basis functions
2 LINEAR REGRESSION MOTIVATION
3 Why Linear Regression? Regression = Prediction of realvalued outputs Simplest regression algorithm Easy, and fast Benchmark algorithm Mathematical Concepts introduced Data format and Matrix notation Minimizing a cost function: gradient descent Nonlinear features and basis functions
4 Examples: (linear) regression application Social science: relationship between data Brain computer interfaces Neuroprosthetic control
5 Examples: (linear) regression application Social science: relationship between data Brain computer interfaces Neuroprosthetic control
6 LINEAR REGRESSION WITH ONE INPUT
7 Linear regression with one input Training set body height Learning algorithm? knee height Hypothesis Test input x Hypothesis h Prediction Parameters?
8 A regression problem We want to learn to predict a person s height based on his/her knee height and/or arm span This is useful for patients who are bed bound or in a wheelchair and cannot stand to take an accurate measurement of their height Knee Height [cm] Arm span [cm] Height [cm]
9 Example Data body height 180 Knee height [cm] Arm span [cm] Height [cm] m=30 data points body height knee height armspan
10 Example Data 190 Knee Height [cm] Arm span [cm] Height [cm] body height armspan knee height 55 60
11 Linear regression with one input Knee Height [cm] Height [cm] Which hypothesis is better? 190 In what sense is it better? body height knee height Hypothesis Parameters?
12 Formalization of problem Knee Height [cm] Height [cm] m=30 data points Given m training examples Goal: learn parameters such that 190 body height for all training examples i= knee height
13 Least Squares Objective Minimize Error body height knee height
14 Least Squares Objective Minimize Error cost function mean squared error body height knee height
15 Least Squares Objective Minimize Error cost function mean squared error body height knee height
16 Cost function illustrated Properties of cost function: Quadratic function Convex Bowl shaped Unique local and global minimum (under regular conditions) body height knee height body height knee height
17 Minimizing the cost Two ways to find the parameters minimizing Gradient descent Direct analytical solution (setting derivatives = 0)
18 EXCURSUS: GRADIENT DESCENT
19 Descending in the steepest direction Gradient descent on some arbitrary cost function
20 Gradient descent algorithm Repeat until convergence (simultaneously updating and ) negative gradient = descent learning rate ( eta ) partial derivative of with respect to
21 Gradient is orthogonal to contour lines A contour line is a line along which = const
22 Potential issues with gradient descent May get stuck in local minima Learning rate too small: slow convergence Learning rate too large: oscillations, divergence too small too large
23 LINEAR REGRESSION WITH GRADIENT DESCENT (ONE INPUT)
24 Application of gradient descent Linear regression cost Gradient descent (simultaneous update) learning rate (simultaneous update) error input
25 Predicting height from knee height Optimal fit to training data body height knee height
26 LINEAR REGRESSION MORE GENERAL FORMULATION: MULTIPLE FEATURES
27 Multiple inputs (features) Knee Height x1 Arm span x2 Age x3 Height y = = = 3 Notation: number of training examples number of features input features of i th training example (vectorvalued). value of feature j in i th training example
28 Linear hypothesis Hypothesis (one input): Hypothesis (multiple input features): Example: h(x) = *kneeheight + 0.3*armspan + 0.1*age More compact notation: Introduce Why? Notation convenience!
29 Multiple inputs (features) revisited x0 Knee Height x1 Arm span x2 Age x3 Height y = = 3 Notation: number of training examples number of features = 1 = 17 input features of i th training example (vectorvalued). value of feature j in i th training example
30 Matrix and vector notation x0 Knee Height x1 Arm span x2 Age x3 Height y features of i th training example design matrix output/target vector (n+1) 1 m (n+1) m 1
31 LINEAR REGRESSION WITH GRADIENT DESCENT (GENERAL FORMULATION)
32 Linear regression problem statement Hypothesis: Cost function: highdimensional quadratic ( bowl shaped) function Goal is to find parameters which minimize the cost
33 Gradient descent (multiple features) with one input feature: learning rate (simultaneous update) error input with n input features: learning rate error input (simultaneous update for j=0 n) For j=0: define for convenience
34 LINEAR REGRESSION ANALYTICAL SOLUTION
35 Analytical solution Set all partial derivatives of cost function = 0 Solving system of linear equations yields: design matrix output/target vector MoorePenrose Pseudoinverse of Note: This analytical solution requires that columns of independent ( regular conditions) are linearly
36 Example: analytical solution applied to problem with one input Knee Height [cm] Height [cm] body height knee height
37 Example: analytical solution applied to problem with one input Knee Height [cm] Height [cm]
38 Predicting height from knee height body height knee height
39 Gradient descent Analytical solution Need to choose learning rate Iterative algorithm (needs many iterations to converge) Works well even when number of input features is large No need to choose Direct solution (no iteration) Slow if is too large (inverting nxn matrix)
40 NONLINEAR FEATURES (NONLINEAR BASIS FUNCTIONS)
41 Nonlinear trends in data How can we learn nonlinear hypotheses? x y ? 2 0???
42 Linear fit to this nonlinear data x y standard design matrix Hypothesis: Optimal parameters:
43 Linear fit to this nonlinear data
44 Nonlinear (quadratic) fit x y design matrix with nonlinear features Hypothesis: Optimal parameters:
45 Nonlinear (quadratic) fit
46 Nonlinear (sinusoid) fit x y design matrix with nonlinear features Hypothesis: Optimal parameters:
47 Nonlinear (sinusoid) fit
48 Image: JPEG = cosinbasis Each block of 8x8 pixels is represented in a Fourrier basis of cosin filters Better representation of edges and corners Allows for compression
49 Audio: cosin or wavelet basis Good signal representation make a compromise between time and frequency
50 Nonlinear input features (in general) feature 2 of all training examples all features of 1st training example Feature 2 for each training example i is computed by applying a nonlinear basis function: Allows to learn a variety of nonlinear functions with the same technique(s): Gradient descent or
51 Polynomial regression Features are powers of x n = degree of polynome to be learned n=0 n=1 n=3 n=9 What happened here? Next lecture
52 Radial basis functions Gaussian shaped RBFs: Each basis function j has a center in the input space The width of the basis functions is determined by x
53 Radial basis functions Gaussian shaped RBFs: Each basis function j has a center in the input space The width of the basis functions is determined by x
54 Radial basis functions Gaussian shaped RBFs: Each basis function j has a center in the input space The width of the basis functions is determined by x
55 Fitting a single RBF to data RBF with
56 Fitting RBFs to data RBFs with
57 SUMMARY (QUESTIONS)
58 Some questions Hypothesis for linear regression =? Cost function for linear regression =? How many local minima may the cost function for lin. reg. have (under regular conditions)? Name two ways to minimize the cost function? General gradient descent formula? Linear regression with gradient descent formula? What issues can arise during gradient descent? What is the design matrix? What are its dimensions? Analytical solution for linear regression =? What are the components of the solution? Pros and Cons of gradient descent vs. analytical solution? How can one learn nonlinear hypotheses with linear regression? What is polynomial regression? What are radial basis functions?
59 What is next? Classification with Logistic Regression Gradient descent tricks & more advanced optimization techniques Underfitting & Overfitting Model selection (Training & Validation & Testset)
Introduction to Machine Learning
Introduction to Machine Learning Prof. Alexander Ihler Prof. Max Welling icamp Tutorial July 22 What is machine learning? The ability of a machine to improve its performance based on previous results:
More informationPATTERN RECOGNITION AND MACHINE LEARNING CHAPTER 4: LINEAR MODELS FOR CLASSIFICATION
PATTERN RECOGNITION AND MACHINE LEARNING CHAPTER 4: LINEAR MODELS FOR CLASSIFICATION Introduction In the previous chapter, we explored a class of regression models having particularly simple analytical
More informationINTRODUCTION TO NEURAL NETWORKS
INTRODUCTION TO NEURAL NETWORKS Pictures are taken from http://www.cs.cmu.edu/~tom/mlbookchapterslides.html http://research.microsoft.com/~cmbishop/prml/index.htm By Nobel Khandaker Neural Networks An
More informationArtificial Neural Networks and Support Vector Machines. CS 486/686: Introduction to Artificial Intelligence
Artificial Neural Networks and Support Vector Machines CS 486/686: Introduction to Artificial Intelligence 1 Outline What is a Neural Network?  Perceptron learners  Multilayer networks What is a Support
More informationIntroduction to Machine Learning. Speaker: Harry Chao Advisor: J.J. Ding Date: 1/27/2011
Introduction to Machine Learning Speaker: Harry Chao Advisor: J.J. Ding Date: 1/27/2011 1 Outline 1. What is machine learning? 2. The basic of machine learning 3. Principles and effects of machine learning
More informationLecture 8 February 4
ICS273A: Machine Learning Winter 2008 Lecture 8 February 4 Scribe: Carlos Agell (Student) Lecturer: Deva Ramanan 8.1 Neural Nets 8.1.1 Logistic Regression Recall the logistic function: g(x) = 1 1 + e θt
More informationNeural Networks. CAP5610 Machine Learning Instructor: GuoJun Qi
Neural Networks CAP5610 Machine Learning Instructor: GuoJun Qi Recap: linear classifier Logistic regression Maximizing the posterior distribution of class Y conditional on the input vector X Support vector
More informationCS 688 Pattern Recognition Lecture 4. Linear Models for Classification
CS 688 Pattern Recognition Lecture 4 Linear Models for Classification Probabilistic generative models Probabilistic discriminative models 1 Generative Approach ( x ) p C k p( C k ) Ck p ( ) ( x Ck ) p(
More informationMachine Learning and Data Mining. Regression Problem. (adapted from) Prof. Alexander Ihler
Machine Learning and Data Mining Regression Problem (adapted from) Prof. Alexander Ihler Overview Regression Problem Definition and define parameters ϴ. Prediction using ϴ as parameters Measure the error
More informationLogistic Regression. Vibhav Gogate The University of Texas at Dallas. Some Slides from Carlos Guestrin, Luke Zettlemoyer and Dan Weld.
Logistic Regression Vibhav Gogate The University of Texas at Dallas Some Slides from Carlos Guestrin, Luke Zettlemoyer and Dan Weld. Generative vs. Discriminative Classifiers Want to Learn: h:x Y X features
More informationLeastSquares Intersection of Lines
LeastSquares Intersection of Lines Johannes Traa  UIUC 2013 This writeup derives the leastsquares solution for the intersection of lines. In the general case, a set of lines will not intersect at a
More informationConvolution. 1D Formula: 2D Formula: Example on the web: http://www.jhu.edu/~signals/convolve/
Basic Filters (7) Convolution/correlation/Linear filtering Gaussian filters Smoothing and noise reduction First derivatives of Gaussian Second derivative of Gaussian: Laplacian Oriented Gaussian filters
More informationStatistical Machine Learning
Statistical Machine Learning UoC Stats 37700, Winter quarter Lecture 4: classical linear and quadratic discriminants. 1 / 25 Linear separation For two classes in R d : simple idea: separate the classes
More informationLecture 2: The SVM classifier
Lecture 2: The SVM classifier C19 Machine Learning Hilary 2015 A. Zisserman Review of linear classifiers Linear separability Perceptron Support Vector Machine (SVM) classifier Wide margin Cost function
More informationNonlinear Iterative Partial Least Squares Method
Numerical Methods for Determining Principal Component Analysis Abstract Factors Béchu, S., RichardPlouet, M., Fernandez, V., Walton, J., and Fairley, N. (2016) Developments in numerical treatments for
More informationLinear Threshold Units
Linear Threshold Units w x hx (... w n x n w We assume that each feature x j and each weight w j is a real number (we will relax this later) We will study three different algorithms for learning linear
More informationProbabilistic Linear Classification: Logistic Regression. Piyush Rai IIT Kanpur
Probabilistic Linear Classification: Logistic Regression Piyush Rai IIT Kanpur Probabilistic Machine Learning (CS772A) Jan 18, 2016 Probabilistic Machine Learning (CS772A) Probabilistic Linear Classification:
More informationCSCI567 Machine Learning (Fall 2014)
CSCI567 Machine Learning (Fall 2014) Drs. Sha & Liu {feisha,yanliu.cs}@usc.edu September 22, 2014 Drs. Sha & Liu ({feisha,yanliu.cs}@usc.edu) CSCI567 Machine Learning (Fall 2014) September 22, 2014 1 /
More informationA Simple Introduction to Support Vector Machines
A Simple Introduction to Support Vector Machines Martin Law Lecture for CSE 802 Department of Computer Science and Engineering Michigan State University Outline A brief history of SVM Largemargin linear
More informationIntroduction to Logistic Regression
OpenStaxCNX module: m42090 1 Introduction to Logistic Regression Dan Calderon This work is produced by OpenStaxCNX and licensed under the Creative Commons Attribution License 3.0 Abstract Gives introduction
More informationLecture 6. Artificial Neural Networks
Lecture 6 Artificial Neural Networks 1 1 Artificial Neural Networks In this note we provide an overview of the key concepts that have led to the emergence of Artificial Neural Networks as a major paradigm
More informationIntroduction to Machine Learning Using Python. Vikram Kamath
Introduction to Machine Learning Using Python Vikram Kamath Contents: 1. 2. 3. 4. 5. 6. 7. 8. 9. 10. Introduction/Definition Where and Why ML is used Types of Learning Supervised Learning Linear Regression
More informationSimilarity and Diagonalization. Similar Matrices
MATH022 Linear Algebra Brief lecture notes 48 Similarity and Diagonalization Similar Matrices Let A and B be n n matrices. We say that A is similar to B if there is an invertible n n matrix P such that
More informationMachine Learning and Pattern Recognition Logistic Regression
Machine Learning and Pattern Recognition Logistic Regression Course Lecturer:Amos J Storkey Institute for Adaptive and Neural Computation School of Informatics University of Edinburgh Crichton Street,
More informationChapter 13 Introduction to Nonlinear Regression( 非 線 性 迴 歸 )
Chapter 13 Introduction to Nonlinear Regression( 非 線 性 迴 歸 ) and Neural Networks( 類 神 經 網 路 ) 許 湘 伶 Applied Linear Regression Models (Kutner, Nachtsheim, Neter, Li) hsuhl (NUK) LR Chap 10 1 / 35 13 Examples
More informationSTA 4273H: Statistical Machine Learning
STA 4273H: Statistical Machine Learning Russ Salakhutdinov Department of Statistics! rsalakhu@utstat.toronto.edu! http://www.cs.toronto.edu/~rsalakhu/ Lecture 6 Three Approaches to Classification Construct
More informationThe basic unit in matrix algebra is a matrix, generally expressed as: a 11 a 12. a 13 A = a 21 a 22 a 23
(copyright by Scott M Lynch, February 2003) Brief Matrix Algebra Review (Soc 504) Matrix algebra is a form of mathematics that allows compact notation for, and mathematical manipulation of, highdimensional
More informationMachine Learning. CUNY Graduate Center, Spring 2013. Professor Liang Huang. huang@cs.qc.cuny.edu
Machine Learning CUNY Graduate Center, Spring 2013 Professor Liang Huang huang@cs.qc.cuny.edu http://acl.cs.qc.edu/~lhuang/teaching/machinelearning Logistics Lectures M 9:3011:30 am Room 4419 Personnel
More informationMachine Learning: Multi Layer Perceptrons
Machine Learning: Multi Layer Perceptrons Prof. Dr. Martin Riedmiller AlbertLudwigsUniversity Freiburg AG Maschinelles Lernen Machine Learning: Multi Layer Perceptrons p.1/61 Outline multi layer perceptrons
More information(Quasi)Newton methods
(Quasi)Newton methods 1 Introduction 1.1 Newton method Newton method is a method to find the zeros of a differentiable nonlinear function g, x such that g(x) = 0, where g : R n R n. Given a starting
More informationConcepts in Machine Learning, Unsupervised Learning & Astronomy Applications
Data Mining In Modern Astronomy Sky Surveys: Concepts in Machine Learning, Unsupervised Learning & Astronomy Applications ChingWa Yip cwyip@pha.jhu.edu; Bloomberg 518 Human are Great Pattern Recognizers
More informationLinear smoother. ŷ = S y. where s ij = s ij (x) e.g. s ij = diag(l i (x)) To go the other way, you need to diagonalize S
Linear smoother ŷ = S y where s ij = s ij (x) e.g. s ij = diag(l i (x)) To go the other way, you need to diagonalize S 2 Online Learning: LMS and Perceptrons Partially adapted from slides by Ryan Gabbard
More informationBig Data Analytics CSCI 4030
High dim. data Graph data Infinite data Machine learning Apps Locality sensitive hashing PageRank, SimRank Filtering data streams SVM Recommen der systems Clustering Community Detection Web advertising
More informationMVA ENS Cachan. Lecture 2: Logistic regression & intro to MIL Iasonas Kokkinos Iasonas.kokkinos@ecp.fr
Machine Learning for Computer Vision 1 MVA ENS Cachan Lecture 2: Logistic regression & intro to MIL Iasonas Kokkinos Iasonas.kokkinos@ecp.fr Department of Applied Mathematics Ecole Centrale Paris Galen
More informationCSC 321 H1S Study Guide (Last update: April 3, 2016) Winter 2016
1. Suppose our training set and test set are the same. Why would this be a problem? 2. Why is it necessary to have both a test set and a validation set? 3. Images are generally represented as n m 3 arrays,
More informationThese slides follow closely the (English) course textbook Pattern Recognition and Machine Learning by Christopher Bishop
Music and Machine Learning (IFT6080 Winter 08) Prof. Douglas Eck, Université de Montréal These slides follow closely the (English) course textbook Pattern Recognition and Machine Learning by Christopher
More informationAutomatic 3D Reconstruction via Object Detection and 3D Transformable Model Matching CS 269 Class Project Report
Automatic 3D Reconstruction via Object Detection and 3D Transformable Model Matching CS 69 Class Project Report Junhua Mao and Lunbo Xu University of California, Los Angeles mjhustc@ucla.edu and lunbo
More informationSupervised Learning (Big Data Analytics)
Supervised Learning (Big Data Analytics) Vibhav Gogate Department of Computer Science The University of Texas at Dallas Practical advice Goal of Big Data Analytics Uncover patterns in Data. Can be used
More informationIntroduction to Machine Learning and Data Mining. Prof. Dr. Igor Trajkovski trajkovski@nyus.edu.mk
Introduction to Machine Learning and Data Mining Prof. Dr. Igor Trakovski trakovski@nyus.edu.mk Neural Networks 2 Neural Networks Analogy to biological neural systems, the most robust learning systems
More informationAirport Planning and Design. Excel Solver
Airport Planning and Design Excel Solver Dr. Antonio A. Trani Professor of Civil and Environmental Engineering Virginia Polytechnic Institute and State University Blacksburg, Virginia Spring 2012 1 of
More informationIntroduction to Artificial Neural Networks. Introduction to Artificial Neural Networks
Introduction to Artificial Neural Networks v.3 August Michel Verleysen Introduction  Introduction to Artificial Neural Networks p Why ANNs? p Biological inspiration p Some examples of problems p Historical
More informationCheng Soon Ong & Christfried Webers. Canberra February June 2016
c Cheng Soon Ong & Christfried Webers Research Group and College of Engineering and Computer Science Canberra February June (Many figures from C. M. Bishop, "Pattern Recognition and ") 1of 31 c Part I
More informationMATRIX ALGEBRA AND SYSTEMS OF EQUATIONS. + + x 2. x n. a 11 a 12 a 1n b 1 a 21 a 22 a 2n b 2 a 31 a 32 a 3n b 3. a m1 a m2 a mn b m
MATRIX ALGEBRA AND SYSTEMS OF EQUATIONS 1. SYSTEMS OF EQUATIONS AND MATRICES 1.1. Representation of a linear system. The general system of m equations in n unknowns can be written a 11 x 1 + a 12 x 2 +
More informationAN INTRODUCTION TO NUMERICAL METHODS AND ANALYSIS
AN INTRODUCTION TO NUMERICAL METHODS AND ANALYSIS Revised Edition James Epperson Mathematical Reviews BICENTENNIAL 0, 1 8 0 7 z ewiley wu 2007 r71 BICENTENNIAL WILEYINTERSCIENCE A John Wiley & Sons, Inc.,
More informationAn Introduction to Neural Networks
An Introduction to Vincent Cheung Kevin Cannons Signal & Data Compression Laboratory Electrical & Computer Engineering University of Manitoba Winnipeg, Manitoba, Canada Advisor: Dr. W. Kinsner May 27,
More informationExample: Credit card default, we may be more interested in predicting the probabilty of a default than classifying individuals as default or not.
Statistical Learning: Chapter 4 Classification 4.1 Introduction Supervised learning with a categorical (Qualitative) response Notation:  Feature vector X,  qualitative response Y, taking values in C
More informationQUALITY ENGINEERING PROGRAM
QUALITY ENGINEERING PROGRAM Production engineering deals with the practical engineering problems that occur in manufacturing planning, manufacturing processes and in the integration of the facilities and
More informationIntroduction to Neural Networks : Revision Lectures
Introduction to Neural Networks : Revision Lectures John A. Bullinaria, 2004 1. Module Aims and Learning Outcomes 2. Biological and Artificial Neural Networks 3. Training Methods for Multi Layer Perceptrons
More informationClass #6: Nonlinear classification. ML4Bio 2012 February 17 th, 2012 Quaid Morris
Class #6: Nonlinear classification ML4Bio 2012 February 17 th, 2012 Quaid Morris 1 Module #: Title of Module 2 Review Overview Linear separability Nonlinear classification Linear Support Vector Machines
More informationChapter 4: Artificial Neural Networks
Chapter 4: Artificial Neural Networks CS 536: Machine Learning Littman (Wu, TA) Administration icml03: instructional Conference on Machine Learning http://www.cs.rutgers.edu/~mlittman/courses/ml03/icml03/
More informationLecture 6: Logistic Regression
Lecture 6: CS 19410, Fall 2011 Laurent El Ghaoui EECS Department UC Berkeley September 13, 2011 Outline Outline Classification task Data : X = [x 1,..., x m]: a n m matrix of data points in R n. y { 1,
More informationLecture 3: Linear methods for classification
Lecture 3: Linear methods for classification Rafael A. Irizarry and Hector Corrada Bravo February, 2010 Today we describe four specific algorithms useful for classification problems: linear regression,
More informationNotes on Support Vector Machines
Notes on Support Vector Machines Fernando Mira da Silva Fernando.Silva@inesc.pt Neural Network Group I N E S C November 1998 Abstract This report describes an empirical study of Support Vector Machines
More informationSupport Vector Machines with Clustering for Training with Very Large Datasets
Support Vector Machines with Clustering for Training with Very Large Datasets Theodoros Evgeniou Technology Management INSEAD Bd de Constance, Fontainebleau 77300, France theodoros.evgeniou@insead.fr Massimiliano
More informationCalculus C/Multivariate Calculus Advanced Placement G/T Essential Curriculum
Calculus C/Multivariate Calculus Advanced Placement G/T Essential Curriculum UNIT I: The Hyperbolic Functions basic calculus concepts, including techniques for curve sketching, exponential and logarithmic
More informationProgramming Exercise 3: Multiclass Classification and Neural Networks
Programming Exercise 3: Multiclass Classification and Neural Networks Machine Learning November 4, 2011 Introduction In this exercise, you will implement onevsall logistic regression and neural networks
More informationMATHEMATICS (MATH) 3. Provides experiences that enable graduates to find employment in sciencerelated
194 / Department of Natural Sciences and Mathematics MATHEMATICS (MATH) The Mathematics Program: 1. Provides challenging experiences in Mathematics, Physics, and Physical Science, which prepare graduates
More informationLecture 2: August 29. Linear Programming (part I)
10725: Convex Optimization Fall 2013 Lecture 2: August 29 Lecturer: Barnabás Póczos Scribes: Samrachana Adhikari, Mattia Ciollaro, Fabrizio Lecci Note: LaTeX template courtesy of UC Berkeley EECS dept.
More informationPart II Redundant Dictionaries and Pursuit Algorithms
Aisenstadt Chair Course CRM September 2009 Part II Redundant Dictionaries and Pursuit Algorithms Stéphane Mallat Centre de Mathématiques Appliquées Ecole Polytechnique Sparsity in Redundant Dictionaries
More informationMetrics on SO(3) and Inverse Kinematics
Mathematical Foundations of Computer Graphics and Vision Metrics on SO(3) and Inverse Kinematics Luca Ballan Institute of Visual Computing Optimization on Manifolds Descent approach d is a ascent direction
More informationLogistic Regression (1/24/13)
STA63/CBB540: Statistical methods in computational biology Logistic Regression (/24/3) Lecturer: Barbara Engelhardt Scribe: Dinesh Manandhar Introduction Logistic regression is model for regression used
More informationLogistic Regression for Spam Filtering
Logistic Regression for Spam Filtering Nikhila Arkalgud February 14, 28 Abstract The goal of the spam filtering problem is to identify an email as a spam or not spam. One of the classic techniques used
More information2.1: MATRIX OPERATIONS
.: MATRIX OPERATIONS What are diagonal entries and the main diagonal of a matrix? What is a diagonal matrix? When are matrices equal? Scalar Multiplication 45 Matrix Addition Theorem (pg 0) Let A, B, and
More informationNonlinear Programming Methods.S2 Quadratic Programming
Nonlinear Programming Methods.S2 Quadratic Programming Operations Research Models and Methods Paul A. Jensen and Jonathan F. Bard A linearly constrained optimization problem with a quadratic objective
More informationAcknowledgments. Data Mining with Regression. Data Mining Context. Overview. Colleagues
Data Mining with Regression Teaching an old dog some new tricks Acknowledgments Colleagues Dean Foster in Statistics Lyle Ungar in Computer Science Bob Stine Department of Statistics The School of the
More informationSupport Vector Machines Explained
March 1, 2009 Support Vector Machines Explained Tristan Fletcher www.cs.ucl.ac.uk/staff/t.fletcher/ Introduction This document has been written in an attempt to make the Support Vector Machines (SVM),
More informationData Mining and Data Warehousing. Henryk Maciejewski. Data Mining Predictive modelling: regression
Data Mining and Data Warehousing Henryk Maciejewski Data Mining Predictive modelling: regression Algorithms for Predictive Modelling Contents Regression Classification Auxiliary topics: Estimation of prediction
More informationMACHINE LEARNING. Introduction. Alessandro Moschitti
MACHINE LEARNING Introduction Alessandro Moschitti Department of Computer Science and Information Engineering University of Trento Email: moschitti@disi.unitn.it Course Schedule Lectures Tuesday, 14:0016:00
More informationClassification using Logistic Regression
Classification using Logistic Regression Ingmar Schuster Patrick Jähnichen using slides by Andrew Ng Institut für Informatik This lecture covers Logistic regression hypothesis Decision Boundary Cost function
More information8 Polynomials Worksheet
8 Polynomials Worksheet Concepts: Quadratic Functions The Definition of a Quadratic Function Graphs of Quadratic Functions  Parabolas Vertex Absolute Maximum or Absolute Minimum Transforming the Graph
More informationLinear Dependence Tests
Linear Dependence Tests The book omits a few key tests for checking the linear dependence of vectors. These short notes discuss these tests, as well as the reasoning behind them. Our first test checks
More informationsuggestive contours and abstracted shading Daniel Arias
suggestive contours and abstracted shading Daniel Arias LINES AND SHADES Shape, volume, shades and texture in drawing lines as nonphotorealistic rendering technique Luis Caballero Suggestive contours Many
More informationWe shall turn our attention to solving linear systems of equations. Ax = b
59 Linear Algebra We shall turn our attention to solving linear systems of equations Ax = b where A R m n, x R n, and b R m. We already saw examples of methods that required the solution of a linear system
More informationImage Compression through DCT and Huffman Coding Technique
International Journal of Current Engineering and Technology EISSN 2277 4106, PISSN 2347 5161 2015 INPRESSCO, All Rights Reserved Available at http://inpressco.com/category/ijcet Research Article Rahul
More informationTHREE DIMENSIONAL REPRESENTATION OF AMINO ACID CHARAC TERISTICS
THREE DIMENSIONAL REPRESENTATION OF AMINO ACID CHARAC TERISTICS O.U. Sezerman 1, R. Islamaj 2, E. Alpaydin 2 1 Laborotory of Computational Biology, Sabancı University, Istanbul, Turkey. 2 Computer Engineering
More informationOpenFOAM Optimization Tools
OpenFOAM Optimization Tools Henrik Rusche and Aleks Jemcov h.rusche@wikkigmbh.de and a.jemcov@wikki.co.uk Wikki, Germany and United Kingdom OpenFOAM Optimization Tools p. 1 Agenda Objective Review optimisation
More informationELECE8104 Stochastics models and estimation, Lecture 3b: Linear Estimation in Static Systems
Stochastics models and estimation, Lecture 3b: Linear Estimation in Static Systems Minimum Mean Square Error (MMSE) MMSE estimation of Gaussian random vectors Linear MMSE estimator for arbitrarily distributed
More informationNeural Networks and Support Vector Machines
INF5390  Kunstig intelligens Neural Networks and Support Vector Machines Roar Fjellheim INF539013 Neural Networks and SVM 1 Outline Neural networks Perceptrons Neural networks Support vector machines
More information4F7 Adaptive Filters (and Spectrum Estimation) Least Mean Square (LMS) Algorithm Sumeetpal Singh Engineering Department Email : sss40@eng.cam.ac.
4F7 Adaptive Filters (and Spectrum Estimation) Least Mean Square (LMS) Algorithm Sumeetpal Singh Engineering Department Email : sss40@eng.cam.ac.uk 1 1 Outline The LMS algorithm Overview of LMS issues
More informationMATRIX ALGEBRA AND SYSTEMS OF EQUATIONS
MATRIX ALGEBRA AND SYSTEMS OF EQUATIONS Systems of Equations and Matrices Representation of a linear system The general system of m equations in n unknowns can be written a x + a 2 x 2 + + a n x n b a
More informationUnit 1: Polynomials. Expressions:  mathematical sentences with no equal sign. Example: 3x + 2
Pure Math 0 Notes Unit : Polynomials Unit : Polynomials : Reviewing Polynomials Epressions:  mathematical sentences with no equal sign. Eample: Equations:  mathematical sentences that are equated with
More informationNeural Networks. Introduction to Artificial Intelligence CSE 150 May 29, 2007
Neural Networks Introduction to Artificial Intelligence CSE 150 May 29, 2007 Administration Last programming assignment has been posted! Final Exam: Tuesday, June 12, 11:302:30 Last Lecture Naïve Bayes
More information6.2.8 Neural networks for data mining
6.2.8 Neural networks for data mining Walter Kosters 1 In many application areas neural networks are known to be valuable tools. This also holds for data mining. In this chapter we discuss the use of neural
More informationBits Superposition Quantum Parallelism
7Qubit Quantum Computer Typical Ion Oscillations in a Trap Bits Qubits vs Each qubit can represent both a or at the same time! This phenomenon is known as Superposition. It leads to Quantum Parallelism
More informationMACHINE LEARNING IN HIGH ENERGY PHYSICS
MACHINE LEARNING IN HIGH ENERGY PHYSICS LECTURE #1 Alex Rogozhnikov, 2015 INTRO NOTES 4 days two lectures, two practice seminars every day this is introductory track to machine learning kaggle competition!
More informationLinear Models for Classification
Linear Models for Classification Sumeet Agarwal, EEL709 (Most figures from Bishop, PRML) Approaches to classification Discriminant function: Directly assigns each data point x to a particular class Ci
More informationIntroduction to Online Learning Theory
Introduction to Online Learning Theory Wojciech Kot lowski Institute of Computing Science, Poznań University of Technology IDSS, 04.06.2013 1 / 53 Outline 1 Example: Online (Stochastic) Gradient Descent
More information3F3: Signal and Pattern Processing
3F3: Signal and Pattern Processing Lecture 3: Classification Zoubin Ghahramani zoubin@eng.cam.ac.uk Department of Engineering University of Cambridge Lent Term Classification We will represent data by
More informationNumerical Methods Lecture 5  Curve Fitting Techniques
Numerical Methods Lecture 5  Curve Fitting Techniques Topics motivation interpolation linear regression higher order polynomial form exponential form Curve fitting  motivation For root finding, we used
More informationTRAIN AND ANALYZE NEURAL NETWORKS TO FIT YOUR DATA
TRAIN AND ANALYZE NEURAL NETWORKS TO FIT YOUR DATA TRAIN AND ANALYZE NEURAL NETWORKS TO FIT YOUR DATA September 2005 First edition Intended for use with Mathematica 5 Software and manual written by: Jonas
More informationPredict Influencers in the Social Network
Predict Influencers in the Social Network Ruishan Liu, Yang Zhao and Liuyu Zhou Email: rliu2, yzhao2, lyzhou@stanford.edu Department of Electrical Engineering, Stanford University Abstract Given two persons
More informationModelling, Extraction and Description of Intrinsic Cues of High Resolution Satellite Images: Independent Component Analysis based approaches
Modelling, Extraction and Description of Intrinsic Cues of High Resolution Satellite Images: Independent Component Analysis based approaches PhD Thesis by Payam Birjandi Director: Prof. Mihai Datcu Problematic
More informationImproving Generalization
Improving Generalization Introduction to Neural Networks : Lecture 10 John A. Bullinaria, 2004 1. Improving Generalization 2. Training, Validation and Testing Data Sets 3. CrossValidation 4. Weight Restriction
More informationFeedForward mapping networks KAIST 바이오및뇌공학과 정재승
FeedForward mapping networks KAIST 바이오및뇌공학과 정재승 How much energy do we need for brain functions? Information processing: Tradeoff between energy consumption and wiring cost Tradeoff between energy consumption
More informationEarly defect identification of semiconductor processes using machine learning
STANFORD UNIVERISTY MACHINE LEARNING CS229 Early defect identification of semiconductor processes using machine learning Friday, December 16, 2011 Authors: Saul ROSA Anton VLADIMIROV Professor: Dr. Andrew
More informationEmpirical ModelBuilding and Response Surfaces
Empirical ModelBuilding and Response Surfaces GEORGE E. P. BOX NORMAN R. DRAPER Technische Universitat Darmstadt FACHBEREICH INFORMATIK BIBLIOTHEK InvortarNf.. Sachgsbiete: Standort: New York John Wiley
More informationNonnegative Matrix Factorization (NMF) in Semisupervised Learning Reducing Dimension and Maintaining Meaning
Nonnegative Matrix Factorization (NMF) in Semisupervised Learning Reducing Dimension and Maintaining Meaning SAMSI 10 May 2013 Outline Introduction to NMF Applications Motivations NMF as a middle step
More information3. Interpolation. Closing the Gaps of Discretization... Beyond Polynomials
3. Interpolation Closing the Gaps of Discretization... Beyond Polynomials Closing the Gaps of Discretization... Beyond Polynomials, December 19, 2012 1 3.3. Polynomial Splines Idea of Polynomial Splines
More informationNEURAL NETWORKS A Comprehensive Foundation
NEURAL NETWORKS A Comprehensive Foundation Second Edition Simon Haykin McMaster University Hamilton, Ontario, Canada Prentice Hall Prentice Hall Upper Saddle River; New Jersey 07458 Preface xii Acknowledgments
More informationPartial Least Squares (PLS) Regression.
Partial Least Squares (PLS) Regression. Hervé Abdi 1 The University of Texas at Dallas Introduction Pls regression is a recent technique that generalizes and combines features from principal component
More information