CSI:FLORIDA. Section 4.4: Logistic Regression

Size: px
Start display at page:

Download "CSI:FLORIDA. Section 4.4: Logistic Regression"

Transcription

1 SI:FLORIDA Section 4.4: Logistic Regression

2 SI:FLORIDA Reisit Masked lass Problem We can generalize this roblem to two class roblem as well!

3 SI:FLORIDA Reisit Masked lass Problem What is the actual roblem here?

4 SI:FLORIDA Reisit Masked lass Problem What is the actual roblem here? -No one line can searate the blue class from the other dataoints! Where has this roblem been seen before?

5 SI:FLORIDA Reisit Masked lass Problem What is the actual roblem here? -No one line can searate the blue class from the other dataoints! Where has this roblem been seen before? -The single-layer ercetron roblem! The XOR roblem

6 SI:FLORIDA Linear Regression in Feature Sace an classify the green class with no roblem! 6

7 SI:FLORIDA Linear Regression in Feature Sace an classify the black class with no roblem! 7

8 SI:FLORIDA Linear Regression in Feature Sace Problems when we try to classify the blue class!!!! 8

9 SI:FLORIDA Reisit Masked lass Problem Are linear methods comletely useless on this data? -No, we can erform a non-linear transformation on the data ia fied basis functions! -Many times when we erform this transformation features that where not linearly searable in the original feature sace become linearly searable in the transformed feature sace.

10 SI:FLORIDA Basis Functions Oeriew Basic linear regression models are linear combinations of inut ariables y, w w + w + L + w D D w is the bias arameter Models can be etended by using fied basis functions which allows for linear combinations of nonlinear functions of the inut ariables M T y, w w j ϕ j w ϕ 2 j µ j Gaussian or RBF basis function: ϕ j e T Basis ector: s ϕ ϕ, K, ϕm Dummy basis function used for bias arameter: Basis function center ϕ µ j goerns location in inut sace Scale arameter determines satial scale s

11 SI:FLORIDA Linear Regression in Transformed Feature Sace Again, can classify the green class with no roblem!

12 SI:FLORIDA Linear Regression in Transformed Feature Sace Again, can classify the black class with no roblem! 2

13 SI:FLORIDA Linear Regression in Transformed Feature Sace Now we can classify the blue class with no roblem! 3

14 SI:FLORIDA Features in Transformed Sace are Linearly Searable theta theta 4

15 SI:FLORIDA More on basis functions and kernel sace in later sections of the book. Now that we hae introduced basis functions and the basis ector we can discuss logistic regression in these terms!

16 SI:FLORIDA Logistic Regression Motiations Desire for a linear model to estimate the osterior robabilities of K classes; to be a robability the model must ensure The osterior robabilities sum to one The osterior robabilities lie in [,] Build a model with roerties desired for a classification task ersus regression No etreme numbers, constrain the model oututs to lie within the [,] interal reate a model that is robust to outliers Desire a model with less arameters 6

17 SI:FLORIDA Logistic Regression Model Formulation The Elements of Statistical Learning The model is formulated as K- log-odds or logit transformations *NOTE: The logits are constructed with linear form but do not require the Gaussian assumtions, will estimate the weights ia IRLS **As reiously shown: This linear model can be deried from LDA under the assumtion of Gaussian distributed classes with a shared coariance matri ϕ t ln w,ϕ + w, ϕ + L+ w, M ϕ M w ϕ K ϕ 2 ϕ t ln w2,ϕ + w2,ϕ + L+ w2, M ϕ M w2 ϕ ϕ K ln K- K M ϕ w ϕ K, ϕ + w K, ϕ + L+ w K, M ϕ M w A logit function or log-odds is the log ratio of the robabilities for two classes; in our model we arbitrarily choose the Kth class for our ratio denominator K t ϕ 7

18 SI:FLORIDA Logistic Regression Model Formulation The Elements of Statistical Learning The class osterior estimations are: t e wk ϕ k ϕ, k, L, K K t + e w ϕ K ϕ j K + j j e t w j ϕ The class distributions will sum to and roduce an outut within [,]; the two class ariant is an een simler model with only a single linear function simle enough, but why do they call it LOGISTI regression 8

19 SI SI SI SI:FLORIDA Logistic Regression Model Formulation Pattern Recognition and Machine Learning the Bisho book Instead of starting with the multi-class ersion lets start with the two class case a σ 9 a a σ + + e ln e 2 2 where we hae defined ln 2 2 a σa is the logistic sigmoid defined as a a + e σ

20 SI:FLORIDA Logistic Regression Model Formulation Pattern Recognition and Machine Learning the Bisho book sigmoid outut alues of 'a' The term sigmoid means S-shaed Also can be referred to as a squashing function 2

21 SI SI SI SI:FLORIDA Logistic Regression Model Formulation Pattern Recognition and Machine Learning the Bisho book The inerse of the logistic sigmoid: ln ln ln 2 a σ σ This function is known as the logit function or log-odds! a a + e σ 2 For the case when K > 2 classes are resent we can use a multi-class generalization of the logistic sigmoid known as the normalized eonential, also known as a softma function K j j k K j j j k k a a k e e where ϕ t k w k a

22 SI:FLORIDA Logistic Regression Model Formulation Pattern Recognition and Machine Learning the Bisho book Thus for a two class logistic regression model we hae: tw ϕ σ w ϕ Now how do we learn the weights? Use a least squares method known as Iteratie Reweighted Least Squares IRLS Why can we not simly use the standard least squares solution? 22

23 SI:FLORIDA Logistic Regression Model Formulation Pattern Recognition and Machine Learning the Bisho book Thus for a two class logistic regression model we hae: tw ϕ σ w ϕ Now how do we learn the weights? Use a least squares method known as Iteratie Reweighted Least Squares IRLS Why can we not simly use the standard least squares solution? Because our log-likelihood function is not quadratic in the weights and thus the deriatie is not linear in the weights. This means we do NOT hae a closed form solution and must erform an iteratie method. 23

24 SI:FLORIDA Iteratie Reweighted Least Squares IRLS IRLS is deried similarly when using and not using the sigmoid function. Deriation of IRLS is made straight forward when using the sigmoid because its deriatie can be eressed in terms of itself. σ σ σ a NOTE: The remainder of the IRLS discussion will be done from the class book, howeer I will oint out some differences between the deriation in the class book and Bishos book. 24

25 SI:FLORIDA Iteratie Reweighted Least Squares IRLS Since we are using the two class case we use the binomial distribution to, model the class robability. We reresent our class labels as ; θ ; θ ; 2 ; θ ; θ N N t t { yi ln i; β + yi ln i; β } { yiβ i ln + e i } l β β i We want to maimize the log-likelihood, howeer in Bisho he minimizes the error function gien by the negatie log-likelihood. l β β N i y i i ; β i As we can see the equations are nonlinear in β. i y i 25

26 SI:FLORIDA Iteratie Reweighted Least Squares IRLS Sole the equations for β using the Newton-Rahson algorithm β new old l β 2 l β β β t β β The second-deriatie or Hessian of our log-likelihood is: 2 l β t β β N t i i i; β i; β i If we eress our data and labels by the matri X and ector y, our robabilities by the ector, and the weighting matri by W we can show the aboe in matri form: β new The weighting matri is a diagonal matri with the ith diagonal entry: i ; β old old β t t X WX X y i ; β old 26

27 SI:FLORIDA Iteratie Reweighted Least Squares IRLS Since the log-likelihood is concae the algorithm does conerge. We can rearrange the Newton ste to show eress the algorithm as a weighted least squares ste: β new X WX t X Wz With the adjusted resonse: z Xβ old + W y t See section for more roerties inherent with the IRLS adjusted resonse. 27

28 SI:FLORIDA L Regularized Logistic Regression As in LASSO an L enalty can be used for ariable selection and shrinkage. This is done by relacing our log-likelihood function with a regularized form and maimizing it: N P t t l β yi β + β i ln + e β + β i λ β j i j NOTE: As before we do not enalize the intercet and so must eress it searately. This function is concae and can be soled ia nonlinear rogramming methods or by reeated alication of the weighted LASSO algorithm. 28

29 onclusions SI:FLORIDA Logistic Regression and LDA hae similar forms: LDA: k ln In LDA this linearity results from our Gaussian assumtion Logistic Regression: K π k ln π K 2 t t t µ + µ Σ µ µ + Σ µ µ α + α k K K k ln K k K t β + β k Logistic regression has linear logits by construction Howeer, their coefficients are estimated differently The logistic regression model has less assumtions and therefore is more general To illustrate, look at the joint density of X and G X, X X k k Both models hae the logit-linear form for the right term The logistic regression model basically ignores the marginal density of X and fits the arameters by maimizing the conditional likelihood The LDA model maimizes the full likelihood based on the joint density k k K k k

30 onclusions SI:FLORIDA What the does this mean for LDA? If Gaussian assumtions are accurate than we hae more information about the model arameters and thus can estimate them more efficiently In addition we can use unlabeled oints to hel us in estimating model and distribution arameters LDA is less robust to outliers i.e. dataoints far from the decision boundary lay a role in estimating the common coariance Logistic regression requires less arameters. Gien an M dimensional feature sace and a two class roblem Logistic regression requires M adjustable arameters LDA requires MM+5/2+ arameters 2M arameters for the means MM+/2 arameters for the shared coariance matri for the class rior

Logistic Regression. Jia Li. Department of Statistics The Pennsylvania State University. Logistic Regression

Logistic Regression. Jia Li. Department of Statistics The Pennsylvania State University. Logistic Regression Logistic Regression Department of Statistics The Pennsylvania State University Email: jiali@stat.psu.edu Logistic Regression Preserve linear classification boundaries. By the Bayes rule: Ĝ(x) = arg max

More information

These slides follow closely the (English) course textbook Pattern Recognition and Machine Learning by Christopher Bishop

These slides follow closely the (English) course textbook Pattern Recognition and Machine Learning by Christopher Bishop Music and Machine Learning (IFT6080 Winter 08) Prof. Douglas Eck, Université de Montréal These slides follow closely the (English) course textbook Pattern Recognition and Machine Learning by Christopher

More information

Statistical Machine Learning

Statistical Machine Learning Statistical Machine Learning UoC Stats 37700, Winter quarter Lecture 4: classical linear and quadratic discriminants. 1 / 25 Linear separation For two classes in R d : simple idea: separate the classes

More information

Lecture 3: Linear methods for classification

Lecture 3: Linear methods for classification Lecture 3: Linear methods for classification Rafael A. Irizarry and Hector Corrada Bravo February, 2010 Today we describe four specific algorithms useful for classification problems: linear regression,

More information

PATTERN RECOGNITION AND MACHINE LEARNING CHAPTER 4: LINEAR MODELS FOR CLASSIFICATION

PATTERN RECOGNITION AND MACHINE LEARNING CHAPTER 4: LINEAR MODELS FOR CLASSIFICATION PATTERN RECOGNITION AND MACHINE LEARNING CHAPTER 4: LINEAR MODELS FOR CLASSIFICATION Introduction In the previous chapter, we explored a class of regression models having particularly simple analytical

More information

Pa8ern Recogni6on. and Machine Learning. Chapter 4: Linear Models for Classifica6on

Pa8ern Recogni6on. and Machine Learning. Chapter 4: Linear Models for Classifica6on Pa8ern Recogni6on and Machine Learning Chapter 4: Linear Models for Classifica6on Represen'ng the target values for classifica'on If there are only two classes, we typically use a single real valued output

More information

Linear Classification. Volker Tresp Summer 2015

Linear Classification. Volker Tresp Summer 2015 Linear Classification Volker Tresp Summer 2015 1 Classification Classification is the central task of pattern recognition Sensors supply information about an object: to which class do the object belong

More information

Christfried Webers. Canberra February June 2015

Christfried Webers. Canberra February June 2015 c Statistical Group and College of Engineering and Computer Science Canberra February June (Many figures from C. M. Bishop, "Pattern Recognition and ") 1of 829 c Part VIII Linear Classification 2 Logistic

More information

Lecture 8 February 4

Lecture 8 February 4 ICS273A: Machine Learning Winter 2008 Lecture 8 February 4 Scribe: Carlos Agell (Student) Lecturer: Deva Ramanan 8.1 Neural Nets 8.1.1 Logistic Regression Recall the logistic function: g(x) = 1 1 + e θt

More information

Softmax Model as Generalization upon Logistic Discrimination Suffers from Overfitting

Softmax Model as Generalization upon Logistic Discrimination Suffers from Overfitting Journal of Data Science 12(2014),563-574 Softmax Model as Generalization uon Logistic Discrimination Suffers from Overfitting F. Mohammadi Basatini 1 and Rahim Chiniardaz 2 1 Deartment of Statistics, Shoushtar

More information

Example: Credit card default, we may be more interested in predicting the probabilty of a default than classifying individuals as default or not.

Example: Credit card default, we may be more interested in predicting the probabilty of a default than classifying individuals as default or not. Statistical Learning: Chapter 4 Classification 4.1 Introduction Supervised learning with a categorical (Qualitative) response Notation: - Feature vector X, - qualitative response Y, taking values in C

More information

Probabilistic Linear Classification: Logistic Regression. Piyush Rai IIT Kanpur

Probabilistic Linear Classification: Logistic Regression. Piyush Rai IIT Kanpur Probabilistic Linear Classification: Logistic Regression Piyush Rai IIT Kanpur Probabilistic Machine Learning (CS772A) Jan 18, 2016 Probabilistic Machine Learning (CS772A) Probabilistic Linear Classification:

More information

Pattern Analysis. Logistic Regression. 12. Mai 2009. Joachim Hornegger. Chair of Pattern Recognition Erlangen University

Pattern Analysis. Logistic Regression. 12. Mai 2009. Joachim Hornegger. Chair of Pattern Recognition Erlangen University Pattern Analysis Logistic Regression 12. Mai 2009 Joachim Hornegger Chair of Pattern Recognition Erlangen University Pattern Analysis 2 / 43 1 Logistic Regression Posteriors and the Logistic Function Decision

More information

Logistic Regression (1/24/13)

Logistic Regression (1/24/13) STA63/CBB540: Statistical methods in computational biology Logistic Regression (/24/3) Lecturer: Barbara Engelhardt Scribe: Dinesh Manandhar Introduction Logistic regression is model for regression used

More information

CS 688 Pattern Recognition Lecture 4. Linear Models for Classification

CS 688 Pattern Recognition Lecture 4. Linear Models for Classification CS 688 Pattern Recognition Lecture 4 Linear Models for Classification Probabilistic generative models Probabilistic discriminative models 1 Generative Approach ( x ) p C k p( C k ) Ck p ( ) ( x Ck ) p(

More information

Point Location. Preprocess a planar, polygonal subdivision for point location queries. p = (18, 11)

Point Location. Preprocess a planar, polygonal subdivision for point location queries. p = (18, 11) Point Location Prerocess a lanar, olygonal subdivision for oint location ueries. = (18, 11) Inut is a subdivision S of comlexity n, say, number of edges. uild a data structure on S so that for a uery oint

More information

Machine Learning and Pattern Recognition Logistic Regression

Machine Learning and Pattern Recognition Logistic Regression Machine Learning and Pattern Recognition Logistic Regression Course Lecturer:Amos J Storkey Institute for Adaptive and Neural Computation School of Informatics University of Edinburgh Crichton Street,

More information

Linear Threshold Units

Linear Threshold Units Linear Threshold Units w x hx (... w n x n w We assume that each feature x j and each weight w j is a real number (we will relax this later) We will study three different algorithms for learning linear

More information

Classification Problems

Classification Problems Classification Read Chapter 4 in the text by Bishop, except omit Sections 4.1.6, 4.1.7, 4.2.4, 4.3.3, 4.3.5, 4.3.6, 4.4, and 4.5. Also, review sections 1.5.1, 1.5.2, 1.5.3, and 1.5.4. Classification Problems

More information

Local classification and local likelihoods

Local classification and local likelihoods Local classification and local likelihoods November 18 k-nearest neighbors The idea of local regression can be extended to classification as well The simplest way of doing so is called nearest neighbor

More information

STA 4273H: Statistical Machine Learning

STA 4273H: Statistical Machine Learning STA 4273H: Statistical Machine Learning Russ Salakhutdinov Department of Statistics! rsalakhu@utstat.toronto.edu! http://www.cs.toronto.edu/~rsalakhu/ Lecture 6 Three Approaches to Classification Construct

More information

Class #6: Non-linear classification. ML4Bio 2012 February 17 th, 2012 Quaid Morris

Class #6: Non-linear classification. ML4Bio 2012 February 17 th, 2012 Quaid Morris Class #6: Non-linear classification ML4Bio 2012 February 17 th, 2012 Quaid Morris 1 Module #: Title of Module 2 Review Overview Linear separability Non-linear classification Linear Support Vector Machines

More information

α α λ α = = λ λ α ψ = = α α α λ λ ψ α = + β = > θ θ β > β β θ θ θ β θ β γ θ β = γ θ > β > γ θ β γ = θ β = θ β = θ β = β θ = β β θ = = = β β θ = + α α α α α = = λ λ λ λ λ λ λ = λ λ α α α α λ ψ + α =

More information

KERNEL LOGISTIC REGRESSION-LINEAR FOR LEUKEMIA CLASSIFICATION USING HIGH DIMENSIONAL DATA

KERNEL LOGISTIC REGRESSION-LINEAR FOR LEUKEMIA CLASSIFICATION USING HIGH DIMENSIONAL DATA Rahayu, Kernel Logistic Regression-Linear for Leukemia Classification using High Dimensional Data KERNEL LOGISTIC REGRESSION-LINEAR FOR LEUKEMIA CLASSIFICATION USING HIGH DIMENSIONAL DATA S.P. Rahayu 1,2

More information

3F3: Signal and Pattern Processing

3F3: Signal and Pattern Processing 3F3: Signal and Pattern Processing Lecture 3: Classification Zoubin Ghahramani zoubin@eng.cam.ac.uk Department of Engineering University of Cambridge Lent Term Classification We will represent data by

More information

MVA ENS Cachan. Lecture 2: Logistic regression & intro to MIL Iasonas Kokkinos Iasonas.kokkinos@ecp.fr

MVA ENS Cachan. Lecture 2: Logistic regression & intro to MIL Iasonas Kokkinos Iasonas.kokkinos@ecp.fr Machine Learning for Computer Vision 1 MVA ENS Cachan Lecture 2: Logistic regression & intro to MIL Iasonas Kokkinos Iasonas.kokkinos@ecp.fr Department of Applied Mathematics Ecole Centrale Paris Galen

More information

A Multivariate Statistical Analysis of Stock Trends. Abstract

A Multivariate Statistical Analysis of Stock Trends. Abstract A Multivariate Statistical Analysis of Stock Trends Aril Kerby Alma College Alma, MI James Lawrence Miami University Oxford, OH Abstract Is there a method to redict the stock market? What factors determine

More information

Rotated Ellipses. And Their Intersections With Lines. Mark C. Hendricks, Ph.D. Copyright March 8, 2012

Rotated Ellipses. And Their Intersections With Lines. Mark C. Hendricks, Ph.D. Copyright March 8, 2012 Rotated Ellipses And Their Intersections With Lines b Mark C. Hendricks, Ph.D. Copright March 8, 0 Abstract: This paper addresses the mathematical equations for ellipses rotated at an angle and how to

More information

Support Vector Machine (SVM)

Support Vector Machine (SVM) Support Vector Machine (SVM) CE-725: Statistical Pattern Recognition Sharif University of Technology Spring 2013 Soleymani Outline Margin concept Hard-Margin SVM Soft-Margin SVM Dual Problems of Hard-Margin

More information

Reject Inference in Credit Scoring. Jie-Men Mok

Reject Inference in Credit Scoring. Jie-Men Mok Reject Inference in Credit Scoring Jie-Men Mok BMI paper January 2009 ii Preface In the Master programme of Business Mathematics and Informatics (BMI), it is required to perform research on a business

More information

Logit Models for Binary Data

Logit Models for Binary Data Chapter 3 Logit Models for Binary Data We now turn our attention to regression models for dichotomous data, including logistic regression and probit analysis. These models are appropriate when the response

More information

Overview Classes. 12-3 Logistic regression (5) 19-3 Building and applying logistic regression (6) 26-3 Generalizations of logistic regression (7)

Overview Classes. 12-3 Logistic regression (5) 19-3 Building and applying logistic regression (6) 26-3 Generalizations of logistic regression (7) Overview Classes 12-3 Logistic regression (5) 19-3 Building and applying logistic regression (6) 26-3 Generalizations of logistic regression (7) 2-4 Loglinear models (8) 5-4 15-17 hrs; 5B02 Building and

More information

A MOST PROBABLE POINT-BASED METHOD FOR RELIABILITY ANALYSIS, SENSITIVITY ANALYSIS AND DESIGN OPTIMIZATION

A MOST PROBABLE POINT-BASED METHOD FOR RELIABILITY ANALYSIS, SENSITIVITY ANALYSIS AND DESIGN OPTIMIZATION 9 th ASCE Secialty Conference on Probabilistic Mechanics and Structural Reliability PMC2004 Abstract A MOST PROBABLE POINT-BASED METHOD FOR RELIABILITY ANALYSIS, SENSITIVITY ANALYSIS AND DESIGN OPTIMIZATION

More information

LINES AND PLANES IN R 3

LINES AND PLANES IN R 3 LINES AND PLANES IN R 3 In this handout we will summarize the properties of the dot product and cross product and use them to present arious descriptions of lines and planes in three dimensional space.

More information

Linear Discrimination. Linear Discrimination. Linear Discrimination. Linearly Separable Systems Pairwise Separation. Steven J Zeil.

Linear Discrimination. Linear Discrimination. Linear Discrimination. Linearly Separable Systems Pairwise Separation. Steven J Zeil. Steven J Zeil Old Dominion Univ. Fall 200 Discriminant-Based Classification Linearly Separable Systems Pairwise Separation 2 Posteriors 3 Logistic Discrimination 2 Discriminant-Based Classification Likelihood-based:

More information

Fitting Subject-specific Curves to Grouped Longitudinal Data

Fitting Subject-specific Curves to Grouped Longitudinal Data Fitting Subject-specific Curves to Grouped Longitudinal Data Djeundje, Viani Heriot-Watt University, Department of Actuarial Mathematics & Statistics Edinburgh, EH14 4AS, UK E-mail: vad5@hw.ac.uk Currie,

More information

CSCI567 Machine Learning (Fall 2014)

CSCI567 Machine Learning (Fall 2014) CSCI567 Machine Learning (Fall 2014) Drs. Sha & Liu {feisha,yanliu.cs}@usc.edu September 22, 2014 Drs. Sha & Liu ({feisha,yanliu.cs}@usc.edu) CSCI567 Machine Learning (Fall 2014) September 22, 2014 1 /

More information

Lecture 6: Logistic Regression

Lecture 6: Logistic Regression Lecture 6: CS 194-10, Fall 2011 Laurent El Ghaoui EECS Department UC Berkeley September 13, 2011 Outline Outline Classification task Data : X = [x 1,..., x m]: a n m matrix of data points in R n. y { 1,

More information

11 Linear and Quadratic Discriminant Analysis, Logistic Regression, and Partial Least Squares Regression

11 Linear and Quadratic Discriminant Analysis, Logistic Regression, and Partial Least Squares Regression Frank C Porter and Ilya Narsky: Statistical Analysis Techniques in Particle Physics Chap. c11 2013/9/9 page 221 le-tex 221 11 Linear and Quadratic Discriminant Analysis, Logistic Regression, and Partial

More information

The equivalence of logistic regression and maximum entropy models

The equivalence of logistic regression and maximum entropy models The equivalence of logistic regression and maximum entropy models John Mount September 23, 20 Abstract As our colleague so aptly demonstrated ( http://www.win-vector.com/blog/20/09/the-simplerderivation-of-logistic-regression/

More information

Penalized Logistic Regression and Classification of Microarray Data

Penalized Logistic Regression and Classification of Microarray Data Penalized Logistic Regression and Classification of Microarray Data Milan, May 2003 Anestis Antoniadis Laboratoire IMAG-LMC University Joseph Fourier Grenoble, France Penalized Logistic Regression andclassification

More information

Regularized Logistic Regression for Mind Reading with Parallel Validation

Regularized Logistic Regression for Mind Reading with Parallel Validation Regularized Logistic Regression for Mind Reading with Parallel Validation Heikki Huttunen, Jukka-Pekka Kauppi, Jussi Tohka Tampere University of Technology Department of Signal Processing Tampere, Finland

More information

Some Essential Statistics The Lure of Statistics

Some Essential Statistics The Lure of Statistics Some Essential Statistics The Lure of Statistics Data Mining Techniques, by M.J.A. Berry and G.S Linoff, 2004 Statistics vs. Data Mining..lie, damn lie, and statistics mining data to support preconceived

More information

The Online Freeze-tag Problem

The Online Freeze-tag Problem The Online Freeze-tag Problem Mikael Hammar, Bengt J. Nilsson, and Mia Persson Atus Technologies AB, IDEON, SE-3 70 Lund, Sweden mikael.hammar@atus.com School of Technology and Society, Malmö University,

More information

NAVAL POSTGRADUATE SCHOOL THESIS

NAVAL POSTGRADUATE SCHOOL THESIS NAVAL POSTGRADUATE SCHOOL MONTEREY CALIFORNIA THESIS SYMMETRICAL RESIDUE-TO-BINARY CONVERSION ALGORITHM PIPELINED FPGA IMPLEMENTATION AND TESTING LOGIC FOR USE IN HIGH-SPEED FOLDING DIGITIZERS by Ross

More information

LOGISTIC REGRESSION. Nitin R Patel. where the dependent variable, y, is binary (for convenience we often code these values as

LOGISTIC REGRESSION. Nitin R Patel. where the dependent variable, y, is binary (for convenience we often code these values as LOGISTIC REGRESSION Nitin R Patel Logistic regression extends the ideas of multiple linear regression to the situation where the dependent variable, y, is binary (for convenience we often code these values

More information

Re-Dispatch Approach for Congestion Relief in Deregulated Power Systems

Re-Dispatch Approach for Congestion Relief in Deregulated Power Systems Re-Disatch Aroach for Congestion Relief in Deregulated ower Systems Ch. Naga Raja Kumari #1, M. Anitha 2 #1, 2 Assistant rofessor, Det. of Electrical Engineering RVR & JC College of Engineering, Guntur-522019,

More information

Efficient Streaming Classification Methods

Efficient Streaming Classification Methods 1/44 Efficient Streaming Classification Methods Niall M. Adams 1, Nicos G. Pavlidis 2, Christoforos Anagnostopoulos 3, Dimitris K. Tasoulis 1 1 Department of Mathematics 2 Institute for Mathematical Sciences

More information

Wes, Delaram, and Emily MA751. Exercise 4.5. 1 p(x; β) = [1 p(xi ; β)] = 1 p(x. y i [βx i ] log [1 + exp {βx i }].

Wes, Delaram, and Emily MA751. Exercise 4.5. 1 p(x; β) = [1 p(xi ; β)] = 1 p(x. y i [βx i ] log [1 + exp {βx i }]. Wes, Delaram, and Emily MA75 Exercise 4.5 Consider a two-class logistic regression problem with x R. Characterize the maximum-likelihood estimates of the slope and intercept parameter if the sample for

More information

Poisson Models for Count Data

Poisson Models for Count Data Chapter 4 Poisson Models for Count Data In this chapter we study log-linear models for count data under the assumption of a Poisson error structure. These models have many applications, not only to the

More information

Introduction to Logistic Regression

Introduction to Logistic Regression OpenStax-CNX module: m42090 1 Introduction to Logistic Regression Dan Calderon This work is produced by OpenStax-CNX and licensed under the Creative Commons Attribution License 3.0 Abstract Gives introduction

More information

Machine Learning Logistic Regression

Machine Learning Logistic Regression Machine Learning Logistic Regression Jeff Howbert Introduction to Machine Learning Winter 2012 1 Logistic regression Name is somewhat misleading. Really a technique for classification, not regression.

More information

A Simple Introduction to Support Vector Machines

A Simple Introduction to Support Vector Machines A Simple Introduction to Support Vector Machines Martin Law Lecture for CSE 802 Department of Computer Science and Engineering Michigan State University Outline A brief history of SVM Large-margin linear

More information

Linear Models for Classification

Linear Models for Classification Linear Models for Classification Sumeet Agarwal, EEL709 (Most figures from Bishop, PRML) Approaches to classification Discriminant function: Directly assigns each data point x to a particular class Ci

More information

Introduction to Support Vector Machines. Colin Campbell, Bristol University

Introduction to Support Vector Machines. Colin Campbell, Bristol University Introduction to Support Vector Machines Colin Campbell, Bristol University 1 Outline of talk. Part 1. An Introduction to SVMs 1.1. SVMs for binary classification. 1.2. Soft margins and multi-class classification.

More information

Logistic Regression. Vibhav Gogate The University of Texas at Dallas. Some Slides from Carlos Guestrin, Luke Zettlemoyer and Dan Weld.

Logistic Regression. Vibhav Gogate The University of Texas at Dallas. Some Slides from Carlos Guestrin, Luke Zettlemoyer and Dan Weld. Logistic Regression Vibhav Gogate The University of Texas at Dallas Some Slides from Carlos Guestrin, Luke Zettlemoyer and Dan Weld. Generative vs. Discriminative Classifiers Want to Learn: h:x Y X features

More information

Key Stage 2 Mathematics Programme of Study

Key Stage 2 Mathematics Programme of Study Deeloping numerical reasoning Identify processes and connections Represent and communicate Reiew transfer mathematical to a ariety of contexts and eeryday situations identify the appropriate steps and

More information

Statistical Machine Learning from Data

Statistical Machine Learning from Data Samy Bengio Statistical Machine Learning from Data 1 Statistical Machine Learning from Data Gaussian Mixture Models Samy Bengio IDIAP Research Institute, Martigny, Switzerland, and Ecole Polytechnique

More information

Probabilistic Discriminative Kernel Classifiers for Multi-class Problems

Probabilistic Discriminative Kernel Classifiers for Multi-class Problems c Springer-Verlag Probabilistic Discriminative Kernel Classifiers for Multi-class Problems Volker Roth University of Bonn Department of Computer Science III Roemerstr. 164 D-53117 Bonn Germany roth@cs.uni-bonn.de

More information

Principles of Hydrology. Hydrograph components include rising limb, recession limb, peak, direct runoff, and baseflow.

Principles of Hydrology. Hydrograph components include rising limb, recession limb, peak, direct runoff, and baseflow. Princiles of Hydrology Unit Hydrograh Runoff hydrograh usually consists of a fairly regular lower ortion that changes slowly throughout the year and a raidly fluctuating comonent that reresents the immediate

More information

Using the Delta Method to Construct Confidence Intervals for Predicted Probabilities, Rates, and Discrete Changes

Using the Delta Method to Construct Confidence Intervals for Predicted Probabilities, Rates, and Discrete Changes Using the Delta Method to Construct Confidence Intervals for Predicted Probabilities, Rates, Discrete Changes JunXuJ.ScottLong Indiana University August 22, 2005 The paper provides technical details on

More information

Programming Exercise 3: Multi-class Classification and Neural Networks

Programming Exercise 3: Multi-class Classification and Neural Networks Programming Exercise 3: Multi-class Classification and Neural Networks Machine Learning November 4, 2011 Introduction In this exercise, you will implement one-vs-all logistic regression and neural networks

More information

The Lognormal Distribution Engr 323 Geppert page 1of 6 The Lognormal Distribution

The Lognormal Distribution Engr 323 Geppert page 1of 6 The Lognormal Distribution Engr 33 Geert age 1of 6 The Lognormal Distribution In general, the most imortant roerty of the lognormal rocess is that it reresents a roduct of indeendent random variables. (Class Handout on Lognormal

More information

Least Squares Estimation

Least Squares Estimation Least Squares Estimation SARA A VAN DE GEER Volume 2, pp 1041 1045 in Encyclopedia of Statistics in Behavioral Science ISBN-13: 978-0-470-86080-9 ISBN-10: 0-470-86080-4 Editors Brian S Everitt & David

More information

Machine Learning with Operational Costs

Machine Learning with Operational Costs Journal of Machine Learning Research 14 (2013) 1989-2028 Submitted 12/11; Revised 8/12; Published 7/13 Machine Learning with Oerational Costs Theja Tulabandhula Deartment of Electrical Engineering and

More information

A General Approach to Variance Estimation under Imputation for Missing Survey Data

A General Approach to Variance Estimation under Imputation for Missing Survey Data A General Approach to Variance Estimation under Imputation for Missing Survey Data J.N.K. Rao Carleton University Ottawa, Canada 1 2 1 Joint work with J.K. Kim at Iowa State University. 2 Workshop on Survey

More information

Introduction to Machine Learning Lecture 1. Mehryar Mohri Courant Institute and Google Research mohri@cims.nyu.edu

Introduction to Machine Learning Lecture 1. Mehryar Mohri Courant Institute and Google Research mohri@cims.nyu.edu Introduction to Machine Learning Lecture 1 Mehryar Mohri Courant Institute and Google Research mohri@cims.nyu.edu Introduction Logistics Prerequisites: basics concepts needed in probability and statistics

More information

APPLICATIONS OF BAYES THEOREM

APPLICATIONS OF BAYES THEOREM ALICATIONS OF BAYES THEOREM C&E 940, September 005 Geoff Bohling Assistant Scientist Kansas Geological Survey geoff@kgs.ku.edu 864-093 Notes, overheads, Excel example file available at http://people.ku.edu/~gbohling/cpe940

More information

Mean shift-based clustering

Mean shift-based clustering Pattern Recognition (7) www.elsevier.com/locate/r Mean shift-based clustering Kuo-Lung Wu a, Miin-Shen Yang b, a Deartment of Information Management, Kun Shan University of Technology, Yung-Kang, Tainan

More information

Pressure Drop in Air Piping Systems Series of Technical White Papers from Ohio Medical Corporation

Pressure Drop in Air Piping Systems Series of Technical White Papers from Ohio Medical Corporation Pressure Dro in Air Piing Systems Series of Technical White Paers from Ohio Medical Cororation Ohio Medical Cororation Lakeside Drive Gurnee, IL 600 Phone: (800) 448-0770 Fax: (847) 855-604 info@ohiomedical.com

More information

Logit and Probit. Brad Jones 1. April 21, 2009. University of California, Davis. Bradford S. Jones, UC-Davis, Dept. of Political Science

Logit and Probit. Brad Jones 1. April 21, 2009. University of California, Davis. Bradford S. Jones, UC-Davis, Dept. of Political Science Logit and Probit Brad 1 1 Department of Political Science University of California, Davis April 21, 2009 Logit, redux Logit resolves the functional form problem (in terms of the response function in the

More information

Introduction to General and Generalized Linear Models

Introduction to General and Generalized Linear Models Introduction to General and Generalized Linear Models General Linear Models - part I Henrik Madsen Poul Thyregod Informatics and Mathematical Modelling Technical University of Denmark DK-2800 Kgs. Lyngby

More information

Predict Influencers in the Social Network

Predict Influencers in the Social Network Predict Influencers in the Social Network Ruishan Liu, Yang Zhao and Liuyu Zhou Email: rliu2, yzhao2, lyzhou@stanford.edu Department of Electrical Engineering, Stanford University Abstract Given two persons

More information

Loss Functions for Preference Levels: Regression with Discrete Ordered Labels

Loss Functions for Preference Levels: Regression with Discrete Ordered Labels Loss Functions for Preference Levels: Regression with Discrete Ordered Labels Jason D. M. Rennie Massachusetts Institute of Technology Comp. Sci. and Artificial Intelligence Laboratory Cambridge, MA 9,

More information

Response variables assume only two values, say Y j = 1 or = 0, called success and failure (spam detection, credit scoring, contracting.

Response variables assume only two values, say Y j = 1 or = 0, called success and failure (spam detection, credit scoring, contracting. Prof. Dr. J. Franke All of Statistics 1.52 Binary response variables - logistic regression Response variables assume only two values, say Y j = 1 or = 0, called success and failure (spam detection, credit

More information

Predict the Popularity of YouTube Videos Using Early View Data

Predict the Popularity of YouTube Videos Using Early View Data 000 001 002 003 004 005 006 007 008 009 010 011 012 013 014 015 016 017 018 019 020 021 022 023 024 025 026 027 028 029 030 031 032 033 034 035 036 037 038 039 040 041 042 043 044 045 046 047 048 049 050

More information

9. Forced Convection Correlations

9. Forced Convection Correlations Part B: Heat Transfer incials in Electronics Cooling 9. Forced Convection Correlations Our rimary objective is to determine heat transfer coefficients (local and average) for different flow geometries

More information

On the intensimetric analysis and monitoring of flue organ pipes. 1 Introduction

On the intensimetric analysis and monitoring of flue organ pipes. 1 Introduction On the intensimetric analysis and monitoring of flue organ ies Domenico Stanzial FSSG, National Research Council of Italy, Fondazione G. Cini, Isola di San Giorgio Maggiore, I-3014 Venezia, Italy, domenico.stanzial@cini.e.cnr.it,

More information

We are going to delve into some economics today. Specifically we are going to talk about production and returns to scale.

We are going to delve into some economics today. Specifically we are going to talk about production and returns to scale. Firms and Production We are going to delve into some economics today. Secifically we are going to talk aout roduction and returns to scale. firm - an organization that converts inuts such as laor, materials,

More information

Analysis of kiva.com Microlending Service! Hoda Eydgahi Julia Ma Andy Bardagjy December 9, 2010 MAS.622j

Analysis of kiva.com Microlending Service! Hoda Eydgahi Julia Ma Andy Bardagjy December 9, 2010 MAS.622j Analysis of kiva.com Microlending Service! Hoda Eydgahi Julia Ma Andy Bardagjy December 9, 2010 MAS.622j What is Kiva? An organization that allows people to lend small amounts of money via the Internet

More information

Making Sense of the Mayhem: Machine Learning and March Madness

Making Sense of the Mayhem: Machine Learning and March Madness Making Sense of the Mayhem: Machine Learning and March Madness Alex Tran and Adam Ginzberg Stanford University atran3@stanford.edu ginzberg@stanford.edu I. Introduction III. Model The goal of our research

More information

problem arises when only a non-random sample is available differs from censored regression model in that x i is also unobserved

problem arises when only a non-random sample is available differs from censored regression model in that x i is also unobserved 4 Data Issues 4.1 Truncated Regression population model y i = x i β + ε i, ε i N(0, σ 2 ) given a random sample, {y i, x i } N i=1, then OLS is consistent and efficient problem arises when only a non-random

More information

Pinhole Optics. OBJECTIVES To study the formation of an image without use of a lens.

Pinhole Optics. OBJECTIVES To study the formation of an image without use of a lens. Pinhole Otics Science, at bottom, is really anti-intellectual. It always distrusts ure reason and demands the roduction of the objective fact. H. L. Mencken (1880-1956) OBJECTIVES To study the formation

More information

Simple and efficient online algorithms for real world applications

Simple and efficient online algorithms for real world applications Simple and efficient online algorithms for real world applications Università degli Studi di Milano Milano, Italy Talk @ Centro de Visión por Computador Something about me PhD in Robotics at LIRA-Lab,

More information

ANALYSIS, THEORY AND DESIGN OF LOGISTIC REGRESSION CLASSIFIERS USED FOR VERY LARGE SCALE DATA MINING

ANALYSIS, THEORY AND DESIGN OF LOGISTIC REGRESSION CLASSIFIERS USED FOR VERY LARGE SCALE DATA MINING ANALYSIS, THEORY AND DESIGN OF LOGISTIC REGRESSION CLASSIFIERS USED FOR VERY LARGE SCALE DATA MINING BY OMID ROUHANI-KALLEH THESIS Submitted as partial fulfillment of the requirements for the degree of

More information

The Artificial Prediction Market

The Artificial Prediction Market The Artificial Prediction Market Adrian Barbu Department of Statistics Florida State University Joint work with Nathan Lay, Siemens Corporate Research 1 Overview Main Contributions A mathematical theory

More information

VI. Introduction to Logistic Regression

VI. Introduction to Logistic Regression VI. Introduction to Logistic Regression We turn our attention now to the topic of modeling a categorical outcome as a function of (possibly) several factors. The framework of generalized linear models

More information

Notes on the Negative Binomial Distribution

Notes on the Negative Binomial Distribution Notes on the Negative Binomial Distribution John D. Cook October 28, 2009 Abstract These notes give several properties of the negative binomial distribution. 1. Parameterizations 2. The connection between

More information

CS229 Project Report Automated Stock Trading Using Machine Learning Algorithms

CS229 Project Report Automated Stock Trading Using Machine Learning Algorithms CS229 roject Report Automated Stock Trading Using Machine Learning Algorithms Tianxin Dai tianxind@stanford.edu Arpan Shah ashah29@stanford.edu Hongxia Zhong hongxia.zhong@stanford.edu 1. Introduction

More information

Optimal Pricing for Multiple Services in Telecommunications Networks Offering Quality of Service Guarantees

Optimal Pricing for Multiple Services in Telecommunications Networks Offering Quality of Service Guarantees To Appear in IEEE/ACM Transactions on etworking, February 2003 Optimal Pricing for Multiple Serices in Telecommunications etworks Offering Quality of Serice Guarantees eil J. Keon Member, IEEE, G. Anandalingam,

More information

degrees of freedom and are able to adapt to the task they are supposed to do [Gupta].

degrees of freedom and are able to adapt to the task they are supposed to do [Gupta]. 1.3 Neural Networks 19 Neural Networks are large structured systems of equations. These systems have many degrees of freedom and are able to adapt to the task they are supposed to do [Gupta]. Two very

More information

SOME PROPERTIES OF EXTENSIONS OF SMALL DEGREE OVER Q. 1. Quadratic Extensions

SOME PROPERTIES OF EXTENSIONS OF SMALL DEGREE OVER Q. 1. Quadratic Extensions SOME PROPERTIES OF EXTENSIONS OF SMALL DEGREE OVER Q TREVOR ARNOLD Abstract This aer demonstrates a few characteristics of finite extensions of small degree over the rational numbers Q It comrises attemts

More information

Nominal and ordinal logistic regression

Nominal and ordinal logistic regression Nominal and ordinal logistic regression April 26 Nominal and ordinal logistic regression Our goal for today is to briefly go over ways to extend the logistic regression model to the case where the outcome

More information

Bayesian Hyperspectral Image Segmentation with Discriminative Class Learning

Bayesian Hyperspectral Image Segmentation with Discriminative Class Learning Bayesian Hyperspectral Image Segmentation with Discriminative Class Learning Janete S. Borges 1,José M. Bioucas-Dias 2, and André R.S.Marçal 1 1 Faculdade de Ciências, Universidade do Porto 2 Instituto

More information

Java Modules for Time Series Analysis

Java Modules for Time Series Analysis Java Modules for Time Series Analysis Agenda Clustering Non-normal distributions Multifactor modeling Implied ratings Time series prediction 1. Clustering + Cluster 1 Synthetic Clustering + Time series

More information

Logistic Regression for Data Mining and High-Dimensional Classification

Logistic Regression for Data Mining and High-Dimensional Classification Logistic Regression for Data Mining and High-Dimensional Classification Paul Komarek Dept. of Math Sciences Carnegie Mellon University komarek@cmu.edu Advised by Andrew Moore School of Computer Science

More information

Data Mining Part 5. Prediction

Data Mining Part 5. Prediction Data Mining Part 5. Prediction 5.7 Spring 2010 Instructor: Dr. Masoud Yaghini Outline Introduction Linear Regression Other Regression Models References Introduction Introduction Numerical prediction is

More information

Basics of Statistical Machine Learning

Basics of Statistical Machine Learning CS761 Spring 2013 Advanced Machine Learning Basics of Statistical Machine Learning Lecturer: Xiaojin Zhu jerryzhu@cs.wisc.edu Modern machine learning is rooted in statistics. You will find many familiar

More information

Auxiliary Variables in Mixture Modeling: 3-Step Approaches Using Mplus

Auxiliary Variables in Mixture Modeling: 3-Step Approaches Using Mplus Auxiliary Variables in Mixture Modeling: 3-Step Approaches Using Mplus Tihomir Asparouhov and Bengt Muthén Mplus Web Notes: No. 15 Version 8, August 5, 2014 1 Abstract This paper discusses alternatives

More information

Acknowledgments. Data Mining with Regression. Data Mining Context. Overview. Colleagues

Acknowledgments. Data Mining with Regression. Data Mining Context. Overview. Colleagues Data Mining with Regression Teaching an old dog some new tricks Acknowledgments Colleagues Dean Foster in Statistics Lyle Ungar in Computer Science Bob Stine Department of Statistics The School of the

More information