CIS603 - Artificial Intelligence. Logistic regression. (some material adopted from notes by M. Hauskrecht) CIS603 - AI. Supervised learning



Similar documents
Probabilistic Linear Classifier: Logistic Regression. CS534-Machine Learning

Simple Linear Regression

The simple linear Regression Model

Statistical Pattern Recognition (CE-725) Department of Computer Engineering Sharif University of Technology

Preprocess a planar map S. Given a query point p, report the face of S containing p. Goal: O(n)-size data structure that enables O(log n) query time.

APPENDIX III THE ENVELOPE PROPERTY

A particle swarm optimization to vehicle routing problem with fuzzy demands

Finito: A Faster, Permutable Incremental Gradient Method for Big Data Problems

IDENTIFICATION OF THE DYNAMICS OF THE GOOGLE S RANKING ALGORITHM. A. Khaki Sedigh, Mehdi Roudaki

Using Phase Swapping to Solve Load Phase Balancing by ADSCHNN in LV Distribution Network

STATISTICAL PROPERTIES OF LEAST SQUARES ESTIMATORS. x, where. = y - ˆ " 1

ANOVA Notes Page 1. Analysis of Variance for a One-Way Classification of Data

RUSSIAN ROULETTE AND PARTICLE SPLITTING

Applications of Support Vector Machine Based on Boolean Kernel to Spam Filtering

Integrating Production Scheduling and Maintenance: Practical Implications

The Digital Signature Scheme MQQ-SIG

Chapter Eight. f : R R

CSSE463: Image Recognition Day 27

Fractal-Structured Karatsuba`s Algorithm for Binary Field Multiplication: FK

Optimal Packetization Interval for VoIP Applications Over IEEE Networks

Statistical Intrusion Detector with Instance-Based Learning

Measuring the Quality of Credit Scoring Models

Lecture 2: Single Layer Perceptrons Kevin Swingler

Design of Experiments

Optimal multi-degree reduction of Bézier curves with constraints of endpoints continuity

M. Salahi, F. Mehrdoust, F. Piri. CVaR Robust Mean-CVaR Portfolio Optimization

1. The Time Value of Money

Optimal replacement and overhaul decisions with imperfect maintenance and warranty contracts

The analysis of annuities relies on the formula for geometric sums: r k = rn+1 1 r 1. (2.1) k=0

Time Series Forecasting by Using Hybrid. Models for Monthly Streamflow Data

T = 1/freq, T = 2/freq, T = i/freq, T = n (number of cash flows = freq n) are :

Chapter 3. AMORTIZATION OF LOAN. SINKING FUNDS R =

On Error Detection with Block Codes

Classic Problems at a Glance using the TVM Solver

Cyber Journals: Multidisciplinary Journals in Science and Technology, Journal of Selected Areas in Telecommunications (JSAT), January Edition, 2011

Three Dimensional Interpolation of Video Signals

Numerical Methods with MS Excel

Response surface methodology

Using Data Mining Techniques to Predict Product Quality from Physicochemical Data

Constrained Cubic Spline Interpolation for Chemical Engineering Applications

Fault Tree Analysis of Software Reliability Allocation

Relaxation Methods for Iterative Solution to Linear Systems of Equations

6.7 Network analysis Introduction. References - Network analysis. Topological analysis

Robust Realtime Face Recognition And Tracking System

10.5 Future Value and Present Value of a General Annuity Due

The impact of service-oriented architecture on the scheduling algorithm in cloud computing

An Approach to Evaluating the Computer Network Security with Hesitant Fuzzy Information

SHAPIRO-WILK TEST FOR NORMALITY WITH KNOWN MEAN

Business Bankruptcy Prediction Based on Survival Analysis Approach

ECONOMIC CHOICE OF OPTIMUM FEEDER CABLE CONSIDERING RISK ANALYSIS. University of Brasilia (UnB) and The Brazilian Regulatory Agency (ANEEL), Brazil

CHAPTER 2. Time Value of Money 6-1

Evaluating the Network and Information System Security Based on SVM Model

A particle Swarm Optimization-based Framework for Agile Software Effort Estimation

Regression Analysis. 1. Introduction

Multiplexers and Demultiplexers

A two-stage stochastic mixed-integer program modelling and hybrid solution approach to portfolio selection problems

OPTIMIZATION METHODS FOR BATCH SCHEDULING

A COMPARATIVE STUDY BETWEEN POLYCLASS AND MULTICLASS LANGUAGE MODELS

Questions? Ask Prof. Herz, General Classification of adsorption

Geometric Mean Maximization: Expected, Observed, and Simulated Performance

An SVR-Based Data Farming Technique for Web Application

Logistic Regression. Lecture 4: More classifiers and classes. Logistic regression. Adaboost. Optimization. Multiple class classification

How To Value An Annuity

Curve Fitting and Solution of Equation

Lecture 7. Norms and Condition Numbers

ERP System Flexibility Measurement Based on Fuzzy Analytic Network Process

Proceedings of the 2010 Winter Simulation Conference B. Johansson, S. Jain, J. Montoya-Torres, J. Hugan, and E. Yücesan, eds.

The Gompertz-Makeham distribution. Fredrik Norström. Supervisor: Yuri Belyaev

Automated Event Registration System in Corporation

A Hybrid Data-Model Fusion Approach to Calibrate a Flush Air Data Sensing System

Software Aging Prediction based on Extreme Learning Machine

Bayesian Network Representation

The Time Value of Money

Common p-belief: The General Case

Speeding up k-means Clustering by Bootstrap Averaging

Average Price Ratios

Aggregation Functions and Personal Utility Functions in General Insurance

Abraham Zaks. Technion I.I.T. Haifa ISRAEL. and. University of Haifa, Haifa ISRAEL. Abstract

Projection model for Computer Network Security Evaluation with interval-valued intuitionistic fuzzy information. Qingxiang Li

Study on prediction of network security situation based on fuzzy neutral network

Raport końcowy Zadanie nr 8:

Dynamic Two-phase Truncated Rayleigh Model for Release Date Prediction of Software

On formula to compute primes and the n th prime

Approximation Algorithms for Scheduling with Rejection on Two Unrelated Parallel Machines

Luby s Alg. for Maximal Independent Sets using Pairwise Independence

CS 2750 Machine Learning. Lecture 3. Density estimation. CS 2750 Machine Learning. Announcements

Transcription:

CIS63 - Artfcal Itellgece Logstc regresso Vasleos Megalookoomou some materal adopted from otes b M. Hauskrecht Supervsed learg Data: D { d d.. d} a set of eamples d < > s put vector ad s desred output gve b a teacher Obectve: lear the mappg f : X Y s.t. f for all.. To tpes of problems: Regresso: Y s cotuous Eample: eargs product orders compa stock prce Classfcato: Y s dscrete Eample: temperature heart rate dsease No: BINARY classfcato problems

Bar classfcato To classes Y {} Our goal s to lear to classf correctl to tpes of eamples Class labeled as Class labeled as We ould lke to lear f : X {} Frst step: e eed to devse a model of the fucto f Isprato: euro erve cells Neuro euro erve cell ad ts actvtes

Neuro-based bar classfcato model z k Threshold fucto Bar classfcato Istead of learg the mappg to dscrete values f : X {} It s easer to lear a probablstc fucto f : X [] here f descrbes the probablt of a class gve p Trasformato to dscrete class values: If p / the choose Else choose Logstc regresso model uses a probablstc fucto 3

Logstc regresso: Logstc regresso p g z g + +... k here are parameters of the models ad gz s a logstc fucto g z / + e z k Bas term Iput vector p k k z Logstc fucto Logstc fucto fucto g z z + e also referred to as sgmod fucto replaces threshold fucto th smooth stchg takes a real umber ad outputs the umber the terval [].9.8.7.6.5.4.3.. - -5 - -5 5 5 4

Logstc regresso - Decso boudar Logstc regresso model defes a lear decso boudar Eample: classes crosses ad crcles Decso boudar.5.5 -.5 - -.5 - - -.5 - -.5.5.5 Bar classfcato - Error To classes Y {} Our goal s to classf correctl as ma eamples as possble Zero-oe error fucto Error f f Error e ould lke to mmze: E Error The error s mmzed f e choose: f p > p otherse We costruct a probablstc verso of the error fucto based o the lkelhood of the data L D P D Iverse optmzato problem Error D L D 5

6 Lkelhood of data We at eghts that mamze the lkelhood of data Trck: mamze the log-lkelhood of data stead Ratoal: The optmal eghts are the same for both the lkelhood ad the log-lkelhood Logstc regresso: parameter learg D l log log D P D L k o g z g p + + d J ole log log > < d here Logstc regresso: parameter estmato log log ole d J D l + ole z g d J Log lkelhood O-le compoet of the log-lkelhood Dervatves of the ole error compoet terms of eghts log log ole d J + ole z g d J

Logstc regresso. Ole gradet. We at to fd the set of parameters optmzg the loglkelhood of data or mmzg the error O-le learg update for eght J ole d * α [ J ole d * ] +th update for the logstc regresso ad d < k + u + α + g + u u > + + α + g + k u u u α - aealed learg rate depeds o the umber of updates The same eas update rule as used the lear regresso!!! Ole logstc regresso algorthm Ole-logstc-regresso D umber of teratos talze eghts k for :: umber of teratos do select a data pot d<> from D set α / update eghts parallel + α[ p ] ed for retur eghts + α[ p ] 7

Ole algorthm. Eample. Ole algorthm. Eample. 8

Ole algorthm. Eample. Lmtatos of basc lear uts Lear regresso Logstc regresso z p k k k k Fucto lear puts Lear decso boudar 9

Logstc regresso - Decso boudar Logstc regresso model defes a lear decso boudar Eample: classes crosses ad crcles Decso boudar.5.5 -.5 - -.5 - - -.5 - -.5.5.5 Lear decso boudar Eample he logstc regresso model s ot optmal but ot that bad 3 Decso boudar.5.5.5 -.5 - -.5 - - -.5 - -.5.5.5

Whe logstc regresso fals? Eample hch the logstc regresso model fals 5 4 3 - - -3-4 -4-3 - - 3 4 5 Lmtatos of logstc regresso. part fucto - o lear decso boudar.5.5 -.5 - -.5 - - -.5 - -.5.5.5

Etesos of smple lear uts Replace puts to lear uts th feature bass fuctos to model oleartes f φ m + φ - a arbtrar fucto of φ φ k φ m 3 The same trck ca be doe for the logstc regresso Eteso of smple lear uts Eample: Fttg of a polomal of degree m Data pots: pars of < > Feature fuctos: φ Fucto to lear: f O le update for <> par + m + α f + α f

Mult-laered eural etorks Alteratve a to troduce oleartes to regresso/classfcato models Idea: Cascade several smple eural models based o logstc regresso. Much lke euro coectos. 3