Logistic Regression. Lecture 4: More classifiers and classes. Logistic regression. Adaboost. Optimization. Multiple class classification
|
|
|
- Roberta Stafford
- 10 years ago
- Views:
Transcription
1 Lecture 4: More classfers and classes C4B Machne Learnng Hlary 20 A. Zsserman Logstc regresson Loss functons revsted Adaboost Loss functons revsted Optmzaton Multple class classfcaton Logstc Regresson
2 Overvew Logstc regresson s actually a classfcaton method LR ntroduces an extra non-lnearty over a lnear classfer, f(x) = w > x + b, by usng a logstc (or sgmod) functon, σ(). The LR classfer s defned as ( 0.5 y =+ σ (f(x )) < 0.5 y = where σ(f(x)) = +e f(x) The logstc functon or sgmod functon σ(z) = +e z z As z goes from to, σ(z) goes from 0 to, a squashng functon. It has a sgmod shape (.e. S-lke shape) σ(0) = 0.5, and f z = w > x + b then dσ(z) dx z=0 = 4 w
3 Intuton why use a sgmod? Here, choose bnary classfcaton to be represented by y {0, }, rather than y {, } Least squares ft σ(wx + b) ft toy wx + b ft toy ft of wx + b domnated by more dstant ponts causes msclassfcaton nstead LR regresses the sgmod to the class data Smlarly n 2D LR lnear LR lnear σ(w x + w 2 x 2 + b) ft, vs w x + w 2 x 2 + b
4 Learnng In logstc regresson ft a sgmod functon to the data { x, y } by mnmzng the classfcaton errors y σ(w > x ) Margn property A sgmod favours a larger margn cf a step classfer
5 Probablstc nterpretaton Thnk of σ(f(x)) as the posteror probablty that y =,.e.p (y = x) =σ(f(x)) Hence, f σ(f(x)) > 0.5 thenclassy = s selected Then, after a rearrangement P (y = x) f(x) =log P (y = x) =logp (y = x) P (y =0 x) whch s the log odds rato Maxmum Lkelhood Estmaton Assume p(y = x; w) = σ(w > x) p(y =0 x; w) = σ(w > x) wrte ths more compactly as p(y x; w) = ³ σ(w > x) y ³ σ(w > x) ( y) Then the lkelhood (assumng data ndependence) s NY ³ p(y x; w) σ(w > ³ x ) y σ(w > ) x ) ( y and the negatve log lkelhood s L(w) = y log σ(w > x )+( y )log( σ(w > x ))
6 Logstc Regresson Loss functon Use notaton y {, }. Then P (y = x) =σ(f(x)) = +e f(x) P (y = x) = σ(f(x)) = So n both cases P (y x )= +e y f(x ) Assumng ndependence, the lkelhood s NY +e y f(x ) andthenegatveloglkelhoods = log ³ +e y f(x ) whch defnes the loss functon. +e +f(x) Logstc Regresson Learnng Learnng s formulated as the optmzaton problem mn w R d log ³ +e y f(x ) + λ w 2 loss functon regularzaton For correctly classfed ponts y f(x ) s negatve, and log ³ +e y f(x ) s near zero For ncorrectly classfed ponts y f(x ) s postve, and log ³ +e y f(x ) can be large. Hence the optmzaton penalzes parameters whch lead to such msclassfcatons
7 Comparson of SVM and LR cost functons SVM mn w R d C N X max (0, y f(x )) + w 2 Logstc regresson: mn log ³ +e y f(x ) + λ w 2 w R d Note: both approxmate 0- loss very smlar asymptotc behavour man dfference s smoothness, and non-zero values outsde margn SVM gves sparse soluton for α y f(x ) AdaBoost
8 Overvew AdaBoost s an algorthm for constructng a strong classfer out of a lnear combnaton TX t= α t h t (x) of smple weak classfers h t (x). It provdes a method of choosng the weak classfers and settng the weghts α t Termnology weak classfer h t (x) {, } strong classfer H(x) =sgn TX t=, for data vector x α t h t (x) Example: combnaton of lnear classfers h t (x) {, } weak classfer weak classfer 2 weak classfer 3 strong classfer h (x) h 2 (x) h 3 (x) H(x) H(x) =sgn(α h (x)+α 2 h 2 (x)+α 3 h 3 (x)) Note, ths lnear combnaton s not a smple majorty vote (t would be f ) Need to compute as well as selectng weak classfers
9 AdaBoost algorthm buldng a strong classfer Start wth equal weghts on each x, and a set of weak classfers For t =,T Select weak classfer wth mnmum error ² t = X ω [h t (x ) 6= y ] Set α t = 2 ln ² t ² t Reweght examples (boostng) to gve msclassfed examples more weght ω t+, = ω t, e α ty h t (x ) α t Add weak classfer wth weght TX H(x) =sgn α t h t (x) t= h t (x) where ω are weghts Example start wth equal weghts on each data pont () Weak Classfer ² j = X ω [h j (x ) 6= y ] Weghts Increased Weak Classfer 2 Weak classfer 3 Fnal classfer s lnear combnaton of weak classfers
10 The AdaBoost algorthm (Freund & Shapre 995) Gven example data (x,y ),...,(x n,y n ), where y =, for negatve and postve examples respectvely. Intalze weghts ω, = 2m, 2l for y =, respectvely, where m and l are the number of negatves and postves respectvely. For t =,...,T. Normalze the weghts, ω t, ω t, P n j= ω t,j so that ω t, s a probablty dstrbuton. 2. For each j, tranaweakclassfer h j wth error evaluated wth respect to ω t,, ² j = X ω t, [h j (x ) 6= y ] 3. Choose the classfer, h t,wththelowesterror² t. 4. Set α t as α t = 2 ln ² t ² t 5. Update the weghts ω t+, = ω t, e αtyht(x) The fnal strong classfer s TX H(x) =sgn α t h t (x) t= Why does t work? The AdaBoost algorthm carres out a greedy optmzaton of a loss functon AdaBoost mn α,h e y H(x ) SVM loss functon max (0, y f(x )) Logstc regresson loss functon log ³ +e y f(x ) LR SVM hnge loss y f(x )
11 Sketch dervaton non-examnable The objectve functon used by AdaBoost s J(H) = X e yh(x) For a correctly classfed pont the penalty s exp( H ) and for an ncorrectly classfed pont the penalty s exp(+ H ). The AdaBoost algorthm ncrementally decreases the cost by addng smple functons to H(x) = X t α t h t (x) Suppose that we have a functon B and we propose to add the functon αh(x) where the scalar α s to be determned and h(x) s some functon that takes values n + or only. The new functon s B(x)+αh(x) and the new cost s J(B + αh) = X e yb(x) e αyh(x) Dfferentatng wth respect to α and settng the result to zero gves X X e yb(x) e +α e yb(x) =0 e α y =h(x ) y 6=h(x ) Rearrangng, the optmal value of α s therefore determned to be α = P 2 log y P =h(x ) e yb(x) y 6=h(x ) e yb(x) The classfcaton error s defned as ² = X ω [h(x ) 6= y ] where ω = e yb(x) P j e y jb(x j ) Then, t can be shown that, α = 2 log ² ² The update from B to H therefore nvolves evaluatng the weghted performance (wth the weghts ω gven above) ² of the weak classfer h. If the current functon B s B(x) = 0 then the weghts wll be unform. Ths s a common startng pont for the mnmzaton. As a numercal convenence, note that at the next round of boostng the requred weghts are obtaned by multplyng the old weghts wth exp( αy h(x )) and then normalzng. Ths gves the update formula where Z t s a normalzng factor. ω t+, = Z t ω t, e αtyht(x) Choosng h The functon h s not chosen arbtrarly but s chosen to gve a good performance (low value of ²) on the tranng data weghted by ω.
12 Optmzaton We have seen many cost functons, e.g. SVM mn w R d C N X max (0, y f(x )) + w 2 Logstc regresson: mn log ³ +e y f(x ) + λ w 2 w R d local mnmum global mnmum Do these have a unque soluton? Does the soluton depend on the startng pont of an teratve optmzaton algorthm (such as gradent descent)? If the cost functon s convex, then a locally optmal pont s globally optmal (provded the optmzaton s over a convex set, whch t s n our case)
13 Convex functons Convex functon examples convex Not convex A non-negatve sum of convex functons s convex
14 + Logstc regresson: mn w R d log ³ +e y f(x ) + λ w 2 convex + SVM mn w R d C N X max (0, y f(x )) + w 2 convex Gradent (or Steepest) descent algorthms To mnmze a cost functon C(w) use the teratve update where η s the learnng rate. w t+ w t η t w C(w t ) In our case the loss functon s a sum over the tranng data. For example for LR X N mn C(w) = log ³ +e y f(x ) + λ w 2 = L(x,y ; w)+λ w 2 w R d Ths means that one teratve update conssts of a pass through the tranng data wth an update for each pont w t+ w t ( η t w L(x,y ; w t )+2λw t ) The advantage s that for large amounts of data, ths can be carred out pont by pont.
15 Gradent descent algorthm for LR Mnmzng L(w) usng gradent descent gves the update rule [exercse] w w η(y σ(w > x ))x where y {0, } Note: ths s smlar, but not dentcal, to the perceptron update rule. w w ηsgn(w > x )x there s a unque soluton for n practce more effcent Newton methods are used to mnmze L there can be problems wth w w becomng nfnte for lnearly separable data Gradent descent algorthm for SVM Frst, rewrte the optmzaton problem as an average mn w C(w) = λ 2 w 2 + max (0, y f(x )) N = µ λ N 2 w 2 +max(0, y f(x )) (wth λ =2/(NC) up to an overall scale of the problem) and f(x) =w > x + b Because the hnge loss s not dfferentable, a sub-gradent s computed
16 Sub-gradent for hnge loss L(x,y ; w) =max(0, y f(x )) f(x )=w > x + b L w = y x L w =0 y f(x ) Sub-gradent descent algorthm for SVM C(w) = N µ λ 2 w 2 + L(x,y ; w) The teratve update s w t+ w t η wt C(w t ) where η s the learnng rate. w t η N (λw t + w L(x,y ; w t )) Then each teraton t nvolves cyclng through the tranng data wth the updates: w t+ ( ηλ)w t + ηy x f y (w > x + b) < ( ηλ)w t otherwse
17 Mult-class Classfcaton Mult-Class Classfcaton what we would lke Assgn nput vector x to one of K classes C k Goal: a decson rule that dvdes nput space nto K decson regons separated by decson boundares
18 Remnder: K Nearest Neghbour (K-NN) Classfer Algorthm For each test pont, x, to be classfed, fnd the K nearest samples n the tranng data Classfy the pont, x, accordng to the majorty vote of ther class labels e.g. K = 3 naturally applcable to mult-class case Buld from bnary classfers Learn: K two-class vs the rest classfers f k (x) vs 2 & 3 C? C 2 C 3 2 vs & 3 3 vs & 2
19 Buld from bnary classfers Learn: K two-class vs the rest classfers f k (x) Classfcaton: choose class wth most postve score vs 2 & 3 C C 2 max k f k (x) C 3 2 vs & 3 3 vs & 2 Applcaton: hand wrtten dgt recognton Feature vectors: each mage s 28 x 28 pxels. Rearrange as a 784-vector x Tranng: learn k=0 two-class vs the rest SVM classfers f k (x) Classfcaton: choose class wth most postve score f(x) =max k f k (x)
20 Example hand drawn classfcaton Background readng and more Other multple-class classfers (not covered here): Neural networks Random forests Bshop, chapters and 4.3 Haste et al, chapters More on web page:
What is Candidate Sampling
What s Canddate Samplng Say we have a multclass or mult label problem where each tranng example ( x, T ) conssts of a context x a small (mult)set of target classes T out of a large unverse L of possble
Logistic Regression. Steve Kroon
Logstc Regresson Steve Kroon Course notes sectons: 24.3-24.4 Dsclamer: these notes do not explctly ndcate whether values are vectors or scalars, but expects the reader to dscern ths from the context. Scenaro
ECE544NA Final Project: Robust Machine Learning Hardware via Classifier Ensemble
1 ECE544NA Fnal Project: Robust Machne Learnng Hardware va Classfer Ensemble Sa Zhang, [email protected] Dept. of Electr. & Comput. Eng., Unv. of Illnos at Urbana-Champagn, Urbana, IL, USA Abstract In
Support Vector Machines
Support Vector Machnes Max Wellng Department of Computer Scence Unversty of Toronto 10 Kng s College Road Toronto, M5S 3G5 Canada [email protected] Abstract Ths s a note to explan support vector machnes.
Lecture 5,6 Linear Methods for Classification. Summary
Lecture 5,6 Lnear Methods for Classfcaton Rce ELEC 697 Farnaz Koushanfar Fall 2006 Summary Bayes Classfers Lnear Classfers Lnear regresson of an ndcator matrx Lnear dscrmnant analyss (LDA) Logstc regresson
Lecture 2: Single Layer Perceptrons Kevin Swingler
Lecture 2: Sngle Layer Perceptrons Kevn Sngler [email protected] Recap: McCulloch-Ptts Neuron Ths vastly smplfed model of real neurons s also knon as a Threshold Logc Unt: W 2 A Y 3 n W n. A set of synapses
BERNSTEIN POLYNOMIALS
On-Lne Geometrc Modelng Notes BERNSTEIN POLYNOMIALS Kenneth I. Joy Vsualzaton and Graphcs Research Group Department of Computer Scence Unversty of Calforna, Davs Overvew Polynomals are ncredbly useful
Probabilistic Linear Classifier: Logistic Regression. CS534-Machine Learning
robablstc Lnear Classfer: Logstc Regresson CS534-Machne Learnng Three Man Approaches to learnng a Classfer Learn a classfer: a functon f, ŷ f Learn a probablstc dscrmnatve model,.e., the condtonal dstrbuton
Luby s Alg. for Maximal Independent Sets using Pairwise Independence
Lecture Notes for Randomzed Algorthms Luby s Alg. for Maxmal Independent Sets usng Parwse Independence Last Updated by Erc Vgoda on February, 006 8. Maxmal Independent Sets For a graph G = (V, E), an ndependent
THE METHOD OF LEAST SQUARES THE METHOD OF LEAST SQUARES
The goal: to measure (determne) an unknown quantty x (the value of a RV X) Realsaton: n results: y 1, y 2,..., y j,..., y n, (the measured values of Y 1, Y 2,..., Y j,..., Y n ) every result s encumbered
Forecasting the Demand of Emergency Supplies: Based on the CBR Theory and BP Neural Network
700 Proceedngs of the 8th Internatonal Conference on Innovaton & Management Forecastng the Demand of Emergency Supples: Based on the CBR Theory and BP Neural Network Fu Deqang, Lu Yun, L Changbng School
CS 2750 Machine Learning. Lecture 3. Density estimation. CS 2750 Machine Learning. Announcements
Lecture 3 Densty estmaton Mlos Hauskrecht [email protected] 5329 Sennott Square Next lecture: Matlab tutoral Announcements Rules for attendng the class: Regstered for credt Regstered for audt (only f there
PSYCHOLOGICAL RESEARCH (PYC 304-C) Lecture 12
14 The Ch-squared dstrbuton PSYCHOLOGICAL RESEARCH (PYC 304-C) Lecture 1 If a normal varable X, havng mean µ and varance σ, s standardsed, the new varable Z has a mean 0 and varance 1. When ths standardsed
benefit is 2, paid if the policyholder dies within the year, and probability of death within the year is ).
REVIEW OF RISK MANAGEMENT CONCEPTS LOSS DISTRIBUTIONS AND INSURANCE Loss and nsurance: When someone s subject to the rsk of ncurrng a fnancal loss, the loss s generally modeled usng a random varable or
Causal, Explanatory Forecasting. Analysis. Regression Analysis. Simple Linear Regression. Which is Independent? Forecasting
Causal, Explanatory Forecastng Assumes cause-and-effect relatonshp between system nputs and ts output Forecastng wth Regresson Analyss Rchard S. Barr Inputs System Cause + Effect Relatonshp The job of
Descriptive Models. Cluster Analysis. Example. General Applications of Clustering. Examples of Clustering Applications
CMSC828G Prncples of Data Mnng Lecture #9 Today s Readng: HMS, chapter 9 Today s Lecture: Descrptve Modelng Clusterng Algorthms Descrptve Models model presents the man features of the data, a global summary
1 Example 1: Axis-aligned rectangles
COS 511: Theoretcal Machne Learnng Lecturer: Rob Schapre Lecture # 6 Scrbe: Aaron Schld February 21, 2013 Last class, we dscussed an analogue for Occam s Razor for nfnte hypothess spaces that, n conjuncton
Face Verification Problem. Face Recognition Problem. Application: Access Control. Biometric Authentication. Face Verification (1:1 matching)
Face Recognton Problem Face Verfcaton Problem Face Verfcaton (1:1 matchng) Querymage face query Face Recognton (1:N matchng) database Applcaton: Access Control www.vsage.com www.vsoncs.com Bometrc Authentcaton
Bag-of-Words models. Lecture 9. Slides from: S. Lazebnik, A. Torralba, L. Fei-Fei, D. Lowe, C. Szurka
Bag-of-Words models Lecture 9 Sldes from: S. Lazebnk, A. Torralba, L. Fe-Fe, D. Lowe, C. Szurka Bag-of-features models Overvew: Bag-of-features models Orgns and motvaton Image representaton Dscrmnatve
Recurrence. 1 Definitions and main statements
Recurrence 1 Defntons and man statements Let X n, n = 0, 1, 2,... be a MC wth the state space S = (1, 2,...), transton probabltes p j = P {X n+1 = j X n = }, and the transton matrx P = (p j ),j S def.
8 Algorithm for Binary Searching in Trees
8 Algorthm for Bnary Searchng n Trees In ths secton we present our algorthm for bnary searchng n trees. A crucal observaton employed by the algorthm s that ths problem can be effcently solved when the
STATISTICAL DATA ANALYSIS IN EXCEL
Mcroarray Center STATISTICAL DATA ANALYSIS IN EXCEL Lecture 6 Some Advanced Topcs Dr. Petr Nazarov 14-01-013 [email protected] Statstcal data analyss n Ecel. 6. Some advanced topcs Correcton for
L10: Linear discriminants analysis
L0: Lnear dscrmnants analyss Lnear dscrmnant analyss, two classes Lnear dscrmnant analyss, C classes LDA vs. PCA Lmtatons of LDA Varants of LDA Other dmensonalty reducton methods CSCE 666 Pattern Analyss
Forecasting the Direction and Strength of Stock Market Movement
Forecastng the Drecton and Strength of Stock Market Movement Jngwe Chen Mng Chen Nan Ye [email protected] [email protected] [email protected] Abstract - Stock market s one of the most complcated systems
How Sets of Coherent Probabilities May Serve as Models for Degrees of Incoherence
1 st Internatonal Symposum on Imprecse Probabltes and Ther Applcatons, Ghent, Belgum, 29 June 2 July 1999 How Sets of Coherent Probabltes May Serve as Models for Degrees of Incoherence Mar J. Schervsh
Single and multiple stage classifiers implementing logistic discrimination
Sngle and multple stage classfers mplementng logstc dscrmnaton Hélo Radke Bttencourt 1 Dens Alter de Olvera Moraes 2 Vctor Haertel 2 1 Pontfíca Unversdade Católca do Ro Grande do Sul - PUCRS Av. Ipranga,
NPAR TESTS. One-Sample Chi-Square Test. Cell Specification. Observed Frequencies 1O i 6. Expected Frequencies 1EXP i 6
PAR TESTS If a WEIGHT varable s specfed, t s used to replcate a case as many tmes as ndcated by the weght value rounded to the nearest nteger. If the workspace requrements are exceeded and samplng has
Module 2 LOSSLESS IMAGE COMPRESSION SYSTEMS. Version 2 ECE IIT, Kharagpur
Module LOSSLESS IMAGE COMPRESSION SYSTEMS Lesson 3 Lossless Compresson: Huffman Codng Instructonal Objectves At the end of ths lesson, the students should be able to:. Defne and measure source entropy..
Gender Classification for Real-Time Audience Analysis System
Gender Classfcaton for Real-Tme Audence Analyss System Vladmr Khryashchev, Lev Shmaglt, Andrey Shemyakov, Anton Lebedev Yaroslavl State Unversty Yaroslavl, Russa [email protected], [email protected], [email protected],
Production. 2. Y is closed A set is closed if it contains its boundary. We need this for the solution existence in the profit maximization problem.
Producer Theory Producton ASSUMPTION 2.1 Propertes of the Producton Set The producton set Y satsfes the followng propertes 1. Y s non-empty If Y s empty, we have nothng to talk about 2. Y s closed A set
Quantization Effects in Digital Filters
Quantzaton Effects n Dgtal Flters Dstrbuton of Truncaton Errors In two's complement representaton an exact number would have nfntely many bts (n general). When we lmt the number of bts to some fnte value
Feature selection for intrusion detection. Slobodan Petrović NISlab, Gjøvik University College
Feature selecton for ntruson detecton Slobodan Petrovć NISlab, Gjøvk Unversty College Contents The feature selecton problem Intruson detecton Traffc features relevant for IDS The CFS measure The mrmr measure
How To Know The Components Of Mean Squared Error Of Herarchcal Estmator S
S C H E D A E I N F O R M A T I C A E VOLUME 0 0 On Mean Squared Error of Herarchcal Estmator Stans law Brodowsk Faculty of Physcs, Astronomy, and Appled Computer Scence, Jagellonan Unversty, Reymonta
) of the Cell class is created containing information about events associated with the cell. Events are added to the Cell instance
Calbraton Method Instances of the Cell class (one nstance for each FMS cell) contan ADC raw data and methods assocated wth each partcular FMS cell. The calbraton method ncludes event selecton (Class Cell
ActiveClean: Interactive Data Cleaning While Learning Convex Loss Models
ActveClean: Interactve Data Cleanng Whle Learnng Convex Loss Models Sanjay Krshnan, Jannan Wang, Eugene Wu, Mchael J. Frankln, Ken Goldberg UC Berkeley, Columba Unversty {sanjaykrshnan, jnwang, frankln,
8.5 UNITARY AND HERMITIAN MATRICES. The conjugate transpose of a complex matrix A, denoted by A*, is given by
6 CHAPTER 8 COMPLEX VECTOR SPACES 5. Fnd the kernel of the lnear transformaton gven n Exercse 5. In Exercses 55 and 56, fnd the mage of v, for the ndcated composton, where and are gven by the followng
The Greedy Method. Introduction. 0/1 Knapsack Problem
The Greedy Method Introducton We have completed data structures. We now are gong to look at algorthm desgn methods. Often we are lookng at optmzaton problems whose performance s exponental. For an optmzaton
J. Parallel Distrib. Comput.
J. Parallel Dstrb. Comput. 71 (2011) 62 76 Contents lsts avalable at ScenceDrect J. Parallel Dstrb. Comput. journal homepage: www.elsever.com/locate/jpdc Optmzng server placement n dstrbuted systems n
A hybrid global optimization algorithm based on parallel chaos optimization and outlook algorithm
Avalable onlne www.ocpr.com Journal of Chemcal and Pharmaceutcal Research, 2014, 6(7):1884-1889 Research Artcle ISSN : 0975-7384 CODEN(USA) : JCPRC5 A hybrd global optmzaton algorthm based on parallel
Texas Instruments 30X IIS Calculator
Texas Instruments 30X IIS Calculator Keystrokes for the TI-30X IIS are shown for a few topcs n whch keystrokes are unque. Start by readng the Quk Start secton. Then, before begnnng a specfc unt of the
Latent Class Regression. Statistics for Psychosocial Research II: Structural Models December 4 and 6, 2006
Latent Class Regresson Statstcs for Psychosocal Research II: Structural Models December 4 and 6, 2006 Latent Class Regresson (LCR) What s t and when do we use t? Recall the standard latent class model
Compiling for Parallelism & Locality. Dependence Testing in General. Algorithms for Solving the Dependence Problem. Dependence Testing
Complng for Parallelsm & Localty Dependence Testng n General Assgnments Deadlne for proect 4 extended to Dec 1 Last tme Data dependences and loops Today Fnsh data dependence analyss for loops General code
Realistic Image Synthesis
Realstc Image Synthess - Combned Samplng and Path Tracng - Phlpp Slusallek Karol Myszkowsk Vncent Pegoraro Overvew: Today Combned Samplng (Multple Importance Samplng) Renderng and Measurng Equaton Random
SVM Tutorial: Classification, Regression, and Ranking
SVM Tutoral: Classfcaton, Regresson, and Rankng Hwanjo Yu and Sungchul Km 1 Introducton Support Vector Machnes(SVMs) have been extensvely researched n the data mnng and machne learnng communtes for the
Least 1-Norm SVMs: a New SVM Variant between Standard and LS-SVMs
ESANN proceedngs, European Smposum on Artfcal Neural Networks - Computatonal Intellgence and Machne Learnng. Bruges (Belgum), 8-3 Aprl, d-sde publ., ISBN -9337--. Least -Norm SVMs: a New SVM Varant between
1. Fundamentals of probability theory 2. Emergence of communication traffic 3. Stochastic & Markovian Processes (SP & MP)
6.3 / -- Communcaton Networks II (Görg) SS20 -- www.comnets.un-bremen.de Communcaton Networks II Contents. Fundamentals of probablty theory 2. Emergence of communcaton traffc 3. Stochastc & Markovan Processes
Machine Learning and Data Mining Lecture Notes
Machne Learnng and Data Mnng Lecture Notes CSC 411/D11 Computer Scence Department Unversty of Toronto Verson: February 6, 2012 Copyrght c 2010 Aaron Hertzmann and Davd Fleet CONTENTS Contents Conventons
Boosting as a Regularized Path to a Maximum Margin Classifier
Journal of Machne Learnng Research 5 (2004) 941 973 Submtted 5/03; Revsed 10/03; Publshed 8/04 Boostng as a Regularzed Path to a Maxmum Margn Classfer Saharon Rosset Data Analytcs Research Group IBM T.J.
Statistical Methods to Develop Rating Models
Statstcal Methods to Develop Ratng Models [Evelyn Hayden and Danel Porath, Österrechsche Natonalbank and Unversty of Appled Scences at Manz] Source: The Basel II Rsk Parameters Estmaton, Valdaton, and
Online Multiple Kernel Learning: Algorithms and Mistake Bounds
Onlne Multple Kernel Learnng: Algorthms and Mstake Bounds Rong Jn 1, Steven C.H. Ho 2, and Tanbao Yang 1 1 Department of Computer Scence and Engneerng, Mchgan State Unversty, MI, 48824, USA 2 School of
The Mathematical Derivation of Least Squares
Pscholog 885 Prof. Federco The Mathematcal Dervaton of Least Squares Back when the powers that e forced ou to learn matr algera and calculus, I et ou all asked ourself the age-old queston: When the hell
Lecture 3: Force of Interest, Real Interest Rate, Annuity
Lecture 3: Force of Interest, Real Interest Rate, Annuty Goals: Study contnuous compoundng and force of nterest Dscuss real nterest rate Learn annuty-mmedate, and ts present value Study annuty-due, and
Data Visualization by Pairwise Distortion Minimization
Communcatons n Statstcs, Theory and Methods 34 (6), 005 Data Vsualzaton by Parwse Dstorton Mnmzaton By Marc Sobel, and Longn Jan Lateck* Department of Statstcs and Department of Computer and Informaton
An Alternative Way to Measure Private Equity Performance
An Alternatve Way to Measure Prvate Equty Performance Peter Todd Parlux Investment Technology LLC Summary Internal Rate of Return (IRR) s probably the most common way to measure the performance of prvate
Section 5.4 Annuities, Present Value, and Amortization
Secton 5.4 Annutes, Present Value, and Amortzaton Present Value In Secton 5.2, we saw that the present value of A dollars at nterest rate per perod for n perods s the amount that must be deposted today
The Development of Web Log Mining Based on Improve-K-Means Clustering Analysis
The Development of Web Log Mnng Based on Improve-K-Means Clusterng Analyss TngZhong Wang * College of Informaton Technology, Luoyang Normal Unversty, Luoyang, 471022, Chna [email protected] Abstract.
Research Article Enhanced Two-Step Method via Relaxed Order of α-satisfactory Degrees for Fuzzy Multiobjective Optimization
Hndaw Publshng Corporaton Mathematcal Problems n Engneerng Artcle ID 867836 pages http://dxdoorg/055/204/867836 Research Artcle Enhanced Two-Step Method va Relaxed Order of α-satsfactory Degrees for Fuzzy
An MILP model for planning of batch plants operating in a campaign-mode
An MILP model for plannng of batch plants operatng n a campagn-mode Yanna Fumero Insttuto de Desarrollo y Dseño CONICET UTN [email protected] Gabrela Corsano Insttuto de Desarrollo y Dseño
NMT EE 589 & UNM ME 482/582 ROBOT ENGINEERING. Dr. Stephen Bruder NMT EE 589 & UNM ME 482/582
NMT EE 589 & UNM ME 482/582 ROBOT ENGINEERING Dr. Stephen Bruder NMT EE 589 & UNM ME 482/582 7. Root Dynamcs 7.2 Intro to Root Dynamcs We now look at the forces requred to cause moton of the root.e. dynamcs!!
v a 1 b 1 i, a 2 b 2 i,..., a n b n i.
SECTION 8.4 COMPLEX VECTOR SPACES AND INNER PRODUCTS 455 8.4 COMPLEX VECTOR SPACES AND INNER PRODUCTS All the vector spaces we have studed thus far n the text are real vector spaces snce the scalars are
Loop Parallelization
- - Loop Parallelzaton C-52 Complaton steps: nested loops operatng on arrays, sequentell executon of teraton space DECLARE B[..,..+] FOR I :=.. FOR J :=.. I B[I,J] := B[I-,J]+B[I-,J-] ED FOR ED FOR analyze
Fisher Markets and Convex Programs
Fsher Markets and Convex Programs Nkhl R. Devanur 1 Introducton Convex programmng dualty s usually stated n ts most general form, wth convex objectve functons and convex constrants. (The book by Boyd and
Conversion between the vector and raster data structures using Fuzzy Geographical Entities
Converson between the vector and raster data structures usng Fuzzy Geographcal Enttes Cdála Fonte Department of Mathematcs Faculty of Scences and Technology Unversty of Combra, Apartado 38, 3 454 Combra,
Learning from Large Distributed Data: A Scaling Down Sampling Scheme for Efficient Data Processing
Internatonal Journal of Machne Learnng and Computng, Vol. 4, No. 3, June 04 Learnng from Large Dstrbuted Data: A Scalng Down Samplng Scheme for Effcent Data Processng Che Ngufor and Janusz Wojtusak part
Institute of Informatics, Faculty of Business and Management, Brno University of Technology,Czech Republic
Lagrange Multplers as Quanttatve Indcators n Economcs Ivan Mezník Insttute of Informatcs, Faculty of Busness and Management, Brno Unversty of TechnologCzech Republc Abstract The quanttatve role of Lagrange
A Lyapunov Optimization Approach to Repeated Stochastic Games
PROC. ALLERTON CONFERENCE ON COMMUNICATION, CONTROL, AND COMPUTING, OCT. 2013 1 A Lyapunov Optmzaton Approach to Repeated Stochastc Games Mchael J. Neely Unversty of Southern Calforna http://www-bcf.usc.edu/
A Probabilistic Theory of Coherence
A Probablstc Theory of Coherence BRANDEN FITELSON. The Coherence Measure C Let E be a set of n propostons E,..., E n. We seek a probablstc measure C(E) of the degree of coherence of E. Intutvely, we want
n + d + q = 24 and.05n +.1d +.25q = 2 { n + d + q = 24 (3) n + 2d + 5q = 40 (2)
MATH 16T Exam 1 : Part I (In-Class) Solutons 1. (0 pts) A pggy bank contans 4 cons, all of whch are nckels (5 ), dmes (10 ) or quarters (5 ). The pggy bank also contans a con of each denomnaton. The total
Vision Mouse. Saurabh Sarkar a* University of Cincinnati, Cincinnati, USA ABSTRACT 1. INTRODUCTION
Vson Mouse Saurabh Sarkar a* a Unversty of Cncnnat, Cncnnat, USA ABSTRACT The report dscusses a vson based approach towards trackng of eyes and fngers. The report descrbes the process of locatng the possble
1. Measuring association using correlation and regression
How to measure assocaton I: Correlaton. 1. Measurng assocaton usng correlaton and regresson We often would lke to know how one varable, such as a mother's weght, s related to another varable, such as a
A Simple Approach to Clustering in Excel
A Smple Approach to Clusterng n Excel Aravnd H Center for Computatonal Engneerng and Networng Amrta Vshwa Vdyapeetham, Combatore, Inda C Rajgopal Center for Computatonal Engneerng and Networng Amrta Vshwa
Design of Output Codes for Fast Covering Learning using Basic Decomposition Techniques
Journal of Computer Scence (7): 565-57, 6 ISSN 59-66 6 Scence Publcatons Desgn of Output Codes for Fast Coverng Learnng usng Basc Decomposton Technques Aruna Twar and Narendra S. Chaudhar, Faculty of Computer
Lecture 2: The SVM classifier
Lecture 2: The SVM classifier C19 Machine Learning Hilary 2015 A. Zisserman Review of linear classifiers Linear separability Perceptron Support Vector Machine (SVM) classifier Wide margin Cost function
Dropout: A Simple Way to Prevent Neural Networks from Overfitting
Journal of Machne Learnng Research 15 (2014) 1929-1958 Submtted 11/13; Publshed 6/14 Dropout: A Smple Way to Prevent Neural Networks from Overfttng Ntsh Srvastava Geoffrey Hnton Alex Krzhevsky Ilya Sutskever
POLYSA: A Polynomial Algorithm for Non-binary Constraint Satisfaction Problems with and
POLYSA: A Polynomal Algorthm for Non-bnary Constrant Satsfacton Problems wth and Mguel A. Saldo, Federco Barber Dpto. Sstemas Informátcos y Computacón Unversdad Poltécnca de Valenca, Camno de Vera s/n
Chapter 4 ECONOMIC DISPATCH AND UNIT COMMITMENT
Chapter 4 ECOOMIC DISATCH AD UIT COMMITMET ITRODUCTIO A power system has several power plants. Each power plant has several generatng unts. At any pont of tme, the total load n the system s met by the
THE DISTRIBUTION OF LOAN PORTFOLIO VALUE * Oldrich Alfons Vasicek
HE DISRIBUION OF LOAN PORFOLIO VALUE * Oldrch Alfons Vascek he amount of captal necessary to support a portfolo of debt securtes depends on the probablty dstrbuton of the portfolo loss. Consder a portfolo
Extending Probabilistic Dynamic Epistemic Logic
Extendng Probablstc Dynamc Epstemc Logc Joshua Sack May 29, 2008 Probablty Space Defnton A probablty space s a tuple (S, A, µ), where 1 S s a set called the sample space. 2 A P(S) s a σ-algebra: a set
CHAPTER 5 RELATIONSHIPS BETWEEN QUANTITATIVE VARIABLES
CHAPTER 5 RELATIONSHIPS BETWEEN QUANTITATIVE VARIABLES In ths chapter, we wll learn how to descrbe the relatonshp between two quanttatve varables. Remember (from Chapter 2) that the terms quanttatve varable
Formulating & Solving Integer Problems Chapter 11 289
Formulatng & Solvng Integer Problems Chapter 11 289 The Optonal Stop TSP If we drop the requrement that every stop must be vsted, we then get the optonal stop TSP. Ths mght correspond to a ob sequencng
New Approaches to Support Vector Ordinal Regression
New Approaches to Support Vector Ordnal Regresson We Chu [email protected] Gatsby Computatonal Neuroscence Unt, Unversty College London, London, WCN 3AR, UK S. Sathya Keerth [email protected]
General Iteration Algorithm for Classification Ratemaking
General Iteraton Algorthm for Classfcaton Ratemakng by Luyang Fu and Cheng-sheng eter Wu ABSTRACT In ths study, we propose a flexble and comprehensve teraton algorthm called general teraton algorthm (GIA)
Project Networks With Mixed-Time Constraints
Project Networs Wth Mxed-Tme Constrants L Caccetta and B Wattananon Western Australan Centre of Excellence n Industral Optmsaton (WACEIO) Curtn Unversty of Technology GPO Box U1987 Perth Western Australa
Implementation of Deutsch's Algorithm Using Mathcad
Implementaton of Deutsch's Algorthm Usng Mathcad Frank Roux The followng s a Mathcad mplementaton of Davd Deutsch's quantum computer prototype as presented on pages - n "Machnes, Logc and Quantum Physcs"
A Master Time Value of Money Formula. Floyd Vest
A Master Tme Value of Money Formula Floyd Vest For Fnancal Functons on a calculator or computer, Master Tme Value of Money (TVM) Formulas are usually used for the Compound Interest Formula and for Annutes.
ONE of the most crucial problems that every image
IEEE TRANSACTIONS ON IMAGE PROCESSING, VOL. 23, NO. 10, OCTOBER 2014 4413 Maxmum Margn Projecton Subspace Learnng for Vsual Data Analyss Symeon Nktds, Anastasos Tefas, Member, IEEE, and Ioanns Ptas, Fellow,
Regression Models for a Binary Response Using EXCEL and JMP
SEMATECH 997 Statstcal Methods Symposum Austn Regresson Models for a Bnary Response Usng EXCEL and JMP Davd C. Trndade, Ph.D. STAT-TECH Consultng and Tranng n Appled Statstcs San Jose, CA Topcs Practcal
Using Series to Analyze Financial Situations: Present Value
2.8 Usng Seres to Analyze Fnancal Stuatons: Present Value In the prevous secton, you learned how to calculate the amount, or future value, of an ordnary smple annuty. The amount s the sum of the accumulated
Mining Feature Importance: Applying Evolutionary Algorithms within a Web-based Educational System
Mnng Feature Importance: Applyng Evolutonary Algorthms wthn a Web-based Educatonal System Behrouz MINAEI-BIDGOLI 1, and Gerd KORTEMEYER 2, and Wllam F. PUNCH 1 1 Genetc Algorthms Research and Applcatons
Learning from Multiple Outlooks
Learnng from Multple Outlooks Maayan Harel Department of Electrcal Engneerng, Technon, Hafa, Israel She Mannor Department of Electrcal Engneerng, Technon, Hafa, Israel [email protected] [email protected]
10.2 Future Value and Present Value of an Ordinary Simple Annuity
348 Chapter 10 Annutes 10.2 Future Value and Present Value of an Ordnary Smple Annuty In compound nterest, 'n' s the number of compoundng perods durng the term. In an ordnary smple annuty, payments are
An artificial Neural Network approach to monitor and diagnose multi-attribute quality control processes. S. T. A. Niaki*
Journal of Industral Engneerng Internatonal July 008, Vol. 4, No. 7, 04 Islamc Azad Unversty, South Tehran Branch An artfcal Neural Network approach to montor and dagnose multattrbute qualty control processes
Ants Can Schedule Software Projects
Ants Can Schedule Software Proects Broderck Crawford 1,2, Rcardo Soto 1,3, Frankln Johnson 4, and Erc Monfroy 5 1 Pontfca Unversdad Católca de Valparaíso, Chle [email protected] 2 Unversdad Fns Terrae,
Person Re-identification by Probabilistic Relative Distance Comparison
Person Re-dentfcaton by Probablstc Relatve Dstance Comparson We-Sh Zheng 1,2, Shaogang Gong 2, and Tao Xang 2 1 School of Informaton Scence and Technology, Sun Yat-sen Unversty, Chna 2 School of Electronc
Performance Analysis and Coding Strategy of ECOC SVMs
Internatonal Journal of Grd and Dstrbuted Computng Vol.7, No. (04), pp.67-76 http://dx.do.org/0.457/jgdc.04.7..07 Performance Analyss and Codng Strategy of ECOC SVMs Zhgang Yan, and Yuanxuan Yang, School
