Prototype based methods: Mathematical foundations, interpretability, and data visualization
|
|
- Sylvia Jordan
- 7 years ago
- Views:
Transcription
1 Prototype based methods: Mathematical foundations, interpretability, and data visualization Barbara Hammer, Xibin Zhu CITEC Centre of Excellence Bielefeld University ijcnn14_tutorial.html
2
3 Why LVQ? [Machine Learning that Matters, Kiri L. Wagstaff, ICML 2012]... of 152 non-cross-conference papers published at ICML 2011:! there is a need for machine learning techniques which facilitate a direct interpretation of the results
4 Why LVQ?! LVQ is a prime example of a Machine Learning model which is intuitive and interpretable! but classical LVQ is a mere heuristic! This Tutorial: modern LVQ variants and their mathematics
5 Prototypes! prototypes are points in the data space:! which decompose the space into receptive fields:! induce a classification ~w i 2 R n R( ~w i )={~x k ~w i ~xk 2 applek~w j ~xk 2 8j 6= i}
6 Prototypes! prototypes offer a sparse encoding! prototypes represent data! manual inspection possible
7 Prototypes WSOM 2005, Paris 7
8 Prototypes WSOM 2005, Paris 8
9 Prototype learning! supervised: classes are known a priori: training set: P = {(~x i,y i ) i =1,...,p} R n {1,...,C}! LVQ, GLVQ, RSLVQ,...! unsupervised: clusters are not known priorly! NG, GTM, AP,...!... usually solid mathematical foundation available
10 LVQ Learning vector quantization [Kohonen, 1988] init positions of ~w j, labels are c( ~w j ) repeat: pick data point (~x i,y i ) randomly determine winner ~w I if y i = c( ~w I ): ~w I (~x i ~w I ) otherwise: ~w I (~x i ~w I )
11 LVQ LVQ 2.1 [Kohonen, 1990] init positions of ~w j, labels are c( ~w j ) repeat: pick data point (~x i,y i ) randomly determine closest prototype with y i = c( ~w + ): ~w + determine closest prototype with y i 6= c( ~w ): ~w if prototypes fall into a window around decision boundary: ~w + (~x i ~w + ) ~w (~x i ~w )
12
13 Cognitive Interaction Technology Center of Excellence Online detection of faults sensors
14 [T.Bojer et al., 2003] Cognitive Interaction Technology Center of Excellence Online detection of faults Setting: high dim. features few training data online training LVQ: close to 100% accuracy prototypes can be stored can be inspected
15 Clinical proteomics unhappy because possibly ill.. take serum put into mass spectrometer observe a characteristic spectrum which tells us more about the peptides in the serum
16 [F.-M.Schleif et al., 2009] Cognitive Interaction Technology Center of Excellence Clinical proteomics prostate cancer [National Cancer Institute, Prostate Cancer Dataset, l]:! 318 examples, SELDI-TOF from blood serum, 130 dim after preprocessing (normalization, peak detection)! 2 classes (healthy versus cancer in different states) LVQ GRLVQ SVM 62.5% 93.7% 92.7%
17 Steroid metabolomics unhappy because possibly ill.. extract steroid markers (32 selected steorid metabolites) by means of GC/MS take serum ACC / ACA
18 [W.Arlt, M.Biehl et al, 2011] Cognitive Interaction Technology Center of Excellence Steroid metabolomics
19 [S.Kirstein, H.Wersing, H.-M.Gross, E.Koerner, 2012] Cognitive Interaction Technology Center of Excellence Object recognition
20 Take home message! LVQ offers an intuitive classifier with high potential for industrial applications! interpretability of the technique is a big plus
21 LVQ code! lvq PAK ( only basic versions! included in popular software such as WEKA: only basic versions! SOM toolbox ( also GLVQ, matrix learning! mloss: also GLVQ, matrix learning! see also material at tutorial web site in particular for advanced versions as covered in the following:
22
23 LVQ! LVQ 1 does not have a valid cost function: X f LV Q (d +,d ) where d ± =(~x i ~w ± ) 2 squared distance to closest correct / wrong prototype and i f LV Q (a, b) = a b if a apple b else
24 LVQ2.1! LVQ2.1 has a valid cost function: X f LV Q2.1 (d +,d ) where d ± =(~x i ~w ± ) 2 squared distance to closest correct / wrong prototype and f LV Q2.1 (a, b) = window (a b) But this is unbounded! i
25 LVQ2.1! behavior without window in simple model situations: generalization error of LVQ depending on its initialization in simple model setting: result can be far from optimum [Biehl,Ghosh,Hammer,2007] (p + > p - )! so tricky choice of window necessary... (p - )
26 More reasonable cost function for LVQ! based on margin maximization: GLVQ [Sato/Yamada 1996, Hammer/Villmann 2002, Crammer et al 2002, Schneider et al. 2009]! based on probabilistic modeling: RSLVQ [Seo/Obermayer 2003]
27 Colt for LVQ in a nutshell! function class F given by possible LVQ-networks! training data (x i,y i )! machine learner! LVQ-function f in F! often: f(x i ) = y i for training points (i.e. small empirical error)! desired: P(f(x) = y) should be large (i.e. small real error)
28 Colt for LVQ in a nutshell safe classification insecure classification! (hypothesis) margin of x i : m(x i ) = d - - d + where d + / d - is the squared distance to closest correct / wrong prototype! mathematics! error is bounded by: E/m + O( p 2 (B 3 ln 1/δ) 1/2 ) / (ρm 1/2 )) good bounds for few training errors and large margin + does not include dimensionality where E = number of misclassified training data with margin smaller than ρ (including errors) δ = confidence m = number of examples, B = support, p = number of prototypes
29 Colt for LVQ in a nutshell safe classification insecure classification! (hypothesis) margin of x i : m(x i ) = d - - d + where d + / d - is the squared distance to closest correct / wrong prototype! mathematics! error is bounded by: good bounds for few training errors and large margin data with E/m (too) + O( p 2 (B 3 term ln 1/δ) / margin 1/2 ) / (ρm 1/2 )) small margin where E = number of misclassified training data with margin smaller than ρ (including errors) δ = confidence m = number of examples, B = support, p = number of prototypes + does not include dimensionality
30 Margin maximization! mathematical objective: maximize margin maximize margin
31 Margin maximization! mathematical objective: unbounded min P i d (~x i) d + (~x i )
32 Margin maximization! mathematical objective: minimize Σ i (d + (x i ) d - (x i )) / (d + (x i ) + d - (x i )) min X i d (~x i ) d + (~x i ) d (~x i )+d + (~x i )
33 [Sato/Yamada 1996] Cognitive Interaction Technology Center of Excellence Generalized LVQ (GLVQ) derivatives GLVQ
34 Generalized LVQ (GLVQ) derivatives GLVQ
35 Generalized LVQ (GLVQ) derivatives scaling LVQ2.1 GLVQ
36 Probabilsitic modeling! Mixture of Gaussians with labels
37 Robust soft LVQ (RSLVQ)
38 RSLVQ Cognitive Interaction Technology Center of Excellence
39 RSLVQ Cognitive Interaction Technology Center of Excellence
40 Prototype locations Cognitive Interaction Technology Center of Excellence
41 Take home! LVQ can be substantiated by large margin generalization bounds (independent of dimensionality)! LVQ can be based on cost functions:! probabilistic modeling! excellent results! bandwidth is very crititcal parameter (crisp limit does not perform well)! prototypes not always representative! margin maximization! very good results! parameters not critical! prototypes are representative for data! enables stable training, principled mathematical modelling
42
43 Why metric learning? Example: acceptance of papers at some conference L - layout, T - technical quality, I - interesting subject, F - famous author, S appropriate subject, Q - overall quality, P - author registers for conference, E - appropriate length, B - likes beer, P - looks pretty, G - gives good talks, K - knows programm committee, M - member of programm committee, C - special session, R - has red hairs
44 Why metric learning?! data are usually represented by feature vectors! feature vectors are compared using Euclidean distance! but this might tell you nothing useful smell head belly human (42,42,42,0,...) (41,43,44,1,...) (-41,43,44,1,...)
45 Why metric learning?
46 Metric parameterization
47 Metric learning: G relevance LVQ! mathematical objective: minimize Σ i (d λ + (x i ) d λ- (x i )) / (d λ+ (x i ) + d λ- (x i )) where d λ (x,y) = Σ l λ l (x l -y l ) 2 normalize the relevance terms relevance learning
48 GRLVQ! mathematical objective: min Σ i (d λ + (x i ) d λ- (x i )) / (d λ+ (x i ) + d λ- (x i )) derivatives intuitive, fast, well founded, flexible, suited for large dimensions
49 GRLVQ! mathematical objective: min Σ i (d λ + (x i ) d λ- (x i )) / (d λ+ (x i ) + d λ- (x i )) derivatives scaling LVQ2.1 relevance update intuitive, fast, well founded, flexible, suited for large dimensions
50 GRLVQ 2D data embedded in 10 D with noise/noisy copies
51 Generalized Matrix LVQ (GMLVQ) Substitute metric by general quadratic form:
52 LGMLVQ Cognitive Interaction Technology Center of Excellence
53 UCI benchmarks... Cognitive Interaction Technology Center of Excellence
54 [W.Arlt, M.Biehl et al, 2011] Cognitive Interaction Technology Center of Excellence Interpretability: Steroid metabolomics
55
56 GMLVQ yields (local) matrices, i.e. (local) scaling and rotations of the space GRLVQ: global scaling GMLVQ: global scaling and rotation LGMLVQ: local scaling and rotation
57 GMLVQ! GMLVQ with positiv semidefinite matrices: * = quadratic complexity w.r.t data dimensionality
58 Low rank GMLVQ! GMLVQ with positiv semidefinite low rank matrices matrices: * = linear complexity w.r.t data dimensionality equivalent to full version (if data are intrincically low dimensional)
59 Low rank GMLVQ Cognitive Interaction Technology Center of Excellence
60 [Bunte et al. 2012] Cognitive Interaction Technology Center of Excellence LiRamLVQ glob al local global local * = induces global projection: glob al f: x " * x
61 Discriminative visualization Example: USPS digits
62
63 Stationary solutions of GMLVQ! assume fixed receptive fields, what is the optimum metric?! update of matrix has the form (prefactor indicates sign): (x centered in prototype) plus normalization! similar to van Mises iteration! converges to first eigenvector of! in particular convergence to low rank matrix!
64 Stationary solution contributes with + contributes with -
65
66 Interpretation of matrix terms high medium low alcohol content infra-red spectral data: 124 wine spamples 256 wavelengths 30 training data 94 test spectra
67 Interpretation of matrix terms! often: diagonal terms are interpreted as relevance! problem: for high dimensional data holds for all matrices with differences in the null space of C = XX t
68 Interpretation of matrix terms! dividing out null space yields the profile! direct interpretation of relevance profile misleading for high dim data, get rid of null space first!
69 Interpretation of matrix terms GMLVQ over-fitting effect best performance 7 dimensions remaining null-space correction P=30 dimensions
70 Take home! metric adaptation:! increases accuracy! does not deteriorating its generalization ability! low rank matrix:! allows efficient training! data visualization! no restriction as compared to optimum metric! intrepretation:! by looking at feature weighting,! for high dimensionali data, normalization is necessary
71 Schneider, Biehl, Hammer...matrix learning is cool! Neural Computation 2009
72
73 Dissimilarity or similarity data! feature extraction " vectorial data size softness color curvature... " (20,7,...)! pairwise (dis)similarity measurement " (dis)similarity matrix
74 (Dis)similarity data! (dis)similarity measures, e.g.: 1.Alignment 2.Normalized Compression Distance 3.Graph structure kernels 4. GTTACAGGT GGTACACGT GTGACAAGT
75 LVQ for dis-/similarities! kernel GLVQ (Suganthan et al.)! differentiable kernel GLVQ (Villmann et al.)! relational GLVQ/SRLVQ (Xibin et al.)! kernel SRLVQ (Hofmann et al.)!...
76 Relational GLVQ Cognitive Interaction Technology Center of Excellence Assumption: Prototypes are expressed as linear combinations w i = α j ij x j where Fact: for every symmetric bilinear form and linear representation as above we find 2 x j w i = (D α i ) j 1 α T 2 i D α i Method: Substitute all terms x j w i in original methods and use
77 Relational GLVQ assume prototypes have the form then GLVQ costs become "... ugly formulas
78 Benchmark data Cognitive Interaction Technology Center of Excellence
79 Similarities/dissimilarities euclid general k~x i ~x j k 2 d ij = d(x i,x j ) h~x i, ~x j i s ij = s(x i,x j ) assumption: symmetric: d ij = d ji s ij = s ji zero diagonal: d ii =0 normalization of s is possible: s ii =1
80 Similarities/dissimilarities euclid general k~x i ~x j k 2 d ij = d(x i,x j ) h~x i, ~x j i s ij = s(x i,x j ) d ij = s ii 2s ij + s jj
81 Similarities/dissimilarities euclid general k~x i ~x j k 2 d ij = d(x i,x j ) h~x i, ~x j i s ij = s(x i,x j ) s ij = 1 2 d ij 1 n P l d il 1 n Pl d lj + 1 P n 2 l,l d 0 ll 0
82 Pseudo-euclidean embedding pseudo-euclid general k~x i ~x j k 2 pq = k~x 1 i ~x 1 j k2 k~x 2 i ~x 2 j k2 d ij = d(x i,x j ) h~x i, ~x j i pq = h~x 1 i, ~x1 j i h~x2 i, ~x2 j i s ij = s(x i,x j ) signature (p, q, n p q) euclideanity can be obtained by clip / flip
83 Pseudo-Euclidean Space For every symmetric D a vector space embedding in pseudo-euclidean space exists; symmetric bilinear form induces dissimilarities -1 P2=(-6.1,1) P4=(-0.1,0) P3=(0.1,0) P1=(6.1,1) +1 P6=(-4,-1) P5=(4,-1)
84 LVQ for dis-/similarities classification based on k~x i ~w j k 2 = k~x i k 2 2h~x i, ~w j i + k ~w j k 2 training optimizes f k~x i ~w j k 2 i,j
85 LVQ for dis-/similarities classification based on k~x i ~w j k 2 = k~x i k 2 2h~x i, ~w j i + k ~w j k 2 training optimizes f k~x i ~w j k 2 i,j prototypes as linear combinations ~w j = P ji ~x i possible assumptions: P j ji = 1, ji 0
86 LVQ for dis-/similarities classification based on k~x i ~w j k 2 = k~x i k 2 2h~x i, ~w j i + k ~w j k 2 training optimizes f k~x i ~w j k 2 i,j kernel aproach k~x i ~w j k 2 = s ii 2 X l jl s il + X l,l 0 jl jl 0s ll 0
87 LVQ for dis-/similarities classification based on k~x i ~w j k 2 = k~x i k 2 2h~x i, ~w j i + k ~w j k 2 training optimizes f k~x i ~w j k 2 i,j relational aproach k~x i ~w j k 2 = X l jl d il 1 2 X l,l 0 jl jl 0d ll 0 for normalized jl
88 LVQ for dis-/similarities 00 optimize: f X ii jl d il X l 1 X jl jl 0d ll 0A l,l 0 i,j 1 A 1 jl s il + X jl jl 0s ll 0A l,l 0 1 A i,j gradient descent with respect to followed by normalization jl! relational GLVQ / SRLVQ
89 LVQ for dis-/similarities gradient descent with respect j f hence: k~x i ~w j k 2 i,j! kernel GLVQ / SRLVQ ~w j = X l = 2f 0 (~x i ~w j ) Pl jl~x l 2f 0 (~x i P l jl~x l ) jl ~x l this can be decomposed into contributions of the coe... only for euclidean form! cients
90 LVQ for dis-/similarities GLVQ similarities gradient w.r.t. coefficients RSLVQ dissimilarities gradient w.r.t. prototypes only in the euclidean case: kernel variants resemble gradient w.r.t w large margin generalization bounds interpretation as likelihood ratio
91 Results Cognitive Interaction Technology Center of Excellence
92 Computational effort Size of Matrix (Double Precision) n Size MB 10, MB 20, GB 50, GB 200, GB
93 Computational effort? k~x i ~w j k 2 = s ii 2 X l jl s il + X l,l 0 jl jl 0s ll 0 = e t ise i 2 e i S j + t js j sample m landmarks only S m,n approximate S S m,n S 1 m,ms n,m S m,m S n,m [Nyström approximation, Williams/Seeger]
94 Experiments Cognitive Interaction Technology Center of Excellence
95
96 Take home! there exist cool methods which enable the application of LVQ for similarities / dissimilarities! quadratic complexity! Nystroem approximation for low rank data reduces to linear complexity! metric adaptation possible in a similar way as for GMLVQ: adapt w.r.t similarity/dissimilarity parameters (has been done for alignment distance " ESANN 14)
97
98 Confidence measures! Certainty of a classification? x?!
99 Conformal prediction! framework to accompany pointwise classification of online methods by provable guarantees: classifier trained on N (exchangeable) data conformity measure yields possible labels such that for a new point it holds: [Shafer & Vovk,2008]
100 Conformal prediction! pick conformity measure, e.g.! induces two terms: Credibility: how sure that a prediction is correct Confidence: how sure that ALL OTHER labels are incorrect.. any measure is valid,but some measures are more useful... higher credibility higher confidence lower credibility lower confidence
101 Conformal prediction algorithm [Shafer,Vovk]
102 Simplified conformal prediction given training data and new point 1. train the model on training data 2. compute nonconformity of training set 3. for every non conformity of 4. compare values is 5. output label with best r-value credibility: largest r-value confidence: 1- second largest r-value
103 Qualitative result Cognitive Interaction Technology Center of Excellence
104 Growing conformal semi-supervised LVQ given labeled data and unlabeled data init model with minimum number of prototypes train model on Loop: predict confidence/credibility on predict labels on based on secures part and consider secure part add the part of with high confidence/credibility identify regions with poor confidence/credibility for generate new protoype
105 Growing conformal LVQ
106 Semi-supervised growing conformal LVQ
107 Example evaluations
108 Take home! conformal prediction enables to accompany classification results by confidence values! can be realised efficiently for LVQ based on distance measures! allows incremental versions (also for relational setting, semi-supervised training)
109
110 Literature! T. Kohonen. Self-Organizing Maps. Springer, Berlin, 1997.! T. Kohonen. Learning vector quantization. In: M.A. Arbib, editor, The Handbook of Brain Theory and Neural Networks., pages MIT Press, Cambridge, MA, 1995.! M. Biehl, B. Hammer, P. Schneider, T. Villmann, Metric Learning for Prototype-based, in: Innovations in Neural Information Paradigms and Applications, M. Bianchini, M. Maggini, F. Scarselli, L.C. Jain (eds.), Springer Studies in Computational Intelligence, Vol 247 (2009), ! M. Biehl, B. Hammer, F.-M. Schleif, P. Schneider, T. Villmann, Stationarity of Matrix Relevance Learning Vector Quantization, Machine Learning Reports 01/2009, Univ. Leipzig (2009)! M. Biehl, A. Ghosh, and B. Hammer, Dynamics and generalization ability of LVQ algorithms, J. Machine Learning Research 8 (Feb): , 2007! W. Arlt, M. Biehl, A.E. Taylor, S. Hahner, R. Libe, B.A. Hughes, P. Schneider, D.J. Smith, H. Stiekema, N. Krone, E. Porfiri, G. Opocher, J. Bertherat, F. Mantero, B. Allolio, M. Terzolo, P. Nightingale, C.H.L. Shackleton, X. Bertagna, M. Fassnacht, P.M. Stewart Urine steroid metabolomics as a biomarker tool for detecting malignancy in adrenal tumors J. of Clinical Endocrinology & Metabolism 96: (2011).! Frank-Michael Schleif, Thomas Villmann, Markus Kostrzewa, Barbara Hammer, Alexander Gammerman: Cancer informatics by prototype networks in mass spectrometry. Artificial Intelligence in Medicine 45(2-3): (2009)! S. Kirstein, H. Wersing, H.-M. Gross, and E. Körner. A Life-Long Learning Vector Quantization Approach for Interactive Learning of Multiple Categories. Neural Networks 28: (2012).! Sambu Seo, Klaus Obermayer: Soft Learning Vector Quantization. Neural Computation 15(7): (2003)! Barbara Hammer, Daniela Hofmann, Frank-Michael Schleif, Xibin Zhu: Learning vector quantization for (dis-)similarities. Neurocomputing (IJON) 131:43-51 (2014)! Marc Strickert, Barbara Hammer, Thomas Villmann, Michael Biehl: Regularization and improved interpretation of linear data mappings and adaptive distance measures. CIDM 2013:10-17! Sato, Yamada, Generalized Learning Vector Quantization, NIPS 96
111 Literature! B. Mokbel, B. Paassen, and B. Hammer. Adaptive distance measures for sequential data. In Michel Verleysen, editor, ESANN, pages , 2014.! Daniela Hofmann, Frank-Michael Schleif, Benjamin Paa.en, and Barbara Hammer. Learning interpretable kernelized prototype-based models. Neurocomputing, accepted, 2013.! Xibin Zhu, Frank-Michael Schleif, and Barbara Hammer. Semi-supervised vector quantization for proximity data. In ESANN, pages 89 94, 2013.! Frank-Michael Schleif, Xibin Zhu, and Barbara Hammer. Sparse conformal prediction for dissimilarity data. Annals of Mathematics and Artificial Intelligence (AMAI), 2014.! Barbara Hammer, Daniela Hofmann, Frank-Michael Schleif, and Xibin Zhu. Learning vector quantization for (dis-)similarities. Neurocomputing, 131:43 51, 2014.! Xibin Zhu, Frank-Michael Schleif, and Barbara Hammer. Patch processing for relational learning vector quantization. In Jun Wang, Gary G. Yen, and Marios M. Polycarpou, editors, Advances in Neural Networks - ISNN th International Symposium on Neural Networks, Shenyang, China, July 11-14, Proceedings, Part I, volume 7367, pages Springer, 2012.! Andrej Gisbrecht, Bassam Mokbel, Frank-Michael Schleif, Xibin Zhu, and Barbara Hammer. Linear time relational prototype based learning. Int. J. Neural Syst., 22(5), 2012.! Kerstin Bunte, Petra Schneider, Barbara Hammer, Frank-Michael Schleif, Thomas Villmann, and Michael Biehl. Limited rank matrix learning, discriminative dimension reduction and visualization. Neural Networks, 26: , 2012.! P. Schneider, K. Bunte, H. Stiekema, B. Hammer, T. Villmann, and M. Biehl. Regularization in matrix relevance learning. IEEE Transactions on Neural Networks, 21: , 2010.! M. Biehl, B. Hammer, F.-M. Schleif, P. Schneider, and T. Villmann. Stationarity of matrix relevance learning vector quantization machine learning reports. Technical Report 01/2009, University of Leipzig, 2009.! Petra Schneider, Michael Biehl, Barbara Hammer: Adaptive Relevance Matrices in Learning Vector Quantization. Neural Computation 21(12): (2009)! Koby Crammer, Ran Gilad-Bachrach, Amir Navot, Naftali Tishby: Margin Analysis of the LVQ Algorithm. NIPS 2002: ! Shafer, Vovk, JMLR 51, A Tutorial on Conformal Prediction,2008.
Nonlinear Discriminative Data Visualization
Nonlinear Discriminative Data Visualization Kerstin Bunte 1, Barbara Hammer 2, Petra Schneider 1, Michael Biehl 1 1- University of Groningen - Institute of Mathematics and Computing Sciences P.O. Box 47,
More informationSupervised Median Clustering
Supervised Median Clustering Barbara Hammer 1, Alexander Hasenfuss 1, Frank-Michael Schleif 2, and Thomas Villmann 3 IfI Technical Report Series IfI-06-09 Impressum Publisher: Institut für Informatik,
More informationMatrix adaptation in discriminative vector quantization
Matrix adaptation in discriminative vector quantization Petra Schneider, Michael Biehl, Barbara Hammer 2 IfI Technical Report Series IFI-8-8 Impressum Publisher: Institut für Informatik, Technische Universität
More informationLearning Vector Quantization: generalization ability and dynamics of competing prototypes
Learning Vector Quantization: generalization ability and dynamics of competing prototypes Aree Witoelar 1, Michael Biehl 1, and Barbara Hammer 2 1 University of Groningen, Mathematics and Computing Science
More informationStatistical Machine Learning
Statistical Machine Learning UoC Stats 37700, Winter quarter Lecture 4: classical linear and quadratic discriminants. 1 / 25 Linear separation For two classes in R d : simple idea: separate the classes
More informationVisualization of large data sets using MDS combined with LVQ.
Visualization of large data sets using MDS combined with LVQ. Antoine Naud and Włodzisław Duch Department of Informatics, Nicholas Copernicus University, Grudziądzka 5, 87-100 Toruń, Poland. www.phys.uni.torun.pl/kmk
More informationA Computational Framework for Exploratory Data Analysis
A Computational Framework for Exploratory Data Analysis Axel Wismüller Depts. of Radiology and Biomedical Engineering, University of Rochester, New York 601 Elmwood Avenue, Rochester, NY 14642-8648, U.S.A.
More informationLinear Threshold Units
Linear Threshold Units w x hx (... w n x n w We assume that each feature x j and each weight w j is a real number (we will relax this later) We will study three different algorithms for learning linear
More informationLinear smoother. ŷ = S y. where s ij = s ij (x) e.g. s ij = diag(l i (x)) To go the other way, you need to diagonalize S
Linear smoother ŷ = S y where s ij = s ij (x) e.g. s ij = diag(l i (x)) To go the other way, you need to diagonalize S 2 Online Learning: LMS and Perceptrons Partially adapted from slides by Ryan Gabbard
More informationIntroduction to Support Vector Machines. Colin Campbell, Bristol University
Introduction to Support Vector Machines Colin Campbell, Bristol University 1 Outline of talk. Part 1. An Introduction to SVMs 1.1. SVMs for binary classification. 1.2. Soft margins and multi-class classification.
More informationClass #6: Non-linear classification. ML4Bio 2012 February 17 th, 2012 Quaid Morris
Class #6: Non-linear classification ML4Bio 2012 February 17 th, 2012 Quaid Morris 1 Module #: Title of Module 2 Review Overview Linear separability Non-linear classification Linear Support Vector Machines
More informationSemi-Supervised Support Vector Machines and Application to Spam Filtering
Semi-Supervised Support Vector Machines and Application to Spam Filtering Alexander Zien Empirical Inference Department, Bernhard Schölkopf Max Planck Institute for Biological Cybernetics ECML 2006 Discovery
More informationSelf Organizing Maps for Visualization of Categories
Self Organizing Maps for Visualization of Categories Julian Szymański 1 and Włodzisław Duch 2,3 1 Department of Computer Systems Architecture, Gdańsk University of Technology, Poland, julian.szymanski@eti.pg.gda.pl
More informationTHREE DIMENSIONAL REPRESENTATION OF AMINO ACID CHARAC- TERISTICS
THREE DIMENSIONAL REPRESENTATION OF AMINO ACID CHARAC- TERISTICS O.U. Sezerman 1, R. Islamaj 2, E. Alpaydin 2 1 Laborotory of Computational Biology, Sabancı University, Istanbul, Turkey. 2 Computer Engineering
More informationLABEL PROPAGATION ON GRAPHS. SEMI-SUPERVISED LEARNING. ----Changsheng Liu 10-30-2014
LABEL PROPAGATION ON GRAPHS. SEMI-SUPERVISED LEARNING ----Changsheng Liu 10-30-2014 Agenda Semi Supervised Learning Topics in Semi Supervised Learning Label Propagation Local and global consistency Graph
More informationBig Data Analytics CSCI 4030
High dim. data Graph data Infinite data Machine learning Apps Locality sensitive hashing PageRank, SimRank Filtering data streams SVM Recommen der systems Clustering Community Detection Web advertising
More informationA Partially Supervised Metric Multidimensional Scaling Algorithm for Textual Data Visualization
A Partially Supervised Metric Multidimensional Scaling Algorithm for Textual Data Visualization Ángela Blanco Universidad Pontificia de Salamanca ablancogo@upsa.es Spain Manuel Martín-Merino Universidad
More informationA Simple Introduction to Support Vector Machines
A Simple Introduction to Support Vector Machines Martin Law Lecture for CSE 802 Department of Computer Science and Engineering Michigan State University Outline A brief history of SVM Large-margin linear
More informationIntroduction to Machine Learning Using Python. Vikram Kamath
Introduction to Machine Learning Using Python Vikram Kamath Contents: 1. 2. 3. 4. 5. 6. 7. 8. 9. 10. Introduction/Definition Where and Why ML is used Types of Learning Supervised Learning Linear Regression
More informationSupport Vector Machines Explained
March 1, 2009 Support Vector Machines Explained Tristan Fletcher www.cs.ucl.ac.uk/staff/t.fletcher/ Introduction This document has been written in an attempt to make the Support Vector Machines (SVM),
More informationSupport Vector Machines with Clustering for Training with Very Large Datasets
Support Vector Machines with Clustering for Training with Very Large Datasets Theodoros Evgeniou Technology Management INSEAD Bd de Constance, Fontainebleau 77300, France theodoros.evgeniou@insead.fr Massimiliano
More informationCROP CLASSIFICATION WITH HYPERSPECTRAL DATA OF THE HYMAP SENSOR USING DIFFERENT FEATURE EXTRACTION TECHNIQUES
Proceedings of the 2 nd Workshop of the EARSeL SIG on Land Use and Land Cover CROP CLASSIFICATION WITH HYPERSPECTRAL DATA OF THE HYMAP SENSOR USING DIFFERENT FEATURE EXTRACTION TECHNIQUES Sebastian Mader
More informationLecture 3: Linear methods for classification
Lecture 3: Linear methods for classification Rafael A. Irizarry and Hector Corrada Bravo February, 2010 Today we describe four specific algorithms useful for classification problems: linear regression,
More informationSACOC: A spectral-based ACO clustering algorithm
SACOC: A spectral-based ACO clustering algorithm Héctor D. Menéndez, Fernando E. B. Otero, and David Camacho Abstract The application of ACO-based algorithms in data mining is growing over the last few
More informationOnline Semi-Supervised Learning
Online Semi-Supervised Learning Andrew B. Goldberg, Ming Li, Xiaojin Zhu jerryzhu@cs.wisc.edu Computer Sciences University of Wisconsin Madison Xiaojin Zhu (Univ. Wisconsin-Madison) Online Semi-Supervised
More informationSocial Media Mining. Data Mining Essentials
Introduction Data production rate has been increased dramatically (Big Data) and we are able store much more data than before E.g., purchase data, social media data, mobile phone data Businesses and customers
More informationLogistic Regression. Vibhav Gogate The University of Texas at Dallas. Some Slides from Carlos Guestrin, Luke Zettlemoyer and Dan Weld.
Logistic Regression Vibhav Gogate The University of Texas at Dallas Some Slides from Carlos Guestrin, Luke Zettlemoyer and Dan Weld. Generative vs. Discriminative Classifiers Want to Learn: h:x Y X features
More informationMethodology for Emulating Self Organizing Maps for Visualization of Large Datasets
Methodology for Emulating Self Organizing Maps for Visualization of Large Datasets Macario O. Cordel II and Arnulfo P. Azcarraga College of Computer Studies *Corresponding Author: macario.cordel@dlsu.edu.ph
More informationVisualization by Linear Projections as Information Retrieval
Visualization by Linear Projections as Information Retrieval Jaakko Peltonen Helsinki University of Technology, Department of Information and Computer Science, P. O. Box 5400, FI-0015 TKK, Finland jaakko.peltonen@tkk.fi
More informationAn Introduction to Machine Learning
An Introduction to Machine Learning L5: Novelty Detection and Regression Alexander J. Smola Statistical Machine Learning Program Canberra, ACT 0200 Australia Alex.Smola@nicta.com.au Tata Institute, Pune,
More informationEnsemble Methods. Knowledge Discovery and Data Mining 2 (VU) (707.004) Roman Kern. KTI, TU Graz 2015-03-05
Ensemble Methods Knowledge Discovery and Data Mining 2 (VU) (707004) Roman Kern KTI, TU Graz 2015-03-05 Roman Kern (KTI, TU Graz) Ensemble Methods 2015-03-05 1 / 38 Outline 1 Introduction 2 Classification
More informationComparing large datasets structures through unsupervised learning
Comparing large datasets structures through unsupervised learning Guénaël Cabanes and Younès Bennani LIPN-CNRS, UMR 7030, Université de Paris 13 99, Avenue J-B. Clément, 93430 Villetaneuse, France cabanes@lipn.univ-paris13.fr
More informationSelf Organizing Maps: Fundamentals
Self Organizing Maps: Fundamentals Introduction to Neural Networks : Lecture 16 John A. Bullinaria, 2004 1. What is a Self Organizing Map? 2. Topographic Maps 3. Setting up a Self Organizing Map 4. Kohonen
More information15.062 Data Mining: Algorithms and Applications Matrix Math Review
.6 Data Mining: Algorithms and Applications Matrix Math Review The purpose of this document is to give a brief review of selected linear algebra concepts that will be useful for the course and to develop
More informationSupport Vector Machines
Support Vector Machines Charlie Frogner 1 MIT 2011 1 Slides mostly stolen from Ryan Rifkin (Google). Plan Regularization derivation of SVMs. Analyzing the SVM problem: optimization, duality. Geometric
More informationMachine Learning and Pattern Recognition Logistic Regression
Machine Learning and Pattern Recognition Logistic Regression Course Lecturer:Amos J Storkey Institute for Adaptive and Neural Computation School of Informatics University of Edinburgh Crichton Street,
More informationCS 2750 Machine Learning. Lecture 1. Machine Learning. http://www.cs.pitt.edu/~milos/courses/cs2750/ CS 2750 Machine Learning.
Lecture Machine Learning Milos Hauskrecht milos@cs.pitt.edu 539 Sennott Square, x5 http://www.cs.pitt.edu/~milos/courses/cs75/ Administration Instructor: Milos Hauskrecht milos@cs.pitt.edu 539 Sennott
More informationINTERACTIVE DATA EXPLORATION USING MDS MAPPING
INTERACTIVE DATA EXPLORATION USING MDS MAPPING Antoine Naud and Włodzisław Duch 1 Department of Computer Methods Nicolaus Copernicus University ul. Grudziadzka 5, 87-100 Toruń, Poland Abstract: Interactive
More informationClassification algorithm in Data mining: An Overview
Classification algorithm in Data mining: An Overview S.Neelamegam #1, Dr.E.Ramaraj *2 #1 M.phil Scholar, Department of Computer Science and Engineering, Alagappa University, Karaikudi. *2 Professor, Department
More informationData topology visualization for the Self-Organizing Map
Data topology visualization for the Self-Organizing Map Kadim Taşdemir and Erzsébet Merényi Rice University - Electrical & Computer Engineering 6100 Main Street, Houston, TX, 77005 - USA Abstract. The
More informationProbabilistic Models for Big Data. Alex Davies and Roger Frigola University of Cambridge 13th February 2014
Probabilistic Models for Big Data Alex Davies and Roger Frigola University of Cambridge 13th February 2014 The State of Big Data Why probabilistic models for Big Data? 1. If you don t have to worry about
More informationModelling, Extraction and Description of Intrinsic Cues of High Resolution Satellite Images: Independent Component Analysis based approaches
Modelling, Extraction and Description of Intrinsic Cues of High Resolution Satellite Images: Independent Component Analysis based approaches PhD Thesis by Payam Birjandi Director: Prof. Mihai Datcu Problematic
More informationSubspace Analysis and Optimization for AAM Based Face Alignment
Subspace Analysis and Optimization for AAM Based Face Alignment Ming Zhao Chun Chen College of Computer Science Zhejiang University Hangzhou, 310027, P.R.China zhaoming1999@zju.edu.cn Stan Z. Li Microsoft
More informationMachine Learning in FX Carry Basket Prediction
Machine Learning in FX Carry Basket Prediction Tristan Fletcher, Fabian Redpath and Joe D Alessandro Abstract Artificial Neural Networks ANN), Support Vector Machines SVM) and Relevance Vector Machines
More informationThe Artificial Prediction Market
The Artificial Prediction Market Adrian Barbu Department of Statistics Florida State University Joint work with Nathan Lay, Siemens Corporate Research 1 Overview Main Contributions A mathematical theory
More informationHow To Cluster
Data Clustering Dec 2nd, 2013 Kyrylo Bessonov Talk outline Introduction to clustering Types of clustering Supervised Unsupervised Similarity measures Main clustering algorithms k-means Hierarchical Main
More informationMA2823: Foundations of Machine Learning
MA2823: Foundations of Machine Learning École Centrale Paris Fall 2015 Chloé-Agathe Azencot Centre for Computational Biology, Mines ParisTech chloe agathe.azencott@mines paristech.fr TAs: Jiaqian Yu jiaqian.yu@centralesupelec.fr
More informationMaking Sense of the Mayhem: Machine Learning and March Madness
Making Sense of the Mayhem: Machine Learning and March Madness Alex Tran and Adam Ginzberg Stanford University atran3@stanford.edu ginzberg@stanford.edu I. Introduction III. Model The goal of our research
More informationAn Introduction to Neural Networks
An Introduction to Vincent Cheung Kevin Cannons Signal & Data Compression Laboratory Electrical & Computer Engineering University of Manitoba Winnipeg, Manitoba, Canada Advisor: Dr. W. Kinsner May 27,
More informationMachine Learning and Data Analysis overview. Department of Cybernetics, Czech Technical University in Prague. http://ida.felk.cvut.
Machine Learning and Data Analysis overview Jiří Kléma Department of Cybernetics, Czech Technical University in Prague http://ida.felk.cvut.cz psyllabus Lecture Lecturer Content 1. J. Kléma Introduction,
More informationExample: Credit card default, we may be more interested in predicting the probabilty of a default than classifying individuals as default or not.
Statistical Learning: Chapter 4 Classification 4.1 Introduction Supervised learning with a categorical (Qualitative) response Notation: - Feature vector X, - qualitative response Y, taking values in C
More informationComparison of Non-linear Dimensionality Reduction Techniques for Classification with Gene Expression Microarray Data
CMPE 59H Comparison of Non-linear Dimensionality Reduction Techniques for Classification with Gene Expression Microarray Data Term Project Report Fatma Güney, Kübra Kalkan 1/15/2013 Keywords: Non-linear
More informationLearning Gaussian process models from big data. Alan Qi Purdue University Joint work with Z. Xu, F. Yan, B. Dai, and Y. Zhu
Learning Gaussian process models from big data Alan Qi Purdue University Joint work with Z. Xu, F. Yan, B. Dai, and Y. Zhu Machine learning seminar at University of Cambridge, July 4 2012 Data A lot of
More informationLarge-Scale Similarity and Distance Metric Learning
Large-Scale Similarity and Distance Metric Learning Aurélien Bellet Télécom ParisTech Joint work with K. Liu, Y. Shi and F. Sha (USC), S. Clémençon and I. Colin (Télécom ParisTech) Séminaire Criteo March
More informationIntegrated Data Mining Strategy for Effective Metabolomic Data Analysis
The First International Symposium on Optimization and Systems Biology (OSB 07) Beijing, China, August 8 10, 2007 Copyright 2007 ORSC & APORC pp. 45 51 Integrated Data Mining Strategy for Effective Metabolomic
More informationLearning Feedback in Intelligent Tutoring Systems
Learning Feedback in Intelligent Tutoring Systems Sebastian Gross Bassam Mokbel Barbara Hammer Niels Pinkwart (This is a preprint of the publication [9], as provided by the authors.) Abstract Intelligent
More informationNovelty Detection in image recognition using IRF Neural Networks properties
Novelty Detection in image recognition using IRF Neural Networks properties Philippe Smagghe, Jean-Luc Buessler, Jean-Philippe Urban Université de Haute-Alsace MIPS 4, rue des Frères Lumière, 68093 Mulhouse,
More informationLogistic Regression. Jia Li. Department of Statistics The Pennsylvania State University. Logistic Regression
Logistic Regression Department of Statistics The Pennsylvania State University Email: jiali@stat.psu.edu Logistic Regression Preserve linear classification boundaries. By the Bayes rule: Ĝ(x) = arg max
More informationVisualization of Breast Cancer Data by SOM Component Planes
International Journal of Science and Technology Volume 3 No. 2, February, 2014 Visualization of Breast Cancer Data by SOM Component Planes P.Venkatesan. 1, M.Mullai 2 1 Department of Statistics,NIRT(Indian
More informationTensor Methods for Machine Learning, Computer Vision, and Computer Graphics
Tensor Methods for Machine Learning, Computer Vision, and Computer Graphics Part I: Factorizations and Statistical Modeling/Inference Amnon Shashua School of Computer Science & Eng. The Hebrew University
More informationLinear Classification. Volker Tresp Summer 2015
Linear Classification Volker Tresp Summer 2015 1 Classification Classification is the central task of pattern recognition Sensors supply information about an object: to which class do the object belong
More informationK-Means Clustering Tutorial
K-Means Clustering Tutorial By Kardi Teknomo,PhD Preferable reference for this tutorial is Teknomo, Kardi. K-Means Clustering Tutorials. http:\\people.revoledu.com\kardi\ tutorial\kmean\ Last Update: July
More informationLearning with Local and Global Consistency
Learning with Local and Global Consistency Dengyong Zhou, Olivier Bousquet, Thomas Navin Lal, Jason Weston, and Bernhard Schölkopf Max Planck Institute for Biological Cybernetics, 7276 Tuebingen, Germany
More informationCSCI567 Machine Learning (Fall 2014)
CSCI567 Machine Learning (Fall 2014) Drs. Sha & Liu {feisha,yanliu.cs}@usc.edu September 22, 2014 Drs. Sha & Liu ({feisha,yanliu.cs}@usc.edu) CSCI567 Machine Learning (Fall 2014) September 22, 2014 1 /
More informationBindel, Spring 2012 Intro to Scientific Computing (CS 3220) Week 3: Wednesday, Feb 8
Spaces and bases Week 3: Wednesday, Feb 8 I have two favorite vector spaces 1 : R n and the space P d of polynomials of degree at most d. For R n, we have a canonical basis: R n = span{e 1, e 2,..., e
More informationProbabilistic Latent Semantic Analysis (plsa)
Probabilistic Latent Semantic Analysis (plsa) SS 2008 Bayesian Networks Multimedia Computing, Universität Augsburg Rainer.Lienhart@informatik.uni-augsburg.de www.multimedia-computing.{de,org} References
More informationLearning with Local and Global Consistency
Learning with Local and Global Consistency Dengyong Zhou, Olivier Bousquet, Thomas Navin Lal, Jason Weston, and Bernhard Schölkopf Max Planck Institute for Biological Cybernetics, 7276 Tuebingen, Germany
More informationEM Clustering Approach for Multi-Dimensional Analysis of Big Data Set
EM Clustering Approach for Multi-Dimensional Analysis of Big Data Set Amhmed A. Bhih School of Electrical and Electronic Engineering Princy Johnson School of Electrical and Electronic Engineering Martin
More informationSPECIAL PERTURBATIONS UNCORRELATED TRACK PROCESSING
AAS 07-228 SPECIAL PERTURBATIONS UNCORRELATED TRACK PROCESSING INTRODUCTION James G. Miller * Two historical uncorrelated track (UCT) processing approaches have been employed using general perturbations
More informationMathematical Models of Supervised Learning and their Application to Medical Diagnosis
Genomic, Proteomic and Transcriptomic Lab High Performance Computing and Networking Institute National Research Council, Italy Mathematical Models of Supervised Learning and their Application to Medical
More informationA Complete Gradient Clustering Algorithm for Features Analysis of X-ray Images
A Complete Gradient Clustering Algorithm for Features Analysis of X-ray Images Małgorzata Charytanowicz, Jerzy Niewczas, Piotr A. Kowalski, Piotr Kulczycki, Szymon Łukasik, and Sławomir Żak Abstract Methods
More informationEfficient online learning of a non-negative sparse autoencoder
and Machine Learning. Bruges (Belgium), 28-30 April 2010, d-side publi., ISBN 2-93030-10-2. Efficient online learning of a non-negative sparse autoencoder Andre Lemme, R. Felix Reinhart and Jochen J. Steil
More informationJPEG compression of monochrome 2D-barcode images using DCT coefficient distributions
Edith Cowan University Research Online ECU Publications Pre. JPEG compression of monochrome D-barcode images using DCT coefficient distributions Keng Teong Tan Hong Kong Baptist University Douglas Chai
More informationAn Introduction to Data Mining. Big Data World. Related Fields and Disciplines. What is Data Mining? 2/12/2015
An Introduction to Data Mining for Wind Power Management Spring 2015 Big Data World Every minute: Google receives over 4 million search queries Facebook users share almost 2.5 million pieces of content
More informationReal-Life Industrial Process Data Mining RISC-PROS. Activities of the RISC-PROS project
Real-Life Industrial Process Data Mining Activities of the RISC-PROS project Department of Mathematical Information Technology University of Jyväskylä Finland Postgraduate Seminar in Information Technology
More informationData Mining - Evaluation of Classifiers
Data Mining - Evaluation of Classifiers Lecturer: JERZY STEFANOWSKI Institute of Computing Sciences Poznan University of Technology Poznan, Poland Lecture 4 SE Master Course 2008/2009 revised for 2010
More informationLecture 8 February 4
ICS273A: Machine Learning Winter 2008 Lecture 8 February 4 Scribe: Carlos Agell (Student) Lecturer: Deva Ramanan 8.1 Neural Nets 8.1.1 Logistic Regression Recall the logistic function: g(x) = 1 1 + e θt
More informationARTIFICIAL INTELLIGENCE (CSCU9YE) LECTURE 6: MACHINE LEARNING 2: UNSUPERVISED LEARNING (CLUSTERING)
ARTIFICIAL INTELLIGENCE (CSCU9YE) LECTURE 6: MACHINE LEARNING 2: UNSUPERVISED LEARNING (CLUSTERING) Gabriela Ochoa http://www.cs.stir.ac.uk/~goc/ OUTLINE Preliminaries Classification and Clustering Applications
More informationArtificial Neural Networks and Support Vector Machines. CS 486/686: Introduction to Artificial Intelligence
Artificial Neural Networks and Support Vector Machines CS 486/686: Introduction to Artificial Intelligence 1 Outline What is a Neural Network? - Perceptron learners - Multi-layer networks What is a Support
More informationVector and Matrix Norms
Chapter 1 Vector and Matrix Norms 11 Vector Spaces Let F be a field (such as the real numbers, R, or complex numbers, C) with elements called scalars A Vector Space, V, over the field F is a non-empty
More informationINTRODUCTION TO MACHINE LEARNING 3RD EDITION
ETHEM ALPAYDIN The MIT Press, 2014 Lecture Slides for INTRODUCTION TO MACHINE LEARNING 3RD EDITION alpaydin@boun.edu.tr http://www.cmpe.boun.edu.tr/~ethem/i2ml3e CHAPTER 1: INTRODUCTION Big Data 3 Widespread
More informationMaximum Margin Clustering
Maximum Margin Clustering Linli Xu James Neufeld Bryce Larson Dale Schuurmans University of Waterloo University of Alberta Abstract We propose a new method for clustering based on finding maximum margin
More informationPATTERN RECOGNITION AND MACHINE LEARNING CHAPTER 4: LINEAR MODELS FOR CLASSIFICATION
PATTERN RECOGNITION AND MACHINE LEARNING CHAPTER 4: LINEAR MODELS FOR CLASSIFICATION Introduction In the previous chapter, we explored a class of regression models having particularly simple analytical
More informationPerformance Metrics for Graph Mining Tasks
Performance Metrics for Graph Mining Tasks 1 Outline Introduction to Performance Metrics Supervised Learning Performance Metrics Unsupervised Learning Performance Metrics Optimizing Metrics Statistical
More informationAnalysis of kiva.com Microlending Service! Hoda Eydgahi Julia Ma Andy Bardagjy December 9, 2010 MAS.622j
Analysis of kiva.com Microlending Service! Hoda Eydgahi Julia Ma Andy Bardagjy December 9, 2010 MAS.622j What is Kiva? An organization that allows people to lend small amounts of money via the Internet
More informationNeural Networks Lesson 5 - Cluster Analysis
Neural Networks Lesson 5 - Cluster Analysis Prof. Michele Scarpiniti INFOCOM Dpt. - Sapienza University of Rome http://ispac.ing.uniroma1.it/scarpiniti/index.htm michele.scarpiniti@uniroma1.it Rome, 29
More informationMachine Learning. CUNY Graduate Center, Spring 2013. Professor Liang Huang. huang@cs.qc.cuny.edu
Machine Learning CUNY Graduate Center, Spring 2013 Professor Liang Huang huang@cs.qc.cuny.edu http://acl.cs.qc.edu/~lhuang/teaching/machine-learning Logistics Lectures M 9:30-11:30 am Room 4419 Personnel
More informationLeast-Squares Intersection of Lines
Least-Squares Intersection of Lines Johannes Traa - UIUC 2013 This write-up derives the least-squares solution for the intersection of lines. In the general case, a set of lines will not intersect at a
More informationVisualization of Topology Representing Networks
Visualization of Topology Representing Networks Agnes Vathy-Fogarassy 1, Agnes Werner-Stark 1, Balazs Gal 1 and Janos Abonyi 2 1 University of Pannonia, Department of Mathematics and Computing, P.O.Box
More informationAdaBoost. Jiri Matas and Jan Šochman. Centre for Machine Perception Czech Technical University, Prague http://cmp.felk.cvut.cz
AdaBoost Jiri Matas and Jan Šochman Centre for Machine Perception Czech Technical University, Prague http://cmp.felk.cvut.cz Presentation Outline: AdaBoost algorithm Why is of interest? How it works? Why
More information203.4770: Introduction to Machine Learning Dr. Rita Osadchy
203.4770: Introduction to Machine Learning Dr. Rita Osadchy 1 Outline 1. About the Course 2. What is Machine Learning? 3. Types of problems and Situations 4. ML Example 2 About the course Course Homepage:
More informationVisualization of General Defined Space Data
International Journal of Computer Graphics & Animation (IJCGA) Vol.3, No.4, October 013 Visualization of General Defined Space Data John R Rankin La Trobe University, Australia Abstract A new algorithm
More informationData Mining and Neural Networks in Stata
Data Mining and Neural Networks in Stata 2 nd Italian Stata Users Group Meeting Milano, 10 October 2005 Mario Lucchini e Maurizo Pisati Università di Milano-Bicocca mario.lucchini@unimib.it maurizio.pisati@unimib.it
More informationLecture 2: The SVM classifier
Lecture 2: The SVM classifier C19 Machine Learning Hilary 2015 A. Zisserman Review of linear classifiers Linear separability Perceptron Support Vector Machine (SVM) classifier Wide margin Cost function
More informationSeveral Views of Support Vector Machines
Several Views of Support Vector Machines Ryan M. Rifkin Honda Research Institute USA, Inc. Human Intention Understanding Group 2007 Tikhonov Regularization We are considering algorithms of the form min
More information6.2.8 Neural networks for data mining
6.2.8 Neural networks for data mining Walter Kosters 1 In many application areas neural networks are known to be valuable tools. This also holds for data mining. In this chapter we discuss the use of neural
More informationMusic Mood Classification
Music Mood Classification CS 229 Project Report Jose Padial Ashish Goel Introduction The aim of the project was to develop a music mood classifier. There are many categories of mood into which songs may
More informationBig Data - Lecture 1 Optimization reminders
Big Data - Lecture 1 Optimization reminders S. Gadat Toulouse, Octobre 2014 Big Data - Lecture 1 Optimization reminders S. Gadat Toulouse, Octobre 2014 Schedule Introduction Major issues Examples Mathematics
More informationProbabilistic Linear Classification: Logistic Regression. Piyush Rai IIT Kanpur
Probabilistic Linear Classification: Logistic Regression Piyush Rai IIT Kanpur Probabilistic Machine Learning (CS772A) Jan 18, 2016 Probabilistic Machine Learning (CS772A) Probabilistic Linear Classification:
More informationLecture 9: Introduction to Pattern Analysis
Lecture 9: Introduction to Pattern Analysis g Features, patterns and classifiers g Components of a PR system g An example g Probability definitions g Bayes Theorem g Gaussian densities Features, patterns
More information