Visualization by Linear Projections as Information Retrieval
|
|
|
- Tracey Dean
- 9 years ago
- Views:
Transcription
1 Visualization by Linear Projections as Information Retrieval Jaakko Peltonen Helsinki University of Technology, Department of Information and Computer Science, P. O. Box 5400, FI-0015 TKK, Finland Abstract. We apply a recent formalization of visualization as information retrieval to linear projections. We introduce a method that optimizes a linear projection for an information retrieval task: retrieving neighbors of input samples based on their low-dimensional visualization coordinates only. The simple linear projection makes the method easy to interpret, while the visualization task is made well-defined by the novel information retrieval criterion. The method has a further advantage: it projects input features, but the input neighborhoods it preserves can be given separately from the input features, e.g. by external data of sample similarities. Thus the visualization can reveal the relationship between data features and complicated data similarities. We further extend the method to kernel-based projections. Key words: visualization, information retrieval, linear projection 1 Introduction Linear projections are widely used to visualize high-dimensional data. They have the advantage of easy interpretation: each axis in the visualization is a simple combination of original data features, which in turn often have clear meanings. Linear projections are also fast to apply to new data. In contrast, nonlinear projections can be hard to interpret, if a functional form of the mapping is available at all. Some nonlinear methods also need much computation or approximation of the mapping to embed new points. Kernel-based projections are a middle ground between linear and nonlinear projections; their computation is linear in the kernel space, and their interpretability depends on the chosen kernel. The crucial question in linear visualization is what criterion to use for finding the projection. Traditional answers include preservation of maximum variance as in principal component analysis (PCA); preservation of an independence structure as in independent component analysis; preservation of distances and pairwise constraints as in [1]; or maximization of class predictive power as in linear discriminant analysis, informative discriminant analysis [], neighborhood components analysis [3], metric learning by collapsing classes [4], and others. When the linear projection is intended for visualization, the previous criteria are insufficient, as they are only indirectly related to visualization. One must
2 Jaakko Peltonen first formalize what is the task of visualization, and what are good performance measures for the task. This question has recently been answered in [5], where the task of visualization is formalized as an information retrieval task, and goodness measures are derived which are generalizations of precision and recall. Based on the goodness measures one can form an optimization criterion, and directly optimize the goodness of a visualization in the information retrieval task; however, so far this approach has only been used for nonlinear embedding where output coordinates are directly optimized without any parametric mapping [5, 6]. We introduce a novel method for linear and kernel-based visualization called Linear Neighbor Retrieval Visualizer (LINNEA): we apply the formalization of visualization as an information retrieval task, and optimize precision and recall of such retrieval. A useful property is that the input features being projected and the distances used to compute the input neighborhoods can be given separately: for example, features can be word occurrence vectors of text documents and distances can be distances of the documents in a citation graph. In special cases, LINNEA is related to the methods stochastic neighbor embedding [7] and metric learning by collapsing classes [4], but it is more general; it can be used for unsupervised and supervised visualization, and allows the user to set the tradeoff between precision and recall of information retrieval. We show by preliminary experiments that LINNEA yields good visualizations of several data sets. Visualization as Information Retrieval We briefly summarize the novel formalization of visualization introduced in [5]. The task is visualization of neighborhood or proximity relationships within a high-dimensional data set. For a set of input points x i R d0, i = 1,...,N, a visualization method yields output coordinates y i R d, which should reveal the neighborhood relationships. This is formalized as an information retrieval task: for any data point, the visualization should allow the user to retrieve its neighboring data points in the original high-dimensional data. Perfect retrieval from a low-dimensional visualization is usually not possible, and the retrieval will make two kinds of errors: not retrieving a neighbor decreases recall of the retrieval, and erroneously retrieving a non-neighbor decreases precision. To apply the information retrieval concepts of precision and recall to visualization, in [5] they are generalized to continuous and probabilistic measures as follows. For each point i, a neighborhood probability distribution p i,j over all other points j is defined; in [5] an exponentially decaying probability based on input distances d(x i,x j ) is used. In this paper we allow the d(x i,x j ) to arise from any definition of distance between points i and j. The retrieval of points from the visualization is also probabilistic: for each point i a distribution q i,j is defined which tells the probability that a particular nearby point j is retrieved from the visualization. The q i,j are defined similarly to the p i,j, but using Euclidean distances y i y j between visualization coordinates y i. This yields p i,j = e d (x i,x j)/σi (x i,x k i e d k )/σi, q i,j = e yi yj /σi k i e yi y k /σi (1)
3 Visualization by Linear Projections as Information Retrieval 3 where σ i are scale parameters which can be set by fixing the entropy of the p i,j as suggested in [5]. Since p i,j and q i,j are probability distributions, it is natural to use Kullback-Leibler divergences to measure how well the retrieved distributions correspond to the input neighborhoods. The divergence D KL (p i, q i ) = j i p i,j log(p i,j /q i,j ) turns out to be a generalization of recall and D KL (q i, p i ) turns out to be a generalization of precision. The values of the divergences are averaged over points i which yields the final goodness measures. 3 The Method: Linear Neighborhood Retrieval Visualizer The generalizations of precision and recall above can be directly used as optimization goals, but as both precision and recall cannot usually be maximized together, the user must set a tradeoff between them. Given the tradeoff a single cost function can be defined and visualizations can be directly optimized in terms of the cost function. In the earlier works [5,6] this approach was used to compute a nonlinear embedding, that is, the output coordinates y i of data points were optimized directly. In this paper we instead consider a parametric, linear projection y i = Wx i where W R d d0 is the projection matrix. We wish to optimize W so that the projection is good for the information retrieval task of visualization. We call the method Linear Neighborhood Retrieval Visualizer (LINNEA). We use the same cost function as in [5], that is, E = λ i D KL (p i, q i ) + (1 λ) D KL (q i, p i ) i = [ λp i,j log q i,j + (1 λ)q i,j log q ] i,j + const. () p i i,j j i where the tradeoff parameter λ is to be set by the user to reflect whether precision or recall is more important. We simply use a conjugate gradient algorithm to minimize E with respect to the matrix W. The gradient E W is i,j i [λ(p i,j q i,j )+(1 λ)q i,j ( D KL (q i, p i ) log q i,j p i,j )] (yi y j )(x i x j ) T σ i which yields O(N ) computational complexity per gradient step. Optimization details. In this paper we simply initialize the elements of W to uniform random numbers between 0 and 1; more complicated initialization, say by initializing W as a principal component analysis projection, is naturally possible. To avoid local optima, we use two simple methods. Firstly, in each run we first set the neighborhood scales to large values, and decrease them after each optimization step until the final scales are reached, after which we run 40 conjugate gradient steps with the final scales. Secondly, we run the algorithm from 10 random initializations and take the result with the best cost function value. (3)
4 4 Jaakko Peltonen 3.1 Kernel Version We now present a kernel version of LINNEA. Instead of simple linear projections we optimize projections from a kernel space: we set y i = WΦ(x i ) where Φ( ) is some nonlinear transformation to a potentially infinite-dimensional space with inner products given by a kernel function k(x i,x j ) = Φ(x i ) T Φ(x j ). As usual, the kernel turns out to be all we need and knowing Φ is not required. The task is the same as before: to optimize the projection (visualization) so that it is good for information retrieval according to the cost function (). It is reasonable to assume that the rows wl T of W can be expressed as linear combinations of the Φ(x i ), so that w l = m al m Φ(x m) where a l m are the coefficients. Then the projection has the simple form [ y i = a 1 mφ(x m ),..., TΦ(xi a d mφ(x m )] ) = AK(x i ) (4) m m where the matrix A R d N contains the coefficients A(l, m) = a l m and K(x i ) = [k(x 1,x i ),..., k(x N,x i )] T. As before, the coordinates y i can be used to compute the neighborhoods q i,j, the cost function, and so on. To optimize this kernel-based projection, it is sufficient to optimize the cost function with respect to the coefficient matrix A. We can again use a standard conjugate gradient method: the gradient with respect to A is the same as equation (3), except that x i and x j are replaced by K(x i ) and K(x j ). Since A has N columns, the computational complexity becomes O(N 3 ) per gradient step. 3. Properties of LINNEA A crucial property of LINNEA is that the input features x i being projected and the distances d(x i,x j ) used to compute the input neighborhoods can be given separately. At simplest d(x i,x j ) can be the Euclidean distance x i x j, but it can also be based on other data: for example, x i can be word occurrence vectors of text documents and d(x i,x j ) can be distances of the documents in a citation graph (we test this example in Section 4). When distances are directly computed from input features, the projection is unsupervised. When distances are given separately the projection is supervised by the distances; then the projection is optimized to allow retrieval of neighbors which are based on the separately given distances, so it reveals the relationship between the features and the distances. Note that a visualization based on the distances only, say multidimensional scaling computed from citation graph distances between documents, would not provide any relationship between the visualization and the features (document content); in contrast, the LINNEA visualization is directly a projection of the features, which is optimized for retrieval of neighbors based on the distances. If we set λ = 1 in (), that is, we maximize recall, this yields the cost function of stochastic neighbor embedding (SNE; [7]); thus LINNEA includes a linear version of SNE as a special case. More generally, the cost function of LINNEA implements a flexible user-defined tradeoff between precision and recall.
5 Visualization by Linear Projections as Information Retrieval 5 Another interesting special case follows if the input neighborhoods are derived from class labels of data points. Consider a straightforward neighborhood p i,j : for any point i in class c i, the neighbors are the other points from the same class, with equal probabilities. It is easy to show that if we set λ = 1 in the cost function (that is, we maximize recall), this is equivalent to maximizing δ ci,c j log i j i e yi yj /σ i j i e yi y j /σi where δ ci,c j = 1 if the classes (c i, c j ) are the same and zero otherwise, and for simplicity classes are assumed equi-probable. This is the cost function of metric learning by collapsing classes (MCML; [4]) which was introduced as a supervised, linear version of SNE. LINNEA includes MCML as a special case. We thus give a new interpretation of MCML: it maximizes recall of same-class points. Note that LINNEA yields meaningful solutions for the above kind of straightforward input neighborhoods because the mapping is a linear projection of input features. In contrast, methods that freely optimize output coordinates could yield trivial solutions mapping all input points of each class to a single output point. To avoid trivial solutions, such methods can e.g. apply topology-preserving supervised metrics as in [6]; such complicated metrics are not needed in LINNEA. In summary, LINNEA can be used for both supervised and unsupervised visualization; it is related to well-known methods but is more general, allowing the user to set the tradeoff between the different costs of information retrieval. 4 Experiments In this first paper we do not yet make thorough comparisons between LINNEA and earlier methods. We show the potential of LINNEA in four experiments; we use principal component analysis (PCA) as a baseline. We use the nonkernel version of LINNEA (Section 3), and set the user-defined tradeoff between precision and recall to λ = 0 (favoring precision only) in experiments 1-3 and λ = 0.1 in experiment 4. Other parameters were defaults from the code of [5]. Experiment 1: Extracting the relevant dimensions. We first test LINNEA on toy data where the visualization can perfectly recover the given input neighborhoods. Consider a spherical Gaussian cloud of 500 points in the Hue-Saturation- Value (HSV) color space, shown in Fig. 1 (left). One cannot represent all three dimensions in one two-dimensional visualization, and without additional knowledge one cannot tell which features are important to preserve, as the shape of the data is the same in all directions. However, if we are also given pairwise distances between points, they determine which features to preserve. Suppose those distances have secretly been computed based only on Hue and Value; then the correct visualization is to take those two dimensions, ignoring Saturation. We optimized a two-dimensional projection with LINNEA; we gave the HSV components of each data point as input features, and computed input neighborhoods from the known pairwise distances. As shown in Fig. 1 (right), LINNEA (5)
6 6 Jaakko Peltonen Saturation Value Hue Fig. 1. Projection of points in the Hue-Saturation-Value color space. Left: the original three-dimensional data set is a Gaussian point cloud; coordinates correspond to Hue, Saturation (grayishness-colorfulness) and Value (lightness-darkness) of each dot. Pairwise input distances were computed from Hue and Value only. Right: LINNEA has correctly found the Hue and Value dimensions in the projection and ignored Saturation. found the Hue-Value dimensions and ignored Saturation, as desired; the weight of Saturation in the projection directions is close to zero. Experiment : S-curve. We visualize data set having a simple underlying manifold: 1000 points sampled along a two-dimensional manifold, embedded in the three-dimensional space as an S-shaped curve as shown in Fig. (left). No external pairwise distances are given and input neighborhoods are computed from the three-dimensional input features. The task is to find a visualization where original neighbors on the manifold can be retrieved well. Unfolding the manifold would suffice; however, a linear projection cannot unfold the nonlinear S-curve perfectly. The PCA solution in Fig. (middle) leaves out the original Z-axis, which is suboptimal for retrieving original neighbors as it leaves visible only one of the coordinates of the underlying two-dimensional manifold. The LINNEA result in Fig. (right) emphasizes directions Z and Y; this shows the coordinates of the underlying manifold well and allows retrieval of neighbors of input points. Experiment 3: Projection of face images. We visualize a data set of human faces ([8]; available at The data set has 698 synthetic face images in different orientations and lighting directions; each image has pixels. We first find linear projections of the face images using the pixel images as input features, without giving any additional knowledge. As shown in Fig. 3 (top left), PCA reveals part of the data structure, but the result is unsatisfactory for retrieving neighboring faces, since PCA has clumped together the back-lit faces. In contrast, as shown in Fig. 3 (top right), LINNEA spreads out both front-lit and back-lit faces. The projection directions can be interpreted as linear filters of the images. For PCA the filter on the horizontal axis roughly responds to a left-facing head; the filter
7 Visualization by Linear Projections as Information Retrieval Z Y X Fig.. Projections of an S-curve. Left: the original three-dimensional data. Middle: PCA neglects the original Z-direction. Right: LINNEA finds a projection where neighbors on the underlying manifold can be retrieved well from the visualization. on the vertical axis roughly detects left-right lighting direction. The LINNEA filters are complicated; more analysis of the filters is needed in future work. Projection to retrieve known pose/lighting neighbors. For the face data the pose and lighting parameters of the faces are available. We can then compute pairwise input distances based on these parameters, and use LINNEA to find a supervised visualization of the pixel images that best allows retrieval of the pose/lighting neighbors of each face. The LINNEA projection is shown in Fig. 3 (bottom). The face images are arranged quite well in terms of the pose and lighting; the top left bottom right axis roughly separates left and right-facing front-lit faces, and the top right bottom left axis roughly separates left and right-facing back-lit faces. The corresponding filters are somewhat complicated; the filters on the vertical axis and horizontal axis seem to roughly detect edges and lighting direction respectively. The underlying pose/lighting space is three-dimensional and cannot be represented exactly by a two-dimensional mapping, thus the filters are compromises between representing several aspects of pose/lighting. Note that running e.g. PCA on the known pose/lighting parameters would not yield filters of the pixel images, thus it would not tell how pixel data is related to pose/lighting; in contrast, LINNEA optimizes filters for retrieval of pose/lighting neighbors. Experiment 4: Visualization of scientific documents. We visualize the CiteSeer data set which contains scientific articles and their citations. The data set is available at Each article is described by a binary 3703-dimensional vector telling which words appeared in the article; we used these vectors as the input features. To reduce computational load we took the subset of 1000 articles having the highest number of inbound plus outbound citations. We provide separate pairwise input distances, simply taking the graph distance in the citation graph: that is, two documents where one cites the other have distance 1, documents that cite the same other document have distance, and so on. As a simplification we assumed citation to be a symmetric relation, and as a regularization we upper bounded the graph distance to 10. We use LINNEA (with λ = 0.1) to optimize a twodimensional visualization where neighbors according to this graph distance can
8 8 Jaakko Peltonen Fig. 3. Projections of face images. Top: unsupervised projections of the pixel images by PCA (left) and LINNEA (right). The linear projection directions can be interpreted as linear filters of the images, which are shown for each axis. Bottom: a supervised projection by LINNEA. Pairwise distances were derived from known pose/lighting parameters of the faces. LINNEA has optimized projections of the pixel images, for retrieving neighbors having similar pose/lighting parameters. See the text for more analysis. be best retrieved. The result is shown in Fig. 4. In the baseline PCA projection (left subfigure) citations are spread all over the data with little visible structure, whereas the LINNEA projection (right subfigure) shows clear structure: clusters where documents cite each other, and citation connections between clusters. For this data each feature is a word; unfortunately the identities of the words are unavailable. In general one can interpret the projection directions given by LINNEA by listing for each direction the words having the largest weights. 5 Conclusions We introduced a novel method for visualization by linear or kernel based projection. The projection is optimized for information retrieval of original neighbor points from the visualization, with a user-defined tradeoff between precision and recall. The method can either find projections for input features as such, or find
9 Visualization by Linear Projections as Information Retrieval 9 Fig. 4. Projections of scientific documents. Documents are shown as dots and citations between two documents are shown as lines. Left: PCA projection of document content vectors does not reveal citation neighborhoods well. Right: projection by LINNEA shows clusters of citing documents and connections between clusters. projections that reveal the relationships between input features and separately given input distances. The method yields good visualization of several data sets. Acknowledgments. The author belongs to the Adaptive Informatics Research Centre and to Helsinki Institute for Information Technology HIIT. He was supported by the Academy of Finland, decision 13983, and in part by the PASCAL Network of Excellence. He thanks Samuel Kaski for very useful discussion. References 1. Cevikalp, H., Verbeek, J., Jurie, F., Kläser, A.: Semi-supervised dimensionality reduction using pairwise equivalence constraints. In: Proc. VISAPP 008, pp (008). Peltonen, J., Kaski, S.: Discriminative components of data. IEEE Trans. Neural Networks 16(1), (005) 3. Goldberger, J., Roweis, S., Hinton, G., Salakhutdinov, R.: Neighbourhood components analysis. In: Proc. NIPS 004, pp MIT Press, Cambridge, MA (005) 4. Globerson, A., Roweis, S.: Metric learning by collapsing classes. In: Proc. NIPS 005, pp MIT Press, Cambridge, MA (006) 5. Venna, J., Kaski, S.: Nonlinear dimensionality reduction as information retrieval. In: Proc. AISTATS*07. (007) 6. Peltonen, J., Aidos, H., Kaski, S.: Supervised nonlinear dimensionality reduction by neighbor retrieval. In: Proc. ICASSP 009. In press. 7. Hinton, G., Roweis, S.T.: Stochastic neighbor embedding. In: Proc. NIPS 00, pp MIT Press, Cambridge, MA (00) 8. Tenenbaum, J.B., de Silva, V., Langford, J.C.: A global geometric framework for nonlinear dimensionality reduction. Science 90 (December 000)
Comparison of Non-linear Dimensionality Reduction Techniques for Classification with Gene Expression Microarray Data
CMPE 59H Comparison of Non-linear Dimensionality Reduction Techniques for Classification with Gene Expression Microarray Data Term Project Report Fatma Güney, Kübra Kalkan 1/15/2013 Keywords: Non-linear
Manifold Learning Examples PCA, LLE and ISOMAP
Manifold Learning Examples PCA, LLE and ISOMAP Dan Ventura October 14, 28 Abstract We try to give a helpful concrete example that demonstrates how to use PCA, LLE and Isomap, attempts to provide some intuition
Visualization of Topology Representing Networks
Visualization of Topology Representing Networks Agnes Vathy-Fogarassy 1, Agnes Werner-Stark 1, Balazs Gal 1 and Janos Abonyi 2 1 University of Pannonia, Department of Mathematics and Computing, P.O.Box
Distance Metric Learning in Data Mining (Part I) Fei Wang and Jimeng Sun IBM TJ Watson Research Center
Distance Metric Learning in Data Mining (Part I) Fei Wang and Jimeng Sun IBM TJ Watson Research Center 1 Outline Part I - Applications Motivation and Introduction Patient similarity application Part II
15.062 Data Mining: Algorithms and Applications Matrix Math Review
.6 Data Mining: Algorithms and Applications Matrix Math Review The purpose of this document is to give a brief review of selected linear algebra concepts that will be useful for the course and to develop
Visualizing Data using t-sne
Journal of Machine Learning Research 1 (2008) 1-48 Submitted 4/00; Published 10/00 Visualizing Data using t-sne Laurens van der Maaten MICC-IKAT Maastricht University P.O. Box 616, 6200 MD Maastricht,
A Computational Framework for Exploratory Data Analysis
A Computational Framework for Exploratory Data Analysis Axel Wismüller Depts. of Radiology and Biomedical Engineering, University of Rochester, New York 601 Elmwood Avenue, Rochester, NY 14642-8648, U.S.A.
Medical Information Management & Mining. You Chen Jan,15, 2013 [email protected]
Medical Information Management & Mining You Chen Jan,15, 2013 [email protected] 1 Trees Building Materials Trees cannot be used to build a house directly. How can we transform trees to building materials?
Modelling, Extraction and Description of Intrinsic Cues of High Resolution Satellite Images: Independent Component Analysis based approaches
Modelling, Extraction and Description of Intrinsic Cues of High Resolution Satellite Images: Independent Component Analysis based approaches PhD Thesis by Payam Birjandi Director: Prof. Mihai Datcu Problematic
Component Ordering in Independent Component Analysis Based on Data Power
Component Ordering in Independent Component Analysis Based on Data Power Anne Hendrikse Raymond Veldhuis University of Twente University of Twente Fac. EEMCS, Signals and Systems Group Fac. EEMCS, Signals
December 4, 2013 MATH 171 BASIC LINEAR ALGEBRA B. KITCHENS
December 4, 2013 MATH 171 BASIC LINEAR ALGEBRA B KITCHENS The equation 1 Lines in two-dimensional space (1) 2x y = 3 describes a line in two-dimensional space The coefficients of x and y in the equation
Support Vector Machines with Clustering for Training with Very Large Datasets
Support Vector Machines with Clustering for Training with Very Large Datasets Theodoros Evgeniou Technology Management INSEAD Bd de Constance, Fontainebleau 77300, France [email protected] Massimiliano
Gaussian Process Latent Variable Models for Visualisation of High Dimensional Data
Gaussian Process Latent Variable Models for Visualisation of High Dimensional Data Neil D. Lawrence Department of Computer Science, University of Sheffield, Regent Court, 211 Portobello Street, Sheffield,
1 Spectral Methods for Dimensionality
1 Spectral Methods for Dimensionality Reduction Lawrence K. Saul Kilian Q. Weinberger Fei Sha Jihun Ham Daniel D. Lee How can we search for low dimensional structure in high dimensional data? If the data
TOOLS FOR 3D-OBJECT RETRIEVAL: KARHUNEN-LOEVE TRANSFORM AND SPHERICAL HARMONICS
TOOLS FOR 3D-OBJECT RETRIEVAL: KARHUNEN-LOEVE TRANSFORM AND SPHERICAL HARMONICS D.V. Vranić, D. Saupe, and J. Richter Department of Computer Science, University of Leipzig, Leipzig, Germany phone +49 (341)
THE GOAL of this work is to learn discriminative components
68 IEEE TRANSACTIONS ON NEURAL NETWORKS, VOL. 16, NO. 1, JANUARY 2005 Discriminative Components of Data Jaakko Peltonen and Samuel Kaski, Senior Member, IEEE Abstract A simple probabilistic model is introduced
Supporting Online Material for
www.sciencemag.org/cgi/content/full/313/5786/504/dc1 Supporting Online Material for Reducing the Dimensionality of Data with Neural Networks G. E. Hinton* and R. R. Salakhutdinov *To whom correspondence
Visualization of Large Font Databases
Visualization of Large Font Databases Martin Solli and Reiner Lenz Linköping University, Sweden ITN, Campus Norrköping, Linköping University, 60174 Norrköping, Sweden [email protected], [email protected]
Maximum Likelihood Graph Structure Estimation with Degree Distributions
Maximum Likelihood Graph Structure Estimation with Distributions Bert Huang Computer Science Department Columbia University New York, NY 17 [email protected] Tony Jebara Computer Science Department
So which is the best?
Manifold Learning Techniques: So which is the best? Todd Wittman Math 8600: Geometric Data Analysis Instructor: Gilad Lerman Spring 2005 Note: This presentation does not contain information on LTSA, which
Convolution. 1D Formula: 2D Formula: Example on the web: http://www.jhu.edu/~signals/convolve/
Basic Filters (7) Convolution/correlation/Linear filtering Gaussian filters Smoothing and noise reduction First derivatives of Gaussian Second derivative of Gaussian: Laplacian Oriented Gaussian filters
Manifold Learning with Variational Auto-encoder for Medical Image Analysis
Manifold Learning with Variational Auto-encoder for Medical Image Analysis Eunbyung Park Department of Computer Science University of North Carolina at Chapel Hill [email protected] Abstract Manifold
Probabilistic Latent Semantic Analysis (plsa)
Probabilistic Latent Semantic Analysis (plsa) SS 2008 Bayesian Networks Multimedia Computing, Universität Augsburg [email protected] www.multimedia-computing.{de,org} References
A Simple Introduction to Support Vector Machines
A Simple Introduction to Support Vector Machines Martin Law Lecture for CSE 802 Department of Computer Science and Engineering Michigan State University Outline A brief history of SVM Large-margin linear
Visualization of General Defined Space Data
International Journal of Computer Graphics & Animation (IJCGA) Vol.3, No.4, October 013 Visualization of General Defined Space Data John R Rankin La Trobe University, Australia Abstract A new algorithm
Review of Fundamental Mathematics
Review of Fundamental Mathematics As explained in the Preface and in Chapter 1 of your textbook, managerial economics applies microeconomic theory to business decision making. The decision-making tools
How To Cluster
Data Clustering Dec 2nd, 2013 Kyrylo Bessonov Talk outline Introduction to clustering Types of clustering Supervised Unsupervised Similarity measures Main clustering algorithms k-means Hierarchical Main
HDDVis: An Interactive Tool for High Dimensional Data Visualization
HDDVis: An Interactive Tool for High Dimensional Data Visualization Mingyue Tan Department of Computer Science University of British Columbia [email protected] ABSTRACT Current high dimensional data visualization
Pixels Description of scene contents. Rob Fergus (NYU) Antonio Torralba (MIT) Yair Weiss (Hebrew U.) William T. Freeman (MIT) Banksy, 2006
Object Recognition Large Image Databases and Small Codes for Object Recognition Pixels Description of scene contents Rob Fergus (NYU) Antonio Torralba (MIT) Yair Weiss (Hebrew U.) William T. Freeman (MIT)
Statistical Machine Learning
Statistical Machine Learning UoC Stats 37700, Winter quarter Lecture 4: classical linear and quadratic discriminants. 1 / 25 Linear separation For two classes in R d : simple idea: separate the classes
Statistical Models in Data Mining
Statistical Models in Data Mining Sargur N. Srihari University at Buffalo The State University of New York Department of Computer Science and Engineering Department of Biostatistics 1 Srihari Flood of
Chapter ML:XI (continued)
Chapter ML:XI (continued) XI. Cluster Analysis Data Mining Overview Cluster Analysis Basics Hierarchical Cluster Analysis Iterative Cluster Analysis Density-Based Cluster Analysis Cluster Evaluation Constrained
Example: Credit card default, we may be more interested in predicting the probabilty of a default than classifying individuals as default or not.
Statistical Learning: Chapter 4 Classification 4.1 Introduction Supervised learning with a categorical (Qualitative) response Notation: - Feature vector X, - qualitative response Y, taking values in C
DATA MINING CLUSTER ANALYSIS: BASIC CONCEPTS
DATA MINING CLUSTER ANALYSIS: BASIC CONCEPTS 1 AND ALGORITHMS Chiara Renso KDD-LAB ISTI- CNR, Pisa, Italy WHAT IS CLUSTER ANALYSIS? Finding groups of objects such that the objects in a group will be similar
Machine Learning and Pattern Recognition Logistic Regression
Machine Learning and Pattern Recognition Logistic Regression Course Lecturer:Amos J Storkey Institute for Adaptive and Neural Computation School of Informatics University of Edinburgh Crichton Street,
A Partially Supervised Metric Multidimensional Scaling Algorithm for Textual Data Visualization
A Partially Supervised Metric Multidimensional Scaling Algorithm for Textual Data Visualization Ángela Blanco Universidad Pontificia de Salamanca [email protected] Spain Manuel Martín-Merino Universidad
PATTERN RECOGNITION AND MACHINE LEARNING CHAPTER 4: LINEAR MODELS FOR CLASSIFICATION
PATTERN RECOGNITION AND MACHINE LEARNING CHAPTER 4: LINEAR MODELS FOR CLASSIFICATION Introduction In the previous chapter, we explored a class of regression models having particularly simple analytical
Assessment. Presenter: Yupu Zhang, Guoliang Jin, Tuo Wang Computer Vision 2008 Fall
Automatic Photo Quality Assessment Presenter: Yupu Zhang, Guoliang Jin, Tuo Wang Computer Vision 2008 Fall Estimating i the photorealism of images: Distinguishing i i paintings from photographs h Florin
How To Fix Out Of Focus And Blur Images With A Dynamic Template Matching Algorithm
IJSTE - International Journal of Science Technology & Engineering Volume 1 Issue 10 April 2015 ISSN (online): 2349-784X Image Estimation Algorithm for Out of Focus and Blur Images to Retrieve the Barcode
ViSOM A Novel Method for Multivariate Data Projection and Structure Visualization
IEEE TRANSACTIONS ON NEURAL NETWORKS, VOL. 13, NO. 1, JANUARY 2002 237 ViSOM A Novel Method for Multivariate Data Projection and Structure Visualization Hujun Yin Abstract When used for visualization of
Logistic Regression. Jia Li. Department of Statistics The Pennsylvania State University. Logistic Regression
Logistic Regression Department of Statistics The Pennsylvania State University Email: [email protected] Logistic Regression Preserve linear classification boundaries. By the Bayes rule: Ĝ(x) = arg max
Self Organizing Maps: Fundamentals
Self Organizing Maps: Fundamentals Introduction to Neural Networks : Lecture 16 John A. Bullinaria, 2004 1. What is a Self Organizing Map? 2. Topographic Maps 3. Setting up a Self Organizing Map 4. Kohonen
4.1 Learning algorithms for neural networks
4 Perceptron Learning 4.1 Learning algorithms for neural networks In the two preceding chapters we discussed two closely related models, McCulloch Pitts units and perceptrons, but the question of how to
Character Image Patterns as Big Data
22 International Conference on Frontiers in Handwriting Recognition Character Image Patterns as Big Data Seiichi Uchida, Ryosuke Ishida, Akira Yoshida, Wenjie Cai, Yaokai Feng Kyushu University, Fukuoka,
Methodology for Emulating Self Organizing Maps for Visualization of Large Datasets
Methodology for Emulating Self Organizing Maps for Visualization of Large Datasets Macario O. Cordel II and Arnulfo P. Azcarraga College of Computer Studies *Corresponding Author: [email protected]
A Genetic Algorithm-Evolved 3D Point Cloud Descriptor
A Genetic Algorithm-Evolved 3D Point Cloud Descriptor Dominik Wȩgrzyn and Luís A. Alexandre IT - Instituto de Telecomunicações Dept. of Computer Science, Univ. Beira Interior, 6200-001 Covilhã, Portugal
Learning Shared Latent Structure for Image Synthesis and Robotic Imitation
University of Washington CSE TR 2005-06-02 To appear in Advances in NIPS 18, 2006 Learning Shared Latent Structure for Image Synthesis and Robotic Imitation Aaron P. Shon Keith Grochow Aaron Hertzmann
Linear Threshold Units
Linear Threshold Units w x hx (... w n x n w We assume that each feature x j and each weight w j is a real number (we will relax this later) We will study three different algorithms for learning linear
Analecta Vol. 8, No. 2 ISSN 2064-7964
EXPERIMENTAL APPLICATIONS OF ARTIFICIAL NEURAL NETWORKS IN ENGINEERING PROCESSING SYSTEM S. Dadvandipour Institute of Information Engineering, University of Miskolc, Egyetemváros, 3515, Miskolc, Hungary,
Algebra 2 Chapter 1 Vocabulary. identity - A statement that equates two equivalent expressions.
Chapter 1 Vocabulary identity - A statement that equates two equivalent expressions. verbal model- A word equation that represents a real-life problem. algebraic expression - An expression with variables.
Making Sense of the Mayhem: Machine Learning and March Madness
Making Sense of the Mayhem: Machine Learning and March Madness Alex Tran and Adam Ginzberg Stanford University [email protected] [email protected] I. Introduction III. Model The goal of our research
ADVANCED MACHINE LEARNING. Introduction
1 1 Introduction Lecturer: Prof. Aude Billard ([email protected]) Teaching Assistants: Guillaume de Chambrier, Nadia Figueroa, Denys Lamotte, Nicola Sommer 2 2 Course Format Alternate between: Lectures
EM Clustering Approach for Multi-Dimensional Analysis of Big Data Set
EM Clustering Approach for Multi-Dimensional Analysis of Big Data Set Amhmed A. Bhih School of Electrical and Electronic Engineering Princy Johnson School of Electrical and Electronic Engineering Martin
Automatic 3D Reconstruction via Object Detection and 3D Transformable Model Matching CS 269 Class Project Report
Automatic 3D Reconstruction via Object Detection and 3D Transformable Model Matching CS 69 Class Project Report Junhua Mao and Lunbo Xu University of California, Los Angeles [email protected] and lunbo
Maximum Margin Clustering
Maximum Margin Clustering Linli Xu James Neufeld Bryce Larson Dale Schuurmans University of Waterloo University of Alberta Abstract We propose a new method for clustering based on finding maximum margin
jorge s. marques image processing
image processing images images: what are they? what is shown in this image? What is this? what is an image images describe the evolution of physical variables (intensity, color, reflectance, condutivity)
Gaussian Processes in Machine Learning
Gaussian Processes in Machine Learning Carl Edward Rasmussen Max Planck Institute for Biological Cybernetics, 72076 Tübingen, Germany [email protected] WWW home page: http://www.tuebingen.mpg.de/ carl
LABEL PROPAGATION ON GRAPHS. SEMI-SUPERVISED LEARNING. ----Changsheng Liu 10-30-2014
LABEL PROPAGATION ON GRAPHS. SEMI-SUPERVISED LEARNING ----Changsheng Liu 10-30-2014 Agenda Semi Supervised Learning Topics in Semi Supervised Learning Label Propagation Local and global consistency Graph
Analysis of kiva.com Microlending Service! Hoda Eydgahi Julia Ma Andy Bardagjy December 9, 2010 MAS.622j
Analysis of kiva.com Microlending Service! Hoda Eydgahi Julia Ma Andy Bardagjy December 9, 2010 MAS.622j What is Kiva? An organization that allows people to lend small amounts of money via the Internet
Supervised Feature Selection & Unsupervised Dimensionality Reduction
Supervised Feature Selection & Unsupervised Dimensionality Reduction Feature Subset Selection Supervised: class labels are given Select a subset of the problem features Why? Redundant features much or
Mixtures of Robust Probabilistic Principal Component Analyzers
Mixtures of Robust Probabilistic Principal Component Analyzers Cédric Archambeau, Nicolas Delannay 2 and Michel Verleysen 2 - University College London, Dept. of Computer Science Gower Street, London WCE
Linear Classification. Volker Tresp Summer 2015
Linear Classification Volker Tresp Summer 2015 1 Classification Classification is the central task of pattern recognition Sensors supply information about an object: to which class do the object belong
Comparing large datasets structures through unsupervised learning
Comparing large datasets structures through unsupervised learning Guénaël Cabanes and Younès Bennani LIPN-CNRS, UMR 7030, Université de Paris 13 99, Avenue J-B. Clément, 93430 Villetaneuse, France [email protected]
An Introduction to Point Pattern Analysis using CrimeStat
Introduction An Introduction to Point Pattern Analysis using CrimeStat Luc Anselin Spatial Analysis Laboratory Department of Agricultural and Consumer Economics University of Illinois, Urbana-Champaign
Unsupervised and supervised dimension reduction: Algorithms and connections
Unsupervised and supervised dimension reduction: Algorithms and connections Jieping Ye Department of Computer Science and Engineering Evolutionary Functional Genomics Center The Biodesign Institute Arizona
SPECIAL PERTURBATIONS UNCORRELATED TRACK PROCESSING
AAS 07-228 SPECIAL PERTURBATIONS UNCORRELATED TRACK PROCESSING INTRODUCTION James G. Miller * Two historical uncorrelated track (UCT) processing approaches have been employed using general perturbations
Virtual Landmarks for the Internet
Virtual Landmarks for the Internet Liying Tang Mark Crovella Boston University Computer Science Internet Distance Matters! Useful for configuring Content delivery networks Peer to peer applications Multiuser
Linear Codes. Chapter 3. 3.1 Basics
Chapter 3 Linear Codes In order to define codes that we can encode and decode efficiently, we add more structure to the codespace. We shall be mainly interested in linear codes. A linear code of length
A Color Placement Support System for Visualization Designs Based on Subjective Color Balance
A Color Placement Support System for Visualization Designs Based on Subjective Color Balance Eric Cooper and Katsuari Kamei College of Information Science and Engineering Ritsumeikan University Abstract:
Introduction to Support Vector Machines. Colin Campbell, Bristol University
Introduction to Support Vector Machines Colin Campbell, Bristol University 1 Outline of talk. Part 1. An Introduction to SVMs 1.1. SVMs for binary classification. 1.2. Soft margins and multi-class classification.
FUZZY CLUSTERING ANALYSIS OF DATA MINING: APPLICATION TO AN ACCIDENT MINING SYSTEM
International Journal of Innovative Computing, Information and Control ICIC International c 0 ISSN 34-48 Volume 8, Number 8, August 0 pp. 4 FUZZY CLUSTERING ANALYSIS OF DATA MINING: APPLICATION TO AN ACCIDENT
Data, Measurements, Features
Data, Measurements, Features Middle East Technical University Dep. of Computer Engineering 2009 compiled by V. Atalay What do you think of when someone says Data? We might abstract the idea that data are
Dimension Reduction. Wei-Ta Chu 2014/10/22. Multimedia Content Analysis, CSIE, CCU
1 Dimension Reduction Wei-Ta Chu 2014/10/22 2 1.1 Principal Component Analysis (PCA) Widely used in dimensionality reduction, lossy data compression, feature extraction, and data visualization Also known
Environmental Remote Sensing GEOG 2021
Environmental Remote Sensing GEOG 2021 Lecture 4 Image classification 2 Purpose categorising data data abstraction / simplification data interpretation mapping for land cover mapping use land cover class
Machine Learning in Computer Vision A Tutorial. Ajay Joshi, Anoop Cherian and Ravishankar Shivalingam Dept. of Computer Science, UMN
Machine Learning in Computer Vision A Tutorial Ajay Joshi, Anoop Cherian and Ravishankar Shivalingam Dept. of Computer Science, UMN Outline Introduction Supervised Learning Unsupervised Learning Semi-Supervised
Semi-Supervised Support Vector Machines and Application to Spam Filtering
Semi-Supervised Support Vector Machines and Application to Spam Filtering Alexander Zien Empirical Inference Department, Bernhard Schölkopf Max Planck Institute for Biological Cybernetics ECML 2006 Discovery
Predict Influencers in the Social Network
Predict Influencers in the Social Network Ruishan Liu, Yang Zhao and Liuyu Zhou Email: rliu2, yzhao2, [email protected] Department of Electrical Engineering, Stanford University Abstract Given two persons
STA 4273H: Statistical Machine Learning
STA 4273H: Statistical Machine Learning Russ Salakhutdinov Department of Statistics! [email protected]! http://www.cs.toronto.edu/~rsalakhu/ Lecture 6 Three Approaches to Classification Construct
Mathematics Course 111: Algebra I Part IV: Vector Spaces
Mathematics Course 111: Algebra I Part IV: Vector Spaces D. R. Wilkins Academic Year 1996-7 9 Vector Spaces A vector space over some field K is an algebraic structure consisting of a set V on which are
An Iterative Image Registration Technique with an Application to Stereo Vision
An Iterative Image Registration Technique with an Application to Stereo Vision Bruce D. Lucas Takeo Kanade Computer Science Department Carnegie-Mellon University Pittsburgh, Pennsylvania 15213 Abstract
Chapter 6: The Information Function 129. CHAPTER 7 Test Calibration
Chapter 6: The Information Function 129 CHAPTER 7 Test Calibration 130 Chapter 7: Test Calibration CHAPTER 7 Test Calibration For didactic purposes, all of the preceding chapters have assumed that the
Machine Learning and Data Analysis overview. Department of Cybernetics, Czech Technical University in Prague. http://ida.felk.cvut.
Machine Learning and Data Analysis overview Jiří Kléma Department of Cybernetics, Czech Technical University in Prague http://ida.felk.cvut.cz psyllabus Lecture Lecturer Content 1. J. Kléma Introduction,
Principal components analysis
CS229 Lecture notes Andrew Ng Part XI Principal components analysis In our discussion of factor analysis, we gave a way to model data x R n as approximately lying in some k-dimension subspace, where k
Visualization of textual data: unfolding the Kohonen maps.
Visualization of textual data: unfolding the Kohonen maps. CNRS - GET - ENST 46 rue Barrault, 75013, Paris, France (e-mail: [email protected]) Ludovic Lebart Abstract. The Kohonen self organizing
Support Vector Machines Explained
March 1, 2009 Support Vector Machines Explained Tristan Fletcher www.cs.ucl.ac.uk/staff/t.fletcher/ Introduction This document has been written in an attempt to make the Support Vector Machines (SVM),
Class-specific Sparse Coding for Learning of Object Representations
Class-specific Sparse Coding for Learning of Object Representations Stephan Hasler, Heiko Wersing, and Edgar Körner Honda Research Institute Europe GmbH Carl-Legien-Str. 30, 63073 Offenbach am Main, Germany
Selection of the Suitable Parameter Value for ISOMAP
1034 JOURNAL OF SOFTWARE, VOL. 6, NO. 6, JUNE 2011 Selection of the Suitable Parameter Value for ISOMAP Li Jing and Chao Shao School of Computer and Information Engineering, Henan University of Economics
Class #6: Non-linear classification. ML4Bio 2012 February 17 th, 2012 Quaid Morris
Class #6: Non-linear classification ML4Bio 2012 February 17 th, 2012 Quaid Morris 1 Module #: Title of Module 2 Review Overview Linear separability Non-linear classification Linear Support Vector Machines
