Using Smoothed Data Histograms for Cluster Visualization in Self-Organizing Maps
|
|
|
- Bernadette Ford
- 10 years ago
- Views:
Transcription
1 Technical Report OeFAI-TR , extended version published in Proceedings of the International Conference on Artificial Neural Networks, Springer Lecture Notes in Computer Science, Madrid, Spain, Using Smoothed Data Histograms for Cluster Visualization in Self-Organizing Maps Elias Pampalk 1, Andreas Rauber 2,, and Dieter Merkl 2 1 Austrian Research Institute for Artificial Intelligence (OeFAI) Schottengasse 3, A-1010 Vienna, Austria [email protected] 2 Department of Software Technology and Interactive Systems Vienna University of Technology Favoritenstr. 9-11/188, A-1040 Vienna, Austria {andi, dieter}@ifs.tuwien.ac.at Abstract. Several methods to visualize clusters in high-dimensional data sets using the Self-Organizing Map (SOM) have been proposed. However, most of these methods only focus on the information extracted from the model vectors of the SOM. This paper introduces a novel method to visualize the clusters of a SOM based on smoothed data histograms. The method is illustrated using a simple 2-dimensional data set and similarities to other SOM based visualizations and to the posterior probability distribution of the Generative Topographic Mapping are discussed. Furthermore, the method is evaluated on a real world data set consisting of pieces of music. 1 Introduction The Self-Organizing Map (SOM) [1] is frequently employed in exploratory data analysis to project multivariate data onto a 2-dimensional map in such a way that data items close to each other in the high-dimensional data space are close to each other on the map. In the interactive process of data mining such maps can be utilized to visualize clusters in the data set to support the user in understanding the inherent structure of the data. Several methods to visualize clusters based on the SOM can be found in the literature. The most commonly used method, i.e. the U-Matrix [2], visualizes the distances between the model vectors of units which are immediate neighbors. Alternatively, the model vectors can be clustered [3] using techniques such as Part of this work was supported by the Austrian FWF under grant Y99-INF. Part of the work was performed while the author was an ERCIM Research Fellow at IEI, Consiglio Nazionale delle Ricerche (CNR), Pisa, Italy.
2 k-means or hierarchical agglomerative clustering. The clusters or dendrograms can be visualized on top of the map. Another possibility to analyze the cluster structure of the SOM is to project the model vectors into low-dimensional spaces and use a color coding to link the projections with the original map [4, 5]. A similar approach is to mirror the movement of the model vectors during SOM training in a two-dimensional output space using Adaptive Coordinates [6]. A rather different approach to visualize clusters are data histograms which count how many data items are best represented by a specific unit. For an overview of different possibilities to visualize the data histograms see e.g. [7]. However, the problem with data histograms is that considering only the best matching unit for each data item ignores the fact that data items are usually represented well by more than just one unit. The visualization method presented in this paper is based on smoothed data histograms (SDH). The underlying idea is, that clusters are areas in the data space with a high density of data items. The results are related to the posterior probability distribution obtained by e.g. the Generative Topographic Mapping (GTM) [8], however the technique itself is very simple and computationally not heavier than the calculation of the standard data histogram. The remainder of this paper is organized as follows. Section 2 presents the calculation of SDHs and discusses the similarities to other methods. In Section 3 the method is evaluated using a data set consisting of 359 pieces of music, and in Section 4 some conclusions are drawn. 2 Smoothed Data Histograms 2.1 Principles The SOM consists of units which are usually ordered on a rectangular 2-dimensional grid which is referred to as map. A model vector in the high-dimensional data space is assigned to each of the units. During the training process the model vectors are fitted to the data in such a way that the distances between the data items and the corresponding closest model vectors are minimized under the constraint that model vectors which belong to units close to each other on the map, are also close to each other in the data space. The objective of the SDH is to visualize the clusters in the data set through estimation of the probability density of the high-dimensional data on the map. This is achieved by using the SOM as basis for a smoothed data histogram. The map units are interpreted as bins. The bin centers in the data space are defined by the model vectors and the varying bin widths are defined through the distances between the model vectors. The membership degree of a data item to a specific bin is governed by the smoothing parameter s and calculated based on the rank of the distances between the data item and all bin centers. In particular, the membership degree is s/c s to the closest bin, (s 1)/c s to the second, (s 2)/c s to the third, and so forth. The membership to all but the closest s bins is 0. The constant c s = s 1 i=0 s i ensures that the total membership of each data item adds up to 1.
3 (a) (b) (c) (a) (b) Fig. 1. The 2-dimensional data space. (a) The probability distribution from which (b) the sample was drawn and (c) the model vectors of the SOM. Fig. 2. The SDH (s = 8) visualized using (a) gray shadings and (b) contours. (a) s = 1 (b) s = 3 (c) s = 5 (d) s = 15 (e) s = 20 (f) s = 50 Fig. 3. Different values for the smoothing parameters s and their effects on the SDH. The SDH is illustrated and the similarities to other methods are discussed using the 2-dimensional data set presented in Figures 1(a) and 1(b). The data set consists of 5000 samples, randomly drawn from a probability distribution that is a mixture of 5 Gaussians. The SOM consists of units which are arranged on a rectangular grid. The model vectors in the data space, where the immediate neighbors are connected by a line, are shown in Figure 1(c). Figure 2(a) illustrates the SDH for s = 8. The SOM has been rotated so that the orientation of the latent space corresponds to the orientation of the data space. The gray shadings are chosen so that darker shadings correspond to lower values, while areas shown in lighter shadings correspond to higher values of the SDH. To further simplify the SDH visualization a 5-level contour plot is presented in Figure 2(b), where the values between units are interpolated. A detailed comparison of the SDH and the model vectors in the data space reveals, that the cluster centers found by the SDH correspond well to the cluster centers of the data set. The influence of the parameter s can be seen in Figure 3. For s = 1 the results are identical with those of the data histogram, where clusters cannot easily be identified. For example, the cluster in the upper left corner is represented by three different peaks. For increasing values of s the general characteristics of the data space become more obvious until levelling out into a coarser cluster representation. For extremely high values of s there is only one big cluster which has its peak approximately in the center of the map. The different cluster shapes reflect the hierarchical structure of the clusters in the data. In particular, s 50 resembles the top level in the hierarchy, where the whole data set appears as unity. On the next level, with s 20, the difference between the upper right cluster and the rest of the data set becomes noticeable.
4 (a) (b) (c) (d) (e) (f) Fig. 4. Alternatives to the SDH, in particular: (a) U-matrix, (b) distance matrix, (c) as contour plot, (d) k-means, (e) posterior distribution of GTM, (f) as contour plot. In the range 5 s 15 the 5 clusters can easily be identified, and s < 5 depicts the noise within the data. The appropriate hierarchical level and thus the optimal value for the smoothing parameter s depends on the respective application and in particular on the noise level contained in the data. 2.2 Similarities to other Visualization Methods The U-matrix of the SOM presented in Figure 1(c) can be seen in Figure 4(a). The large distances between the model vectors which are in the center right of the map become clearly visible in the U-matrix. However, these large distances are so dominant that the other distances seem less significant. Figure 4(b) depicts the distance matrix which is calculated as the median of the all the distances of the model vectors between a unit and its immediate neighbors. Although the big gap in the center right of the map is still very dominating, the contour plot depicts that there are 5 clusters (cf. Figure 4(c)). Generally, there is a close relationship between the distance of the model vectors and the probability density of the data due to an important characteristic of the SOM algorithm known as magnification factors. Areas with a high density of data items are represented by more model vectors and thus with more detail than sparse areas. Another advantage of visualizing the distances between the model vectors is that there is no need for further parameters. Note that the cluster centers found by the SDH with s = 8 and with the distance matrix are approximately identical. Yet, as we will see in Section 3, the distance matrix provides sub-optimal results in many real-world data mining applications. Figure 4(d) illustrates the k-means (with k = 5) clustering of the model vectors where all map units belonging to one cluster have the same gray shading. Although the 5 clusters can clearly be identified, it is necessary to know how many clusters there are beforehand. The Davies-Bouldin index [9], for example, indicates that k = 2 would be a better choice, resulting only in the separation of the upper right cluster from the other four clusters. The main advantage of the posterior probability of the GTM (cf. Figures 4(e - f)) is the precisely defined statistical interpretation. The GTM was trained using latent points, 4 basis functions, and σ = 3. Although this paper does not deal with visualizing the GTM, it is interesting to note the similarities between the posterior probability and the SDH.
5 (a) s = 1 (b) s = 2 (c) s = 4 (d) s = 15 Fig. 5. Influence of the parameter s on the SDH visualization of the music collection (a) (b) Fig. 6. Music collection with SDH, s = 3. Fig. 7. Alternative visualizations: (a) distance matrix, (b) k-means, k = 6. 3 Visualizing a Music Collection The SDH was evaluated using a data set consisting of 359 pieces of music which are represented by 1200 features based on Fourier-transformed frequency spectra. The main goal of this experiment is to obtain a clustering of music according to perceived acoustic similarities similar to the results presented in [10]. For full details on the feature extraction and experiments see [11]. Figure 5 depicts the effects of different s values for the SDH of the SOM trained with the music collection. The best results were obtained with s = 3 (cf. Figure 6). Using, for example, s = 15 the cluster in the upper left of the map disappears. Using values below s = 3 the cluster in the lower right becomes too dominant. Yet, in all cases the cluster structure is clearly visible. We will now take a closer look at the interpretation of some of the clusters. Cluster 1 in Figure 6 represents music with very strong beats. In particular, several songs of the group Bomfunk MCs are located there, but also songs such as Blue by Eiffel 65 or Let s get loud by Jennifer Lopez. Cluster 2 represents music mainly by the rock band Red Hot Chili Peppers including songs like Californication and Otherside. Cluster 3 represents more aggressive music by bands such as Limp Bizkit, Papa Roaches, Korn. Cluster 4 represents slightly less aggressive music by groups such as Guano Apes and K s Choice. Cluster 5 represents concert music and classical music used for films, e.g., the well known Starwars theme. Cluster 6 represents peaceful, classical pieces such as Für Elise by Beethoven or Eine kleine Nachtmusik by Mozart. Figure 7 depicts the distance matrix and the k-means visualization of the same SOM. While the former clearly identifies Cluster 1 in the upper left corner, all other clusters are not revealed, mainly because of the general similarity of data in that area and the dominance of the difference to the cluster in the upper
6 left. The features extracted represent the dynamic patterns of the music pieces and emphasize in particular beats reoccurring in fixed time intervals. Thus, for example, the values of songs by Bomfunk MCs are much higher than those of Für Elise. The general tendency of the increasing strength of the beats can also be guessed from the k-means visualization for k = 6 in Figure 7(b), yet, the cluster structure itself is not visible. 4 Conclusion The smoothed data histogram is a simple, robust, and computationally light cluster visualization method for self-organizing maps. We illustrated SDHs using a simple 2-dimensional data set and discussed the similarities to other SOM based visualizations as well as to the posterior probability distribution of the GTM. Furthermore, the SDH was evaluated on a real world data set consisting of pieces of music. Observed results show improved cluster identification compared to alternative visualizations based solely on the information extracted from the model vectors of the SOM. The SDH is able to identify clusters by resembling the probability distribution of the data on the map. The low complexity of the SDH calculation allows interactive determination of the smoothing parameter s, and thus is well suited for interactive data analysis. We provide a Matlab r toolbox to visualize SDHs together with several demonstrations at elias/sdh. References 1. Kohonen, T.: Self-Organizing Maps. 3rd edn. Springer-Verlag, Berlin (2001) 2. Ultsch, A., Siemon, H.: Kohonen s self-organizing feature maps for exploratory data analysis. In: Proc Int l Neural Network Conference (INNC 90), Dordrecht, Netherlands, Kluwer (1990) Vesanto, J., Alhoniemi, E.: Clustering of the Self-Organizing Map. IEEE Transactions on Neural Networks 11 (2000) Himberg, J.: Enhancing the SOM based data visualization by linking different data projections. In: Proc Int l Symp on Intelligent Data Engineering and Learning (IDEAL 98), Hong Kong (1998) Kaski, S., Kohonen, T., Venna, J.: Tips for SOM processing and colorcoding of maps. In: Visual Explorations in Finance. Springer Verlag, Berlin, Germany (1998) 6. Merkl, D., Rauber, A.: Alternative ways for cluster visualization in self-organizing maps. In: Proc Workshop on Self-Organizing Maps (WSOM97), Espoo, Finland 7. Vesanto, J.: SOM-Based data visualization methods. Intelligent Data Analysis 3 (1999) Bishop, C., Svensen, M., Williams, C.: GTM: The generative topographic mapping. Neural Computation 10 (1998) Davies, D., Bouldin, D.: A cluster separation measure. IEEE Transactions on Pattern Analysis and Machine Intelligence 1 (1979) Rauber, A., Frühwirth, M.: Automatically analyzing and organizing music archives. In: Proc 5. Europ Conf on Research and Advanced Technology for Digital Libraries (ECDL 2001). Darmstadt, Germany, Springer (2001) 11. Pampalk, E.: Islands of Music: Analysis, Organization, and Visualization of Music Archives. Master s thesis, Vienna University of Technology, Austria (2001) elias/music
Data topology visualization for the Self-Organizing Map
Data topology visualization for the Self-Organizing Map Kadim Taşdemir and Erzsébet Merényi Rice University - Electrical & Computer Engineering 6100 Main Street, Houston, TX, 77005 - USA Abstract. The
Visualization of Breast Cancer Data by SOM Component Planes
International Journal of Science and Technology Volume 3 No. 2, February, 2014 Visualization of Breast Cancer Data by SOM Component Planes P.Venkatesan. 1, M.Mullai 2 1 Department of Statistics,NIRT(Indian
A Discussion on Visual Interactive Data Exploration using Self-Organizing Maps
A Discussion on Visual Interactive Data Exploration using Self-Organizing Maps Julia Moehrmann 1, Andre Burkovski 1, Evgeny Baranovskiy 2, Geoffrey-Alexeij Heinze 2, Andrej Rapoport 2, and Gunther Heidemann
On the use of Three-dimensional Self-Organizing Maps for Visualizing Clusters in Geo-referenced Data
On the use of Three-dimensional Self-Organizing Maps for Visualizing Clusters in Geo-referenced Data Jorge M. L. Gorricha and Victor J. A. S. Lobo CINAV-Naval Research Center, Portuguese Naval Academy,
USING SELF-ORGANIZING MAPS FOR INFORMATION VISUALIZATION AND KNOWLEDGE DISCOVERY IN COMPLEX GEOSPATIAL DATASETS
USING SELF-ORGANIZING MAPS FOR INFORMATION VISUALIZATION AND KNOWLEDGE DISCOVERY IN COMPLEX GEOSPATIAL DATASETS Koua, E.L. International Institute for Geo-Information Science and Earth Observation (ITC).
Visualizing an Auto-Generated Topic Map
Visualizing an Auto-Generated Topic Map Nadine Amende 1, Stefan Groschupf 2 1 University Halle-Wittenberg, information manegement technology [email protected] 2 media style labs Halle Germany [email protected]
Local Anomaly Detection for Network System Log Monitoring
Local Anomaly Detection for Network System Log Monitoring Pekka Kumpulainen Kimmo Hätönen Tampere University of Technology Nokia Siemens Networks [email protected] [email protected] Abstract
A Computational Framework for Exploratory Data Analysis
A Computational Framework for Exploratory Data Analysis Axel Wismüller Depts. of Radiology and Biomedical Engineering, University of Rochester, New York 601 Elmwood Avenue, Rochester, NY 14642-8648, U.S.A.
3D Content-Based Visualization of Databases
3D Content-Based Visualization of Databases Anestis KOUTSOUDIS Dept. of Electrical and Computer Engineering, Democritus University of Thrace 12 Vas. Sofias Str., Xanthi, 67100, Greece Fotis ARNAOUTOGLOU
Visualization of large data sets using MDS combined with LVQ.
Visualization of large data sets using MDS combined with LVQ. Antoine Naud and Włodzisław Duch Department of Informatics, Nicholas Copernicus University, Grudziądzka 5, 87-100 Toruń, Poland. www.phys.uni.torun.pl/kmk
Self-Organizing g Maps (SOM) COMP61021 Modelling and Visualization of High Dimensional Data
Self-Organizing g Maps (SOM) Ke Chen Outline Introduction ti Biological Motivation Kohonen SOM Learning Algorithm Visualization Method Examples Relevant Issues Conclusions 2 Introduction Self-organizing
An Analysis on Density Based Clustering of Multi Dimensional Spatial Data
An Analysis on Density Based Clustering of Multi Dimensional Spatial Data K. Mumtaz 1 Assistant Professor, Department of MCA Vivekanandha Institute of Information and Management Studies, Tiruchengode,
Knowledge Discovery in Stock Market Data
Knowledge Discovery in Stock Market Data Alfred Ultsch and Hermann Locarek-Junge Abstract This work presents the results of a Data Mining and Knowledge Discovery approach on data from the stock markets
Self Organizing Maps for Visualization of Categories
Self Organizing Maps for Visualization of Categories Julian Szymański 1 and Włodzisław Duch 2,3 1 Department of Computer Systems Architecture, Gdańsk University of Technology, Poland, [email protected]
A user friendly toolbox for exploratory data analysis of underwater sound
061215-064 1 A user friendly toolbox for exploratory data analysis of underwater sound Fernando J. Pires 1 and Victor Lobo 2, Member, IEEE Abstract The underwater acoustics research group at the Portuguese
Methodology for Emulating Self Organizing Maps for Visualization of Large Datasets
Methodology for Emulating Self Organizing Maps for Visualization of Large Datasets Macario O. Cordel II and Arnulfo P. Azcarraga College of Computer Studies *Corresponding Author: [email protected]
Monitoring of Complex Industrial Processes based on Self-Organizing Maps and Watershed Transformations
Monitoring of Complex Industrial Processes based on Self-Organizing Maps and Watershed Transformations Christian W. Frey 2012 Monitoring of Complex Industrial Processes based on Self-Organizing Maps and
Comparing large datasets structures through unsupervised learning
Comparing large datasets structures through unsupervised learning Guénaël Cabanes and Younès Bennani LIPN-CNRS, UMR 7030, Université de Paris 13 99, Avenue J-B. Clément, 93430 Villetaneuse, France [email protected]
INTERACTIVE DATA EXPLORATION USING MDS MAPPING
INTERACTIVE DATA EXPLORATION USING MDS MAPPING Antoine Naud and Włodzisław Duch 1 Department of Computer Methods Nicolaus Copernicus University ul. Grudziadzka 5, 87-100 Toruń, Poland Abstract: Interactive
Exploratory Data Analysis with MATLAB
Computer Science and Data Analysis Series Exploratory Data Analysis with MATLAB Second Edition Wendy L Martinez Angel R. Martinez Jeffrey L. Solka ( r ec) CRC Press VV J Taylor & Francis Group Boca Raton
Topic Maps Visualization
Topic Maps Visualization Bénédicte Le Grand, Laboratoire d'informatique de Paris 6 Introduction Topic maps provide a bridge between the domains of knowledge representation and information management. Topics
ViSOM A Novel Method for Multivariate Data Projection and Structure Visualization
IEEE TRANSACTIONS ON NEURAL NETWORKS, VOL. 13, NO. 1, JANUARY 2002 237 ViSOM A Novel Method for Multivariate Data Projection and Structure Visualization Hujun Yin Abstract When used for visualization of
Clustering. Data Mining. Abraham Otero. Data Mining. Agenda
Clustering 1/46 Agenda Introduction Distance K-nearest neighbors Hierarchical clustering Quick reference 2/46 1 Introduction It seems logical that in a new situation we should act in a similar way as in
Cluster Analysis: Advanced Concepts
Cluster Analysis: Advanced Concepts and dalgorithms Dr. Hui Xiong Rutgers University Introduction to Data Mining 08/06/2006 1 Introduction to Data Mining 08/06/2006 1 Outline Prototype-based Fuzzy c-means
PRACTICAL DATA MINING IN A LARGE UTILITY COMPANY
QÜESTIIÓ, vol. 25, 3, p. 509-520, 2001 PRACTICAL DATA MINING IN A LARGE UTILITY COMPANY GEORGES HÉBRAIL We present in this paper the main applications of data mining techniques at Electricité de France,
Data Exploration Data Visualization
Data Exploration Data Visualization What is data exploration? A preliminary exploration of the data to better understand its characteristics. Key motivations of data exploration include Helping to select
2002 IEEE. Reprinted with permission.
Laiho J., Kylväjä M. and Höglund A., 2002, Utilization of Advanced Analysis Methods in UMTS Networks, Proceedings of the 55th IEEE Vehicular Technology Conference ( Spring), vol. 2, pp. 726-730. 2002 IEEE.
Modelling, Extraction and Description of Intrinsic Cues of High Resolution Satellite Images: Independent Component Analysis based approaches
Modelling, Extraction and Description of Intrinsic Cues of High Resolution Satellite Images: Independent Component Analysis based approaches PhD Thesis by Payam Birjandi Director: Prof. Mihai Datcu Problematic
Online data visualization using the neural gas network
Online data visualization using the neural gas network Pablo A. Estévez, Cristián J. Figueroa Department of Electrical Engineering, University of Chile, Casilla 412-3, Santiago, Chile Abstract A high-quality
Data Mining. Cluster Analysis: Advanced Concepts and Algorithms
Data Mining Cluster Analysis: Advanced Concepts and Algorithms Tan,Steinbach, Kumar Introduction to Data Mining 4/18/2004 1 More Clustering Methods Prototype-based clustering Density-based clustering Graph-based
Annotated bibliographies for presentations in MUMT 611, Winter 2006
Stephen Sinclair Music Technology Area, McGill University. Montreal, Canada Annotated bibliographies for presentations in MUMT 611, Winter 2006 Presentation 4: Musical Genre Similarity Aucouturier, J.-J.
Data Mining: Exploring Data. Lecture Notes for Chapter 3. Slides by Tan, Steinbach, Kumar adapted by Michael Hahsler
Data Mining: Exploring Data Lecture Notes for Chapter 3 Slides by Tan, Steinbach, Kumar adapted by Michael Hahsler Topics Exploratory Data Analysis Summary Statistics Visualization What is data exploration?
Integration of Cluster Analysis and Visualization Techniques for Visual Data Analysis
Integration of Cluster Analysis and Visualization Techniques for Visual Data Analysis M. Kreuseler, T. Nocke, H. Schumann, Institute of Computer Graphics University of Rostock, D-18059 Rostock, Germany
IMPROVISATION OF STUDYING COMPUTER BY CLUSTER STRATEGIES
INTERNATIONAL JOURNAL OF ADVANCED RESEARCH IN ENGINEERING AND SCIENCE IMPROVISATION OF STUDYING COMPUTER BY CLUSTER STRATEGIES C.Priyanka 1, T.Giri Babu 2 1 M.Tech Student, Dept of CSE, Malla Reddy Engineering
EM Clustering Approach for Multi-Dimensional Analysis of Big Data Set
EM Clustering Approach for Multi-Dimensional Analysis of Big Data Set Amhmed A. Bhih School of Electrical and Electronic Engineering Princy Johnson School of Electrical and Electronic Engineering Martin
Using Data Mining for Mobile Communication Clustering and Characterization
Using Data Mining for Mobile Communication Clustering and Characterization A. Bascacov *, C. Cernazanu ** and M. Marcu ** * Lasting Software, Timisoara, Romania ** Politehnica University of Timisoara/Computer
DATA MINING CLUSTER ANALYSIS: BASIC CONCEPTS
DATA MINING CLUSTER ANALYSIS: BASIC CONCEPTS 1 AND ALGORITHMS Chiara Renso KDD-LAB ISTI- CNR, Pisa, Italy WHAT IS CLUSTER ANALYSIS? Finding groups of objects such that the objects in a group will be similar
Advanced Web Usage Mining Algorithm using Neural Network and Principal Component Analysis
Advanced Web Usage Mining Algorithm using Neural Network and Principal Component Analysis Arumugam, P. and Christy, V Department of Statistics, Manonmaniam Sundaranar University, Tirunelveli, Tamilnadu,
High-dimensional labeled data analysis with Gabriel graphs
High-dimensional labeled data analysis with Gabriel graphs Michaël Aupetit CEA - DAM Département Analyse Surveillance Environnement BP 12-91680 - Bruyères-Le-Châtel, France Abstract. We propose the use
Visualization of textual data: unfolding the Kohonen maps.
Visualization of textual data: unfolding the Kohonen maps. CNRS - GET - ENST 46 rue Barrault, 75013, Paris, France (e-mail: [email protected]) Ludovic Lebart Abstract. The Kohonen self organizing
Visualization and Data Mining of Pareto Solutions Using Self-Organizing Map
Visualization and Data Mining of Solutions Using Self-Organizing Map Shigeru Obayashi and Daisuke Sasaki Institute of Fluid Science, Tohoku University, Sendai, 980-8577 JAPAN [email protected], [email protected]
ADVANCED MACHINE LEARNING. Introduction
1 1 Introduction Lecturer: Prof. Aude Billard ([email protected]) Teaching Assistants: Guillaume de Chambrier, Nadia Figueroa, Denys Lamotte, Nicola Sommer 2 2 Course Format Alternate between: Lectures
VISUALIZATION OF GEOSPATIAL DATA BY COMPONENT PLANES AND U-MATRIX
VISUALIZATION OF GEOSPATIAL DATA BY COMPONENT PLANES AND U-MATRIX Marcos Aurélio Santos da Silva 1, Antônio Miguel Vieira Monteiro 2 and José Simeão Medeiros 2 1 Embrapa Tabuleiros Costeiros - Laboratory
Exploratory Data Analysis Using Radial Basis Function Latent Variable Models
Exploratory Data Analysis Using Radial Basis Function Latent Variable Models Alan D. Marrs and Andrew R. Webb DERA St Andrews Road, Malvern Worcestershire U.K. WR14 3PS {marrs,webb}@signal.dera.gov.uk
Proceedings - AutoCarto 2012 - Columbus, Ohio, USA - September 16-18, 2012
Data Mining of Collaboratively Collected Geographic Crime Information Using an Unsupervised Neural Network Approach Julian Hagenauer, Marco Helbich (corresponding author), Michael Leitner, Jerry Ratcliffe,
In Apostolos-Paul N. Refenes, Yaser Abu-Mostafa, John Moody, and Andreas Weigend (Eds.) Neural Networks in Financial Engineering. Proceedings of the Third International Conference on Neural Networks in
Visualizing class probability estimators
Visualizing class probability estimators Eibe Frank and Mark Hall Department of Computer Science University of Waikato Hamilton, New Zealand {eibe, mhall}@cs.waikato.ac.nz Abstract. Inducing classifiers
Data Mining: Exploring Data. Lecture Notes for Chapter 3. Introduction to Data Mining
Data Mining: Exploring Data Lecture Notes for Chapter 3 Introduction to Data Mining by Tan, Steinbach, Kumar What is data exploration? A preliminary exploration of the data to better understand its characteristics.
Iris Sample Data Set. Basic Visualization Techniques: Charts, Graphs and Maps. Summary Statistics. Frequency and Mode
Iris Sample Data Set Basic Visualization Techniques: Charts, Graphs and Maps CS598 Information Visualization Spring 2010 Many of the exploratory data techniques are illustrated with the Iris Plant data
Visualization of High Dimensional Scientific Data
Visualization of High Dimensional Scientific Data Roberto Tagliaferri and Antonino Staiano Department of Mathematics and Computer Science, University of Salerno, Italy {robtag,astaiano}@unisa.it Copyright
Data Mining: Exploring Data. Lecture Notes for Chapter 3. Introduction to Data Mining
Data Mining: Exploring Data Lecture Notes for Chapter 3 Introduction to Data Mining by Tan, Steinbach, Kumar Tan,Steinbach, Kumar Introduction to Data Mining 8/05/2005 1 What is data exploration? A preliminary
Clustering & Visualization
Chapter 5 Clustering & Visualization Clustering in high-dimensional databases is an important problem and there are a number of different clustering paradigms which are applicable to high-dimensional data.
CITY UNIVERSITY OF HONG KONG 香 港 城 市 大 學. Self-Organizing Map: Visualization and Data Handling 自 組 織 神 經 網 絡 : 可 視 化 和 數 據 處 理
CITY UNIVERSITY OF HONG KONG 香 港 城 市 大 學 Self-Organizing Map: Visualization and Data Handling 自 組 織 神 經 網 絡 : 可 視 化 和 數 據 處 理 Submitted to Department of Electronic Engineering 電 子 工 程 學 系 in Partial Fulfillment
An Introduction to Point Pattern Analysis using CrimeStat
Introduction An Introduction to Point Pattern Analysis using CrimeStat Luc Anselin Spatial Analysis Laboratory Department of Agricultural and Consumer Economics University of Illinois, Urbana-Champaign
Analyzing The Role Of Dimension Arrangement For Data Visualization in Radviz
Analyzing The Role Of Dimension Arrangement For Data Visualization in Radviz Luigi Di Caro 1, Vanessa Frias-Martinez 2, and Enrique Frias-Martinez 2 1 Department of Computer Science, Universita di Torino,
Utilizing spatial information systems for non-spatial-data analysis
Jointly published by Akadémiai Kiadó, Budapest Scientometrics, and Kluwer Academic Publishers, Dordrecht Vol. 51, No. 3 (2001) 563 571 Utilizing spatial information systems for non-spatial-data analysis
Data Visualization with Simultaneous Feature Selection
1 Data Visualization with Simultaneous Feature Selection Dharmesh M. Maniyar and Ian T. Nabney Neural Computing Research Group Aston University, Birmingham. B4 7ET, United Kingdom Email: {maniyard,nabneyit}@aston.ac.uk
Visual Cluster Analysis of Trajectory Data With Interactive Kohonen Maps
Visual Cluster Analysis of Trajectory Data With Interactive Kohonen Maps Tobias Schreck Interactive Graphics Systems Group TU Darmstadt, Germany Jürgen Bernard Interactive Graphics Systems Group TU Darmstadt,
Personalized Hierarchical Clustering
Personalized Hierarchical Clustering Korinna Bade, Andreas Nürnberger Faculty of Computer Science, Otto-von-Guericke-University Magdeburg, D-39106 Magdeburg, Germany {kbade,nuernb}@iws.cs.uni-magdeburg.de
Clustering. 15-381 Artificial Intelligence Henry Lin. Organizing data into clusters such that there is
Clustering 15-381 Artificial Intelligence Henry Lin Modified from excellent slides of Eamonn Keogh, Ziv Bar-Joseph, and Andrew Moore What is Clustering? Organizing data into clusters such that there is
Neural Networks Lesson 5 - Cluster Analysis
Neural Networks Lesson 5 - Cluster Analysis Prof. Michele Scarpiniti INFOCOM Dpt. - Sapienza University of Rome http://ispac.ing.uniroma1.it/scarpiniti/index.htm [email protected] Rome, 29
Colour Image Segmentation Technique for Screen Printing
60 R.U. Hewage and D.U.J. Sonnadara Department of Physics, University of Colombo, Sri Lanka ABSTRACT Screen-printing is an industry with a large number of applications ranging from printing mobile phone
9. Text & Documents. Visualizing and Searching Documents. Dr. Thorsten Büring, 20. Dezember 2007, Vorlesung Wintersemester 2007/08
9. Text & Documents Visualizing and Searching Documents Dr. Thorsten Büring, 20. Dezember 2007, Vorlesung Wintersemester 2007/08 Slide 1 / 37 Outline Characteristics of text data Detecting patterns SeeSoft
A Study of Web Log Analysis Using Clustering Techniques
A Study of Web Log Analysis Using Clustering Techniques Hemanshu Rana 1, Mayank Patel 2 Assistant Professor, Dept of CSE, M.G Institute of Technical Education, Gujarat India 1 Assistant Professor, Dept
EFFICIENT DATA PRE-PROCESSING FOR DATA MINING
EFFICIENT DATA PRE-PROCESSING FOR DATA MINING USING NEURAL NETWORKS JothiKumar.R 1, Sivabalan.R.V 2 1 Research scholar, Noorul Islam University, Nagercoil, India Assistant Professor, Adhiparasakthi College
Non-Stop Optical Illusions A Teacher s Guide to the Empire State Plaza Art Collection
Non-Stop Optical Illusions A Teacher s Guide to the Empire State Plaza Art Collection New York State Office of General Services Curatorial/Tour Services 29 th Floor, Corning Tower Albany, NY 12242 518.473.7521
Machine Learning for Data Science (CS4786) Lecture 1
Machine Learning for Data Science (CS4786) Lecture 1 Tu-Th 10:10 to 11:25 AM Hollister B14 Instructors : Lillian Lee and Karthik Sridharan ROUGH DETAILS ABOUT THE COURSE Diagnostic assignment 0 is out:
Template-based Eye and Mouth Detection for 3D Video Conferencing
Template-based Eye and Mouth Detection for 3D Video Conferencing Jürgen Rurainsky and Peter Eisert Fraunhofer Institute for Telecommunications - Heinrich-Hertz-Institute, Image Processing Department, Einsteinufer
