Topological Data Analysis

Size: px
Start display at page:

Download "Topological Data Analysis"

Transcription

1 INF563 Topological Data Analysis Steve Oudot, Mathieu Carrière

2 Context: The data deluge - Les donnees de ce type apparaissent dans des contextes scientifiques et industrie Data are generated at an unprecedented rate by: academia industry general public 1

3 Context: The data deluge - Les donnees de ce type apparaissent dans des contextes scientifiques et industrie Data are generated at an unprecedented rate by: academia industry general public Need for new scalable methods to analyze and classify these data automatically 1

4 Exploratory analysis of geometric data - ma recherche s inscrit dans le contexte de l analyse exploratoire des donnees, dont l obj Input: set of data points with metric or (dis-)similarity measure data point 3d point, image patch, image or 3d shape in collection, Facebook user, etc. 2

5 Exploratory analysis of geometric data - ma recherche s inscrit dans le contexte de l analyse exploratoire des donnees, dont l obj Input: set of data points with metric or (dis-)similarity measure data point 3d point, image patch, image or 3d shape in collection, Facebook user, etc. Goal: describe the underlying structure of the data, for interpretation or summary 2

6 Challenges Noise Scale Rd Dimensionality Rk 3

7 Challenges 4 million data points in R9 (source: [Lee, Pederson, Mumford 2003]) Motivation: study cognitive representation of space of images Topology 3

8 Challenges 4 million data points in R9 (source: [Lee, Pederson, Mumford 2003]) Motivation: study cognitive representation of space of images underlying structure: Klein bottle (source: [Carlsson, Ishkhanov, de Silva, Zomorodian 2008]) Topology PCA k-pca Isomap 3

9 Challenges - each node represents an NBA player, links represent proximity relations in a 7-dimensional spac Topology (source: 3

10 This The is ourtopology goal at large. To ofachieve data it, (TDA) we use concepts and tools from algebraic t topological invariants for classification β 0 = β 2 = 1 β 1 = 2 like homology groups, or the dimension of their free part (called Betti numbers) A.T. in the 20th century triangulation A.T. in the 21st century compact set topological descriptors for inference and comparison β 0 β 1 β 2 point cloud 4

11 The TDA community This is our goal at large. To achieve it, we use concepts and tools from algebraic t (as of 2003) Stanford (G. Carlsson) Duke (H. Edelsbrunner) 2 research groups (5-10 researchers) 5

12 The TDA community This is our goal at large. To achieve it, we use concepts and tools from algebraic t (as of 2007) Stanford (G. Carlsson. L. Guibas) Pomona (V. de Silva) Rutgers (K. Mischaikow) UPenn (Rob Ghrist) Duke (H. Edelsbrunner, J. Harer) Jagiellonian (M. Mrozek) IST Austria (H. Edelsbrunner) Technion (R. Adler) Topological Data Analysis (F. Chazal, S. Oudot) Geometrica (J.-D. Boissonnat, D. Cohen-Steiner) 8-10 research groups ( researchers) 5

13 The TDA community This is our goal at large. To achieve it, we use concepts and tools from algebraic t (as of 2014) Stanford Edinburgh, MPI, Münster IMA, TTI, OSU, UConn Jagiellonian (M. Mrozek) Rutgers (K. Mischaikow) IST Austria (H. Edelsbrunner) (G. Carlsson. L. Guibas) UPenn (Rob Ghrist) ETH, Bologna Pomona (V. de Silva) Duke (H. Edelsbrunner, J. Harer) Technion (R. Adler) (F. Chazal, S. Oudot) ENS Paris, U. Paris-Est Geometrica (J.-D. Boissonnat, D. Cohen-Steiner) Gipsa-lab, LJK researchers at the theory level researchers at the applications level research themes: applied topology, algorithmics, data science success stories: natural images, dynamical systems, NBA, breast cancer, 5

14 The TDA community This is our goal at large. To achieve it, we use concepts and tools from algebraic t (as of 2014) Stanford Edinburgh, MPI, Münster IMA, TTI, OSU, UConn Jagiellonian (M. Mrozek) Rutgers (K. Mischaikow) IST Austria (H. Edelsbrunner) (G. Carlsson. L. Guibas) UPenn (Rob Ghrist) ETH, Bologna Pomona (V. de Silva) Duke (H. Edelsbrunner, J. Harer) Technion (R. Adler) (F. Chazal, S. Oudot) ENS Paris, U. Paris-Est Geometrica (J.-D. Boissonnat, D. Cohen-Steiner) Gipsa-lab, LJK researchers at the theory level researchers at the applications level C est l une des research specificites themes: de l equipe applied topology, Geometrica, algorithmics, de regarder data touscience les 3 aspect success stories: natural images, dynamical systems, NBA, breast cancer, 5

15 A few applications Fors de notre resultat de stabilite et de notre nouveau cadre theorique pour l analy R Scalar field analysis over sensor networks [Gao, Guibas, O., Wang 2010] [Chazal, Guibas, O. Skraba 2011] sensors Stable signatures for shape comparison [Chazal, Cohen-Steiner, Guibas, Me moli, O. 2009] [Chazal, de Silva, O. 2013] [Chazal, Glisse, Labrue re, Michel 2014] camel cat elephant face head horse Unsupervised learning with guarantees on the number of clusters [Chazal, Guibas, O., Skraba 2013] 6

16 Course outline Session 1: dimensionality reduction (linear vs. non-linear) + lab Session 2: clustering (hierarchical, mode-seeking) + lab Session 3: homology theory + exercises Session 4: size theory, persistence + exercises Session 5: topological inference I + exercises/lab Session 6: topological inference II + exercises/lab Session 7: topological signatures I + lab Session 8: topological signatures II + lab Session 9: Mapper + lab Evaluation: written exam 7

Data Analysis using Computational Topology and Geometric Statistics

Data Analysis using Computational Topology and Geometric Statistics Data Analysis using Computational Topology and Geometric Statistics Peter Bubenik (Cleveland State University), Gunnar Carlsson (Stanford University), Peter T. Kim (University of Guelph) Mar 8 Mar 13,

More information

Topological Data Analysis Applications to Computer Vision

Topological Data Analysis Applications to Computer Vision Topological Data Analysis Applications to Computer Vision Vitaliy Kurlin, http://kurlin.org Microsoft Research Cambridge and Durham University, UK Topological Data Analysis quantifies topological structures

More information

BARCODES: THE PERSISTENT TOPOLOGY OF DATA

BARCODES: THE PERSISTENT TOPOLOGY OF DATA BARCODES: THE PERSISTENT TOPOLOGY OF DATA ROBERT GHRIST Abstract. This article surveys recent work of Carlsson and collaborators on applications of computational algebraic topology to problems of feature

More information

Clustering and mapper

Clustering and mapper June 17th, 2014 Overview Goal of talk Explain Mapper, which is the most widely used and most successful TDA technique. (At core of Ayasdi, TDA company founded by Gunnar Carlsson.) Basic idea: perform clustering

More information

A fast and robust algorithm to count topologically persistent holes in noisy clouds

A fast and robust algorithm to count topologically persistent holes in noisy clouds A fast and robust algorithm to count topologically persistent holes in noisy clouds Vitaliy Kurlin Durham University Department of Mathematical Sciences, Durham, DH1 3LE, United Kingdom [email protected],

More information

How To Understand And Understand The Theory Of Computational Finance

How To Understand And Understand The Theory Of Computational Finance This course consists of three separate modules. Coordinator: Omiros Papaspiliopoulos Module I: Machine Learning in Finance Lecturer: Argimiro Arratia, Universitat Politecnica de Catalunya and BGSE Overview

More information

BIOINF 585 Fall 2015 Machine Learning for Systems Biology & Clinical Informatics http://www.ccmb.med.umich.edu/node/1376

BIOINF 585 Fall 2015 Machine Learning for Systems Biology & Clinical Informatics http://www.ccmb.med.umich.edu/node/1376 Course Director: Dr. Kayvan Najarian (DCM&B, [email protected]) Lectures: Labs: Mondays and Wednesdays 9:00 AM -10:30 AM Rm. 2065 Palmer Commons Bldg. Wednesdays 10:30 AM 11:30 AM (alternate weeks) Rm.

More information

Introduction to the R package TDA

Introduction to the R package TDA Introduction to the R package TDA Brittany T. Fasy, Jisu Kim, Fabrizio Lecci, Clément Maria, and Vincent Rouvreau In collaboration with the CMU TopStat Group Abstract We present a short tutorial and introduction

More information

Topological Data Analysis and Machine Learning Theory

Topological Data Analysis and Machine Learning Theory Topological Data Analysis and Machine Learning Theory Gunnar Carlsson (Stanford University), Rick Jardine (University of Western Ontario), Dmitry Feichtner-Kozlov (University of Bremen), Dmitriy Morozov

More information

Comparison of Non-linear Dimensionality Reduction Techniques for Classification with Gene Expression Microarray Data

Comparison of Non-linear Dimensionality Reduction Techniques for Classification with Gene Expression Microarray Data CMPE 59H Comparison of Non-linear Dimensionality Reduction Techniques for Classification with Gene Expression Microarray Data Term Project Report Fatma Güney, Kübra Kalkan 1/15/2013 Keywords: Non-linear

More information

WORKSHOP ON TOPOLOGY AND ABSTRACT ALGEBRA FOR BIOMEDICINE

WORKSHOP ON TOPOLOGY AND ABSTRACT ALGEBRA FOR BIOMEDICINE WORKSHOP ON TOPOLOGY AND ABSTRACT ALGEBRA FOR BIOMEDICINE ERIC K. NEUMANN Foundation Medicine, Cambridge, MA 02139, USA Email: [email protected] SVETLANA LOCKWOOD School of Electrical Engineering

More information

RESEARCH SUMMARY PETER BUBENIK

RESEARCH SUMMARY PETER BUBENIK RESEARCH SUMMARY PETER BUBENIK I am active in three areas of research: computational algebraic topology and data analysis, directed homotopy theory and concurrent computing, and homotopy theory, differential

More information

Classifying Large Data Sets Using SVMs with Hierarchical Clusters. Presented by :Limou Wang

Classifying Large Data Sets Using SVMs with Hierarchical Clusters. Presented by :Limou Wang Classifying Large Data Sets Using SVMs with Hierarchical Clusters Presented by :Limou Wang Overview SVM Overview Motivation Hierarchical micro-clustering algorithm Clustering-Based SVM (CB-SVM) Experimental

More information

TDA and Machine Learning: Better Together

TDA and Machine Learning: Better Together TDA and Machine Learning: Better Together TDA AND MACHINE LEARNING: BETTER TOGETHER 2 TABLE OF CONTENTS The New Data Analytics Dilemma... 3 Introducing Topology and Topological Data Analysis... 3 The Promise

More information

Statistiques en grande dimension

Statistiques en grande dimension Statistiques en grande dimension Christophe Giraud 1,2 et Tristan Mary-Huart 3,4 (1) Université Paris-Sud (2) Ecole Polytechnique (3) AgroParistech (4) INRA - Le Moulon M2 MathSV & Maths Aléa C. Giraud

More information

Machine Learning for Data Science (CS4786) Lecture 1

Machine Learning for Data Science (CS4786) Lecture 1 Machine Learning for Data Science (CS4786) Lecture 1 Tu-Th 10:10 to 11:25 AM Hollister B14 Instructors : Lillian Lee and Karthik Sridharan ROUGH DETAILS ABOUT THE COURSE Diagnostic assignment 0 is out:

More information

Using multiple models: Bagging, Boosting, Ensembles, Forests

Using multiple models: Bagging, Boosting, Ensembles, Forests Using multiple models: Bagging, Boosting, Ensembles, Forests Bagging Combining predictions from multiple models Different models obtained from bootstrap samples of training data Average predictions or

More information

CS Master Level Courses and Areas COURSE DESCRIPTIONS. CSCI 521 Real-Time Systems. CSCI 522 High Performance Computing

CS Master Level Courses and Areas COURSE DESCRIPTIONS. CSCI 521 Real-Time Systems. CSCI 522 High Performance Computing CS Master Level Courses and Areas The graduate courses offered may change over time, in response to new developments in computer science and the interests of faculty and students; the list of graduate

More information

Liste d'adresses URL

Liste d'adresses URL Liste de sites Internet concernés dans l' étude Le 25/02/2014 Information à propos de contrefacon.fr Le site Internet https://www.contrefacon.fr/ permet de vérifier dans une base de donnée de plus d' 1

More information

CHARACTERISTICS IN FLIGHT DATA ESTIMATION WITH LOGISTIC REGRESSION AND SUPPORT VECTOR MACHINES

CHARACTERISTICS IN FLIGHT DATA ESTIMATION WITH LOGISTIC REGRESSION AND SUPPORT VECTOR MACHINES CHARACTERISTICS IN FLIGHT DATA ESTIMATION WITH LOGISTIC REGRESSION AND SUPPORT VECTOR MACHINES Claus Gwiggner, Ecole Polytechnique, LIX, Palaiseau, France Gert Lanckriet, University of Berkeley, EECS,

More information

TOPOLOGY AND DATA GUNNAR CARLSSON

TOPOLOGY AND DATA GUNNAR CARLSSON BULLETIN (New Series) OF THE AMERICAN MATHEMATICAL SOCIETY Volume 46, Number 2, April 2009, Pages 255 308 S 0273-0979(09)01249-X Article electronically published on January 29, 2009 TOPOLOGY AND DATA GUNNAR

More information

Unsupervised Data Mining (Clustering)

Unsupervised Data Mining (Clustering) Unsupervised Data Mining (Clustering) Javier Béjar KEMLG December 01 Javier Béjar (KEMLG) Unsupervised Data Mining (Clustering) December 01 1 / 51 Introduction Clustering in KDD One of the main tasks in

More information

Introduction to Topology and its Applications to Complex Data

Introduction to Topology and its Applications to Complex Data Task: In each of these two beautiful parks, try to find a path (to walk) that uses each bridge once and exactly once. Start with the park on the right first. (The blue denotes a river that contains alligators,

More information

Machine Learning CS 6830. Lecture 01. Razvan C. Bunescu School of Electrical Engineering and Computer Science [email protected]

Machine Learning CS 6830. Lecture 01. Razvan C. Bunescu School of Electrical Engineering and Computer Science bunescu@ohio.edu Machine Learning CS 6830 Razvan C. Bunescu School of Electrical Engineering and Computer Science [email protected] What is Learning? Merriam-Webster: learn = to acquire knowledge, understanding, or skill

More information

Geometry and Topology from Point Cloud Data

Geometry and Topology from Point Cloud Data Geometry and Topology from Point Cloud Data Tamal K. Dey Department of Computer Science and Engineering The Ohio State University Dey (2011) Geometry and Topology from Point Cloud Data WALCOM 11 1 / 51

More information

A Partially Supervised Metric Multidimensional Scaling Algorithm for Textual Data Visualization

A Partially Supervised Metric Multidimensional Scaling Algorithm for Textual Data Visualization A Partially Supervised Metric Multidimensional Scaling Algorithm for Textual Data Visualization Ángela Blanco Universidad Pontificia de Salamanca [email protected] Spain Manuel Martín-Merino Universidad

More information

Clustering Big Data. Anil K. Jain. (with Radha Chitta and Rong Jin) Department of Computer Science Michigan State University November 29, 2012

Clustering Big Data. Anil K. Jain. (with Radha Chitta and Rong Jin) Department of Computer Science Michigan State University November 29, 2012 Clustering Big Data Anil K. Jain (with Radha Chitta and Rong Jin) Department of Computer Science Michigan State University November 29, 2012 Outline Big Data How to extract information? Data clustering

More information

Learning to Process Natural Language in Big Data Environment

Learning to Process Natural Language in Big Data Environment CCF ADL 2015 Nanchang Oct 11, 2015 Learning to Process Natural Language in Big Data Environment Hang Li Noah s Ark Lab Huawei Technologies Part 1: Deep Learning - Present and Future Talk Outline Overview

More information

Machine learning for algo trading

Machine learning for algo trading Machine learning for algo trading An introduction for nonmathematicians Dr. Aly Kassam Overview High level introduction to machine learning A machine learning bestiary What has all this got to do with

More information

Visualization of General Defined Space Data

Visualization of General Defined Space Data International Journal of Computer Graphics & Animation (IJCGA) Vol.3, No.4, October 013 Visualization of General Defined Space Data John R Rankin La Trobe University, Australia Abstract A new algorithm

More information

Big Data and Complex Networks Analytics. Timos Sellis, CSIT Kathy Horadam, MGS

Big Data and Complex Networks Analytics. Timos Sellis, CSIT Kathy Horadam, MGS Big Data and Complex Networks Analytics Timos Sellis, CSIT Kathy Horadam, MGS Big Data What is it? Most commonly accepted definition, by Gartner (the 3 Vs) Big data is high-volume, high-velocity and high-variety

More information

Distance Metric Learning in Data Mining (Part I) Fei Wang and Jimeng Sun IBM TJ Watson Research Center

Distance Metric Learning in Data Mining (Part I) Fei Wang and Jimeng Sun IBM TJ Watson Research Center Distance Metric Learning in Data Mining (Part I) Fei Wang and Jimeng Sun IBM TJ Watson Research Center 1 Outline Part I - Applications Motivation and Introduction Patient similarity application Part II

More information

Fanny Dos Reis. Visiting Assistant Professor, Texas A&M University. September 2006 - May 2008

Fanny Dos Reis. Visiting Assistant Professor, Texas A&M University. September 2006 - May 2008 Fanny Dos Reis Positions Held Visiting Assistant Professor, Texas A&M University. September 2006 - May 2008 Visiting Assistant Professor, University of Lille 1, France. September 2004 - August 2006 Visiting

More information

Divvy: Fast and Intuitive Exploratory Data Analysis

Divvy: Fast and Intuitive Exploratory Data Analysis Journal of Machine Learning Research 14 (2013) 3159-3163 Submitted 6/13; Revised 8/13; Published 10/13 Divvy: Fast and Intuitive Exploratory Data Analysis Joshua M. Lewis Virginia R. de Sa Department of

More information

How To Cluster

How To Cluster Data Clustering Dec 2nd, 2013 Kyrylo Bessonov Talk outline Introduction to clustering Types of clustering Supervised Unsupervised Similarity measures Main clustering algorithms k-means Hierarchical Main

More information

DATA ANALYTICS Unlocking knowledge and value from data

DATA ANALYTICS Unlocking knowledge and value from data DATA ANALYTICS Unlocking knowledge and value from data November 2014 Summary Inria Industry Meetings p 3 Your contacts at the Inria Saclay - Île-de-France research center p 4 Technologies Bertifier Sparklificator

More information

Virtual Landmarks for the Internet

Virtual Landmarks for the Internet Virtual Landmarks for the Internet Liying Tang Mark Crovella Boston University Computer Science Internet Distance Matters! Useful for configuring Content delivery networks Peer to peer applications Multiuser

More information

Information Management course

Information Management course Università degli Studi di Milano Master Degree in Computer Science Information Management course Teacher: Alberto Ceselli Lecture 01 : 06/10/2015 Practical informations: Teacher: Alberto Ceselli ([email protected])

More information

ADVANCED MACHINE LEARNING. Introduction

ADVANCED MACHINE LEARNING. Introduction 1 1 Introduction Lecturer: Prof. Aude Billard ([email protected]) Teaching Assistants: Guillaume de Chambrier, Nadia Figueroa, Denys Lamotte, Nicola Sommer 2 2 Course Format Alternate between: Lectures

More information

Modelling, Extraction and Description of Intrinsic Cues of High Resolution Satellite Images: Independent Component Analysis based approaches

Modelling, Extraction and Description of Intrinsic Cues of High Resolution Satellite Images: Independent Component Analysis based approaches Modelling, Extraction and Description of Intrinsic Cues of High Resolution Satellite Images: Independent Component Analysis based approaches PhD Thesis by Payam Birjandi Director: Prof. Mihai Datcu Problematic

More information

Stable Topological Signatures for Points on 3D Shapes

Stable Topological Signatures for Points on 3D Shapes Eurographics Symposium on Geometry Processing 2015 Mirela Ben-Chen and Ligang Liu (Guest Editors) Volume 34 (2015), Number 5 Stable Topological Signatures for Points on 3D Shapes Mathieu Carrière 1 and

More information

INTERACTIVE DATA EXPLORATION USING MDS MAPPING

INTERACTIVE DATA EXPLORATION USING MDS MAPPING INTERACTIVE DATA EXPLORATION USING MDS MAPPING Antoine Naud and Włodzisław Duch 1 Department of Computer Methods Nicolaus Copernicus University ul. Grudziadzka 5, 87-100 Toruń, Poland Abstract: Interactive

More information

Robust Blind Watermarking Mechanism For Point Sampled Geometry

Robust Blind Watermarking Mechanism For Point Sampled Geometry Robust Blind Watermarking Mechanism For Point Sampled Geometry Parag Agarwal Balakrishnan Prabhakaran Department of Computer Science, University of Texas at Dallas MS EC 31, PO Box 830688, Richardson,

More information

Visualization of Large Font Databases

Visualization of Large Font Databases Visualization of Large Font Databases Martin Solli and Reiner Lenz Linköping University, Sweden ITN, Campus Norrköping, Linköping University, 60174 Norrköping, Sweden [email protected], [email protected]

More information

Semi-Supervised and Unsupervised Machine Learning. Novel Strategies

Semi-Supervised and Unsupervised Machine Learning. Novel Strategies Brochure More information from http://www.researchandmarkets.com/reports/2179190/ Semi-Supervised and Unsupervised Machine Learning. Novel Strategies Description: This book provides a detailed and up to

More information

DSSP Data Science Starter Program - Polytechnique

DSSP Data Science Starter Program - Polytechnique DSSP Data Science Starter Program - Polytechnique A novel professional training on Data Science and Bigdata, offered by École Polytechnique jointly by the Applied Mathematics and Informatics Department

More information

A quick trip through geometrical shape comparison

A quick trip through geometrical shape comparison A quick trip through geometrical shape comparison Patrizio Frosini 1,2 1 Department of Mathematics, University of Bologna, Italy 2 ARCES - Vision Mathematics Group, University of Bologna, Italy [email protected]

More information

Final Project Report

Final Project Report CPSC545 by Introduction to Data Mining Prof. Martin Schultz & Prof. Mark Gerstein Student Name: Yu Kor Hugo Lam Student ID : 904907866 Due Date : May 7, 2007 Introduction Final Project Report Pseudogenes

More information

Exploratory Data Analysis with MATLAB

Exploratory Data Analysis with MATLAB Computer Science and Data Analysis Series Exploratory Data Analysis with MATLAB Second Edition Wendy L Martinez Angel R. Martinez Jeffrey L. Solka ( r ec) CRC Press VV J Taylor & Francis Group Boca Raton

More information

Neural Networks Lesson 5 - Cluster Analysis

Neural Networks Lesson 5 - Cluster Analysis Neural Networks Lesson 5 - Cluster Analysis Prof. Michele Scarpiniti INFOCOM Dpt. - Sapienza University of Rome http://ispac.ing.uniroma1.it/scarpiniti/index.htm [email protected] Rome, 29

More information

PSG College of Technology, Coimbatore-641 004 Department of Computer & Information Sciences BSc (CT) G1 & G2 Sixth Semester PROJECT DETAILS.

PSG College of Technology, Coimbatore-641 004 Department of Computer & Information Sciences BSc (CT) G1 & G2 Sixth Semester PROJECT DETAILS. PSG College of Technology, Coimbatore-641 004 Department of Computer & Information Sciences BSc (CT) G1 & G2 Sixth Semester PROJECT DETAILS Project Project Title Area of Abstract No Specialization 1. Software

More information

Supervised Feature Selection & Unsupervised Dimensionality Reduction

Supervised Feature Selection & Unsupervised Dimensionality Reduction Supervised Feature Selection & Unsupervised Dimensionality Reduction Feature Subset Selection Supervised: class labels are given Select a subset of the problem features Why? Redundant features much or

More information

Evaluating Ayasdi s Topological Data Analysis For Big Data

Evaluating Ayasdi s Topological Data Analysis For Big Data Master Thesis Evaluating Ayasdi s Topological Data Analysis For Big Data Author: Hee Eun Kim Advisor 1: Advisor 2: Prof. Dr. rer. nat. Stephan Trahasch Prof. Dott. Ing. Roberto V. Zicari 31 August 2015

More information

Secure Because Math: Understanding ML- based Security Products (#SecureBecauseMath)

Secure Because Math: Understanding ML- based Security Products (#SecureBecauseMath) Secure Because Math: Understanding ML- based Security Products (#SecureBecauseMath) Alex Pinto Chief Data Scientist Niddel / MLSec Project @alexcpsec @MLSecProject @NiddelCorp MLSec Project / Niddel MLSec

More information

High-dimensional labeled data analysis with Gabriel graphs

High-dimensional labeled data analysis with Gabriel graphs High-dimensional labeled data analysis with Gabriel graphs Michaël Aupetit CEA - DAM Département Analyse Surveillance Environnement BP 12-91680 - Bruyères-Le-Châtel, France Abstract. We propose the use

More information

Clustering Connectionist and Statistical Language Processing

Clustering Connectionist and Statistical Language Processing Clustering Connectionist and Statistical Language Processing Frank Keller [email protected] Computerlinguistik Universität des Saarlandes Clustering p.1/21 Overview clustering vs. classification supervised

More information

Introduction to nonparametric regression: Least squares vs. Nearest neighbors

Introduction to nonparametric regression: Least squares vs. Nearest neighbors Introduction to nonparametric regression: Least squares vs. Nearest neighbors Patrick Breheny October 30 Patrick Breheny STA 621: Nonparametric Statistics 1/16 Introduction For the remainder of the course,

More information

A Computational Framework for Exploratory Data Analysis

A Computational Framework for Exploratory Data Analysis A Computational Framework for Exploratory Data Analysis Axel Wismüller Depts. of Radiology and Biomedical Engineering, University of Rochester, New York 601 Elmwood Avenue, Rochester, NY 14642-8648, U.S.A.

More information

Data Mining and Exploration. Data Mining and Exploration: Introduction. Relationships between courses. Overview. Course Introduction

Data Mining and Exploration. Data Mining and Exploration: Introduction. Relationships between courses. Overview. Course Introduction Data Mining and Exploration Data Mining and Exploration: Introduction Amos Storkey, School of Informatics January 10, 2006 http://www.inf.ed.ac.uk/teaching/courses/dme/ Course Introduction Welcome Administration

More information

Object class recognition using unsupervised scale-invariant learning

Object class recognition using unsupervised scale-invariant learning Object class recognition using unsupervised scale-invariant learning Rob Fergus Pietro Perona Andrew Zisserman Oxford University California Institute of Technology Goal Recognition of object categories

More information

Syllabus for MATH 191 MATH 191 Topics in Data Science: Algorithms and Mathematical Foundations Department of Mathematics, UCLA Fall Quarter 2015

Syllabus for MATH 191 MATH 191 Topics in Data Science: Algorithms and Mathematical Foundations Department of Mathematics, UCLA Fall Quarter 2015 Syllabus for MATH 191 MATH 191 Topics in Data Science: Algorithms and Mathematical Foundations Department of Mathematics, UCLA Fall Quarter 2015 Lecture: MWF: 1:00-1:50pm, GEOLOGY 4645 Instructor: Mihai

More information

Machine Learning. 01 - Introduction

Machine Learning. 01 - Introduction Machine Learning 01 - Introduction Machine learning course One lecture (Wednesday, 9:30, 346) and one exercise (Monday, 17:15, 203). Oral exam, 20 minutes, 5 credit points. Some basic mathematical knowledge

More information

Steven C.H. Hoi School of Information Systems Singapore Management University Email: [email protected]

Steven C.H. Hoi School of Information Systems Singapore Management University Email: chhoi@smu.edu.sg Steven C.H. Hoi School of Information Systems Singapore Management University Email: [email protected] Introduction http://stevenhoi.org/ Finance Recommender Systems Cyber Security Machine Learning Visual

More information

inter.noise 2000 The 29th International Congress and Exhibition on Noise Control Engineering 27-30 August 2000, Nice, FRANCE

inter.noise 2000 The 29th International Congress and Exhibition on Noise Control Engineering 27-30 August 2000, Nice, FRANCE Copyright SFA - InterNoise 2000 1 inter.noise 2000 The 29th International Congress and Exhibition on Noise Control Engineering 27-30 August 2000, Nice, FRANCE I-INCE Classification: 6.1 AUDITORY COMFORT

More information

USING SELF-ORGANIZING MAPS FOR INFORMATION VISUALIZATION AND KNOWLEDGE DISCOVERY IN COMPLEX GEOSPATIAL DATASETS

USING SELF-ORGANIZING MAPS FOR INFORMATION VISUALIZATION AND KNOWLEDGE DISCOVERY IN COMPLEX GEOSPATIAL DATASETS USING SELF-ORGANIZING MAPS FOR INFORMATION VISUALIZATION AND KNOWLEDGE DISCOVERY IN COMPLEX GEOSPATIAL DATASETS Koua, E.L. International Institute for Geo-Information Science and Earth Observation (ITC).

More information

IMPLICIT SHAPE MODELS FOR OBJECT DETECTION IN 3D POINT CLOUDS

IMPLICIT SHAPE MODELS FOR OBJECT DETECTION IN 3D POINT CLOUDS IMPLICIT SHAPE MODELS FOR OBJECT DETECTION IN 3D POINT CLOUDS Alexander Velizhev 1 (presenter) Roman Shapovalov 2 Konrad Schindler 3 1 Hexagon Technology Center, Heerbrugg, Switzerland 2 Graphics & Media

More information

Data Mining and Knowledge Discovery in Databases (KDD) State of the Art. Prof. Dr. T. Nouri Computer Science Department FHNW Switzerland

Data Mining and Knowledge Discovery in Databases (KDD) State of the Art. Prof. Dr. T. Nouri Computer Science Department FHNW Switzerland Data Mining and Knowledge Discovery in Databases (KDD) State of the Art Prof. Dr. T. Nouri Computer Science Department FHNW Switzerland 1 Conference overview 1. Overview of KDD and data mining 2. Data

More information

Self-Organizing g Maps (SOM) COMP61021 Modelling and Visualization of High Dimensional Data

Self-Organizing g Maps (SOM) COMP61021 Modelling and Visualization of High Dimensional Data Self-Organizing g Maps (SOM) Ke Chen Outline Introduction ti Biological Motivation Kohonen SOM Learning Algorithm Visualization Method Examples Relevant Issues Conclusions 2 Introduction Self-organizing

More information

APPM4720/5720: Fast algorithms for big data. Gunnar Martinsson The University of Colorado at Boulder

APPM4720/5720: Fast algorithms for big data. Gunnar Martinsson The University of Colorado at Boulder APPM4720/5720: Fast algorithms for big data Gunnar Martinsson The University of Colorado at Boulder Course objectives: The purpose of this course is to teach efficient algorithms for processing very large

More information

Account Manager H/F - CDI - France

Account Manager H/F - CDI - France Account Manager H/F - CDI - France La société Fondée en 2007, Dolead est un acteur majeur et innovant dans l univers de la publicité sur Internet. En 2013, Dolead a réalisé un chiffre d affaires de près

More information

Search Taxonomy. Web Search. Search Engine Optimization. Information Retrieval

Search Taxonomy. Web Search. Search Engine Optimization. Information Retrieval Information Retrieval INFO 4300 / CS 4300! Retrieval models Older models» Boolean retrieval» Vector Space model Probabilistic Models» BM25» Language models Web search» Learning to Rank Search Taxonomy!

More information

CURRICULUM VITAE. August 2008 now: Lecturer in Analysis at the University of Birmingham.

CURRICULUM VITAE. August 2008 now: Lecturer in Analysis at the University of Birmingham. CURRICULUM VITAE Name: Olga Maleva Work address: School of Mathematics, Watson Building, University of Birmingham, Edgbaston, Birmingham, B15 2TT, UK Telephone: +44(0)121 414 6584 Fax: +44(0)121 414 3389

More information

Well-Separated Pair Decomposition for the Unit-disk Graph Metric and its Applications

Well-Separated Pair Decomposition for the Unit-disk Graph Metric and its Applications Well-Separated Pair Decomposition for the Unit-disk Graph Metric and its Applications Jie Gao Department of Computer Science Stanford University Joint work with Li Zhang Systems Research Center Hewlett-Packard

More information

Non-negative Matrix Factorization (NMF) in Semi-supervised Learning Reducing Dimension and Maintaining Meaning

Non-negative Matrix Factorization (NMF) in Semi-supervised Learning Reducing Dimension and Maintaining Meaning Non-negative Matrix Factorization (NMF) in Semi-supervised Learning Reducing Dimension and Maintaining Meaning SAMSI 10 May 2013 Outline Introduction to NMF Applications Motivations NMF as a middle step

More information

Data deluge (and it s applications) Gianluigi Zanetti. Data deluge. (and its applications) Gianluigi Zanetti

Data deluge (and it s applications) Gianluigi Zanetti. Data deluge. (and its applications) Gianluigi Zanetti Data deluge (and its applications) Prologue Data is becoming cheaper and cheaper to produce and store Driving mechanism is parallelism on sensors, storage, computing Data directly produced are complex

More information

Machine Learning in Computer Vision A Tutorial. Ajay Joshi, Anoop Cherian and Ravishankar Shivalingam Dept. of Computer Science, UMN

Machine Learning in Computer Vision A Tutorial. Ajay Joshi, Anoop Cherian and Ravishankar Shivalingam Dept. of Computer Science, UMN Machine Learning in Computer Vision A Tutorial Ajay Joshi, Anoop Cherian and Ravishankar Shivalingam Dept. of Computer Science, UMN Outline Introduction Supervised Learning Unsupervised Learning Semi-Supervised

More information

203.4770: Introduction to Machine Learning Dr. Rita Osadchy

203.4770: Introduction to Machine Learning Dr. Rita Osadchy 203.4770: Introduction to Machine Learning Dr. Rita Osadchy 1 Outline 1. About the Course 2. What is Machine Learning? 3. Types of problems and Situations 4. ML Example 2 About the course Course Homepage:

More information

Medical Information Management & Mining. You Chen Jan,15, 2013 [email protected]

Medical Information Management & Mining. You Chen Jan,15, 2013 You.chen@vanderbilt.edu Medical Information Management & Mining You Chen Jan,15, 2013 [email protected] 1 Trees Building Materials Trees cannot be used to build a house directly. How can we transform trees to building materials?

More information

Data, Measurements, Features

Data, Measurements, Features Data, Measurements, Features Middle East Technical University Dep. of Computer Engineering 2009 compiled by V. Atalay What do you think of when someone says Data? We might abstract the idea that data are

More information

How To Understand The Theory Of Probability

How To Understand The Theory Of Probability Graduate Programs in Statistics Course Titles STAT 100 CALCULUS AND MATR IX ALGEBRA FOR STATISTICS. Differential and integral calculus; infinite series; matrix algebra STAT 195 INTRODUCTION TO MATHEMATICAL

More information

Introduction to Data Mining

Introduction to Data Mining Introduction to Data Mining 1 Why Data Mining? Explosive Growth of Data Data collection and data availability Automated data collection tools, Internet, smartphones, Major sources of abundant data Business:

More information

CITY UNIVERSITY OF HONG KONG 香 港 城 市 大 學. Self-Organizing Map: Visualization and Data Handling 自 組 織 神 經 網 絡 : 可 視 化 和 數 據 處 理

CITY UNIVERSITY OF HONG KONG 香 港 城 市 大 學. Self-Organizing Map: Visualization and Data Handling 自 組 織 神 經 網 絡 : 可 視 化 和 數 據 處 理 CITY UNIVERSITY OF HONG KONG 香 港 城 市 大 學 Self-Organizing Map: Visualization and Data Handling 自 組 織 神 經 網 絡 : 可 視 化 和 數 據 處 理 Submitted to Department of Electronic Engineering 電 子 工 程 學 系 in Partial Fulfillment

More information

HDDVis: An Interactive Tool for High Dimensional Data Visualization

HDDVis: An Interactive Tool for High Dimensional Data Visualization HDDVis: An Interactive Tool for High Dimensional Data Visualization Mingyue Tan Department of Computer Science University of British Columbia [email protected] ABSTRACT Current high dimensional data visualization

More information

Visualization of Breast Cancer Data by SOM Component Planes

Visualization of Breast Cancer Data by SOM Component Planes International Journal of Science and Technology Volume 3 No. 2, February, 2014 Visualization of Breast Cancer Data by SOM Component Planes P.Venkatesan. 1, M.Mullai 2 1 Department of Statistics,NIRT(Indian

More information