ADVANCED MACHINE LEARNING. Introduction


 Shawn O’Connor’
 1 years ago
 Views:
Transcription
1 1 1 Introduction Lecturer: Prof. Aude Billard Teaching Assistants: Guillaume de Chambrier, Nadia Figueroa, Denys Lamotte, Nicola Sommer
2 2 2 Course Format Alternate between: Lectures + Exercises and Practice Sessions: 9h1511h00 (Thursdays) 10h1511h00 (Fridays) Check class timetable!!
3 3 3 Class Timetable
4 4 4 Grading 50% of the grade based on personal work. Choice between: 1. Miniproject implementing and evaluating the algorithm performance and sensibility to parameter choices (should be done individually preferably). OR 2. A literature survey on a topic chosen among a list provided in class (can be done in team of two people, report conf. paper format, 8 pages, 10pt, double column format) ~2530 hours of personal work, i.e. count one full week of work. 50% based on final oral exam 20 minutes preparation 20 minutes answer on the black board (closed book, but allowed to bring a rectoverso A4 page with personal notes)
5 5 5 Prerequisites Linear Algebra, Probabilities and Statistics Familiar with basic Machine Learning Techniques: Principal Component Analysis Clustering with Kmeans and Gaussian Mixture Models Linear and weighted regression
6 6 6 Today s class content Taxonomy and basic concepts of ML Examples of ML applications Overview of class topics
7 7 7 Main Types of Learning Methods Supervised learning where the algorithm learns a function or model that maps best a set of inputs to a set of desired outputs. Reinforcement learning where the algorithm learns a policy or model of the set of transitions across a discrete set of inputoutput states (Markovian world) in order to maximize a reward value (external reinforcement). Unsupervised learning where the algorithm learns a model that best represent a set of inputs without any feedback (no desired output, no external reinforcement)
8 8 8 Main Types of Learning Methods Supervised learning where the algorithm learns a function or model that maps best a set of inputs to a set of desired outputs. Classes of supervised algorithms: Classification Regression
9 9 9 Classification: Principle N Map Ndim. input x to a set of class labels y. C: number of classes ( ( )) { 1,..., C} If two classes (binary classification), learn a function of the type: N h: and y = sgn h x. y = +1 y =? Train on subset of datapoints y = 1 Hope to generalize class labels to unseen datapoints
10 10 10 Classification: overview Classification is a supervised clustering process. Classification is usually multiclass; Given a set of known classes, the algorithm learns to extract combinations of data features that best predict the true class of the data. Original Data After 4class classification using Support Vector Machine (SVM)
11 111 Classification: example Classification of finance data to assess solvability using Support Vector Machine (SVM). 5classes to measure insolvency risks: Excellent, good, satisfactory, passable, poor (credit) Swiderski et al, Decision Multistage classification by using logistic regression and neural networks for assessment of financial condition of company, Support System, 2012
12 12 12 Classification: issues Classification of finance data to assess solvability using Support Vector Machine (SVM). The features are crucial. If they are poor, poor learning Usually determine features by hand (prior knowledge) Novel approaches learn the features Swiderski et al, Decision Multistage classification by using logistic regression and neural networks for assessment of financial condition of company, Support System, classes to measure insolvency risks: Excellent, good, satisfactory, passable, poor (credit)
13 13 13 Classification: issues A recurrent problem when applying classification to real life problems is that classes are often unbalanced. More samples from positive class than from negative class Swiderski et al, 2012 This can affect drastically classification performance, as classes with many data points have more influence on the error measure during training.
14 14 14 Classification Algorithms in this Course Support Vector Machine Relevance Vector Machine Boosting random projections Boosting random gaussians Random forest Gaussian Process
15 15 15 Regression: Principle N Map Ndim. input x to a continuous output y. Learn a function of the type: ( ) N f : and y = f x. y 3 y 2 y True function Estimate 4 y 1 y 1 x 2 x 3 x 4 x x { i i} i= 1,... Estimate f that best predict set of training points x, y? M
16 16 16 Regression: Issues N Map Ndim. input x to a continuous output y. Learn a function of the type: ( ) N f : and y = f x. y 3 y 2 y 4 y 1 y Fit strongly influenced by choice of:  datapoints for training  complexity of the model (interpolation) True function Estimate 1 x 2 x 3 x 4 x x { i i} i= 1,... Estimate f that best predict set of training points x, y? M
17 Regression: example Predicting the optimal position and orientation of the golf club to hit the ball so as to sink it into the goal. Kronander, Khansari and Billard, JTSC award, IEEE Int. Conf. on Int. and Rob. Systems
18 Regression: example Predicting the optimal position and orientation of the golf club to hit the ball so as to sink it into the goal. Kronander, Khansari and Billard, JTSC award, IEEE Int. Conf. on Int. and Rob. Systems
19 Regression: example Contrast prediction of two methods (Gaussian Process Regression and Gaussian Mixture Regression) in terms of precision and generalization. Generalization power outside the training data. GPR GMR Kronander, Khansari and Billard, JTSC award, IEEE Int. Conf. on Int. and Rob. Systems
20 20 20 Regression Algorithms in this Course Support Vector Machine Relevance Vector Machine Support vector regression Boosting random projections Relevance vector regression Boosting random gaussians Random Gaussian forest process regression Gaussian Process Gradient boosting Locally weighted projected regression
21 21 21 Main Types of Learning Methods Supervised learning where the algorithm learns a function or model that maps best a set of inputs to a set of desired outputs. Reinforcement learning where the algorithm learns a policy or model of the set of transitions across a discrete set of inputoutput states (Markovian world) in order to maximize a reward value (external reinforcement). Unsupervised learning where the algorithm learns a model that best represent a set of inputs without any feedback (no desired output, no external reinforcement)
22 222 Reinforcement Learning: Principle Generate a series of input for t time steps by performing action state x to state x: t 1 a a a 1 2 t 1 x x x 1 2 t... t a to go from Collecte data D= { x, x,... x, a, a,..., a } 1 2 t 1 2 t 1 Receive a reward at t: r t = 1 Map the reward to choices of actions a for each state (Credit Assignment Problem) Repeat the procedure (generate a series of trials and errors) until you converge to a good map that represents the best choice of action in any state to maximize total rewards.
23 23 23 t 1 Reinforcement Learning: Issues Generate a series of input for t time steps by performing action a to go from state x to state x: 1 2 t 1 x x x 1 2 t... Collecte data D= { x, x,... x, a, a,..., a } 1 2 t 1 2 t 1 t actionstate pairs may be prohibitive. Receive a reward at t: r t = 1 a a a Number of states and actions may be infinite (continuous world). In discrete world, visiting all Map the reward to choices of actions a for each state (Credit Assignment Problem) Repeat the procedure (generate a series of trials and errors) until you converge to a good map that represents the best choice of action in any state to maximize total rewards.
24 Reinforcement Learning: example Goal: to stand up Reward: 1 if up, 0 otherwise Requires 750 trials in simulation on real robot to learn First set of trials At convergence Morimoto and Doya, Robotics and Autonomous Systems,
25 Reinforcement Learning Algorithms in this Course Random walk Dynamic programming Power Genetic algorithm Discrete and continuous state reinforcement learning methods Above: solutions found by 4 different techniques for learning 25 25
26 26 26 Main Types of Learning Methods Supervised learning where the algorithm learns a function or model that maps best a set of inputs to a set of desired outputs. Reinforcement learning where the algorithm learns a policy or model of the set of transitions across a discrete set of inputoutput states (Markovian world) in order to maximize a reward value (external reinforcement). Unsupervised learning where the algorithm learns a model that best represent a set of inputs without any feedback (no desired output, no external reinforcement)
27 27 27 Unsupervised Learning: Principle Observe a series of input: D= { x, x,... x }, x 1 2 M 1 N Find regularities in the input, e.g. correlations, patterns. Use these to reduce the dimensionality of the data (with correlations) or to group datapoints (patterns).
28 Unsupervised learning ~ Structure discovery Raw Data Trying to find some structure in the data.. Pattern Recognition 28 28
29 29 29 Pattern Recognition: Similar Shape Results of decomposition with Principal Component Analysis: eigenvectors
30 30 30 Pattern Recognition: Similar Shape Encapsulate main differences across groups of images (in the first eigenvectors)
31 31 31 Pattern Recognition: Similar Shape Detailed features (glasses) get encapsulated next (in the following eigenvectors)
32 32 32 Pattern Recognition: Similar Shape Data are classifiable in a lower dimensional space (here plotted on 1 st and 2 nd eigenvector) Pattern recognition techniques of the type of PCA can be used to reduce the dimensionality of the dataset
33 333 Pattern Recognition: Similar Shape Subclasses can be discovered by looking at other combinations of projections (here plotted on 2 nd and 3 rd eigenvector) Pattern recognition techniques of the type of PCA can be used to reduce the dimensionality of the dataset
34 34 34 Pattern Recognition: Determining Trends Goal: Predicting evolution of prices of stock market Issues: Data vary over time Time is the input variable
35 35 35 Pattern Recognition: Determining Trends Moving time window Size of time window? Goal: Predicting evolution of prices of stock market Challenges: are previous data meaningful to predict future data? Data are very noisy; each datapoint is seen only once Must decouple noise from temporal variations of time series
36 36 36 Pattern Recognition: Clustering Clustering for automatic processing of medical images: enhance visibility Multispectral medical image segmentation. (left: MRIimage from 1 channel) (right: classification from a 9cluster semisupervised learning); Clusters should identify patterns, such as cerebrospinal fluid, white matter, striated muscle, tumor. (Lundervolt et al, 1996). Cluster groups of pixels by tissue type use grey shadings to denote different groups
37 Pattern Recognition: Clustering Clustering assume groups of points are similar according to the same metric of similarity. Challenge: Clustering may fail when metric changes depending on the region of space Jain, 2010, Data clustering: 50 years beyond Kmeans, Pattern Recognition Letters 37 37
38 Pattern Recognition: Clustering Different techniques or heuristics can be developed to help the algorithm determine the right boundaries across clusters: Add manually set of constraints across pairs of datapoints determine links between datapoints in same/different clusters use semisupervised clustering Jain, 2010, Data clustering: 50 years beyond Kmeans, Pattern Recognition Letters 38 38
39 39 Pattern Recognition: Clustering Clustering encompasses a large set of methods that try to find patterns that are similar in some way. Hierarchical clustering builds treelike structure by pairing datapoints according to increasing levels of similarity.
40 40 40 Pattern Recognition: Clustering Hierarchical clustering can be used with arbitrary sets of data. Example: Hierarchical clustering to discover similar temporal pattern of crimes across districts in India. Chandra et al, A Multivariate Time Series Clustering Approach for Crime Trends Prediction, IEEE SMC 2008.
41 Unsupervised Learning Algorithms in this Course PCA Kernel PCA Kernel Kmeans Genetic algorithm Isomap Laplacian map Projections on Laplacian eigenvector Swissroll example 41 41
42 42 42 Main Types of Learning Methods Supervised learning where the algorithm learns a function or model that maps best a set of inputs to a set of desired outputs. Reinforcement Classification learning where the algorithm learns a policor model of the set of transitions across a discrete set of semisupervised clustering inputoutput states (Markovian world) in order to maximize a reward value Clustering (external reinforcement). Unsupervised learning where the algorithm learns a model that best represent a set of inputs without any feedback (no desired output, no external reinforcement)
43 Data Mining Pattern recognition with very large amount of highdimensional data (Tens of thousands to billions) (Several hundreds and more) 43 43
44 Mining webpages Data Mining: examples Cluster groups of webpage by topics Cluster links across webpages Other algorithms required: Fast methods for crawling the web Text processing (Natural Language Processing) Understanding semantics Issues: Domainspecific language / terminology Foreign languages Dynamics of web (pages disappear / get created) 444
45 Mining webpages Data Mining: examples Personalizing search Improvising user s experience Enhance commercial market value Build userspecific data country of origin languages spoken list of websites visited in the past days/weeks frequency of occurrence of some queries Link userspecific data to other groups of data e.g. robotics in Switzerland academic institutions in Europe 45 45
46 Data Mining Main Issue: Curse of Dimensionality Computational Costs O(M 2 ) O(M) M: Nb of datapoints Computational costs may also grow as a function of number of dimensions ML techniques seek linear growth whenever possible Apply methods for dimensionality reduction whenever possible 46 46
47 47 47 Some Machine Learning Resources Network of excellence on Pattern Recognition, Statistical Modelling and Computational Learning (summer schools and workshops) Databases: Journals: Machine Learning Journal, Kluwer Publisher IEEE Transactions on Signal processing IEEE Transactions on Pattern Analysis IEEE Transactions on Pattern Recognition The Journal of Machine Learning Research Conferences: ICML: int. conf. on machine learning Neural Information Processing Conference online repository of all research papers,
48 48 48 Topics for Literature survey and MiniProjects The exact list of topics for lit. survey and miniproject will be posted by March 4 Topics for survey will entail: Data mining methods for crawling mailboxes Data mining methods for crawling github Ethical issues on data mining Method for learning the features Metrics for clustering Topics for miniproject will entail implementing either of these:  Manifold learning (Isomap, Eigenmaps, Local Linear Embedding)  Classification (Random Forest)  Regression (Relevance Vector Regression)  NonParametric ApproximationsTechniques for Mixture Models Programming language open: matlab, C/C++, Python, or embed into MLDemo software
49 49 49 Overview of Practicals
50 50 50 Summary of today s class content Taxonomy and basic concepts of ML Structure discovery, classification, regression Supervised, unsupervised and reinforcement learning Data mining and curse of dimensionality Examples of ML applications Pattern recognition in images Data mining of the web Overview of projects and survey topics
CS 2750 Machine Learning. Lecture 1. Machine Learning. http://www.cs.pitt.edu/~milos/courses/cs2750/ CS 2750 Machine Learning.
Lecture Machine Learning Milos Hauskrecht milos@cs.pitt.edu 539 Sennott Square, x5 http://www.cs.pitt.edu/~milos/courses/cs75/ Administration Instructor: Milos Hauskrecht milos@cs.pitt.edu 539 Sennott
More informationBIOINF 585 Fall 2015 Machine Learning for Systems Biology & Clinical Informatics http://www.ccmb.med.umich.edu/node/1376
Course Director: Dr. Kayvan Najarian (DCM&B, kayvan@umich.edu) Lectures: Labs: Mondays and Wednesdays 9:00 AM 10:30 AM Rm. 2065 Palmer Commons Bldg. Wednesdays 10:30 AM 11:30 AM (alternate weeks) Rm.
More informationCS 2750 Machine Learning. Lecture 1. Machine Learning. CS 2750 Machine Learning.
Lecture 1 Machine Learning Milos Hauskrecht milos@cs.pitt.edu 539 Sennott Square, x5 http://www.cs.pitt.edu/~milos/courses/cs75/ Administration Instructor: Milos Hauskrecht milos@cs.pitt.edu 539 Sennott
More informationIntroduction to Machine Learning. Speaker: Harry Chao Advisor: J.J. Ding Date: 1/27/2011
Introduction to Machine Learning Speaker: Harry Chao Advisor: J.J. Ding Date: 1/27/2011 1 Outline 1. What is machine learning? 2. The basic of machine learning 3. Principles and effects of machine learning
More information203.4770: Introduction to Machine Learning Dr. Rita Osadchy
203.4770: Introduction to Machine Learning Dr. Rita Osadchy 1 Outline 1. About the Course 2. What is Machine Learning? 3. Types of problems and Situations 4. ML Example 2 About the course Course Homepage:
More informationIntroduction to Machine Learning Lecture 1. Mehryar Mohri Courant Institute and Google Research mohri@cims.nyu.edu
Introduction to Machine Learning Lecture 1 Mehryar Mohri Courant Institute and Google Research mohri@cims.nyu.edu Introduction Logistics Prerequisites: basics concepts needed in probability and statistics
More informationIntroduction to Machine Learning Using Python. Vikram Kamath
Introduction to Machine Learning Using Python Vikram Kamath Contents: 1. 2. 3. 4. 5. 6. 7. 8. 9. 10. Introduction/Definition Where and Why ML is used Types of Learning Supervised Learning Linear Regression
More informationDistance Metric Learning in Data Mining (Part I) Fei Wang and Jimeng Sun IBM TJ Watson Research Center
Distance Metric Learning in Data Mining (Part I) Fei Wang and Jimeng Sun IBM TJ Watson Research Center 1 Outline Part I  Applications Motivation and Introduction Patient similarity application Part II
More informationHT2015: SC4 Statistical Data Mining and Machine Learning
HT2015: SC4 Statistical Data Mining and Machine Learning Dino Sejdinovic Department of Statistics Oxford http://www.stats.ox.ac.uk/~sejdinov/sdmml.html Bayesian Nonparametrics Parametric vs Nonparametric
More informationSupervised Feature Selection & Unsupervised Dimensionality Reduction
Supervised Feature Selection & Unsupervised Dimensionality Reduction Feature Subset Selection Supervised: class labels are given Select a subset of the problem features Why? Redundant features much or
More informationAzure Machine Learning, SQL Data Mining and R
Azure Machine Learning, SQL Data Mining and R Daybyday Agenda Prerequisites No formal prerequisites. Basic knowledge of SQL Server Data Tools, Excel and any analytical experience helps. Best of all:
More informationMachine learning for algo trading
Machine learning for algo trading An introduction for nonmathematicians Dr. Aly Kassam Overview High level introduction to machine learning A machine learning bestiary What has all this got to do with
More informationMS1b Statistical Data Mining
MS1b Statistical Data Mining Yee Whye Teh Department of Statistics Oxford http://www.stats.ox.ac.uk/~teh/datamining.html Outline Administrivia and Introduction Course Structure Syllabus Introduction to
More informationEECS 445: Introduction to Machine Learning Winter 2015
Instructor: Prof. Jenna Wiens Office: 3609 BBB wiensj@umich.edu EECS 445: Introduction to Machine Learning Winter 2015 Graduate Student Instructor: Srayan Datta Office: 3349 North Quad (**office hours
More informationData, Measurements, Features
Data, Measurements, Features Middle East Technical University Dep. of Computer Engineering 2009 compiled by V. Atalay What do you think of when someone says Data? We might abstract the idea that data are
More informationAPPM4720/5720: Fast algorithms for big data. Gunnar Martinsson The University of Colorado at Boulder
APPM4720/5720: Fast algorithms for big data Gunnar Martinsson The University of Colorado at Boulder Course objectives: The purpose of this course is to teach efficient algorithms for processing very large
More informationMachine Learning and Data Analysis overview. Department of Cybernetics, Czech Technical University in Prague. http://ida.felk.cvut.
Machine Learning and Data Analysis overview Jiří Kléma Department of Cybernetics, Czech Technical University in Prague http://ida.felk.cvut.cz psyllabus Lecture Lecturer Content 1. J. Kléma Introduction,
More informationThe Data Mining Process
Sequence for Determining Necessary Data. Wrong: Catalog everything you have, and decide what data is important. Right: Work backward from the solution, define the problem explicitly, and map out the data
More informationPractical Data Science with Azure Machine Learning, SQL Data Mining, and R
Practical Data Science with Azure Machine Learning, SQL Data Mining, and R Overview This 4day class is the first of the two data science courses taught by Rafal Lukawiecki. Some of the topics will be
More informationIntroduction to Machine Learning
Introduction to Machine Learning Brown University CSCI 1950F, Spring 2012 Instructor: Erik Sudderth Graduate TAs: Dae Il Kim & Ben Swanson Head Undergraduate TA: William Allen Undergraduate TAs: Soravit
More informationClass Overview and General Introduction to Machine Learning
Class Overview and General Introduction to Machine Learning Piyush Rai www.cs.utah.edu/~piyush CS5350/6350: Machine Learning August 23, 2011 (CS5350/6350) Intro to ML August 23, 2011 1 / 25 Course Logistics
More informationCOMP 598 Applied Machine Learning Lecture 21: Parallelization methods for largescale machine learning! Big Data by the numbers
COMP 598 Applied Machine Learning Lecture 21: Parallelization methods for largescale machine learning! Instructor: (jpineau@cs.mcgill.ca) TAs: PierreLuc Bacon (pbacon@cs.mcgill.ca) Ryan Lowe (ryan.lowe@mail.mcgill.ca)
More informationSupervised and unsupervised learning  1
Chapter 3 Supervised and unsupervised learning  1 3.1 Introduction The science of learning plays a key role in the field of statistics, data mining, artificial intelligence, intersecting with areas in
More informationCSci 538 Articial Intelligence (Machine Learning and Data Analysis)
CSci 538 Articial Intelligence (Machine Learning and Data Analysis) Course Syllabus Fall 2015 Instructor Derek Harter, Ph.D., Associate Professor Department of Computer Science Texas A&M University  Commerce
More informationEnvironmental Remote Sensing GEOG 2021
Environmental Remote Sensing GEOG 2021 Lecture 4 Image classification 2 Purpose categorising data data abstraction / simplification data interpretation mapping for land cover mapping use land cover class
More informationUnsupervised Data Mining (Clustering)
Unsupervised Data Mining (Clustering) Javier Béjar KEMLG December 01 Javier Béjar (KEMLG) Unsupervised Data Mining (Clustering) December 01 1 / 51 Introduction Clustering in KDD One of the main tasks in
More informationMachine Learning. CUNY Graduate Center, Spring 2013. Professor Liang Huang. huang@cs.qc.cuny.edu
Machine Learning CUNY Graduate Center, Spring 2013 Professor Liang Huang huang@cs.qc.cuny.edu http://acl.cs.qc.edu/~lhuang/teaching/machinelearning Logistics Lectures M 9:3011:30 am Room 4419 Personnel
More informationMachine Learning with MATLAB David Willingham Application Engineer
Machine Learning with MATLAB David Willingham Application Engineer 2014 The MathWorks, Inc. 1 Goals Overview of machine learning Machine learning models & techniques available in MATLAB Streamlining the
More informationAn Introduction to Data Mining. Big Data World. Related Fields and Disciplines. What is Data Mining? 2/12/2015
An Introduction to Data Mining for Wind Power Management Spring 2015 Big Data World Every minute: Google receives over 4 million search queries Facebook users share almost 2.5 million pieces of content
More informationFacebook Friend Suggestion Eytan Daniyalzade and Tim Lipus
Facebook Friend Suggestion Eytan Daniyalzade and Tim Lipus 1. Introduction Facebook is a social networking website with an open platform that enables developers to extract and utilize user information
More informationEM Clustering Approach for MultiDimensional Analysis of Big Data Set
EM Clustering Approach for MultiDimensional Analysis of Big Data Set Amhmed A. Bhih School of Electrical and Electronic Engineering Princy Johnson School of Electrical and Electronic Engineering Martin
More informationMSCA 31000 Introduction to Statistical Concepts
MSCA 31000 Introduction to Statistical Concepts This course provides general exposure to basic statistical concepts that are necessary for students to understand the content presented in more advanced
More informationCS Master Level Courses and Areas COURSE DESCRIPTIONS. CSCI 521 RealTime Systems. CSCI 522 High Performance Computing
CS Master Level Courses and Areas The graduate courses offered may change over time, in response to new developments in computer science and the interests of faculty and students; the list of graduate
More informationMachine Learning. 01  Introduction
Machine Learning 01  Introduction Machine learning course One lecture (Wednesday, 9:30, 346) and one exercise (Monday, 17:15, 203). Oral exam, 20 minutes, 5 credit points. Some basic mathematical knowledge
More informationIntroduction to machine learning and pattern recognition Lecture 1 Coryn BailerJones
Introduction to machine learning and pattern recognition Lecture 1 Coryn BailerJones http://www.mpia.de/homes/calj/mlpr_mpia2008.html 1 1 What is machine learning? Data description and interpretation
More informationInformation Management course
Università degli Studi di Milano Master Degree in Computer Science Information Management course Teacher: Alberto Ceselli Lecture 01 : 06/10/2015 Practical informations: Teacher: Alberto Ceselli (alberto.ceselli@unimi.it)
More informationSTA 4273H: Statistical Machine Learning
STA 4273H: Statistical Machine Learning Russ Salakhutdinov Department of Statistics! rsalakhu@utstat.toronto.edu! http://www.cs.toronto.edu/~rsalakhu/ Lecture 6 Three Approaches to Classification Construct
More informationMachine Learning CS 6830. Lecture 01. Razvan C. Bunescu School of Electrical Engineering and Computer Science bunescu@ohio.edu
Machine Learning CS 6830 Razvan C. Bunescu School of Electrical Engineering and Computer Science bunescu@ohio.edu What is Learning? MerriamWebster: learn = to acquire knowledge, understanding, or skill
More informationFeature Selection using Integer and Binary coded Genetic Algorithm to improve the performance of SVM Classifier
Feature Selection using Integer and Binary coded Genetic Algorithm to improve the performance of SVM Classifier D.Nithya a, *, V.Suganya b,1, R.Saranya Irudaya Mary c,1 Abstract  This paper presents,
More informationComparison of Nonlinear Dimensionality Reduction Techniques for Classification with Gene Expression Microarray Data
CMPE 59H Comparison of Nonlinear Dimensionality Reduction Techniques for Classification with Gene Expression Microarray Data Term Project Report Fatma Güney, Kübra Kalkan 1/15/2013 Keywords: Nonlinear
More informationLearning outcomes. Knowledge and understanding. Competence and skills
Syllabus Master s Programme in Statistics and Data Mining 120 ECTS Credits Aim The rapid growth of databases provides scientists and business people with vast new resources. This programme meets the challenges
More informationMachine Learning in Computer Vision A Tutorial. Ajay Joshi, Anoop Cherian and Ravishankar Shivalingam Dept. of Computer Science, UMN
Machine Learning in Computer Vision A Tutorial Ajay Joshi, Anoop Cherian and Ravishankar Shivalingam Dept. of Computer Science, UMN Outline Introduction Supervised Learning Unsupervised Learning SemiSupervised
More informationMusic Mood Classification
Music Mood Classification CS 229 Project Report Jose Padial Ashish Goel Introduction The aim of the project was to develop a music mood classifier. There are many categories of mood into which songs may
More informationFace Recognition using SIFT Features
Face Recognition using SIFT Features Mohamed Aly CNS186 Term Project Winter 2006 Abstract Face recognition has many important practical applications, like surveillance and access control.
More informationNetwork Machine Learning Research Group. Intended status: Informational October 19, 2015 Expires: April 21, 2016
Network Machine Learning Research Group S. Jiang InternetDraft Huawei Technologies Co., Ltd Intended status: Informational October 19, 2015 Expires: April 21, 2016 Abstract Network Machine Learning draftjiangnmlrgnetworkmachinelearning00
More information10601. Machine Learning. http://www.cs.cmu.edu/afs/cs/academic/class/10601f10/index.html
10601 Machine Learning http://www.cs.cmu.edu/afs/cs/academic/class/10601f10/index.html Course data All uptodate info is on the course web page: http://www.cs.cmu.edu/afs/cs/academic/class/10601f10/index.html
More information8/29/2015. Data Mining and Machine Learning. Erich Seamon University of Idaho
Data Mining and Machine Learning Erich Seamon University of Idaho www.webpages.uidaho.edu/erichs erichs@uidaho.edu 1 I am NOMAD 2 1 Data Mining and Machine Learning Outline Outlining the data mining and
More informationInternational Journal of Computer Science Trends and Technology (IJCST) Volume 2 Issue 3, MayJun 2014
RESEARCH ARTICLE OPEN ACCESS A Survey of Data Mining: Concepts with Applications and its Future Scope Dr. Zubair Khan 1, Ashish Kumar 2, Sunny Kumar 3 M.Tech Research Scholar 2. Department of Computer
More informationMA2823: Foundations of Machine Learning
MA2823: Foundations of Machine Learning École Centrale Paris Fall 2015 ChloéAgathe Azencot Centre for Computational Biology, Mines ParisTech chloe agathe.azencott@mines paristech.fr TAs: Jiaqian Yu jiaqian.yu@centralesupelec.fr
More informationC19 Machine Learning
C9 Machine Learning 8 Lectures Hilary Term 25 2 Tutorial Sheets A. Zisserman Overview: Supervised classification perceptron, support vector machine, loss functions, kernels, random forests, neural networks
More informationMACHINE LEARNING AN INTRODUCTION
AN INTRODUCTION JOSEFIN ROSÉN, SENIOR ANALYTICAL EXPERT, SAS INSTITUTE JOSEFIN.ROSEN@SAS.COM TWITTER: @ROSENJOSEFIN AGENDA What is machine learning? When, where and how is machine learning used? Exemple
More informationConcepts in Machine Learning, Unsupervised Learning & Astronomy Applications
Data Mining In Modern Astronomy Sky Surveys: Concepts in Machine Learning, Unsupervised Learning & Astronomy Applications ChingWa Yip cwyip@pha.jhu.edu; Bloomberg 518 Human are Great Pattern Recognizers
More informationKnearestneighbor: an introduction to machine learning
Knearestneighbor: an introduction to machine learning Xiaojin Zhu jerryzhu@cs.wisc.edu Computer Sciences Department University of Wisconsin, Madison slide 1 Outline Types of learning Classification:
More informationRole of Social Networking in Marketing using Data Mining
Role of Social Networking in Marketing using Data Mining Mrs. Saroj Junghare Astt. Professor, Department of Computer Science and Application St. Aloysius College, Jabalpur, Madhya Pradesh, India Abstract:
More informationUsing Data Mining for Mobile Communication Clustering and Characterization
Using Data Mining for Mobile Communication Clustering and Characterization A. Bascacov *, C. Cernazanu ** and M. Marcu ** * Lasting Software, Timisoara, Romania ** Politehnica University of Timisoara/Computer
More informationData Clustering. Dec 2nd, 2013 Kyrylo Bessonov
Data Clustering Dec 2nd, 2013 Kyrylo Bessonov Talk outline Introduction to clustering Types of clustering Supervised Unsupervised Similarity measures Main clustering algorithms kmeans Hierarchical Main
More informationARTIFICIAL INTELLIGENCE (CSCU9YE) LECTURE 6: MACHINE LEARNING 2: UNSUPERVISED LEARNING (CLUSTERING)
ARTIFICIAL INTELLIGENCE (CSCU9YE) LECTURE 6: MACHINE LEARNING 2: UNSUPERVISED LEARNING (CLUSTERING) Gabriela Ochoa http://www.cs.stir.ac.uk/~goc/ OUTLINE Preliminaries Classification and Clustering Applications
More informationThe Scientific Data Mining Process
Chapter 4 The Scientific Data Mining Process When I use a word, Humpty Dumpty said, in rather a scornful tone, it means just what I choose it to mean neither more nor less. Lewis Carroll [87, p. 214] In
More informationA Partially Supervised Metric Multidimensional Scaling Algorithm for Textual Data Visualization
A Partially Supervised Metric Multidimensional Scaling Algorithm for Textual Data Visualization Ángela Blanco Universidad Pontificia de Salamanca ablancogo@upsa.es Spain Manuel MartínMerino Universidad
More informationDATA MINING CLUSTER ANALYSIS: BASIC CONCEPTS
DATA MINING CLUSTER ANALYSIS: BASIC CONCEPTS 1 AND ALGORITHMS Chiara Renso KDDLAB ISTI CNR, Pisa, Italy WHAT IS CLUSTER ANALYSIS? Finding groups of objects such that the objects in a group will be similar
More informationMachine Learning for Data Science (CS4786) Lecture 1
Machine Learning for Data Science (CS4786) Lecture 1 TuTh 10:10 to 11:25 AM Hollister B14 Instructors : Lillian Lee and Karthik Sridharan ROUGH DETAILS ABOUT THE COURSE Diagnostic assignment 0 is out:
More informationRobust Outlier Detection Technique in Data Mining: A Univariate Approach
Robust Outlier Detection Technique in Data Mining: A Univariate Approach Singh Vijendra and Pathak Shivani Faculty of Engineering and Technology Mody Institute of Technology and Science Lakshmangarh, Sikar,
More informationIntroduction to Data Mining
Introduction to Data Mining 1 Why Data Mining? Explosive Growth of Data Data collection and data availability Automated data collection tools, Internet, smartphones, Major sources of abundant data Business:
More informationClustering Big Data. Anil K. Jain. (with Radha Chitta and Rong Jin) Department of Computer Science Michigan State University November 29, 2012
Clustering Big Data Anil K. Jain (with Radha Chitta and Rong Jin) Department of Computer Science Michigan State University November 29, 2012 Outline Big Data How to extract information? Data clustering
More informationColour Image Segmentation Technique for Screen Printing
60 R.U. Hewage and D.U.J. Sonnadara Department of Physics, University of Colombo, Sri Lanka ABSTRACT Screenprinting is an industry with a large number of applications ranging from printing mobile phone
More informationMachine Learning: Overview
Machine Learning: Overview Why Learning? Learning is a core of property of being intelligent. Hence Machine learning is a core subarea of Artificial Intelligence. There is a need for programs to behave
More informationJournée Thématique Big Data 13/03/2015
Journée Thématique Big Data 13/03/2015 1 Agenda About Flaminem What Do We Want To Predict? What Is The Machine Learning Theory Behind It? How Does It Work In Practice? What Is Happening When Data Gets
More informationKnowledge Discovery from patents using KMX Text Analytics
Knowledge Discovery from patents using KMX Text Analytics Dr. Anton Heijs anton.heijs@treparel.com Treparel Abstract In this white paper we discuss how the KMX technology of Treparel can help searchers
More informationMultiple Kernel Learning on the Limit Order Book
JMLR: Workshop and Conference Proceedings 11 (2010) 167 174 Workshop on Applications of Pattern Analysis Multiple Kernel Learning on the Limit Order Book Tristan Fletcher Zakria Hussain John ShaweTaylor
More informationSimple and efficient online algorithms for real world applications
Simple and efficient online algorithms for real world applications Università degli Studi di Milano Milano, Italy Talk @ Centro de Visión por Computador Something about me PhD in Robotics at LIRALab,
More informationPredict Influencers in the Social Network
Predict Influencers in the Social Network Ruishan Liu, Yang Zhao and Liuyu Zhou Email: rliu2, yzhao2, lyzhou@stanford.edu Department of Electrical Engineering, Stanford University Abstract Given two persons
More informationPrinciples of Data Mining by Hand&Mannila&Smyth
Principles of Data Mining by Hand&Mannila&Smyth Slides for Textbook Ari Visa,, Institute of Signal Processing Tampere University of Technology October 4, 2010 Data Mining: Concepts and Techniques 1 Differences
More informationPATTERN RECOGNITION AND MACHINE LEARNING CHAPTER 4: LINEAR MODELS FOR CLASSIFICATION
PATTERN RECOGNITION AND MACHINE LEARNING CHAPTER 4: LINEAR MODELS FOR CLASSIFICATION Introduction In the previous chapter, we explored a class of regression models having particularly simple analytical
More informationKATE GLEASON COLLEGE OF ENGINEERING. John D. Hromi Center for Quality and Applied Statistics
ROCHESTER INSTITUTE OF TECHNOLOGY COURSE OUTLINE FORM KATE GLEASON COLLEGE OF ENGINEERING John D. Hromi Center for Quality and Applied Statistics NEW (or REVISED) COURSE (KGCOE CQAS 747 Principles of
More informationUnsupervised Learning: Clustering with DBSCAN Mat Kallada
Unsupervised Learning: Clustering with DBSCAN Mat Kallada STAT 2450  Introduction to Data Mining Supervised Data Mining: Predicting a column called the label The domain of data mining focused on prediction:
More informationFeature Extraction and Selection. More Info == Better Performance? Curse of Dimensionality. Feature Space. HighDimensional Spaces
More Info == Better Performance? Feature Extraction and Selection APR Course, Delft, The Netherlands Marco Loog Feature Space Curse of Dimensionality A pdimensional space, in which each dimension is a
More informationA Survey on Outlier Detection Techniques for Credit Card Fraud Detection
IOSR Journal of Computer Engineering (IOSRJCE) eissn: 22780661, p ISSN: 22788727Volume 16, Issue 2, Ver. VI (MarApr. 2014), PP 4448 A Survey on Outlier Detection Techniques for Credit Card Fraud
More informationBayesian Machine Learning (ML): Modeling And Inference in Big Data. Zhuhua Cai Google, Rice University caizhua@gmail.com
Bayesian Machine Learning (ML): Modeling And Inference in Big Data Zhuhua Cai Google Rice University caizhua@gmail.com 1 Syllabus Bayesian ML Concepts (Today) Bayesian ML on MapReduce (Next morning) Bayesian
More informationSocial Media Mining. Data Mining Essentials
Introduction Data production rate has been increased dramatically (Big Data) and we are able store much more data than before E.g., purchase data, social media data, mobile phone data Businesses and customers
More informationStatistics for BIG data
Statistics for BIG data Statistics for Big Data: Are Statisticians Ready? Dennis Lin Department of Statistics The Pennsylvania State University John Jordan and Dennis K.J. Lin (ICSABulletine 2014) Before
More informationMachine Learning What, how, why?
Machine Learning What, how, why? Rémi Emonet (@remiemonet) 20150930 Web En Vert $ whoami $ whoami Software Engineer Researcher: machine learning, computer vision Teacher: web technologies, computing
More informationCHAPTER 3 DATA MINING AND CLUSTERING
CHAPTER 3 DATA MINING AND CLUSTERING 3.1 Introduction Nowadays, large quantities of data are being accumulated. The amount of data collected is said to be almost doubled every 9 months. Seeking knowledge
More informationHow to use Big Data in Industry 4.0 implementations. LAURI ILISON, PhD Head of Big Data and Machine Learning
How to use Big Data in Industry 4.0 implementations LAURI ILISON, PhD Head of Big Data and Machine Learning Big Data definition? Big Data is about structured vs unstructured data Big Data is about Volume
More informationRecognizing Informed Option Trading
Recognizing Informed Option Trading Alex Bain, Prabal Tiwaree, Kari Okamoto 1 Abstract While equity (stock) markets are generally efficient in discounting public information into stock prices, we believe
More informationGovernment of Russian Federation. Faculty of Computer Science School of Data Analysis and Artificial Intelligence
Government of Russian Federation Federal State Autonomous Educational Institution of High Professional Education National Research University «Higher School of Economics» Faculty of Computer Science School
More informationIntroduction to Pattern Recognition
Introduction to Pattern Recognition Selim Aksoy Department of Computer Engineering Bilkent University saksoy@cs.bilkent.edu.tr CS 551, Spring 2009 CS 551, Spring 2009 c 2009, Selim Aksoy (Bilkent University)
More informationMaking Sense of the Mayhem: Machine Learning and March Madness
Making Sense of the Mayhem: Machine Learning and March Madness Alex Tran and Adam Ginzberg Stanford University atran3@stanford.edu ginzberg@stanford.edu I. Introduction III. Model The goal of our research
More informationEnsemble Methods. Knowledge Discovery and Data Mining 2 (VU) (707.004) Roman Kern. KTI, TU Graz 20150305
Ensemble Methods Knowledge Discovery and Data Mining 2 (VU) (707004) Roman Kern KTI, TU Graz 20150305 Roman Kern (KTI, TU Graz) Ensemble Methods 20150305 1 / 38 Outline 1 Introduction 2 Classification
More informationModelling, Extraction and Description of Intrinsic Cues of High Resolution Satellite Images: Independent Component Analysis based approaches
Modelling, Extraction and Description of Intrinsic Cues of High Resolution Satellite Images: Independent Component Analysis based approaches PhD Thesis by Payam Birjandi Director: Prof. Mihai Datcu Problematic
More informationECS289 Data Mining Lecture Outline
ECS289 Data Mining Lecture Outline What is Data Mining? Absurdly Simple Student Grade Example Fundamental Tasks in Data Mining Course Structure Overview Emphasis on social networks and graphs Our aim
More informationClassification of Bad Accounts in Credit Card Industry
Classification of Bad Accounts in Credit Card Industry Chengwei Yuan December 12, 2014 Introduction Risk management is critical for a credit card company to survive in such competing industry. In addition
More informationExample application (1) Telecommunication. Lecture 1: Data Mining Overview and Process. Example application (2) Health
Lecture 1: Data Mining Overview and Process What is data mining? Example applications Definitions Multi disciplinary Techniques Major challenges The data mining process History of data mining Data mining
More informationData Mining Analysis of HIV1 Protease Crystal Structures
Data Mining Analysis of HIV1 Protease Crystal Structures Gene M. Ko, A. Srinivas Reddy, Sunil Kumar, and Rajni Garg AP0907 09 Data Mining Analysis of HIV1 Protease Crystal Structures Gene M. Ko 1, A.
More informationAn Introduction to Statistical Machine Learning  Overview 
An Introduction to Statistical Machine Learning  Overview  Samy Bengio bengio@idiap.ch Dalle Molle Institute for Perceptual Artificial Intelligence (IDIAP) CP 592, rue du Simplon 4 1920 Martigny, Switzerland
More informationBEHAVIOR BASED CREDIT CARD FRAUD DETECTION USING SUPPORT VECTOR MACHINES
BEHAVIOR BASED CREDIT CARD FRAUD DETECTION USING SUPPORT VECTOR MACHINES 123 CHAPTER 7 BEHAVIOR BASED CREDIT CARD FRAUD DETECTION USING SUPPORT VECTOR MACHINES 7.1 Introduction Even though using SVM presents
More informationCOLLEGE OF SCIENCE. John D. Hromi Center for Quality and Applied Statistics
ROCHESTER INSTITUTE OF TECHNOLOGY COURSE OUTLINE FORM COLLEGE OF SCIENCE John D. Hromi Center for Quality and Applied Statistics NEW (or REVISED) COURSE: COSSTAT747 Principles of Statistical Data Mining
More informationThe Artificial Prediction Market
The Artificial Prediction Market Adrian Barbu Department of Statistics Florida State University Joint work with Nathan Lay, Siemens Corporate Research 1 Overview Main Contributions A mathematical theory
More informationBig Data Analytics CSCI 4030
High dim. data Graph data Infinite data Machine learning Apps Locality sensitive hashing PageRank, SimRank Filtering data streams SVM Recommen der systems Clustering Community Detection Web advertising
More informationAn Analysis on Density Based Clustering of Multi Dimensional Spatial Data
An Analysis on Density Based Clustering of Multi Dimensional Spatial Data K. Mumtaz 1 Assistant Professor, Department of MCA Vivekanandha Institute of Information and Management Studies, Tiruchengode,
More informationCROP CLASSIFICATION WITH HYPERSPECTRAL DATA OF THE HYMAP SENSOR USING DIFFERENT FEATURE EXTRACTION TECHNIQUES
Proceedings of the 2 nd Workshop of the EARSeL SIG on Land Use and Land Cover CROP CLASSIFICATION WITH HYPERSPECTRAL DATA OF THE HYMAP SENSOR USING DIFFERENT FEATURE EXTRACTION TECHNIQUES Sebastian Mader
More information