ADVANCED MACHINE LEARNING. Introduction
|
|
|
- Shawn O’Connor’
- 9 years ago
- Views:
Transcription
1 1 1 Introduction Lecturer: Prof. Aude Billard ([email protected]) Teaching Assistants: Guillaume de Chambrier, Nadia Figueroa, Denys Lamotte, Nicola Sommer
2 2 2 Course Format Alternate between: Lectures + Exercises and Practice Sessions: 9h15-11h00 (Thursdays) 10h15-11h00 (Fridays) Check class timetable!!
3 3 3 Class Timetable
4 4 4 Grading 50% of the grade based on personal work. Choice between: 1. Mini-project implementing and evaluating the algorithm performance and sensibility to parameter choices (should be done individually preferably). OR 2. A literature survey on a topic chosen among a list provided in class (can be done in team of two people, report conf. paper format, 8 pages, 10pt, double column format) ~25-30 hours of personal work, i.e. count one full week of work. 50% based on final oral exam 20 minutes preparation 20 minutes answer on the black board (closed book, but allowed to bring a recto-verso A4 page with personal notes)
5 5 5 Prerequisites Linear Algebra, Probabilities and Statistics Familiar with basic Machine Learning Techniques: Principal Component Analysis Clustering with K-means and Gaussian Mixture Models Linear and weighted regression
6 6 6 Today s class content Taxonomy and basic concepts of ML Examples of ML applications Overview of class topics
7 7 7 Main Types of Learning Methods Supervised learning where the algorithm learns a function or model that maps best a set of inputs to a set of desired outputs. Reinforcement learning where the algorithm learns a policy or model of the set of transitions across a discrete set of inputoutput states (Markovian world) in order to maximize a reward value (external reinforcement). Unsupervised learning where the algorithm learns a model that best represent a set of inputs without any feedback (no desired output, no external reinforcement)
8 8 8 Main Types of Learning Methods Supervised learning where the algorithm learns a function or model that maps best a set of inputs to a set of desired outputs. Classes of supervised algorithms: Classification Regression
9 9 9 Classification: Principle N Map N-dim. input x to a set of class labels y. C: number of classes ( ( )) { 1,..., C} If two classes (binary classification), learn a function of the type: N h: and y = sgn h x. y = +1 y =? Train on subset of datapoints y = 1 Hope to generalize class labels to unseen datapoints
10 10 10 Classification: overview Classification is a supervised clustering process. Classification is usually multi-class; Given a set of known classes, the algorithm learns to extract combinations of data features that best predict the true class of the data. Original Data After 4-class classification using Support Vector Machine (SVM)
11 111 Classification: example Classification of finance data to assess solvability using Support Vector Machine (SVM). 5-classes to measure insolvency risks: Excellent, good, satisfactory, passable, poor (credit) Swiderski et al, Decision Multistage classification by using logistic regression and neural networks for assessment of financial condition of company, Support System, 2012
12 12 12 Classification: issues Classification of finance data to assess solvability using Support Vector Machine (SVM). The features are crucial. If they are poor, poor learning Usually determine features by hand (prior knowledge) Novel approaches learn the features Swiderski et al, Decision Multistage classification by using logistic regression and neural networks for assessment of financial condition of company, Support System, classes to measure insolvency risks: Excellent, good, satisfactory, passable, poor (credit)
13 13 13 Classification: issues A recurrent problem when applying classification to real life problems is that classes are often unbalanced. More samples from positive class than from negative class Swiderski et al, 2012 This can affect drastically classification performance, as classes with many data points have more influence on the error measure during training.
14 14 14 Classification Algorithms in this Course Support Vector Machine Relevance Vector Machine Boosting random projections Boosting random gaussians Random forest Gaussian Process
15 15 15 Regression: Principle N Map N-dim. input x to a continuous output y. Learn a function of the type: ( ) N f : and y = f x. y 3 y 2 y True function Estimate 4 y 1 y 1 x 2 x 3 x 4 x x { i i} i= 1,... Estimate f that best predict set of training points x, y? M
16 16 16 Regression: Issues N Map N-dim. input x to a continuous output y. Learn a function of the type: ( ) N f : and y = f x. y 3 y 2 y 4 y 1 y Fit strongly influenced by choice of: - datapoints for training - complexity of the model (interpolation) True function Estimate 1 x 2 x 3 x 4 x x { i i} i= 1,... Estimate f that best predict set of training points x, y? M
17 Regression: example Predicting the optimal position and orientation of the golf club to hit the ball so as to sink it into the goal. Kronander, Khansari and Billard, JTSC award, IEEE Int. Conf. on Int. and Rob. Systems
18 Regression: example Predicting the optimal position and orientation of the golf club to hit the ball so as to sink it into the goal. Kronander, Khansari and Billard, JTSC award, IEEE Int. Conf. on Int. and Rob. Systems
19 Regression: example Contrast prediction of two methods (Gaussian Process Regression and Gaussian Mixture Regression) in terms of precision and generalization. Generalization power outside the training data. GPR GMR Kronander, Khansari and Billard, JTSC award, IEEE Int. Conf. on Int. and Rob. Systems
20 20 20 Regression Algorithms in this Course Support Vector Machine Relevance Vector Machine Support vector regression Boosting random projections Relevance vector regression Boosting random gaussians Random Gaussian forest process regression Gaussian Process Gradient boosting Locally weighted projected regression
21 21 21 Main Types of Learning Methods Supervised learning where the algorithm learns a function or model that maps best a set of inputs to a set of desired outputs. Reinforcement learning where the algorithm learns a policy or model of the set of transitions across a discrete set of inputoutput states (Markovian world) in order to maximize a reward value (external reinforcement). Unsupervised learning where the algorithm learns a model that best represent a set of inputs without any feedback (no desired output, no external reinforcement)
22 222 Reinforcement Learning: Principle Generate a series of input for t time steps by performing action state x to state x: t 1 a a a 1 2 t 1 x x x 1 2 t... t a to go from Collecte data D= { x, x,... x, a, a,..., a } 1 2 t 1 2 t 1 Receive a reward at t: r t = 1 Map the reward to choices of actions a for each state (Credit Assignment Problem) Repeat the procedure (generate a series of trials and errors) until you converge to a good map that represents the best choice of action in any state to maximize total rewards.
23 23 23 t 1 Reinforcement Learning: Issues Generate a series of input for t time steps by performing action a to go from state x to state x: 1 2 t 1 x x x 1 2 t... Collecte data D= { x, x,... x, a, a,..., a } 1 2 t 1 2 t 1 t action-state pairs may be prohibitive. Receive a reward at t: r t = 1 a a a Number of states and actions may be infinite (continuous world). In discrete world, visiting all Map the reward to choices of actions a for each state (Credit Assignment Problem) Repeat the procedure (generate a series of trials and errors) until you converge to a good map that represents the best choice of action in any state to maximize total rewards.
24 Reinforcement Learning: example Goal: to stand up Reward: 1 if up, 0 otherwise Requires 750 trials in simulation on real robot to learn First set of trials At convergence Morimoto and Doya, Robotics and Autonomous Systems,
25 Reinforcement Learning Algorithms in this Course Random walk Dynamic programming Power Genetic algorithm Discrete and continuous state reinforcement learning methods Above: solutions found by 4 different techniques for learning 25 25
26 26 26 Main Types of Learning Methods Supervised learning where the algorithm learns a function or model that maps best a set of inputs to a set of desired outputs. Reinforcement learning where the algorithm learns a policy or model of the set of transitions across a discrete set of inputoutput states (Markovian world) in order to maximize a reward value (external reinforcement). Unsupervised learning where the algorithm learns a model that best represent a set of inputs without any feedback (no desired output, no external reinforcement)
27 27 27 Unsupervised Learning: Principle Observe a series of input: D= { x, x,... x }, x 1 2 M 1 N Find regularities in the input, e.g. correlations, patterns. Use these to reduce the dimensionality of the data (with correlations) or to group datapoints (patterns).
28 Unsupervised learning ~ Structure discovery Raw Data Trying to find some structure in the data.. Pattern Recognition 28 28
29 29 29 Pattern Recognition: Similar Shape Results of decomposition with Principal Component Analysis: eigenvectors
30 30 30 Pattern Recognition: Similar Shape Encapsulate main differences across groups of images (in the first eigenvectors)
31 31 31 Pattern Recognition: Similar Shape Detailed features (glasses) get encapsulated next (in the following eigenvectors)
32 32 32 Pattern Recognition: Similar Shape Data are classifiable in a lower dimensional space (here plotted on 1 st and 2 nd eigenvector) Pattern recognition techniques of the type of PCA can be used to reduce the dimensionality of the dataset
33 333 Pattern Recognition: Similar Shape Subclasses can be discovered by looking at other combinations of projections (here plotted on 2 nd and 3 rd eigenvector) Pattern recognition techniques of the type of PCA can be used to reduce the dimensionality of the dataset
34 34 34 Pattern Recognition: Determining Trends Goal: Predicting evolution of prices of stock market Issues: Data vary over time Time is the input variable
35 35 35 Pattern Recognition: Determining Trends Moving time window Size of time window? Goal: Predicting evolution of prices of stock market Challenges: are previous data meaningful to predict future data? Data are very noisy; each datapoint is seen only once Must decouple noise from temporal variations of time series
36 36 36 Pattern Recognition: Clustering Clustering for automatic processing of medical images: enhance visibility Multispectral medical image segmentation. (left: MRI-image from 1 channel) (right: classification from a 9-cluster semi-supervised learning); Clusters should identify patterns, such as cerebro-spinal fluid, white matter, striated muscle, tumor. (Lundervolt et al, 1996). Cluster groups of pixels by tissue type use grey shadings to denote different groups
37 Pattern Recognition: Clustering Clustering assume groups of points are similar according to the same metric of similarity. Challenge: Clustering may fail when metric changes depending on the region of space Jain, 2010, Data clustering: 50 years beyond K-means, Pattern Recognition Letters 37 37
38 Pattern Recognition: Clustering Different techniques or heuristics can be developed to help the algorithm determine the right boundaries across clusters: Add manually set of constraints across pairs of datapoints determine links between datapoints in same/different clusters use semi-supervised clustering Jain, 2010, Data clustering: 50 years beyond K-means, Pattern Recognition Letters 38 38
39 39 Pattern Recognition: Clustering Clustering encompasses a large set of methods that try to find patterns that are similar in some way. Hierarchical clustering builds tree-like structure by pairing datapoints according to increasing levels of similarity.
40 40 40 Pattern Recognition: Clustering Hierarchical clustering can be used with arbitrary sets of data. Example: Hierarchical clustering to discover similar temporal pattern of crimes across districts in India. Chandra et al, A Multivariate Time Series Clustering Approach for Crime Trends Prediction, IEEE SMC 2008.
41 Unsupervised Learning Algorithms in this Course PCA Kernel PCA Kernel K-means Genetic algorithm Isomap Laplacian map Projections on Laplacian eigenvector Swissroll example 41 41
42 42 42 Main Types of Learning Methods Supervised learning where the algorithm learns a function or model that maps best a set of inputs to a set of desired outputs. Reinforcement Classification learning where the algorithm learns a policor model of the set of transitions across a discrete set of semi-supervised clustering input-output states (Markovian world) in order to maximize a reward value Clustering (external reinforcement). Unsupervised learning where the algorithm learns a model that best represent a set of inputs without any feedback (no desired output, no external reinforcement)
43 Data Mining Pattern recognition with very large amount of high-dimensional data (Tens of thousands to billions) (Several hundreds and more) 43 43
44 Mining webpages Data Mining: examples Cluster groups of webpage by topics Cluster links across webpages Other algorithms required: Fast methods for crawling the web Text processing (Natural Language Processing) Understanding semantics Issues: Domain-specific language / terminology Foreign languages Dynamics of web (pages disappear / get created) 444
45 Mining webpages Data Mining: examples Personalizing search Improvising user s experience Enhance commercial market value Build user-specific data country of origin languages spoken list of websites visited in the past days/weeks frequency of occurrence of some queries Link user-specific data to other groups of data e.g. robotics in Switzerland academic institutions in Europe 45 45
46 Data Mining Main Issue: Curse of Dimensionality Computational Costs O(M 2 ) O(M) M: Nb of datapoints Computational costs may also grow as a function of number of dimensions ML techniques seek linear growth whenever possible Apply methods for dimensionality reduction whenever possible 46 46
47 47 47 Some Machine Learning Resources Network of excellence on Pattern Recognition, Statistical Modelling and Computational Learning (summer schools and workshops) Databases: Journals: Machine Learning Journal, Kluwer Publisher IEEE Transactions on Signal processing IEEE Transactions on Pattern Analysis IEEE Transactions on Pattern Recognition The Journal of Machine Learning Research Conferences: ICML: int. conf. on machine learning Neural Information Processing Conference on-line repository of all research papers,
48 48 48 Topics for Literature survey and Mini-Projects The exact list of topics for lit. survey and mini-project will be posted by March 4 Topics for survey will entail: Data mining methods for crawling mailboxes Data mining methods for crawling git-hub Ethical issues on data mining Method for learning the features Metrics for clustering Topics for mini-project will entail implementing either of these: - Manifold learning (Isomap, Eigenmaps, Local Linear Embedding) - Classification (Random Forest) - Regression (Relevance Vector Regression) - Non-Parametric ApproximationsTechniques for Mixture Models Programming language open: matlab, C/C++, Python, or embed into MLDemo software
49 49 49 Overview of Practicals
50 50 50 Summary of today s class content Taxonomy and basic concepts of ML Structure discovery, classification, regression Supervised, unsupervised and reinforcement learning Data mining and curse of dimensionality Examples of ML applications Pattern recognition in images Data mining of the web Overview of projects and survey topics
CS 2750 Machine Learning. Lecture 1. Machine Learning. http://www.cs.pitt.edu/~milos/courses/cs2750/ CS 2750 Machine Learning.
Lecture Machine Learning Milos Hauskrecht [email protected] 539 Sennott Square, x5 http://www.cs.pitt.edu/~milos/courses/cs75/ Administration Instructor: Milos Hauskrecht [email protected] 539 Sennott
BIOINF 585 Fall 2015 Machine Learning for Systems Biology & Clinical Informatics http://www.ccmb.med.umich.edu/node/1376
Course Director: Dr. Kayvan Najarian (DCM&B, [email protected]) Lectures: Labs: Mondays and Wednesdays 9:00 AM -10:30 AM Rm. 2065 Palmer Commons Bldg. Wednesdays 10:30 AM 11:30 AM (alternate weeks) Rm.
203.4770: Introduction to Machine Learning Dr. Rita Osadchy
203.4770: Introduction to Machine Learning Dr. Rita Osadchy 1 Outline 1. About the Course 2. What is Machine Learning? 3. Types of problems and Situations 4. ML Example 2 About the course Course Homepage:
Introduction to Machine Learning Using Python. Vikram Kamath
Introduction to Machine Learning Using Python Vikram Kamath Contents: 1. 2. 3. 4. 5. 6. 7. 8. 9. 10. Introduction/Definition Where and Why ML is used Types of Learning Supervised Learning Linear Regression
Introduction to Machine Learning Lecture 1. Mehryar Mohri Courant Institute and Google Research [email protected]
Introduction to Machine Learning Lecture 1 Mehryar Mohri Courant Institute and Google Research [email protected] Introduction Logistics Prerequisites: basics concepts needed in probability and statistics
Distance Metric Learning in Data Mining (Part I) Fei Wang and Jimeng Sun IBM TJ Watson Research Center
Distance Metric Learning in Data Mining (Part I) Fei Wang and Jimeng Sun IBM TJ Watson Research Center 1 Outline Part I - Applications Motivation and Introduction Patient similarity application Part II
Supervised Feature Selection & Unsupervised Dimensionality Reduction
Supervised Feature Selection & Unsupervised Dimensionality Reduction Feature Subset Selection Supervised: class labels are given Select a subset of the problem features Why? Redundant features much or
HT2015: SC4 Statistical Data Mining and Machine Learning
HT2015: SC4 Statistical Data Mining and Machine Learning Dino Sejdinovic Department of Statistics Oxford http://www.stats.ox.ac.uk/~sejdinov/sdmml.html Bayesian Nonparametrics Parametric vs Nonparametric
MS1b Statistical Data Mining
MS1b Statistical Data Mining Yee Whye Teh Department of Statistics Oxford http://www.stats.ox.ac.uk/~teh/datamining.html Outline Administrivia and Introduction Course Structure Syllabus Introduction to
Machine learning for algo trading
Machine learning for algo trading An introduction for nonmathematicians Dr. Aly Kassam Overview High level introduction to machine learning A machine learning bestiary What has all this got to do with
Azure Machine Learning, SQL Data Mining and R
Azure Machine Learning, SQL Data Mining and R Day-by-day Agenda Prerequisites No formal prerequisites. Basic knowledge of SQL Server Data Tools, Excel and any analytical experience helps. Best of all:
Data, Measurements, Features
Data, Measurements, Features Middle East Technical University Dep. of Computer Engineering 2009 compiled by V. Atalay What do you think of when someone says Data? We might abstract the idea that data are
Unsupervised Data Mining (Clustering)
Unsupervised Data Mining (Clustering) Javier Béjar KEMLG December 01 Javier Béjar (KEMLG) Unsupervised Data Mining (Clustering) December 01 1 / 51 Introduction Clustering in KDD One of the main tasks in
The Data Mining Process
Sequence for Determining Necessary Data. Wrong: Catalog everything you have, and decide what data is important. Right: Work backward from the solution, define the problem explicitly, and map out the data
Machine Learning and Data Analysis overview. Department of Cybernetics, Czech Technical University in Prague. http://ida.felk.cvut.
Machine Learning and Data Analysis overview Jiří Kléma Department of Cybernetics, Czech Technical University in Prague http://ida.felk.cvut.cz psyllabus Lecture Lecturer Content 1. J. Kléma Introduction,
Supervised and unsupervised learning - 1
Chapter 3 Supervised and unsupervised learning - 1 3.1 Introduction The science of learning plays a key role in the field of statistics, data mining, artificial intelligence, intersecting with areas in
Practical Data Science with Azure Machine Learning, SQL Data Mining, and R
Practical Data Science with Azure Machine Learning, SQL Data Mining, and R Overview This 4-day class is the first of the two data science courses taught by Rafal Lukawiecki. Some of the topics will be
COMP 598 Applied Machine Learning Lecture 21: Parallelization methods for large-scale machine learning! Big Data by the numbers
COMP 598 Applied Machine Learning Lecture 21: Parallelization methods for large-scale machine learning! Instructor: ([email protected]) TAs: Pierre-Luc Bacon ([email protected]) Ryan Lowe ([email protected])
APPM4720/5720: Fast algorithms for big data. Gunnar Martinsson The University of Colorado at Boulder
APPM4720/5720: Fast algorithms for big data Gunnar Martinsson The University of Colorado at Boulder Course objectives: The purpose of this course is to teach efficient algorithms for processing very large
CSci 538 Articial Intelligence (Machine Learning and Data Analysis)
CSci 538 Articial Intelligence (Machine Learning and Data Analysis) Course Syllabus Fall 2015 Instructor Derek Harter, Ph.D., Associate Professor Department of Computer Science Texas A&M University - Commerce
Machine Learning. CUNY Graduate Center, Spring 2013. Professor Liang Huang. [email protected]
Machine Learning CUNY Graduate Center, Spring 2013 Professor Liang Huang [email protected] http://acl.cs.qc.edu/~lhuang/teaching/machine-learning Logistics Lectures M 9:30-11:30 am Room 4419 Personnel
Environmental Remote Sensing GEOG 2021
Environmental Remote Sensing GEOG 2021 Lecture 4 Image classification 2 Purpose categorising data data abstraction / simplification data interpretation mapping for land cover mapping use land cover class
Machine Learning. 01 - Introduction
Machine Learning 01 - Introduction Machine learning course One lecture (Wednesday, 9:30, 346) and one exercise (Monday, 17:15, 203). Oral exam, 20 minutes, 5 credit points. Some basic mathematical knowledge
Facebook Friend Suggestion Eytan Daniyalzade and Tim Lipus
Facebook Friend Suggestion Eytan Daniyalzade and Tim Lipus 1. Introduction Facebook is a social networking website with an open platform that enables developers to extract and utilize user information
Machine Learning with MATLAB David Willingham Application Engineer
Machine Learning with MATLAB David Willingham Application Engineer 2014 The MathWorks, Inc. 1 Goals Overview of machine learning Machine learning models & techniques available in MATLAB Streamlining the
Information Management course
Università degli Studi di Milano Master Degree in Computer Science Information Management course Teacher: Alberto Ceselli Lecture 01 : 06/10/2015 Practical informations: Teacher: Alberto Ceselli ([email protected])
CS Master Level Courses and Areas COURSE DESCRIPTIONS. CSCI 521 Real-Time Systems. CSCI 522 High Performance Computing
CS Master Level Courses and Areas The graduate courses offered may change over time, in response to new developments in computer science and the interests of faculty and students; the list of graduate
EM Clustering Approach for Multi-Dimensional Analysis of Big Data Set
EM Clustering Approach for Multi-Dimensional Analysis of Big Data Set Amhmed A. Bhih School of Electrical and Electronic Engineering Princy Johnson School of Electrical and Electronic Engineering Martin
Learning outcomes. Knowledge and understanding. Competence and skills
Syllabus Master s Programme in Statistics and Data Mining 120 ECTS Credits Aim The rapid growth of databases provides scientists and business people with vast new resources. This programme meets the challenges
Comparison of Non-linear Dimensionality Reduction Techniques for Classification with Gene Expression Microarray Data
CMPE 59H Comparison of Non-linear Dimensionality Reduction Techniques for Classification with Gene Expression Microarray Data Term Project Report Fatma Güney, Kübra Kalkan 1/15/2013 Keywords: Non-linear
Machine Learning CS 6830. Lecture 01. Razvan C. Bunescu School of Electrical Engineering and Computer Science [email protected]
Machine Learning CS 6830 Razvan C. Bunescu School of Electrical Engineering and Computer Science [email protected] What is Learning? Merriam-Webster: learn = to acquire knowledge, understanding, or skill
DATA MINING CLUSTER ANALYSIS: BASIC CONCEPTS
DATA MINING CLUSTER ANALYSIS: BASIC CONCEPTS 1 AND ALGORITHMS Chiara Renso KDD-LAB ISTI- CNR, Pisa, Italy WHAT IS CLUSTER ANALYSIS? Finding groups of objects such that the objects in a group will be similar
Using Data Mining for Mobile Communication Clustering and Characterization
Using Data Mining for Mobile Communication Clustering and Characterization A. Bascacov *, C. Cernazanu ** and M. Marcu ** * Lasting Software, Timisoara, Romania ** Politehnica University of Timisoara/Computer
Clustering Big Data. Anil K. Jain. (with Radha Chitta and Rong Jin) Department of Computer Science Michigan State University November 29, 2012
Clustering Big Data Anil K. Jain (with Radha Chitta and Rong Jin) Department of Computer Science Michigan State University November 29, 2012 Outline Big Data How to extract information? Data clustering
A Partially Supervised Metric Multidimensional Scaling Algorithm for Textual Data Visualization
A Partially Supervised Metric Multidimensional Scaling Algorithm for Textual Data Visualization Ángela Blanco Universidad Pontificia de Salamanca [email protected] Spain Manuel Martín-Merino Universidad
Journée Thématique Big Data 13/03/2015
Journée Thématique Big Data 13/03/2015 1 Agenda About Flaminem What Do We Want To Predict? What Is The Machine Learning Theory Behind It? How Does It Work In Practice? What Is Happening When Data Gets
STA 4273H: Statistical Machine Learning
STA 4273H: Statistical Machine Learning Russ Salakhutdinov Department of Statistics! [email protected]! http://www.cs.toronto.edu/~rsalakhu/ Lecture 6 Three Approaches to Classification Construct
Simple and efficient online algorithms for real world applications
Simple and efficient online algorithms for real world applications Università degli Studi di Milano Milano, Italy Talk @ Centro de Visión por Computador Something about me PhD in Robotics at LIRA-Lab,
Feature Selection using Integer and Binary coded Genetic Algorithm to improve the performance of SVM Classifier
Feature Selection using Integer and Binary coded Genetic Algorithm to improve the performance of SVM Classifier D.Nithya a, *, V.Suganya b,1, R.Saranya Irudaya Mary c,1 Abstract - This paper presents,
An Introduction to Data Mining. Big Data World. Related Fields and Disciplines. What is Data Mining? 2/12/2015
An Introduction to Data Mining for Wind Power Management Spring 2015 Big Data World Every minute: Google receives over 4 million search queries Facebook users share almost 2.5 million pieces of content
10-601. Machine Learning. http://www.cs.cmu.edu/afs/cs/academic/class/10601-f10/index.html
10-601 Machine Learning http://www.cs.cmu.edu/afs/cs/academic/class/10601-f10/index.html Course data All up-to-date info is on the course web page: http://www.cs.cmu.edu/afs/cs/academic/class/10601-f10/index.html
MSCA 31000 Introduction to Statistical Concepts
MSCA 31000 Introduction to Statistical Concepts This course provides general exposure to basic statistical concepts that are necessary for students to understand the content presented in more advanced
MA2823: Foundations of Machine Learning
MA2823: Foundations of Machine Learning École Centrale Paris Fall 2015 Chloé-Agathe Azencot Centre for Computational Biology, Mines ParisTech chloe agathe.azencott@mines paristech.fr TAs: Jiaqian Yu [email protected]
Network Machine Learning Research Group. Intended status: Informational October 19, 2015 Expires: April 21, 2016
Network Machine Learning Research Group S. Jiang Internet-Draft Huawei Technologies Co., Ltd Intended status: Informational October 19, 2015 Expires: April 21, 2016 Abstract Network Machine Learning draft-jiang-nmlrg-network-machine-learning-00
Introduction to Pattern Recognition
Introduction to Pattern Recognition Selim Aksoy Department of Computer Engineering Bilkent University [email protected] CS 551, Spring 2009 CS 551, Spring 2009 c 2009, Selim Aksoy (Bilkent University)
How To Cluster
Data Clustering Dec 2nd, 2013 Kyrylo Bessonov Talk outline Introduction to clustering Types of clustering Supervised Unsupervised Similarity measures Main clustering algorithms k-means Hierarchical Main
Machine Learning for Data Science (CS4786) Lecture 1
Machine Learning for Data Science (CS4786) Lecture 1 Tu-Th 10:10 to 11:25 AM Hollister B14 Instructors : Lillian Lee and Karthik Sridharan ROUGH DETAILS ABOUT THE COURSE Diagnostic assignment 0 is out:
Predict Influencers in the Social Network
Predict Influencers in the Social Network Ruishan Liu, Yang Zhao and Liuyu Zhou Email: rliu2, yzhao2, [email protected] Department of Electrical Engineering, Stanford University Abstract Given two persons
Statistics for BIG data
Statistics for BIG data Statistics for Big Data: Are Statisticians Ready? Dennis Lin Department of Statistics The Pennsylvania State University John Jordan and Dennis K.J. Lin (ICSA-Bulletine 2014) Before
Robust Outlier Detection Technique in Data Mining: A Univariate Approach
Robust Outlier Detection Technique in Data Mining: A Univariate Approach Singh Vijendra and Pathak Shivani Faculty of Engineering and Technology Mody Institute of Technology and Science Lakshmangarh, Sikar,
International Journal of Computer Science Trends and Technology (IJCST) Volume 2 Issue 3, May-Jun 2014
RESEARCH ARTICLE OPEN ACCESS A Survey of Data Mining: Concepts with Applications and its Future Scope Dr. Zubair Khan 1, Ashish Kumar 2, Sunny Kumar 3 M.Tech Research Scholar 2. Department of Computer
KATE GLEASON COLLEGE OF ENGINEERING. John D. Hromi Center for Quality and Applied Statistics
ROCHESTER INSTITUTE OF TECHNOLOGY COURSE OUTLINE FORM KATE GLEASON COLLEGE OF ENGINEERING John D. Hromi Center for Quality and Applied Statistics NEW (or REVISED) COURSE (KGCOE- CQAS- 747- Principles of
Principles of Data Mining by Hand&Mannila&Smyth
Principles of Data Mining by Hand&Mannila&Smyth Slides for Textbook Ari Visa,, Institute of Signal Processing Tampere University of Technology October 4, 2010 Data Mining: Concepts and Techniques 1 Differences
ARTIFICIAL INTELLIGENCE (CSCU9YE) LECTURE 6: MACHINE LEARNING 2: UNSUPERVISED LEARNING (CLUSTERING)
ARTIFICIAL INTELLIGENCE (CSCU9YE) LECTURE 6: MACHINE LEARNING 2: UNSUPERVISED LEARNING (CLUSTERING) Gabriela Ochoa http://www.cs.stir.ac.uk/~goc/ OUTLINE Preliminaries Classification and Clustering Applications
Social Media Mining. Data Mining Essentials
Introduction Data production rate has been increased dramatically (Big Data) and we are able store much more data than before E.g., purchase data, social media data, mobile phone data Businesses and customers
DATA MINING TECHNIQUES AND APPLICATIONS
DATA MINING TECHNIQUES AND APPLICATIONS Mrs. Bharati M. Ramageri, Lecturer Modern Institute of Information Technology and Research, Department of Computer Application, Yamunanagar, Nigdi Pune, Maharashtra,
The Scientific Data Mining Process
Chapter 4 The Scientific Data Mining Process When I use a word, Humpty Dumpty said, in rather a scornful tone, it means just what I choose it to mean neither more nor less. Lewis Carroll [87, p. 214] In
Machine Learning in Computer Vision A Tutorial. Ajay Joshi, Anoop Cherian and Ravishankar Shivalingam Dept. of Computer Science, UMN
Machine Learning in Computer Vision A Tutorial Ajay Joshi, Anoop Cherian and Ravishankar Shivalingam Dept. of Computer Science, UMN Outline Introduction Supervised Learning Unsupervised Learning Semi-Supervised
Machine Learning What, how, why?
Machine Learning What, how, why? Rémi Emonet (@remiemonet) 2015-09-30 Web En Vert $ whoami $ whoami Software Engineer Researcher: machine learning, computer vision Teacher: web technologies, computing
Government of Russian Federation. Faculty of Computer Science School of Data Analysis and Artificial Intelligence
Government of Russian Federation Federal State Autonomous Educational Institution of High Professional Education National Research University «Higher School of Economics» Faculty of Computer Science School
Making Sense of the Mayhem: Machine Learning and March Madness
Making Sense of the Mayhem: Machine Learning and March Madness Alex Tran and Adam Ginzberg Stanford University [email protected] [email protected] I. Introduction III. Model The goal of our research
Non-negative Matrix Factorization (NMF) in Semi-supervised Learning Reducing Dimension and Maintaining Meaning
Non-negative Matrix Factorization (NMF) in Semi-supervised Learning Reducing Dimension and Maintaining Meaning SAMSI 10 May 2013 Outline Introduction to NMF Applications Motivations NMF as a middle step
Big Data Analytics CSCI 4030
High dim. data Graph data Infinite data Machine learning Apps Locality sensitive hashing PageRank, SimRank Filtering data streams SVM Recommen der systems Clustering Community Detection Web advertising
Machine Learning: Overview
Machine Learning: Overview Why Learning? Learning is a core of property of being intelligent. Hence Machine learning is a core subarea of Artificial Intelligence. There is a need for programs to behave
The Artificial Prediction Market
The Artificial Prediction Market Adrian Barbu Department of Statistics Florida State University Joint work with Nathan Lay, Siemens Corporate Research 1 Overview Main Contributions A mathematical theory
Role of Social Networking in Marketing using Data Mining
Role of Social Networking in Marketing using Data Mining Mrs. Saroj Junghare Astt. Professor, Department of Computer Science and Application St. Aloysius College, Jabalpur, Madhya Pradesh, India Abstract:
Knowledge Discovery from patents using KMX Text Analytics
Knowledge Discovery from patents using KMX Text Analytics Dr. Anton Heijs [email protected] Treparel Abstract In this white paper we discuss how the KMX technology of Treparel can help searchers
An Analysis on Density Based Clustering of Multi Dimensional Spatial Data
An Analysis on Density Based Clustering of Multi Dimensional Spatial Data K. Mumtaz 1 Assistant Professor, Department of MCA Vivekanandha Institute of Information and Management Studies, Tiruchengode,
Modelling, Extraction and Description of Intrinsic Cues of High Resolution Satellite Images: Independent Component Analysis based approaches
Modelling, Extraction and Description of Intrinsic Cues of High Resolution Satellite Images: Independent Component Analysis based approaches PhD Thesis by Payam Birjandi Director: Prof. Mihai Datcu Problematic
AMIS 7640 Data Mining for Business Intelligence
The Ohio State University The Max M. Fisher College of Business Department of Accounting and Management Information Systems AMIS 7640 Data Mining for Business Intelligence Autumn Semester 2013, Session
Music Mood Classification
Music Mood Classification CS 229 Project Report Jose Padial Ashish Goel Introduction The aim of the project was to develop a music mood classifier. There are many categories of mood into which songs may
Neural Networks Lesson 5 - Cluster Analysis
Neural Networks Lesson 5 - Cluster Analysis Prof. Michele Scarpiniti INFOCOM Dpt. - Sapienza University of Rome http://ispac.ing.uniroma1.it/scarpiniti/index.htm [email protected] Rome, 29
Bayesian Machine Learning (ML): Modeling And Inference in Big Data. Zhuhua Cai Google, Rice University [email protected]
Bayesian Machine Learning (ML): Modeling And Inference in Big Data Zhuhua Cai Google Rice University [email protected] 1 Syllabus Bayesian ML Concepts (Today) Bayesian ML on MapReduce (Next morning) Bayesian
How to use Big Data in Industry 4.0 implementations. LAURI ILISON, PhD Head of Big Data and Machine Learning
How to use Big Data in Industry 4.0 implementations LAURI ILISON, PhD Head of Big Data and Machine Learning Big Data definition? Big Data is about structured vs unstructured data Big Data is about Volume
Syllabus for MATH 191 MATH 191 Topics in Data Science: Algorithms and Mathematical Foundations Department of Mathematics, UCLA Fall Quarter 2015
Syllabus for MATH 191 MATH 191 Topics in Data Science: Algorithms and Mathematical Foundations Department of Mathematics, UCLA Fall Quarter 2015 Lecture: MWF: 1:00-1:50pm, GEOLOGY 4645 Instructor: Mihai
Machine Learning and Data Mining. Fundamentals, robotics, recognition
Machine Learning and Data Mining Fundamentals, robotics, recognition Machine Learning, Data Mining, Knowledge Discovery in Data Bases Their mutual relations Data Mining, Knowledge Discovery in Databases,
Introduction to Data Mining
Introduction to Data Mining 1 Why Data Mining? Explosive Growth of Data Data collection and data availability Automated data collection tools, Internet, smartphones, Major sources of abundant data Business:
PATTERN RECOGNITION AND MACHINE LEARNING CHAPTER 4: LINEAR MODELS FOR CLASSIFICATION
PATTERN RECOGNITION AND MACHINE LEARNING CHAPTER 4: LINEAR MODELS FOR CLASSIFICATION Introduction In the previous chapter, we explored a class of regression models having particularly simple analytical
Comparison of K-means and Backpropagation Data Mining Algorithms
Comparison of K-means and Backpropagation Data Mining Algorithms Nitu Mathuriya, Dr. Ashish Bansal Abstract Data mining has got more and more mature as a field of basic research in computer science and
Ensemble Methods. Knowledge Discovery and Data Mining 2 (VU) (707.004) Roman Kern. KTI, TU Graz 2015-03-05
Ensemble Methods Knowledge Discovery and Data Mining 2 (VU) (707004) Roman Kern KTI, TU Graz 2015-03-05 Roman Kern (KTI, TU Graz) Ensemble Methods 2015-03-05 1 / 38 Outline 1 Introduction 2 Classification
Colour Image Segmentation Technique for Screen Printing
60 R.U. Hewage and D.U.J. Sonnadara Department of Physics, University of Colombo, Sri Lanka ABSTRACT Screen-printing is an industry with a large number of applications ranging from printing mobile phone
Maschinelles Lernen mit MATLAB
Maschinelles Lernen mit MATLAB Jérémy Huard Applikationsingenieur The MathWorks GmbH 2015 The MathWorks, Inc. 1 Machine Learning is Everywhere Image Recognition Speech Recognition Stock Prediction Medical
Exploratory Data Analysis with MATLAB
Computer Science and Data Analysis Series Exploratory Data Analysis with MATLAB Second Edition Wendy L Martinez Angel R. Martinez Jeffrey L. Solka ( r ec) CRC Press VV J Taylor & Francis Group Boca Raton
BEHAVIOR BASED CREDIT CARD FRAUD DETECTION USING SUPPORT VECTOR MACHINES
BEHAVIOR BASED CREDIT CARD FRAUD DETECTION USING SUPPORT VECTOR MACHINES 123 CHAPTER 7 BEHAVIOR BASED CREDIT CARD FRAUD DETECTION USING SUPPORT VECTOR MACHINES 7.1 Introduction Even though using SVM presents
An Introduction to Data Mining
An Introduction to Intel Beijing [email protected] January 17, 2014 Outline 1 DW Overview What is Notable Application of Conference, Software and Applications Major Process in 2 Major Tasks in Detail
not possible or was possible at a high cost for collecting the data.
Data Mining and Knowledge Discovery Generating knowledge from data Knowledge Discovery Data Mining White Paper Organizations collect a vast amount of data in the process of carrying out their day-to-day
MACHINE LEARNING IN HIGH ENERGY PHYSICS
MACHINE LEARNING IN HIGH ENERGY PHYSICS LECTURE #1 Alex Rogozhnikov, 2015 INTRO NOTES 4 days two lectures, two practice seminars every day this is introductory track to machine learning kaggle competition!
Semi-Supervised and Unsupervised Machine Learning. Novel Strategies
Brochure More information from http://www.researchandmarkets.com/reports/2179190/ Semi-Supervised and Unsupervised Machine Learning. Novel Strategies Description: This book provides a detailed and up to
Analysis Tools and Libraries for BigData
+ Analysis Tools and Libraries for BigData Lecture 02 Abhijit Bendale + Office Hours 2 n Terry Boult (Waiting to Confirm) n Abhijit Bendale (Tue 2:45 to 4:45 pm). Best if you email me in advance, but I
Multiple Kernel Learning on the Limit Order Book
JMLR: Workshop and Conference Proceedings 11 (2010) 167 174 Workshop on Applications of Pattern Analysis Multiple Kernel Learning on the Limit Order Book Tristan Fletcher Zakria Hussain John Shawe-Taylor
Mathematical Models of Supervised Learning and their Application to Medical Diagnosis
Genomic, Proteomic and Transcriptomic Lab High Performance Computing and Networking Institute National Research Council, Italy Mathematical Models of Supervised Learning and their Application to Medical
Recognizing Informed Option Trading
Recognizing Informed Option Trading Alex Bain, Prabal Tiwaree, Kari Okamoto 1 Abstract While equity (stock) markets are generally efficient in discounting public information into stock prices, we believe
Classification of Bad Accounts in Credit Card Industry
Classification of Bad Accounts in Credit Card Industry Chengwei Yuan December 12, 2014 Introduction Risk management is critical for a credit card company to survive in such competing industry. In addition
Clustering. Danilo Croce Web Mining & Retrieval a.a. 2015/201 16/03/2016
Clustering Danilo Croce Web Mining & Retrieval a.a. 2015/201 16/03/2016 1 Supervised learning vs. unsupervised learning Supervised learning: discover patterns in the data that relate data attributes with
Is a Data Scientist the New Quant? Stuart Kozola MathWorks
Is a Data Scientist the New Quant? Stuart Kozola MathWorks 2015 The MathWorks, Inc. 1 Facts or information used usually to calculate, analyze, or plan something Information that is produced or stored by
Data Mining and Knowledge Discovery in Databases (KDD) State of the Art. Prof. Dr. T. Nouri Computer Science Department FHNW Switzerland
Data Mining and Knowledge Discovery in Databases (KDD) State of the Art Prof. Dr. T. Nouri Computer Science Department FHNW Switzerland 1 Conference overview 1. Overview of KDD and data mining 2. Data
How To Identify A Churner
2012 45th Hawaii International Conference on System Sciences A New Ensemble Model for Efficient Churn Prediction in Mobile Telecommunication Namhyoung Kim, Jaewook Lee Department of Industrial and Management
