Why are you here? What is Machine Learning? Why are you taking this course? INTRODUCTION TO MACHINE LEARNING David Kauchak CS 451 Fall 2013 What topics would you like to see covered? Machine Learning is Machine Learning is Machine learning, a branch of artificial intelligence, concerns the construction and study of systems that can learn from data. Machine learning is programming computers to optimize a performance criterion using example data or past experience. -- Ethem Alpaydin The goal of machine learning is to develop methods that can automatically detect patterns in data, and then to use the uncovered patterns to predict future data or other outcomes of interest. -- Kevin P. Murphy The field of pattern recognition is concerned with the automatic discovery of regularities in data through the use of computer algorithms and with the use of these regularities to take actions. -- Christopher M. Bishop 1
Machine Learning is Machine Learning is Machine learning is about predicting the future based on the past. -- Hal Daume III Machine learning is about predicting the future based on the past. -- Hal Daume III past Training learn model/ predictor future Testing model/ predictor predict Machine Learning, aka data mining: machine learning applied to databases, i.e. collections of data inference and/or estimation in statistics pattern recognition in engineering signal processing in electrical engineering induction optimization Goals of the course: Learn about Different machine learning problems Common techniques/tools used theoretical understanding practical implementation Proper experimentation and evaluation Dealing with large (huge) data sets Parallelization frameworks Programming tools 2
Goals of the course Administrative Course page: http://www.cs.middlebury.edu/~dkauchak/classes/cs451/ go/cs451 Assignments Weekly Mostly programming (Java, mostly) Some written/write-up Generally due Friday evenings Two exams Late Policy Be able to laugh at these signs (or at least know why one might) Honor code Course expectations 400-level course Plan to stay busy! Machine learning problems What high-level machine learning problems have you seen or heard of before? Applied class, so lots of programming Machine learning involves math 3
4
Supervised learning Supervised learning 1 3 ed 1 3 model/ predictor 4 4 5 5 Supervised learning: given ed Supervised learning: given ed Supervised learning Supervised learning: classification apple model/ predictor predicted apple Classification: a finite set of s banana banana Supervised learning: learn to predict new example Supervised learning: given ed 5
Classification Example Classification Applications Face recognition Differentiate between low-risk and high-risk customers from their income and savings Character recognition Spam detection Medical diagnosis: From symptoms to illnesses Biometrics: Recognition/authentication using physical and/ or behavioral characteristics: Face, iris, signature, etc... Supervised learning: regression Regression Example -4.5 Price of a used car 10.1 3.2 Regression: is real-valued x : car attributes (e.g. mileage) y : price y = wx+w 0 4.3 Supervised learning: given ed 24 6
Regression Applications Supervised learning: ranking Economics/Finance: predict the value of a stock Epidemiology Car/plane navigation: angle of the steering wheel, acceleration, Temporal trends: weather over time 1 4 2 3 Ranking: is a ranking Supervised learning: given ed Ranking example Given a query and a set of web pages, rank them according to relevance Ranking Applications User preference, e.g. Netflix My List -- movie queue ranking itunes flight search (search in general) reranking N-best output lists 7
Unsupervised learning Unsupervised learning applications learn clusters/groups without any customer segmentation (i.e. grouping) image compression bioinformatics: learn motifs Unupervised learning: given data, i.e., but no s Reinforcement learning Reinforcement learning example left, right, straight, left, left, left, straight GOOD Backgammon left, straight, straight, left, right, straight, straight BAD WIN! left, right, straight, left, left, left, straight left, straight, straight, left, right, straight, straight 18.5 Given a sequence of /states and a reward after completing that sequence, learn to predict the action to take in for an individual example/state -3 LOSE! Given sequences of moves and whether or not the player won at the end, learn to make good moves 8
Reinforcement learning example Other learning variations What data is available: n Supervised, unsupervised, reinforcement learning n semi-supervised, active learning, How are we getting the data: n online vs. offline learning http://www.youtube.com/watch?v=vcdxqn0fcne Type of model: n generative vs. discriminative n parametric vs. non-parametric 9