Introduction to Pattern Recognition



Similar documents
Classification Techniques for Remote Sensing

The Scientific Data Mining Process

Lecture Slides for INTRODUCTION TO. ETHEM ALPAYDIN The MIT Press, Lab Class and literature. Friday, , Harburger Schloßstr.

TIETS34 Seminar: Data Mining on Biometric identification

Object Recognition. Selim Aksoy. Bilkent University

Lecture 9: Introduction to Pattern Analysis

Machine Learning CS Lecture 01. Razvan C. Bunescu School of Electrical Engineering and Computer Science

Introduction. Selim Aksoy. Bilkent University

Machine Learning: Overview

CS 2750 Machine Learning. Lecture 1. Machine Learning. CS 2750 Machine Learning.

Data Mining Part 5. Prediction

Graduate Co-op Students Information Manual. Department of Computer Science. Faculty of Science. University of Regina

Maschinelles Lernen mit MATLAB

Supervised and unsupervised learning - 1

Information Management course

International Journal of Computer Science Trends and Technology (IJCST) Volume 2 Issue 3, May-Jun 2014

Environmental Remote Sensing GEOG 2021

Machine Learning. Chapter 18, 21. Some material adopted from notes by Chuck Dyer

Comparison of Non-linear Dimensionality Reduction Techniques for Classification with Gene Expression Microarray Data

An Introduction to Data Mining. Big Data World. Related Fields and Disciplines. What is Data Mining? 2/12/2015

Software Development Training Camp 1 (0-3) Prerequisite : Program development skill enhancement camp, at least 48 person-hours.

Data, Measurements, Features

Introduction to Data Mining

Introduction. A. Bellaachia Page: 1

Introduction to Machine Learning Using Python. Vikram Kamath

6.2.8 Neural networks for data mining

Applications of Deep Learning to the GEOINT mission. June 2015

INTRODUCTION TO MACHINE LEARNING 3RD EDITION

MA2823: Foundations of Machine Learning

REGULATIONS FOR THE DEGREE OF MASTER OF SCIENCE IN COMPUTER SCIENCE (MSc[CompSc])

How To Cluster

Open issues and research trends in Content-based Image Retrieval

Obtaining Value from Big Data

Analecta Vol. 8, No. 2 ISSN

: Introduction to Machine Learning Dr. Rita Osadchy

The Role of Size Normalization on the Recognition Rate of Handwritten Numerals

Clustering & Visualization

Using Artificial Intelligence to Manage Big Data for Litigation

Learning Example. Machine learning and our focus. Another Example. An example: data (loan application) The data and the goal

A Partially Supervised Metric Multidimensional Scaling Algorithm for Textual Data Visualization

No BI without Machine Learning

Data Mining and Machine Learning in Bioinformatics

Foundations of Business Intelligence: Databases and Information Management

not possible or was possible at a high cost for collecting the data.

Machine Learning for Data Science (CS4786) Lecture 1

Adding New Level in KDD to Make the Web Usage Mining More Efficient. Abstract. 1. Introduction [1]. 1/10

Integrated Data Mining Strategy for Effective Metabolomic Data Analysis

Data Mining mit der JMSL Numerical Library for Java Applications

Classification of Fingerprints. Sarat C. Dass Department of Statistics & Probability

DATA MINING TECHNIQUES AND APPLICATIONS

Introduction to Machine Learning Lecture 1. Mehryar Mohri Courant Institute and Google Research

ECLT 5810 E-Commerce Data Mining Techniques - Introduction. Prof. Wai Lam

Example application (1) Telecommunication. Lecture 1: Data Mining Overview and Process. Example application (2) Health

Doctor of Philosophy in Computer Science

LOCAL SURFACE PATCH BASED TIME ATTENDANCE SYSTEM USING FACE.

Knowledge-based systems and the need for learning

Machine Learning Introduction

Analysis of gene expression data. Ulf Leser and Philippe Thomas

Data Mining: Overview. What is Data Mining?

Data Mining for Customer Service Support. Senioritis Seminar Presentation Megan Boice Jay Carter Nick Linke KC Tobin

REGULATIONS FOR THE DEGREE OF MASTER OF SCIENCE IN COMPUTER SCIENCE (MSc[CompSc])

Visual-based ID Verification by Signature Tracking

WATER BODY EXTRACTION FROM MULTI SPECTRAL IMAGE BY SPECTRAL PATTERN ANALYSIS

Data Mining for Fun and Profit

Machine Learning with MATLAB David Willingham Application Engineer

Multisensor Data Fusion and Applications

Final Project Report

AN IMPROVED DOUBLE CODING LOCAL BINARY PATTERN ALGORITHM FOR FACE RECOGNITION

Multiscale Object-Based Classification of Satellite Images Merging Multispectral Information with Panchromatic Textural Features

Local outlier detection in data forensics: data mining approach to flag unusual schools

How To Use Neural Networks In Data Mining

MS1b Statistical Data Mining

IEEE International Conference on Computing, Analytics and Security Trends CAST-2016 (19 21 December, 2016) Call for Paper

Modelling, Extraction and Description of Intrinsic Cues of High Resolution Satellite Images: Independent Component Analysis based approaches

Learning is a very general term denoting the way in which agents:

Digital Image Fundamentals. Selim Aksoy Department of Computer Engineering Bilkent University

NAVIGATING SCIENTIFIC LITERATURE A HOLISTIC PERSPECTIVE. Venu Govindaraju

FACE RECOGNITION BASED ATTENDANCE MARKING SYSTEM

Attribution. Modified from Stuart Russell s slides (Berkeley) Parts of the slides are inspired by Dan Klein s lecture material for CS 188 (Berkeley)

ADVANCED MACHINE LEARNING. Introduction

Document Image Retrieval using Signatures as Queries

International Journal of Advanced Information in Arts, Science & Management Vol.2, No.2, December 2014

Introduction to Learning & Decision Trees

False alarm in outdoor environments

International Journal of Computer Trends and Technology (IJCTT) volume 4 Issue 8 August 2013

Introduction to Data Mining

Network Machine Learning Research Group. Intended status: Informational October 19, 2015 Expires: April 21, 2016

Non-negative Matrix Factorization (NMF) in Semi-supervised Learning Reducing Dimension and Maintaining Meaning

Example: Document Clustering. Clustering: Definition. Notion of a Cluster can be Ambiguous. Types of Clusterings. Hierarchical Clustering

Mathematical Models of Supervised Learning and their Application to Medical Diagnosis

p. 3 p. 4 p. 93 p. 111 p. 119

How To Fix Out Of Focus And Blur Images With A Dynamic Template Matching Algorithm

COM CO P 5318 Da t Da a t Explora Explor t a ion and Analysis y Chapte Chapt r e 3

Server Load Prediction

Social Media Mining. Data Mining Essentials

Machine Learning and Statistics: What s the Connection?

Linear Filtering Part II

Limitations of Human Vision. What is computer vision? What is computer vision (cont d)?

Welcome. Data Mining: Updates in Technologies. Xindong Wu. Colorado School of Mines Golden, Colorado 80401, USA

Visualization of Breast Cancer Data by SOM Component Planes

Transcription:

Introduction to Pattern Recognition Selim Aksoy Department of Computer Engineering Bilkent University saksoy@cs.bilkent.edu.tr CS 551, Spring 2009 CS 551, Spring 2009 c 2009, Selim Aksoy (Bilkent University) 1 / 39

Human Perception Humans have developed highly sophisticated skills for sensing their environment and taking actions according to what they observe, e.g., recognizing a face, understanding spoken words, reading handwriting, distinguishing fresh food from its smell. We would like to give similar capabilities to machines. CS 551, Spring 2009 c 2009, Selim Aksoy (Bilkent University) 2 / 39

What is Pattern Recognition? A pattern is an entity, vaguely defined, that could be given a name, e.g., fingerprint image, handwritten word, human face, speech signal, DNA sequence,... Pattern recognition is the study of how machines can observe the environment, learn to distinguish patterns of interest, make sound and reasonable decisions about the categories of the patterns. CS 551, Spring 2009 c 2009, Selim Aksoy (Bilkent University) 3 / 39

Human and Machine Perception We are often influenced by the knowledge of how patterns are modeled and recognized in nature when we develop pattern recognition algorithms. Research on machine perception also helps us gain deeper understanding and appreciation for pattern recognition systems in nature. Yet, we also apply many techniques that are purely numerical and do not have any correspondence in natural systems. CS 551, Spring 2009 c 2009, Selim Aksoy (Bilkent University) 4 / 39

Pattern Recognition Applications Table 1: Example pattern recognition applications. Problem Domain Application Input Pattern Pattern Classes Document image analysis Optical character recognition Document image Characters, words Document classification Internet search Text document Semantic categories Document classification Junk mail filtering Email Junk/non-junk Multimedia database retrieval Internet search Video clip Video genres Speech recognition Telephone directory assistance Speech waveform Spoken words Natural language processing Information extraction Sentences Parts of speech Biometric recognition Personal identification Face, iris, fingerprint Authorized users for access control Medical Computer aided diagnosis Microscopic image Cancerous/healthy cell Military Automatic target recognition Optical or infrared image Target type Industrial automation Printed circuit board inspection Intensity or range image Defective/non-defective product Industrial automation Fruit sorting Images taken on a conveyor Grade of quality belt Remote sensing Forecasting crop yield Multispectral image Land use categories Bioinformatics Sequence analysis DNA sequence Known types of genes Data mining Searching for meaningful patterns Points in multidimensional space Compact and well-separated clusters CS 551, Spring 2009 c 2009, Selim Aksoy (Bilkent University) 5 / 39

Pattern Recognition Applications Figure 1: English handwriting recognition. CS 551, Spring 2009 c 2009, Selim Aksoy (Bilkent University) 6 / 39

Pattern Recognition Applications Figure 2: Chinese handwriting recognition. CS 551, Spring 2009 c 2009, Selim Aksoy (Bilkent University) 7 / 39

Pattern Recognition Applications Figure 3: Fingerprint recognition. CS 551, Spring 2009 c 2009, Selim Aksoy (Bilkent University) 8 / 39

Pattern Recognition Applications Figure 4: Biometric recognition. CS 551, Spring 2009 c 2009, Selim Aksoy (Bilkent University) 9 / 39

Pattern Recognition Applications Figure 5: Cancer detection and grading using microscopic tissue data. CS 551, Spring 2009 c 2009, Selim Aksoy (Bilkent University) 10 / 39

Pattern Recognition Applications Figure 6: Cancer detection and grading using microscopic tissue data. CS 551, Spring 2009 c 2009, Selim Aksoy (Bilkent University) 11 / 39

Pattern Recognition Applications Figure 7: Land cover classification using satellite data. CS 551, Spring 2009 c 2009, Selim Aksoy (Bilkent University) 12 / 39

Pattern Recognition Applications Figure 8: Building and building group recognition using satellite data. CS 551, Spring 2009 c 2009, Selim Aksoy (Bilkent University) 13 / 39

Pattern Recognition Applications Figure 9: License plate recognition: US license plates. CS 551, Spring 2009 c 2009, Selim Aksoy (Bilkent University) 14 / 39

Pattern Recognition Applications Figure 10: Clustering of microarray data. CS 551, Spring 2009 c 2009, Selim Aksoy (Bilkent University) 15 / 39

An Example Problem: Sorting incoming fish on a conveyor belt according to species. Assume that we have only two kinds of fish: sea bass, salmon. Figure 11: Picture taken from a camera. CS 551, Spring 2009 c 2009, Selim Aksoy (Bilkent University) 16 / 39

An Example: Decision Process What kind of information can distinguish one species from the other? length, width, weight, number and shape of fins, tail shape, etc. What can cause problems during sensing? lighting conditions, position of fish on the conveyor belt, camera noise, etc. What are the steps in the process? capture image isolate fish take measurements make decision CS 551, Spring 2009 c 2009, Selim Aksoy (Bilkent University) 17 / 39

An Example: Selecting Features Assume a fisherman told us that a sea bass is generally longer than a salmon. We can use length as a feature and decide between sea bass and salmon according to a threshold on length. How can we choose this threshold? CS 551, Spring 2009 c 2009, Selim Aksoy (Bilkent University) 18 / 39

An Example: Selecting Features Figure 12: Histograms of the length feature for two types of fish in training samples. How can we choose the threshold l to make a reliable decision? CS 551, Spring 2009 c 2009, Selim Aksoy (Bilkent University) 19 / 39

An Example: Selecting Features Even though sea bass is longer than salmon on the average, there are many examples of fish where this observation does not hold. Try another feature: average lightness of the fish scales. CS 551, Spring 2009 c 2009, Selim Aksoy (Bilkent University) 20 / 39

An Example: Selecting Features Figure 13: Histograms of the lightness feature for two types of fish in training samples. It looks easier to choose the threshold x but we still cannot make a perfect decision. CS 551, Spring 2009 c 2009, Selim Aksoy (Bilkent University) 21 / 39

An Example: Cost of Error We should also consider costs of different errors we make in our decisions. For example, if the fish packing company knows that: Customers who buy salmon will object vigorously if they see sea bass in their cans. Customers who buy sea bass will not be unhappy if they occasionally see some expensive salmon in their cans. How does this knowledge affect our decision? CS 551, Spring 2009 c 2009, Selim Aksoy (Bilkent University) 22 / 39

An Example: Multiple Features Assume we also observed that sea bass are typically wider than salmon. We can use two features in our decision: lightness: x1 width: x2 Each fish image is now represented as a point (feature vector) ( x = x 1 x 2 ) in a two-dimensional feature space. CS 551, Spring 2009 c 2009, Selim Aksoy (Bilkent University) 23 / 39

An Example: Multiple Features Figure 14: Scatter plot of lightness and width features for training samples. We can draw a decision boundary to divide the feature space into two regions. Does it look better than using only lightness? CS 551, Spring 2009 c 2009, Selim Aksoy (Bilkent University) 24 / 39

An Example: Multiple Features Does adding more features always improve the results? Avoid unreliable features. Be careful about correlations with existing features. Be careful about measurement costs. Be careful about noise in the measurements. Is there some curse for working in very high dimensions? CS 551, Spring 2009 c 2009, Selim Aksoy (Bilkent University) 25 / 39

An Example: Decision Boundaries Can we do better with another decision rule? More complex models result in more complex boundaries. Figure 15: We may distinguish training samples perfectly but how can we predict how well we can generalize to unknown samples? CS 551, Spring 2009 c 2009, Selim Aksoy (Bilkent University) 26 / 39

An Example: Decision Boundaries How can we manage the tradeoff between complexity of decision rules and their performance to unknown samples? Figure 16: Different criteria lead to different decision boundaries. CS 551, Spring 2009 c 2009, Selim Aksoy (Bilkent University) 27 / 39

More on Complexity 1 0 1 0 1 Figure 17: Regression example: plot of 10 sample points for the input variable x along with the corresponding target variable t. Green curve is the true function that generated the data. CS 551, Spring 2009 c 2009, Selim Aksoy (Bilkent University) 28 / 39

More on Complexity 1 1 0 0 1 1 0 1 0 1 (a) 0 th order polynomial (b) 1 st order polynomial 1 1 0 0 1 1 0 1 0 1 (c) 3 rd order polynomial (d) 9 th order polynomial Figure 18: Polynomial curve fitting: plots of polynomials having various orders, shown as red curves, fitted to the set of 10 sample points. CS 551, Spring 2009 c 2009, Selim Aksoy (Bilkent University) 29 / 39

More on Complexity 1 1 0 0 1 1 0 1 0 1 (a) 15 sample points (b) 100 sample points Figure 19: Polynomial curve fitting: plots of 9 th order polynomials fitted to 15 and 100 sample points. CS 551, Spring 2009 c 2009, Selim Aksoy (Bilkent University) 30 / 39

Pattern Recognition Systems Physical environment Data acquisition/sensing Training data Pre processing Pre processing Feature extraction Feature extraction/selection Features Features Classification Model Model learning/estimation Post processing Decision Figure 20: Object/process diagram of a pattern recognition system. CS 551, Spring 2009 c 2009, Selim Aksoy (Bilkent University) 31 / 39

Pattern Recognition Systems Data acquisition and sensing: Measurements of physical variables. Important issues: bandwidth, resolution, sensitivity, distortion, SNR, latency, etc. Pre-processing: Removal of noise in data. Isolation of patterns of interest from the background. Feature extraction: Finding a new representation in terms of features. CS 551, Spring 2009 c 2009, Selim Aksoy (Bilkent University) 32 / 39

Pattern Recognition Systems Model learning and estimation: Learning a mapping between features and pattern groups and categories. Classification: Using features and learned models to assign a pattern to a category. Post-processing: Evaluation of confidence in decisions. Exploitation of context to improve performance. Combination of experts. CS 551, Spring 2009 c 2009, Selim Aksoy (Bilkent University) 33 / 39

The Design Cycle Collect data Select features Select model Train classifier Evaluate classifier Figure 21: The design cycle. Data collection: Collecting training and testing data. How can we know when we have adequately large and representative set of samples? CS 551, Spring 2009 c 2009, Selim Aksoy (Bilkent University) 34 / 39

The Design Cycle Feature selection: Domain dependence and prior information. Computational cost and feasibility. Discriminative features. Similar values for similar patterns. Different values for different patterns. Invariant features with respect to translation, rotation and scale. Robust features with respect to occlusion, distortion, deformation, and variations in environment. CS 551, Spring 2009 c 2009, Selim Aksoy (Bilkent University) 35 / 39

The Design Cycle Model selection: Domain dependence and prior information. Definition of design criteria. Parametric vs. non-parametric models. Handling of missing features. Computational complexity. Types of models: templates, decision-theoretic or statistical, syntactic or structural, neural, and hybrid. How can we know how close we are to the true model underlying the patterns? CS 551, Spring 2009 c 2009, Selim Aksoy (Bilkent University) 36 / 39

The Design Cycle Training: How can we learn the rule from data? Supervised learning: a teacher provides a category label or cost for each pattern in the training set. Unsupervised learning: the system forms clusters or natural groupings of the input patterns. Reinforcement learning: no desired category is given but the teacher provides feedback to the system such as the decision is right or wrong. CS 551, Spring 2009 c 2009, Selim Aksoy (Bilkent University) 37 / 39

The Design Cycle Evaluation: How can we estimate the performance with training samples? How can we predict the performance with future data? Problems of overfitting and generalization. CS 551, Spring 2009 c 2009, Selim Aksoy (Bilkent University) 38 / 39

Summary Pattern recognition techniques find applications in many areas: machine learning, statistics, mathematics, computer science, biology, etc. There are many sub-problems in the design process. Many of these problems can indeed be solved. More complex learning, searching and optimization algorithms are developed with advances in computer technology. There remain many fascinating unsolved problems. CS 551, Spring 2009 c 2009, Selim Aksoy (Bilkent University) 39 / 39