Neural Networks and Support Vector Machines

Similar documents
Artificial Neural Networks and Support Vector Machines. CS 486/686: Introduction to Artificial Intelligence

Feed-Forward mapping networks KAIST 바이오및뇌공학과 정재승

Chapter 4: Artificial Neural Networks

Role of Neural network in data mining

Introduction to Machine Learning and Data Mining. Prof. Dr. Igor Trajkovski

Support Vector Machine (SVM)

NEURAL NETWORKS A Comprehensive Foundation

Chapter 12 Discovering New Knowledge Data Mining

6.2.8 Neural networks for data mining

Data Mining and Neural Networks in Stata

Data quality in Accounting Information Systems

Introduction to Artificial Neural Networks

SUCCESSFUL PREDICTION OF HORSE RACING RESULTS USING A NEURAL NETWORK

AUTOMATION OF ENERGY DEMAND FORECASTING. Sanzad Siddique, B.S.

Big Data Analytics CSCI 4030

3 An Illustrative Example

Artificial Neural Network, Decision Tree and Statistical Techniques Applied for Designing and Developing Classifier

Comparison of K-means and Backpropagation Data Mining Algorithms

Lecture 6. Artificial Neural Networks

NEURAL NETWORKS IN DATA MINING

Neural Networks and Back Propagation Algorithm

Recurrent Neural Networks

An Introduction to Neural Networks

A Content based Spam Filtering Using Optical Back Propagation Technique

Novelty Detection in image recognition using IRF Neural Networks properties

Class #6: Non-linear classification. ML4Bio 2012 February 17 th, 2012 Quaid Morris

Feedforward Neural Networks and Backpropagation

Electroencephalography Analysis Using Neural Network and Support Vector Machine during Sleep

Neural Computation - Assignment

Artificial neural networks

EFFICIENT DATA PRE-PROCESSING FOR DATA MINING

Price Prediction of Share Market using Artificial Neural Network (ANN)

Horse Racing Prediction Using Artificial Neural Networks

Machine Learning and Data Mining -

Analecta Vol. 8, No. 2 ISSN

Back Propagation Neural Networks User Manual

Machine learning in financial forecasting. Haindrich Henrietta Vezér Evelin

Evaluation of Feature Selection Methods for Predictive Modeling Using Neural Networks in Credits Scoring

Power Prediction Analysis using Artificial Neural Network in MS Excel

APPLICATION OF ARTIFICIAL NEURAL NETWORKS USING HIJRI LUNAR TRANSACTION AS EXTRACTED VARIABLES TO PREDICT STOCK TREND DIRECTION

Neural Network Design in Cloud Computing

Design call center management system of e-commerce based on BP neural network and multifractal

Neural Networks algorithms and applications

A Neural Network based Approach for Predicting Customer Churn in Cellular Network Services

Method of Combining the Degrees of Similarity in Handwritten Signature Authentication Using Neural Networks

Time Series Data Mining in Rainfall Forecasting Using Artificial Neural Network

Data Mining. Supervised Methods. Ciro Donalek Ay/Bi 199ab: Methods of Sciences hcp://esci101.blogspot.

A Time Series ANN Approach for Weather Forecasting

Open Access Research on Application of Neural Network in Computer Network Security Evaluation. Shujuan Jin *

Comparison of Supervised and Unsupervised Learning Classifiers for Travel Recommendations

Data Mining. Nonlinear Classification

Neural network software tool development: exploring programming language options

NTC Project: S01-PH10 (formerly I01-P10) 1 Forecasting Women s Apparel Sales Using Mathematical Modeling

Stock Prediction using Artificial Neural Networks

Comparison of Supervised and Unsupervised Learning Algorithms for Pattern Classification

SURVIVABILITY ANALYSIS OF PEDIATRIC LEUKAEMIC PATIENTS USING NEURAL NETWORK APPROACH

Neural Network Based Forecasting of Foreign Currency Exchange Rates

American International Journal of Research in Science, Technology, Engineering & Mathematics

Linear smoother. ŷ = S y. where s ij = s ij (x) e.g. s ij = diag(l i (x)) To go the other way, you need to diagonalize S

Data Mining Techniques

AN APPLICATION OF TIME SERIES ANALYSIS FOR WEATHER FORECASTING

Learning to Process Natural Language in Big Data Environment

OPTIMUM LEARNING RATE FOR CLASSIFICATION PROBLEM

Learning is a very general term denoting the way in which agents:

Classifying Large Data Sets Using SVMs with Hierarchical Clusters. Presented by :Limou Wang

LCs for Binary Classification

Classification and Prediction

Evaluation of Machine Learning Techniques for Green Energy Prediction

Journal of Asian Scientific Research COMPARISON OF THREE CLASSIFICATION ALGORITHMS FOR PREDICTING PM2.5 IN HONG KONG RURAL AREA.

Supply Chain Forecasting Model Using Computational Intelligence Techniques

Neural Network Toolbox

How To Use Neural Networks In Data Mining

Analysis of Multilayer Neural Networks with Direct and Cross-Forward Connection

Performance Evaluation of Artificial Neural. Networks for Spatial Data Analysis

NEURAL NETWORK FUNDAMENTALS WITH GRAPHS, ALGORITHMS, AND APPLICATIONS

A hybrid financial analysis model for business failure prediction

Car Insurance. Havránek, Pokorný, Tomášek

Neural Networks for Machine Learning. Lecture 13a The ups and downs of backpropagation

Machine learning for algo trading

Classification algorithm in Data mining: An Overview

Neural Network Predictor for Fraud Detection: A Study Case for the Federal Patrimony Department

TRAINING A LIMITED-INTERCONNECT, SYNTHETIC NEURAL IC

Machine Learning: Multi Layer Perceptrons

Use of Artificial Neural Network in Data Mining For Weather Forecasting

Keywords: Image complexity, PSNR, Levenberg-Marquardt, Multi-layer neural network.

Data Mining Techniques for Prognosis in Pancreatic Cancer

A simple application of Artificial Neural Network to cloud classification

Neural Networks for Sentiment Detection in Financial Text

High-Performance Signature Recognition Method using SVM

Application of Event Based Decision Tree and Ensemble of Data Driven Methods for Maintenance Action Recommendation

Stock Market Prediction Model by Combining Numeric and News Textual Mining

Impact of Feature Selection on the Performance of Wireless Intrusion Detection Systems

Stock Portfolio Selection using Data Mining Approach

Data Mining Practical Machine Learning Tools and Techniques

ABSTRACT The World MINING R. Vasudevan. Trichy. Page 9. usage mining. basic. processing. Web usage mining. Web. useful information

REVIEW OF HEART DISEASE PREDICTION SYSTEM USING DATA MINING AND HYBRID INTELLIGENT TECHNIQUES

Scalable Developments for Big Data Analytics in Remote Sensing

Search Taxonomy. Web Search. Search Engine Optimization. Information Retrieval

Transcription:

INF5390 - Kunstig intelligens Neural Networks and Support Vector Machines Roar Fjellheim INF5390-13 Neural Networks and SVM 1

Outline Neural networks Perceptrons Neural networks Support vector machines Summary AIMA Chapter 18: Learning from Examples INF5390-13 Neural Networks and SVM 2

Neural networks in AI The human brain is a huge network of neurons A neuron is a basic processing unit that collects, processes and disseminates electrical signals Early AI tried to imitate the brain by building artificial neural networks (ANN) Met with theoretical limits and disappeared In the 1980-90 es, interest in ANNs resurfaced New theoretical development Massive industrial interest&applications INF5390-13 Neural Networks and SVM 3

The basic unit of neural networks The network consists of units (nodes, neurons ) connected by links Carries an activation a i from unit i to unit j The link from unit i to unit j has a weight W i,j Bias weight W 0,j to fixed input a 0 = 1 Activation of a unit j Calculate input in j = W i,j a i (i=0..n) Derive output a j = g(in j ) where g is the activation function INF5390-13 Neural Networks and SVM 4

Activation functions Activation function should separate well Active (near 1) for desired input Inactive (near 0) otherwise It should be non-linear Most used functions Threshold function Sigmoid function INF5390-13 Neural Networks and SVM 5

Neural networks as logical gates With proper use of bias weight W 0 to set thresholds, neural networks can compute standard logical gate functions INF5390-13 Neural Networks and SVM 6

Neural network structures Two main structures Feed-forward (acyclic) networks Represents a function of its inputs No internal state Recurrent network Feeds outputs back to inputs May be stable, oscillate or become chaotic Output depends on initial state Recurrent networks are the most interesting and brain-like, but also most difficult to understand INF5390-13 Neural Networks and SVM 7

Feed-forward networks as functions A FF network calculates a function of its inputs The network may contain hidden units/layers By changing #layers/units and their weights, different functions can be realized FF networks are often used for classification INF5390-13 Neural Networks and SVM 8

Perceptrons Single-layer feed-forward neural networks are called perceptrons, and were the earliest networks to be studied Perceptrons can only act as linear separators, a small subset of all interesting functions This partly explains why neural network research was discontinued for a long time INF5390-13 Neural Networks and SVM 9

Perceptron learning algorithm How to train the network to do a certain function (e.g. classification) based on a training set of input/output pairs? x 1 W j x 2 y x 3 Basic idea x 4 Adjust network link weights to minimize some measure of the error on the training set Adjust weights in direction that minimizes error INF5390-13 Neural Networks and SVM 10

Perceptron learning algorithm (cont.) function PERCEPTRON-LEARNING(examples, network) returns a perceptron hypothesis inputs: examples, a set of examples, each with inputs x 1, x 2.. and output y repeat network, a perceptron with weights W j and act. function g for each e in examples do in = W j x j [e] Err = y[e] g(in) W j = W j + Err x j [e] until some stopping criterion is satisfied j=0.. n return NEURAL-NETWORK-HYPOTHESIS(network) - the learning rate INF5390-13 Neural Networks and SVM 11

Performance of perceptrons vs. decision trees Perceptrons better at learning separable problem Decision trees better at restaurant problem INF5390-13 Neural Networks and SVM 12

Multi-layer feed-forward networks Adds hidden layers The most common is one extra layer The advantage is that more function can be realized, in effect by combining several perceptron functions It can be shown that A feed-forward network with a single sufficiently large hidden layer can represent any continuous function With two layers, even discontinuous functions can be represented However Cannot easily tell which functions a particular network is able to represent Not well understood how to choose structure/number of layers for a particular problem INF5390-13 Neural Networks and SVM 13

Example network structure Feed-forward network with 10 inputs, one output and one hidden layer suitable for restaurant problem INF5390-13 Neural Networks and SVM 14

More complex activation functions Multi-layer networks can combine simple (linear separation) perceptron activation functions into more complex functions (combine 2) (combine 2) INF5390-13 Neural Networks and SVM 15

Learning in multi-layer networks In principle as for perceptrons adjusting weights to minimize error The main difference is what error at internal nodes mean nothing to compare to Solution: Propagate error at output nodes back to hidden layers Successively propagate backwards if the network has several hidden layers The resulting Back-propagation algorithm is the standard learning method for neural networks INF5390-13 Neural Networks and SVM 16

Learning neural network structure Need to learn network structure Learning algorithms have assumed fixed network structure However, we do not know in advance what structure will be necessary and sufficient Solution approach Try different configurations, keep the best Search space is very large (# layers and # nodes) Optimal brain damage : Start with full network, remove nodes selectively (optimally) Tiling : Start with minimal network that covers subset of training set, expand incrementally INF5390-13 Neural Networks and SVM 17

Support Vector Machines (SVM) Currently the most popular approach for supervised learning Does not require any prior knowledge Scales to very large problems Attractive features of SVM Constructs a maximum margin separator, decision boundary with max. possible distance to examples Creates linear separators, but can embed data in higher dimensions (kernel trick) A non-parametric method, i.e. may retain examples (instances, in addition to parameters as in NN), thus be able to express more complex functions INF5390-13 Neural Networks and SVM 18

Classification by SVM SVM finds the separator with maximum margin between examples The example points nearest the separator are called support vectors INF5390-13 Neural Networks and SVM 19

The kernel trick in SVM What if the examples are not linearly separable? SVM maps each example to a new vector in a higher dimension space, using kernel functions In the new space, a linear maximum separator may be found (the kernel trick) INF5390-13 Neural Networks and SVM 20

Ensemble learning (EM) In ensemble learning predictions from a collection of hypotheses are combined Example: Three linear separators are combined such that all separators must return positive for the overall classification to be positive The combined classifier is more expressive without being much more complex Boosting is a widely used ensemble learning method INF5390-13 Neural Networks and SVM 21

Summary Neural networks (NN) are inspired by human brains, and are complex nonlinear functions with many parameters learned from noisy data A perceptron is a feed-forward network with no hidden layers and can only represent linearly separable functions Multi-layer feed-forward NN can represent arbitrary functions, and be trained efficiently using the back-propagation algorithm Support vector machines (SVM) is an effective method for learning classifiers in large data sets Ensemble learning (EM) combines several simpler classifiers in a more complex function INF5390-13 Neural Networks and SVM 22