TECNICHE DI DIAGNOSI AUTOMATICA DEI GUASTI. Silvio Simani silvio.simani@unife.it. References



Similar documents
Spline. Computer Graphics. B-splines. B-Splines (for basis splines) Generating a curve. Basis Functions. Lecture 14 Curves and Surfaces II

12/7/2011. Procedures to be Covered. Time Series Analysis Using Statgraphics Centurion. Time Series Analysis. Example #1 U.S.

A Hybrid Method for Forecasting Stock Market Trend Using Soft-Thresholding De-noise Model and SVM

Capacity Planning. Operations Planning

An Anti-spam Filter Combination Framework for Text-and-Image s through Incremental Learning

Linear Extension Cube Attack on Stream Ciphers Abstract: Keywords: 1. Introduction

A Hybrid AANN-KPCA Approach to Sensor Data Validation

Lecture 40 Induction. Review Inductors Self-induction RL circuits Energy stored in a Magnetic Field

A Background Layer Model for Object Tracking through Occlusion

Boosting for Learning Multiple Classes with Imbalanced Class Distribution

How To Calculate Backup From A Backup From An Oal To A Daa

Genetic Algorithm with Range Selection Mechanism for Dynamic Multiservice Load Balancing in Cloud-Based Multimedia System

HEAT CONDUCTION PROBLEM IN A TWO-LAYERED HOLLOW CYLINDER BY USING THE GREEN S FUNCTION METHOD

MORE ON TVM, "SIX FUNCTIONS OF A DOLLAR", FINANCIAL MECHANICS. Copyright 2004, S. Malpezzi

Linear methods for regression and classification with functional data

Estimating intrinsic currency values

An Introductory Study on Time Series Modeling and Forecasting

A 3D Model Retrieval System Using The Derivative Elevation And 3D-ART

An Ensemble Data Mining and FLANN Combining Short-term Load Forecasting System for Abnormal Days

RESOLUTION OF THE LINEAR FRACTIONAL GOAL PROGRAMMING PROBLEM

Kalman filtering as a performance monitoring technique for a propensity scorecard

Pedro M. Castro Iiro Harjunkoski Ignacio E. Grossmann. Lisbon, Portugal Ladenburg, Germany Pittsburgh, USA

Using Cellular Automata for Improving KNN Based Spam Filtering

An Architecture to Support Distributed Data Mining Services in E-Commerce Environments

Analyzing Energy Use with Decomposition Methods

MODEL-BASED APPROACH TO CHARACTERIZATION OF DIFFUSION PROCESSES VIA DISTRIBUTED CONTROL OF ACTUATED SENSOR NETWORKS

Time Series. A thesis. Submitted to the. Edith Cowan University. Perth, Western Australia. David Sheung Chi Fung. In Fulfillment of the Requirements

A Common Neural Network Model for Unsupervised Exploratory Data Analysis and Independent Component Analysis

A Decision-Theoretic Generalization of On-Line Learning and an Application to Boosting*

The Prediction Algorithm Based on Fuzzy Logic Using Time Series Data Mining Method

PerfCenter: A Methodology and Tool for Performance Analysis of Application Hosting Centers

HEURISTIC ALGORITHM FOR SINGLE RESOURCE CONSTRAINED PROJECT SCHEDULING PROBLEM BASED ON THE DYNAMIC PROGRAMMING

INTERNATIONAL JOURNAL OF STRATEGIC MANAGEMENT

Revision: June 12, E Main Suite D Pullman, WA (509) Voice and Fax

PARTICLE FILTER BASED VEHICLE TRACKING APPROACH WITH IMPROVED RESAMPLING STAGE

The Rules of the Settlement Guarantee Fund. 1. These Rules, hereinafter referred to as "the Rules", define the procedures for the formation

APPLICATION OF CHAOS THEORY TO ANALYSIS OF COMPUTER NETWORK TRAFFIC Liudvikas Kaklauskas, Leonidas Sakalauskas

MULTI-WORKDAY ERGONOMIC WORKFORCE SCHEDULING WITH DAYS OFF

DEPARTMENT OF ECONOMETRICS AND BUSINESS STATISTICS. Exponential Smoothing for Inventory Control: Means and Variances of Lead-Time Demand

How Much Life Insurance is Enough?

Cooperative Distributed Scheduling for Storage Devices in Microgrids using Dynamic KKT Multipliers and Consensus Networks

Sensor Nework proposeations

Anomaly Detection in Network Traffic Using Selected Methods of Time Series Analysis

Decentralized Model Reference Adaptive Control Without Restriction on Subsystem Relative Degrees

A GENERALIZED FRAMEWORK FOR CREDIT RISK PORTFOLIO MODELS

GUIDANCE STATEMENT ON CALCULATION METHODOLOGY

arxiv: v1 [cs.sy] 22 Jul 2014

Cost- and Energy-Aware Load Distribution Across Data Centers

A Heuristic Solution Method to a Stochastic Vehicle Routing Problem

Modèles financiers en temps continu

Analysis of intelligent road network, paradigm shift and new applications

Selected Financial Formulae. Basic Time Value Formulae PV A FV A. FV Ad

t φρ ls l ), l = o, w, g,

Market-Clearing Electricity Prices and Energy Uplift

Testing techniques and forecasting ability of FX Options Implied Risk Neutral Densities. Oren Tapiero

The Virtual Machine Resource Allocation based on Service Features in Cloud Computing Environment

Finance and Economics Discussion Series Divisions of Research & Statistics and Monetary Affairs Federal Reserve Board, Washington, D.C.

Index Mathematics Methodology

A robust optimisation approach to project scheduling and resource allocation. Elodie Adida* and Pradnya Joshi

CONTROLLER PERFORMANCE MONITORING AND DIAGNOSIS. INDUSTRIAL PERSPECTIVE

ACKNOWLEDGEMENT RATNADIP ADHIKARI - 3 -

Both human traders and algorithmic

Modelling Operational Risk in Financial Institutions using Hybrid Dynamic Bayesian Networks. Authors:

Case Study on Web Service Composition Based on Multi-Agent System

The Multi-shift Vehicle Routing Problem with Overtime

International Journal of Mathematical Archive-7(5), 2016, Available online through ISSN

Nonparametric deconvolution of hormone time-series: A state-space approach *

Modeling state-related fmri activity using change-point theory

Optimal Taxation. 1 Warm-Up: The Neoclassical Growth Model with Endogenous Labour Supply. β t u (c t, L t ) max. t=0

Pocket3D Designing a 3D Scanner by means of a PDA 3D DIGITIZATION

Levy-Grant-Schemes in Vocational Education

Proceedings of the 2008 Winter Simulation Conference S. J. Mason, R. R. Hill, L. Mönch, O. Rose, T. Jefferson, J. W. Fowler eds.

Y2K* Stephanie Schmitt-Grohé. Rutgers Uni ersity, 75 Hamilton Street, New Brunswick, New Jersey

Optimization of Nurse Scheduling Problem with a Two-Stage Mathematical Programming Model

Currency Exchange Rate Forecasting from News Headlines

Methodology of the CBOE S&P 500 PutWrite Index (PUT SM ) (with supplemental information regarding the CBOE S&P 500 PutWrite T-W Index (PWT SM ))

THE IMPACT OF UNSECURED DEBT ON FINANCIAL DISTRESS AMONG BRITISH HOUSEHOLDS. Ana del Río and Garry Young. Documentos de Trabajo N.

Prot sharing: a stochastic control approach.

The Feedback from Stock Prices to Credit Spreads

THE USE IN BANKS OF VALUE AT RISK METHOD IN MARKET RISK MANAGEMENT. Ioan TRENCA *

SPC-based Inventory Control Policy to Improve Supply Chain Dynamics

Network Effects on Standard Software Markets: A Simulation Model to examine Pricing Strategies

Ground rules. Guide to the calculation methods of the FTSE Actuaries UK Gilts Index Series v1.9

Scaling Up POMDPs for Dialog Management: The Summary POMDP Method. Jason D. Williams and Steve Young

(Im)possibility of Safe Exchange Mechanism Design

Fixed Income Attribution. Remco van Eeuwijk, Managing Director Wilshire Associates Incorporated 15 February 2006

Attribution Strategies and Return on Keyword Investment in Paid Search Advertising

Applying the Theta Model to Short-Term Forecasts in Monthly Time Series

Prices of Credit Default Swaps and the Term Structure of Credit Risk

Social security, education, retirement and growth*

Distribution Channel Strategy and Efficiency Performance of the Life insurance. Industry in Taiwan. Abstract

Insurance. By Mark Dorfman, Alexander Kling, and Jochen Russ. Abstract

THE IMPACT OF QUICK RESPONSE IN INVENTORY-BASED COMPETITION

A binary powering Schur algorithm for computing primary matrix roots

A New Approach to Linear Filtering and Prediction Problems 1

Best estimate calculations of saving contracts by closed formulas Application to the ORSA

The impact of unsecured debt on financial distress among British households

GENETIC NEURAL NETWORK BASED DATA MINING AND APPLICATION IN CASE ANALYSIS OF POLICE OFFICE

The Definition and Measurement of Productivity* Mark Rogers

Event Based Project Scheduling Using Optimized Ant Colony Algorithm Vidya Sagar Ponnam #1, Dr.N.Geethanjali #2

Transcription:

TECNICHE DI DIAGNOSI AUTOMATICA DEI GUASTI Re Neural per l Idenfcazone d Ssem non Lnear e Paern Recognon slvo.sman@unfe. References Texbook suggesed: Neural Neworks for Idenfcaon, Predcon, and Conrol, by Duc Truong Pham and Xng Lu. Sprnger Verlag; December 1995. ISBN: 3540199594 Nonlnear Idenfcaon and Conrol: A Neural Nework Approach, by G. P. Lu. Sprnger Verlag; Ocober 2001. ISBN: 1852333421. Fuzzy Modelng for Conrol, by Rober Babuska. Sprnger; 1s edon May 1, 1998 ISBN-10: 0792381548, ISBN-13: 978-0792381549. 01/05/2011 2

Course Overvew 1. Inroducon. Course nroducon. Inroducon o neural nework. Issues n neural nework 2. Smple neural nework. Percepron. Adalne 3. Mullayer Percepron. Bascs 4. Radal bass neworks: overvew 5. Fuzzy Sysems: overvew 6. Applcaon examples 01/05/2011 3 Machne Learnng Improve auomacally wh experence Imang human learnng Human learnng Fas recognon and classfcaon of complex classes of objecs and conceps and fas adapaon Example: neural neworks Some echnques assume sascal source Selec a sascal model o model he source Oher echnques are based on reasonng or nducve nference e.g. Decson ree 01/05/2011 4

Machne Learnng Defnon A compuer program s sad o learn from experence E wh respec o some class of asks T and performance measure P, f s performance a asks n T, as measured by P, mproves wh experence. 01/05/2011 5 Examples of Learnng Problems Example 1: handwrng recognon: T: recognzng and classfyng handwren words whn mages. P: percenage of words correcly classfed. E: a daabase of handwren words wh gven classfcaon. Example 2: learn o play checkers: T: play checkers. P: percenage of games won n a ournamen. E: opporuny o play agans self war games. 01/05/2011 6

Type of Tranng Experence Drec or ndrec? Drec: board sae -> correc move Indrec: cred assgnmen problem degree of cred or blame for each move o he fnal oucome of wn or loss Teacher or no? Teacher selecs board saes and provde correc moves or Learner can selec board saes Is ranng experence represenave of performance goal? Tranng playng agans self Performance evaluaed playng agans world champon 01/05/2011 7 Issues n Machne Learnng Wha algorhms can approxmae funcons well and when? How does he number of ranng examples nfluence accuracy? How does he complexy of hypohess represenaon mpac? How does nosy daa nfluence accuracy? How do you reduce a learnng problem o a se of funcon approxmaon? 01/05/2011 8

Summary Machne learnng s useful for daa mnng, poorly undersood doman face recognon and programs ha mus dynamcally adap. Draws from many dverse dscplnes. Learnng problem needs well-specfed ask, performance merc and ranng experence. Involve searchng space of possble hypoheses. Dfferen learnng mehods search dfferen hypohess space, such as numercal funcons, neural neworks, decson rees, symbolc rules. 01/05/2011 9 Inroducon o Neural Neworks 01/05/2011 10

Bran 10 11 neurons processors On average 1000-10000 connecons 01/05/2011 11 Arfcal Neuron bas j ne = j w j y j + b 01/05/2011 12

Arfcal Neuron Inpu/Oupu Sgnal may be. Real value. Unpolar {0, 1}. Bpolar {-1, +1}. Wegh : w j srengh of connecon. Noe ha w j refers o he wegh from un j o un no he oher way round. 01/05/2011 13 Arfcal Neuron The bas b s a consan ha can be wren as w 0 y 0 wh y 0 = b and w 0 = 1 such ha n ne = w y j = 0 The funcon f s he un s acvaon funcon. In he smples case, f s he deny funcon, and he un s oupu s jus s ne npu. Ths s called a lnear un. Oher acvaon funcons are : sep funcon, sgmod funcon and Gaussan funcon. 01/05/2011 14 j j

Acvaon Funcons Ideny funcon Bnary Sep funcon Bpolar Sep funcon y 2 x μ 1 2 2σ x = e 2πσ Sgmod funcon Bpolar Sgmod funcon Gaussan funcon 01/05/2011 15 Arfcal Neural Neworks ANN Inpu vecor wegh Sgnal roung Acvaon funcon wegh Acvaon funcon Oupu vecor 01/05/2011 16

When Should ANN Soluon Be Consdered? The soluon o he problem canno be explcly descrbed by an algorhm, a se of equaons, or a se of rules. There s some evdence ha an npu-oupu mappng exss beween a se of npu and oupu varables. There should be a large amoun of daa avalable o ran he nework. 01/05/2011 17 Problems Tha Can Lead o Poor Performance? The nework has o dsngush beween very smlar cases wh a very hgh degree of accuracy. The ran daa does no represen he ranges of cases ha he nework wll encouner n pracce. The nework has a several hundred npus. The man dscrmnang facors are no presen n he avalable daa. E.g. Tryng o assess he loan applcaon whou havng knowledge of he applcan's salares. The nework s requred o mplemen a very complex funcon. 01/05/2011 18

Applcaons of Arfcal Neural Neworks Manufacurng : faul dagnoss, fraud deecon. Realng : fraud deecon, forecasng, daa mnng. Fnance : fraud deecon, forecasng, daa mnng. Engneerng : faul dagnoss, sgnal/mage processng. Producon : faul dagnoss, forecasng. Sales & markeng : forecasng, daa mnng. 01/05/2011 19 Daa Pre-processng Neural neworks very rarely operae on he raw daa. An nal pre-processng sage s essenal. Some examples are as follows: Feaure exracon of mages: for example, he analyss of x-rays requres pre-processng o exrac feaures whch may be of neres whn a specfed regon. Represenng npu varables wh numbers. For example "+1" s he person s marred, "0" f dvorced, and "-1" f sngle. Anoher example s represenng he pxels of an mage: 255 = brgh whe, 0 = black. To ensure he generalzaon capably of a neural nework, he daa should be encoded n form whch allows for nerpolaon. 01/05/2011 20

Daa Pre-processng Caegorcal Varable A caegorcal varable s a varable ha can belong o one of a number of dscree caegores. For example, red, green, blue. Caegorcal varables are usually encoded usng 1 ou-of n codng. e.g. for hree colors, red = 1 0 0, green =0 1 0 Blue =0 0 1. If we used red = 1, green = 2, blue = 3, hen hs ype of encodng mposes an orderng on he values of he varables whch does no exs. 01/05/2011 21 Daa Pre-processng CONTINUOUS VARIABLES A connuous varable can be drecly appled o a neural nework. However, f he dynamc range of npu varables are no approxmaely he same, s beer o normalze all npu varables of he neural nework. 01/05/2011 22

Smple Neural Neworks Smple Percepron 01/05/2011 23 Oulnes The Percepron Lnearly separable problem Nework srucure Percepron learnng rule Convergence of Percepron 01/05/2011 24

THE PERCEPTRON The percepron was a smple model of ANN nroduced by Rosenbla of MIT n he 1960 wh he dea of learnng. Percepron s desgned o accomplsh a smple paern recognon ask: afer learnng wh real value ranng daa { x, d, =1,2,, p} where d = 1 or -1 For a new sgnal paern x+1, he percepron s capable of ellng you o whch class he new sgnal belongs x+1 percepron = 1 or 1 01/05/2011 25 Percepron Lnear Threshold Un LTU x 1 w 1 x 0 =1 w 0 =b ox= { 1 f Σ =0 n w x >0-1 oherwse x 2. w 2 w n Σ x= Σ =0n w x o x n 01/05/2011 26

01/05/2011 27 = = = + = m m x w f b x w f y 0 1 where f s he hard lmer funcon.e. + > + = = = 0 1 0 1 1 1 m m b x w f b x w f y We can always rea he bas b as anoher wegh wh npus equal 1 Mahemacally he Percepron s 01/05/2011 28 0 1 = + = m b x w Why s he nework capable of solvng lnearly separable problem? + 0 1 > + = m b wx 0 1 < + = m b w x

Learnng rule An algorhm o updae he weghs w so ha fnally he npu paerns le on boh sdes of he lne decded by he percepron Le be he me, a = 0, we have + w 0 x = 0 01/05/2011 29 d w + = 1 + 1 f 1 f Percepron learnng rule In Mah Where η s he learnng rae >0, +1 f x>0 sgnx = 1 f x<=0, x n x n class class = w + η [ d sgn w x ] x NB : d s he same as d and x as x 01/05/2011 30 + hard lmer funcon

In words: If he classfcaon s rgh, do no updae he weghs If he classfcaon s no correc, updae he wegh owards he oppose drecon so ha he oupu move close o he rgh drecons. 01/05/2011 31 Percepron convergence heorem Rosenbla, 1962 Le he subses of ranng vecors be lnearly separable. Then afer fne seps of learnng we have lm w = w whch correcly separae he samples. The dea of proof s ha o consder w+1-w - w-w whch s a decrease funcon of 01/05/2011 32

Summary of Percepron learnng Varables and parameers x = m+1 dm. npu vecors a me = b, x 1, x 2,..., x m w = m+1 dm. wegh vecors = 1, w 1,..., w m b = bas y = acual response η = learnng rae parameer, a +ve consan < 1 d = desred response 01/05/2011 33 Summary of Percepron learnng Daa { x, d, =1,,p} Presen he daa o he nework once a pon could be cyclc : x1, d1, x2, d2,, xp, dp, xp+1, dp+1, or randomly Hence we mx me wh here 01/05/2011 34

Summary of Percepron learnng algorhm 1. Inalsaon Se w0=0. Then perform he followng compuaon for me sep =1,2,... 2. Acvaon A me sep, acvae he percepron by applyng npu vecor x and desred response d 3. Compuaon of acual response Compue he acual response of he percepron y = sgn w x where sgn s he sgn funcon 4. Adapaon of wegh vecor Updae he wegh vecor of he percepron w+1 = w+ η [ d - y ] x 5. Connuaon 01/05/2011 35 Quesons reman Where or when o sop? By mnmzng he generalzaon error For ranng daa {x, d, =1, p} How o defne ranng error afer seps of learnng? E= p =1 [d-sgnw. x]2 01/05/2011 36

We nex urn o ADALINE learnng, from whch we can undersand he learnng rule, and more general he Back-Propagaon BP learnng 01/05/2011 37 Smple Neural Nework ADALINE Learnng 01/05/2011 38

Oulnes ADALINE Graden descendng learnng Modes of ranng 01/05/2011 39 Unhappy Over Percepron Tranng When a percepron gves he rgh answer, no learnng akes place Anyhng below he hreshold s nerpreed as no, even s jus below he hreshold. I mgh be beer o ran he neuron based on how far below he hreshold s. 01/05/2011 40

ADALINE ADALINE s an acronym for ADApve LINear Elemen or ADApve LInear NEuron developed by Bernard Wdrow and Marcan Hoff 1960. There are several varaons of Adalne. One has hreshold same as percepron and anoher jus a bare lnear funcon. The Adalne learnng rule s also known as he leasmean-squares LMS rule, he dela rule, or he Wdrow- Hoff rule. I s a ranng rule ha mnmzes he oupu error usng approxmae graden descen mehod. 01/05/2011 41 Replace he sep funcon n he percepron wh a connuous dfferenable funcon f, e.g he smples s lnear funcon Wh or whou he hreshold, he Adalne s raned based on he oupu of he funcon f raher han he fnal oupu. +/Σ f x Adalne 01/05/2011 42

Afer each ranng paern x s presened, he correcon o apply o he weghs s proporonal o he error. E, = ½ [ d fw x ] 2 =1,...,p N.B. If f s a lnear funcon fw x = w x Summng ogeher, our purpose s o fnd w whch mnmzes E = E, 01/05/2011 43 General Approach graden descen mehod To fnd g w+1 = w+g Ew so ha w auomacally ends o he global mnmum of Ew. w+1 = w- E wη see fgure below 01/05/2011 44

Graden drecon s he drecon of uphll for example, n he Fgure, a poson 0.4, he graden s uphll F s E, consder one dm case Fw F Graden drecon F 0.4 01/05/2011 45 w In graden descen algorhm, we have w+1 = w F w ητ herefore he ball goes downhll snce F w s downhll drecon Fw Graden drecon w 01/05/2011 46 w

In graden descen algorhm, we have w+1 = w F w ητ herefore he ball goes downhll snce F w s downhll drecon Fw Graden drecon w+1 w 01/05/2011 47 Gradually he ball wll sop a a local mnma where he graden s zero Fw Graden drecon w+k w 01/05/2011 48

In words Graden mehod could be hough of as a ball rollng down from a hll: he ball wll roll down and fnally sop a he valley Thus, he weghs are adjused by w j +1 = w j +η Σ [d - fw x ] x j f Ths corresponds o graden descen on he quadrac error surface E When f =1, we have he percepron learnng rule we have n general f >0 n neural neworks. The ball moves n he rgh drecon. 01/05/2011 49 Two ypes of nework ranng: Sequenal mode on-lne, sochasc, or per-paern : Weghs updaed afer each paern s presened Percepron s n hs class Bach mode off-lne or per-epoch : Weghs updaed afer all paerns are presened 01/05/2011 50

Comparson Percepron and Graden Descen Rules Percepron learnng rule guaraneed o succeed f Tranng examples are lnearly separable Suffcenly small learnng rae η Lnear un ranng rule uses graden descen guaraneed o converge o hypohess wh mnmum squared error gven suffcenly small learnng rae η Even when ranng daa conans nose Even when ranng daa no separable by hyperplanes 01/05/2011 51 Summary Percepron W+1= W+η [ d - sgn w. x] x Adalne Graden descen mehod W+1= W+η [ d - fw. x] x f 01/05/2011 52

Mul-Layer Percepron MLP Idea: Cred assgnmen problem Problem of assgnng cred or blame o ndvdual elemens nvolvng n formng overall response of a learnng sysem hdden uns In neural neworks, problem relaes o dvdng whch weghs should be alered, by how much and n whch drecon. 01/05/2011 53 x 1 Example: Three-layer neworks x 2 Inpu Oupu x n Sgnal roung Inpu layer Hdden layer Oupu layer 01/05/2011 54

Properes of archecure No connecons whn a layer No drec connecons beween npu and oupu layers Fully conneced beween layers Ofen more han 2 layers Number of oupu uns need no equal number of npu uns Number of hdden uns per layer can be more or less han npu or oupu uns Each un s a percepron m y = f w x + b j j j = 1 01/05/2011 55 BP Back Propagaon + 01/05/2011 56

MulLayer Percepron I Back Propagang Learnng 01/05/2011 57 BP learnng algorhm Soluon o cred assgnmen problem n MLP Rumelhar, Hnon and Wllams 1986 BP has wo phases: Forward pass phase: compues funconal sgnal, feedforward propagaon of npu paern sgnals hrough nework Backward pass phase: compues error sgnal, propagaon of error dfference beween acual and desred oupu values backwards hrough nework sarng a oupu uns 01/05/2011 58

BP Learnng for Smples MLP O Task : Daa {I, d} o mnmze E = d - o 2 /2 = [d - fwy ] 2 /2 = [d - fwfwi ] 2 /2 W y w Error funcon a he oupu un Wegh a me s w and W, nend o fnd he wegh w and W a me +1 Where y = fwi, oupu of he npu un I 2 layers example 01/05/2011 59 Forward pass phase Suppose ha we have w, W of me O For gven npu I, we can calculae and y = fwi o = f W y = f W f w I W y w Error funcon of oupu un wll be E = d - o 2 /2 I 2 layers example 01/05/2011 60

Backward Pass Phase O de W + 1 = W η dw de df = W η df dw = W + η d o f ' W y y W y w I E = d - o 2 /2 o = f W y 01/05/2011 61 W + 1 Backward pass phase de = W η df = W + η d o f ' W y y = W + ηδy de = W η dw df dw W y I w O where Δ = d-o f 01/05/2011 62

01/05/2011 63 I I w f W w dw dy W y W f o d w dw dy dy de w dw de w w ' ' 1 Δ + = + = = = + η η η η I w W y O Backward pass phase o= f W y = f W f w I 01/05/2011 64 Summary wegh updaes are local oupu un npu un 1 1 y W W I w w j k kj kj j j j Δ = + = + η ηδ Δ = = + k kj k j j j j I W ne f I w w ' 1 η ηδ ' 1 y Ne f O d y W W j k k k j k kj kj = Δ = + η η Once wegh changes are compued for all uns, weghs are updaed a same me bas ncluded as weghs here We now compue he dervave of he acvaon funcon f. npu un oupu un

Acvaon Funcons o compue j and kwe need o fnd he dervave of acvaon funcon f o fnd dervave he acvaon funcon mus be smooh Sgmodal logsc funcon-common n MLP f δ ne Δ 1 = 1+ exp kne where k s a posve consan. The sgmodal funcon gves value n range of 0 o 1 Inpu-oupu funcon of a neuron rae codng assumpon 01/05/2011 65 Shape of sgmodal funcon Noe: when ne = 0, f = 0.5 01/05/2011 66

Shape of sgmodal funcon dervave Dervave of sgmodal funcon has max a x= 0, s symmerc abou hs pon fallng o zero as sgmodal approaches exreme values 01/05/2011 67 Reurnng o local error gradens n BP algorhm we have for oupu uns Δ = d = d O f ' Ne O ko 1 O For npu uns we have δ = ky = f ' ne 1 y k k Δ Δ k k W W k k Snce degree of wegh change s proporonal o dervave of acvaon funcon, wegh changes wll be greaes when uns receves md-range funconal sgnal han a exremes 01/05/2011 68

η Summary of BP learnng algorhm Se learnng rae η Se nal wegh values ncl.. bases: w, W Loop unl soppng crera sasfed: presen npu paern o NN npus compue funconal sgnal for npu uns compue funconal sgnal for oupu uns presen Targe response o oupu uns compue error sgnal for oupu uns compue error sgnal for npu uns updae all weghs a same me ncremen n o n+1 and selec nex I and d end loop 01/05/2011 69 Nework ranng: Tranng se shown repeaedly unl soppng crera are me Each full presenaon of all paerns = epoch Randomse order of ranng paerns presened for each epoch n order o avod correlaon beween consecuve ranng pars beng learn order effecs Two ypes of nework ranng: Sequenal mode on-lne, sochasc, or per-paern Weghs updaed afer each paern s presened Bach mode off-lne or per -epoch 01/05/2011 70

Advanages and dsadvanages of dfferen modes Sequenal mode: Less sorage for each weghed connecon Random order of presenaon and updang per paern means search of wegh space s sochasc-reducng rsk of local mnma able o ake advanage of any redundancy n ranng se.e. same paern occurs more han once n ranng se, esp. for large ranng ses Smpler o mplemen Bach mode: Faser learnng han sequenal mode 01/05/2011 71 MulLayer Percepron II Dynamcs of MulLayer Percepron

Summary of Nework Tranng Forward phase: I, w, ne, y, W, Ne, O Backward phase: Oupu un W kj + 1 W = η d O k kj k = η Δ k f ' Ne k y j y j Inpu un w j = η f + 1 w j = ηδ j I ' ne j Δ k W kj I k 01/05/2011 73 Nework ranng: Tranng se shown repeaedly unl soppng crera are me. Possble convergence crera are Eucldean norm of he graden vecor reaches a suffcenly small denoed as θ. When he absolue rae of change n he average squared error per epoch s suffcenly small denoed as θ. Valdaon for generalzaon performance : sop when generalzaon reachng he peak llusrae n hs lecure 01/05/2011 74

Goals of Neural Nework Tranng To gve he correc oupu for npu ranng vecor Learnng To gve good responses o new unseen npu paerns Generalzaon 01/05/2011 75 Tranng and Tesng Problems Suck neurons: Degree of wegh change s proporonal o dervave of acvaon funcon, wegh changes wll be greaes when uns receves md-range funconal sgnal han a exremes neuron. To avod suck neurons weghs nalzaon should gve oupus of all neurons approxmae 0.5 Insuffcen number of ranng paerns: In hs case, he ranng paerns wll be learn nsead of he underlyng relaonshp beween npus and oupu,.e. nework jus memorzng he paerns. Too few hdden neurons: nework wll no produce a good model of he problem. Over-fng: he ranng paerns wll be learn nsead of he underlyng funcon beween npus and oupu because of oo many of hdden neurons. Ths means ha he nework wll have a poor generalzaon capably. 01/05/2011 76

Dynamcs of BP learnng Am s o mnmse an error funcon over all ranng paerns by adapng weghs n MLP Recallng he ypcal error funcon s he mean squared error as follows E= 1 2 p k = 1 d k O k 2 The dea s o reduce E o global mnmum pon. 01/05/2011 77 Dynamcs of BP learnng In sngle layer percepron wh lnear acvaon funcons, he error funcon s smple, descrbed by a smooh parabolc surface wh a sngle mnmum 01/05/2011 78

Dynamcs of BP learnng MLP wh non-lnear acvaon funcons have complex error surfaces e.g. plaeaus, long valleys ec. wh no sngle mnmum For complex error surfaces he problem s learnng rae mus keep small o preven dvergence. Addng momenum erm s a smple approach dealng wh hs problem. 01/05/2011 79 Momenum Reducng problems of nsably whle ncreasng he rae of convergence Addng erm o wegh updae equaon can effecvely holds as exponenally wegh hsory of prevous weghs changed Modfed wegh updae equaon s w n + 1 w n = ηδ n y n j j j + α [ w n w n 1] j j + 01/05/2011 80

Effec of momenum erm If wegh changes end o have same sgn, momenum erm ncreases and graden decrease speed up convergence on shallow graden If wegh changes end have opposng sgns, momenum erm decreases and graden descen slows o reduce oscllaons sablzes Can help escape beng rapped n local mnma 01/05/2011 81 Selecng Inal Wegh Values Choce of nal wegh values s mporan as hs decdes sarng poson n wegh space. Tha s, how far away from global mnmum Am s o selec wegh values whch produce mdrange funcon sgnals Selec wegh values randomly from unform probably dsrbuon Normalse wegh values so number of weghed connecons per un produces mdrange funcon sgnal 01/05/2011 82

Convergence of Backprop Avod local mnumum wh fas convergence: Add momenum Sochasc graden descen Tran mulple nes wh dfferen nal weghs Naure of convergence Inalze weghs near zero or nal neworks near-lnear Increasngly non-lnear funcons possble as ranng progresses 01/05/2011 83 Use of Avalable Daa Se for Tranng The avalable daa se s normally spl no hree ses as follows: Tranng se use o updae he weghs. Paerns n hs se are repeaedly n random order. The wegh updae equaon are appled afer a ceran number of paerns. Valdaon se use o decde when o sop ranng only by monorng he error. Tes se Use o es he performance of he neural nework. I should no be used as par of he neural nework developmen cycle. 01/05/2011 84

Earler Soppng - Good Generalzaon Runnng oo many epochs may overran he nework and resul n overfng and perform poorly n generalzaon. Keep a hold-ou valdaon se and es accuracy afer every epoch. Manan weghs for bes performng nework on he valdaon se and sop ranng when error ncreases ncreases beyond hs. error No. of epochs Valdaon se Tranng se 01/05/2011 85 Model Selecon by Cross-valdaon Too few hdden uns preven he nework from learnng adequaely fng he daa and learnng he concep more han wo layer neworks. Too many hdden uns leads o overfng. Smlar cross-valdaon mehods can be used o deermne an approprae number of hdden uns by usng he opmal es error o selec he model wh opmal number of hdden layers and nodes. Valdaon se error Tranng se No. of epochs 01/05/2011 86

Radal Bass Funcons Radal Bass Funcons Overvew Radal-bass funcon RBF neworks RBF = radal-bass funcon: a funcon whch depends only on he radal dsance from a pon XOR problem quadracally separable 01/05/2011 88

Radal-bass funcon RBF neworks So RBFs are funcons akng he form φ x x where φ s a non-lnear acvaon funcon, x s he npu and x s he h poson, prooype, bass or cenre vecor. The dea s ha pons near he cenres wll have smlar oupus.e. f x ~ x hen f x ~ f x snce hey should have smlar properes. The smples s he lnear RBF : φx = x x 01/05/2011 89 Typcal RBFs nclude a Mulquadrcs φ 2 r = r + c 2 1 / 2 for some c>0 b Inverse mulquadrcs φ r 2 = r + c 2 1 / 2 for some c>0 c Gaussan 2 r φ r = exp 2 2σ for some σ >0 01/05/2011 90

nonlocalzed funcons localzed funcons 01/05/2011 91 Idea s o use a weghed sum of he oupus from he bass funcons o represen he daa. Thus ceners can be hough of as prooypes of npu daa. * * * * * 1 0 0 * O 1 MLP vs RBF dsrbued local

Sarng pon: exac nerpolaon Each npu paern x mus be mapped ono a arge value d 01/05/2011 93 Tha s, gven a se of N vecors x and a correspondng se of N real numbers, d he arges, fnd a funcon F ha sasfes he nerpolaon condon: F x = d for =1,...,N or more exacly fnd: sasfyng: N F x = wφ x x j j= 1 N j F x = w φ x x d j j = j= 1 01/05/2011 94

Inpu y 1 y 2 Sngle-layer neworks φ 1 y=φ 1 y-x 1 w j Σ d Oupu y p Inpu layer : φ Ν y=φ N y-x N oupu = Σ w φ y -x adjusable parameers are weghs w j number of npu uns number of daa pons Form of he bass funcons decded n advance 01/05/2011 95 To summarze: For a gven daa se conanng N pons x,d, =1,,N Choose a RBF funcon φ Calculae φx j x Solve he lnear equaon Φ W = D Ge he unque soluon Done Lke MLP s, RBFNs can be shown o be able o approxmae any funcon o arbrary accuracy usng an arbrarly large numbers of bass funcons. Unlke MLP s, however, hey have he propery of bes approxmaon.e. here exss an RBFN wh mnmum approxmaon error. 01/05/2011 96

Large σ = 1 01/05/2011 97 Small σ = 0.2 01/05/2011 98

Problems wh exac nerpolaon can produce poor generalsaon performance as only daa pons consran mappng Bshop1995 example Overfng problem Underlyng funcon fx=0.5+0.4sne2π x sampled randomly for 30 pons added Gaussan nose o each daa pon 30 daa pons 30 hdden RBF uns fs all daa pons bu creaes oscllaons due added nose and unconsraned beween daa pons 01/05/2011 99 All Daa Pons 5 Bass funcons 01/05/2011 100

To f an RBF o every daa pon s very neffcen due o he compuaonal cos of marx nverson and s very bad for generalzaon so: Use less RBF s han daa pons I.e. M<N Therefore don necessarly have RBFs cenred a daa pons Can nclude bas erms Can have Gaussan wh general covarance marces bu here s a rade-off beween complexy and he number of parameers o be found eg for d rbfs we have: 01/05/2011 101 Fuzzy Modellng and Idenfcaon Fuzzy Cluserng wh Applcaon o Daa-Drven Modellng

Inroducon The ably o cluser daa conceps, percepons, ec. essenal feaure of human nellgence. A cluser s a se of objecs ha are more smlar o each oher han o objecs from oher clusers. Applcaons of cluserng echnques n paern recognon and mage processng. Some machne-learnng echnques are based on he noon of smlary decson rees, case-based reasonng Non-lnear regresson and black-box modellng can be based on he paronng daa no clusers. 01/05/2011 103 Secon Oulne Basc conceps n cluserng daa se paron marx dsance measures Cluserng algorhms fuzzy c-means Gusafson Kessel Applcaon examples sysem denfcaon and modellng dagnoss 01/05/2011 104

Examples of Clusers 01/05/2011 105 Problem Formulaon Gven s a se of daa n R n and he esmaed number of clusers o look for a dffcul problem, more on hs laer. Fnd he paronng of he daa no subses clusers, such ha samples whn a subse are more smlar o each oher han o samples from oher subses. Smlary s mahemacally formulaed by usng a dsance measure.e., a dssmlary funcon. Usually, each cluser wll have a prooype and he dsance s measured from hs prooype. 01/05/2011 106

Dsance Measure 01/05/2011 107 Dsance Measures Eucldean norm: d 2 z j, v = z j v T z j v Inner-produc norm: d 2 A z j, v = z j v T A z j v Many oher possbles... 01/05/2011 108

Generalzed Prooypes Varees 01/05/2011 109 Correspondng Dsance Measures 01/05/2011 110

Mahemacal Formulaon of Cluserng Gven he daa: z T n [ z z,, z ] R, k = 1, N k = 1k, 2k nk, Fnd: μ he paron 11 U = marx: μc1 and he cluser prooype cenres: 01/05/2011 111 μ μ 1k ck μ μ 1N cn n { v, v2,, v } R V =, v 1 c Fuzzy Cluserng: an Opmsaon Approach Objecve funcon leas-squares creron: subjec o consrans: 01/05/2011 112

Fuzzy c-means Algorhm Repea: 1. Compue cluser prooypes means: 2. Calculae dsances: 3. Updae paron marx: unl = 1,, c. k = 1,, N 01/05/2011 113 Falure o Dscover Non- Sphercal Clusers 01/05/2011 114

Adapve Dsance Measure Inner-produc norm: norm-nducng marx covarance marx 01/05/2011 115 Inner-Produc Norm 01/05/2011 116

Gusafson Kessel Algorhm Repea: 1. Compue cluser prooypes means: 2. Compue covarance marces: 3. Compue dsances: 4. Compue paron marx: unl 01/05/2011 117 Clusers of Dfferen Shape and Orenaon 01/05/2011 118

Number of Clusers Valdy measures Fuzzy hypervolume: Average whn-cluser dsance: Xe Ben ndex... 01/05/2011 119 Valdy Measures: Example Daa over 4 clusers 01/05/2011 120

Valdy Measures 01/05/2011 121 Number of Clusers 01/05/2011 122

Daa-Drven Black-Box Modellng Lnear model for lnear sysems only, lmed n use Neural nework black box, unrelable exrapolaon Rule-based model more ransparen, grey-box 01/05/2011 123 Exracon of Rules by Fuzzy Cluserng 01/05/2011 124

Exracon of Rules by Fuzzy Cluserng 01/05/2011 125 Example: Non-lnear Auoregressve Sysem NARX 01/05/2011 126

Srucure Selecon and Daa Preparaon 1. Choose model order p 2. Form paern marx Z o be clusered 01/05/2011 127 Cluserng Resuls 01/05/2011 128

Rules Obaned 01/05/2011 129 Idenfcaon of Pressure Dynamcs 01/05/2011 130

01/05/2011 131 Concludng Remarks Opmsaon approach o cluserng effecve for merc e.g., real-valued daa accurae resuls for small o medum complexy problems for large problems, convergence o local opma, slow Many oher echnques agglomerave mehods herarchcal splng mehods graph-heorec mehods Varey of applcaons 01/05/2011 132

Applcaon Examples Neural Neworks for Non-lnear Idenfcaon Nonlnear Sysem Idenfcaon Targe funcon: y p k+1 = f. Idenfed funcon: y NET k+1 = F. Esmaon error: ek+1 01/05/2011 134

Nonlnear Sysem Idenfcaon Neural nework npu generaon Pm 01/05/2011 135 Nonlnear Sysem Idenfcaon Neural nework arge Tm Neural nework response angle & velocy 01/05/2011 136

Malab NNool GUI Graphcal User Inerface 01/05/2011 137