Design of Output Codes for Fast Covering Learning using Basic Decomposition Techniques



Similar documents
What is Candidate Sampling

Luby s Alg. for Maximal Independent Sets using Pairwise Independence

Logistic Regression. Lecture 4: More classifiers and classes. Logistic regression. Adaboost. Optimization. Multiple class classification

Performance Analysis and Coding Strategy of ECOC SVMs

Lecture 2: Single Layer Perceptrons Kevin Swingler

Module 2 LOSSLESS IMAGE COMPRESSION SYSTEMS. Version 2 ECE IIT, Kharagpur

The Greedy Method. Introduction. 0/1 Knapsack Problem

8 Algorithm for Binary Searching in Trees

Forecasting the Demand of Emergency Supplies: Based on the CBR Theory and BP Neural Network

Forecasting the Direction and Strength of Stock Market Movement

Face Verification Problem. Face Recognition Problem. Application: Access Control. Biometric Authentication. Face Verification (1:1 matching)

Descriptive Models. Cluster Analysis. Example. General Applications of Clustering. Examples of Clustering Applications

ECE544NA Final Project: Robust Machine Learning Hardware via Classifier Ensemble

benefit is 2, paid if the policyholder dies within the year, and probability of death within the year is ).

Support Vector Machines

Gender Classification for Real-Time Audience Analysis System

Single and multiple stage classifiers implementing logistic discrimination

The Development of Web Log Mining Based on Improve-K-Means Clustering Analysis

Project Networks With Mixed-Time Constraints

A Secure Password-Authenticated Key Agreement Using Smart Cards

How To Understand The Results Of The German Meris Cloud And Water Vapour Product

1 Example 1: Axis-aligned rectangles

An Interest-Oriented Network Evolution Mechanism for Online Communities

A Replication-Based and Fault Tolerant Allocation Algorithm for Cloud Computing

How Sets of Coherent Probabilities May Serve as Models for Degrees of Incoherence

Statistical Approach for Offline Handwritten Signature Verification

Conversion between the vector and raster data structures using Fuzzy Geographical Entities

Adaptive Fractal Image Coding in the Frequency Domain

THE APPLICATION OF DATA MINING TECHNIQUES AND MULTIPLE CLASSIFIERS TO MARKETING DECISION

PSYCHOLOGICAL RESEARCH (PYC 304-C) Lecture 12

NPAR TESTS. One-Sample Chi-Square Test. Cell Specification. Observed Frequencies 1O i 6. Expected Frequencies 1EXP i 6

8.5 UNITARY AND HERMITIAN MATRICES. The conjugate transpose of a complex matrix A, denoted by A*, is given by

INVESTIGATION OF VEHICULAR USERS FAIRNESS IN CDMA-HDR NETWORKS

Calculating the high frequency transmission line parameters of power cables

Multiplication Algorithms for Radix-2 RN-Codings and Two s Complement Numbers

Mining Multiple Large Data Sources

APPLICATION OF PROBE DATA COLLECTED VIA INFRARED BEACONS TO TRAFFIC MANEGEMENT

A Hierarchical Anomaly Network Intrusion Detection System using Neural Network Classification

NEURO-FUZZY INFERENCE SYSTEM FOR E-COMMERCE WEBSITE EVALUATION

A Performance Analysis of View Maintenance Techniques for Data Warehouses

WISE-Integrator: An Automatic Integrator of Web Search Interfaces for E-Commerce

Bayesian Network Based Causal Relationship Identification and Funding Success Prediction in P2P Lending

Searching for Interacting Features for Spam Filtering

SVM Tutorial: Classification, Regression, and Ranking

Ants Can Schedule Software Projects

This circuit than can be reduced to a planar circuit

Implementation of Deutsch's Algorithm Using Mathcad

A DATA MINING APPLICATION IN A STUDENT DATABASE

DEFINING %COMPLETE IN MICROSOFT PROJECT

How To Know The Components Of Mean Squared Error Of Herarchcal Estmator S

An Alternative Way to Measure Private Equity Performance

1. Fundamentals of probability theory 2. Emergence of communication traffic 3. Stochastic & Markovian Processes (SP & MP)

A Fast Incremental Spectral Clustering for Large Data Sets

Development of an intelligent system for tool wear monitoring applying neural networks

Simple Interest Loans (Section 5.1) :

Learning from Multiple Outlooks

Improved SVM in Cloud Computing Information Mining

Enterprise Master Patient Index

Logical Development Of Vogel s Approximation Method (LD-VAM): An Approach To Find Basic Feasible Solution Of Transportation Problem

Minimal Coding Network With Combinatorial Structure For Instantaneous Recovery From Edge Failures

Joint Scheduling of Processing and Shuffle Phases in MapReduce Systems

Feature selection for intrusion detection. Slobodan Petrović NISlab, Gjøvik University College

Vision Mouse. Saurabh Sarkar a* University of Cincinnati, Cincinnati, USA ABSTRACT 1. INTRODUCTION

An MILP model for planning of batch plants operating in a campaign-mode

Automated Mobile ph Reader on a Camera Phone

Document Clustering Analysis Based on Hybrid PSO+K-means Algorithm

"Research Note" APPLICATION OF CHARGE SIMULATION METHOD TO ELECTRIC FIELD CALCULATION IN THE POWER CABLES *

Usage of LCG/CLCG numbers for electronic gambling applications

BERNSTEIN POLYNOMIALS

Hybrid-Learning Methods for Stock Index Modeling

An Adaptive and Distributed Clustering Scheme for Wireless Sensor Networks

An ILP Formulation for Task Mapping and Scheduling on Multi-core Architectures

GRAVITY DATA VALIDATION AND OUTLIER DETECTION USING L 1 -NORM

New Approaches to Support Vector Ordinal Regression

On the Use of Neural Network as a Universal Approximator

A study on the ability of Support Vector Regression and Neural Networks to Forecast Basic Time Series Patterns

Ring structure of splines on triangulations

A hybrid global optimization algorithm based on parallel chaos optimization and outlook algorithm

Biometric Signature Processing & Recognition Using Radial Basis Function Network

An ACO Algorithm for. the Graph Coloring Problem

Mining Feature Importance: Applying Evolutionary Algorithms within a Web-based Educational System

Offline Verification of Hand Written Signature using Adaptive Resonance Theory Net (Type-1)

Intra-day Trading of the FTSE-100 Futures Contract Using Neural Networks With Wavelet Encodings

Disagreement-Based Multi-System Tracking

Detecting Credit Card Fraud using Periodic Features

CHOLESTEROL REFERENCE METHOD LABORATORY NETWORK. Sample Stability Protocol

Invoicing and Financial Forecasting of Time and Amount of Corresponding Cash Inflow

FREQUENCY OF OCCURRENCE OF CERTAIN CHEMICAL CLASSES OF GSR FROM VARIOUS AMMUNITION TYPES

The Journal of Systems and Software

Automated Network Performance Management and Monitoring via One-class Support Vector Machine

Abstract. 260 Business Intelligence Journal July IDENTIFICATION OF DEMAND THROUGH STATISTICAL DISTRIBUTION MODELING FOR IMPROVED DEMAND FORECASTING

CS 2750 Machine Learning. Lecture 3. Density estimation. CS 2750 Machine Learning. Announcements

J. Parallel Distrib. Comput.

Transcription:

Journal of Computer Scence (7): 565-57, 6 ISSN 59-66 6 Scence Publcatons Desgn of Output Codes for Fast Coverng Learnng usng Basc Decomposton Technques Aruna Twar and Narendra S. Chaudhar, Faculty of Computer Engneerng Department, Shr G. S. Insttute of Technology & Scence (SGSITS),, Par Road, Indore 5 (M.P.) INDIA Faculty of School of Computer Engneerng(SCE), Nanyang Technologcal Unversty(NTU), 5,Nanyang Avenue, Nanyang Technologcal Unversty(NTU), Sngapore 69798 SINGAPORE Abstract: We propose the desgn of output codes for solvng the classfcaton problem n Fast Coverng Learnng Algorthm (FCLA). For a complex mult-class problem normally the classfers are constructed by combnng the outputs of several bnary ones. In ths paper, we use the basc methods of decomposton; one per class (OPC) and Error Correctng Output Code (ECOC) wth FCLA, bnary to bnary mappng algorthm as a base bnary learner. The methods have been tested on Fsher s wellnown Irs data set and expermental results show that the classfcaton ablty s mproved by usng ECOC method. Key words: Bnary neural networ, One per class, Error correctng output code. INTRODUCTION In the last two decades, bnary neural networs (BNNs) have attracted attenton of many researchers and now there have been many establshed approaches for the constructon of BNNs. They nclude Boolean Le Tranng Algorthm (BLTA) [], Improved Expand and Truncated Learnng (IETL) [8]. In these methods, predefned output codes are used for the representaton of multple classes. Usng predefned output codes maes the problem ndependent of the specfc applcaton and class of hypotheses used to construct bnary classfers [9]. Expermental wor has shown that output codng can greatly mprove varous performance parameters le generalzaton, predcton accuracy [] etc. Several output codng methods have been suggested and tested so far, such as comparng each class aganst the rest (One Per Class: OPC), comparng all pars of classes (Par Wse Couplng: PWC), random codes, exhaustve codes, Error Correctng Output Codes, Margn Classfers [,5,6,7]. In ths paper, we extend Fast Coverng Learnng Algorthm (FCLA) [] for mult-class problem (.e., K- classes, where K>). Further, ths paper addresses the desgn of output codes for a bnary to bnary mappng learnng. In our wor, we use two output codng schemes One-Per-Class (OPC) and Error Correctng Output Code (ECOC). Output Codng of mult-class problems s composed of two stages. In the tranng stage, we need to construct hdden layer by ndependent K bnary classfers where K s the number of classes to be learned. The output layer s then constructed by tranng of number of neurons as per the codng scheme used. In the second stage, the classfcaton part, the appled sample s predcted by combnng varous bnary classfers. OPC separates one class from all other classes and ECOC conssts of several dchotomzers wth class redundancy to get robustness n case some dchotomzers fal [5,6,7]. ECOC approach mproves the generalzaton performance [,5,7]. These codng schemes are used for output codng for the tranng phase of the neural networ. In the reconstructon stage, when new samples come, some smlarty measure s requred to fnd out the class to whch t belongs, f the generated strng s n bnary form, the hammng dstance crtera s beng used for decdng the class to whch new sample belongs [5,7]. In case of OPC, for the tranng of output layer, a class s separated from the rest of the classes. Therefore, at the output layer, a sngle neuron per dchotomzer s taen to collect the outputs from the hdden layer neurons of ther respectve class. The weghts and thresholds n the output layer are set to one for each of the dchotomzer/neuron. In ECOC [], each class s assgned a unque bnary strng. We refer to these strngs as codewords. Then we tran K classfers at the hdden layer and l number of output neurons at the output layer (where l s the length of the codeword). The predcted class s one whose codeword s closest to the output generated. The smlarty measure s the Hammng dstance ; (.e., the number of bts dfferent from the codeword bts). We show that the use of ECOC method for FCLA mproves the generalzaton capabltes over the OPC. Ths comparson has been tested by expermentng on Irs data set. Also, utlzng bnary to bnary mappng Correspondng Author : Aruna Twar, Computer Engneerng Department,Shr G. S. Insttute of Technology and Scence (SGSITS),, Par Road, Indore 5 (M.P.) INDIA565

J. Computer Sc., (7): 565-57, 5 algorthm, convergence problem has been resolved as compared to bacpropagaton algorthm. Thus tranng tme has been reduced. The use of nteger weghts and thresholds reduces predcton tme also, as computatons have been reduced. In secton we dscuss the basc concepts for extendng the FCLA framewor. In secton and, we present the formulae used under tranng and tranng algorthm of FCLA. In secton 5, the extenson of the FCLA framewor s presented. Secton 6 gves one llustratve example and n secton 7 performance comparson s gven, In secton 8 we gve concludng remars. BASIC CONCEPTS Let s={x,x,.,x m } are the tranng examples. The proposed learnng algorthm learns the classfcaton functon f(x) that taes these tranng examples and classfes t nto one of -classes: f(x) {c, c,.c }. To learn ths classfcaton functon, the algorthm analyzes a set of tranng examples {(x,f(x )), (x,f(x )),, (x m,f(x m ))}. Each tranng example s a par consstng of a descrpton of an obect x and ts correct classfcaton, f(x ). The FCLA algorthm s desgned for solvng any bnary (-class) classfcaton problems n three layer networ structure as shown n fg. Fg. : FCLA Three layer networ structure used for mult-class problem For decdng the output codes for each of the class, let s,s, s be dstnct bnary strngs of length L. The length of the strng wll depend on the type of decomposton method used: OPC or ECOC. We call each strng S the codeword for class c. Now defne L hypotheses.e. f,f,,f l. For OPC, f,f,,f hypotheses are learned, one functon f s defned for each class, such that f (x)= f f(x)=c and zero otherwse. Durng learnng, a set of hypotheses, {f,f,,f } s learned. To classfy a new example, x, we compute the value of f (x ) for each. The predcted value of f(x ) s the class c for whch f (x ) s generatng. For ECOC, L hypotheses f,f,,f l for a class c f =, then f = for all = to L otherwse there are alternatng runs of - zeroes and - ones. Durng learnng, the hdden layer neurons are traned Fg. : FCLA Three layer networ structure For each of the classes, FCLA [] algorthm can be appled separately for the tranng of hdden layer. Thus for each of the -classes the FCLA algorthm can be appled n parallel n order to fnd out the hdden layer neurons wth respect to each and every class. For combnng the outputs of the hdden layer neurons, FCLA approach can be extended for the tranng of output layer by usng ether of the two codng schemes: OPC or ECOC and three layered networ structure s formed as depcted n the fgure. usng two class learnng algorthm to learn each of g functon of x,x,.,x m examples. The output layer neurons are traned dependng on the codng scheme used for the classfcaton OPC or ECOC, presented n the next secton. The output layer have L hypotheses {f,f,,f l }. To classfy a new example, x, we apply each of the ' learned functon g to compute bnary strng s =<f( x ), ' ' f( x ),, f( x m )>. Then we determne whch codeword s s nearest to ths s. The predcted value of f(x ) s the class c correspondng to the nearest codeword (havng mnmum Hammng dstance) s. 566

J. Computer Sc., (7): 565-57, 5 FORMULAE USED: FAST COVERING LEARNING ALGORITHM Whle constructng the BNN, suppose that {x, x,,x v } are v (true) vertces ncluded n one hypersphere. The centre s defned as follows [] : c = v = x v three rad are defned as follows: v n = max = = = r + = r + ( ) x c () r () r () r () formulae for weghts and threshold value of a neuron: else begn -chec ths nput data(x ) wth respect to the exstng neurons -for each of the p th neuron do the followng checs <Cond> f(wx >= t ) then -ths nput s already covered by the p th neuron so smply ext & tae next nput(match regon) <Cond> f(t <= Wx <=t ) -nput data s wthn the clam regon -update the parameters of p th neuron by usng the formulae n secton -center C, radus, weghts, threshold -ext & tae next nput w v = x v (5) = v n t = mn w x = = (6) t = t v (7) t = t v (8) TRAINING FOR THE CONSTRUCTION OF NETWORK For our extenson, there are two broad steps nvolved n the constructon of networ: A. Tranng of hdden layer: The tranng of hdden layer s done n parallel for each of classes usng FCLA [] as follows: Algorthm. For a gven class C, tae set of true vertces (x,x x m ), each vertex s n-bt long represented as x, where n.. For each of the nput data- For = to m do Begn f (=) then -add a new neuron wth respect to ths nput (x ) therefore evaluate followng parameters- -Center C ( usng equaton ()) -Radus r, r,r (usng equatons (), (), ()) -Weghts (w, w, w n) represented as weght vector W (usng equatons (5)) -Thresholds(t, t, t ) (usng equatons(6), (7), (8)) 567 <Cond> f(t > Wx ) -f ths condton s true for all the neurons then a new neuron s beng added. -Evaluatng all the parameters center, radus, weght & thresholds n secton <Cond> f(t <=Wx < t ) -the vertex s wthn the boundary regon of the neuron, so we frst -examne whether other avalable neurons can clam t? -f t can not be ncluded n any other avalable neuron, we put asde for reconsderaton after other vertces are processed. -ncluson of other vertces to exstng neurons results n the expanson of match & clam regons of the neurons; other vertces putasde may be clamed. <Cond> & <Cond> s beng retested. End else End for. Modfcaton process: Apply all vertces belongng to other classes (say, false vertces) to the hdden layer neurons traned for a class. If the output s zero then omt t. If output s one then we wll represent the wrongly represented vertces by addtonal hdden neurons by applyng step.. Repeat steps and for each of the class. 5. Stop. B. Tranng the output layer Accordng to FCLA [], at the output layer a sngle neuron s needed to collect the outputs of all the hdden neurons wth respect to a two class problem as depcted

J. Computer Sc., (7): 565-57, 5 o n fg.. Let w represents the weghts from th hdden neuron to the o th output neuron. The total number of neurons for a gven class are nc, out of whch q represents the number of hdden neurons learned true vertces wth generalzaton and the remanng (q+, nc) are the neurons whch learned the false vertces. The weghts and threshold of the output neurons are assgned as follows: w ={ o f =,..., q q f = q+,..., nc and threshold of the neuron can be assgned as t o = (9) EXTENSION OF FCLA FRAMEWORK We now use codng schemes for extendng the FCLA framewor for solvng classfcaton problems fgure. We use two codng schemes for the constructon of output layer : () OPC scheme, () ECOC scheme. The number of neurons requred at the output layer depends on the codng scheme used. neurons are equal to the number of classes.e. K. In ECOC, the number of neurons are - -. Thresholds of the output neurons are set to n both the schemes. Further weght settng s done as follows:. OPC: Weght values for the th class from th neuron of hdden layer to the q th neuron of output layer s decded as follows: w = f =q; q = otherwse. ECOC: Weght settng s done usng followng algorthm: Algorthm. For each of the th class. For each of the th hdden layer neuron wth respect to ths class. Mae the followng assgnment : current_op_neuron=. For each of the q th output layer neuron 5. For the current_op_neuron to the (current_op_neuron+ - -) Assgn weght value: 6. For subsequent output neuron to the (current_op_neuron+ - -) Assgn weght value: 7. Repeat the steps 5 to 6 for each of the output neuron. 8. Repeat the steps to 7 for each of the hdden neuron. 9. Repeat the steps to 8 for each of the class. w w q q = = Fg. : Partal networ showng the use of codng schemes for tranng the output layer A. Constructon of hdden layer For a gven K-Class problem {G,G,.G }, for each & every class, we separately apply FCL [] Algorthm. Thus hdden neurons are evaluated for each of the classes. After ths, for collectng the outputs of the hdden neurons, we propose the approach n the next secton. ILLUSTRATIVE EXAMPLE We llustrate the proposed approach wth an example mentoned below: Approxmaton of the followng regons mentoned as A, B, C, D, E n the fgure can be done by 6*6 grd. Table gves the approxmaton of these regons through 6-bt bnary values. B. Tranng Of Output Layer The outputs generated by the hdden layer are combned at the output layer. The number of Output neurons are decded on the bass of the codng scheme used OPC or ECOC. As stated earler, n OPC, the number of Fg. : Approxmaton of regons 568

J. Computer Sc., (7): 565-57, 5 Table :Data sets wth respect to the approxmated regons. Intput datas Regon/Classes {,,,,, A } {,,,,, B } {,,,,, C } {,,,,, D } {,,, } E Applyng Algorthm of secton, the results of the constructon of hdden layer s as follows: Table : Hdden layer soluton Inputs {,,,,, } Neuro ns W W W W W 5 W 6 t t t Regon/ classes -5 - - 5-5 - -7 - - - - A {,,,,, } - - - - - - 8 B {,,,,, } {,,,,, } {,,, } 5 - -5-5 - - - - - - -7 C -6 - - -6 - - - - -6 D - - 8 E Output layer weghts for two methods: Table : Ouput layer weghts and thresholds usng OPC (One Per Class). Hdden layer Neuron/output layer neurons f f f f f 5 Thresholds Regons/ classes A B C D E 569

J. Computer Sc., (7): 565-57, 5 Table : output layer weghts usng ECOC (Error Correctng Output Code). Hdden layer Neuron/output layer neuron f f f f f 5 f 6 f 7 f 8 f 9 f f f f f f 5 regons OR classes A B C D E Next, tables and are depcted through the fgures. As dscussed n secton, fgure, three layered networ structure s formed : nput layer, hdden layer and output layer. Input layer doesn t contan any processng element, these are ust nodes for provdng nputs to the hdden layer. Hdden and output layers contans the neurons. Wth respect to table, networ structure formed s depcted n fgure. Networ structure for Table s shown n fgure 5. Fg. 5: Example Soluton usng ECOC scheme PERFORMANCE COMPARISION Fg. : Example Soluton usng OPC scheme We mae use of the Fsher s Irs data set for comparng the performance of the codng schemes used OPC and ECOC for the desgnng of classfers n FCLA. Fsher s Irs Data Set contans 5 patterns for representng three classes []. There are 5 patterns of each class. There are four propertes on the bass of combnaton of these propertes, the classfcaton have been done. For applyng the nputs to the networ the each of the four propertes of the orgnal pattern have been represented by 7-bt bnary equvalent. Thus the nput contans total of 8-bts. Hdden layer neurons have been found out by usng FCLA approach. Total of neurons are requred n the hdden layer. For Setosa : 7 neurons are needed. For Verscolor: 9 neurons and for Vrgnca: 6 neurons are needed. The number of output neurons are for both the codng schemes used OPC or ECOC. The weghts and thresholds of the output layer neurons are gven n the tables 5 and 6 as follows : Table 5 : Output layer neurons when usng OPC scheme Classes/neurons f f f Threshold () Setosa ()Verscolor () Vrgnca 57

J. Computer Sc., (7): 565-57, 5 Table 6 : Output layer neurons when usng ECOC scheme Classes/neurons f f f Threshold () Setosa () Verscolor () Vrgnca For testng over these pattern, we splt each of the 5 patterns for each of the class / (tran/test) data. Testng results show that ECOC performs better n terms of classfcaton accuracy. For Setosa and Verscolor, ECOC s gves % accuracy(.e. classfyng all the samples properly). For Vrgnca, 8% accuracy s acheved wth ECOC. Usng OPC wth the same case, results are not satsfactory. CONCLUSION In ths paper, we extend FCLA [] method for multclass problems by desgnng classfers usng codng schemes. The hdden layer traned s n modular form. Thus modules n the hdden layer correspondng to each class can be traned ndependently [] n parallel, thus reduces tranng tme. For output layer tranng, the paper has examned the use of Error correctng codng and One Per Class codng scheme for bnary to bnary mappng learnng algorthm. The performance of the method has been compared on the Fsher s well-nown Irs dataset. The results shows that ECOC gves more classfcaton accuracy as compared to OPC. REFERENCES. Thomas G. Detterch, Ghulum. Bar,995. Solvng Multclass Learnng Problems va Error- Correctng Output Codes : Journal of Artfcal Intellgence Research, Vol. : 6-86.. D Wang and Narendra S. Chaudhar,. An Approach for Constructon of Boolean Neural Networs Based on Geometrcal Expanson : Neurocomputng, vol. 57, pp :55-6.. Donald L. Gray and Anthony N. Mchel, 99. A tranng algorthm for bnary feedforward neural networs. IEEE Trans : Neural Networs, Vol., No., IEEE, USA, pp :76-9.. Rangachar Anand, Kshan Mehrotra, Chluur K. Mohan and Sanay Rana, 995. Effcent Classfcaton of multclass problem usng Modular Neural Networ : IEEE transactons on Neural Networs, vol.6, pp : 7-. 5. Francesco Masull., Gorgo Valentn,. Comparng Decomposton Methods for Classfcaton : Proc. Of Internatonal Conference on Knowledge-based Intellgent Engneerng Systems & Alled Technologes, Vol. : 788-79. 6. Ern L. Allwen, Robert E. Schapre, Yoram Snger,. Reducng Multclass to Bnary: A Unfyng Approach for Margn Classfers : Proc. Of Internatonal Conference on Machne Learnng, pp : 9-6. 7. Francesco Masull, Gorgo Valentn,. Effectveness of error-correctng output codes n multclass learnng problems : In Proc. Of MCS (), Frst Internatonal Worshop on Multple Classfer Systems, Caglar, Italy. 8. Atsush Yamamoto, Toshmch Sato, 997. An mproved Expand-and-Truncate Learnng : Proc. Of IEEE Internatonal Conference on Neural Networs (ICNN), Vol., pp : -6. 9. Koby Crammer, Yoram Snger,. On the learnablty and desgn of output codes for multclass problems :In proceedngs of Thrteenth Annual Conference on Computatonal Learnng Theory, pp : 5-6.. Kshan Mehrotra, Chluur K. Mohan and Sanay Rana, 997. Elements of Artfcal Neural Networs : Cambrdge, MA:MIT Press. 57