CONNECTIONIST THEORIES OF LEARNING

Similar documents
Self Organizing Maps: Fundamentals

Artificial neural networks

SELECTING NEURAL NETWORK ARCHITECTURE FOR INVESTMENT PROFITABILITY PREDICTIONS

Feed-Forward mapping networks KAIST 바이오및뇌공학과 정재승

MANAGING QUEUE STABILITY USING ART2 IN ACTIVE QUEUE MANAGEMENT FOR CONGESTION CONTROL

SEMINAR OUTLINE. Introduction to Data Mining Using Artificial Neural Networks. Definitions of Neural Networks. Definitions of Neural Networks

NEURAL NETWORK FUNDAMENTALS WITH GRAPHS, ALGORITHMS, AND APPLICATIONS

Introduction to Machine Learning and Data Mining. Prof. Dr. Igor Trajkovski

THE HUMAN BRAIN. observations and foundations

Neural Network Design in Cloud Computing

EVALUATION OF NEURAL NETWORK BASED CLASSIFICATION SYSTEMS FOR CLINICAL CANCER DATA CLASSIFICATION

129: Artificial Neural Networks. Ajith Abraham Oklahoma State University, Stillwater, OK, USA 1 INTRODUCTION TO ARTIFICIAL NEURAL NETWORKS

Recurrent Neural Networks

Keywords: Image complexity, PSNR, Levenberg-Marquardt, Multi-layer neural network.

Self-Organizing g Maps (SOM) COMP61021 Modelling and Visualization of High Dimensional Data

ARTIFICIAL NEURAL NETWORKS FOR DATA MINING

UNIVERSITY OF BOLTON SCHOOL OF ENGINEERING MS SYSTEMS ENGINEERING AND ENGINEERING MANAGEMENT SEMESTER 1 EXAMINATION 2015/2016 INTELLIGENT SYSTEMS

6.2.8 Neural networks for data mining

An Artificial Neural Networks-Based on-line Monitoring Odor Sensing System

Comparison of Supervised and Unsupervised Learning Algorithms for Pattern Classification

Neural network software tool development: exploring programming language options

Neural Networks and Support Vector Machines

Visualization of Breast Cancer Data by SOM Component Planes

AN APPLICATION OF TIME SERIES ANALYSIS FOR WEATHER FORECASTING

A Time Series ANN Approach for Weather Forecasting

EFFICIENT DATA PRE-PROCESSING FOR DATA MINING

An Introduction to Artificial Neural Networks (ANN) - Methods, Abstraction, and Usage

Comparison of Supervised and Unsupervised Learning Classifiers for Travel Recommendations

Machine Learning and Data Mining -

Artificial Neural Networks and Support Vector Machines. CS 486/686: Introduction to Artificial Intelligence

NEURAL NETWORKS IN DATA MINING

Handwritten Digit Recognition with a Back-Propagation Network

Stabilization by Conceptual Duplication in Adaptive Resonance Theory

Performance Evaluation of Artificial Neural. Networks for Spatial Data Analysis

Iranian J Env Health Sci Eng, 2004, Vol.1, No.2, pp Application of Intelligent System for Water Treatment Plant Operation.

Neural Networks: a replacement for Gaussian Processes?

Appendix 4 Simulation software for neuronal network models

Data Mining Techniques Chapter 7: Artificial Neural Networks

How To Use Neural Networks In Data Mining

Lecture 6. Artificial Neural Networks

TRAINING A LIMITED-INTERCONNECT, SYNTHETIC NEURAL IC

Power Prediction Analysis using Artificial Neural Network in MS Excel

SEARCH AND CLASSIFICATION OF "INTERESTING" BUSINESS APPLICATIONS IN THE WORLD WIDE WEB USING A NEURAL NETWORK APPROACH

Data Mining and Neural Networks in Stata

Reconstructing Self Organizing Maps as Spider Graphs for better visual interpretation of large unstructured datasets

SUCCESSFUL PREDICTION OF HORSE RACING RESULTS USING A NEURAL NETWORK

Biological Neurons and Neural Networks, Artificial Neurons

Comparison of K-means and Backpropagation Data Mining Algorithms

U-shaped curves in development: A PDP approach. Carnegie Mellon University, Pittsburgh PA. Fax: international Fax:

Models of Cortical Maps II

Method of Combining the Degrees of Similarity in Handwritten Signature Authentication Using Neural Networks

Back Propagation Neural Network for Wireless Networking

Follow links Class Use and other Permissions. For more information, send to:

Monitoring of Complex Industrial Processes based on Self-Organizing Maps and Watershed Transformations

Sensitivity Analysis for Data Mining

ViSOM A Novel Method for Multivariate Data Projection and Structure Visualization

American International Journal of Research in Science, Technology, Engineering & Mathematics

ELLIOTT WAVES RECOGNITION VIA NEURAL NETWORKS

Performance Comparison between Backpropagation Algorithms Applied to Intrusion Detection in Computer Network Systems

A Neural Network based Approach for Predicting Customer Churn in Cellular Network Services

Machine Learning Introduction

The Research of Data Mining Based on Neural Networks

COMBINED NEURAL NETWORKS FOR TIME SERIES ANALYSIS

Stock Prediction using Artificial Neural Networks

Visualization of Topology Representing Networks

Chapter 4: Artificial Neural Networks

An Introduction to Neural Networks

Open Access Research on Application of Neural Network in Computer Network Security Evaluation. Shujuan Jin *

A Content based Spam Filtering Using Optical Back Propagation Technique

A Prediction Model for Taiwan Tourism Industry Stock Index

Introduction to Machine Learning Using Python. Vikram Kamath

Data Mining on Sequences with recursive Self-Organizing Maps

Chapter 12 Discovering New Knowledge Data Mining

Analecta Vol. 8, No. 2 ISSN

Classification of Engineering Consultancy Firms Using Self-Organizing Maps: A Scientific Approach

Keywords: Data Mining, Neural Networks, Data Mining Process, Knowledge Discovery, Implementation. I. INTRODUCTION

Multiscale Object-Based Classification of Satellite Images Merging Multispectral Information with Panchromatic Textural Features

Impelling Heart Attack Prediction System using Data Mining and Artificial Neural Network

An Analysis on Density Based Clustering of Multi Dimensional Spatial Data

Advanced Web Usage Mining Algorithm using Neural Network and Principal Component Analysis

Inductive QoS Packet Scheduling for Adaptive Dynamic Networks

Cognitive Dynamics - Dynamic Cognition?

Neural Network Add-in

USING MODULAR NEURAL NETWORKS TO MONITOR ACCIDENT CONDITIONS IN NUCLEAR POWER PLANTS. Zhichao Guo* Robert E. Uhrig

Role of Neural network in data mining

NEURAL NETWORKS A Comprehensive Foundation

Neural Networks for Intrusion Detection and Its Applications

Learning Vector Quantization: generalization ability and dynamics of competing prototypes

LVQ Plug-In Algorithm for SQL Server

The Counterpropagation Network

Load balancing in a heterogeneous computer system by self-organizing Kohonen network

Introduction to Artificial Neural Networks

COMPARATIVE STUDY OF RECOGNITION TOOLS AS BACK-ENDS FOR BANGLA PHONEME RECOGNITION

Predicting the Risk of Heart Attacks using Neural Network and Decision Tree

Study on the Evaluation for the Knowledge Sharing Efficiency of the Knowledge Service Network System in Agile Supply Chain

THREE DIMENSIONAL REPRESENTATION OF AMINO ACID CHARAC- TERISTICS

SMORN-VII REPORT NEURAL NETWORK BENCHMARK ANALYSIS RESULTS & FOLLOW-UP 96. Özer CIFTCIOGLU Istanbul Technical University, ITU. and

Price Prediction of Share Market using Artificial Neural Network (ANN)

Entropy and Mutual Information

Self Organizing Maps for Visualization of Categories

Transcription:

CONNECTIONIST THEORIES OF LEARNING Themis N. Karaminis, Michael S.C. Thomas Department of Psychological Sciences, Birkbeck College, University of London London, WC1E 7HX UK tkaram01@students.bbk.ac.uk, m.thomas@bbk.ac.uk, http://www.psyc.bbk.ac.uk/research/dnl/ Synonyms Hebbian Learning, Associative Learning, Correlational Learning, Back-propagation of Error Algorithm, Self- Organizing Maps Definition The majority or the connectionist theories of learning are based on the Hebbian Learning Rule (Hebb, 1949). According to this rule, connections between neurons presenting correlated activity are strengthened. Connectionist theories of learning are essentially abstract implementations of general features of brain plasticity in architectures of artificial neural networks. Theoretical Background Connectionism provides a framework (Rumelhart, Hinton, & McClelland, 1986) for the study of cognition using Artificial Neural Network models. Neural network models are architectures of simple processing (artificial neurons) interconnected via weighted connections. An artificial neuron functions as a detector, which produces an output activation value determined by the level of the total input activation and an activation function. As a result, when a neural network is exposed to an environment, encoded as activation patterns in the input of the network, it responds with activation patterns across the. In the connectionist framework an artificial neural network model depicts cognition when it is able to respond to its environment with meaningful activation patterns. This can be achieved by modifications of the values of the connection weights, so as to regulate the activation patterns in the network appropriately. Therefore, connectionism suggests that learning involves the shaping of the connection weights. A learning algorithm is necessary to determine the changes in the weight values by which the network can acquire domain-appropriate input-output mappings. The idea that learning in artificial neural networks should entail changes in the weight values was based on observations of neuropsychologist Donald Hebb on biological neural systems. Hebb (1949) proposed his cell assembly theory also known as Hebb's rule or Hebb's postulate: When an axon of cell A is near enough to excite a cell B and repeatedly or persistently takes part in firing it, some growth process or metabolic change takes place in one or both cells such that A's efficiency, as one of the cells firing B, is increased. (1949, p.62) Hebb s rule suggested that connections between neurons which present correlated activity should be strengthened. This type of learning was also termed correlational or associative learning. A simple mathematical formulation of the Hebbian learning rule is:

Δw ij = η α i α j (1) The change of the weight (Δw ij ) from a sending unit j to a receiving unit i should be equal to the constant η multiplied by the product of output activation values (α i and α j ) of the. The constant η is known as learning rate. Important Scientific Research and Open Questions Different learning algorithms have been proposed to implement learning in artificial neural networks. These algorithms could be considered as variants of the Hebbian rule, adjusted to different architectures and different training methods. A large class of neural networks models uses a multilayered feed-forward architecture. This class of models is trained with supervised learning (figure 1). The environment is presented as pairs of input patterns and desired output patterns (or targets), where the target is provided by an external system (the notional supervisor ). The network is trained on the task of producing the corresponding targets in the output when an input pattern is presented. output target activity values output patterns internal (hidden) input input patterns Fig. 1. Supervised learning in a three-layered feed-forward neural network. The Backpropagation of Error algorithm (Rumelhart, Hinton, and WIlliams, 1986) as proposed for training such networks. Backpropagation is an error-driven algorithm. The aim of the weight changes is the minimization of the output error of the network. The Backpropagation algorithm is based on the delta rule: Δw ij = η (t i α i ) α j (2) The delta rule is a modification of the Hebbian learning rule (Eq. 1) for neurons that learn with supervised learning. In the delta rule, the weight change (Δw ij ) is proportional the difference between the target output (t i ) and the output activation of the receiving neuron (α i ), and the output activation of the sending neuron (α j ). Backpropagation generalizes the delta rule in networks with hidden layers, as a target activation value is not available for the neurons on these internal layers. Internal layers are necessary to improve the computational

power of the learning system. In a forward pass, the Backpropagation algorithm calculates the activations of the of the network. Next, in a backward pass the algorithm iteratively computes error signals (delta terms) for the of the deeper layers of the network. The error signals express the contribution of each unit to the overall error of the network. They are computed based on the derivatives of the error function. Error signals determine changes in the weights which minimize the overall network error. The generalized delta rule is used for this purpose: Δw ij = η δ i α j (3) According to this rule, weight changes equal to the learning rate times the product of the output activation of the sending unit (α j ) and the delta term of the receiving unit (δ ii ). Although the Backpropagation algorithm has been widely used, it employs features which are biologically implausible. For example, it is implausible that error signals are calculated and transmitted between the neurons. However, it has been argued that since forward projections between neurons are often matched by backward projections permitting bidirectional signaling, the backward projections may allow the implementation of the abstract idea of the backpropagation of error. Pursuing this idea, other learning algorithms have been proposed to implement error-driven learning in a more biologically plausible way. The Contrastive Hebbian Learning algorithm (Hinton, 1989) is a learning algorithm for bidirectional connected networks. This algorithm considers two phases of training in each presentation of an input pattern. In the first one, known as the minus phase or anti-hebbian update, the network is allowed to settle as an input pattern is presented to the network while the output are free to adopt any activation state. These activations serve as noise. In the second phase (plus phase or Hebbian update), the network settles as the input is presented while the output are clamped to the target outputs. These activations serve as signal. The weight change is proportional to the difference between the products of the activations of the sending and the receiving in the two phases, so that the changes reinforce signal and reduce noise: Δw ij = η (α i + α j + - α i - α j- ) (4) Learning is based on contrasting the two phases, hence then term Contrastive Hebbian Learning. O Reilly and Munakata (2000) proposed the LEABRA (Local, Error-driven and Associative, Biologically Realistic Algorithm) algorithm. This algorithm combines error-driven and Hebbian Learning, exploiting bidirectional connectivity to allow the propagation of error signals in a biologically plausible fashion. The supervised learning algorithms assume a very detailed error signal telling each output how it should be responding. Other algorithms have been developed that assume less detailed information. These approaches are referred to as reinforcement learning. Another class of neural networks is trained with unsupervised learning. In this type of learning, the network is presented with different input patterns. The aim of the network is to form its own internal representations which reflect regularities in the input patterns. The Self-Organizing Map (SOM; Kohonen, 1984) is an example of a neural network architecture that is trained with unsupervised learning. As shown in figure 2, a SOM consists of an array of neurons or nodes. Each node has coordinates on the map and is associated with a weight vector, of the same dimensionality as the input patterns. For example, if there are three dimensions in the input, there will be three input, and each output unit will have a vector of three weights connected to those input. The aim of the SOM learning algorithm is to produce a topographic map that reflects regularities in the set of input patterns. When an input pattern is presented to the network, the SOM training algorithm computes the

x 1 x 2 x 3 input vector w ij pattern class 1 pattern class 2 array of nodes (output layer) Fig. 2. Unsupervised learning in a simple self-organizing map (SOM). Euclidean distance between the weight vector and the input pattern for each node. The node that presents the least Euclidean distance (winning node or best matching unit [BMU]) is associated with the input pattern. Next, the weights vectors of the neighboring nodes are changed so as to become more similar to the weights vector of the winning node. The extent of the weight changes for each of the neighboring nodes is determined by its location on the map using a neighborhood function. In effect, regions of the output layer compete to represent the input patterns, and regional organization is enforced by short-range excitatory and long range inhibitory connections within the output layer. SOMs are thought to capture aspects of the organization of sensory input in the cerebral cortex. Hebbian learning to associate sensory and motor topographic maps then provides the basis for a system that learns to generate adaptive behavior in an environment. Acknowledgements The studies of the first author are funded by the Greek State Scholarship Foundation (IKY). The work of the second author is supported by UK MRC Grant G0300188. Cross-References Learning in artificial neural networks Connectionism Association learning Competitive Learning Adaptation and unsupervised learning Categorical learning

Bayesian learning Cognitive Learning Human cognition and learning Computational models of human learning References Hebb, D. O. (1949). The organization of behavior: A neuropsychological approach. NewYork: John Wiley & Sons. Hinton, G. E. (1989). Deterministic Boltzmann learning performs steepest descent in weightspace. Neural Computation, 1, 143 150. Kohonen, T. (1984). Self-organization and associative memory. Berlin: Springer-Verlag. O Reilly, R. C., & Munakata, Y. (2000). Computational explorations in cognitive neuroscience: Understanding the mind by simulating the brain. Cambridge, MA: MIT Press. Rumelhart, D. E., Hinton, G. E., & McClelland, J. L. (1986). A general framework for parallel distributed processing. In D. E. Rumelhart, J. L. McClelland, & the PDP Research Group, Parallel distributed processing: Explorations in the microstructure of cognition. Volume 1: Foundations (pp. 45 76). Cambridge, MA:MIT Press. Rumelhart, D. E., Hinton, G. E., & Williams, R.J. (1986). Learning internal representations by error propagation. In D. E. Rumelhart, J. L. McClelland and The PDP Research Group, Parallel distributed processing: Explorations in the microstructure of cognition. Volume 1: Foundations (pp. 318 362). Cambridge, MA: MIT Press.