# 3 An Illustrative Example

Save this PDF as:

Size: px
Start display at page:

## Transcription

1 Objectives An Illustrative Example Objectives - Theory and Examples -2 Problem Statement -2 Perceptron - Two-Input Case -4 Pattern Recognition Example -5 Hamming Network -8 Feedforward Layer -8 Recurrent Layer -9 Hopfield Network -2 Epilogue -5 Exercise -6 Objectives Think of this chapter as a preview of coming attractions. We will take a simple pattern recognition problem and show how it can be solved using three different neural network architectures. It will be an opportunity to see how the architectures described in the previous chapter can be used to solve a practical (although extremely oversimplified) problem. Do not expect to completely understand these three networks after reading this chapter. We present them simply to give you a taste of what can be done with neural networks, and to demonstrate that there are many different types of networks that can be used to solve a given problem. The three networks presented in this chapter are representative of the types of networks discussed in the remaining chapters: feedforward networks (represented here by the perceptron), competitive networks (represented here by the Hamming network) and recurrent associative memory networks (represented here by the Hopfield network). -

2 An Illustrative Example Theory and Examples Problem Statement A produce dealer has a warehouse that stores a variety of fruits and vegetables. When fruit is brought to the warehouse, various types of fruit may be mixed together. The dealer wants a machine that will sort the fruit according to type. There is a conveyer belt on which the fruit is loaded. This conveyer passes through a set of sensors, which measure three properties of the fruit: shape, texture and weight. These sensors are somewhat primitive. The shape sensor will output a if the fruit is approximately round and a if it is more elliptical. The texture sensor will output a if the surface of the fruit is smooth and a if it is rough. The weight sensor will output a if the fruit is more than one pound and a if it is less than one pound. The three sensor outputs will then be input to a neural network. The purpose of the network is to decide which kind of fruit is on the conveyor, so that the fruit can be directed to the correct storage bin. To make the problem even simpler, letõs assume that there are only two kinds of fruit on the conveyor: apples and oranges. Neural Network Sensors Sorter Apples Oranges As each fruit passes through the sensors it can be represented by a threedimensional vector. The first element of the vector will represent shape, the second element will represent texture and the third element will represent weight: -2

3 Perceptron p = shape texture weight. (.) Therefore, a prototype orange would be represented by p =, (.2) and a prototype apple would be represented by The neural network will receive one three-dimensional input vector for each fruit on the conveyer and must make a decision as to whether the fruit is an orange ( p ) or an apple ( p 2 ). Now that we have defined this simple (trivial?) pattern recognition problem, letõs look briefly at three different neural networks that could be used to solve it. The simplicity of our problem will facilitate our understanding of the operation of the networks. Perceptron p 2 =. (.) The first network we will discuss is the perceptron. Figure. illustrates a single-layer perceptron with a symmetric hard limit transfer function hardlims. Inputs Sym. Hard - Title Limit - Layer R p R x W S x R b S x n S x a = hardlims - Exp (Wp - + b) a S S x Figure. Single-Layer Perceptron -

4 An Illustrative Example Two-Input Case Before we use the perceptron to solve the orange and apple recognition problem (which will require a three-input perceptron, i.e., R = ), it is useful to investigate the capabilities of a two-input/single-neuron perceptron ( R = 2 ), which can be easily analyzed graphically. The two-input perceptron is shown in Figure.2. Inputs Two-Input Neuron p w, p 2 w,2 Σ b Figure.2 Two-Input/Single-Neuron Perceptron Single-neuron perceptrons can classify input vectors into two categories. For example, for a two-input perceptron, if w, = and w 2, = then n a a = hardlims (Wp + b) a = hardlims( n) = hardlims( p + b). (.4) Therefore, if the inner product of the weight matrix (a single row vector in this case) with the input vector is greater than or equal to b, the output will be. If the inner product of the weight vector and the input is less than b, the output will be. This divides the input space into two parts. Figure. illustrates this for the case where b =. The blue line in the figure represents all points for which the net input n is equal to 0: n = p = 0. (.5) Notice that this decision boundary will always be orthogonal to the weight matrix, and the position of the boundary can be shifted by changing b. (In the general case, W is a matrix consisting of a number of row vectors, each of which will be used in an equation like Eq. (.5). There will be one boundary for each row of W. See Chapter 4 for more on this topic.) The shaded region contains all input vectors for which the output of the network will be. The output will be for all other input vectors. -4

5 Perceptron p 2 W n > 0 n < 0 - p Figure. Perceptron Decision Boundary The key property of the single-neuron perceptron, therefore, is that it can separate input vectors into two categories. The decision boundary between the categories is determined by the equation Wp + b = 0. (.6) Because the boundary must be linear, the single-layer perceptron can only be used to recognize patterns that are linearly separable (can be separated by a linear boundary). These concepts will be discussed in more detail in Chapter 4. Pattern Recognition Example Now consider the apple and orange pattern recognition problem. Because there are only two categories, we can use a single-neuron perceptron. The vector inputs are three-dimensional ( R = ), therefore the perceptron equation will be a = hardlims w, w 2, w, p 2 p p + b. (.7) We want to choose the bias b and the elements of the weight matrix so that the perceptron will be able to distinguish between apples and oranges. For example, we may want the output of the perceptron to be when an apple is input and when an orange is input. Using the concept illustrated in Figure., letõs find a linear boundary that can separate oranges and ap- -5

6 An Illustrative Example ples. The two prototype vectors (recall Eq. (.2) and Eq. (.)) are shown in Figure.4. From this figure we can see that the linear boundary that divides these two vectors symmetrically is the p, p plane. p p 2 p p (orange) p 2 (apple) Figure.4 Prototype Vectors The p, p plane, which will be our decision boundary, can be described by the equation or p 2 = 0, (.8) 00 p p = 0 p. (.9) Therefore the weight matrix and bias will be W = 00, b = 0. (.0) The weight matrix is orthogonal to the decision boundary and points toward the region that contains the prototype pattern p 2 (apple) for which we want the perceptron to produce an output of. The bias is 0 because the decision boundary passes through the origin. Now letõs test the operation of our perceptron pattern classifier. It classifies perfect apples and oranges correctly since -6

7 Perceptron Orange: a = hardlims = ( orange), (.) Apple: a = hardlims = ( apple). (.2) But what happens if we put a not-so-perfect orange into the classifier? LetÕs say that an orange with an elliptical shape is passed through the sensors. The input vector would then be p =. (.) The response of the network would be a = hardlims = ( orange). (.4) In fact, any input vector that is closer to the orange prototype vector than to the apple prototype vector (in Euclidean distance) will be classified as an orange (and vice versa). To experiment with the perceptron network and the apple/orange classification problem, use the Neural Network Design Demonstration Perceptron Classification (nndpc). This example has demonstrated some of the features of the perceptron network, but by no means have we exhausted our investigation of perceptrons. This network, and variations on it, will be examined in Chapters 4 through 2. LetÕs consider some of these future topics. In the apple/orange example we were able to design a network graphically, by choosing a decision boundary that clearly separated the patterns. What about practical problems, with high dimensional input spaces? In Chapters 4, 7, 0 and we will introduce learning algorithms that can be used to train networks to solve complex problems by using a set of examples of proper network behavior. -7

8 An Illustrative Example The key characteristic of the single-layer perceptron is that it creates linear decision boundaries to separate categories of input vector. What if we have categories that cannot be separated by linear boundaries? This question will be addressed in Chapter, where we will introduce the multilayer perceptron. The multilayer networks are able to solve classification problems of arbitrary complexity. Hamming Network The next network we will consider is the Hamming network [Lipp87]. It was designed explicitly to solve binary pattern recognition problems (where each element of the input vector has only two possible values Ñ in our example or ). This is an interesting network, because it uses both feedforward and recurrent (feedback) layers, which were both described in Chapter 2. Figure.5 shows the standard Hamming network. Note that the number of neurons in the first layer is the same as the number of neurons in the second layer. The objective of the Hamming network is to decide which prototype vector is closest to the input vector. This decision is indicated by the output of the recurrent layer. There is one neuron in the recurrent layer for each prototype pattern. When the recurrent layer converges, there will be only one neuron with nonzero output. This neuron indicates the prototype pattern that is closest to the input vector. Now letõs investigate the two layers of the Hamming network in detail. Feedforward Layer Recurrent Layer p R x R W S x R b S x a S n S x S x W 2 S x S n 2 (t + ) a 2 (t + ) S x S x S D a2(t) S x a = purelin - Exp (W - p + b ) a 2 (0) = a a 2 (t + ) = poslin (W 2 a 2 (t)) Feedforward Layer Figure.5 Hamming Network The feedforward layer performs a correlation, or inner product, between each of the prototype patterns and the input pattern (as we will see in Eq. (.7)). In order for the feedforward layer to perform this correlation, the -8

9 Hamming Network rows of the weight matrix in the feedforward layer, represented by the connection matrix W, are set to the prototype patterns. For our apple and orange example this would mean W p T = = p 2 T. (.5) The feedforward layer uses a linear transfer function, and each element of the bias vector is equal to R, where R is the number of elements in the input vector. For our example the bias vector would be b =. (.6) With these choices for the weight matrix and bias vector, the output of the feedforward layer is p T a = W p+ b = p + = p 2 T p T p + p T 2 p +. (.7) Note that the outputs of the feedforward layer are equal to the inner products of each prototype pattern with the input, plus R. For two vectors of equal length (norm), their inner product will be largest when the vectors point in the same direction, and will be smallest when they point in opposite directions. (We will discuss this concept in more depth in Chapters 5, 8 and 9.) By adding R to the inner product we guarantee that the outputs of the feedforward layer can never be negative. This is required for proper operation of the recurrent layer. This network is called the Hamming network because the neuron in the feedforward layer with the largest output will correspond to the prototype pattern that is closest in Hamming distance to the input pattern. (The Hamming distance between two vectors is equal to the number of elements that are different. It is defined only for binary vectors.) We leave it to the reader to show that the outputs of the feedforward layer are equal to 2R minus twice the Hamming distances from the prototype patterns to the input pattern. Recurrent Layer The recurrent layer of the Hamming network is what is known as a ÒcompetitiveÓ layer. The neurons in this layer are initialized with the outputs of the feedforward layer, which indicate the correlation between the prototype patterns and the input vector. Then the neurons compete with each other to determine a winner. After the competition, only one neuron will -9

10 An Illustrative Example have a nonzero output. The winning neuron indicates which category of input was presented to the network (for our example the two categories are apples and oranges). The equations that describe the competition are: a 2 ( 0) = a (Initial Condition), (.8) and a 2 ( t + ) = poslin( W 2 a 2 () t ). (.9) (DonÕt forget that the superscripts here indicate the layer number, not a power of 2.) The poslin transfer function is linear for positive values and zero for negative values. The weight matrix W 2 has the form W 2 = ε ε, (.20) where ε is some number less than ( S ), and S is the number of neurons in the recurrent layer. (Can you show why ε must be less than ( S )?) An iteration of the recurrent layer proceeds as follows: a 2 ( t + ) poslin ε = ε a2 () t = poslin a 2 () t εa 2 2 () t a 2 2 () t εa 2 () t. (.2) Each element is reduced by the same fraction of the other. The larger element will be reduced by less, and the smaller element will be reduced by more, therefore the difference between large and small will be increased. The effect of the recurrent layer is to zero out all neuron outputs, except the one with the largest initial value (which corresponds to the prototype pattern that is closest in Hamming distance to the input). To illustrate the operation of the Hamming network, consider again the oblong orange that we used to test the perceptron: p =. (.22) The output of the feedforward layer will be -0

11 Hamming Network a = + = ( + ) = ( + ) 4 2, (.2) which will then become the initial condition for the recurrent layer. The weight matrix for the recurrent layer will be given by Eq. (.20) with ε = 2 (any number less than would work). The first iteration of the recurrent layer produces a 2 ( ) = poslin( W 2 a 2 ( 0) ) = The second iteration produces poslin poslin = (.24) a 2 ( 2) = poslin( W 2 a 2 ( ) ) = poslin (.25) Since the outputs of successive iterations produce the same result, the network has converged. Prototype pattern number one, the orange, is chosen as the correct match, since neuron number one has the only nonzero output. (Recall that the first element of a was ( p T p + ).) This is the correct choice, since the Hamming distance from the orange prototype to the input pattern is, and the Hamming distance from the apple prototype to the input pattern is 2. To experiment with the Hamming network and the apple/orange classification problem, use the Neural Network Design Demonstration Hamming Classification (nndhamc). There are a number of networks whose operation is based on the same principles as the Hamming network; that is, where an inner product operation (feedforward layer) is followed by a competitive dynamic layer. These competitive networks will be discussed in Chapters through 6. They are self-organizing networks, which can learn to adjust their prototype vectors based on the inputs that have been presented. 0 poslin =.5 0 -

12 An Illustrative Example Hopfield Network The final network we will discuss in this brief preview is the Hopfield network. This is a recurrent network that is similar in some respects to the recurrent layer of the Hamming network, but which can effectively perform the operations of both layers of the Hamming network. A diagram of the Hopfield network is shown in Figure.6. (This figure is actually a slight variation of the standard Hopfield network. We use this variation because it is somewhat simpler to describe and yet demonstrates the basic concepts.) The neurons in this network are initialized with the input vector, then the network iterates until the output converges. When the network is operating correctly, the resulting output should be one of the prototype vectors. Therefore, whereas in the Hamming network the nonzero neuron indicates which prototype pattern is chosen, the Hopfield network actually produces the selected prototype pattern at its output. Initial Condition Recurrent Layer S p S x W S x S b S x n(t + ) a(t + ) D S x S x S a(t) S x a(0) = p a(t + ) = satlins (Wa(t) + b) Figure.6 Hopfield Network The equations that describe the network operation are a( 0) = p (.26) and a( t + ) = satlins( Wa() t + b), (.27) where satlins is the transfer function that is linear in the range [-, ] and saturates at for inputs greater than and at - for inputs less than -. The design of the weight matrix and the bias vector for the Hopfield network is a more complex procedure than it is for the Hamming network, -2

13 Hopfield Network where the weights in the feedforward layer are the prototype patterns. Hopfield design procedures will be discussed in detail in Chapter 8. To illustrate the operation of the network, we have determined a weight matrix and a bias vector that can solve our orange and apple pattern recognition problem. They are given in Eq. (.28). W = 0.2 0, b = (.28) Although the procedure for computing the weights and biases for the Hopfield network is beyond the scope of this chapter, we can say a few things about why the parameters in Eq. (.28) work for the apple and orange example. We want the network output to converge to either the orange pattern,, or the apple pattern, p 2. In both patterns, the first element is, and the third element is. The difference between the patterns occurs in the second element. Therefore, no matter what pattern is input to the network, we want the first element of the output pattern to converge to, the third element to converge to, and the second element to go to either or, whichever is closer to the second element of the input vector. The equations of operation of the Hopfield network, using the parameters given in Eq. (.28), are a ( t + ) = satlins( 0.2a () t + 0.9) a 2 ( t + ) = satlins(.2a 2 () t ) a ( t + ) = satlins( 0.2a () t 0.9) (.29) Regardless of the initial values, a i ( 0), the first element will be increased until it saturates at, and the third element will be decreased until it saturates at. The second element is multiplied by a number larger than. Therefore, if it is initially negative, it will eventually saturate at ; if it is initially positive it will saturate at. (It should be noted that this is not the only ( W, b) pair that could be used. You might want to try some others. See if you can discover what makes these work.) LetÕs again take our oblong orange to test the Hopfield network. The outputs of the Hopfield network for the first three iterations would be p -

14 An Illustrative Example 0.7 a( 0) =, a( ) =, a( 2) =, a( ) = (.0) The network has converged to the orange pattern, as did both the Hamming network and the perceptron, although each network operated in a different way. The perceptron had a single output, which could take on values of - (orange) or (apple). In the Hamming network the single nonzero neuron indicated which prototype pattern had the closest match. If the first neuron was nonzero, that indicated orange, and if the second neuron was nonzero, that indicated apple. In the Hopfield network the prototype pattern itself appears at the output of the network. To experiment with the Hopfield network and the apple/orange classification problem, use the Neural Network Design Demonstration Hopfield Classification (nndhopc). As with the other networks demonstrated in this chapter, do not expect to feel completely comfortable with the Hopfield network at this point. There are a number of questions that we have not discussed. For example, ÒHow do we know that the network will eventually converge?ó It is possible for recurrent networks to oscillate or to have chaotic behavior. In addition, we have not discussed general procedures for designing the weight matrix and the bias vector. These topics will be discussed in detail in Chapters 7 and 8. -4

15 Epilogue Epilogue The three networks that we have introduced in this chapter demonstrate many of the characteristics that are found in the architectures which are discussed throughout this book. Feedforward networks, of which the perceptron is one example, are presented in Chapters 4, 7, and 2. In these networks, the output is computed directly from the input in one pass; no feedback is involved. Feedforward networks are used for pattern recognition, as in the apple and orange example, and also for function approximation (see Chapter ). Function approximation applications are found in such areas as adaptive filtering (see Chapter 0) and automatic control. Competitive networks, represented here by the Hamming network, are characterized by two properties. First, they compute some measure of distance between stored prototype patterns and the input pattern. Second, they perform a competition to determine which neuron represents the prototype pattern closest to the input. In the competitive networks that are discussed in Chapters 4Ð6, the prototype patterns are adjusted as new inputs are applied to the network. These adaptive networks learn to cluster the inputs into different categories. Recurrent networks, like the Hopfield network, were originally inspired by statistical mechanics. They have been used as associative memories, in which stored data is recalled by association with input data, rather than by an address. They have also been used to solve a variety of optimization problems. We will discuss these recurrent networks in Chapters 7 and 8. We hope this chapter has piqued your curiosity about the capabilities of neural networks and has raised some questions. A few of the questions we will answer in later chapters are:. How do we determine the weight matrix and bias for perceptron networks with many inputs, where it is impossible to visualize the decision boundary? (Chapters 4 and 0) 2. If the categories to be recognized are not linearly separable, can we extend the standard perceptron to solve the problem? (Chapters and 2). Can we learn the weights and biases of the Hamming network when we donõt know the prototype patterns? (Chapters 4Ð6) 4. How do we determine the weight matrix and bias vector for the Hopfield network? (Chapter 8) 5. How do we know that the Hopfield network will eventually converge? (Chapters 7 and 8) -5

16 An Illustrative Example Exercise E. In this chapter we have designed three different neural networks to distinguish between apples and oranges, based on three sensor measurements (shape, texture and weight). Suppose that we want to distinguish between bananas and pineapples: p = (Banana) p 2 = (Pineapple) i. Design a perceptron to recognize these patterns. ii. Design a Hamming network to recognize these patterns. iii. Design a Hopfield network to recognize these patterns. iv. Test the operation of your networks by applying several different input patterns. Discuss the advantages and disadvantages of each network. -6

### Using Neural Networks for Pattern Classification Problems

Using Neural Networks for Pattern Classification Problems Converting an Image Camera captures an image Image needs to be converted to a form that can be processed by the Neural Network Converting an Image

### Neural Networks and Support Vector Machines

INF5390 - Kunstig intelligens Neural Networks and Support Vector Machines Roar Fjellheim INF5390-13 Neural Networks and SVM 1 Outline Neural networks Perceptrons Neural networks Support vector machines

### Introduction to Neural Networks for Senior Design

Introduction to Neural Networks for Senior Design Intro-1 Neural Networks: The Big Picture Artificial Intelligence Neural Networks Expert Systems Machine Learning not ruleoriented ruleoriented Intro-2

### Self Organizing Maps: Fundamentals

Self Organizing Maps: Fundamentals Introduction to Neural Networks : Lecture 16 John A. Bullinaria, 2004 1. What is a Self Organizing Map? 2. Topographic Maps 3. Setting up a Self Organizing Map 4. Kohonen

### A TUTORIAL. BY: Negin Yousefpour PhD Student Civil Engineering Department TEXAS A&M UNIVERSITY

ARTIFICIAL NEURAL NETWORKS: A TUTORIAL BY: Negin Yousefpour PhD Student Civil Engineering Department TEXAS A&M UNIVERSITY Contents Introduction Origin Of Neural Network Biological Neural Networks ANN Overview

### Learning. Artificial Intelligence. Learning. Types of Learning. Inductive Learning Method. Inductive Learning. Learning.

Learning Learning is essential for unknown environments, i.e., when designer lacks omniscience Artificial Intelligence Learning Chapter 8 Learning is useful as a system construction method, i.e., expose

### Neural Nets. General Model Building

Neural Nets To give you an idea of how new this material is, let s do a little history lesson. The origins are typically dated back to the early 1940 s and work by two physiologists, McCulloch and Pitts.

### Recurrent Neural Networks

Recurrent Neural Networks Neural Computation : Lecture 12 John A. Bullinaria, 2015 1. Recurrent Neural Network Architectures 2. State Space Models and Dynamical Systems 3. Backpropagation Through Time

### Systems of Linear Equations

Systems of Linear Equations Beifang Chen Systems of linear equations Linear systems A linear equation in variables x, x,, x n is an equation of the form a x + a x + + a n x n = b, where a, a,, a n and

### CHAPTER 3 Numbers and Numeral Systems

CHAPTER 3 Numbers and Numeral Systems Numbers play an important role in almost all areas of mathematics, not least in calculus. Virtually all calculus books contain a thorough description of the natural,

### The Hebbian rule. The neural network stores and retrieves associations, which are learned as synaptic connection.

Hopfield Networks The Hebbian rule Donald Hebb hypothesised in 1949 how neurons are connected with each other in the brain: When an axon of cell A is near enough to excite a cell B and repeatedly or persistently

### Artificial Neural Networks and Support Vector Machines. CS 486/686: Introduction to Artificial Intelligence

Artificial Neural Networks and Support Vector Machines CS 486/686: Introduction to Artificial Intelligence 1 Outline What is a Neural Network? - Perceptron learners - Multi-layer networks What is a Support

### Chapter 15 Introduction to Linear Programming

Chapter 15 Introduction to Linear Programming An Introduction to Optimization Spring, 2014 Wei-Ta Chu 1 Brief History of Linear Programming The goal of linear programming is to determine the values of

### Linear Dependence Tests

Linear Dependence Tests The book omits a few key tests for checking the linear dependence of vectors. These short notes discuss these tests, as well as the reasoning behind them. Our first test checks

### Neural Networks. Neural network is a network or circuit of neurons. Neurons can be. Biological neurons Artificial neurons

Neural Networks Neural network is a network or circuit of neurons Neurons can be Biological neurons Artificial neurons Biological neurons Building block of the brain Human brain contains over 10 billion

### INTRODUCTION TO NEURAL NETWORKS

INTRODUCTION TO NEURAL NETWORKS Pictures are taken from http://www.cs.cmu.edu/~tom/mlbook-chapter-slides.html http://research.microsoft.com/~cmbishop/prml/index.htm By Nobel Khandaker Neural Networks An

### Chapter 7. Diagnosis and Prognosis of Breast Cancer using Histopathological Data

Chapter 7 Diagnosis and Prognosis of Breast Cancer using Histopathological Data In the previous chapter, a method for classification of mammograms using wavelet analysis and adaptive neuro-fuzzy inference

### Review Jeopardy. Blue vs. Orange. Review Jeopardy

Review Jeopardy Blue vs. Orange Review Jeopardy Jeopardy Round Lectures 0-3 Jeopardy Round \$200 How could I measure how far apart (i.e. how different) two observations, y 1 and y 2, are from each other?

### NEURAL NETWORKS A Comprehensive Foundation

NEURAL NETWORKS A Comprehensive Foundation Second Edition Simon Haykin McMaster University Hamilton, Ontario, Canada Prentice Hall Prentice Hall Upper Saddle River; New Jersey 07458 Preface xii Acknowledgments

### 6.2.8 Neural networks for data mining

6.2.8 Neural networks for data mining Walter Kosters 1 In many application areas neural networks are known to be valuable tools. This also holds for data mining. In this chapter we discuss the use of neural

### An Introduction to Neural Networks

An Introduction to Vincent Cheung Kevin Cannons Signal & Data Compression Laboratory Electrical & Computer Engineering University of Manitoba Winnipeg, Manitoba, Canada Advisor: Dr. W. Kinsner May 27,

### Elementary Number Theory We begin with a bit of elementary number theory, which is concerned

CONSTRUCTION OF THE FINITE FIELDS Z p S. R. DOTY Elementary Number Theory We begin with a bit of elementary number theory, which is concerned solely with questions about the set of integers Z = {0, ±1,

### Chapter 19 Operational Amplifiers

Chapter 19 Operational Amplifiers The operational amplifier, or op-amp, is a basic building block of modern electronics. Op-amps date back to the early days of vacuum tubes, but they only became common

### Gradient Methods. Rafael E. Banchs

Gradient Methods Rafael E. Banchs INTRODUCTION This report discuss one class of the local search algorithms to be used in the inverse modeling of the time harmonic field electric logging problem, the Gradient

### 6. Feed-forward mapping networks

6. Feed-forward mapping networks Fundamentals of Computational Neuroscience, T. P. Trappenberg, 2002. Lecture Notes on Brain and Computation Byoung-Tak Zhang Biointelligence Laboratory School of Computer

### Neural Networks. Introduction to Artificial Intelligence CSE 150 May 29, 2007

Neural Networks Introduction to Artificial Intelligence CSE 150 May 29, 2007 Administration Last programming assignment has been posted! Final Exam: Tuesday, June 12, 11:30-2:30 Last Lecture Naïve Bayes

### degrees of freedom and are able to adapt to the task they are supposed to do [Gupta].

1.3 Neural Networks 19 Neural Networks are large structured systems of equations. These systems have many degrees of freedom and are able to adapt to the task they are supposed to do [Gupta]. Two very

### STRAND B: Number Theory. UNIT B2 Number Classification and Bases: Text * * * * * Contents. Section. B2.1 Number Classification. B2.

STRAND B: Number Theory B2 Number Classification and Bases Text Contents * * * * * Section B2. Number Classification B2.2 Binary Numbers B2.3 Adding and Subtracting Binary Numbers B2.4 Multiplying Binary

### Neural networks. Chapter 20, Section 5 1

Neural networks Chapter 20, Section 5 Chapter 20, Section 5 Outline Brains Neural networks Perceptrons Multilayer perceptrons Applications of neural networks Chapter 20, Section 5 2 Brains 0 neurons of

### SUCCESSFUL PREDICTION OF HORSE RACING RESULTS USING A NEURAL NETWORK

SUCCESSFUL PREDICTION OF HORSE RACING RESULTS USING A NEURAL NETWORK N M Allinson and D Merritt 1 Introduction This contribution has two main sections. The first discusses some aspects of multilayer perceptrons,

### Big Data Analytics CSCI 4030

High dim. data Graph data Infinite data Machine learning Apps Locality sensitive hashing PageRank, SimRank Filtering data streams SVM Recommen der systems Clustering Community Detection Web advertising

### PATTERN RECOGNITION AND MACHINE LEARNING CHAPTER 4: LINEAR MODELS FOR CLASSIFICATION

PATTERN RECOGNITION AND MACHINE LEARNING CHAPTER 4: LINEAR MODELS FOR CLASSIFICATION Introduction In the previous chapter, we explored a class of regression models having particularly simple analytical

### 15.062 Data Mining: Algorithms and Applications Matrix Math Review

.6 Data Mining: Algorithms and Applications Matrix Math Review The purpose of this document is to give a brief review of selected linear algebra concepts that will be useful for the course and to develop

### Lecture 8 Artificial neural networks: Unsupervised learning

Lecture 8 Artificial neural networks: Unsupervised learning Introduction Hebbian learning Generalised Hebbian learning algorithm Competitive learning Self-organising computational map: Kohonen network

### MEP Y9 Practice Book A

1 Base Arithmetic 1.1 Binary Numbers We normally work with numbers in base 10. In this section we consider numbers in base 2, often called binary numbers. In base 10 we use the digits 0, 1, 2, 3, 4, 5,

### (Refer Slide Time: 00:00:56 min)

Numerical Methods and Computation Prof. S.R.K. Iyengar Department of Mathematics Indian Institute of Technology, Delhi Lecture No # 3 Solution of Nonlinear Algebraic Equations (Continued) (Refer Slide

### A Simple Introduction to Support Vector Machines

A Simple Introduction to Support Vector Machines Martin Law Lecture for CSE 802 Department of Computer Science and Engineering Michigan State University Outline A brief history of SVM Large-margin linear

### We call this set an n-dimensional parallelogram (with one vertex 0). We also refer to the vectors x 1,..., x n as the edges of P.

Volumes of parallelograms 1 Chapter 8 Volumes of parallelograms In the present short chapter we are going to discuss the elementary geometrical objects which we call parallelograms. These are going to

### α = u v. In other words, Orthogonal Projection

Orthogonal Projection Given any nonzero vector v, it is possible to decompose an arbitrary vector u into a component that points in the direction of v and one that points in a direction orthogonal to v

### Feed-Forward mapping networks KAIST 바이오및뇌공학과 정재승

Feed-Forward mapping networks KAIST 바이오및뇌공학과 정재승 How much energy do we need for brain functions? Information processing: Trade-off between energy consumption and wiring cost Trade-off between energy consumption

### Linear Algebra Notes for Marsden and Tromba Vector Calculus

Linear Algebra Notes for Marsden and Tromba Vector Calculus n-dimensional Euclidean Space and Matrices Definition of n space As was learned in Math b, a point in Euclidean three space can be thought of

### Lecture 8 February 4

ICS273A: Machine Learning Winter 2008 Lecture 8 February 4 Scribe: Carlos Agell (Student) Lecturer: Deva Ramanan 8.1 Neural Nets 8.1.1 Logistic Regression Recall the logistic function: g(x) = 1 1 + e θt

### Appendix 4 Simulation software for neuronal network models

Appendix 4 Simulation software for neuronal network models D.1 Introduction This Appendix describes the Matlab software that has been made available with Cerebral Cortex: Principles of Operation (Rolls

### Data Mining Classification: Alternative Techniques. Instance-Based Classifiers. Lecture Notes for Chapter 5. Introduction to Data Mining

Data Mining Classification: Alternative Techniques Instance-Based Classifiers Lecture Notes for Chapter 5 Introduction to Data Mining by Tan, Steinbach, Kumar Set of Stored Cases Atr1... AtrN Class A B

### LCs for Binary Classification

Linear Classifiers A linear classifier is a classifier such that classification is performed by a dot product beteen the to vectors representing the document and the category, respectively. Therefore it

### The Counterpropagation Network

214 The Counterpropagation Network The Counterpropagation Network " The Counterpropagation network (CPN) is the most recently developed of the models that we have discussed so far in this text. The CPN

### SELECTING NEURAL NETWORK ARCHITECTURE FOR INVESTMENT PROFITABILITY PREDICTIONS

UDC: 004.8 Original scientific paper SELECTING NEURAL NETWORK ARCHITECTURE FOR INVESTMENT PROFITABILITY PREDICTIONS Tonimir Kišasondi, Alen Lovren i University of Zagreb, Faculty of Organization and Informatics,

### Lecture 3: Finding integer solutions to systems of linear equations

Lecture 3: Finding integer solutions to systems of linear equations Algorithmic Number Theory (Fall 2014) Rutgers University Swastik Kopparty Scribe: Abhishek Bhrushundi 1 Overview The goal of this lecture

### Threshold Logic. 2.1 Networks of functions

2 Threshold Logic 2. Networks of functions We deal in this chapter with the simplest kind of computing units used to build artificial neural networks. These computing elements are a generalization of the

### Chapter 4: Artificial Neural Networks

Chapter 4: Artificial Neural Networks CS 536: Machine Learning Littman (Wu, TA) Administration icml-03: instructional Conference on Machine Learning http://www.cs.rutgers.edu/~mlittman/courses/ml03/icml03/

### Linear Algebra Review. Vectors

Linear Algebra Review By Tim K. Marks UCSD Borrows heavily from: Jana Kosecka kosecka@cs.gmu.edu http://cs.gmu.edu/~kosecka/cs682.html Virginia de Sa Cogsci 8F Linear Algebra review UCSD Vectors The length

### Introduction to Neural Networks : Revision Lectures

Introduction to Neural Networks : Revision Lectures John A. Bullinaria, 2004 1. Module Aims and Learning Outcomes 2. Biological and Artificial Neural Networks 3. Training Methods for Multi Layer Perceptrons

### Math 215 HW #6 Solutions

Math 5 HW #6 Solutions Problem 34 Show that x y is orthogonal to x + y if and only if x = y Proof First, suppose x y is orthogonal to x + y Then since x, y = y, x In other words, = x y, x + y = (x y) T

### CS 2750 Machine Learning. Lecture 1. Machine Learning. http://www.cs.pitt.edu/~milos/courses/cs2750/ CS 2750 Machine Learning.

Lecture Machine Learning Milos Hauskrecht milos@cs.pitt.edu 539 Sennott Square, x5 http://www.cs.pitt.edu/~milos/courses/cs75/ Administration Instructor: Milos Hauskrecht milos@cs.pitt.edu 539 Sennott

Facebook Friend Suggestion Eytan Daniyalzade and Tim Lipus 1. Introduction Facebook is a social networking website with an open platform that enables developers to extract and utilize user information

### Eigenvalues and eigenvectors of a matrix

Eigenvalues and eigenvectors of a matrix Definition: If A is an n n matrix and there exists a real number λ and a non-zero column vector V such that AV = λv then λ is called an eigenvalue of A and V is

### Lecture 1: Introduction to Neural Networks Kevin Swingler / Bruce Graham

Lecture 1: Introduction to Neural Networks Kevin Swingler / Bruce Graham kms@cs.stir.ac.uk 1 What are Neural Networks? Neural Networks are networks of neurons, for example, as found in real (i.e. biological)

### 7 Gaussian Elimination and LU Factorization

7 Gaussian Elimination and LU Factorization In this final section on matrix factorization methods for solving Ax = b we want to take a closer look at Gaussian elimination (probably the best known method

### Data Envelopment Analysis: A Primer for Novice Users and Students at all Levels

Data Envelopment Analysis: A Primer for Novice Users and Students at all Levels R. Samuel Sale Lamar University Martha Lair Sale Florida Institute of Technology In the three decades since the publication

### Guide to SRW Section 1.7: Solving inequalities

Guide to SRW Section 1.7: Solving inequalities When you solve the equation x 2 = 9, the answer is written as two very simple equations: x = 3 (or) x = 3 The diagram of the solution is -6-5 -4-3 -2-1 0

### Analecta Vol. 8, No. 2 ISSN 2064-7964

EXPERIMENTAL APPLICATIONS OF ARTIFICIAL NEURAL NETWORKS IN ENGINEERING PROCESSING SYSTEM S. Dadvandipour Institute of Information Engineering, University of Miskolc, Egyetemváros, 3515, Miskolc, Hungary,

### 1 The Brownian bridge construction

The Brownian bridge construction The Brownian bridge construction is a way to build a Brownian motion path by successively adding finer scale detail. This construction leads to a relatively easy proof

### Definition 8.1 Two inequalities are equivalent if they have the same solution set. Add or Subtract the same value on both sides of the inequality.

8 Inequalities Concepts: Equivalent Inequalities Linear and Nonlinear Inequalities Absolute Value Inequalities (Sections 4.6 and 1.1) 8.1 Equivalent Inequalities Definition 8.1 Two inequalities are equivalent

### Building MLP networks by construction

University of Wollongong Research Online Faculty of Informatics - Papers (Archive) Faculty of Engineering and Information Sciences 2000 Building MLP networks by construction Ah Chung Tsoi University of

### AUTOMATION OF ENERGY DEMAND FORECASTING. Sanzad Siddique, B.S.

AUTOMATION OF ENERGY DEMAND FORECASTING by Sanzad Siddique, B.S. A Thesis submitted to the Faculty of the Graduate School, Marquette University, in Partial Fulfillment of the Requirements for the Degree

### 1.7 Graphs of Functions

64 Relations and Functions 1.7 Graphs of Functions In Section 1.4 we defined a function as a special type of relation; one in which each x-coordinate was matched with only one y-coordinate. We spent most

### Manifold Learning Examples PCA, LLE and ISOMAP

Manifold Learning Examples PCA, LLE and ISOMAP Dan Ventura October 14, 28 Abstract We try to give a helpful concrete example that demonstrates how to use PCA, LLE and Isomap, attempts to provide some intuition

### Quiz 1 for Name: Good luck! 20% 20% 20% 20% Quiz page 1 of 16

Quiz 1 for 6.034 Name: 20% 20% 20% 20% Good luck! 6.034 Quiz page 1 of 16 Question #1 30 points 1. Figure 1 illustrates decision boundaries for two nearest-neighbour classifiers. Determine which one of

### Inner Product Spaces

Math 571 Inner Product Spaces 1. Preliminaries An inner product space is a vector space V along with a function, called an inner product which associates each pair of vectors u, v with a scalar u, v, and

### Wave Hydro Dynamics Prof. V. Sundar Department of Ocean Engineering Indian Institute of Technology, Madras

Wave Hydro Dynamics Prof. V. Sundar Department of Ocean Engineering Indian Institute of Technology, Madras Module No. # 02 Wave Motion and Linear Wave Theory Lecture No. # 03 Wave Motion II We will today

### The Graphical Method: An Example

The Graphical Method: An Example Consider the following linear program: Maximize 4x 1 +3x 2 Subject to: 2x 1 +3x 2 6 (1) 3x 1 +2x 2 3 (2) 2x 2 5 (3) 2x 1 +x 2 4 (4) x 1, x 2 0, where, for ease of reference,

### Making Sense of the Mayhem: Machine Learning and March Madness

Making Sense of the Mayhem: Machine Learning and March Madness Alex Tran and Adam Ginzberg Stanford University atran3@stanford.edu ginzberg@stanford.edu I. Introduction III. Model The goal of our research

### Improved Fuzzy C-means Clustering Algorithm Based on Cluster Density

Journal of Computational Information Systems 8: 2 (2012) 727 737 Available at http://www.jofcis.com Improved Fuzzy C-means Clustering Algorithm Based on Cluster Density Xiaojun LOU, Junying LI, Haitao

### Data Mining and Neural Networks in Stata

Data Mining and Neural Networks in Stata 2 nd Italian Stata Users Group Meeting Milano, 10 October 2005 Mario Lucchini e Maurizo Pisati Università di Milano-Bicocca mario.lucchini@unimib.it maurizio.pisati@unimib.it

### Chapter 6. Cuboids. and. vol(conv(p ))

Chapter 6 Cuboids We have already seen that we can efficiently find the bounding box Q(P ) and an arbitrarily good approximation to the smallest enclosing ball B(P ) of a set P R d. Unfortunately, both

### Lecture No. # 02 Prologue-Part 2

Advanced Matrix Theory and Linear Algebra for Engineers Prof. R.Vittal Rao Center for Electronics Design and Technology Indian Institute of Science, Bangalore Lecture No. # 02 Prologue-Part 2 In the last

### CS229 Project Final Report. Sign Language Gesture Recognition with Unsupervised Feature Learning

CS229 Project Final Report Sign Language Gesture Recognition with Unsupervised Feature Learning Justin K. Chen, Debabrata Sengupta, Rukmani Ravi Sundaram 1. Introduction The problem we are investigating

### Digital Image Processing. Prof. P. K. Biswas. Department of Electronics & Electrical Communication Engineering

Digital Image Processing Prof. P. K. Biswas Department of Electronics & Electrical Communication Engineering Indian Institute of Technology, Kharagpur Lecture - 28 Colour Image Processing - III Hello,

### 2x + y = 3. Since the second equation is precisely the same as the first equation, it is enough to find x and y satisfying the system

1. Systems of linear equations We are interested in the solutions to systems of linear equations. A linear equation is of the form 3x 5y + 2z + w = 3. The key thing is that we don t multiply the variables

### Finite Element Analysis Prof. Dr. B. N. Rao Department of Civil Engineering Indian Institute of Technology, Madras. Lecture - 01

Finite Element Analysis Prof. Dr. B. N. Rao Department of Civil Engineering Indian Institute of Technology, Madras Lecture - 01 Welcome to the series of lectures, on finite element analysis. Before I start,

### Lecture 9: Introduction to Pattern Analysis

Lecture 9: Introduction to Pattern Analysis g Features, patterns and classifiers g Components of a PR system g An example g Probability definitions g Bayes Theorem g Gaussian densities Features, patterns

### Notes on Orthogonal and Symmetric Matrices MENU, Winter 2013

Notes on Orthogonal and Symmetric Matrices MENU, Winter 201 These notes summarize the main properties and uses of orthogonal and symmetric matrices. We covered quite a bit of material regarding these topics,

### Solutions of Linear Equations in One Variable

2. Solutions of Linear Equations in One Variable 2. OBJECTIVES. Identify a linear equation 2. Combine like terms to solve an equation We begin this chapter by considering one of the most important tools

### Introduction to Machine Learning and Data Mining. Prof. Dr. Igor Trajkovski trajkovski@nyus.edu.mk

Introduction to Machine Learning and Data Mining Prof. Dr. Igor Trakovski trakovski@nyus.edu.mk Neural Networks 2 Neural Networks Analogy to biological neural systems, the most robust learning systems

### Second Order Linear Partial Differential Equations. Part I

Second Order Linear Partial Differential Equations Part I Second linear partial differential equations; Separation of Variables; - point boundary value problems; Eigenvalues and Eigenfunctions Introduction

### 3.5 Spline interpolation

3.5 Spline interpolation Given a tabulated function f k = f(x k ), k = 0,... N, a spline is a polynomial between each pair of tabulated points, but one whose coefficients are determined slightly non-locally.

### Math 54. Selected Solutions for Week Is u in the plane in R 3 spanned by the columns

Math 5. Selected Solutions for Week 2 Section. (Page 2). Let u = and A = 5 2 6. Is u in the plane in R spanned by the columns of A? (See the figure omitted].) Why or why not? First of all, the plane in

### Linear Threshold Units

Linear Threshold Units w x hx (... w n x n w We assume that each feature x j and each weight w j is a real number (we will relax this later) We will study three different algorithms for learning linear

### Learning. CS461 Artificial Intelligence Pinar Duygulu. Bilkent University, Spring 2007. Slides are mostly adapted from AIMA and MIT Open Courseware

1 Learning CS 461 Artificial Intelligence Pinar Duygulu Bilkent University, Slides are mostly adapted from AIMA and MIT Open Courseware 2 Learning What is learning? 3 Induction David Hume Bertrand Russell

### Sub Module Design of experiments

Sub Module 1.6 6. Design of experiments Goal of experiments: Experiments help us in understanding the behavior of a (mechanical) system Data collected by systematic variation of influencing factors helps

### Part 2: Analysis of Relationship Between Two Variables

Part 2: Analysis of Relationship Between Two Variables Linear Regression Linear correlation Significance Tests Multiple regression Linear Regression Y = a X + b Dependent Variable Independent Variable

### Several Views of Support Vector Machines

Several Views of Support Vector Machines Ryan M. Rifkin Honda Research Institute USA, Inc. Human Intention Understanding Group 2007 Tikhonov Regularization We are considering algorithms of the form min

### Solving Systems of Linear Equations; Row Reduction

Harvey Mudd College Math Tutorial: Solving Systems of Linear Equations; Row Reduction Systems of linear equations arise in all sorts of applications in many different fields of study The method reviewed

### (57) (58) (59) (60) (61)

Module 4 : Solving Linear Algebraic Equations Section 5 : Iterative Solution Techniques 5 Iterative Solution Techniques By this approach, we start with some initial guess solution, say for solution and

### 2.3 Solving Equations Containing Fractions and Decimals

2. Solving Equations Containing Fractions and Decimals Objectives In this section, you will learn to: To successfully complete this section, you need to understand: Solve equations containing fractions

### Solutions to Math 51 First Exam January 29, 2015

Solutions to Math 5 First Exam January 29, 25. ( points) (a) Complete the following sentence: A set of vectors {v,..., v k } is defined to be linearly dependent if (2 points) there exist c,... c k R, not

### Clustering and Data Mining in R

Clustering and Data Mining in R Workshop Supplement Thomas Girke December 10, 2011 Introduction Data Preprocessing Data Transformations Distance Methods Cluster Linkage Hierarchical Clustering Approaches