# A Non-Linear Schema Theorem for Genetic Algorithms

Save this PDF as:

Size: px
Start display at page:

## Transcription

1 A Non-Linear Schema Theorem for Genetic Algorithms William A Greene Computer Science Department University of New Orleans New Orleans, LA Abstract We generalize Holland s Schema Theorem to the setting that genes are arranged, not necessarily in a linear sequence, but as the nodes in a connected graph We have experimental results showing that the flourishing of building blocks can be expected for two distinct graphs we have investigated, one being a tree and the other being the lattice points in a cube in Euclidean 3-space 1 INTRODUCTION Holland developed the ground work for genetic algorithms in the 1970 s (Holland, 1975) His Schema Theorem says that building blocks should flourish Its proof relies on the fact that genes in a chromosome are arranged in a linear sequence But why limit ourselves to this arrangement of genes? Ultimately genes should be arranged in geometries most appropriate to the problem at hand, which may not be sequential For example, for Koza s work (Koza, 1992) in genetic programming, the natural arrangement is tree-like Other arrangements which might be appropriate to some problem are (a) a high-dimensional cube of short edge size, such as {0, 1, 2} 20, or (b) a web of points on the surface of a sphere In this paper we generalize Holland s Schema Theorem to the case that genes are arranged, not necessarily in a linear sequence, but as the nodes in a connected graph In this paper, a gene will simply be a bit 2 HOLLAND S SCHEMA THEOREM First we carefully review the setting in which the theorem is proved We follow the development found in (Goldberg, 1989) There is a universe U of individuals, each of which is a bit string (a list of bits) of a fixed length, call that length L In set-theoretics, U = 2 L There is a fitness function f : U R + which assigns to each individual a fitness value which is a positive real number Also present is a population P which is a subset of the universe U There will be generations of the population A schema is to be a pattern which can match individual bit strings We use the asterisk,, to be a wildcard character, and then a schema is a string of length L each of whose components is either a bit value 0 or 1, or the wildcard character The schema positions holding a 0 or 1 are called fixed positions; those holding the wildcard are unfixed The schema matches an individual bit string if the latter agrees identically with the schema at the schema s fixed positions If an individual bit pattern matches a schema, we say that bit pattern is a representative of the schema If L = 12, an example of a schema is s = (,, 0,,, 1, 0,,,,, ) This schema has 3 fixed positions, at indices 3, 6, and 7 The order of a schema, order(s), is the number of fixed positions in it, 3 for this example The defining length, δ() s, of a schema is the distance between its first and last fixed positions For the above example, the first and last fixed positions are at indices 3 and 7 respectively, so the defining length is 7-3 = 4 Now we return to the population P which is a subset of the universe The population undergoes generational changes by subjecting it to the evolutionary forces of survival of the fittest, mating with crossover, and mutation, soon to be detailed Recall the presence of the fitness function f Implicit in a generational change is the hope that fitter and fitter individuals begin to appear in the population as time t increases Holland talks of building blocks, meaning schemas of low defining length (so, also low order) whose representatives tend to have higher fitnesses The supposition is that the fixed values within a building block are ones that are beneficial to the individual Holland s Schema Theorem asserts that we can expect an increase, as time advances, in the number of representatives of the good building blocks within P Let the size of the population be N Let the individuals in P be denoted a 1, a 2,, a N, and let their associated fitness values be f 1, f 2,, f N Create the next generation population as follows We use the weighted roulette wheel approach for choosing individuals for parenting: an individual a i is selected for parenting with a probability equal to its relative fitness within the population, that is, with probability f i ( Σ j f j ) Having selected two parents in this way, we perform mating with one-point crossover:

2 at random choose one of the L-1 positions 2 through length L Cut each parent in two, between the chosen position and the preceding one, next interchange parental fragments to form two children, then add the children to the next generation Continue in this way until the next generation reaches the same size as the preceding one Note that two parents produce two children Finally, with some low probability p m, subject each bit in each child to a mutational change Our population at time t, P(t), consists of individuals a 1, a 2,, a N, with corresponding fitnesses f 1, f 2,, f N Let H be some given schema, and suppose within P(t) there are m representatives of H, denote them a i1, a i2,, a im Denote µ ( H) = ( Σ k f ik ) m which is the average fitness of the H-representatives in P(t), and denote µ ( P) = ( Σ j f j ) N which is the average fitness in the entire population P(t) The probability that a particular H-representative, a ik, will be chosen for parenting is f ik ( Σ j f j ), therefore, the probability that some H-representative is chosen for parenting is ( Σ k f ik ) ( Σ j f j ) There are altogether N parents chosen, so the number of parents that are H- representatives has expected value N ( Σ k f ik ) ( Σ j f j ) The latter number rewrites to m µ ( H) µ ( P) Assuming a representative of H is chosen for mating, and recalling that a cutpoint for crossover is chosen with uniform randomness to cut just in front of the positions 2L, then the bit values making the parent a representative of H will not be separated from one another under crossover if the cutpoint is chosen outside the span of those bit values, and the latter happens with probability 1 ( δ( H) ( L 1) ) Thus, after mating with crossover, the number of children that are representatives of H should be at least as big as m µ ( H) δ( H) µ ( P) L 1 Recall p m is the probability that a single bit is mutated to the complementary value, and order(h) is the number of fixed positions in H An H-representative remains such, despite mutation, with probability ( 1 p m ) order( H) Now, in the next generation P(t+1) of the population, the number of representatives of H should be at least as big as (E1) m µ ( H) δ( H) ( 1 p µ ( P) L 1 m ) order( H) If the factor complementary to m, (E2) µ ( H) δ( H) ( 1 p µ ( P) L 1 m ) order( H) is greater than 1, we conclude that the number of H- representatives can be expected to increase from P(t) to P(t+1) The smaller δ( H) is, the closer factor 1 ( δ( H) ( L 1) ) is to 1 The smaller p m and order(h) are, the closer factor ( 1 p m ) order( H) is to 1 If µ ( H), the average fitness of those H-representatives in P(t), is somewhat greater than µ ( P), the average fitness of all of P(t), then the quantity (E2) can exceed 1 Whence: Holland s Schema Theorem: Assume schema H has short defining length δ( H) (therefore, also low value for order(h)) Also assume the mutation rate p m is low If the representatives of H have somewhat above average fitness within the population P(t) at time t, then the number of representatives of H can be expected to increase in the next generation P(t+1) 3 A NON-LINEAR SCHEMA THEOREM Now we want to consider scenarios where individuals in a universe are described by some set of bit values but where the bits are arranged in a configuration which is not necessarily a linear sequence For example, the bits might be arranged as the nodes of a tree Or they might be arranged as the lattice points (the points with integer coordinates) in a parallelepiped (a box ) in multidimensional real space R n More generally, we will assume the bits are arranged as the nodes in some connected finite graph G We shall assume there is some notion of the distance between two nodes (perhaps path length in G) Also we will assume there is some natural notion of ways to cut G into two non-empty subsets of its nodes (possibly by clipping one or several edges in G) A schema will be as before: at G s nodes we assign values of either 0 or 1 or the wildcard, Fixed positions in the schema, versus unfixed positions, means the same as before The order of a schema means (as before) the number of fixed positions in it We need a notion analogous to a schema s defining length We have assumed there is a notion of distance between two nodes in G If B is some subset of G-nodes, define the diameter ( B) of B to be the maximum distance between any two nodes in B Since graph G is finite, the diameter of any G-subset, including G itself, is a well-defined (finite) positive real number Given a schema H, define its relative diameter rel ( H) = ( fixed( H) ) ( G) where fixed(h) is the set of fixed nodes in H Figure 1 suggestively illustrates the diameters ( G) and ( fixed( H) ) The relative diameter of a schema is a real number in the unit interval [0, 1] Note that in the original Holland scenario, when L bits are arranged in a linear

3 sequence, the relative diameter of a schema H is δ( H) ( L 1) (G) Figure 1: Diameters of Graph G and fixed(h) within it When graph G is cut in two (in some natural but as yet unspecified way), it is possible for two fixed positions in a schema to become separated (wind up in separate subsets of G-nodes); term this a disruption of the schema Define the disruption probability, dp(h), of a schema H to be the probability that a random cut (whatever that may mean) disrupts the schema Note that in the original Holland scenario, when L bits are arranged in a linear sequence, the disruption probability of a schema H is δ( H) ( L 1), so, is equal to the relative diameter rel ( H) Now we prove the non-linear analogue of Holland s Schema Theorem Let schema H be given Suppose there are m representatives of H in the population P(t) As before, µ ( H) denotes the average fitness of the H- representatives in P(t), and µ ( P) the average fitness of the entire population P(t) Again, when creating a generational change, we select individuals for parenting by a weighted roulette wheel Again the number of parents which are H-representatives has expected value m µ ( H) µ ( P) Two children are formed from two parents by making a random cut of G, then interchanging parental fragments A child of an H-representative continues to be an H-representative if the random cut did not disrupt H, which happens with probability 1 - dp(h) At this point we make the critical assumption: Assume Then the number of children which are H-representatives should be at least as big as 1 1 (fixed(h)) 0 m µ ( H), and after mutation, the µ ( P) ( 1 rel ( H )) number of children which are H-representatives should be at least as big as (E3) m µ ( H) µ ( P) ( 1 rel ( H )) ( 1 p m ) order( H) Of course, this is the analogue to the expression (E1) in section 2 above Non-Linear Schema Theorem: Let H be a schema Assume (i) the mutation rate p m is low, (ii) H has low order, (iii) H has low relative diameter rel ( H), and (iv) If representatives of H have somewhat above average fitness within the population P(t), then the number of representatives of H can be expected to increase in the next generation P(t+1) Of course, for this theorem it would be enough to replace (iii) and (iv) with (iii ) dp(h) is low But we anticipate that, in general, dp(h) is considerably harder to calculate than rel ( H) With regard to proving theorems about the burgeoning of building blocks, possible relations between rel ( H) and dp(h) are: 1 for all schema H; 2 for all schema H which have low relative diameter rel ( H) ; 3 for the majority of schema H which have low relative diameter 4 BITS ARRANGED IN A TREE Now we investigate connecting bits as the nodes in a tree In particular, consider the full binary tree T on 63 nodes In this tree, levels 0 through 5 are full; level 5 contains 32 leaf nodes The edges in this graph are those that connect children to parents Only the root node has no parent, so there are 62 edges altogether For our notion of distance, we will use path length between two nodes (we mean the shortest path that goes through their nearest common ancestor) The longest path in this tree has length 10; this is the length of path that connects any leaf in the root s right subtree to any leaf in the root s left subtree Manufacturing a schema H amounts to making some of the 63 nodes have fixed values of 0 or 1, and assigning an asterisk to the other nodes We can write a function that returns the length of the longest path between two fixed nodes in a schema Thus rel ( H) is easily calculated directly Given two parent bit trees, we perform mating with crossover in the style of (Koza, 1992): at random select

4 Table 1: Tree Experiments Schema Number dp(h) dp / rel # times order trials least most avg least most avg dp> rel K K K Column 2 is the number of random schema generated for the given order Columns 3-5 and 6-8 give values of the disruption probability dp(h) and of the ratio dp(h)/ rel ( H )) Column 9 shows that dp(h) never exceeded rel ( H )) in these experiments one of the 62 edges in the tree and interchange the parental subtrees beneath that edge Now we address the calculation of dp(h), the probability that a random cut will disrupt a schema H Of course, here by disrupt we mean that at least one fixed node of H is in the subtree underneath the edge being cut, and at least one fixed node of H lies in the rest of the tree Since there are only 62 edges which can be cut, we write a function that explicitly calculates dp(h) by examining the consequences of the 62 possible cuts It becomes too expensive to explicitly determine by exhaustion whether for all schemas H (nor has a mathematical proof occurred to us) For example, the number of schemas of order 4 is a multiple of C(63, 4), where C(63, 4) = the number of ways of combining 63 items 4 at a time = 595,665 Instead we make the following statistical analysis For some orders 2, 3, 4, etc, we manufacture many random schema, and for those schema we explicitly calculate and compare rel ( H) and dp(h) The results are given in Table 1 As the last column in Table 1 shows, in our experiments, dp(h) never exceeded rel ( H) Moreover, as columns 6-8 show, for low order schemas, dp(h) was on average quite a bit smaller than rel ( H) We thank our colleague Dr Terry Watkins of the University of New Orleans Mathematics Department for the statistical assertions of this paragraph Suppose we test the hypothesis that by taking n = 1000 randomly chosen schema H and for those seeing if the hypothesis holds These are Bernoulli trials where we will assume the probability of success (that is, ) is constant If every one of the 1000 trials results in success, then with confidence level we can assert that the proportion of schemas for which the hypothesis holds must be at least Thus, the experiments reported in Table 1 suggest with extraordinary statistical persuasion that for all the schema in our setting 5 BITS ARRANGED IN A CUBE Now we consider bits arranged as the lattice points within a parallelepiped in 3-dimensional Euclidean space R 3 A lattice point is one whose coordinates are integers Our parallelepiped will actually be a cube We will consider several cubes The first cube has 8 points on each edge and consists of the vectors (l, m, n) in R 3 such that each of the coordinates l, m, n is an integer in the range 07 Such vectors are the nodes in our graph G We intend there to be an edge between two vectors ( = nodes) if the two vectors agree in two of their coordinates and differ by 1 in the remaining coordinate But, in truth, since the notion of distance we will use will be Euclidean distance, the edges are less significant than the physical proximity of nodes to one another The other two cubes are similar but with, respectively, 10 and 12 points on each edge The number of bits in our three cubes are, respectively, 8 3 = 512, 10 3 = 1000, and 12 3 = 1728 We will cut such a cube with a hyperplane In 3-dimensional Euclidean space R 3, a hyperplane is determined by four real numbers a, b, c, d, and consists of those vectors (x, y, z) such that a x + b y + c z = d The half-spaces into which the hyperplane cuts the world consists of those (x, y, z) which satisfy

5 Table 2: Cube Experiments Edge Ball Avg Avg % Avg ratio dp/ rel size size order rel rel dp all 1000 when dp> rel For all three edge sizes, 1000 random schema were manufactured Column 2 is the maximum number of points in a ball with rel = 025 Column 5 is the percentage of the 1000 schema which satisfy Column 6 has the average ratio of dp(h) to rel ( H), taken over all 1000 schema, and column 7 has that average ratio over just the minority of schema for which dp(h) > rel ( H) a x + b y + c z > d (which we term the sky side) versus those which satisfy a x + b y + c z d (the sod side) We manufacture a random hyperplane by choosing four random real numbers a, b, c, d out of the real interval [-1, 1] A random hyperplane may not pass through the cube at all; for instance, the entire cube might lie in the sky side If a hyperplane does pass through the cube, it disrupts a schema if at least one fixed position in the schema is on the sky side and at least one schema fixed position is on the sod side Crossover between two parents consists of intersecting each parent with the same random hyperplane, then interchanging parental sky sides (or sod sides, of course) to form two children Our distance function will be Euclidean distance, the square root of the sum of the squares of the differences of corresponding vector coordinates For speaking of a Schema Theorem, we really want to make an assertion about schemas of low relative diameter So, for our experiments, we manufactured numerous trial schemas of low relative diameter, done as follows The relative diameter we here report on is rel = 025 (behavior for this value is typical) The schema s fixed positions are all chosen to lie within a geometric ball, of this relative diameter, about some center lattice point Geometry necessitates that there is some maximum number (see column 2 of Table 2) of lattice points that can fit in this ball (As can be deduced from Table 2, this maximum number amounts to just several percentage points of the entire cube) To manufacture random schemas, we choose a center point at random, then, with chance upon each addition, add each possible cube point that fits in this ball as another fixed position in the schema we are manufacturing In Table 2, the average order (column 3) of the schemas we manufacture is a bit less that half the maximum number (column 2) because balls centered near a cube face do not receive points outside the cube Having manufactured a random schema H of low relative diameter as just described, we estimate the probability dp(h) as follows We choose 1000 hyperplanes at random, then use the ratio of hyperplanes that disrupt the schema, to hyperplanes that pass through the cube, as our approximation of dp(h) (We would not be very interested at crossover time in hyperplanes that do not pass through the cube) Column 5 gives the percentage of the 1000 trial schema which satisfy This proportion ranges between 735% and 640% In the ranking of desirable relationships between rel ( H) and dp(h) which is given at the end of section 3, the relationship shown in Table 2 is the third: for a majority of schema of low relative diameter But the picture is actually a little rosier than that Column 6 of Table 2 shows the average ratio dp( H) rel ( H) over all 1000 schema This evidence says values rel ( H) and dp(h) are on average rather close to one another Column 7 looks at the average ratio dp( H) rel ( H) over just that minority of schemas for which dp( H) > rel ( H) This evidence says that when dp(h) exceeds rel ( H), it does not do so excessively These experiments indicate that arranging bits at lattice points in boxes, which for purposes of crossover are cut by hyperplanes, can entail the desired behavior that building blocks have their representation increased If a schema s representatives in the population have somewhat above average fitnesses, then their numbers should increase in the next generation provided the schema has low disruption probability dp(h) Quantity dp(h) is hard to calculate, but is usually less than relative diameter rel ( H) and on average is rather close to the value rel ( H) 6 CONCLUSIONS We have generalized Holland s Schema Theorem, to the setting that bits are arranged, not necessarily in a linear sequence, but as the nodes of a connected graph G To do

6 so we posited the existence of a distance function, and our theorem incorporates the premise that We gave experimental results showing that Holland-like behavior, meaning the flourishing of building blocks, can reasonably be expected for two distinct graphs G, one being a tree and the other being a cube of lattice points in Euclidean 3-space We have given both theory and experimental evidence that bits (genes) need not be limited to alignment in a sequence Ultimately genes should be arranged in ways that are natural to the problem at hand References Goldberg, David (1989) Genetic Algorithms in Search, Optimization, and Machine Learning Reading, MA: Addison-Wesley Publishing Holland, John (1975) Adaptation in Natural and Artificial Systems Ann Arbor, MI: University of Michigan Press Koza, John (1992) Genetic Programming: On the Programming of Computers by Means of Natural Selection Cambridge, MA: The MIT Press

### Full and Complete Binary Trees

Full and Complete Binary Trees Binary Tree Theorems 1 Here are two important types of binary trees. Note that the definitions, while similar, are logically independent. Definition: a binary tree T is full

### Holland s GA Schema Theorem

Holland s GA Schema Theorem v Objective provide a formal model for the effectiveness of the GA search process. v In the following we will first approach the problem through the framework formalized by

### Optimizing CPU Scheduling Problem using Genetic Algorithms

Optimizing CPU Scheduling Problem using Genetic Algorithms Anu Taneja Amit Kumar Computer Science Department Hindu College of Engineering, Sonepat (MDU) anutaneja16@gmail.com amitkumar.cs08@pec.edu.in

### Lab 4: 26 th March 2012. Exercise 1: Evolutionary algorithms

Lab 4: 26 th March 2012 Exercise 1: Evolutionary algorithms 1. Found a problem where EAs would certainly perform very poorly compared to alternative approaches. Explain why. Suppose that we want to find

### Lecture 7: Approximation via Randomized Rounding

Lecture 7: Approximation via Randomized Rounding Often LPs return a fractional solution where the solution x, which is supposed to be in {0, } n, is in [0, ] n instead. There is a generic way of obtaining

### A very brief introduction to genetic algorithms

A very brief introduction to genetic algorithms Radoslav Harman Design of experiments seminar FACULTY OF MATHEMATICS, PHYSICS AND INFORMATICS COMENIUS UNIVERSITY IN BRATISLAVA 25.2.2013 Optimization problems:

### College of information technology Department of software

University of Babylon Undergraduate: third class College of information technology Department of software Subj.: Application of AI lecture notes/2011-2012 ***************************************************************************

### 6 Creating the Animation

6 Creating the Animation Now that the animation can be represented, stored, and played back, all that is left to do is understand how it is created. This is where we will use genetic algorithms, and this

### The sample space for a pair of die rolls is the set. The sample space for a random number between 0 and 1 is the interval [0, 1].

Probability Theory Probability Spaces and Events Consider a random experiment with several possible outcomes. For example, we might roll a pair of dice, flip a coin three times, or choose a random real

### Worksheet for Teaching Module Probability (Lesson 1)

Worksheet for Teaching Module Probability (Lesson 1) Topic: Basic Concepts and Definitions Equipment needed for each student 1 computer with internet connection Introduction In the regular lectures in

### We call this set an n-dimensional parallelogram (with one vertex 0). We also refer to the vectors x 1,..., x n as the edges of P.

Volumes of parallelograms 1 Chapter 8 Volumes of parallelograms In the present short chapter we are going to discuss the elementary geometrical objects which we call parallelograms. These are going to

### Uniform Multicommodity Flow through the Complete Graph with Random Edge-capacities

Combinatorics, Probability and Computing (2005) 00, 000 000. c 2005 Cambridge University Press DOI: 10.1017/S0000000000000000 Printed in the United Kingdom Uniform Multicommodity Flow through the Complete

### A Robust Method for Solving Transcendental Equations

www.ijcsi.org 413 A Robust Method for Solving Transcendental Equations Md. Golam Moazzam, Amita Chakraborty and Md. Al-Amin Bhuiyan Department of Computer Science and Engineering, Jahangirnagar University,

### Reading 13 : Finite State Automata and Regular Expressions

CS/Math 24: Introduction to Discrete Mathematics Fall 25 Reading 3 : Finite State Automata and Regular Expressions Instructors: Beck Hasti, Gautam Prakriya In this reading we study a mathematical model

### CSC 310: Information Theory

CSC 310: Information Theory University of Toronto, Fall 2011 Instructor: Radford M. Neal Week 2 What s Needed for a Theory of (Lossless) Data Compression? A context for the problem. What are we trying

### Genetic programming with regular expressions

Genetic programming with regular expressions Børge Svingen Chief Technology Officer, Open AdExchange bsvingen@openadex.com 2009-03-23 Pattern discovery Pattern discovery: Recognizing patterns that characterize

### COMP 250 Fall 2012 lecture 2 binary representations Sept. 11, 2012

Binary numbers The reason humans represent numbers using decimal (the ten digits from 0,1,... 9) is that we have ten fingers. There is no other reason than that. There is nothing special otherwise about

### PROBLEM SET 7: PIGEON HOLE PRINCIPLE

PROBLEM SET 7: PIGEON HOLE PRINCIPLE The pigeonhole principle is the following observation: Theorem. Suppose that > kn marbles are distributed over n jars, then one jar will contain at least k + marbles.

### Course Notes for Math 320: Fundamentals of Mathematics Chapter 3: Induction.

Course Notes for Math 320: Fundamentals of Mathematics Chapter 3: Induction. February 21, 2006 1 Proof by Induction Definition 1.1. A subset S of the natural numbers is said to be inductive if n S we have

### Metric Spaces. Chapter 7. 7.1. Metrics

Chapter 7 Metric Spaces A metric space is a set X that has a notion of the distance d(x, y) between every pair of points x, y X. The purpose of this chapter is to introduce metric spaces and give some

### Evolutionary SAT Solver (ESS)

Ninth LACCEI Latin American and Caribbean Conference (LACCEI 2011), Engineering for a Smart Planet, Innovation, Information Technology and Computational Tools for Sustainable Development, August 3-5, 2011,

### Math 4310 Handout - Quotient Vector Spaces

Math 4310 Handout - Quotient Vector Spaces Dan Collins The textbook defines a subspace of a vector space in Chapter 4, but it avoids ever discussing the notion of a quotient space. This is understandable

### Regular Languages and Finite Automata

Regular Languages and Finite Automata 1 Introduction Hing Leung Department of Computer Science New Mexico State University Sep 16, 2010 In 1943, McCulloch and Pitts [4] published a pioneering work on a

### Discrete Mathematics and Probability Theory Fall 2009 Satish Rao,David Tse Note 11

CS 70 Discrete Mathematics and Probability Theory Fall 2009 Satish Rao,David Tse Note Conditional Probability A pharmaceutical company is marketing a new test for a certain medical condition. According

### WHERE DOES THE 10% CONDITION COME FROM?

1 WHERE DOES THE 10% CONDITION COME FROM? The text has mentioned The 10% Condition (at least) twice so far: p. 407 Bernoulli trials must be independent. If that assumption is violated, it is still okay

### x if x 0, x if x < 0.

Chapter 3 Sequences In this chapter, we discuss sequences. We say what it means for a sequence to converge, and define the limit of a convergent sequence. We begin with some preliminary results about the

### 3. Mathematical Induction

3. MATHEMATICAL INDUCTION 83 3. Mathematical Induction 3.1. First Principle of Mathematical Induction. Let P (n) be a predicate with domain of discourse (over) the natural numbers N = {0, 1,,...}. If (1)

### Planar Tree Transformation: Results and Counterexample

Planar Tree Transformation: Results and Counterexample Selim G Akl, Kamrul Islam, and Henk Meijer School of Computing, Queen s University Kingston, Ontario, Canada K7L 3N6 Abstract We consider the problem

### Section 1.1. Introduction to R n

The Calculus of Functions of Several Variables Section. Introduction to R n Calculus is the study of functional relationships and how related quantities change with each other. In your first exposure to

### Classifying Large Data Sets Using SVMs with Hierarchical Clusters. Presented by :Limou Wang

Classifying Large Data Sets Using SVMs with Hierarchical Clusters Presented by :Limou Wang Overview SVM Overview Motivation Hierarchical micro-clustering algorithm Clustering-Based SVM (CB-SVM) Experimental

### The Basics of Graphical Models

The Basics of Graphical Models David M. Blei Columbia University October 3, 2015 Introduction These notes follow Chapter 2 of An Introduction to Probabilistic Graphical Models by Michael Jordan. Many figures

### 6.080 / 6.089 Great Ideas in Theoretical Computer Science Spring 2008

MIT OpenCourseWare http://ocw.mit.edu 6.080 / 6.089 Great Ideas in Theoretical Computer Science Spring 2008 For information about citing these materials or our Terms of Use, visit: http://ocw.mit.edu/terms.

### Mathematics Course 111: Algebra I Part IV: Vector Spaces

Mathematics Course 111: Algebra I Part IV: Vector Spaces D. R. Wilkins Academic Year 1996-7 9 Vector Spaces A vector space over some field K is an algebraic structure consisting of a set V on which are

### Discrete Math in Computer Science Homework 7 Solutions (Max Points: 80)

Discrete Math in Computer Science Homework 7 Solutions (Max Points: 80) CS 30, Winter 2016 by Prasad Jayanti 1. (10 points) Here is the famous Monty Hall Puzzle. Suppose you are on a game show, and you

### Course 221: Analysis Academic year , First Semester

Course 221: Analysis Academic year 2007-08, First Semester David R. Wilkins Copyright c David R. Wilkins 1989 2007 Contents 1 Basic Theorems of Real Analysis 1 1.1 The Least Upper Bound Principle................

### Prime Numbers and Irreducible Polynomials

Prime Numbers and Irreducible Polynomials M. Ram Murty The similarity between prime numbers and irreducible polynomials has been a dominant theme in the development of number theory and algebraic geometry.

### INTRODUCTION TO PROOFS: HOMEWORK SOLUTIONS

INTRODUCTION TO PROOFS: HOMEWORK SOLUTIONS STEVEN HEILMAN Contents 1. Homework 1 1 2. Homework 2 6 3. Homework 3 10 4. Homework 4 16 5. Homework 5 19 6. Homework 6 21 7. Homework 7 25 8. Homework 8 28

### INDISTINGUISHABILITY OF ABSOLUTELY CONTINUOUS AND SINGULAR DISTRIBUTIONS

INDISTINGUISHABILITY OF ABSOLUTELY CONTINUOUS AND SINGULAR DISTRIBUTIONS STEVEN P. LALLEY AND ANDREW NOBEL Abstract. It is shown that there are no consistent decision rules for the hypothesis testing problem

### 14.10.2014. Overview. Swarms in nature. Fish, birds, ants, termites, Introduction to swarm intelligence principles Particle Swarm Optimization (PSO)

Overview Kyrre Glette kyrrehg@ifi INF3490 Swarm Intelligence Particle Swarm Optimization Introduction to swarm intelligence principles Particle Swarm Optimization (PSO) 3 Swarms in nature Fish, birds,

### Introduction To Genetic Algorithms

1 Introduction To Genetic Algorithms Dr. Rajib Kumar Bhattacharjya Department of Civil Engineering IIT Guwahati Email: rkbc@iitg.ernet.in References 2 D. E. Goldberg, Genetic Algorithm In Search, Optimization

### 1.3 Induction and Other Proof Techniques

4CHAPTER 1. INTRODUCTORY MATERIAL: SETS, FUNCTIONS AND MATHEMATICAL INDU 1.3 Induction and Other Proof Techniques The purpose of this section is to study the proof technique known as mathematical induction.

### NP-Completeness and Cook s Theorem

NP-Completeness and Cook s Theorem Lecture notes for COM3412 Logic and Computation 15th January 2002 1 NP decision problems The decision problem D L for a formal language L Σ is the computational task:

### 2 The Euclidean algorithm

2 The Euclidean algorithm Do you understand the number 5? 6? 7? At some point our level of comfort with individual numbers goes down as the numbers get large For some it may be at 43, for others, 4 In

### MATH 4330/5330, Fourier Analysis Section 11, The Discrete Fourier Transform

MATH 433/533, Fourier Analysis Section 11, The Discrete Fourier Transform Now, instead of considering functions defined on a continuous domain, like the interval [, 1) or the whole real line R, we wish

### Notes on Complexity Theory Last updated: August, 2011. Lecture 1

Notes on Complexity Theory Last updated: August, 2011 Jonathan Katz Lecture 1 1 Turing Machines I assume that most students have encountered Turing machines before. (Students who have not may want to look

### Principle of (Weak) Mathematical Induction. P(1) ( n 1)(P(n) P(n + 1)) ( n 1)(P(n))

Outline We will cover (over the next few weeks) Mathematical Induction (or Weak Induction) Strong (Mathematical) Induction Constructive Induction Structural Induction Principle of (Weak) Mathematical Induction

### SEQUENCES, MATHEMATICAL INDUCTION, AND RECURSION

CHAPTER 5 SEQUENCES, MATHEMATICAL INDUCTION, AND RECURSION Alessandro Artale UniBZ - http://www.inf.unibz.it/ artale/ SECTION 5.2 Mathematical Induction I Copyright Cengage Learning. All rights reserved.

### CHAPTER 3 Numbers and Numeral Systems

CHAPTER 3 Numbers and Numeral Systems Numbers play an important role in almost all areas of mathematics, not least in calculus. Virtually all calculus books contain a thorough description of the natural,

### Alpha Cut based Novel Selection for Genetic Algorithm

Alpha Cut based Novel for Genetic Algorithm Rakesh Kumar Professor Girdhar Gopal Research Scholar Rajesh Kumar Assistant Professor ABSTRACT Genetic algorithm (GA) has several genetic operators that can

### Solutions to Exercises 8

Discrete Mathematics Lent 2009 MA210 Solutions to Exercises 8 (1) Suppose that G is a graph in which every vertex has degree at least k, where k 1, and in which every cycle contains at least 4 vertices.

### diffy boxes (iterations of the ducci four number game) 1

diffy boxes (iterations of the ducci four number game) 1 Peter Trapa September 27, 2006 Begin at the beginning and go on till you come to the end: then stop. Lewis Carroll Consider the following game.

### ISSN: 2319-5967 ISO 9001:2008 Certified International Journal of Engineering Science and Innovative Technology (IJESIT) Volume 2, Issue 3, May 2013

Transistor Level Fault Finding in VLSI Circuits using Genetic Algorithm Lalit A. Patel, Sarman K. Hadia CSPIT, CHARUSAT, Changa., CSPIT, CHARUSAT, Changa Abstract This paper presents, genetic based algorithm

### Chapter Three. Functions. In this section, we study what is undoubtedly the most fundamental type of relation used in mathematics.

Chapter Three Functions 3.1 INTRODUCTION In this section, we study what is undoubtedly the most fundamental type of relation used in mathematics. Definition 3.1: Given sets X and Y, a function from X to

### 1 The Brownian bridge construction

The Brownian bridge construction The Brownian bridge construction is a way to build a Brownian motion path by successively adding finer scale detail. This construction leads to a relatively easy proof

### Asexual Versus Sexual Reproduction in Genetic Algorithms 1

Asexual Versus Sexual Reproduction in Genetic Algorithms Wendy Ann Deslauriers (wendyd@alumni.princeton.edu) Institute of Cognitive Science,Room 22, Dunton Tower Carleton University, 25 Colonel By Drive

### Fast Sequential Summation Algorithms Using Augmented Data Structures

Fast Sequential Summation Algorithms Using Augmented Data Structures Vadim Stadnik vadim.stadnik@gmail.com Abstract This paper provides an introduction to the design of augmented data structures that offer

### Classification/Decision Trees (II)

Classification/Decision Trees (II) Department of Statistics The Pennsylvania State University Email: jiali@stat.psu.edu Right Sized Trees Let the expected misclassification rate of a tree T be R (T ).

### WHAT ARE MATHEMATICAL PROOFS AND WHY THEY ARE IMPORTANT?

WHAT ARE MATHEMATICAL PROOFS AND WHY THEY ARE IMPORTANT? introduction Many students seem to have trouble with the notion of a mathematical proof. People that come to a course like Math 216, who certainly

### Comparison of Optimization Techniques in Large Scale Transportation Problems

Journal of Undergraduate Research at Minnesota State University, Mankato Volume 4 Article 10 2004 Comparison of Optimization Techniques in Large Scale Transportation Problems Tapojit Kumar Minnesota State

### Genetic Algorithms and Sudoku

Genetic Algorithms and Sudoku Dr. John M. Weiss Department of Mathematics and Computer Science South Dakota School of Mines and Technology (SDSM&T) Rapid City, SD 57701-3995 john.weiss@sdsmt.edu MICS 2009

### Genetic Algorithms commonly used selection, replacement, and variation operators Fernando Lobo University of Algarve

Genetic Algorithms commonly used selection, replacement, and variation operators Fernando Lobo University of Algarve Outline Selection methods Replacement methods Variation operators Selection Methods

### 5.4. HUFFMAN CODES 71

5.4. HUFFMAN CODES 7 Corollary 28 Consider a coding from a length n vector of source symbols, x = (x... x n ), to a binary codeword of length l(x). Then the average codeword length per source symbol for

### that simple hill-climbing schemes would perform poorly because a large number of bit positions must be optimized simultaneously in order to move from

B.2.7.5: Fitness Landscapes: Royal Road Functions Melanie Mitchell Santa Fe Institute 1399 Hyde Park Road Santa Fe, NM 87501 mm@santafe.edu Stephanie Forrest Dept. of Computer Science University of New

### Theorem (The division theorem) Suppose that a and b are integers with b > 0. There exist unique integers q and r so that. a = bq + r and 0 r < b.

Theorem (The division theorem) Suppose that a and b are integers with b > 0. There exist unique integers q and r so that a = bq + r and 0 r < b. We re dividing a by b: q is the quotient and r is the remainder,

### Introduction to Genetic Programming

Introduction to Genetic Programming Matthew Walker October 7, 2001 1 The Basic Idea Genetic Programming (GP) is a method to evolve computer programs. And the reason we would want to try this is because,

### Theory of Computation Prof. Kamala Krithivasan Department of Computer Science and Engineering Indian Institute of Technology, Madras

Theory of Computation Prof. Kamala Krithivasan Department of Computer Science and Engineering Indian Institute of Technology, Madras Lecture No. # 31 Recursive Sets, Recursively Innumerable Sets, Encoding

### Lecture 3. Mathematical Induction

Lecture 3 Mathematical Induction Induction is a fundamental reasoning process in which general conclusion is based on particular cases It contrasts with deduction, the reasoning process in which conclusion

### POWER SETS AND RELATIONS

POWER SETS AND RELATIONS L. MARIZZA A. BAILEY 1. The Power Set Now that we have defined sets as best we can, we can consider a sets of sets. If we were to assume nothing, except the existence of the empty

### Genetic Algorithm for Solving Simple Mathematical Equality Problem

Genetic Algorithm for Solving Simple Mathematical Equality Problem Denny Hermawanto Indonesian Institute of Sciences (LIPI), INDONESIA Mail: denny.hermawanto@gmail.com Abstract This paper explains genetic

### Sequences and Convergence in Metric Spaces

Sequences and Convergence in Metric Spaces Definition: A sequence in a set X (a sequence of elements of X) is a function s : N X. We usually denote s(n) by s n, called the n-th term of s, and write {s

### Genetic Algorithm an Approach to Solve Global Optimization Problems

Genetic Algorithm an Approach to Solve Global Optimization Problems PRATIBHA BAJPAI Amity Institute of Information Technology, Amity University, Lucknow, Uttar Pradesh, India, pratibha_bajpai@rediffmail.com

### THE NUMBER OF GRAPHS AND A RANDOM GRAPH WITH A GIVEN DEGREE SEQUENCE. Alexander Barvinok

THE NUMBER OF GRAPHS AND A RANDOM GRAPH WITH A GIVEN DEGREE SEQUENCE Alexer Barvinok Papers are available at http://www.math.lsa.umich.edu/ barvinok/papers.html This is a joint work with J.A. Hartigan

### Summary of week 8 (Lectures 22, 23 and 24)

WEEK 8 Summary of week 8 (Lectures 22, 23 and 24) This week we completed our discussion of Chapter 5 of [VST] Recall that if V and W are inner product spaces then a linear map T : V W is called an isometry

### Project and Production Management Prof. Arun Kanda Department of Mechanical Engineering Indian Institute of Technology, Delhi

Project and Production Management Prof. Arun Kanda Department of Mechanical Engineering Indian Institute of Technology, Delhi Lecture - 15 Limited Resource Allocation Today we are going to be talking about

### Binomial lattice model for stock prices

Copyright c 2007 by Karl Sigman Binomial lattice model for stock prices Here we model the price of a stock in discrete time by a Markov chain of the recursive form S n+ S n Y n+, n 0, where the {Y i }

### Lecture Notes on Spanning Trees

Lecture Notes on Spanning Trees 15-122: Principles of Imperative Computation Frank Pfenning Lecture 26 April 26, 2011 1 Introduction In this lecture we introduce graphs. Graphs provide a uniform model

### Applications of Methods of Proof

CHAPTER 4 Applications of Methods of Proof 1. Set Operations 1.1. Set Operations. The set-theoretic operations, intersection, union, and complementation, defined in Chapter 1.1 Introduction to Sets are

### Genetic Algorithm Evolution of Cellular Automata Rules for Complex Binary Sequence Prediction

Brill Academic Publishers P.O. Box 9000, 2300 PA Leiden, The Netherlands Lecture Series on Computer and Computational Sciences Volume 1, 2005, pp. 1-6 Genetic Algorithm Evolution of Cellular Automata Rules

### 7 Hypothesis testing - one sample tests

7 Hypothesis testing - one sample tests 7.1 Introduction Definition 7.1 A hypothesis is a statement about a population parameter. Example A hypothesis might be that the mean age of students taking MAS113X

### Chapter 3: DISCRETE RANDOM VARIABLES AND PROBABILITY DISTRIBUTIONS

Chapter 3: DISCRETE RANDOM VARIABLES AND PROBABILITY DISTRIBUTIONS Part 4: Geometric Distribution Negative Binomial Distribution Hypergeometric Distribution Sections 3-7, 3-8 The remaining discrete random

### Math212a1010 Lebesgue measure.

Math212a1010 Lebesgue measure. October 19, 2010 Today s lecture will be devoted to Lebesgue measure, a creation of Henri Lebesgue, in his thesis, one of the most famous theses in the history of mathematics.

### Analysis of Algorithms I: Binary Search Trees

Analysis of Algorithms I: Binary Search Trees Xi Chen Columbia University Hash table: A data structure that maintains a subset of keys from a universe set U = {0, 1,..., p 1} and supports all three dictionary

### The Standard Normal distribution

The Standard Normal distribution 21.2 Introduction Mass-produced items should conform to a specification. Usually, a mean is aimed for but due to random errors in the production process we set a tolerance

### Lecture 14: Fourier Transform

Machine Learning: Foundations Fall Semester, 2010/2011 Lecture 14: Fourier Transform Lecturer: Yishay Mansour Scribes: Avi Goren & Oron Navon 1 14.1 Introduction In this lecture, we examine the Fourier

### Sets and Cardinality Notes for C. F. Miller

Sets and Cardinality Notes for 620-111 C. F. Miller Semester 1, 2000 Abstract These lecture notes were compiled in the Department of Mathematics and Statistics in the University of Melbourne for the use

### 6.3 Conditional Probability and Independence

222 CHAPTER 6. PROBABILITY 6.3 Conditional Probability and Independence Conditional Probability Two cubical dice each have a triangle painted on one side, a circle painted on two sides and a square painted

### Continued Fractions and the Euclidean Algorithm

Continued Fractions and the Euclidean Algorithm Lecture notes prepared for MATH 326, Spring 997 Department of Mathematics and Statistics University at Albany William F Hammond Table of Contents Introduction

### The Goldberg Rao Algorithm for the Maximum Flow Problem

The Goldberg Rao Algorithm for the Maximum Flow Problem COS 528 class notes October 18, 2006 Scribe: Dávid Papp Main idea: use of the blocking flow paradigm to achieve essentially O(min{m 2/3, n 1/2 }

### If f is a 1-1 correspondence between A and B then it has an inverse, and f 1 isa 1-1 correspondence between B and A.

Chapter 5 Cardinality of sets 51 1-1 Correspondences A 1-1 correspondence between sets A and B is another name for a function f : A B that is 1-1 and onto If f is a 1-1 correspondence between A and B,

### A Partition-Based Efficient Algorithm for Large Scale. Multiple-Strings Matching

A Partition-Based Efficient Algorithm for Large Scale Multiple-Strings Matching Ping Liu Jianlong Tan, Yanbing Liu Software Division, Institute of Computing Technology, Chinese Academy of Sciences, Beijing,

### A HYBRID GENETIC ALGORITHM FOR THE MAXIMUM LIKELIHOOD ESTIMATION OF MODELS WITH MULTIPLE EQUILIBRIA: A FIRST REPORT

New Mathematics and Natural Computation Vol. 1, No. 2 (2005) 295 303 c World Scientific Publishing Company A HYBRID GENETIC ALGORITHM FOR THE MAXIMUM LIKELIHOOD ESTIMATION OF MODELS WITH MULTIPLE EQUILIBRIA:

### 2.3 Scheduling jobs on identical parallel machines

2.3 Scheduling jobs on identical parallel machines There are jobs to be processed, and there are identical machines (running in parallel) to which each job may be assigned Each job = 1,,, must be processed

### Fixed Point Theorems

Fixed Point Theorems Definition: Let X be a set and let T : X X be a function that maps X into itself. (Such a function is often called an operator, a transformation, or a transform on X, and the notation

### Lecture 7: NP-Complete Problems

IAS/PCMI Summer Session 2000 Clay Mathematics Undergraduate Program Basic Course on Computational Complexity Lecture 7: NP-Complete Problems David Mix Barrington and Alexis Maciel July 25, 2000 1. Circuit

### 2.3 Convex Constrained Optimization Problems

42 CHAPTER 2. FUNDAMENTAL CONCEPTS IN CONVEX OPTIMIZATION Theorem 15 Let f : R n R and h : R R. Consider g(x) = h(f(x)) for all x R n. The function g is convex if either of the following two conditions

### 1 Limiting distribution for a Markov chain

Copyright c 2009 by Karl Sigman Limiting distribution for a Markov chain In these Lecture Notes, we shall study the limiting behavior of Markov chains as time n In particular, under suitable easy-to-check

### Lecture 3: Linear Programming Relaxations and Rounding

Lecture 3: Linear Programming Relaxations and Rounding 1 Approximation Algorithms and Linear Relaxations For the time being, suppose we have a minimization problem. Many times, the problem at hand can

### A Genetic Algorithm Processor Based on Redundant Binary Numbers (GAPBRBN)

ISSN: 2278 1323 All Rights Reserved 2014 IJARCET 3910 A Genetic Algorithm Processor Based on Redundant Binary Numbers (GAPBRBN) Miss: KIRTI JOSHI Abstract A Genetic Algorithm (GA) is an intelligent search