Holland s GA Schema Theorem



Similar documents
A Non-Linear Schema Theorem for Genetic Algorithms

The Influence of Binary Representations of Integers on the Performance of Selectorecombinative Genetic Algorithms

Introduction To Genetic Algorithms

Genetic Algorithms and Sudoku

Comparison of Major Domination Schemes for Diploid Binary Genetic Algorithms in Dynamic Environments

A Robust Method for Solving Transcendental Equations

Numerical Research on Distributed Genetic Algorithm with Redundant

Genetic Algorithms commonly used selection, replacement, and variation operators Fernando Lobo University of Algarve

A Binary Model on the Basis of Imperialist Competitive Algorithm in Order to Solve the Problem of Knapsack 1-0

Alpha Cut based Novel Selection for Genetic Algorithm

Row Echelon Form and Reduced Row Echelon Form

Mathematical Induction

Genetic Algorithm Based Interconnection Network Topology Optimization Analysis

Lab 4: 26 th March Exercise 1: Evolutionary algorithms

LAB : THE CHI-SQUARE TEST. Probability, Random Chance, and Genetics

A Brief Study of the Nurse Scheduling Problem (NSP)

A Comparison of Genotype Representations to Acquire Stock Trading Strategy Using Genetic Algorithms

CHAPTER 6 GENETIC ALGORITHM OPTIMIZED FUZZY CONTROLLED MOBILE ROBOT

Evolutionary Turing in the Context of Evolutionary Machines

ISSN: ISO 9001:2008 Certified International Journal of Engineering Science and Innovative Technology (IJESIT) Volume 2, Issue 3, May 2013

Genetic Algorithm Performance with Different Selection Strategies in Solving TSP

A Parallel Processor for Distributed Genetic Algorithm with Redundant Binary Number

Evolutionary Detection of Rules for Text Categorization. Application to Spam Filtering

CHAPTER 3 SECURITY CONSTRAINED OPTIMAL SHORT-TERM HYDROTHERMAL SCHEDULING

CS 3719 (Theory of Computation and Algorithms) Lecture 4

AP: LAB 8: THE CHI-SQUARE TEST. Probability, Random Chance, and Genetics

A Fast Computational Genetic Algorithm for Economic Load Dispatch

COMP 250 Fall 2012 lecture 2 binary representations Sept. 11, 2012

Genetic Algorithm. Based on Darwinian Paradigm. Intrinsically a robust search and optimization mechanism. Conceptual Algorithm

Online Appendix to Stochastic Imitative Game Dynamics with Committed Agents

Programming Risk Assessment Models for Online Security Evaluation Systems

Symbol Tables. Introduction

Genetic Algorithm. Tom V. Mathew Assistant Professor, Department of Civil Engineering, Indian Institute of Technology Bombay, Mumbai

Introduction to Schema Theory

Estimation of the COCOMO Model Parameters Using Genetic Algorithms for NASA Software Projects

Regular Languages and Finite Automata

Continued Fractions and the Euclidean Algorithm

Multiobjective Multicast Routing Algorithm

Notes on Complexity Theory Last updated: August, Lecture 1

MATH10212 Linear Algebra. Systems of Linear Equations. Definition. An n-dimensional vector is a row or a column of n numbers (or letters): a 1.

STUDY ON APPLICATION OF GENETIC ALGORITHM IN CONSTRUCTION RESOURCE LEVELLING

Genetic Algorithm Approach for Risk Reduction of Information Security

The Binary Genetic Algorithm

HYBRID GENETIC ALGORITHM PARAMETER EFFECTS FOR OPTIMIZATION OF CONSTRUCTION RESOURCE ALLOCATION PROBLEM. Jin-Lee KIM 1, M. ASCE

1 Error in Euler s Method

On line construction of suffix trees 1

DEALING WITH THE DATA An important assumption underlying statistical quality control is that their interpretation is based on normal distribution of t

Original Article Efficient Genetic Algorithm on Linear Programming Problem for Fittest Chromosomes

HYBRID GENETIC ALGORITHMS FOR SCHEDULING ADVERTISEMENTS ON A WEB PAGE

Implementation of Recursively Enumerable Languages using Universal Turing Machine in JFLAP

Turing Machines: An Introduction

Optimization in Strategy Games: Using Genetic Algorithms to Optimize City Development in FreeCiv

Design of Web Ranking Module using Genetic Algorithm

A Study of Crossover Operators for Genetic Algorithm and Proposal of a New Crossover Operator to Solve Open Shop Scheduling Problem

Solving Banana (Rosenbrock) Function Based on Fitness Function

Volume 3, Issue 2, February 2015 International Journal of Advance Research in Computer Science and Management Studies

MASSACHUSETTS INSTITUTE OF TECHNOLOGY 6.436J/15.085J Fall 2008 Lecture 5 9/17/2008 RANDOM VARIABLES

Automata Theory. Şubat 2006 Tuğrul Yılmaz Ankara Üniversitesi

GRID SEARCHING Novel way of Searching 2D Array

Improved Multiprocessor Task Scheduling Using Genetic Algorithms

Background knowledge-enrichment for bottom clauses improving.

Biology Notes for exam 5 - Population genetics Ch 13, 14, 15

A Multi-Objective Performance Evaluation in Grid Task Scheduling using Evolutionary Algorithms

Genetic programming with regular expressions

A Genetic-Fuzzy Logic Based Load Balancing Algorithm in Heterogeneous Distributed Systems

Application of Genetic Algorithm to Scheduling of Tour Guides for Tourism and Leisure Industry

Improving the Performance of a Computer-Controlled Player in a Maze Chase Game using Evolutionary Programming on a Finite-State Machine

New binary representation in Genetic Algorithms for solving TSP by mapping permutations to a list of ordered numbers

Hidden Markov Models in Bioinformatics. By Máthé Zoltán Kőrösi Zoltán 2006

Optimization of Preventive Maintenance Scheduling in Processing Plants

Evolutionary SAT Solver (ESS)

International Journal of Information Technology, Modeling and Computing (IJITMC) Vol.1, No.3,August 2013

Management Science Letters

THE BANACH CONTRACTION PRINCIPLE. Contents

Overview. Swarms in nature. Fish, birds, ants, termites, Introduction to swarm intelligence principles Particle Swarm Optimization (PSO)

USING GENETIC ALGORITHM IN NETWORK SECURITY

PROCESS OF LOAD BALANCING IN CLOUD COMPUTING USING GENETIC ALGORITHM

Architectural Design for Space Layout by Genetic Algorithms

GARY J. KOEHLER. Biography. The Foundation for The Gator Nation

Inventory Optimization in Efficient Supply Chain Management

A Hybrid Tabu Search Method for Assembly Line Balancing

FEGYVERNEKI SÁNDOR, PROBABILITY THEORY AND MATHEmATICAL

The Dynamics of a Genetic Algorithm on a Model Hard Optimization Problem

CLOUD DATABASE ROUTE SCHEDULING USING COMBANATION OF PARTICLE SWARM OPTIMIZATION AND GENETIC ALGORITHM

6.207/14.15: Networks Lecture 15: Repeated Games and Cooperation

Mathematical Induction

Near Optimal Solutions

What is Linear Programming?

Universal Turing Machine: A Model for all Computational Problems

Computer Science 281 Binary and Hexadecimal Review

Automatic parameter regulation for a tracking system with an auto-critical function

Today s Topics. Primes & Greatest Common Divisors

Evaluation of Crossover Operator Performance in Genetic Algorithms With Binary Representation

Section 1.3 P 1 = 1 2. = P n = 1 P 3 = Continuing in this fashion, it should seem reasonable that, for any n = 1, 2, 3,..., =

A HYBRID GENETIC ALGORITHM FOR THE MAXIMUM LIKELIHOOD ESTIMATION OF MODELS WITH MULTIPLE EQUILIBRIA: A FIRST REPORT

Lecture 1. Basic Concepts of Set Theory, Functions and Relations

Transcription:

Holland s GA Schema Theorem v Objective provide a formal model for the effectiveness of the GA search process. v In the following we will first approach the problem through the framework formalized by Holland [1] and popularized by Goldberg [2]. v This concentrates on providing a model for the expectation of schema survival, where this naturally represents a limitation in itself. v We will then move on to consider further drawbacks of the original Schema Theorem and some of the recent attempts to provide more representative Schema Theorems. v Consider the case of a canonical GA, Binary alphabet; Fixed length individuals of equal length, l; Fitness Proportional Selection; Single Point Crossover; Gene wise mutation. v Definition 1 Schema, H. A schema is a subset of the space of all possible individuals for which all the genes match the template for schema H. v If A denotes the alphabet of gene alleles then A» * is the schema alphabet, where * is the wild card symbol matching any allele value. E.g. for the binary alphabet A Œ {0, 1, *} where * Œ {0, 1} Example v For a binary individual with the gene sequence {0 1 1 1 0 0 0}, then it follows that one (of many) matching schema might have the form, H = [* 1 1 * 0 * *] v The schema H = [0 1 * 1 *] identifies the chromosome set, 0 1 0 1 0 0 1 0 1 1 0 1 1 1 0 0 1 1 1 1 v Needless to say, not all schema are created equal, thus, 1 * * * * * * does not tell us as much as 0 1 * * 1 1 0. 1

v Moreover, schema 1 * * * * * 0 spans the entire length of an individual whereas 1 * 1 * * * * does not. v Definition 2 Schema Order, o(h). Schema order, o( ), is the number of non * genes in schema H. Example, o(* * * 0 * * *) = 1 v Definition 3 Schema Defining Length, d(h). Schema Defining Length, d(h), is the distance between first and last non * gene in schema H. Example, d(* * * 0 * * *) = 4 4 = 0 v Note For cardinality, k, there are (k + 1) l schema in string of length l. We are now in a position to incrementally introduce the effect of the various selection and search operators associated with the above definition for the special case of a canonical GA with binary genes. Selection Operators Fitness Proportional Selection v Essentially all that we are attempting to model is the probability that individual, h, samples schema, H, or P(h Œ H) v There are several ways of modeling this. Consider the following two part model, Probability of selection µ number of instances of schema H in the population; Probability of selection µ average fitness of schema H relative to the average fitness of all individuals in the population. Thus, (Number of individuals matching (Mean fitness of individuals matching P(h Œ H) = schema H at generation t) schema H) (Mean fitness of individuals in the (Population Size) population) 2

or, P(h Œ H) = m( H, t) f ( H, t) Mf ( t) where m(h, t) is the number of instances of schema H at generation t. v Lemma 1 Under fitness proportional selection the expected number of instances of schema H at time t is, v Implication, E[m(H, t + 1)] = M P(h Œ H) = m(h,t) f (H,t) f (t) Schemas with fitness greater (lower) than the average population fitness are likely to account for proportionally more (less) of the population at the next generation. Strictly speaking for accurate estimates of expectation and probability the population size should be infinite. Search Operators Single point crossover v Reproduction does nothing to improve the fitness of individuals in a population. (Single point) Crossover was the first of two search operators introduced to modify the distribution of schema in the population. In his original work, Holland concentrated on modeling the lower bound alone [1]. v Consider the following individual, h, two matching schema, H 1, H 2 and crossover point between 3 rd and 4 th gene, or, h = 1 0 1 1 1 0 0 H 1 = * 0 1 * * * 0 H 2 = * 0 1 * * * * v Observations, Schema H 1 will naturally be broken by the location of the crossover operator unless the second parent is able to repair the disrupted gene. Schema H 2 emerges unaffected and is therefore independent of the second parent. Schema with long defining length are more likely to be disrupted by single point crossover than schema using short defining lengths. v Lemma 2 Under single point crossover, the (lower bound) probability of schema H surviving at generation t is, P(H survives) = 1 P (H does not survive) 3

a priori selected threshold of applying crossover. v Comment, or, 1- d(h) l -1 P diff (H,t) Where P diff (H, t) is the probability that the second parent does not match schema H; and p c is the In the special case of the worst case lower bound P diff (H, t) = 1. Search Operators Gene wise Mutation v Mutation is applied gene by gene. v In order for schema H to survive, all non * genes in the schema much remain unchanged. v Probability of not changing a gene is, v Require that all o(h) non * genes survive, or (1 p m ) (1 p m ) o(h) v Typically the probability of applying the mutation operator, p m, << 1, thus (1 p m ) o(h) ª 1 o(h) p m v Lemma 3 Under gene wise mutation, the (lower bound) probability of an order o(h) schema H surviving at generation t is, v Theorem 1 The Schema Theorem [1, 2] 1 o(h) p m The expected number of schema H at generation t + 1 when using a canonical GA with proportional selection, single point crossover and gene wise mutation (where the latter are applied at rates p c and p m ) is, v Comments, E[m(H, t + 1)] m(h,t) f (H,t) f (t) Ï d(h) 1- p c 1- l p (H,t) - o(h)p Ì diff m Ó The theorem is described in terms of expectation, thus strictly speaking is only true for the case of a population with an infinite number of members. In the case of finite population sizes the significance of population drift plays an increasingly important role [3]. The above form is naturally specific to the selection and search operators under which it was derived. A more generic form for the Schema Theorem might take the form, E[m(H, t + 1)] m(h, t) a(h, t) {1 b(h, t)} 4

Where a(h, t) is the selection coefficient and b(h, t) is the transcription error [4]. This makes the various contributions more explicit. Specifically for schema H to survive then, or, f (H,t) f (t) a(h, t) {1 b(h, t)} Ï d(h) 1- p c 1- l p (H,t) - o(h)p Ì diff m Ó This is the basis for the observation that short (defining length), low order schema of above average population fitness will be favored by canonical GAs, or the Building Block Hypothesis [1, 2]. Problems v The Schema Theorem as defined by Holland represented a mile stone in the development of Genetic Algorithms in particular and latter in the development of corresponding theorems for Genetic Programming. However, it has also several significant shortcomings which have lead to more modern approaches to theorems for Genetic Algorithms. Consider the following, Only the worst-case scenario is considered. No positive effects of the search operators are considered. This has lead to the development of Exact Schema Theorems [5]; The theorem concentrates on the number of schema surviving not which schema survive. Such considerations have been addressed by the utilization of Markov chains to provide models of behavior associated with specific individuals in the population [6]. Claims of exponential increases in fit schema i.e., if the expectation operator of Theorem References 1 is ignored and the effects of crossover and mutation discounted, the following result was popularized by Goldberg [2], m(h, t + 1) (1 + c) m(h, t), where c is the constant by which fit schema are always fitter than the population average. Unfortunately, this is rather misleading as the average population fitness will tend to increase with t, thus population and fitness of remaining schema will tend to converge with increasing time. [1] J.H. Holland, (1998) Adaptation in Natural and Artificial Systems: An Introductory Analysis with Applications to biology, control and artificial intelligence. MIT Press, ISBN 0-262-58111-6. (NB original printing 1975). 5

[2] D.E., Goldberg, (1989) Genetic Algorithms in Search, Optimization and Machine Learning. Addison Wesley, ISBN 0-201-15767-5. [3] Rodgers A., Prugel-Bennett A. (1999) Genetic Drift in GA Selection Schemes, IEEE Transactions on Evolutionary Computation. 3(4), pp 298-303. [4] C.R. Reeves, J.E. Rowe (2003) Genetic Algorithms Principles and Perspectives. Kluwer Academic Pub. ISBN 1-4020-7240-6. [5] C. Stephens, H. Waelbroeck (1998) Effective Degrees of Freedom in Genetic Algorithms, Physical review: E, 57(3), pp 3251-3264. [6] Vose M.D., Nix A. (1992) Modeling Genetic Algorithms with Markov Chains, Annals of Mathematics and Artificial Intelligence, 5, pp 79-98. 6