Introduction to Schema Theory

Similar documents

A Non-Linear Schema Theorem for Genetic Algorithms

Holland s GA Schema Theorem

6 Creating the Animation

Full and Complete Binary Trees

Lab 4: 26 th March Exercise 1: Evolutionary algorithms

Genetic Algorithms and Sudoku

Genetic programming with regular expressions

Analysis of Algorithms I: Binary Search Trees

Lecture 18: Applications of Dynamic Programming Steven Skiena. Department of Computer Science State University of New York Stony Brook, NY

Lecture 1: Course overview, circuits, and formulas

6.3 Conditional Probability and Independence

The Basics of Graphical Models

A Robust Method for Solving Transcendental Equations

Dynamic Programming. Lecture Overview Introduction

Classifying Large Data Sets Using SVMs with Hierarchical Clusters. Presented by :Limou Wang

Modified Version of Roulette Selection for Evolution Algorithms - the Fan Selection

Data Structures Fibonacci Heaps, Amortized Analysis

Overview. Swarms in nature. Fish, birds, ants, termites, Introduction to swarm intelligence principles Particle Swarm Optimization (PSO)

Introduction To Genetic Algorithms

Package NHEMOtree. February 19, 2015

Scheduling Shop Scheduling. Tim Nieberg

4. Continuous Random Variables, the Pareto and Normal Distributions

14.1 Rent-or-buy problem

Binary Search Trees CMPSC 122

Regular Expressions and Automata using Haskell

Question: What is the probability that a five-card poker hand contains a flush, that is, five cards of the same suit?

Grammatical Differential Evolution

Exact Nonparametric Tests for Comparing Means - A Personal Summary

Web Data Extraction: 1 o Semestre 2007/2008

3. Mathematical Induction

Language Modeling. Chapter Introduction

GA as a Data Optimization Tool for Predictive Analytics

CHAPTER 6 GENETIC ALGORITHM OPTIMIZED FUZZY CONTROLLED MOBILE ROBOT

Class Notes CS Creating and Using a Huffman Code. Ref: Weiss, page 433

Binary Search Trees. Data in each node. Larger than the data in its left child Smaller than the data in its right child

Mathematical Induction. Lecture 10-11

A Survey of Evolutionary Algorithms for Data Mining and Knowledge Discovery

Protein Protein Interaction Networks

Metric Spaces. Chapter Metrics

CHAPTER 7 GENERAL PROOF SYSTEMS

Introduction to Algorithms March 10, 2004 Massachusetts Institute of Technology Professors Erik Demaine and Shafi Goldwasser Quiz 1.

1 Review of Newton Polynomials

This unit will lay the groundwork for later units where the students will extend this knowledge to quadratic and exponential functions.

Original Article Efficient Genetic Algorithm on Linear Programming Problem for Fittest Chromosomes

GRAPH THEORY LECTURE 4: TREES

Near Optimal Solutions

arxiv: v1 [math.pr] 5 Dec 2011

ECON 459 Game Theory. Lecture Notes Auctions. Luca Anderlini Spring 2015

A HYBRID GENETIC ALGORITHM FOR THE MAXIMUM LIKELIHOOD ESTIMATION OF MODELS WITH MULTIPLE EQUILIBRIA: A FIRST REPORT

CS711008Z Algorithm Design and Analysis

Overview. Essential Questions. Precalculus, Quarter 4, Unit 4.5 Build Arithmetic and Geometric Sequences and Series

Information, Entropy, and Coding

Decision Making under Uncertainty

NPV Versus IRR. W.L. Silber We know that if the cost of capital is 18 percent we reject the project because the NPV

11 Multivariate Polynomials

Classification/Decision Trees (II)

Learning Example. Machine learning and our focus. Another Example. An example: data (loan application) The data and the goal

ISSN: ISO 9001:2008 Certified International Journal of Engineering Science and Innovative Technology (IJESIT) Volume 2, Issue 3, May 2013

Numerical Research on Distributed Genetic Algorithm with Redundant

Output: struct treenode{ int data; struct treenode *left, *right; } struct treenode *tree_ptr;

Lecture Notes on Database Normalization

From Last Time: Remove (Delete) Operation

The Prime Numbers. Definition. A prime number is a positive integer with exactly two positive divisors.

Chapter 11 Monte Carlo Simulation

6.4 Normal Distribution

Creating, Solving, and Graphing Systems of Linear Equations and Linear Inequalities

Symbol Tables. Introduction

2) Write in detail the issues in the design of code generator.

How To Understand The Theory Of Computer Science

Converting a Number from Decimal to Binary

Lecture 10: Regression Trees

A Parallel Processor for Distributed Genetic Algorithm with Redundant Binary Number

A Fast Computational Genetic Algorithm for Economic Load Dispatch

CS104: Data Structures and Object-Oriented Design (Fall 2013) October 24, 2013: Priority Queues Scribes: CS 104 Teaching Team

Handout #1: Mathematical Reasoning

Reading 13 : Finite State Automata and Regular Expressions

CALCULATIONS & STATISTICS

COMP3420: Advanced Databases and Data Mining. Classification and prediction: Introduction and Decision Tree Induction

Pigeonhole Principle Solutions

A Study of Crossover Operators for Genetic Algorithm and Proposal of a New Crossover Operator to Solve Open Shop Scheduling Problem

Virtual Enterprise Transactions: A Cost Model

International Journal of Advanced Research in Computer Science and Software Engineering

1 Error in Euler s Method

8.1 Min Degree Spanning Tree

Algorithms and Data Structures

Tagging with Hidden Markov Models

This asserts two sets are equal iff they have the same elements, that is, a set is determined by its elements.

CS321. Introduction to Numerical Methods

Vector storage and access; algorithms in GIS. This is lecture 6

Automata and Formal Languages

Krishna Institute of Engineering & Technology, Ghaziabad Department of Computer Application MCA-213 : DATA STRUCTURES USING C

Continued Fractions and the Euclidean Algorithm

Mining Social Network Graphs

Binary Search Trees. A Generic Tree. Binary Trees. Nodes in a binary search tree ( B-S-T) are of the form. P parent. Key. Satellite data L R

Professor Anita Wasilewska. Classification Lecture Notes

Transcription:

Introduction to Schema Theory A survey lecture of pessimistic & exact schema theory William C. Liles R. Paul Wiegand wliles@cs.gmu.edu paul@tesseract.org ECLab George Mason University EClab - Summer Lecture Series p.1/48

Outline of Discussion Part I: Overview of Schema Theory Part II: Part III: Part IV: Pessimistic Schema Theory Exact Schema Theory Conclusions EClab - Summer Lecture Series p.2/48

Overview of Schema Theory What is a Theory? Set of analytical tools to help answer questions Particular domain in which to ask questions EClab - Summer Lecture Series p.3/48

Overview of Schema Theory What is a Theory? Set of analytical tools to help answer questions Particular domain in which to ask questions What do theories do? Predict Explain EClab - Summer Lecture Series p.3/48

Overview of Schema Theory What are Schemata? Can view Schemata in many ways Templates specifying groups (sets) of similar chromosomes Partitions of genome space Descriptions of hyperplanes through genome space Sets of search points sharing some syntactic feature EClab - Summer Lecture Series p.4/48

Overview of Schema Theory What are Schemata? Can view Schemata in many ways Templates specifying groups (sets) of similar chromosomes Partitions of genome space Descriptions of hyperplanes through genome space Sets of search points sharing some syntactic feature Example for binary representation: * Don t care Schema Members 1000 1**0 1010 1100 1110 EClab - Summer Lecture Series p.4/48

Overview of Schema Theory What is Schema Theory? Describes how schemata are expected to propagate from one generation to the next More specifically: Divides the space into subspaces Quantifies these subspaces Explains how & why individuals move between subspaces EClab - Summer Lecture Series p.5/48

Overview of Schema Theory What are Schema Theorems? Specific analytical models Usually reflect particular representational choices May provide only a lower bound schemata growth May provide tight bounds on schemata growth EClab - Summer Lecture Series p.6/48

Overview of Schema Theory The Traditional Schema Theorem Pessimistic prediction of schema growth in a GA NOT the same as the Building Block Hypothesis EClab - Summer Lecture Series p.7/48

Overview of Schema Theory The Traditional Schema Theorem Pessimistic prediction of schema growth in a GA NOT the same as the Building Block Hypothesis From Michalewicz, 1992: SCHEMA THEOREM: Short, low-order, aboveaverage schemata receive exponentially increasing trials in subsequent generations of a genetic algorithm EClab - Summer Lecture Series p.7/48

Overview of Schema Theory The Traditional Schema Theorem Pessimistic prediction of schema growth in a GA NOT the same as the Building Block Hypothesis From Michalewicz, 1992: SCHEMA THEOREM: Short, low-order, aboveaverage schemata receive exponentially increasing trials in subsequent generations of a genetic algorithm Prediction EClab - Summer Lecture Series p.7/48

Overview of Schema Theory The Traditional Schema Theorem Pessimistic prediction of schema growth in a GA NOT the same as the Building Block Hypothesis From Michalewicz, 1992: SCHEMA THEOREM: Short, low-order, aboveaverage schemata receive exponentially increasing trials in subsequent generations of a genetic algorithm Prediction BUILDING BLOCK HYPOTHESIS: A GA seeks near optimal performance through the juxtaposition of short, low-order, high-performance schemata, called the building blocks. EClab - Summer Lecture Series p.7/48

Overview of Schema Theory The Traditional Schema Theorem Pessimistic prediction of schema growth in a GA NOT the same as the Building Block Hypothesis From Michalewicz, 1992: SCHEMA THEOREM: Short, low-order, aboveaverage schemata receive exponentially increasing trials in subsequent generations of a genetic algorithm BUILDING BLOCK HYPOTHESIS: A GA seeks near optimal performance through the juxtaposition of short, low-order, high-performance schemata, called the building blocks. Prediction Explanation EClab - Summer Lecture Series p.7/48

Overview of Schema Theory Advantages of Schema Theory Appeals to our intuition of how problems may be solved (i.e., building blocks) Can be formulated in a concise way, such that specific instantiations for particular EAs can be plugged in to main theorems With exact models, we can get correct predictions of expectation, as well as bounds on the probability of those expectations EClab - Summer Lecture Series p.8/48

Overview of Schema Theory Criticism of the Schema Theory Results & conclusions of limited use. Why? EClab - Summer Lecture Series p.9/48

Overview of Schema Theory Criticism of the Schema Theory Results & conclusions of limited use. Why? Traditionally produces a lower-bound expectation for schemata growth Not easy to predict global behavior without a recursive solution Operates under the assumption that knowing what schemata are doing helps us understand how & why the GA is working (it doesn t always) EClab - Summer Lecture Series p.9/48

Overview of Schema Theory Criticism of the Schema Theory Results & conclusions of limited use. Why? Traditionally produces a lower-bound expectation for schemata growth Not easy to predict global behavior without a recursive solution Operates under the assumption that knowing what schemata are doing helps us understand how & why the GA is working (it doesn t always) Applicability is low. Why? EClab - Summer Lecture Series p.9/48

Overview of Schema Theory Criticism of the Schema Theory Results & conclusions of limited use. Why? Traditionally produces a lower-bound expectation for schemata growth Not easy to predict global behavior without a recursive solution Operates under the assumption that knowing what schemata are doing helps us understand how & why the GA is working (it doesn t always) Applicability is low. Why? Large effort to produce very specific models Most EAs modeled under ST are simple EAs, not used in practice EClab - Summer Lecture Series p.9/48

Outline of Discussion Part I: Overview of Schema Theory Part II: Pessimistic Schema Theory Part III: Exact Schema Theory Part IV: Conclusions EClab - Summer Lecture Series p.10/48

Pessimistic Schema Theory Traditional Schema Theory Attempts to give insight About how GAs work Describes how the expected number of schemata will at least grow in the next generation Generally assumes Binary, fixed-length representation Bit flip mutation, 1-point crossover Proportional selection EClab - Summer Lecture Series p.11/48

Pessimistic Schema Theory Traditional Schema Theory Attempts to give insight About how GAs work Describes how the expected number of schemata will at least grow in the next generation Generally assumes Binary, fixed-length representation Bit flip mutation, 1-point crossover Proportional selection E[ # strs schema @ gen t + 1] (# strs schema @ t) (rel. value of schema) (prob destroyed) EClab - Summer Lecture Series p.11/48

Pessimistic Schema Theory Notation Individual are fixed-length binary strings, v {0,1} l Schema are fixed length ternary strings, s {0,1, } l Schema define a set of individual strings. v s iff i {0...l},v i = s i s i = Population at time t is a set of binary strings, P t = {v (1),v (2),...,v (n) }, v (i) {0,1} l Match is no. of strings in P t also in schema s. m(s,t) = {v P t,v s} EClab - Summer Lecture Series p.12/48

Pessimistic Schema Theory Notation (continued) Mean fitness of strings in schema s also in P t, f (s,t) Mean fitness of the population, f (t) Probability of mutation, p m Probability of crossover, p c EClab - Summer Lecture Series p.13/48

Pessimistic Schema Theory Notation (continued) Mean fitness of strings in schema s also in P t, f (s,t) Mean fitness of the population, f (t) Probability of mutation, p m Probability of crossover, p c E[m(s,t + 1)] m(s,t) f (s,t) f (t) Pr[surv mut]pr[surv xover] EClab - Summer Lecture Series p.13/48

Pessimistic Schema Theory Measures of Schema Order of schema is the # of fixed positions, o(s) = { i {1...l},s i } Defining length is the longest distance between two fixed positions, δ(s) = max ( i j, i, j {1...l},s i s j ) EClab - Summer Lecture Series p.14/48

Pessimistic Schema Theory Measures of Schema Order of schema is the # of fixed positions, o(s) = { i {1...l},s i } Defining length is the longest distance between two fixed positions, δ(s) = max ( i j, i, j {1...l},s i s j ) Example: s = * 0 * 1 1 * * 1 * o(s) = 4 δ(s) = 6 EClab - Summer Lecture Series p.14/48

Pessimistic Schema Theory Surviving Mutation Probability no mutation at a given position occurs is 1 p m All mutations are independent We only care about the fixed positions So probability schema survives disruption is (1 p m ) o(s) EClab - Summer Lecture Series p.15/48

Pessimistic Schema Theory Surviving Crossover A crossover event which does not divide the defined positions in the schema is harmless The probability the crossover breaks the schema is δ(s) l 1 Thus the probability to survive crossover is δ(s) 1 p c l 1 EClab - Summer Lecture Series p.16/48

Pessimistic Schema Theory The Traditional Schema Theorem E[m(s,t + 1)] m(s,t) f (s,t) f (t) (1 p m) o(s) ( 1 p c δ(s) l 1 ) EClab - Summer Lecture Series p.17/48

Pessimistic Schema Theory The Traditional Schema Theorem E[m(s,t + 1)] m(s,t) f (s,t) f (t) (1 p m) o(s) ( 1 p c δ(s) l 1 But we can do better, since this ignores the possibility that you select a breeding partner from the same schema... m(s,t) f (s,t) f (t) (1 p m) o(s) (( 1 p c δ(s) l 1 )( 1 ) m(s,t) f (s,t) n f (t) )) EClab - Summer Lecture Series p.17/48

Pessimistic Schema Theory The Traditional Schema Theorem E[m(s,t + 1)] m(s,t) f (s,t) f (t) (1 p m) o(s) ( 1 p c δ(s) l 1 But we can do better, since this ignores the possibility that you select a breeding partner from the same schema... m(s,t) f (s,t) f (t) (1 p m) o(s) (( 1 p c δ(s) l 1 )( 1 ) m(s,t) f (s,t) n f (t) More generally, we can replace the selection method... n p(s,t)(1 p m ) o(s) (( 1 p c δ(s) l 1 ) ) (1 p(s,t)) Where p(s,t) is the probability of selecting schema s at generation t. For proportional selection p(s,t) = m(s,t) n f (s,t) f (t). )) EClab - Summer Lecture Series p.17/48

Pessimistic Schema Theory What does this tell us? s s.5 Helps explain how m(s,t) is affected by GA operators Helps predict how m(s,t) varies from one generation to the next Provides only a lower bound EClab - Summer Lecture Series p.18/48

Pessimistic Schema Theory What does this tell us? s s.5 Helps explain how m(s,t) is affected by GA operators Helps predict how m(s,t) varies from one generation to the next Provides only a lower bound Want appropriate schema defn (granularity) Want appropriate measures (meaning) EClab - Summer Lecture Series p.18/48

Pessimistic Schema Theory What does this tell us? s s.5 Helps explain how m(s,t) is affected by GA operators Helps predict how m(s,t) varies from one generation to the next Provides only a lower bound Want appropriate schema defn (granularity) Want appropriate measures (meaning) No. indiv in schema Avg fitness of schema Avg fitness of pop Disruption of ops etc. EClab - Summer Lecture Series p.18/48

Pessimistic Schema Theory Genetic Programming Schemata We can express schemata as a list of ordered pairs 0 * 1 * * 1 1 [(0,1), (1,3), (11,6)] Supposing v {a,b,c,d} l, a * b c * a d [(a,1), (bc,3), (ad,6)] Traditional ST specifies the complete string, so position is important. But we could match partial strings... And we could match strings independent of position... EClab - Summer Lecture Series p.19/48

Pessimistic Schema Theory GP Schemata (continued) No. of individuals matching s in P t is m(s,t) No. of substrings matching s in P t is ι(s,t) (instantiations) Example Population: a b a a c a c c a b c c c c b a b c a b m([ab],t) = 3, ι([ab],t) = 4 m([a,b],t) = 4, ι([a,b],t) = 12 EClab - Summer Lecture Series p.20/48

Pessimistic Schema Theory Rosca s Rooted Tree Schemata Syntactically: Every schema is a contiguous tree fragment Every schema includes the root node of the tree Use # character rather than * for don t care symbol Semantically: A schema defines a set of programs s = (- # y) defines the set of programs: -? y.. EClab - Summer Lecture Series p.21/48

Pessimistic Schema Theory Rosca s Rooted Tree Schemata Syntactically: Every schema is a contiguous tree fragment Every schema includes the root node of the tree Use # character rather than * for don t care symbol Semantically: A schema defines a set of programs s = (- # y) defines the set of programs: - For example: v = (- (+ x z) y)? y.. EClab - Summer Lecture Series p.21/48

Pessimistic Schema Theory Rosca s Rooted Tree Schemata Syntactically: Every schema is a contiguous tree fragment Every schema includes the root node of the tree Use # character rather than * for don t care symbol Semantically: A schema defines a set of programs s = (- # y) defines the set of programs: - For example: v = (- (+ x z) y) -.?. y + y x z EClab - Summer Lecture Series p.21/48

Pessimistic Schema Theory Rosca s Schemata Notation/Measures Size of program v matching schema s is N(v) The fitness of program v is f (v) The order of schema is number of defining symbols it contains, O(v) The population at time t is the multiset P(t) All programs in P(t) which are also members of the schema s is denoted s P(t) The probability (per child) that mutation is applied is p m EClab - Summer Lecture Series p.22/48

Pessimistic Schema Theory Rosca s Schema Theorem E[m(s,t + 1)] m(s,t) f (s,t) f (t) [ 1 (p m + p c ) v s P(t) O(s) N(v) ] f (v) v s P(t) f (v) Divide space into subspaces containing programs of different sizes and shapes EClab - Summer Lecture Series p.23/48

Pessimistic Schema Theory Rosca s Schema Theorem E[m(s,t + 1)] m(s,t) f (s,t) f (t) 1 (p m + p c ) v s P(t) Frag. of inst. { }} { O(s) N(v) f (v) v s P(t) f (v) Divide space into subspaces containing programs of different sizes and shapes Estimate fragility of schema instances EClab - Summer Lecture Series p.23/48

Pessimistic Schema Theory Rosca s Schema Theorem E[m(s,t + 1)] m(s,t) f (s,t) f (t) 1 (p m + p c ) v s P(t) Frag. of inst. { }} { O(s) N(v) f (v) v s P(t) f (v) } {{ } Prob disrupting s Divide space into subspaces containing programs of different sizes and shapes Estimate fragility of schema instances Use wt d sum & prob of seln. to est. prob. of disrupting schema EClab - Summer Lecture Series p.23/48

Pessimistic Schema Theory Poli-Langdon Schemata Fixed size and shape Set of functions is denoted, F Set of terminals is denoted, T The = symbol is don t care for a single function or terminal Schema s is a rooted tree composed of nodes from F T {=} EClab - Summer Lecture Series p.24/48

Pessimistic Schema Theory Poli-Langdon Schemata Fixed size and shape Set of functions is denoted, F Set of terminals is denoted, T The = symbol is don t care for a single function or terminal Schema s is a rooted tree composed of nodes from F T {=} For example: F {+,,,/},T {1,...,9,x,y,z} + = * / = / 2 x y z 2 x = z EClab - Summer Lecture Series p.24/48

Pessimistic Schema Theory Poli-Langdon ST Notation/Measures The order is the number of non-= symbols of the schema, O(s) The length is the total number of nodes in the schema, N(s) The defining length is the number of links in the minimum tree fragment including all non-= symbols in a schema, L(s) EClab - Summer Lecture Series p.25/48

Pessimistic Schema Theory Poli-Langdon ST Notation/Measures The order is the number of non-= symbols of the schema, O(s) The length is the total number of nodes in the schema, N(s) The defining length is the number of links in the minimum tree fragment including all non-= symbols in a schema, L(s) For example: + - 2 x = L = 3 O = 4 - = 2 = = L = 1 O = 2 = + = x = L = 2 O = 2 EClab - Summer Lecture Series p.25/48

Pessimistic Schema Theory Poli-Langdon ST Notation/Measures The zeroth order schema with the same structure as schema s is G(s) The conditional prob. that s is disrupted by x-over from parent not in G(s) is p di f f The prob. of selecting the s from population at time t is p(s,t) EClab - Summer Lecture Series p.26/48

Pessimistic Schema Theory Poli-Langdon ST Notation/Measures The zeroth order schema with the same structure as schema s is G(s) The conditional prob. that s is disrupted by x-over from parent not in G(s) is p di f f The prob. of selecting the s from population at time t is p(s,t) Schema s: = Schema G(s): = * / = = 3 * y = = = = = = z = = EClab - Summer Lecture Series p.26/48

Pessimistic Schema Theory GP One-Point Crossover + / * / - * 2 + y z x y z + x y Align the two parents z y EClab - Summer Lecture Series p.27/48

Pessimistic Schema Theory GP One-Point Crossover + / * / - * 2 + y z x y z + x y z y Align the two parents Identify the region which is struct. common EClab - Summer Lecture Series p.27/48

Pessimistic Schema Theory GP One-Point Crossover * + / / - / / * 2 + y z x y z + x y z y Align the two parents Identify the region which is struct. common Choose a x-over pt from within that region EClab - Summer Lecture Series p.27/48

Pessimistic Schema Theory GP One-Point Crossover * 2 + x + y * z + z y x Align the two parents - y / / Identify the region which is struct. common Choose a x-over pt from within that region Perform subtree crossover at that point y z EClab - Summer Lecture Series p.27/48

Pessimistic Schema Theory Poli-Langdon Theorem (pess, 1-pt x-over) E[m(s,t + 1)] n p(s,t) } {{ } selection (1 p m ) O(s) } {{ } mutation {1 p c [ p di f f (t)(1 p(g(s),t) + L(s) N(s) (p(g(s),t) p(s,t)) ]} Pessimistic ST for 1-pt crossover using any selection EClab - Summer Lecture Series p.28/48

Pessimistic Schema Theory Poli-Langdon Theorem (pess, 1-pt x-over) E[m(s,t + 1)] n 1 p c p(s,t) } {{ } (1 p m ) O(s) } {{ } selection mutation p di f f (t)(1 p(g(s),t) + L(s) } {{ } N(s) xover w/ parent / G(s) (p(g(s),t) p(s,t)) Pessimistic ST for 1-pt crossover using any selection Prob, of crossing over with a parent not in G(s) EClab - Summer Lecture Series p.28/48

Pessimistic Schema Theory Poli-Langdon Theorem (pess, 1-pt x-over) E[m(s,t + 1)] n p(s,t) } {{ } (1 p m ) O(s) } {{ } selection mutation 1 p c p di f f (t)(1 p(g(s),t)) + L(s) (p(g(s),t) p(s,t)) } {{ } N(s) } {{ } xover w/ parent / G(s) xover w/ parent G(s) Pessimistic ST for 1-pt crossover using any selection Prob, of crossing over with a parent not in G(s) Prob, of crossing over with a parent in G(s) EClab - Summer Lecture Series p.28/48

Outline of Discussion Part I: Overview of Schema Theory Part II: Pessimistic Schema Theory Part III: Exact Schema Theory Part IV: Conclusions EClab - Summer Lecture Series p.29/48

Exact Schema Theory Exact Schema Theory Two added elements: Tight bounds on E[m(s,t + 1)] by forming an equality, rather than an inequality Now able to estimate variance, and thus determine the certainty of the estimate EClab - Summer Lecture Series p.30/48

Exact Schema Theory Exact Schema Theory Two added elements: Tight bounds on E[m(s,t + 1)] by forming an equality, rather than an inequality Now able to estimate variance, and thus determine the certainty of the estimate Account for creation & survival of schema Assume parents are selected independently EClab - Summer Lecture Series p.30/48

Exact Schema Theory Exact Schema Theory Two added elements: Tight bounds on E[m(s,t + 1)] by forming an equality, rather than an inequality Now able to estimate variance, and thus determine the certainty of the estimate Account for creation & survival of schema Assume parents are selected independently Exact in what sense? Because it is possible to predict with a known certainty whether m(s,t + 1) will be above a certain threshold, the expectation operator can be removed from the theorem. EClab - Summer Lecture Series p.30/48

Exact Schema Theory Transmission Probability Want to know α(s,t): The probability that at generation t, individuals produced via genetic operators (including selection & cloning) will be in schema s. α(s,t) = Pr[s survives] + Pr[s constructed] α(s,t) can be quite difficult to elcit EClab - Summer Lecture Series p.31/48

Exact Schema Theory Transmission Probability Want to know α(s,t): The probability that at generation t, individuals produced via genetic operators (including selection & cloning) will be in schema s. α(s,t) = Pr[s survives] + Pr[s constructed] α(s,t) can be quite difficult to elcit But assuming we knew it, if parents are selected independently... EClab - Summer Lecture Series p.31/48

Exact Schema Theory Transmission Probability Want to know α(s,t): The probability that at generation t, individuals produced via genetic operators (including selection & cloning) will be in schema s. α(s,t) = Pr[s survives] + Pr[s constructed] α(s,t) can be quite difficult to elcit But assuming we knew it, if parents are selected independently... Then m(s,t) is binomially distributed! Pr[m(s,t + 1)] = ( n) k α(s,t) k (1 α(s,t)) n k EClab - Summer Lecture Series p.31/48

Exact Schema Theory Distribution of m(s,t + 1) Now we can compute expectation and variance exactly: E[m(s,t + 1)] = n α(s,t) Var[m(s,t + 1)] = n α(s,t) (1 α(s,t)) Now we can compute a probabilistic ST: Using Chebyshev s inequality & some algebra [ Pr m(s,t + 1) > nα(s,t) k ] nα(s,t)(1 α(s,t)) 1 1 k 2 for any given constant k Now have a relationship between the bounding condition and accuracy of the prediction EClab - Summer Lecture Series p.32/48

Exact Schema Theory Stephens & Waelbroeck s ST Exact ST for GAs with fixed-length binary representations One-point crossover (no mutation) α(s,t) = (1 p c )p(s,t) + p c l 1 l 1 i=1 p(l(s,i),t) p(r(s,i),t) where: L(s,i) replaces right (i + 1)...l positions with * R(s,i) replaces left 1...i positions with * EClab - Summer Lecture Series p.33/48

Exact Schema Theory Fixed Size/Shape GP ST All programs have same size and shape One-point GP crossover α(s,t) = (1 p c )p(s,t) + p c N(s) where: N(s) is number of nodes in schema s N(s) 1 i=1 p(l(s,i),t) p(u(s,i),t) l(s,i) repl all nodes above x-over pt i with = nodes u(s,i) repl all nodes below x-over pt i with = nodes EClab - Summer Lecture Series p.34/48

Exact Schema Theory Hyperschemata Set of functions is denoted, F Set of terminals is denoted, T The = symbol is don t care for a single function or terminal The # symbol is don t care for any valid subtree Schema s is a rooted tree composed of: Internal nodes from F {=} Leaf nodes from T {=,#} EClab - Summer Lecture Series p.35/48

Exact Schema Theory Hyperschema Example For example: F {+,,,/},T {1,...,9,x,y,z} (+ (* 2 (+ x y)) (/ y z)) (= (= 2 #) (/ = z)) EClab - Summer Lecture Series p.36/48

Exact Schema Theory Hyperschema Example For example: F {+,,,/},T {1,...,9,x,y,z} (+ (* 2 (+ x y)) (/ y z)) (= (= 2 #) (/ = z)) + = * 2 + y / z = 2? / = z x y.. EClab - Summer Lecture Series p.36/48

Exact Schema Theory Hyperschemata Notation/Measures Let s now denote a member of P t as v i No. nodes in tree fragment rep. common region b/w programs v 1 and v 2 is NC(v 1,v 2 ) Set of indices of crossover points in common region is C(v 1,v 2 ) Durak delta function, δ(x) = 1 if x is true, and 0 otherwise L(s,i) is hyperschema obtained by replacing all nodes on path between x-over pt. i and root node with = nodes and all subtrees connected to those nodes with # U(s,i) is hyperschema obtained by replacing all nodes below x-over pt. i with # node EClab - Summer Lecture Series p.37/48

Exact Schema Theory Microscopic Exact GP ST Valid for populations of programs of any size and shape Generalization of Rosca s & P/L schemata (hence hyper ) One-point crossover, no mutation Microscopic since it must consider in detail each member of the population α(s,t) = (1 p c ) p(s,t)+ p(v p 1,t)p(v 2,t) c v 1 v 2 NC(v 1,v 2 ) i C(v 1,v 2 ) δ(v 1 L(s,i))δ(v 2 U(s,i)) Where first two sums are over all individuals in the population. EClab - Summer Lecture Series p.38/48

Exact Schema Theory Advantages of Macroscopic ST Way of course-graining the right-hand side Do not have to consider each individual in population Mainly we are interested in average properties Becomes a proper generalization of all ST so-far described Holland s ST Stephens & Waelbroeck s ST Poli-Langdon s fixed size & shape ST Poli-Langdon s microscopic exact ST EClab - Summer Lecture Series p.39/48

Exact Schema Theory Macroscopic Exact GP ST G i represents all possible schemata of order 0 of the i th fixed size & shape. Enumerating all fixed size & shape order 0 schemata, G 1,G 2,..., we cover the entire search space One-point crossover, no mutation α(s,t) = (1 p c ) p(s,t)+ p c j k 1 NC(G j,g k ) i C(G j,g k ) p( L(s,i) G j,t ) p(r(s,i) G k,t) EClab - Summer Lecture Series p.40/48

Exact Schema Theory Other Macroscopic GP ST Homologous crossover Node-invariant subtree-crossing crossover Standard crossover Linear, length-only ST: Homologous crossover Headless chicken crossover Subtree mutation crossover EClab - Summer Lecture Series p.41/48

Exact Schema Theory GP and the Building Block Hypothesis O Reilly (1995) suggests crossover too destructive, so no BBH for GP Langdon & Poli (for 1-pt crossover) suggest otherwise: L(s.i) G j and U(s.i) G k behave like GA BBs GP really builds solutions via construction of lower order BBs But GP BBs do not have to be above average fitness, short, or even of particularly low order EClab - Summer Lecture Series p.42/48

Outline of Discussion Part I: Overview of Schema Theory Part II: Pessimistic Schema Theory Part III: Exact Schema Theory Part IV: Conclusions EClab - Summer Lecture Series p.43/48

Conclusions Traditional versus Exact Schema Theory Traditional GA/Binary representation Exact Main focus has been in GP Fixed length rep. Variable length rep. Pessimistic inequality provides lower bound on expectation Uses equality, so correctly predicts the expectation Can compute probability of expectation EClab - Summer Lecture Series p.44/48

Conclusions Riccardo Poli s View of GP ST Compatibility with existing theory GP Schema Theory is a superset of GA Schema Theory GAs are a subset of GP Overlaps with dynamical systems & Markov modeling EClab - Summer Lecture Series p.45/48

Conclusions Riccardo Poli s View of GP ST Compatibility with existing theory GP Schema Theory is a superset of GA Schema Theory GAs are a subset of GP Overlaps with dynamical systems & Markov modeling Framework for est. specialized theorems There is no correct schema theorem, it depends on the context of the question Must find the right level of granularity Different operators lead to different ST EClab - Summer Lecture Series p.45/48

Conclusions Riccardo Poli s View of GP ST Compatibility with existing theory GP Schema Theory is a superset of GA Schema Theory GAs are a subset of GP Overlaps with dynamical systems & Markov modeling Framework for est. specialized theorems There is no correct schema theorem, it depends on the context of the question Must find the right level of granularity Different operators lead to different ST Valid and useful tool for researchers The meaning of GP building blocks is clearer Exact GP ST can help (and has helped) guide designers to create better representations, operators, and algorithms Can tell (and has told) us useful things about how GP works... EClab - Summer Lecture Series p.45/48

Conclusions What has ST told us? Prediction: Local estimations for schema propagation Local estimations of average fitness improvement Rank operators for given fitness function (derive Price s theorem) Explanation: Understand why certain things happen (formally): When algorithms behave differently for different initial populations Rate changes of different selection operators Why (and how) GP adds/removes individuals from a schema Understand operator bias: When mutation and recombination have creation/survival advantages over one another in GAs. (Spears, 2000) Linear, 1-pt crossover is unbiased w.r.t. program length. (McPhee, 2001) Standard crossover is biased toward longer programs, but in which primitives are uniformly distributed in quantity and position (generalized Geiringer s theorm). (Poli, 2002) Combining operators may arrest growth from standard crossover (model validation studies). McPhee, 2002) EClab - Summer Lecture Series p.46/48

Conclusions What ST hasn t told us? How to build the perfect EA to solve problems of some particular problem class What kind of long term behaviors can we expect from our EAs Whether or not EAs (nearly) converge to (near) optimal solutions and under what contexts How long we can expect to wait for an EA to converge What kind of car Sean should buy, without verbal feedback EClab - Summer Lecture Series p.47/48

Conclusions What ST hasn t told us? How to build the perfect EA to solve problems of some particular problem class (Omnipotence) What kind of long term behaviors can we expect from our EAs Whether or not EAs (nearly) converge to (near) optimal solutions and under what contexts How long we can expect to wait for an EA to converge What kind of car Sean should buy, without verbal feedback EClab - Summer Lecture Series p.47/48

Conclusions What ST hasn t told us? How to build the perfect EA to solve problems of some particular problem class (Omnipotence) What kind of long term behaviors can we expect from our EAs (Dynamical systems, Global analysis) Whether or not EAs (nearly) converge to (near) optimal solutions and under what contexts How long we can expect to wait for an EA to converge What kind of car Sean should buy, without verbal feedback EClab - Summer Lecture Series p.47/48

Conclusions What ST hasn t told us? How to build the perfect EA to solve problems of some particular problem class (Omnipotence) What kind of long term behaviors can we expect from our EAs (Dynamical systems, Global analysis) Whether or not EAs (nearly) converge to (near) optimal solutions and under what contexts (Dynamical systems?) How long we can expect to wait for an EA to converge What kind of car Sean should buy, without verbal feedback EClab - Summer Lecture Series p.47/48

Conclusions What ST hasn t told us? How to build the perfect EA to solve problems of some particular problem class (Omnipotence) What kind of long term behaviors can we expect from our EAs (Dynamical systems, Global analysis) Whether or not EAs (nearly) converge to (near) optimal solutions and under what contexts (Dynamical systems?) How long we can expect to wait for an EA to converge (Global analysis) What kind of car Sean should buy, without verbal feedback EClab - Summer Lecture Series p.47/48

Conclusions What ST hasn t told us? How to build the perfect EA to solve problems of some particular problem class (Omnipotence) What kind of long term behaviors can we expect from our EAs (Dynamical systems, Global analysis) Whether or not EAs (nearly) converge to (near) optimal solutions and under what contexts (Dynamical systems?) How long we can expect to wait for an EA to converge (Global analysis) What kind of car Sean should buy, without verbal feedback (Omnipotence, Telepathy) EClab - Summer Lecture Series p.47/48

Conclusions References Goldberg, D. Genetic Algorithms in Search, Optimization and Machine Learning. 1989 Langdon, W. and Poli, R. Foundations of Genetic Programming. 2002 Langdon, W. and Poli, R. Tutorial on Foundations of Genetic Programming. In GECCO 2002 Tutorials. 2002 Michalewicz, Z. Genetic Algorithms + Data Structures = Evolution Programs, 3rd Revised and Extended Edition. 1996 McPhee, N. et al. A schema theory analysis of the evolution of size in genetic programming with linear representations. In EUROGP 2001 Proceedings. 2001 Poli, R. et al. On the Search Biases of Homologous Crossover in Linear Genetic Programming and Variable-length Genetic Algorithms. In GECCO 2002 Proceedings. 2002 Poli, R. and McPhee, N. Markov chain models for GP and variable-length GAs with homologous crossover. In GECCO 2001 Proceedings. 2001 Spears, W. Evolutionary Algorithms: The Role of Mutation and Recombination. 2000 Stephens, C. and Waelbroeck. Schemata Evolution and Building Blocks. In Evolutionary Computation 7(2). 1999 EClab - Summer Lecture Series p.48/48