Lehrstuhl für Informatik 2



Similar documents
Machine Learning and Data Mining. Fundamentals, robotics, recognition

Some Research Challenges for Big Data Analytics of Intelligent Security

Abstract. The DNA promoter sequences domain theory and database have become popular for

Indiana State Core Curriculum Standards updated 2009 Algebra I

Logic in general. Inference rules and theorem proving

Learning is a very general term denoting the way in which agents:

CHANCE ENCOUNTERS. Making Sense of Hypothesis Tests. Howard Fincher. Learning Development Tutor. Upgrade Study Advice Service

CS510 Software Engineering

Philosophical argument

each college c i C has a capacity q i - the maximum number of students it will admit

Improving Knowledge-Based System Performance by Reordering Rule Sequences

Bounded Treewidth in Knowledge Representation and Reasoning 1

Probabilistic Relational Learning of Human Behavior Models

Five High Order Thinking Skills

BCS HIGHER EDUCATION QUALIFICATIONS Level 6 Professional Graduate Diploma in IT. March 2013 EXAMINERS REPORT. Knowledge Based Systems

Writing learning objectives

Coverability for Parallel Programs

Version Spaces.

CFSD 21 ST CENTURY SKILL RUBRIC CRITICAL & CREATIVE THINKING

Software Engineering Reference Framework

Rigorous Software Development CSCI-GA

Notes from Week 1: Algorithms for sequential prediction

For example, estimate the population of the United States as 3 times 10⁸ and the

Discovering process models from empirical data

Reusable Knowledge-based Components for Building Software. Applications: A Knowledge Modelling Approach

Course 395: Machine Learning

Algebra I. In this technological age, mathematics is more important than ever. When students

BASIC RULES OF CHESS

Topic 2: Structure of Knowledge-Based Systems

What is Learning? CS 391L: Machine Learning Introduction. Raymond J. Mooney. Classification. Problem Solving / Planning / Control

Overview of the TACITUS Project

Data Project Extract Big Data Analytics course. Toulouse Business School London 2015

SECTION 10-2 Mathematical Induction

INTEGER PROGRAMMING. Integer Programming. Prototype example. BIP model. BIP models

2. (a) Explain the strassen s matrix multiplication. (b) Write deletion algorithm, of Binary search tree. [8+8]

The Trip Scheduling Problem

Predicate logic Proofs Artificial intelligence. Predicate logic. SET07106 Mathematics for Software Engineering

cs171 HW 1 - Solutions

Analysis of Algorithms, I

(Refer Slide Time: 01:52)

The Classes P and NP

A Game Theoretical Framework for Adversarial Learning

Application of Backward Chaining Method to Computer Forensic

Trust but Verify: Authorization for Web Services. The University of Vermont

About the Author. The Role of Artificial Intelligence in Software Engineering. Brief History of AI. Introduction 2/27/2013

CHAPTER 7 GENERAL PROOF SYSTEMS

DATA MINING TECHNOLOGY. Keywords: data mining, data warehouse, knowledge discovery, OLAP, OLAM.

Adversary Modelling 1

Regression Verification: Status Report

A Semantical Perspective on Verification of Knowledge

Healthcare Measurement Analysis Using Data mining Techniques

Integrating Benders decomposition within Constraint Programming

APPLYING MACHINE LEARNING ALGORITHMS IN SOFTWARE DEVELOPMENT

TDA and Machine Learning: Better Together

IAI : Expert Systems

Concepts of digital forensics

Introduction to Logic in Computer Science: Autumn 2006

Full and Complete Binary Trees

Measuring the Performance of an Agent

Mathematics for Computer Science/Software Engineering. Notes for the course MSM1F3 Dr. R. A. Wilson

CHAPTER 3. Methods of Proofs. 1. Logical Arguments and Formal Proofs

Predicate Logic Review

ML for the Working Programmer

Computational Soundness of Symbolic Security and Implicit Complexity

Machine Learning Introduction

DATA MINING IN FINANCE

SQL INJECTION ATTACKS By Zelinski Radu, Technical University of Moldova

Data Mining - Evaluation of Classifiers

A Learning Based Method for Super-Resolution of Low Resolution Images

Introducing Formal Methods. Software Engineering and Formal Methods

Sequential lmove Games. Using Backward Induction (Rollback) to Find Equilibrium

The Research Proposal

3. Mathematical Induction

PROPERTECHNIQUEOFSOFTWARE INSPECTIONUSING GUARDED COMMANDLANGUAGE

Automated Theorem Proving - summary of lecture 1

1 INTRODUCTION TO SYSTEM ANALYSIS AND DESIGN

Resolution. Informatics 1 School of Informatics, University of Edinburgh

Problem of the Month: Perfect Pair

LAKE ELSINORE UNIFIED SCHOOL DISTRICT

The program also provides supplemental modules on topics in geometry and probability and statistics.

How To Know If A Domain Is Unique In An Octempo (Euclidean) Or Not (Ecl)

TDWI strives to provide course books that are content-rich and that serve as useful reference documents after a class has ended.

Chapter 11. Managing Knowledge

Computational Methods for Database Repair by Signed Formulae

Satisfiability Checking

Course Outline Department of Computing Science Faculty of Science. COMP Applied Artificial Intelligence (3,1,0) Fall 2015

Videogaming and the development of the scientific mind

Regular Languages and Finite Automata

Science Stage 6 Skills Module 8.1 and 9.1 Mapping Grids

Review. Bayesianism and Reliability. Today s Class

How To Make A Correct Multiprocess Program Execute Correctly On A Multiprocedor

Master of Sciences in Informatics Engineering Programming Paradigms 2005/2006. Final Examination. January 24 th, 2006

Machine Learning and Data Analysis overview. Department of Cybernetics, Czech Technical University in Prague.

Title. Introduction to Data Mining. Dr Arulsivanathan Naidoo Statistics South Africa. OECD Conference Cape Town 8-10 December 2010.

A Working Knowledge of Computational Complexity for an Optimizer

Maschinelles Lernen mit MATLAB

Data Mining. 1 Introduction 2 Data Mining methods. Alfred Holl Data Mining 1

Quality Control of National Genetic Evaluation Results Using Data-Mining Techniques; A Progress Report

Social Media Mining. Data Mining Essentials

Chapter 12 Discovering New Knowledge Data Mining

Transcription:

Analytical Learning Introduction Lehrstuhl Explanation is used to distinguish the relevant features of the training examples from the irrelevant ones, so that the examples can be generalised Introduction 3 Prior knowledge and deductive reasoning is used to augment the information provided by the training examples Prior knowledge is used to reduce the complexity of hypothesis space Assumption: learner's prior knowledge is correct and complete Difference between Inductive and Analytical Learning Introduction 2 für Informatik 2 Example: Learn to recognise important classes of games Difference Analytical Learning between Inductive and Goal: Recognise chessboard positions in which black will lose its queen within two moves Induction can be employed <=> Problem: thousands of training examples similar to this one are needed Suggested target hypothesis: board position in which the black king and queen are simultaneously attacked Not suggested: board position in which four white pawns are still in their original location Explanations of human beings provide the information needed to rationally generalise from details Prior knowledge: e.g. knowledge about the rules of chess: legal moves, how is the game won,... Given just this prior knowledge it is possible in principle to calculate the optimal chess move for any board position <=> in practice it will be frustratingly complex Goal: learning algorithm that automatically constructs and learns a move from such explanations Analytical learning methods seek a hypothesis that fits the learner's prior knowledge and covers the training examples Explanation based learning is a form of analytical learning in which the learner processes each new training example by Explaining the observed target value for this example in terms of the domain theory Analysing this explanation to determine the general conditions under which the explanation holds Refining its hypothesis to incorporate these general conditions 1

Difference Analytical between Learning Inductive (2) and New Example Difference: They assume two different formulations of the learning problem: Inductive learning: input: hypothesis space H + set of training examples D= { x,f 1 ( x 1),..., x n,f ( xn ) } output: hypothesis h, that is consistent with these training examples Analytical learning: input: hypothesis space H + set of training examples D= x,f 1 x 1,..., x n,f xn + domain theory B consisting of background knowledge (used to explain the training examples) output: hypothesis h, that is consistent with both the training examples D and the domain theory B { ( ) ( ) } Given: Instance space X: Each instance describes a pair of objects represented by the predicates Type, Color, Volume, Owner, Material, Density and On. Hypothesis space H: Each hypothesis is a set of Horn clauses. The head of each clause is a literal of the target predicate SafeToStack. The body of each Horn clause is a conjunction of literals based on the same predicates used to describe the instances + LessThan Equal GreaterThan + function: plus minus times Target concept: SafeToStack ( x,y) Volume ( x,vx ) Volume ( y,vy ) LessThan ( vx,vy ) Training examples: On(Obj1, Obj2) Owner (Obj1, Fred) Type(Obj1, Box) Owner (Obj2, Louise) Type(Obj2, Endtable) Density(Obj1, 0.3) Color(Obj1, red) Material (Obj1, Carboard) Color(Obj2, Blue) Material (Obj1, Wood) Volume(Obj1, 2) Difference Analytical between Learning Inductive (3) and New Example 2 Learning PROLOG-EBG with Perfect Domain Theories: Illustration: f ( x i ) is True if x i is a situation in which black will lose its queen within two moves and False otherwise H: set of Horn-clauses where predicates used by the rules refer to the position or relative position of specific pieces B: formalisation of the rules of chess Domain Theory B: SafeToStack ( x, y ) Fragile ( y ) SafeToStack ( x,y ) Lighter ( x, y) Lighter ( x, y) Weight ( x,wx ) Weight ( y,wy ) LessThan ( wx, wy) Weight ( x, w ) Volume( x, v) Density ( x, d ) Equal( w, ( v,d )) W eight ( x,5) Type ( x, Endtable) Fragile ( x ) M aterial( x, Glass ) Determine: A hypothesis from H consistent with the training examples and the domain theory Learning with Perfect Domain Theories: PROLOG-EBG An Illustrative Trace Explanation-Based Learning of Search Control Knowledge A domain theory is said to be correct if each of its assertions is a truthful statement about the world A domain theory is complete with respect to a given target concept and instance space, if the domain theory covers every positive example in the instance space. It is not required that the domain theory is able to prove that negative examples do not satisfy the target concept. 2

Learning PROLOG-EBG with (2) Perfect Domain Theories: Learning PROLOG-EBG with (4) Perfect Domain Theories: An Illustrative Trace (3) Question: The learner had a perfect domain theory, why would it need to learn? Answer: There are cases in which it is feasible to provide a perfect domain theory It is unreasonable to assure that a perfect domain theory is available. A realistic assumption is that plausible explanations based on imperfect domain theories must be used, rather than exact proofs based on perfect knowledge. Repeatedly: Domain theory is correct and complete this explanation constitutes a proof that the training examples satisfy the target concept PROLOG-EBG(TargetConcept, TrainingExamples, Domain Theory) LearnedRules { } Pos the positive examples from TrainingExamples for each PositiveExample in Pos that is not covered by LearnedRules do Explain Explanation an explanation (proof) in terms of the DomainTheory that PositiveExample satisfies the TargetConcept Analyse Sufficient Condition the most general set of features of PositiveExample sufficient to satisfy the TargetConcept according to the Explanation Refine LearnedRules LearnedRules + NewHornClause, where NewHornClause is of the form TargetConcept SufficientConditions In general there may be multiple possible explanations Any or all of the explanations may be used. Explanation is generated using backward chaining search as performed by PROLOG. General rule justified by the domain theory: SafeToStack ( x,y) Volume ( x,2) Density ( x,0.3 ) Type ( y,endtable ) leaf node in the proof tree expects Equal(0.6,times(2,03) and LessThan(0.6,5) Learning PROLOG-EBG with (3) Perfect Domain Theories: An Illustrative Trace (2) An Illustrative Trace (4) PROLOG-EBG (Kedar-Cabelli and McCarthy 1987) Sequential covering algorithm When given a complete and correct domain theory, the method is guaranteed to output a hypothesis (set of rules) that is correct and that covers the observed positive training examples Output: set of logically sufficient conditions for the target concept, according the domain theory The imprtant question in the generalising-process: Of the many features that happen to be true of the current training example, which ones are generally relevant to the target concept? Explanation constructs the answer: Precisely the features mentioned in the explanation Explanation of the training example forms the proof for the correctness of this rule PROLOG-EBG computes the most general rule that can be justified by the explanation, by computing the weakest preimage of the explanation Definition: The weakest preimage of a conclusion C with respect to a proof P is the most general set of initial assertions A, such that A entails C according to P. Example: SafeToStack ( x,y) Volume ( x,vx ) Density ( x,dx ) Equal ( wx,times ( vx,dx )) LessThan ( wx,5) Type( y,endtable) PROLOG_EBG computes the weakest preimage of the target concept with respect to the explanation, using a general procedure called regression Regression: go iteratively backward through the explanation, first computing the weakest preimage of the target concept with respect to the final proof step in the explanation Computing the weakest preimage of the resulting expressions with respect to the proceeding step and so on 3

An Illustrative Trace (5) Remarks on Explanation-Based Learning Remarks Explanation-Based Learning 2 Discovering new features An Illustrative Trace 5 Remarks on Explanation-Based Learning Remarks Explanation-Based Learning 3 Key properties: PROLOG-EBG produces justified general hypotheses by using prior knowledge to analyse individual examples The explanation about the way how an example satisfies the target concept determines which example attributes are relevant: the ones mentioned by the explanation Regressing the target concept to determine its weakest preimage with respect to the explanation allows deriving more general constraints on the values of relevant features Each learned Horn clause corresponds to a sufficient condition for satisfying the target concept The generality of the learned Horn clauses will depend on the formulation of the domain theory and on the sequence in which the training examples are considered Implicitly assumes that the domain theory is correct and complete Related perspectives to help to understand its capabilities and limitations: EBL as theory guided generalisation of examples: Rational generalisation from examples allows to avoid the bounds on sample complexity that occured in pure inductive learning EBL as example guided reformulation of theories: Method for reformulating the domain theory into more operational form: Creating rules that: Deductively follow the domain theory Classify the observed training examples in a single inference step Related perspectives to help to understand its capabilities and limitations: EBL is just restating what the learner already knows : In what sense does this quality help to learn then? Knowledge reformulation: In many tasks the difference between what one knows in principle and what one can efficiently compute in practice may be great Situation: Complete perfect domain theory is already known to the (human) learner, and further learning is simple! So it's a matter of reformulating this knowledge into a form in which it can be used more effectively to select appropriate moves. 4

Remarks Explanation-Based Learning 4 Knowledge Compilation: EBL involves reformulating the domain theory to produce general rules that classify examples in a single inference step Summary Discovering new features are needed to describe the general rule underlying the training examples Summary Interesting capability: Ability to formulate new features that are not explicitly in the description of the training examples but that This feature is similarly represented by the hidden units of neural networks Like the BACKPROPAGATION algorithm, PROLOG_EBG automatically formulates such features in its attempt to fit the training data BUT: In neural networks it's developed in a statistical process PROLOG-EBG it's derived in an analytical process Example: derives the feature Volume Density >5 PROLOG-EBG Uses first order Horn clauses in its domain theory and in its learned hypotheses The explanation is a PROLOG proof The hypothesis extracted from the explanation is the weakest preimage of this proof Analytical learning methods construct useful intermediate features as a side effect of analysing individual training examples. Other deductive learning procedures can extend the deductive closure of their domain. PRODIGY and SOAR have demonstrated the utility of explanation based learning methods for automatically acquiring effective search control knowledge that speeds up problem solving Disadvantage: purely deductive implementations such as PROLOG- EBG produce a correct output if the domain theory is also correct 5